Google recently announced that they use data from search logs to fight webspam
Data from search logs is one tool we use to fight webspam and return cleaner and more relevant results. Logs data such as IP address and cookie information make it possible to create and use metrics that measure the different aspects of our search quality (such as index size and coverage, results “freshness,” and spam).
Whenever we create a new metric, it’s essential to be able to go over our logs data and compute new spam metrics using previous queries or results. We use our search logs to go “back in time” and see how well Google did on queries from months before. When we create a metric that measures a new type of spam more accurately, we not only start tracking our spam success going forward, but we also use logs data to see how we were doing on that type of spam in previous months and years.
The IP and cookie information is important for helping us apply this method only to searches that are from legitimate users as opposed to those that were generated by bots and other false searches. For example, if a bot sends the same queries to Google over and over again, those queries should really be discarded before we measure how much spam our users see. All of this–log data, IP addresses, and cookie information–makes your search results cleaner and more relevant.
Source: Google Blog Post via: Dave Naylor
In Dave Naylor’ post he pointed out some techniques to try and avoid detection these included
- Make sure you purge your cookie on closing your browse
- Install, foxy proxy or another firefox proxy plugin
The first part is quite easy you just need to go into the Firefox options – Privacy Tab – Then keep cookies until I close Firefox. Also I would recommend that your clear your private data when closing Firefox (this can include cookies as well). I typically clear my cache, cookies and authenticated sessions.
Dave also mentioned you can block Google altogether from Cookies but this prevents you using some of the Google services so I would not bother too much with it.
The next step is preventing Google from obtaining your IP freely. I am not a privacy expert and I am relatively new to the whole proxy / vpn game so the following information may be a little inaccurate or there are probably better services out there.
One of the services I have used over the past 12+ months is Xerobank. This is an encrypted virtual private network that is designed to completely hide your identity from the Internet. It basically works by you connecting to their network (via OpenVPN)and all the traffic is routed through them. Unlike proxies it actually routes all your data through their network rather than just your browser requests. It is not the cheapest option out there at $35 a month but I find it has excellent performance. I can enable the VPN and browse the Internet with little if any noticeable difference in speed. Apart from just using the OpenVPN software they also offer the xB Browser which is a free open-source anonymous web browser, that can be installed on your PC or run directly from a USB drive. I have not really used the xB Browser but if I remember correctly it is a modded version of Firefox that incorporates all the Xerobank privacy feature. I think it is also allows you to use Tor.
The next option is Tor (The onion router). This is typically implemented via plugins such as FoxyProxy and Torbutton. Tor seems to be quite a popular option, and I can see why it is free and very secure (depending on who’s viewpoint you believe). Tor basically operates by using each computer on the network to route encrypted traffic from the start node to the end node. The traffic takes a random route across the network and to any observer the traffic will appear to originate from the end node. I the problem I personally find with Tor is that it’s greatest strength is also is greatest weakness. The onion routers are operated by volunteers using their own bandwidth at their own cost and the performance of Tor is reliant on the routers your traffic is passed across. If any one of them routers is running very slowly then the performance will be greatly reduced. I am unsure how many routers the traffic passes across but it only takes one router to make browsing the net run very very very slow. Which it frequently does.
**Edit** I have just installed Tor and tried it out for the first time in ages. It is not exactly blazing fast but it appears to be running at an acceptable speed. FoxyProxy does warn that if the pages don’t load it will be the Tor network being slow/down so I expect it will still have issues.
Another option is to use a normal proxy server, these can be set up again with FoxyProxy, QuickProxy, SwitchProxy or you can input the settings directly into Firefox via Options – Advanced – Network – Connection Settings. There will be similar options for IE and Opera. If you are using Firefox I would recommend one of the plugins as it will allow you to switch the proxy on and off quickly and easily. Proxies work more or less the same as the above 2 techniques, though only your web traffic is routed through the proxy unlike the VPN option and it is normally only routed through one server/router unlike Tor.
There is a large number of proxies available to use, many/most of them being free. Proxies can provide different levels of anonymity with some being classed as transparent, anonymous, or high anonymity. Anonymous and High Anonymity should be the ones to chose if you are trying to hide your IP. As with Tor the large problem with Proxies is that the free ones are susceptible to run very slow, or go completely down. This is because they typically have a large volume of traffic passing through them while only having limited bandwidth. There are also paid for proxy services which should offer more reliable and faster performance however I have not tried any of these so can not recommend any specific ones.
There are some other issues that need you need to be aware of with all the above solutions. Your traffic is being routed via another machine so if you are based in the UK like myself then Google or whoever looks at your IP may perceived you as being located somewhere else. In the case of Xerobank I believe the IP address will be located in Germany. With Tor I think your IP could end up being located absolutely anywhere, and with the proxy the location is dependant on the proxy you choose. Then there is the issue of the actual security of the data. It is impossible to know if the proxy you are using is spying on your data themselves and for this reason I would only recommend using proxy services for normal browsing habits. Anything you log into I would recommend just going through your ISP. I believe Tor is a little safer as someone looking at the traffic does not actually know where the traffic originated from, however I would still not recommend logging onto your bank account while on Tor. Finally there is also the issue of Google tracking you via the proxy/vpn you are actually using. I am not 100% if Xerobank issues different IPs each time but if it is the same IP every time then there is only limited benefit to routing your traffic via the VPN. With a proxy I would recommend keeping a list of reliable proxies and using different ones on different days. FoxyProxy should allow you to use multiple proxies.
In my next post I will explain how to set Tor/Vidalia and setting up FoxyProxy




















One Trackback
[...] well we have discussed Google using search logs to catch out webspam (source), how to clean up your cookies and the various privacy options [...]