The following text is
copyright 2006 by Network World, permission is hearby given for reproduction,
as long as attribution is given and this notice is included.
Thanks AOL, you
have endangered all of us
By Scott Bradner
As you all know by now, in early
August someone over at AOL had a brain fart and posted a few months of raw
search queries by 650,000 AOL customers.
The queries semi-anonymized by using ID numbers rather than IP addresses
for the query source. But as a
number of people quickly found out, lots of individuals could be easily
identified by looking at what they searched for. The New York Times even published a front page story about
one of the searchers they were able to identify. What AOL did was stupid and, to their credit, AOL said so,
but we are all in greater danger because of what AOL did even if we are not all
AOL users.
Even though AOL removed their
posting quite quickly the data has been mirrored in a number of places around
the net (see http://www.gregsadetsky.com/aol-data/). AOL's data is quite an interesting and sometimes deeply
troubling snapshot of the psychology of the US in the Internet age. A number of web sites that provide
analysis or access to the data have sprung up in the last week or so (e.g.,
http://www.dontdelete.com/). In
addition to the Times, a number of other newspapers have done their own
analysis of the data or have asked outside researches to do so. For example the Wall Street Journal
reported that over 3 million of the 26 million queries were "some form of
explicit sexual search" vs. a little over 3,000 for "god".
I grabbed a copy of the data and
poked around a bit. There are a
lot of people looking for not so nice things: over 800 queries included
"child porn," 38 queries with both the words "build" and
"bomb" in them, over 400 queries included "bestiality,"
over 131,000 queries included "nude," almost 39,000 queries that
included the word "kill" (of which 900 were queries for "To Kill
a Mockingbird"), over 20,000 queries included "HIV" and over
11,000 included "AIDS," and 284 queries included the word
"toothpick."
Somewhat worrisome for AOL (and
Google) was the fact that over half a million of the searches on the AOL search
engine were in fact requests to use Google. Google may not like it but "to google" is becoming
the generic term for performing an Internet search.
The biggest problem with AOL
releasing this data is not the potential privacy invasion of its own users,
although that can be significant.
I expect a number of police agencies have already or will soon serve AOL
with subpoenas demanding identification information for specific people who
searched for suspicious content, building bombs or child porn for example.
The biggest problem with what AOL
did is that it destroyed the security through obscurity cover over all the data
that the search engine companies have been collecting on their users. Sure, a lot of people have been worried
about this data, particularly after the US government in some type of wild
goose chase subpoenaed huge amounts of raw search info from these
companies. (see "King George
I on privacy" - http://www.networkworld.com/columnists/2006/013006bradner.html)
But now every divorce lawyer, private eye or local cop will first think of this
easy way to get dirt on someone.
Have fun trying to explain some of the searches you have done in the
past to your soon-to-be ex's lawyer.
There are ways to search without
leaving a trail, for example using anonymizing networks such as Tor (see
http://tor.eff.org/) -- see this story on how to set it up to anonymize your
google searches
(http://www.infoshop.org/inews/article.php?story=20060813064045562).
It would be great if Congress
would just put a halt to these massive databases but your privacy is apparently
not a Congressional concern. (See "Congress fails to grasp security
risk" http://www.networkworld.com/columnists/2006/081406bradner.html) So
have your browser remove Google's cookies (See "Are Microsoft's cookies
super?" http://www.networkworld.com/columnists/2006/051506bradner.html)
and use an anonymizing net or public Internet kiosk if you want to do a search
that might be hard to explain 5 years from now.
disclaimer: Harvard is infrequently anonymous and
even less frequently does web searches so the above is my own ramble and
warning