The following text is
copyright 2005 by Network World, permission is hearby given for reproduction,
as long as attribution is given and this notice is included.
Refusal,
ignorance, arrogance or PR?
By Scott
Bradner
In mid
March French news service Agence France Presse (AFP) sued Google in a U.S.
District Court for copyright violations.
They demanded that Google News stop including AFP's material in Google
News site and asked for $17.5 million to compensate AFP for the damage that
Google News had caused. You will
pardon me if I express some doubts about the actual motivation for this lawsuit.
I've
written in the past about Google News. (http://www.nwfusion.com/columnists/2004/0308bradner.html) I consider it one of the most useful
sites on the Internet. I use it to
fill out the news snippits that I get from most other news sources. That said, I do get frustrated at
Google News links to subscription-only sites since I cannot access some of the
stories that look interesting. I
have always assumed that such sites welcome Google's pointers because the sites
get free advertising for themselves and thus may get some additional
customers. In that
context the AFP suit makes me wonder what's up with them. Google News does not show full articles
so I find it hard to understand what damage could mount up to over $17 million
-- maybe AFP has a very high opinion of its ability to come up with inventive
headlines and feels that other news organizations will rip them off if the
headlines, which Google News does show, are visible. Or maybe the reason that AFP does not want Google News to
point to AFP's material is that AFP fears that getting more subscribers will
mean that AFP would have to hire more people to deal with them.
Even if
I do not understand why a company in the business of selling its services does
not want more people to know about those services it does not look like it
would be all that hard for AFP to ensure that the AFP sites are skipped over by
Goggle. Google has an easy to find
web page that says quite clearly how to keep a site from being scanned. (http://www.google.com/remove.html) Basically all you do if you want Google
to skip all or part of your site is to put one or more files named
"robots.txt" in your web site.
For example, your whole site will be skipped if you have such a file at
the root of your web server containing these two lines:
User-agent: *
Disallow: /
Robots.txt files can get quite fancy see
http://www.searchengineworld.com/robots/robots_tutorial.htm for more
information.
I suppose it is possible that the Google News web crawlers do not
pay attention to the robots.txt files that Google says that it respects for its
other web crawling but that does not seem all that likely. It is more likely that AFP somehow did
not know how easy it would be to do 2 minutes worth of work themselves on their
own web site to ensure that their material would not be included. A tactic that
would have taken far less effort than, as they claim to have done, pestering
Google trying to get Google to stop scanning. It would also have taken far less effort than filing a
lawsuit. Well maybe it is not all
that likely that no one at AFP knew about robots.txt files -- maybe there is
some other reason that AFP did not take the easy path. The two that spring to mind are
arrogance ('stop' said King Canute to the tide, 'splash' said the tide to King
Canute) or a desire for publicity.
disclaimer:
Of course you never see either arrogance or a desire
for publicity in relationship to Harvard so the above observation is mine
alone.