This story appeared on Network
World at
http://www.networkworld.com/columnists/2006/052906bradner.html
The future
library: A 50-petabyte iPod?
'Net Insider
By Scott Bradner, Network World,
05/29/06
I started playing with digitized
literature almost 25 years ago. A lot has changed in the digital books biz
since then.
Some of the history, current
status, future possibilities and clashing business models in this area were
recently explored in a cover "manifesto" in The New York Times
Magazine by Wired writer Kevin Kelly. Spoiler: It will all come out fine in the
end, but the length of time you will have to wait depends on when Congress
stops moving the copyright goal posts.
In the summer of 1982, a classics
graduate student working in the computer lab I ran in the Harvard psychology
department got a copy of the Thesaurus Linguae Graecae, a large batch of
classical Greek literature that had been typed into computers someplace outside
the United States, with HP co-founder David Packard paying the bill. I, along
with people in the Harvard Classics and English departments, convinced the
university administration to pay for a huge - at the time - 300MB disk drive to
store this text as well as a collection of Middle English literature.
Over the next few years the
graduate student, Greg Crane, now a professor at Tufts University, put together
the first version of what became the Perseus Project. This is a Weblike mixture
of text and clickable links to other material, done many years before the Web
and search engines showed up.
This well-indexed online text
changed what sort of things would be reasonable Ph.D. dissertation topics.
Before Crane's work, a student could arrive at a topic after years of
index-card- based investigations of how specific words were used in classical
Greek; after Crane's effort, that became a weekend task.
Kelly's Times Magazine story
explores what happens in a future where you might have petabytes of digital
material being attacked by cutting-edge search engines. Kelly estimates that a
50-petabyte disk farm could hold all the 32 million books, 750 million stories
and essays, 25 million songs, 500 million images, 500,000 movies, TV shows and
short films and 100 billion public Web pages.
Quite a bit of the material is
already digitized, including as new books, DVD movies and CD music. The story
describes multiple projects under way to try to catch up with digitizing older
books and discusses the legal and access issues caused by Congress' continual
extension of the copyright period.
A few years ago in a column I
quoted a student who told me "if it is not on the Web, then it does not
exist." The same point was reinforced last week when I suggested that a
graduate student see whether he could find some information on a particular
topic in the library that was one floor down from my office, and he admitted to
being in the library only once or twice - and had not looked anything up.
Kelly paints a picture in which
physical libraries might not be needed, other than for books published by
companies whose lawyers are not ready to embrace a searchable digital world. In
Kelly's future, world books are no longer individual items but are parts of a
vast relational database on steroids where your biggest problem will be
figuring out how to ask the question you want answered. And to figure out what
is left that could be a good dissertation topic. All in all, a very good read.
Disclaimer: If physical libraries
fade away, Harvard is going to wind up with a lot of prime real estate that
will be bitterly fought over, but I did not ask the view of the university
library folk about The New York Times story, so the above is my own review.
All contents copyright 1995-2006
Network World, Inc. http://www.networkworld.com