The Chronicle Review

Thank You, Judge Chin

Why the Google Books setback is an opportunity to reassert academic values

Chronicle Illustration by Ellen Winkler

March 24, 2011

When federal Judge Denny Chin rejected the complex settlement among Google, large commercial publishers, and the elite Authors Guild this week, he may have set back Google's plans to digitize millions of books and offer them for sale online, thereby establishing monopoly control over the sale of electronic versions of the vast majority of those published in the 20th century.

But Chin also generated some outstanding opportunities for the scholars and libraries to build something far better than a used bookstore.

Google might decide to scale back its ambitions for its Google Book Search program. Or it might decide to fight on for two, three, or five more years to be able to build the sort of dominant, comprehensive service it has always wanted. Regardless, we now have time to strengthen and champion the values of academe and librarianship instead of crass commercialism.

The first thing that we must do is realize what is within our grasp. We have the technological systems in place to connect the vast majority of people in the world with much, if not most, of the greatest collections of knowledge. We have impressive digital databases. We have millions of hours of sound recordings. We have 100 years of film and video available. We have, of course, millions of books.

We could conceivably build a system that could ensure that a child growing up in South Africa has the same opportunities to explore science, agricultural techniques, or the poetry of Japan as a child growing up in Sweden or Canada has today. That should be our goal, anyway.

We lack only one thing: the political will to fight for a great and noble information system—a global digital library. I'm not talking about the haphazard rush we've seen to date to digitize the stacks of major research libraries. Nor a commercial venture like Google's. I'm proposing what I call the "Human Knowledge Project" in my book, The Googlization of Everything (And Why We Should Worry). What I mean is a truly global digital library. To generate support for that, we need to identify the political and legal constraints, as well as articulate the payoffs.

That entails a formidable series of tasks. It might take 10, 20, or even 50 years. But there is no reason we should settle for expediency at the expense of excellence. After a few conversations, we might decide it's not worth the effort or cost. But at least we would have tried. And that's so much healthier than waiting for the Big Rich Magic Company in the Clouds to do all this for us—on its terms.

I call this the "Human Knowledge Project" to echo the Human Genome Project. Back in the early 1990s, when attempts to annotate the human genome were scattered and uncoordinated across three continents, the methods were slow. No one predicted a completed database within 20 years. Then a brilliant, ambitious company called Celera announced it would do the job itself and ask for no public money. Celera would use newer, faster methods. And all it asked was the right to extract a royalty for any marketable treatments that might emerge from the use of its database.

As opposed to how some university libraries celebrated Google's announcement of its Book Search Project in 2004 because now they would not have to spend money to get digital files of their books, scientists who work on the Genome stood up and organized. They appealed to their various governments that had been supporting genome research to step up financing and coordinate efforts so that this essential knowledge would never be exclusively privatized.

Those scientists showed the virtue of political activism in the face of market fundamentalism. They stood up for the public interest and won. The rest of us in academe could use some of their vision and gumption.

For any global digital library to work, we need to reform the global copyright system. Google would never have run into trouble had it opted to work through the legislatures of the world rather than the courts of the United States to change copyright, using a class-action settlement to create a monopolistic licensing system over unclaimed works. The alternative would not have been easy, perhaps not even likely to succeed. But courts should not be making law at the behest of big companies. That's what Congress is for.

Correcting the "orphan works" problem should be job No. 1. Every historian working on the 20th century understands the frustration with trying to quote from an obscure text or use an image from the 1930s in a published work. The burden is entirely on the scholar to identify the copyright holder (if there is one) and negotiate rights. All too often, the work goes unused because the publisher declines to take the chance that a copyright holder will rise up after publication and shake the publisher down. Congress has considered bills in recent years to limit the liability for scholars who make a good-faith effort to find copyright holders. And that challenge—how to negotiate publication rights with an unfindable or unidentifiable person—is what the judge ruled Google's proposed settlement had not dealt with appropriately. It's one of the main reasons that similar efforts to create digital collections have shied away from material published after 1923, when it's possible it's still covered by copyright.

There is a simpler and better way to defeat the orphan-work problem. Lawrence Lessig of the Harvard Law School proposed it a few years back. If Congress would just go back to a system that requires active renewal of copyright instead of terms that last so long they might as well be perpetual, we could accomplish two goals. For a nominal transaction fee, Disney and Random House could maintain the rights to their older works, for which there is still a vibrant market. But the vast majority of work that has no market value would be liberated by default. Copyright holders would have to step forward to renew and retain the rights to work they hoped to exploit. Most older work would enter the public domain, harming no one yet helping everyone. Of course big copyright-holding companies would oppose such a plan. But if Google, Yahoo, Microsoft, Amazon, and every major university joined a coalition to push for more rational copyright, we could build a system people actually trust and value rather than our absurd and maddening system that prevents people from doing creative work.

Meanwhile, we should support and invest in three major projects. Two have the advantage of actually already existing. And the third is a practical and executable plan that is gaining momentum thanks to the vision of some of the finest scholars in the country.

The Open Book Alliance has been making preservation-quality scans of public-domain library material since 2005. It is a collaborative not-for-profit project working with support from Internet companies, foundations, and dozens of university and public libraries. It's the model of how to build a high-quality, noncommercial, open digital collection. It was the brainchild of the Internet visionary Brewster Kahle. If it had received half the support and attention that Google Book Search did over the past six years, the alliance would be considered the chief medium for digitized material for research. One of its key strengths, however, is also its weakness: It does not include materials that could infringe on copyright. So it's a less useful collection for those who wish to explore the 20th and 21st century than what Google offers. Judge Chin did not rule on copyright issues, only on the terms of the settlement among the company, authors, and publishers. At this point, no court has ordered Google to stop digitizing copyrighted books. So Google still has the best collection out there.

In contrast to the Open Book Alliance's oppositional stance toward Google, the HathiTrust program is an effort to make the most out of university partnerships with Google Books. Hathi partners, including more than 50 leading research universities, support repositories for digitized works. While that includes material made available by Google, it is not limited to it. Hathi corrects for many of the threats that a privately digitized library raises and that many librarians and scholars (including myself) have been warning about since the Google scanning project was announced—the exploitation of user information for marketing or search results, for example, could favor more profitable content over less. And Hathi's site is easy to use and brilliantly constructed. It's a model of how a library search system should work.

Relying on Google to do the scanning as Hathi does, however, generates some glaring problems as well. First, the quality of Google scans is notoriously poor. Second, because nothing in the proposed settlement ever insulated universities from liability, a copyright holder could still sue participating universities for using unauthorized scans that Google had made for them. The potential liability of universities over the barter transaction (access to books in exchange for a digital copy from Google) is still an open and festering legal question. And because of the terms of the deal some university libraries have made with Google, full use of their collections is limited to their faculty and students. But at least Hathi offers those scholars and students access to the 20th century. And much of the content comes to Hathi from sources beyond Google. Hathi is working with publishers directly to enhance availability and preservation. So access to that material could be much easier—and legal. It's an invaluable resource and deserves our full support.

The third project that deserves our support is the most exciting: Digital Public Library of America. The Berkman Center for Internet & Society at Harvard University is in the process of developing plans for a national digital library. The historian Robert Darnton, for the past few years the head of libraries at Harvard, inspired the effort through a series of articles in The New York Review of Books that began as a critique of Google's plans and moved toward a positive vision of what we might better build. Of all the efforts at work today to extend the utility and access to digitized material, the DPLA is the closest to realizing the values of scholarship and librarianship. Those who have been active in the planning process (I am tangentially involved) have identified hurdles and opportunities, like copyright complications and user-privacy concerns. Many of the most important leaders in digitizing materials are participating, including my colleague at the University of Virginia, the textual scholar Jerome McGann, and the University of Michigan library head, Paul N. Courant.

Each of these projects supports values that are dear to academe and academic libraries: a commitment to the public good rather than the bottom line; respect for users; a concern for preservation and open standards; and open conversations about policy. They all have the potential to serve the public. But they all depend on continued and coordinated commitment by individuals, foundations, and academic institutions. We can have it all. We can have a universal digital library that enhances preservation, protects user privacy, and resists the perils of privatization.

It's all as feasible as it is desirable. We just have to want it badly enough. Judge Chin has given us the opening we need.

Siva Vaidhyanathan is a professor of media studies and law at the University of Virginia. His most recent book, The Googlization of Everything (And Why We Should Worry), was published this month by the University of California Press.