The ABC’s of GBS: Part 3
Google Library, The Lawsuits, and Is Charkin Barking Up the Right Tree?

June 18^th 2007

Below, Evan Schnittman shares his opinion, based on his knowledge as a rights guy, on Google Library. This isn’t Oxford University Press’s official stance – but represents just one of the many opinions floating around our office on this very tricky subject.

By Evan Schnittman

To avoid confusion lets get everyone on the same page. Google Library (GL, as opposed to Google Book Search) is a program that has scanning facilities set up at 17+ libraries around the globe. These facilities digitize the print books in a given collection and then index the text so that it can be discovered by Google’s search engine. The search engine displays only a snippet (250 characters or so) of the book when there is a search hit, if the book is in copyright. In exchange for sharing their collections, Google gives a digital file of each book to the library for their archives. GL should not be confused with Google Book Search (GBS), which is a publisher sanctioned program in which Google licenses the right, from publishers, to digitize, index, and display 20% of a book for the purpose of making it “discoverable” in Google’s search engine. See The ABC’s of GBS, Part 1 for a complete description.

Over the last couple of weeks there has been some buzz in the tech and publishing blogosphere over a stunt pulled by Macmillan’s UK-based CEO Richard Charkin at BEA (Book Expo America). In an effort to illustrate his view on GL, Charkin went into the Google stand with an accomplice, took two laptops, and waited nearby to see what would happen (see Charkblog). After some time, a Google rep asked what was going on – Charkin pointed out that he was doing exactly what Google was doing to publishers. As “there was no sign that said ’do not steal the laptops,’” and, therefore, he felt the right to walk off with one. While I found this extremely amusing as a prank – (Charkin Punk’d Google!) I think the effort missed on a major point.

Google interpreted copyright law in a search engine friendly manner and decided that the act of digitizing books found in libraries, indexing that content, and then displaying only the smallest “snippet” of that content (250 characters), was no different than what they do spidering the internet and displaying snippet results. This is where the world of the internet and book publishing collide culturally – Charkin sees this as theft, Google sees it as how they operate on the internet – indexing content in order to make it discoverable without having to ask permission. However, this is not the most disturbing aspect of the GL program. Had Charkin taken the laptop and given it to a third party – he would have been closer to what is going on with GL.

Even better, imagine Charkin had gone to one of the thousands of server farms that Google employs and convinced a few to allow him to extract Google’s sacred and vaunted search algorithm from their servers. His promise to them would be that he would only use the algorithm to make his business better at achieving higher search results and he would give a copy of the algorithm back to them so they could archive it. (You need to pretend for a minute that Google’s search algorithm is actually on servers that aren’t in their direct control.)

Google’s reaction to this would have been swift and severely punitive. Google puts their search algorithm on third-party servers with the understanding that that a server farm will have use and access to the algorithm (Google Intellectual Property) only in the manner that was agreed (keeping the search running, etc). The server farm would not have any rights to the algorithm – even though it came in the server that Google set up at the server farm. Copying the algorithm, even for the seemingly innocent purpose of archiving the algorithm for posterity would not be permissible in any way or form.

As absurd as this example may be, this is what Google and its institutional partners are doing in the GL program. Google is going to owners of a physical manifestation (books) of Publisher and Author intellectual property and using the intellectual property in a manner not considered in the original agreement (sale) of the book. Google isn’t just digitizing and indexing the books – they are then giving the files to the library – all without a license to do so from the IP owners.

The problem is that as lofty as the intentions of the libraries and Google may be, they do not have the right to do what they are doing. Libraries do not have to right to give their books to anyone to digitize, for any purpose, without the explicit permission of the copyright holder (they have the right to make one single copy – not two, just one – themselves for the purpose of archiving). Google doesn’t have the right to digitize and index the content without permission either – and they especially don’t have the right to digitize the content and then give it to a third party!

This last point is where things have really gone pear-shaped. Google and libraries are doing something they need a license to do – and rather than ask for one, they are asking the copyright holders to provide a list of properties they wish to protect or not include in this program. Neither Google nor the participating libraries are following the well established and longstanding protocol of rights licensing. By pushing the onus of identifying individual copyrights that should NOT be included in a program without license, they are turning upside down the very nature of licensing.

To put a finer point on this – OUP (or any other publisher for that matter) does not produce a list of their copyrights they DO NOT WISH TO LICENSE and send the list to every potential licensee in the world saying “please do not license the following titles.” Publishers are not being asked to review lists of titles found in libraries and decide which they want in GL – they are being asked to provide a list of titles to GL that should not be included. It’s the alternate universe of intellectual property management where Mr. Spock has that funky Fu Manchu mustache.

Google Library can be an important program for publishers and authors when we find a way back to the normal Spock universe of content licensing. The wholesale digitization and discoverability of the most important content in our collective intellectual history is a really amazing concept. While publishers applaud the ambition, vision, and strength of resolve Google has shown, we need to live in a world that respects all kinds of property rights – physical and intellectual. Until then, we will not allow Google or anyone else to use our content as they see fit – unless they ask.

Mr. Charkin, next time submit a request for a laptop from Google – they may just say yes.

Evan Schnittman is OUP’s Vice President of Business Development and Rights for the Academic and USA Divisions. His career in publishing spans nearly 20 years and includes positions as varied as Executive Vice President at The Princeton Review and Professor at New York University’s Center for Publishing. He lives in New Jersey with his wife and two children.

Recent Comments

John Cowan18^th June 2007

What Google does in GL is the same thing they do in web search: they display a snippet of copyrighted content. Indeed, web search also provides a cached version of the content. If doing that is fair use, indexing books and displaying snippets obviously is as well.
Adam Hodgkin19^th June 2007

Comment well put and point well made. Next time the AAP wants a public spokesperson they should call on you. Google will probably have to climb down in the direction you indicate….. [Google spokesperson in 2009 announcing negotiated standoff with publishers]: “We goofed. We have done the work, the books have been copied the texts have been indexed but we will not display snippets, of unsanctioned copyright works until the rights situation is less murky. Or until there is broad and general agreement about how to handle orphan copyrights.”
Michael20^th June 2007

@John Cowan: There’s a huge difference between GL and web search. With web search, Google is adding each page by asking for a copy of that page from the page owner, who presumably has the right to make copies of that page. (That’s how the web works at the simplest level — a browser sends an http request for a page, and the server sends a copy of the page.) Google is only indexing web pages which are made public at no charge, and they’re doing it by asking the owners of that page. With GL, Google is adding a book by borrowing it from someone who does NOT have the right to reproduce the book, making a couple of complete copies of that book for themselves and the library, and NOT asking the copyright holder who does have the right to make copies.

Google allows web sites to opt out of being added to Google web search by having a “noindex” tag that Google respects. Publishers feel they already have the equivalent in books — the copyright notice. It’s unclear why Google refuses to respect it.
Richard Ahlquist20^th June 2007

@Michael, you have a very good point. There is a substantial difference. What are we as a consumer to do then? I would love to be able to search the great works of the world but cant because there is no way to search them all in one place.

Shall I wait till copyright holders can all agree upon a system much like the movie industry did with Blue Ray and HDDVD? What a boon for the consumer, I can imagine the decades it would take to settle on a system, and then have each publisher decide to go with their own, custom system and charge everyone to search. That would be helpful too.

This is the most selfish generation in the history of the world. You have one side who wants to charge to share their intellectual property and another side who wants to consume that intellectual property in the manner and time they want.

@Evan, you too have a good point. Following by the letter of the law they do not have a right to do what they are doing. So lets look at it this way.

I can go in with a notepad and a pen and take notes about passages I may need for say a term paper. So that in the future I wont have to search the whole volume for the content I want. Are you trying to tell me that the copyright holder would have a problem with me doing this? What if I typed my notes on a laptop so I could search my notes easier? What if I included direct quotes in my laptop? Area gets a bit more grey. Can I pay someone to provide that service for me? (remember paying someone for the service is not the same as paying them for the copyrighted content)

Now say I am blind, and the works aren’t available in braille, but I have a terminal with voice synthesis capability and a braille keyboard. Are you saying it is the direct intent of the copyright holders to prevent this differently abled person from ever being able to access their work? Or are you just saying that the copyright holder deserves to make more for making their book available in this different format?

They may not have the legal right, but as with all media and all intellectual property the world has moved on, much like when Gutenberg arrived on the scene. Perhaps its time for publishers to pay for the worth of the creation, rather than quantity it has sold?
links for 2007-06-2120^th June 2007

[…] The ABC’s of GBS: Part 3Google Library, The Lawsuits, and Is Charkin Barking Up the Right Tree… (tags: books google copyright) […]
John Cowan21^st June 2007

Michael writes: “Google is only indexing web pages which are made public at no charge, and they’re doing it by asking the owners of that page.”

But Google isn’t only doing that. It is also excerpting part of the web page and using the snippet to make something new, a search page served from their own site. That is the reuse of someone else’s copyrighted content to make a derivative work, and the only reason Google can do so is that the snippets make only fair use (“fair dealing” in some jurisdictions) of the original content.

The point about the original full-text copies of the books is disposed of readily: Google is dealing here with U.S. libraries, which explicitly have special rights (under U.S.C. Title 17 § 108) to make archival copies of the works they hold. Google is returning the full-text copies to those libraries and as far as I know to nobody else, except where the works in question are in the public domain.

Furthermore, the implicit license to copy a web page that is provided by putting the page on a server is itself pretty limited. You can’t, for example, copy the whole page into your own by framing it or otherwise: some copyright holders aggressively defend their rights against that kind of infringement.

So I continue to think that snippets of text, whether from print or online sources, are legitimate fair use of copyright content.

Disclaimer: I am not a lawyer; this is not legal advice.
CaptainBooshi21^st June 2007

Actually, to make what he did exactly what GL is doing, he would have to walk up to the booth, take the laptop and make a perfect copy of it, give the laptop back to Google, and walk off with the copy. Then, if anybody wanted to look through the laptop, he would show very small portions of any files relevant to the questions they asked, and say that if they wanted to know anymore, they would have to go to Google.

Much less malicious than you’d like to admit.
CaptainBooshi21^st June 2007

Sorry, I submitted too soon. To finish:
Although I have to admit the legality of it is still dubious, the situation is not helped by histrionics and exaggerations, as seen here.
Michael22^nd June 2007

Richard asks, “What are we as a consumer to do then?” My answer would be “Push for changes in the laws.” Copyright laws need to be reevaluated and changed to keep up with the changing needs of our society, and not just the changing desires of our largest corporations. If we want to create a compulsory license for Google to make as many complete digitized copies as they need on their back end as long as they only provide snippets to the public, let’s do that. If we want to create a new fair use exception that says that those complete back-end copies are fair use, let’s do that. There are ways to balance competing interests that should benefit society as a whole and allow people and companies to follow the law while doing what they generally want to do. Sadly, we’re not finding those balances even in relatively simple areas like file-sharing of music. No wonder we’re unprepared to find that balance for an area like Google Library, where Google pretends that the full-copy back end is magic and the snippet-based front end is all that matters. Even when a column like this one focuses explicitly on how Google is starting their back end by making complete copies, folks like the first commenter immediately talk about the snippet-based front end as if the back end is magic.
Dirtboy23^rd June 2007

Indexing books is as much “fair use” as I can possibly think of. GL is not providing third parties with a usable version of the books. It is simply providing potential readers with the means to find that book based on their interests and needs. Once upon an analog time, we had to shift through card catalogs in libraries to find books of interest. Those cards did not index books in their entirety, but did provide only that information necessary for the user to decide if the book was worth opening (although most of the time the cards said too little and the book was not what the card made it appear). GL will provide an expanded version of the card catalog, which will “only” differ in that the results are more detailed. This does not provide third parties with more than snippet of text. And laws allow for publication of short quotes from copyrighted material so long as they are not substantive. Google is doing book publishers a favor because GL will make their books much more accessible to readers – readers who will have to buy the book or check it out from a library.

If Charkin wanted to replicate this activity, he should have taken a picture of the computer, pasted it to a sign with a short snippet of content, and walked around the expo sign in hand. People could have then seen what Google had to offer at their booth and thus been encouraged to check it out.

Publishers are like spoiled children. They have had the sandbox all to themselves for a millennium and now they are rabid at having to share it with someone. The new kid on the block isn’t stealing anything… it is pointing at what publishers have done and saying, “Hey, like it? You should come over here and check it out.”
John Cowan25^th June 2007

Google, or rather the libraries for whom it is an agent, already possesses the compulsory license, as I noted above.
Analogy Boy26^th June 2007

Remind me to take Dirtboy’s car for a joyride for a couple of weeks. I won’t give anyone else rides in it; I’ll only let other people see the outside, and I’ll tell them that they should go buy the car from Dirtboy if they want it. Since I’m helping out Dirtboy by sending him car buyers, and since I’m not letting anyone else drive his car, it’s apparently all ok.

Ignore the fact that I stole the car in the first place. Focus on the fact that I’m not loaning the car out to other people.

Comments are closed.

By Evan Schnittman

Related posts:

Recent Comments