Oxford University Press's
Academic Insights for the Thinking World

The ABC’s of GBS: Part 1

Google Book Search explained as a tool for marketing books through discoverability on the internet.

For a while, Evan Schnittman (his bio is at the bottom of the post) and I have been planning a series of blog posts which would provide a look “Inside Oxford.” So last week, when the Financial Times article about digitizing books came out, I was eager to post his full response (a shorter response appeared in the paper on Saturday). This week we present the first installment of Evan’s series on “The ABC’s of GBS.” With his help, we hope to untangle the intricacies, and express our excitement, about the future of publishing. Be sure to check back next week for Part 2 of this series.

GBS, (Google Book Search) is the publisher-sanctioned effort by Google to “organize the world’s information” by licensing the right to scan (or upload a digital file), host, index, and display search-relevant portions of books. This effort is not to be confused with Google Library (not publisher-sanctioned) or Google Scholar (journals focused) or even the newly named initiative, Google Print (Print ad space sales for magazines).

In its purest form, Google Book Search is a marketing tool for books on the internet. The theory behind GBS is that Google indexes the entire content of a book and applies its powerful search algorithm to the indexed content, which enables internet searches to “discover” relevant book content. The search “result” when clicked, leads directly to an image of the book page in the book where the result first appears. The consumer is allowed to read a few pages of content before jumping to the next result. There are options to do additional searches within the book, but it all ends after about 20% of the content has been viewed. This 20% freeze lasts for a month, where the counter restarts and the customer can begin searching and reading again.

Though it sounds quite simple, it is a very, very complex task to integrate simple full text search into Google’s algorithm as the components that drive internet site ranked results are completely different. Internet sites show up on search engines because of key factors such as site popularity, number of links to the site, metadata, length of time in the index, etc. Book content has a hard time competing in those terms, so the engineers at Google have been tweaking and massaging their systems to help books become appropriately discoverable.

Think about it this way, if a searcher on Google types in the phrase “Washington’s Crossing,” search has to make assumptions about what the average person is looking to find: Are they looking for information on the historic park, an illustrated timelines of the event, directions to the park, the name of the artist who did the painting, the name of the author of the book? This is a simplistic example but one that gets more complicated when you add book content to an internet search.

The good news is that as more books are indexed, and the more use the indexed content gets, the algorithm gets better at making book content show up in searches where the intended or best results lead to book content. Furthermore, the more people discover books on the internet through search engines, the more they will look to the internet as a place to discover books. That is the key benefit to GBS: Book content made discoverable where most of the world looks to for information, the internet.

This will make the internet the most cost effective marketing tool for books.

Oxford University Press is participating in Google Book Search, as well as Microsoft’s Live Search Books (LSB) and Amazon’s Search Inside The Book (SITB) because we strongly believe that discoverability and access leads to customer acquisition and book purchasing. By indexing and placing our book content where it can be most easily discovered, Google has changed how books are marketed. Yes, Amazon has been doing SITB for 3 years, but it is a merchandizing program, not a discovery program. As good as SITB is (and it is very good) it cannot reach out across the internet and bring customers to Amazon.

Where SITB is an enhancement to the book shopping experience – GBS and LSB can introduce the notion that a book may hold the answers. GBS and LSB are a means for internet users to discover book content in the course of general internet usage. This is a landmark development in the course of book marketing as never before have books been afforded the opportunity for so much exposure. In fact, I believe that one can argue that this is the first time in the history of book marketing that the full contents of a book will directly market the book. GBS and LSB are turning upside down the entire notion of book marketing.

However, I don’t want to let expectations of this marketing revolution to get carried away. Just because the contents of a book are made discoverable, doesn’t mean that they will be discovered or when discovered will be appropriate – or of enough immediate value to produce a sale. There are many, many factors in a consumer purchase online – discover is the first test.


Evan’s PictureEvan Schnittman is OUP’s Vice President of Business Development and Rights for the Academic and USA Divisions. His career in publishing spans nearly 20 years and includes positions as varied as Executive Vice President at The Princeton Review and Professor at New York University’s Center for Publishing. He lives in New Jersey with his wife and two children. (Full Disclosure: Evan is a member of Google’s Publisher Advisory Board. As the name implies, it is simply an advisory group, and Google can take or leave its suggestions. Additionally, OUP is participating not only in Google Book Search, but also Microsoft’s Live Search Books (LSB) and Amazon’s Search Inside The Book (SITB).)

Recent Comments

  1. language hat

    I trust one of the articles will deal with the extremely annoying and unhelpful “snippet view,” which often enough does not even contain the term searched for.

  2. Grant Barrett

    Seconding Language Hat, I’d hope that you’d also include discussions of reliability. As Ben Zimmer has pointed out, long runs of periodicals (mostly journals) are often listed under the date of the first issue of the run. Which means that a periodical run that started in 1931 will have that date for every issue for decades. Here’s an example for Dialect Notes, where the official information in the upper left shows 1896 but you can plainly see in the snippet view dates of 1928 and 1930. Surely fair use would permit Google to display the copyright pages of each issue.

    Further, Google has included many works that are clearly no longer under copyright, yet which cannot be viewed in full text. Some of the out-of-copyright works are full text, but many are not.

    Even more, there seems to be no facility for submitting corrections. Such as, what the hell is this? And this? “By American Dialect Society, New York (State). Committee for the Adoption of the Constitution, New York (State)”?

    Finally for now, but by no means is this the end of my carping, I’m a vice president of the American Dialect Society who’s been authorized by our executive board to grant Google permission to put all issues of our former journal Dialect Notes—even those still ostensibly in copyright—in full text view on Google Print. Google wants me to sign up as a “partner” to submit my existing books. Yet all of our titles are already scanned and included by various libraries and I see no facility for picking these books out and saying, “We hereby authorize full-text view.” Further, even if I wanted to submit them myself, all the issues of the journal predate ISBNs, and, therefore, could not be submitted anyway.

  3. Rebecca

    Evan Schnittman says: The purpose of posting this series on Search Engines and Book publishing is to share our understanding of the programs and explain how they work in the macro sense. The comments made by Language Hat and Grant Barrett refer to either Google Library or Google Scholar – both programs cited in the piece as not being covered – “…not to be confused with Google Library (not publisher-sanctioned) or Google Scholar (journals focused) or even the newly named initiative, Google Print (Print ad space sales for magazines).” That said, we will be covering Google Library in a later posting, but again, from a macro view and not in the kind of detail of these comments.
    Rebecca says: While we may not be addressing these projects on a micro leval we encourage you to keep discussing and asking questions. We will answer as many as we can. After all, we don’t have all the answers.

  4. Grant Barrett

    Well, yes, our comments are relevant. It is the same interface, is it not? The same scanning, the same OCR, the same presentation? All your work comes to naught if presentation by Google makes it difficult to use, publisher-sanctioned or not.

  5. Stephen Cole

    Evan, I have to congratulate you. In this brief piece you’ve managed to do what Google has patently failed to do — you’ve made their various book initiatives (and their rationales) clearly comprehensible. Next time someone wants me to explain what’s going on with Google and books I’ll give them this link.

    Stephen

  6. […] read more of what Schnittman has to say read his OUPblog posts, (The ABC’s of GBS and Playing Nice With Google) and come back later this week for the next installment of his column. […]

  7. […] Currently, Google (and Microsoft with its Live Book search) have full book contents on their servers which are indexed for the purpose of discoverability (See the ABC’s of GBS – Part 1) […]

  8. […] The ABCâ��s of GBS: Part 1 – The OUP Blog talks about a more positive attitude to Google Book Search – helping people find what they want, and selling more books. […]

  9. […] 20% of a book for the purpose of making it “discoverable” in Google’s search engine. See The ABC’s of GBS, Part 1 for a complete […]

Comments are closed.