Peter Brantley on the Google Books Settlement

On April 9, 2009, Peter Brantley from the Internet Archive gave a talk at KEI on the proposed “Google Book Search Copyright Class Action Settlement.” The following are my rough notes from the presentation.

Peter has been a director of the Internet Archive (IA) for about three weeks. He was accompanied by Will Rodger, a Managing Director of the Law Media Group (LMG), a firm that represents Microsoft and the IA on the Google Books settlement. Attending the event were the following persons:

Rick Johnson, SPARC, ARL
Will Rodger, Law Media Group
Corey Williams, ALA
Carrie Russell, ALA

Heather Joseph, SPARC
Kat Walsh, Wikimedia
Wendy Seltzer, Berkman Center, Chilling Effects
Marc Rotenberg, EPIC
Katitza Rodriguez, EPIC
Chip Pitts, EPIC/Bill of Rights Defense Council
Peter Brantley, Internet Archives
Daniel McCartney, Public Knowledge
Jef Pearlman, Public Knowledge

James Love, KEI
Meredith Filak, KEI
Malini Aisola, KEI
Manon Ress, KEI
Judit Rius, KEI

Peter gave an initial talk of about 20 minutes, followed by more than a hour of questions and answers. The following notes attempt to summarize some of Peter’s points.

[edited 4:20pm, April 11, 2009]

Background:

The Google Book project began by scanning entire library collections, but soon ran into copyright issues. Threats of suits led to negotiations between Google and rights holders to determine acceptable conditions for the project. Google created an opt-out system for books (authors or publishers could ask that their work be taken down, once it was up) but that wasn’t enough. Google was quickly hit with lawsuits filed by the Authors Guild (a class action suit, which raised concerns about the Guild’s class status, as well as about suit itself) and then separately by the American Association of Publishers (a suit brought by a small group of publishers). When reviewing the suit, there was obvious incentive for Google to settle. After some negotiations, a super-class was created from the AAP and AG, and included publishers and rights holders. After 2 years of “extraordinarily painful negotiations,” and several false alarm announcements, an agreement was released last October.

Overview of settlement:
The settlement covers both known in-copyright works, and “middle ground” works whose status is uncertain, including works which could be public domain in reality, and others of which likely have absent rightsholders, who are unknown, (also known as “orphan works”).

There was some discussion about public domain works due to fear that Google might monetize PD works in the process.

The settlement provides no admission of wrongdoing on Google’s part. It instead offers rights-holders a payment of $45M, and secures Google’s ability to monetize works in their search program via advertising and sales mechanisms. The settlement outlines Google’s right to undertake various services (not yet implemented/exercised) with regard to the orphan works, among them print-on-demand, pay-per-view, renting/lending rights, and others.

There is speculation that the settlement will create a merger of consumption profiles between Google’s two major Books programs–the Publishers and Library Programs. (As a result of the settlement, publishers have to decide whether to pull works out of Library program and into Publishers, etc.)

The settlement proposes to create a Books’ Rights Registry (BRR). The BRR will manage rights and various policy issues for the books covered under the terms of the settlement. It serves two primary functions:
(1) To record known info about holders of rights to works covered under the settlement. It is unclear whether this information will be available to the public under open APIs. (Google has publicly assumed yes; meanwhile, “publishers aren’t sure what an API is”.)
(2) The BRR will serve as a negotiating agent, on behalf of rights-holders of books covered by the settlement, with Google re: pricing and economic models. The BRR will also negotiate on behalf of other actors who have claimed rights to orphaned works.

There is also a Most Favored Nation (MFN) clause between Google and the BRR. A number of details are still unclear, but the bottom line is that BRR can’t, with any other entity, create a business relationship that would grant that entity a competitive advantage over Google.

Libraries.

Public libraries can request a free subscription for one public terminal per building, or one terminal per 10k FTE (full-time equivalent students) at research/university libraries. K-12 facilities aren’t covered. Libraries can pay for subscriptions for additional terminals as business products offered by Google.

Participating libraries are libraries that allowed Google to scan their collections. Each participating library will receive a digital copy of the books scanned from its own collection, although apparently not necessarily of the same “quality” as the one held by Google. If that library is part of a consortium, they receive additional digitized copies from the entire consortium’s collection, but only of books that are in their own collection.

Two universities will maintain (highly regulated) research-only copies of EVERYTHING digitized under Google Books. These will likely/probably be at UMichigan and Stanford. Queries to these databases can originate elsewhere, but the actual databases copies remain at UM and Stanford.

Issues:
prima facie: When you move reading to a data cloud format, it creates tremendous privacy issues. Google knows what everyone’s reading. They can use this data to make recommendations–or track and sell advertising.

Privacy is not addressed in the settlement. There are terse references to “not utilizing data,” but nothing on user intentionality data, user behavior data, etc is discussed explicitly. This raises a number of concerns, among them: data that is subject to subpoena, potentially discriminatory pricing, and profiling by usage patterns. Several actors are currently trying to impose regulations on use of that data. ALA and EPIC both indicated they would raise privacy issues in their submissions to the court.

There’s also concern about the MFN. There’s obvious recognition that Google was the first mover of this project, but the clause was presented by the IA as a potential roadblock to additional innovation. Perhaps more importantly, Peter stressed that Google is the only company released from liability for exploiting orphan works. If a second mover–such as Microsoft, Amazon, or anyone else–attempted the same thing, they would be vulnerable to lawsuits.

Peter and other participants explored several issues. Only Google will have access to the (nearly) entire corpus of published literature held by the leading U.S. research libraries — a unique resources. They will have primary control over published works, orphan or “middle-ground” works (the purpose of the BRR), and will offer access to public domain works. For either public domain or private works, third parties could still create contracts with publishers, but Google would still have sole access to “middle-ground” works.

This presents another barrier to entry by competitors. This market structure would eat into others’ abilities to claim fair use (since fair use claims depend on a competitor’s ability to project a negligible effect on market by their use).

Google Books’ impact on/treatment of libraries

Google has never provided the “cleanest copies” of scans back to any participating library. They won’t provide some libraries with text coordinates on files (critical for search capability). Books on Google are incomplete—no inclusion of graphs, maps, images, due to the challenge of dealing with additional rights issues.

There were concerns about access in larger libraries (for example, the Chicago public library—one VERY large building gets one terminal) and implementing a one-terminal policy in an already-stressed library system. Libraries will be paying subscriptions for access to database copies of books they already own. During the original scanning, some libraries pressed Google for compensation on a per-contract basis to recoup the labor costs associated with finding, shipping, cataloging, and re-cataloging their books, but most did not.

Publishers have essentially made the point that libraries can’t make database copies of their own books. Infringement fee is only removed for Google. Historically, libraries have battled with both publishers and authors. According to one particpant, text to speech (think Kindle/Authors Guild dispute) is allowed at the library terminals. [Clarified after the event to only extend to some reading disabled users, and not for all users].

Is there any incentive for Google to identify rights holders on orphan works and reimburse them? Google is in the middle of the largest notification effort ever undertaken. They have an opt-in system for claims: if you’re a rights holder, you have to opt-in to claim the work. This differs from the “Diligent search” standards which normally regulate institutions. While Google cannot commercialize public domain works directly, authors who choose to cede work to the public domain (for example, under a Creative Commons license) have NO PROVISIONS in the settlement. Peter thought that when possible and appropriate, Google would likely “let the work go” into public domain, and indeed would have incentives to do so.

Authors can choose to mark their works as free but the BRR has no obligation to inform authors of these, or any other licensing options.

The exclusivity of one company having huge DB of orphan works creates important pricing issues. Google will likely seek to establish an “on the ground” market price, rather than a cost-recovery model. This could engender unnecessarily high prices and endanger fair use.

A number of use issues were also raised. The use of covered books is extremely limited–they can’t be downloaded. Can be printed at library printing costs. Can’t be copy/pasted. Specifics of use vary from book to book. Books which are commercially available texts are generally permitted to have only bibliographic and front matter displayed. Non commercially available texts are permitted “display use” which is NOT full view, but by default ~ 20 percent view;

There were also concerns that the BRR would be a captive agent in the process.

Reflections:

One participant noted that Google has traditionally opposed database legislation, arguing that the act of compiling databases out of existing data does not constitute copyright infringement. They’ve reversed or partially compromised their stance on this issue with the settlement, as Publishers/authors have essentially demanded payment for the ability to create an index. This may impedes the creation of indices and databases in the future.

Peter quoted Mary Beth Peters in saying this creates, in effect, a compulsory license for the exclusive benefit of one company — a private agreement which appears as a form of legislative action. Is that a good thing? Congress is silent but doesn’t seem to see that this settlement is dangerous to public interest.

Peter was asked, “what can people do.” He said, “a lot, legally, formally, and informally.”

Formally: Now is the time to write Op-Eds. NYT, WSJ have already published stuff relating to the settlement. Statements and blogs make a difference.

Legally: If you’re a rights-holder (as many at the meeting were), file to make a statement before the court. Rights holders can file amicus briefs or interventions (representing any party whose interests were not included in the settlement but should have been).

The Harvard Berkman center group is apparently planning to seek to intervene in the case. The Internet Archive will “very likely” file intervention as well. EFF is reconsidering where they want to be in the issue.

After the KEI event, Peter was on his way to speak with the DOJ litigation unit, and to discuss the case with the American Anti-trust Institute. According to the discussion, DOJ has expressed interest in the settlement. According to one particpant, some at DOJ apparently are viewing Google as having a near-monopoly on advertising revenue. DOJ is still fact-finding at this point, but could claim interest or intervene. Some Attorney Generals (specifically Massachsetts, and Cuomo—NY) also expressed interest. As noted by one particpant, over 15 AGs participated in the Yahoo! Merger investigation, so likely more will get involved.

Among the possible “fixes” to the settlement (mentioned by Jamie) would be to limit Google’s monopoly over the database to a limited period of time — noting that rights in pharmaceutical clinicial trail data only lasts 5 years in the U.S. — maybe Google does not need a near perpetual right. Objection to that plan was raised by Peter, who though five years is too long for this “critical juncture” in ebooks and online reading.

Additional Links for more info:

Full text of the settlement and the Official settlement homepage

Google Book Search Settlement: Public Access Service – a look at several aspects of the settlement as they impact libraries. (via Disruptive Technology Library Jester)

Google Claims Orphan Books, Raising Alarm in Academia (New York Times)

EFF’s Reader’s Guide to the Settlement

In Google Books Settlement, Business Trumps Ideals (PC World)

Consumer Watchdog calls on the Department of Justice to intervene in Google Books settlement