September 16, 2013

A report from Google on its anti-piracy measures leaves some questions unanswered

by Sal Robinson

Last week, Google released a report that appears to have been aimed at pacifying some of its critics who claim that the company doesn’t do enough to combat piracy. The 25-page report, “How Google Fights Piracy,” lays out the company’s overall approach to combating piracy, and then gives details about the methods Google uses to do so.

It’s a chance to look under the hood of Google’s operations, albeit fairly shallowly. For instance, the report describes its Content ID system, which allows copyholders to monitor and control the content on YouTube. The way it works is that, first, copyright holders provide a reference file of the video or audio, along with metadata. YouTube then scans all uploaded material for matches, and if it finds one, the copyright holder has a choice about how to handle the infringing content: they can have it taken down, they can leave it up and receive information about the viewing statistics, or they can choose to make money off the upload (YouTube will run ads on the content and give 80 percent of the revenue to the rightsholder).

This is part of a larger strategy to fight piracy with flexible responses, not just takedowns: for example, the report also touts Google’s vigorous creation of and engagement with sites and services that provide an alternative to pirated content—in essence, trying to drown out the illegal material with attractive, easily findable, and not-too-pricey legit content.

But there are plenty of takedowns still going on in the search side of things, and that’s the area where the report fails to address some disturbing underlying questions. Volume has something to do with it: the number of requests Google has gotten to remove material from search results has grown massively in the past couple of years; the report says that they “receive removal requests for more URLs every week than we did in the twelve years from 1998 to 2010 combined,” upwards of 4 million pages per week.

The report describes the various ways Google deals with this flood, but the strain is clearly showing: there’s a whole page devoted to examples of abuses of the Digital Millennium Copyright Act, including the following:

A major U.S. motion picture studio requested removal of the Internet Movie Database (IMDb) page released by its own studio, as well as the official trailer posted on a major authorized online media service
A driving school in the UK requested the removal of a competitor’s homepage from Google Search, on the grounds that the competitor had copied an alphabetized list of cities and regions where instruction was offered
A company in the U.S. requested the removal of search results that link to an employee’s blog posts about unjust and unfair treatment

The report notes that Google removed the URLs for none of these, which you could see (and Google would very much like you to see) as a triumph of Google’s vetting systems, but I consider it more an indication of the problems with the DMCA. Kyle Wagner and Corynne McSherry explained the basic problem in a Gizmodo post earlier this year:

Basically, in order to retain Safe Harbor status [where sites are protected from being held responsible for the material their users upload], outlets like YouTube have to remove any content copyright holders claim is infringing, in an “expedient” fashion. And there’s no limit to the number of notices you can send, or any oversight to how valid they must be. So companies like Viacom have taken to machine-gunning hundreds of thousands of requests to sites like YouTube, demanding everything come down—from basic definitely-a-violation uploads to things that have only the most tangential (if any) relation to the copyright.

And Google’s charts of the top requesters confirms it (on pg. 16 of the report): RIAA, BPI, and similar companies and organizations are sending millions upon millions of removal requests to Google every week—the likelihood that some or many of them are insufficiently grounded (not to mention completely antithetical to a lot of creative activity) is high.

But because the burden of proof is on the alleged infringer, and very often those infringers are individuals who can’t afford to mount legal challenges against entertainment industry giants, the picture emerges of a very lopsided system. If Google’s standards for determining whether or not the request should be complied with were clear—if they were spelled out in this report, for instance—some re-balancing of the situation might seem possible. But simply stating that one complies with the DMCA is avoiding some measure of responsibility.

Besides, Google’s vetting is far from foolproof: when TechDirt ran a post last year summarizing their objections to SOPA and PIPA, an “anti-piracy firm,” Armovore, filed a request on behalf of a porn company, Paper Street Cash, claiming that the post used content from another site, TeamSkeet. Nothing in this was legit: the blog post didn’t use material from TeamSkeet, had nothing to do with porn, and didn’t violate any other copyright laws along the way. Nevertheless, Google removed the post from its search results, and right in the midst of the debates over SOPA. As Mike Masnick put it:

This is a 100% bogus DMCA takedown — something we only discovered by complete accident over a month later — hiding one of our key articles in an important fight about abusing copyright law to take down free speech. Seems like a perfect example of how copyright can be — and is — abused to suppress free speech.

As long as the system is this broken, Google’s trumpeting of their anti-piracy record—though it is generally admirable given the complexity and the size of the issue they’re facing—is a little hard to swallow.

Sal Robinson is an editor at Melville House. She's also the co-founder of the Bridge Series, a reading series focused on translation.