October 7, 2013

Piracy.lab study investigates e-book piracy


Piracy.lab’s website does have a Word Cloud. But don’t hate them for their Word Cloud.

When the science fiction publisher Tor announced in April 2012 that they would be removing DRM from their books, it was a decision that promised to send significant ripples through the publishing world: on the one hand, readers were delighted, as were many of Tor’s authors, among them Cory Doctorow and China Miéville, who lauded the company for being the first to take this long-desired step.

Other publishers, meanwhile, looked on with bemused interest: Tor was essentially undertaking an experiment in how the removal of DRM might affect e-book piracy on behalf of the rest of the publishing industry (and of course, on their own behalf as well), and whatever conclusions they came to could be applied at other houses.

One year later, the experiment was proclaimed a success: Tor’s UK Publishing Director Julie Crisp wrote that they’d “seen no discernible increase in piracy on any of our titles, despite them being DRM-free for nearly a year.”

They’ve since been followed by just one pretty big house, Image Comics, which publishes The Walking Dead and Spawn. And around the same time Tor went DRM-free, the Harry Potter books were released on Pottermore, similarly unencumbered: at the time, this seemed like it might “change ebooks forever.” Though there hasn’t been an avalanche of publishers following suit, there’s a sense that DRM’s days might be numbered… or at least, that it’s worth considering the role piracy plays in book circulation — how prevalent it is, how much of a threat to the interests of creators, and whether controls like DRM are the right way to respond to it.

And in recent months, a research collective up at Columbia University called Piracy.lab has begun to investigate these very questions. Piracy.lab, headed up by Professor Dennis Tenen in the English and Comparative Lit department, is dedicated to studying “illicit knowledge, information ‘leaks,’ and underground archives,” beginning with book-sharing sites, such as the ones that are particularly prevalent outside of the U.S.

Tenen, interviewed for the Columbia Journalism Review by Sarah Laskow, explains that his interest in book-sharing sites was originally piqued by talking to colleagues in Comp Lit departments:

“I started hearing the story more and more from people from India or China or Russia or even Europe, and they say that I am here, in this program, because I had access to this particular book-sharing site.” These are places that offer PDFs of expensive books for free. Tenen heard his colleagues say, again and again, that without these resources, they wouldn’t have had access to the academic texts they needed to do their research. These books, he says, are “usually in the English language; they’re very difficult to get; they’re even more expensive overseas, and sometimes they’re just not available.”

In other words, we’re not just talking about ripped-off copies of The Da Vinci Code and Game of Thrones, but academic books that, because of the limitations of international book distribution networks, and the high prices charged by academic publishers to begin with, are otherwise inaccessible to a population that wants to use them.

Tenen and his team have begun their research by examining Library Genius, a site based in Russia and specializing in books on engineering and the natural sciences, with over 800,000 books in its coffers. They’re gathering data from the sites’ forums to answer questions about how and why the illicit economy functions:

Are there 10 people pirating everything? Or is it 100? A thousand? A million? Is there a core group that’s driving everything? What’s driving them? Are they ideological? Do they have a political agenda about freedom of information? Or are they acting more like collectors?

Early findings suggest that there is a small group of core contributors, though their motivations haven’t yet been sifted out. As piracy.lab includes more sites in their analysis, their results should be increasingly useful for publishing houses in figuring out who’s reading what and how publishers could ultimately make these texts more easily accessible to the readers looking for them. And, as with the Tor experiment, other accepted truths about e-book piracy could also be tested.

In 2011, The Pirate Bay surveyed its own users, looking for answers to similar sorts of questions, and the results of this survey were released in August, searchable at The Research Bay. Among other findings, the responses indicate that the motives of the 75,000 users who took the survey vary by region, in ways that echo the impetus behind piracy.lab. From a Forbes article on the data release by Emma Woollacott:

“The motivation for file sharing is different between different countries. It’s quite obvious that in the US and Europe it seems to be a question of ease of access, maybe getting the latest movies and TV shows. The release dates of television series are still very domestically controlled,” says de Kaminski [a researcher at the Lund University Internet Institute and the Cybernorms Research Group, which collaborated with The Pirate Bay on the study]. “Also, a lot of responses from other countries show that there it tends to be more about accessing information at all – it might be hard to get the content in a legal way.”

The numbers are currently being crunched, but it’ll be interesting to see if The Pirate Bay also depends on a relatively limited group of regular file-sharers, an internet community model familiar from Wikipedia.

In both cases, the gathering of hard data is a welcome development in understanding the real landscape of piracy. Like Robert Darnton’s study of The Forbidden Bestsellers of Pre-Revolutionary France, there’s much to be gained here in terms of immediate responses to the threats and opportunities of digital publishing, but also, in a longer view, in tracing imbalances in the flow of knowledge and content, and seeing how they’re reversed by a resourceful string of new participants in an old, old game.


Sal Robinson is an editor at Melville House. She's also the co-founder of the Bridge Series, a reading series focused on translation.