This past January, a hero in the world of information-age activism committed suicide after years of legal hounding by powerful institutions – including the United States government. His crime? Stealing the complete contents of the JSTOR academic database – a veritable treasure trove of academic research produced over many decades by publicly-funded scholars working in many […]
This past January, a hero in the world of information-age activism committed suicide after years of legal hounding by powerful institutions – including the United States government. His crime? Stealing the complete contents of the JSTOR academic database – a veritable treasure trove of academic research produced over many decades by publicly-funded scholars working in many different disciplines at institutions around the world – and potentially making them freely available to any and all who wanted to read the vast collection of papers this database contained.
Faced with decades of prison time and millions in fines if convicted by a federal prosecutor bent on making an example out of him, our young hero chose to end his life – the only way out of a crushing legal and financial bind that he could find.
If you have not heard of the sad tale of the late Aaron Swartz, hero to millions of information serfs around the world, then you are missing out, for his story is likely to become the modern-day equivalent of Robin Hood. Only instead of stealing money from the rich and giving it to the poor, Mr. Swartz liberated valuable, taxpayer-funded research from the prison in which its controllers had locked it away for their own benefit and then dared to distribute it to its real owners – you and me.
The tale begins in late 2010 when Swartz, a research fellow at Harvard and a leading figure in the Internet freedom movement, began downloading a massive amount of files from the JSTOR electronic archive, a scholarly database he had open access to via his Harvard fellowship and which is also made freely available through MIT’s free, “open campus” network.
It should be noted that JSTOR is a non-profit entity, and at no time did Mr. Swartz have non-authorized access to material contained within the archive. Rather, if JSTOR can be considered a library, then what Mr. Swartz did was check out every book in that library and kept an electronic copy of each article and book it contained – material that was, for all intents and purposes, free anyway. Legal action by MIT and JSTOR swiftly followed.
The issue here, of course, is that though JSTOR is a non-profit entity, that status does not mean it makes no profits. Instead, libraries, research institutes and other members of the scholarly community must pay exorbitant amounts for subscription access to JSTOR’s digital library. Full subscription for complete access to the entire archive can easily cost upwards of $50,000 or more a year, per institution, though many pick and choose from a menu of subscription options that JSTOR offers. As there are many, many research institutions and libraries that subscribe to JSTOR; either in part or in full, this adds up quickly.
A few years ago one blogger, curious as to where this money goes, did some digging and looked into JSTOR’s annual 990 filings – the IRS forms nonprofits must file every year in order to demonstrate the validity of their non-taxable status. In 2009, JSTOR and its parent, Ithaka Harbors Inc. – another non-profit corporation — took in a little over $60 million in 2009 and employed 211 people. Costs included $3 million for digitizing documents – the heart of what JSTOR does – though this varies from year to year. $5 million was paid out for administrative and travel costs, $3 million for IT services and $11.5 million on salary and staff costs. Once other costs were subtracted out, JSTOR and its parent had an $8 million surplus, but, again, this seems to vary from year to year.
Of interest here is the role that overhead plays. JSTOR’s executive director made approximately $300,000 while its senior staff, 12 people, averaged $155,000 each. Others averaged $67,000 per year. Once one includes the cost of operating out of a Manhattan office and the large amount of traveling and conference participation JSTOR staff apparently engage in, this amounts to a rather large amount of “padding” for an ostensibly non-profit organization dedicated to the simple mission of digitizing and making available the world’s scholarly output.
As the blogger investigating JSTOR’s finances put it, if a library spends $10,000 on an annual subscription, then “$3,000 goes to the academic publishers, $1,000 goes to servers to host the digitized files and $6,000 goes to people and to feed and water those servers” – or about 60 percent of the cost of a subscription. Much of the cost of subscribing to JSTOR, then, appears to go directly to publisher fees and overhead costs that could likely be substantially lower. Why, for instance, does JSTOR operate out of expensive offices in one of the most expensive cities in the world? Why does its executive director get paid so much?
Why? Because JSTOR, which has effectively monopolized the digital distribution of academic research in a wide variety of fields, has no serious competition and can thus force researchers to pay whatever they feel they can squeeze out of them. JSTOR is so valuable an academic research tool that to not have access to it effectively regales your research institution to the lower ranks of the professional research community in many fields. Libraries, therefore, are often forced to make due by limiting services in other vital areas to maintain access to these increasingly expensive databases.
Compounding the problem, however, is the very nature of profit-driven academic publishing to begin with. Academic publishers of journals are, in theory, controlled by and organized for members of a respective academic discipline. Scholars publish in selected journals because, for better or worse, the body of researchers working in a particular area have created a community of peer-reviewers that serve as a gatekeeper for those publishing in that journal. The more stringent and rigorous the peer-review process, in general, the more highly-regarded a particular journal is likely to be and the more researchers will want to both read the journal and publish in it.
All well, good and important. Unfortunately, since academics are not publishers and do not wish to oversee the physical production, marketing and distribution of scholarly journals themselves, this task is often contracted out to professional academic publishers which, because they are profit-making enterprises, charge outrageous amounts to put these works together. They are so expensive, goes the argument, because the demand is so low for academic research that the limited production numbers this limited demand inevitably leads to must be justified by high per-unit prices – prices that only libraries, and their large budgets, can usually afford.
This has the effect of making most academic research ridiculously expensive for individuals or poorer institutions – especially in the developing world – to gain access to. It also, because journals were and still are made of print and paper, limited the amount of scholarly material that could be published due to page counts and other archaic vestiges of the pre-digital era. Under a regime where information was held in dusty books on library shelves around the country, this status quo prevailed. The result, as any undergraduate venturing to the campus bookstore for the first time can confirm, is a product that is outrageously overpriced.
Digitization, however, changes all of this. Now the effective cost of production is next to nothing and distribution even less than that. As the late Mr. Swartz demonstrated, entire libraries can be downloaded and traded or given away for nothing. The only cost is the time it takes to download and the memory needed to store the relevant electronic files. Similarly, academic output need not be limited by page lengths, word counts or any of the other limits of the analog, print-based world. There is no reason in today’s digitized word for academic journals to be printed on paper at all – far more can be made available to more people at a far lower price through electronic data archives and virtual academic journals than can be provided through traditional print models.
The logic of cheap, universal access to digitized scholarly archives makes even more sense once one considers that the vast majority of this academic research is, in fact, publicly funded to begin with — either directly through the paying of academic salaries at public universities or indirectly through the provision of government grants to researchers both inside and outside of academia.
In the days before speedy Internet connections and cheap data storage, the profits of the academic publishing industry made some degree of sense. Even though most, if not all, scholarly research was effectively publicly funded, the distribution of that knowledge was tremendously expensive and it made sense to leverage the power of the marketplace to spread it. Profits, as such, were a motivation to spread expensive-to-distribute information as broadly as possible. In this, public and private interests were in sync.
This no longer holds because the strategic role played by academic publishers – indeed publishers and distributors of all information, not just academic scholarship – has been radically changed by the information revolution. Distribution is no longer a “natural” strategic bottleneck behind which profits naturally accrue due to the physical difficulty of carting information to and fro. It remains a bottleneck, to be sure, but only one that remains firmly in place, mostly due to onerous intellectual property rules that give legal rights to the information to whomever holds the copyright. In most often cases, that is still the publisher-distributor, like JSTOR and the academic publishing houses that JSTOR gives 30 percent of its revenues to.
Maintaining this bottleneck in the face of staggering technological change which has effectively destroyed the raison d’etre of much of the publishing industry explains why usually mild-mannered, non-profit distributors of scholarly work decided to bring out the big guns of federal prosecution against young Mr. Swartz.
His brazen “theft” – if it could be called such – threatened liberation of what is effectively the public’s research held the potential to destroy not just JSTOR’s lucrative monopoly over the distribution of digitized academic research, but the wider, profit-motivated prerogatives of the publishing industry at large – an industry that effectively no longer has an economic reason for being.
The issues raised by the Swartz case go well beyond the world of scholarly research, however, and raise grave concerns about the nature of intellectual property rights in an information-driven, Internet-saturated economy. Patent holders in the biosciences can, for example, hold potential life-saving medical research or applications of said research hostage to the whims of the modern-day equivalent of feudal lords who levy taxes on peasants and passing merchants simply because they can. Likewise, while copyright may protect some artists in the short run, in the long run the indefinite ownership of such cultural staples as the song “Happy Birthday to You” threaten to strangle artistic creativity in its crib.
In a perfect world, producers and consumers of information would meet in virtual marketplaces and freely conduct business to their own mutual advantage, making everyone better off in the process. In the world as it exists, however, the legacy legal rights of an earlier technological era empower powerful vested interests – mostly middlemen like academic publishers – who seek to maintain their privileged stranglehold on the lifeblood of the emerging, information-based economy by ruthlessly maintaining and enforcing those rights by whatever means possible – as the case of Aaron Swartz demonstrates clearly.
As the 21st century progresses and, as a society, we become ever more dependent on the products and services provided by the still nascent digital revolution, vast profits – and power – will flow to those who manage to control the information that drives it. Right now deeply entrenched vested interests are moving heaven and earth to retain their status as the feudal lords of the old information economy. Looking back, we can see that demolishing the old feudal order premised on aristocratic birth and ownership of land took centuries and is still not done away with entirely. How long it will take to overcome this new information-based feudalism, or even if such a thing is even possible, is an open question.