Image source: Wikimedia commons.
Lately I’ve been surprised by how much I’ve had to learn about the morals and politics of web publishing, in addition to the technical skills like xml coding, or the protocols for marking up Greek and Latin inscriptions and other documents. But as I prepare data about the ancient world for publication online, it’s perfectly understandable that I’d need to bone up on the humanities side of my digital humanities work, as well as learning the digital aspects.
In recent months the public discussion about free and open access to scholarship has increased and sharpened following the tragic story of the young activist and highly gifted programmer Aaron Swartz. Due to face trial this April for allegedly downloading millions of articles from JSTOR (available only by subscription), Swartz committed suicide in January and was mourned widely and very publicly by advocates for freely available information, who only seem to have become even more committed to ensuring that scientific and cultural breakthroughs (especially tax-payer funded research) are accessible to the public at no cost.
The humanities are yet to fully embrace open access scholarly publications, although a few publications like the digital humanities quarterly and the ISAW Papers are blazing the trail in my field, classical studies. (JSTOR has recently made openly available their “early content” that is now outside copyright.) The question over the next years will be how academics meet the costs of publishing research online that still meets their peer-review standards, and the short answer is that the costs might simply be transferred to the researchers, as happens currently with many scientific publications like PLOS. With any luck the specialized European periodicals for Greek and Latin epigraphy will also open up online too.
But there is also the question of making publicly available the source information. In my case, I want to publish online the texts and information from all the unpublished tombstones in Butler’s Rare Book and Manuscript Library (RBML). Traditionally these kinds of data sets have been hoarded by scholars, protective of the efforts they have put into gathering and curating information. But it is only once all the data is freely available and in an organized form that it can be analyzed comprehensively. With epigraphy the tension has been between a quick and dirty digitization of the vast reference volumes of the 19th and 20th centuries and a more thorough and careful publication of new editions of texts, with accompanying images and metadata, from Roman Italy and the provinces, as well as Christian inscriptions.
One low-cost, high-visibility platform for publishing data has been blogs. Here scientists can published their source information for public re-use, but still receive credit for their work. In a similar way, I’ve been able to encode a log of my activities inside the source code for each inscription I’ve edited, so everyone can read exactly who worked on the Columbia inscriptions, and when. There’s really no reason to be protective of this information, and the RBML are only too willing to get their information out there.
Open source information has also been invaluable though because I’ve basically been able to teach myself how to code simply by observing at what other editors have written before me. I can see precisely what level and amount of information previous coders have supplied. This transparency means any can also check the quality of the information before deciding how to re-use it.
The whole point of my project is to connect the ancient data kept at Columbia University to the rest of the world — just one small link in a chain that will only grow longer and stronger as other collections also digitize their holdings to the same standards and join up with each other. The only way to do this is through openly publishing texts and information whose source can be inspected freely.
It drives me crazy whenever I find the gates on 114th St. or at 117th St. are closed, because it seems like a waste of time having to go the long way around to the College Walk. The very least we can do is take the bolt cutters to the treasure troves of information we have here at Columbia because quite frankly they’ll end up open anyway.
A lot of digital ink has already been spilled on these topics, but the EFF is a good place to start and the ProfHacker blog regularly touches on these issues. You can search the Directory of Open Access Journals for public research, and check out Columbia’s online medical journal Tremor.