Thanks to Victoria Horrocks, Caro Bratnober, Matthew Baker, and the students at the Burke Library who assisted with this project.
The Digital Scholarship unit is pleased to announce that the archive of Union Seminary Quarterly Review (USQR) is now openly available in Academic Commons (AC).
“USQR was founded in 1939 as a platform for inspired and socially-engaged religious thought. Prominent contributors in the early days of the review included Reinhold Niebuhr, Paul Tillich, Martin Buber, and Albert Einstein. Over the years, USQR expanded its vision to embrace a multicultural outlook in theology and an interdisciplinary approach to the study of religion. Reflecting this new perspective, it featured articles and book reviews by established and emergent biblical scholars, historians, ethicists, theologians, and other academic, church, and social commentators. Although editorially independent of Union Theological Seminary, USQR was managed by doctoral students of Union, with support and participation from faculty and administration.” USQR was published semi-quarterly from 1945 until 2016. Beginning in 2010, USQR was published for the web on a site that was maintained by the Columbia University Libraries. In 2016, the journal ceased publication and no longer required a live, dynamic website for editors to access. It was a perfect candidate for Academic Commons, which exists to provide global access to research and scholarship produced at Columbia University and its affiliate institutions, including acting as the repository for archival copies of many journals and series produced at Columbia.
The project to move the publication into Academic Commons was multi-parted. First, the USQR website, originally created and maintained by CDRS, had to be backed up and archived in Academic Commons, along with the articles hosted on the website, published on AC with individual DOIs. These materials spanned 2011 until 2016. Web Services and Libraries Information Technology (LITO) assisted with this archiving and decommissioned the website once the archive was complete.
The older content was more complicated. In addition to the materials hosted on the defunct USQR website, DS staff were given 22,724 TIFFs representing 197 issues. The TIFFs were galleys, in individual folders based on issue, but the folders were not named in a meaningful way. The individual TIFFs had some amount of consistent naming, but the order of the files in the folders were not in keeping with the order of the pages of each issue. Staff used Python to create new PDFs based on the galley TIFFs and optical character recognition (OCR) software to transcribe the linguistic content. Then, an intern assisted with transcribing the tables of contents from the issues, as the OCR was not always consistent on title pages with varying degrees of color and contrast. After a request to ATLA to reuse their metadata was denied, DS staff created a spreadsheet with the appropriate metadata for each issue and used the spreadsheet and the Hyacinth transfer server to publish the materials in Academic Commons.
Ideally, each article would have been individually published in AC. However, due to the inconsistencies in page enumeration in the galleys, to take this additional step would have required significant staff time. As a compromise, the Table of Contents metadata for each issue was transcribed in a standardized way, including punctuation marks used as delimiters, allowing for ease of metadata reuse in the future, should there be more resources to catalog at the article level. Additionally, users searching for a particular article title on the web should be able to find its full containing issue, download the PDF, and navigate to the desired article. Scans and metadata for the several missing issues were provided thanks to Burke’s Public Services Librarian, Caro Bratnober, and the Union students who staff the Burke’s circulation and scanning operations.
This concluded project has included staff time and effort from many different stakeholders across various divisions and units within the Libraries. While complex, it is a good example of the nuanced problems that staff solve every day regarding legacy content, both web and print, the challenge of making Columbia University research openly and broadly accessible, and how satisfying the hard-won successes may be.
In my role as student cataloging assistant in the Digital Scholarship division, providing assistance on the Academic Commons platform, I assisted with transcribing the tables of contents for individual issues of the USQR. The OCR-generated content) was not uniform across table of contents and title pages, requiring that the text be input manually. Titles, headings, author names, and page numbers were transcribed according to a standardized format, with consistent usage of syntax to enable the metadata to be easily accessed for any future use. It was a pleasure to assist with the completion of this project and contribute to Academic Commons’s broader mission of making this research more open and accessible.
–Victoria Horrocks, Masters Student in the Department of Art History & Archaeology