Ford International Fellowships Program (IFP) Archive

Ford-IFP-Blog-Post-ImageColumbia’s Ford International Fellowships Program Archive was officially launched on May 16, 2016. The archive includes content from the more than twelve years the IFP program was active (2001-2013), awarding fellowships to more than 4,300 individuals with the goal of improving access to higher education worldwide and promoting community development and social justice. It provides access to comprehensive planning and administrative files of the program as well as to related audiovisual materials. The launch of the public Ford IFP Digital Archive was timed to coincide with the May 16 announcement of those selected for research fellowships being awarded as part of the grant project. (See also the Libraries News Release.) The awardees will come to Columbia in Summer 2016 and work with material in both the paper and digital archive and present their findings at a symposium to be held in September.

The IFP Digital Archive presented interesting challenges and surprises that required the Libraries Digital Program Division (LDPD) to develop innovative approaches, workflows, and solutions. Thanks to our work on the IFP Archives, LDPD was able to implement a new processing workflow for long-term content preservation and craft a new online access model that will be able to be used for the increasing number of born-digital archival collections we now acquire.

Among the surprises we had with the IFP Archives, perhaps the most unexpected was the amount of digital content. We were expecting approximately 90% of the material to arrive in print, paper, or other hard copy formats, but instead received more than 70% of the material electronically, processing more than 350,000 files in 250 different file formats, 10 languages, and multiple alphabets. The challenges we faced in processing the digital content influenced the approaches we took for searching, navigation, and access.

Our most interesting challenge for the online archive was to develop ways to leverage our search and discovery systems to return meaningful search results sets for digital content that had no real metadata – only file names and directory paths. We also needed to create both a public online access system as well as a separate restricted access mode in order to protect some types of confidential information. We worked initially with the twenty-two IFP offices before they closed on separating out fellows’ private information, but in the end we had to review all of the content again. Since there was no way to do this programmatically, our Digital Preservation Librarian, Dina Sokolova, working with Jane Gorjevsky in RBML and an outside consultant, opened each document from the non-embargoed portion of the archive (over 100,000 to date) to check for personally identifiable information.

The IFP material also required that we develop ways to process and work with many different file formats and with content in multiple languages and scripts in the context of our existing technology framework (Hydra, Fedora, Blacklight, SOLR). Data loads are still in process.

For this collection there are three types of access, which will likely also be the case for similar collections in the future:

  • a public website that currently includes approximately 55,000 items, with more still in process
  • specialized restricted access laptops that can be requested and used in RBML to access on-site-only electronic archive that currently includes approximately 4,000 items, as well as access to the publicly available website
  • a conventional on-site paper archive and finding aid, the latter with links into the online database where relevant.

We are continuing work on making the audio and video content easier to access. This content currently is downloadable as well as playable, provided one has the appropriate playback software. By the end of the project, all audio and video content will also be available via our new streaming media server, implemented as part of the IFP project. In addition, the IFP Digital Archive will undergo formal usability testing this summer so we can understand how well our interface design is working and where we might still be able to make improvements.

LDPD has been fortunate to work closely with Jane Gorjevsky and RBML, as well as with the Preservation and Digital Conversion Division, to make the Ford IFP Archive available. More background information about the project, including links to relevant articles and presentations, is available on our Behind the Scenes website. We are looking forward to continuing to learn from our work with this incredibly rich material.

Project contacts: Dina Sokolova, LDPD (; Jane Gorjevsky, RBML (

The Making and Knowing Project


The Libraries Digital Program Division and the Digital Humanities Center are collaborating with Dr. Pamela Smith (Columbia’s Seth Low Professor of History & Director of the Center for Science & Society) on the Making and Knowing Project, which was launched by Dr. Smith in 2014 as the flagship research initiative of Columbia’s Center for Science & Society. The project involves working with a sixteenth century French manuscript held by the Bibliothèque National, BnF MS. Fr. 640, a so-called “book of secrets” that contains recipes and instructions for creating a dizzying variety of crafts, metalwork items, casts, and other material objects. It builds on Dr. Smith’s work on the connections between the early modern craft workshop and the scientific laboratory, expanding the model of material reconstruction and textual analysis that she has been doing with the manuscript. The manuscript is being used as the focal point for innovative pedagogical approaches that offer students and researchers the opportunity to experience the history of science in an immersive way in a lab setting and to learn about translation, experimentation, materials science, language, deep data, technology, and artisanship. The Lab Seminar taught each year combines historical research and materials science, and gives students hands-on experience with the recipes and materials referenced in the manuscript. Each summer, a paleography workshop is held that gives advanced graduate students an opportunity to transcribe and translate the Middle French manuscript collaboratively. This interdisciplinary project connects natural scientists, scholars from across the humanities, and conservators, and offers students a unique opportunity to explore the challenges of hands on lab work and scientific experimentation, the importance of paleography, and the power of collaboration for building understanding and producing knowledge.

The online critical edition, which will include visual annotations, electronic field notes, images, commentaries, and video from the lab work, as well as support for the transcription and translation process, is being developed with the assistance of the Columbia University Libraries Digital Program Division (LDPD) and the Digital Humanities Center. Terry Catapano, a digitization and metadata expert in LDPD, has been working closely with Dr. Smith on the project. His work has been centered around identifying and providing appropriate technologies for the online critical edition, and directing the application of XML markup using Text Encoding Initiative (TEI) Guidelines for tagging cultural heritage content. Along with the translation, two transcriptions are being developed, a “diplomatic” transcription that is a literal representation of how the text appears on the page in the original and a normalized transcription in which abbreviations are expanded, readings are provided for hard to discern words, and deletions are removed. The markup of the translation and transcriptions is being done to identify the structural and textual units of interest, e.g., page breaks, recipe text, main text, and marginal notes. Tagging is also being used to identify features of interest in the manuscript, including the materials discussed, animals, places and place names, professions, and other identifiable objects. Students learn the middle French script, create transcriptions by deciphering the text, and then add XML tags in Google Docs, which although it provides a poor interface for text encoding markup, is nonetheless effective in enabling collaborative engagement with the transcription by students and members of the team. The next stage of the text encoding process will be to align the markup from the two previous paleography workshops (in 2014 and 2015), where the markup schemes differed to some extent. Once the collaborative features of Google Docs are no longer as crucial, the transcription will be moved into a more formal XML structure directly using TEI guidelines.

Terry is also collaborating on defining the specifications for the features of the online critical edition of the manuscript. This work is being done using open tools and standards (e.g., XML, HTML, CSS, etc.) and the text encoded in Unicode/UTF-8 to facilitate the long-term viability and reusability of the electronic edition. Using open source technologies liberates the text from the restrictions of formats such as MS Word or Adobe PDF, and also enables semantic markup that can, for example, define text segments and convey what each segment is about. A string of text can be described as a paragraph, a name, or a recipe, and that same string can also be identified as referring to an animal, a place, or a profession. This semantic markup then facilitates more analytical searching by researchers, allowing, for instance, a search that shows all references to an animal, e.g., lizards, in recipes having to do with metalworking. The related material, including textual commentary and the reconstruction of the recipes, will also be available to shed light on the text. The online edition will support various views of the text and allow researchers to view the translation and the transcriptions side by side.

Terry has been working with the project since 2014. He developed the initial ad hoc markup tag set that was used initially and then instructed 2015 Paleography Workshop students in applying the corresponding standard XML and TEI markup. He will be responsible for creating the final markup to be used for the project and establishing the schema for the transcription, which establishes a structure for the tags and identifies where they belong and in what order. Encoded according to the rules expressed by the schema, the text can be readily rendered in multiple ways. This allows, for example, the identification of where original text pages begin and where sections, divisions and paragraphs begin and end. The schema and tags work together to display the data logically and to support analytically-oriented semantic search. Terry will be co-instructing the June 2016 Workshop, which will combine paleography, TEI, and data visualization. He will instruct students on how to do the TEI markup and how to then use it for data visualization.

The Making and Knowing Project gives students and researchers from a variety of disciplines the opportunity to experience the potential of the digital humanities firsthand. The expertise that Terry Catapano of LDPD and Alex Gil of the Digital Humanities Center provide to support this unique digital humanities research and lab work enriches both the project and the Libraries. There will undoubtedly be more to learn from this multi-dimensional, interdisciplinary project.

Support for the Libraries’ Online Exhibitions: LDPD & Omeka

This winter and spring, LDPD supported five projects built using Omeka, our online exhibition platform.  These online exhibitions range from the very current Love In Action project, which actively gathers supporting documents and photos from Union Theological Seminary’s present-day activism, to early modern literature from the Elizabethan and Jacobean period.  The exhibitions accompany a conference, in the case of Early Modern Futures; a civil rights and protest movement in the case of Love in Action; and an important publication by a prominent Columbia University faculty member, in the case of the Sydney Howard Gay exhibition. LDPD explored new technologies including a self-deposit feature which we implemented for the Love In Action exhibition, and has supported varying requirements and needs to help make each of these exhibitions possible.

LDPD facilitates the creation and presentation of our online exhibitions using Omeka, an open-source, exhibit-focused content management system.  We help curators navigate the process and take the necessary steps to move from the idea for an exhibition to curating, creating, and publishing that exhibition.  Melanie Wacker and Robbie Blitz begin by scheduling a metadata consultation with the curator.  After the curator has filled out the metadata spreadsheet, we review and send any corrections; coordinate the imaging of the items with the Digital Imaging Lab; bulk upload the digital assets and completed metadata into Omeka; provision the curator’s Omeka account; and provide training.  Exhibitions are structured and include a narrative about the items that are included and the relationships between those items.  An introduction to curating online exhibitions was created by LDPD and offers information about how to conceptualize and actualize one’s curatorial ideas.   (The Online Exhibition Planning Form and checklist are available on the wiki at: .)

LDPD began using Omeka to support the Libraries’ Online Exhibitions in 2009.  The system is flexible and has been easy for contributors to use.  Supporting both the system and the creation of the exhibitions has been straightforward and easy to streamline.  We developed templates that allow users to customize the colors of their exhibition, and the consistent Columbia Libraries’ look and feel clearly indicates to users that they are in a trusted Columbia University online environment, while providing links to other online Libraries services.  There is an active community of Omeka users at other institutions that we can work with to customize the software when needed, ensuring that we don’t reinvent a plugin or recode a customization when someone else has already done that work.


Our exhibitions are sometimes created and published in conjunction with on-site exhibitions at the Libraries, and because we no longer publish printed catalogs for exhibitions, our Omeka online exhibitions often function as such.  They offer the additional benefits of being freely and widely available, and of providing ongoing access to the items once the physical exhibition has ended.  They are a great way to begin to explore some of our treasured collections and to be introduced to the riches that we have available to researchers in our physical collections.  Over time all digital content created for our Omeka exhibitions that we own will be added to Fedora.  Depending on copyright restrictions, the content will also be made available through our Digital Library Collections portal and within the Digital Public Library of America.

Sydney Howard Gay’s Record of Fugitives

Portrait-Sydney-Howard-GaySydney Howard Gay was the editor of the weekly abolitionist publication, the National Anti-Slavery Standard and a key operative in the underground railroad in New York City.  Over two years, he meticulously recorded the arrival of fugitive slaves at his office in two volumes he called The Record of Fugitives, which is included in the Gay Papers at the Rare Books and Manuscript Library of Columbia University.  An accomplished journalist, Gay interviewed over two hundred men, women, and children and recorded the stories of well-known and lesser-known, yet instrumental, fugitives, including Harriet Tubman.  More than half of the fugitives arrived by train via Philadelphia, and also appear in a similar set of records maintained there by the black abolitionist William Still.

The Omeka site, Sydney Howard Gay’s Record of Fugitives, reproduces the recently rediscovered Record of Fugitives, both in high-resolution images of all of its pages and in a searchable transcript.  It also includes documentation about 172 individual slaves, compiled by Professor Eric Foner, that summarizes the information noted by Gay and William Still combined with data from other sources.  The Still records enabled the historian to check the consistency of the runaways’ stories.

Dr. Foner initiated this project in conjunction with the publication of his book, Gateway to Freedom: The Hidden History of the Underground Railroad (W. W. Norton and Co., 2015).   He had a transcription of the journal done by graduate students.  The Digital Imaging Lab in the Preservation & Digital Conversion Division carried out digital photography of the journals..  Terry Catapano and Robbie Blitz, in consultation with Foner and RBML Curator Thai Jones, faciliated the creation and design of this exhibition, and Foner provided the text for the exhibition homepage.

Using our Omeka exhibition platform, Robbie Blitz put each digitized journal page up along with its transcription.  Individual pages were also created for each fugitive that Dr. Foner identified, including the name of the fugitive, information about their former owners, census information, who they escaped with and where they went (when available), and what they did after gaining their freedom.  We also provide downloadable PDFs of both volumes of the journal in their entireties as well as a PDF of the transcription, and our programmer, Fred Duby, installed a special Omeka PDF viewer that will now be able to be used for other projects.

We have also actively worked to open this exhibition up to a broader audience beyond Columbia. Catapano created a Wikipedia entry for Sydney Howard Gay (none existed) and included links to Dr. Foner’s books, to the archive of the National Anti-Slavery Standard (Gay’s abolitionist newspaper), and to our exhibition.  He  uploaded the digital facsimile of the two-volume journal to the Internet Archive and WikiSource.

News Clipping: A Peep at Slavery
For a full analysis of the Record of Fugitives, as well as a history of the Underground Railroad in New York City and the northeastern metropolitan corridor, see Eric Foner’s book Gateway to Freedom: The Hidden History of the Underground Railroad (W. W. Norton and Co., 2015).


Love-in-Action-Studying-While-Black-ColumbiaThe #LoveInAction project is an entirely new direction for our Omeka work, more a real-time digital archive than an online exhibition of curated items.  Union Theological Seminary (UTS) faculty, students, and alumni have long been engaged in social justice activism.  Working from Cornel West’s premise that “justice is what love looks like in public,” the Burke Library at UTS has been actively seeking to document UTS students, alumni/ae, and faculty participation in the #BlackLivesMatter and Occupy movements, among others, through their photos, videos, audio, flyers, pamphlets, and other writing to a living, ongoing, real-time digital archive.

LDPD worked with Elizabeth Call from the Burke Library to support the #LoveInAction project’s needs, particularly the need for students, faculty, and alumni to contribute content to the project via Omeka, the open-source content management system that the Libraries use to create and curate online exhibitions.  To allow participants to directly contribute digital materials into the exhibition, our programmer, Fred Duby, first needed to upgrade our Omeka web application.  He then installed and customized the externally-developed Contribution plugin so that it could be used in this context.

Love-in-Action-Millions-MarchAfter the site launched, Elizabeth Call hosted a contribute-a-thon to encourage UTS activists to contribute their content.  The site was advertised to UTS in newsletters and highlighted at faculty and student meetings.  Burke staff plan to evaluate the project over the summer.  More than 50 items have been contributed, including photographs, video footage, flyers, posters, and other documents, that cover a variety of social justice causes such as fair wages, Occupy Wall Street, the Millions March and other protests of police brutality, the People’s Climate March, notices of what to expect when engaging in civil disobedience, and more.  It’s a remarkable and wonderful way for the Burke Library to actively engage with UTS faculty and students, and meet an immediate archiving need.

Cornelius Vander Starr, His Life and Work

CVStarr-in-China-at-TempleWho, one might wonder, was C.V. Starr?  More than one East Asian Library bears his name, including ours at Columbia.  This new exhibition works to answer that question and offer information about the patron who championed both scholarship and a better understanding of Asia.  His legacy lives on in The Starr Foundation, which awards grants in many areas, including Education, Medicine and Healthcare, Human Needs, Public Policy, Culture, and the Environment.  Starr was committed to a better understanding of Asia, and this exhibition provides a historical exploration of his history and his relationship to Asia.

The East Asian Library worked with the Starr Foundation and with Robbie Blitz to use Omeka to put together the C.V. Starr, His Life and Work exhibition.  The impetus for this exhibition came from The Starr Foundation, and Ria Koopmans-de Bruijn curated the presentation of the photographs and ephemera and the accompanying information.  LDPD worked with the Starr Foundation and Koopmans-de Bruijn to create and import the metadata and upload the assets.  Generally our exhibitions involve items that we own, but this exhibition is unusual in that the Starr Foundation provided all the content from their own collections, requiring that we identify the content as belonging to a private collection in the metadata.  .  We also worked to ensure that the attribution of the images was handled appropriately, that the exhibition “went live” without a hitch, and that the functionality, such as moving through the exhibition and enlarging of the items, worked smoothly.


Early Modern Futures Exhibition

Map-North-Pole-Early-ModernThe online Early Modern Futures exhibition accompanied the Early Modern Futures Conference held in April 2015 as well as a physical exhibition in the Rare Book & Manuscript Library (RBML).  Developed by graduate students working with Karla Nielsen RBML’s Curator of Literature, the online exhibition explores how early modern literature conceived of future events, how and whether those conceptions shaped people’s lives, and what this literature offers us in both its historical perspectives and its forward-thinking projections.  From the well known writers Jonson, Marlowe, and Milton to central literary figures such as Katherine Phillips and works on land surveying, tobacco farming, and the end times of Revelations, this online exhibition offers a fascinating perspective on the ways that early modern literature influenced and was influenced by the histories and futures it encompassed.

LDPD provided the Omeka platform, metadata and imaging assistance, and training for its curators.  The curators of this exhibition were graduate students in Columbia’s Department of English and Comparative Literature.  LDPD has worked with RBML curators to enable graduate students to effectively contribute to the creation and curation of online exhibitions. Graduate students often work closely with RBML on their dissertation research, so providing this support allows LDPD to support the teaching and learning missions of the libraries.

A Church is Born: Church of South India Inauguration

Church-is-Born-Image-Syrian-ChurchIn the exhibition, A Church is Born: Church of South India Inauguration, Omeka is serving a greater purpose than simply content management for an exhibition: it is providing a way of digitally preserving a deteriorating filmstrip that was discovered in the Burke Libraries’ Archives.  Documenting one of the most historically significant processes in the Church Union movement, this filmstrip offers a look at the unification process of the Anglican, Methodist, Reformed, Presbyterian, and Congregationalist Churches, which led to the establishment of the Church of South India in 1947.

Project Archivist Brigette C. Kamsler in her work on the Missionary Research Library Archives identified the need to preserve the deteriorating filmstrip and went to great lengths to try to find copyright information about it.  No copyright holder was identified, so the decision was made to digitize the filmstrip and curate it within an online exhibition in order to preserve it for the future, highlight this important chapter in the ecumenical movement, and provide access to the commentary and script that were found with the film.  In order to document our efforts to identify rights holders, Kamsler’s work to locate copyright information is clearly described in the online exhibition itself.

The exhibition includes images from the filmstrip accompanied by the portion of the script that goes with that segment.  The items have been thoroughly described in the metadata, which is available (as always in our online exhibitions) by clicking on the link to item-level information.

The Old York Library

Seymour Durst’s Old York Library Collection

Ebbets Field Brooklyn, Durst Collection

Ebbets Field Brooklyn (Seymour Durst)

Comprised of more than 20,000 items, including postcards, books, maps, and more, Seymour Durst’s Old York Library Collection has been digitized as part of the $4 million gift from the Durst family. $1.2 million of that gift was earmarked for the cataloging, housing, and digitization of the collection, and preparation of the digital Old York Library is underway.  The digital Old York Library Collection is the first specialized collection to be built on and take advantage of the Digital Library Collections Website infrastructure and showcases the ways that LDPD is streamlining the digital project website creation process by leveraging the DLC Blacklight Hydra code base.  Using this strategy, custom collections take advantage of already-existing functionality while also pushing the development of new features and functionality that can then be leveraged by the other collections in the site.  The Durst project, for instance, brings the ability to display search results on an interactive map and the integration of GeoData using the open street maps API.  The content capabilities native to the DLC will be seen to great advantage when exploring this wonderful, and often whimsical, collection of NY- and NYC-related memorabilia.  Some gems to watch for include fascinating photo books that exemplify a then-current cultural trend (perhaps the initial instigation for Instagram?).

The Columbia Spectator


The Spectator is fully digitized and online! Issues from 1877 to 2014 are available online.  Whenever possible pages have been scanned from original paper copies and digitized using state of the art technology that provides full-page, searchable reproductions of articles, photographs, and advertisements.  The digitization of the materials required repairing fragile, heavily-used items, which took longer but resulted in high-quality, color digital reproductions.  The project has successfully addressed both archival preservation and information access, offering excellent, detailed searching of article content, advertisements, and images.  The Spectator Archive is a tremendously rich and important part of Columbia University history, including that of its libraries.  The Library News columns from the 1800’s include articles and short pieces covering Melvil Dewey’s tenure and influence here at Columbia, particularly his implementation of the Dewey organization and classification system and the establishment of his “School of Library Economy”.  Check out this article from The Columbia Daily Spectator, Volume XXIII, Number 7, 17 January 1889, Spectator-Library-Column-17-January-1889.