Columbia’s Ford International Fellowships Program Archive was officially launched on May 16, 2016. The archive includes content from the more than twelve years the IFP program was active (2001-2013), awarding fellowships to more than 4,300 individuals with the goal of improving access to higher education worldwide and promoting community development and social justice. It provides access to comprehensive planning and administrative files of the program as well as to related audiovisual materials. The launch of the public Ford IFP Digital Archive was timed to coincide with the May 16 announcement of those selected for research fellowships being awarded as part of the grant project. (See also the Libraries News Release.) The awardees will come to Columbia in Summer 2016 and work with material in both the paper and digital archive and present their findings at a symposium to be held in September.
The IFP Digital Archive presented interesting challenges and surprises that required the Libraries Digital Program Division (LDPD) to develop innovative approaches, workflows, and solutions. Thanks to our work on the IFP Archives, LDPD was able to implement a new processing workflow for long-term content preservation and craft a new online access model that will be able to be used for the increasing number of born-digital archival collections we now acquire.
Among the surprises we had with the IFP Archives, perhaps the most unexpected was the amount of digital content. We were expecting approximately 90% of the material to arrive in print, paper, or other hard copy formats, but instead received more than 70% of the material electronically, processing more than 350,000 files in 250 different file formats, 10 languages, and multiple alphabets. The challenges we faced in processing the digital content influenced the approaches we took for searching, navigation, and access.
Our most interesting challenge for the online archive was to develop ways to leverage our search and discovery systems to return meaningful search results sets for digital content that had no real metadata – only file names and directory paths. We also needed to create both a public online access system as well as a separate restricted access mode in order to protect some types of confidential information. We worked initially with the twenty-two IFP offices before they closed on separating out fellows’ private information, but in the end we had to review all of the content again. Since there was no way to do this programmatically, our Digital Preservation Librarian, Dina Sokolova, working with Jane Gorjevsky in RBML and an outside consultant, opened each document from the non-embargoed portion of the archive (over 100,000 to date) to check for personally identifiable information.
The IFP material also required that we develop ways to process and work with many different file formats and with content in multiple languages and scripts in the context of our existing technology framework (Hydra, Fedora, Blacklight, SOLR). Data loads are still in process.
For this collection there are three types of access, which will likely also be the case for similar collections in the future:
- a public website that currently includes approximately 55,000 items, with more still in process
- specialized restricted access laptops that can be requested and used in RBML to access on-site-only electronic archive that currently includes approximately 4,000 items, as well as access to the publicly available website
- a conventional on-site paper archive and finding aid, the latter with links into the online database where relevant.
We are continuing work on making the audio and video content easier to access. This content currently is downloadable as well as playable, provided one has the appropriate playback software. By the end of the project, all audio and video content will also be available via our new streaming media server, implemented as part of the IFP project. In addition, the IFP Digital Archive will undergo formal usability testing this summer so we can understand how well our interface design is working and where we might still be able to make improvements.
LDPD has been fortunate to work closely with Jane Gorjevsky and RBML, as well as with the Preservation and Digital Conversion Division, to make the Ford IFP Archive available. More background information about the project, including links to relevant articles and presentations, is available on our Behind the Scenes website. We are looking forward to continuing to learn from our work with this incredibly rich material.
Project contacts: Dina Sokolova, LDPD (ds2057@columbia.edu); Jane Gorjevsky, RBML (jg2138@columbia.edu).