Hello, fellow digital humanists and scientists,
In my quest to learn more about digital methodologies and their application in humanities research, my project has been through numerous twists, turns, and rogue experiments during the past few months. I began my project with the intent to map the animals and monsters of the medieval Icelandic Sagas, but in the process of engaging closely with both texts and technology, I found that my research was raising different questions than I had anticipated. In order to share my methodological adventures and ultimate research findings, I will present two blog posts. The first of which (this one!) will explore the project’s methodology by addressing my three most important takeaways. The second post, highlighting my research findings and data processing, will be posted shortly.
First things first:
ASK THE RIGHT QUESTION
When I began to annotate the Icelandic Sagas and use Quantitative GIS software to map their citations of animals and monsters, I became increasingly interested in the place names that comprised my data. In the Sagas, place names reveal more than just geographic location; they act as indicators of personal affiliations with areas, proxies for legal gatherings, and possessive nouns to indicate property. Place names also appear in poetry and speeches, as well as the prose of the sagas, suggesting that an analysis of place names—perhaps more so than animals or monsters, I concluded for purposes of this project—would best reveal what is unique about geographic depiction in literature.
In my first blog post, “Mapping the Medieval Perspective,” I explained my project goals as I conceived of them in November, noting that “one of the primary functions of a text is to describe, and thus create, space…by characterizing its geography, a work of literature [makes] room for a new mental space that challenges the boundaries of the physical.” As I continued to read and map, I realized that the Sagas’ use of place names represented the very crux of what it means to mention place and space in literature: is it imaginative or geographic, conceptual or literal? What are the convergent and divergent characteristics of the aforementioned “mental space,” perhaps best explored through semantic and syntactic qualities of literature, and the “boundaries of the physical” that allow for the geographic mapping of place name in texts?
In order to manageably address this inquiry, I chose the three outlaw sagas within the corpus of the Icelandic Sagas—Grettir’s Saga, the Saga of Hord and the People of Holm, and Gisli’s Saga—for their demonstrated interest in geography as a function of outlawry. Narrowing my field of inquiry, as well as deciding that I wanted to specifically analyze the function of place name from a geographic and literary perspective, were invaluable initial steps that took months of experimentation to shape into a functioning research question: how do place names create literary and geographic meaning in the outlaw sagas?
TAKEAWAY: When beginning a digital project, give yourself a grace period to experiment and explore the question you think you’d like to answer. Also use this time to familiarize yourself with any new software and technology you’ll be using before officially beginning your project and its documentation. Sometimes in this process of discovery, you will find that you are in fact asking a different question, which may require renegotiating your methodological approach.
LOOK FOR FLEXIBILITY
When I first began the project, I realized that my initial task would require transforming qualitative material into quantitative data. How could I convert a hard-cover, printed copy of the Icelandic Sagas from Columbia’s Butler Library into digital data, which could then be read and processed by computer commands? After I had scanned my books at the Digital Humanities Center using ABBYY FineReader 10.1 to make the documents compatible for OCR (optical character recognition), I had performed the initial step of transforming text into data. However, in order to give this data a greater degree of granularity, I would have to annotate the text in a way that a computer could read.
I initially used the software NVivo, which allows users to annotate texts according to self-designated nodes, or categories. However, I found that once I had annotated information in NVivo, it was difficult to export and process outside the software and I could not open my files in non-Columbia computers because the software is not open-source. Instead of pursuing NVivo, I researched alternative annotation options and decided to use XML to encode my sagas. This mark-up language, or eXtensible Mark-up Language, is both human and computer readable, and operates similarly to HTML; both languages require tags to categorize data, but unlike HTML’s rigid language, XML allows users to define and make their own tags that can be exported into tabular data.
To give a semblance of order to tagging protocol in XML documents, the Text Encoding Initiative, or TEI, has created a standard system of tags for encoding humanities documents and projects. Organizations from the Folger Shakespeare Library’s Digital Texts to Perseus Digital Library (both excellent resources!) have used TEI protocol to encode XML documents for scholarly work. Furthermore, since it requires no specialized software, and is also the standard for peer-review journals and publications, TEI protocol and XML were ideal for my encoding. In the Digital Humanities Center, I used oXygen XML editor to encode my document, but TEI documents can even be created in Notepad or a .txt file.
TAKEAWAY: Before committing to software for any step of your project, ask the following questions: Can I only use the software on a certain computer, and will that hinder my research? Can I export data that I create in this software into other programs? Will this software increase or decrease my ability to share my data with others or process data in a variety of ways? Does the software have adequate documentation so I can use it successfully?
MANAGE AND DOCUMENT YOUR DATA CAREFULLY—After encoding my TEI document, I exported it to an Excel document in tabular form. It did, however, require clean-up in order to make it readable, and the nature of metadata in my project required numerous .csv, .xls, and .xlsx documents with different information in order to join data for my maps and network visualization. Keeping track of these various documents became very important, especially as I created different versions of the data for various purposes, and small mistakes (such as mislabeling a shared ID number across spreadsheets) often had large consequences in data joins that needed correction. More conscientious tracking of my data across its different versions in spreadsheets would have significantly reduced this issue.
In order to clean up my data to prevent future errors in processing, I created a spreadsheet of every document that I had used in the project to record its file name, its file type, its folder location, a brief description of its contents, and whether it was used for reference or for data. I also created a sheet of information on every piece of hardware and software that I had used in the project, so that my project could be replicated by a third party should a peer review system wish to check my work.
TAKEAWAY: While I completed a file description spreadsheet and hardware/software documentation at the end of the project, it would have been wise to begin this process immediately. As a humanities scholar, I was not familiar with the practice of lab notebooks or documentation in forms other than the classic bibliography; however, it is important to document your research and manage your data carefully. Without a continuously-updated data management and documentation plan, research is difficult to replicate and therefore authenticate. Furthermore, without documentation it can become complicated to organize and keep track of various versions of data, thus increasing the likelihood of simple errors than can be increasingly difficult to find later.
So, once I had created and formatted all of my place name data from the Outlaw Sagas, how did I map, process, and analyze my information? My next post will show two maps that I have made–a GIS map using Leaflet, and a network visualization generated by Gephi–and explore how digital technologies allow for productive approaches to questions of geographic space in medieval literature.
-Mary Catherine Kinniburgh.