Author Archives: Wendell Marsh

Islamic Africa in Search of a Database

Just as the emergence of a digital paradigm is transforming other fields of scholarship and providing new approaches to old questions as well as new questions altogether, Islam in Africa, a field that is at the intersection of African studies and Islamic studies faces new openings in research. However, scholars of Islam in Africa have been slow to adapt to emergent technologies for their own ends, leaving the question of the digital to the realm of manuscript preservation or stalling it at complaints about the digital divide. At the same time, the ongoing conversation about the Digital Humanities has, with a few notable exceptions, has seldom made African Studies or Islamic Studies (less true for Islamic studies), let alone Islam in Africa, a research priority.

Despite the state of affairs, the field is ripe for research with as many opportunities as there are challenges. The current moment can perhaps be compared to the first days of the field of African History in the 1960’s when Africa was in search of an archive. The process of building archives for the sake of African history in the first days of independence entailed not only the collection and preservation of documents, it also required defining what constituted a “document,” developing methodologies of research, and standards in professional activity. Similarly, the database is not simply a digital storage space; it has corresponding ways of thinking and attendant scholarly practices that should be intentionally defined for the sake of creating a “usable future.”(pdf)

These are the stakes that provide the larger backdrop of my project. The immediate query, I assure you, is far more modest. I want to generate a list of local textual production in Arabic by using the contents of the Segou collection if the Bibliotheque Nationale de France,  a 4,000-odd document library from present-day Mali, West Africa that was captured in 1890 by French military forces. I then want to take that information to make an assessment about the types of works that were locally produced. While there has been significant, if not magisterial, work done on the Arabic Literature of Africa, it has been more bibliographic than it has been analytical. There, too, has been work on the “Core Curriculum”(pdf) of West African Islamic learning. However, the scale has been too large to capture local production.

I have used the West African Arabic Manuscript Database as a source of the textual metadata that I used for a process of what I have been calling “scanning.” By scanning I do not mean digitalization, but rather I am trying to invoke the impression one has when walking quickly through the stacks of a library. The operation would probably be recognizable to library scientist as a statistical corpus analysis with an added geographical dimension.

DH: The Data Turn or Back to Texts?

The discussion around the Digital humanities often seems to be focused on transforming texts into data. Turning language to information, humanists are told, renders their increasingly archaic materials available to a range of computational methods. The majority of these methods, however, are not a part of their training and humanists look towards the other side of the two-cultures divide. Graphs, codes, and algorithms were not apart of the deal when we decided to study the fuzzy arts. Yet, we recognize that the world is now data-driven and we, naturally, want to go along for the ride.

So, we give ground to the number crunching and the people who seem to know it the best. We take our cues from them, adopting their methods and we try visualizing our material, because our narratives seem to have become so impoverished.

Don’t get me wrong, I think there is some great potential in thinking about texts as data and incorporating computational methods. At the same time, however, the move to the digital should not necessarily mean the relinquishing of techniques, approaches, and procedures inherited through the humanities. We should not give up ground to the quantitative so easily. I want to argue that thinking about data as text by scholars trained in “the old way” might be as productive and important as adapting our text to the digital and and rendering them as data.

By way of some sort of reflection of this, I want to provide two excerpts of things that I have been reading recently.

Karin Barber has written a theoretical reflection on what a comparative historical anthropology of texts would look like. She is specifically interested in oral and written culture in Africa but her discussion is quite global. She starts with a basic proposition: “A text is a tissue of words” (1). Defining text beyond its materiality is critical for her as she is trying to think of the oral and the written simultaneously. Textuality, then, is in effect “the idea of weaving or fabricating — connectedness, the quality of having been put together, of having been made by human ingenuity” (21) and not its medium. Leaving the question of “the human” aside for now, I think keeping her basic argument in mind can help us think of data and code as texts, to essentially think of the oral, the written, and the digital together.

The basic recognition that I think is useful is that, following Barber, texts are things, say things, and do things. That texts, oral or written or digital, differ in modality are not an indication that they are so fundamentally different.

The second text is from the Natural Language Processing with Python textbook.

Python texts.jpgHere, the understanding of “text” is quite similar. Although this definition, compels us to think more of the progressive levels of a structure. This topic, of course, is about using something already recognizable as a text and rendering them available for computational methods. In the example immediately following this one, the method is determining lexical diversity.

Isn’t this what “theory” has been doing for decades?

You might say that this is a bad example because it does not show how “data” can be read as text. However my main point is that we should not let the materiality of the text, that is its medium, trick us into surrendering a textualist orientation to the material. The materiality has changed but still fundamentally a text, an intentional weaving together of structural elements that have to manage limitations and opportunities of the form itself in order to produce meaning.

This has been more of a musing than an argument. Accordingly, I would like to end with a question.
What would a text-based, or a textualist approach to data look like?