Learning from the crowd: the CULHebrewmss Twitter bot

Magical manuscript that had been erroneously labeled “Talmud” prior to metadata correction (MS X893.16 Se35)

In 2018, we decided to partner with a developer named Russel Neiss to create an automated Twitter account that randomly selects and posts images from the Hebrew manuscript collection on the Internet Archive. In doing so, we have not only made the manuscripts available to an audience that includes people who could not or would not step into RBML, but we have also learned a tremendous amount from scholars, students, and sharp-eyed Twitter scrollers.  Aside from small corrections (a language error here, a typographical error there, an error in metadata in a third place) of human error in cataloging, we have also gained so much more knowledge about the materials we care for and provide access to.  Catalogers don’t get to spend a lot of time reading an entire manuscript or book; their job is usually to provide clear (and often, due to time constraints, basic) information so a user can see if an item might be useful for a particular research topic. Putting the images with basic metadata on the web, however, invites everyone to come and read what they would like, and often yields new and interesting discoveries. We are very grateful when researchers share these findings, as we can share them in turn with future scholars and students!

Example from the Mendelsohn catalog with two Yuds as one keyboard stroke (חיים)

One of the very first comments on the bot when it premiered was from Nick Block, a professor of German and Yiddish at Boston College, who suggested that Isaac Mendelsohn’s Descriptive catalogue of Semitic manuscripts (mostly Hebrew) in the Libraries of Columbia University must have been typed on a Yiddish typewriter. The professor noticed a unique aspect of the Hebrew letter Yud (י), which, when written as a double (i.e. two of the same letter in a row) in Yiddish, exists as a single key on a Yiddish keyboard. The close proximity of double Yuds in the catalog indicated that they were written on a Yiddish typewriter rather than a Hebrew one. Interestingly, the Libraries’ archives specifically notes the acquisition of a “Hebrew typewriter” in the 1940s when Mendelsohn began his catalog – but perhaps this was actually a Yiddish typewriter (which uses Hebrew characters, and has some characters unique to Yiddish as well).

Images and transcription of ‘Olam Hafukh, MS X893 Ab8

In the last year, the most commented-on manuscript was probably Olam Hafukh by Abraham Catalano, which discusses the Jewish community’s response to the plague in Padua in the 1630s. Dr. Joshua Teplitsky (Stony Brook University) wrote an essay on Catalano’s manuscript for the Institute for Israel and Jewish Studies’ Magazine, which was then posted to the RBML blog. In June of 2020, Twitter user @eliyyahou read and transcribed some of the poetry from the manuscript, so interested readers without expertise in 17th century Italian Hebrew handwriting could also read the text (right). In March of 2020, even before the full impact of the covid pandemic was known, S.D. Homnick commented on some of the more tragic aspects of the plague documented in the manuscript, including the story of an infant who was fed goat’s milk after the death of its mother, but sadly died 15 days later.

Detail of ‘Olam Hafukh (MS X893 Ab8), telling the story of the baby fed with goat’s milk.
Amulet, General MS 194
Amulet, General MS 194

In April of 2021, Aharon Varady identified an amulet for a woman in labor (specifically, Mirḳada d.m Ṿiadah bat Donah), as the same one that had been described by Richard James Horatio Gottheil (Professor of Rabbinical Literature and Semitic Languages at Columbia from 1890 until his death in 1936) as “Amulet No. 42” in James Montgomery’s Aramaic Incantation Texts from Nippur (1913). As Varady tells the tale, Montgomery wasn’t sure whether the text came from of an amulet bowl or something else, and even Gottheil himself couldn’t remember where he saw it. Gottheil’s collection was donated to Columbia in 1936, but this manuscript was added to a box – without identifying information. It was re-discovered and cataloged by Yoram Bitton in 2011 as part of a project to process the uncataloged Hebrew manuscripts, but it wasn’t until it was digitized and made widely available online that it could be identified as Gottheil’s Amulet No. 42!

India ha-hadasha; Nahuatl glossary with Hebrew explanations. MS X893 K82.
One of the most exciting findings that was brought to our attention through the Twitter bot was the discovery of a Hebrew-Native American dictionary from 1557.  We’ve discussed the India ha-Hadashah manuscript here in a previous post, detailing how the manuscript was translated by the 16th century Jewish historian Yosef ha-Kohen from a book printed originally in Spanish: Francisco López de Gómara’s “Historia General de las Indias.” Due to its conservation (thanks to the Berg Foundation) and digitization, we were able to upload it to the Hebrew Manuscripts page on the Internet Archive, which allowed a scholar at the Princeton Geniza Lab to read its contents more closely. The Geniza Lab itself is another wonderful project promoting the manuscripts of the Cairo Geniza, and its Twitter account is filled with interesting materials covering a variety of languages and topics, and spanning millenia. A tweet thread from the Geniza lab identified a glossary of words in a dialect of Nahuatl, a language used by the Nahua, an indigenous people in modern-day central Mexico. Terms in the glossary include Ayotochtli (armadillo), Okpatli (a kind of root), Teocalli (the name of an alter), Tlachtli (a court used in an Aztec ball game), Tianquiztli (marketplace), and many others.  According to the author of the tweets, who thoroughly read the text, the glossary was not in de Gómara’s original volume, but was the creation of Yosef ha-Kohen, the Jewish translator.
Liturgical supplications according to the rite of Ioannina (Greece, 1876). MS X893 J5183.

In May of 2020, a Twitter user named Moshe took the time to read a Greek liturgical manuscript – in doing so, he identified a particular prayer as written for the coronation of Sultan Murad V for his success (it seems that that Murad wasn’t very successful; the sultan was deposed after only 93 days).

Some threads show discussions among scholars, both lay and professional, of where one manuscript or another may have been written based on clues in the text; others discuss who the scribe might have been. Often users will provide citations for editions of particular manuscripts or further sources to provide context (such as this suggestion by Rabbi Elli Fischer that Columbia’s copy of Hayim Yosef David Azulai’s Shem ha-gedolim was actually the proof copy for the first printed edition, followed by confirmation in Isaac Mendelsohn’s catalog!).

So what have we learned from this experience? Oh, so much. But the most important lesson has been how critical it is to provide open data of and open access to our manuscripts. Posting high-quality images on the Internet Archive was only the first step. Creating a space where Internet users could “happen” across the materials has yielded exponentially more data for us. Data which, of course, we’ll be putting back out for people to read, study, and ultimately (we hope) send back to us significantly enhanced so we can continue the cycle!

2 thoughts on “Learning from the crowd: the CULHebrewmss Twitter bot

Leave a Reply

Your email address will not be published. Required fields are marked *