OpenGulf is a research group that aims to study the past and present of the Gulf region through the lens of data.
One of the primary aims of the research group is to create openly available datasets containing information extracted from sources originating in the Gulf region.
Whereas many other regions of the world have systematically collected, catalogued and digitized archives for academic study, in the Gulf region such resources are dispersed and (under)catalogued. Because of the postcolonial situation of the current states of the region, the archives we work with are multilingual. Some of the archives we work with have been digitized, but many others have not. OpenGulf is not a digitization project, nor is it an archiving project. However, we focus on documents that emerge from the Gulf region, engaging them for data creation and digital storytelling.
OpenGulf research teams use well-known digital humanities tools for data creation including annotation and extraction (Recogito), optical character recognition (tesseract), automated transcription of handwriting (Transkribus) or crowd-transcription (FromThePage). We work with both handwritten and typewritten text as well as raster maps and images. Since humanities data can often have a significant degree of uncertainty and incompleteness, we attempt to document the provenance of our datasets as well as the decisions which went into the data models and data creation.
Wherever possible, we aim to create data from publicly available archival materials in an accessible, transparent manner, which we publish with clear reuse licenses. Since our teams are heterogeneous and work across a variety of infrastructure, our workflows center minimal computing practices (cite minimal computing) and open access tools. Kinds of data created by OpenGulf at present include transcribed archival documents, structured datasets, annotated digital text, aligned lists of transliterated names with original spellings as well as historical vectorized polygons created from historical maps of the region. Visualization of this data plays an important role in its dissemination.
In addition to putting the Gulf “on the map” when it comes to historical data, OpenGulf foregrounds community building around data that is regionally meaningful. We employ many student assistants to create and contribute to projects for the research group, and include training in a variety of skills essential to a rapidly changing landscape of digital and computational humanities.