Resource “Accessibility” Is More Than Just “Posting It Online”

Resource “Accessibility” Is More Than Just “Posting It Online”

Quinn Remington, Xinyue Yu, Makenna Page

Not everyone has the time and money to book a flight across the world to look at an artifact in person, so how do researchers with limited funding access one-of-a-kind resources? The Internet is a godsend for collaboration, letting us share photos of ancient pottery fragments, 3D scans of mummified tissue, and create virtual tours of ancient Egyptian tombs. However, sharing becomes a little more complicated when that artifact contains thousands of individual pages in 61 diaries, handwritten by a steamship clerk living in nineteenth-century Iraq.

The Svoboda Diaries Project (SDP) focuses on exactly that. For nearly two decades, this project has used new and exciting digital preservation methods and extensive collaboration to make these diaries accessible to everyone.

—How do researchers with limited funding access one-of-a-kind resources?—

The SDP employs interns to transcribe and proof the diaries. After initial transcription by individual interns, the first proofing stage helps identify basic errors such as typos and misspellings. In the second proofing, team members work together to identify damaged or illegible fragments and standardize transcriptions for publication. During onboarding, interns learn rigorous guidelines to ensure consistency and integrity to the original text.

(Transcription Pipeline)

  This process makes the volumes more easily readable, but that’s still roughly forty years’ worth of writing—which is a lot of reading.

The Text Encoding Initiative (TEI) seemed to be a solution. TEI allows encoders to ‘tag’ specific parts of transcriptions with different labels. The diaries become searchable, like typing a keyword into Google to find relevant articles. For example, if a researcher wants to find examples of medicines used during this time, they can search for “medicine”, and all mentions of terms labeled as medicines will be listed.

(A Diary 55 transcription, an excerpt of ailments and prescribed medication) 

However, tagging text takes a lot of time, and we have a lot of text. With a limited number of interns who work on transcription and a limited number who work on TEI, how do we choose our priorities? If our goal is to make this extensive resource accessible to researchers, what research do we prioritize?

This year, our team worked on TEI guidelines, building on previous work at SDP, to tag measurements, names, titles, weather, climate, medicine, illness, food, locations, groups of people, and more. This way, researchers can make broad searches for information from a wide range of topics spanning thousands of pages. But then we have the question of how to develop guidelines further. Do we assign this task to just two people and make them responsible for deciding what future interns from different backgrounds might want to make searchable? We could do that, but we opted for a more inclusive method that allows all our voices to be heard.

Every week, interns meet for an hour-long meeting, where we invite all to ask questions and share ideas about the project or their progress. These meetings are also a time to host group brainstorming sessions, share updates about other sections of the team, and volunteer any information someone might want to share. Additionally, a yearly “All Team Meeting” brings together interns and managers in a group conference. Members update each other on the progress of their projects and work together to establish goals for the next year.

(Management Meeting Notes 9/21/20)

These weekly and yearly meetings allow interns to receive updates on each other’s progress, make suggestions for or against specific tags, and educate participants on the ongoing work. This year’s all-staff meeting featured an activity to familiarize the rest of the team with TEI and a conversation about possible improvements, which they later implemented.

(Meeting Notes 5/1/24)

Strong communication between teams allows those working on TEI to write guidelines inclusive of interests from varied fields. However, some might not see the merits of writing guidelines. We have limited time, so why focus on this instead of encoding?

In short, this project lacks funding and relies solely on limited volunteer efforts. We cannot tag every piece of information with every tag that could be applied to it; we have to pick and choose which might be most useful to the most people. However, this project has been at UW since 2006, and before that, began in the 1970s at Al-Hikma University. Hundreds of hands have passed through this project, and as long as this project exists, more will continue to pass through.

Working on the guidelines creates a set of conventions for those future hands to build off of. They contain the tags that the current content team thought were most necessary. If future collaborators want to expand the tags applied to the published transcriptions, then guidelines provide a starting point. It’s still difficult to prioritize tags because we see the uses of different ones, but collaborating on wide-ranging guidelines lets us expand our project’s reach; if we can’t implement these tags now, we can at least provide a foundation for the people who will.

(Slide from All-Team Meeting 4/13/24)

We don’t know what novel technologies and AI applications will streamline the future of this project, but we do know that this project exists for the sake of researchers. Our internal documentation is a resource itself. For those who take SDP in a new direction, they have our ideas to consider. The SDP has always had a collaborative mindset, and this mindset extends to the future; participants are not limited to working with just our present company. Instead, we’ve set aside resources and documentation to collaborate with people who haven’t yet discovered our project. 

We preserve and publish these diaries to broaden their accessibility, and that extends to different interpretations of how they could best be used. Our teams rely on consistent communication to make sure all our voices are heard, and because we document our processes, guidelines, and ideas, we make sure that those in the future have the option to listen to the voices of the past.

Cite this article in APA as: Remington, Q., Yu, X., & Page, M. Resource “accessibility” is more than just “posting it online.” (2024, May 17). Information Matters, Vol. 4, Issue 5. https://informationmatters.org/2024/05/resource-accessibility-is-more-than-just-posting-it-online/


Quinn Remington

Quinn Remington, B.A., M.A., has interests in linguistics, primary resource preservation, translation, and other adjacent fields. She is currently pursuing her MRes in Clinical Linguistics funded by the Erasmus Mundus scholarship.