Translation

Towards Augmented Participatory Archives: What Role for Citizens’ Collective Intelligence in the Age of AI?

Towards Augmented Participatory Archives: What Role for Citizens’ Collective Intelligence in the Age of AI?

Siham Alaoui, Ph.D.

Introduction

The advent of artificial intelligence (AI) has redefined the way heritage institutions (Galleries, Libraries, Archives, and Museums) manage documentary resources. In the archival sector, AI can be leveraged to support various processes. For instance, natural language processing techniques (NLP) can be used to describe, index, and classify archives, helping identify organic links between historical records. Similarly, Named Entity Recognition (NER), as a subset of NLP, is useful for detecting the presence of specific expressions or words, such as places, organizations, and people names in processed archival materials. Moreover, NER can be used to automatically detect personal data in archives, which helps heritage institutions better compliance with national laws regarding personal data protection. NER techniques can also improve archival descriptions by identifying harmful expressions in metadata fields associated with archival records. Therefore, AI can help cultural institutions’ in their commitments to equity, diversity, and inclusion (EDI) values by ensuring archival descriptions that do not harm the reputation of marginalized groups. In the same line, Generative AI (GenAI), and, more precisely, Large language models (LLMs) and Handwritten text recognition (HTR) techniques, support the automatic transcription of manuscripts to ensure their readability. Furthermore, the automatic conversion of text to speech, or vice versa, makes audiovisual documents more accessible (e.g. subtitles on videos, storytelling based on textual materials, etc.), which supports EDI values by offering publics with disabilities a better user experience with archives. Finally, computer vision can support indexing and description, as well as the classification of photographs and audiovisual materials, and, therefore, improve their discoverability online.

—What role for citizens’ collective intelligence in the age of AI?—

However, AI also has its limitations, particularly because NLP and GenAI algorithms rely on applied mathematical rules based on applied probability techniques, which means that AI outputs may contain errors. Furthermore, archival materials are analyzed based on their content, while archival description practices also include context. The latter helps assess the authenticity, reliability, and intelligibility of historical records. Accuracy problems should be mentioned, especially when it comes to detecting specific entities (e.g. names, organizations, etc.), due to the variation morphology of proper nouns and the evolution of their accepted form throughout time. Multilingualism issues are also among problems that should be tackled, as algorithms are trained on data available in the most popular languages worldwide, which engenders linguistic biases against local languages spoken by specific communities. Finally, a major part of a nation’s heritage relies on oral tradition: traditions, experiences, and stories are orally transmitted, which means that this knowledge is related to citizens who mobilize their individual and collective experiences to better describe archives.

This raises the question of human agency in this algorithmic context. For decades, archival institutions have relied on the tradition of volunteering to invite citizens, historians, researchers, and students, to take part in participatory archives, using crowdsourcing techniques and leveraging the potential of web 2.0 features to describe documentary heritage (Huvila, 2008). Participatory archives projects are part of this tradition of volunteerism, in which citizens are called upon to describe, annotate, transcribe, and highlight the historical and evidential values of archives that are made available online on participatory platforms (Eveleigh, 2017). What about the viability of these initiatives in the algorithmic era? To answer this question, I suggest the expression “augmented participatory archives”, a collaborative practice where AI and citizens’ collective intelligence coexist to assist archivists in describing documentary heritage. More precisely, I will talk about the role of citizens as co-creators in crowdsourced archives, and as auditors of AI systems and their outputs.

Citizens as cocreators

While AI can automate archival description, some key contextual aspects remain beyond the machine’s grasp. Let’s talk about the symbolic aspect of archives: these records hold subjective, symbolic, and identity-related value, embodying citizens’ sense of belonging to communities advocating specific common social causes (e.g. justice). This symbolic dimension will not be sufficiently captured by AI systems, as it is the individuals involved in the events documented in the archives, or their families, ancestors, or even friends, are best placed to provide an accurate archival description of these records. Such description is often enriched by an emotional charge, because citizens recall remarkable events, national celebrations, and feelings of pride, illustrating their collective identity and how it is valued by different members of societies. Other emotions may also be felt, such as injustice and sadness due traumatic experiences. It is this emotional aspect transmitted through historical records that motivates citizens to participate in the collaborative description of archives. They are cocreators who enrich archives’ description by appropriate comments and keywords to enhance the value of those cultural artifacts and ensure their usability by other citizens.

Cocreation is also related to tackling AI limitations, especially while transcribing historical records automatically, using HTR and computer vision, which challenge linguistic aspects as well as the morphological variation of expressions found in handwritten texts. LLMs may not be trained on data in all languages, as the most widely spoken ones are prioritized, illustrating a representativeness issue of some local languages. This may have a negative impact on the discoverability of some archives online, as well as their reuse. In this context, citizens, especially those who belong to communities where the under-represented languages are spoken, may contribute to multilingualism to ensure datasets used to train AI models are diverse enough to support cultural heritage management. Translating datasets into some local languages, or even donating archives in those languages, are examples of roles citizens could assume to promote linguistic diversity in the algorithmic era.

Thus, citizens act as co-creators who should collaborate with archivists to enrich AI-generated archival descriptions, by highlighting the symbolic and emotional aspects conveyed by these records. This role is complemented by that of citizens who serve as auditors, verifying the quality of the outputs generated by disruptive technologies.

Citizens as auditors

As previously stated, hallucinations and accuracy issues are among the challenges that automation poses for natural language processing. In an archival context, descriptive metadata generated by AI systems can be imprecise or irrelevant, leading to information overload and retrieval issues. Algorithmic biases are also a serious challenge, as data used to train LLMs is not only selected according to automation purposes, but also based on some subjective criteria (e.g. personal experience of AI experts, social purposes, commercial goals, etc.). Algorithmic biases could perpetuate social inequalities and prejudices against some specific social groups, by generating harmful content reproducing social exclusion and injustice. To address these issues, it is essential to assess the compliance of AI tools, audit them to minimize the risks of bias, and enrich AI models training data. Explainable AI, as a branch of AI, helps reduce the “black box” nature of AI systems by capturing documentation that explains how decisions are made by disruptive technologies (Bennetot et al., 2024), thereby supporting the traceability of their design and deployment.

In this context, citizens will draw on their collective intelligence to act as auditors of the results produced by AI technologies. They will detect errors, assess the quality of archival descriptions, check for potential biases, and suggest ways to improve the performance of AI systems. Among these suggestions may be those for diversifying and prioritizing training datasets to improve algorithmic performance, to prevent AI from reproducing social injustices and ensure all social groups are equally and fairly represented. Thus, citizens, especially those with appropriate digital skills, assist archivists in their efforts to audit AI systems, working in collaboration with AI experts and data analysts. Citizen participation in AI governance is becoming one of the pillars of democratic states, that is why collective intelligence is important to ensure AI tools comply with national and local standards and regulations.

To sum up, in the context of augmented participatory archives, citizens apply their critical thinking to improve the quality of archival processes. Whether acting as co-creators or auditors, citizens leverage their collective intelligence to make archives a means of supporting social justice, a reflection of a society and its achievements, and a tool for continuous learning. These roles demonstrate how machines can only enhance capabilities by allowing archivists and citizens to focus on more complex tasks.

References

Bennetot, A, Donadello, I, El Qadi El Haouari, A, Dragoni, M, Frossard, T, Wagner, B, … & Diaz-Rodriguez, N (2024) A practical tutorial on explainable AI techniques. ACM Computing Surveys 57, 2: 1-44. https://doi.org/10.1145/3670685

Eveleigh, A (2016) Crowding out the archivist? Locating crowdsourcing within the broader landscape of participatory archives. In Mia Ridge (ed.), Crowdsourcing our cultural heritage (pp. 211-230). Routledge. https://doi.org/10.4324/9781315575162

Huvila, I (2008) Participatory archive: towards decentralised curation, radical user orientation, and broader contextualisation of records management. Archival Science, 8,1: 15-36.  https://doi.org/10.1007/s10502-008-9071-0

Cite this article in APA as: Alaoui, S. (2026, June 29). Towards augmented participatory archives: What role for citizens’ collective intelligence in the age of AI? Information Matters. https://informationmatters.org/2026/06/towards-augmented-participatory-archives-what-role-for-citizens-collective-intelligence-in-the-age-of-ai/

Author

  • Siham Alaoui

    Siham Alaoui is an Assistant Professor in information management at Université Laval. She serves at the head of the undergraduate programs of information management at the same university. She holds a PhD in archival science and public communication (2024, Université Laval, Québec, Canada), a master's degree in Information Science (obtained in 2015 from the University of Montreal), and a bachelor's degree in Information Science (obtained in 2013 from the School of Information Science, Morocco). She is interested in digital documentary mediation (information and data management) and their deployment in universities, governments, and non-profit organizations. She is the author of several scientific and professional articles published in specialized journals in information science (e.g. Archives, Canadian Journal of Information and Library Science, Documentation et Bibliothèques, Comma, Records management journal). She has also given many communications in conferences and symposia.

    View all posts

Siham Alaoui

Siham Alaoui is an Assistant Professor in information management at Université Laval. She serves at the head of the undergraduate programs of information management at the same university. She holds a PhD in archival science and public communication (2024, Université Laval, Québec, Canada), a master's degree in Information Science (obtained in 2015 from the University of Montreal), and a bachelor's degree in Information Science (obtained in 2013 from the School of Information Science, Morocco). She is interested in digital documentary mediation (information and data management) and their deployment in universities, governments, and non-profit organizations. She is the author of several scientific and professional articles published in specialized journals in information science (e.g. Archives, Canadian Journal of Information and Library Science, Documentation et Bibliothèques, Comma, Records management journal). She has also given many communications in conferences and symposia.