About - Computational Archives AHRC Research Network

The large-scale digitisation of analogue archives, the emerging diverse forms of born-digital archives, and the new ways in which researchers across disciplines (as well as the public) wish to engage with archival material, are disrupting traditional archival theories and practices, and are presenting challenges for practitioners and researchers who work with archival material. They also offer enhanced possibilities for scholarship, through the application of computational methods and tools to the archival problem space, and, more fundamentally, through the integration of ‘computational thinking’ with ‘archival thinking’. This potential has led the collaborators in this proposal to identify Computational Archival Science (CAS) as a new field of study,

The increasingly digital nature of the archive provides opportunities as well as challenges for addressing this question, by using a range of computational methods for meeting the increasingly complex demands of both archival users and practitioners. While much of the digital data currently held by institutions is the result of digitization, born-digital archives are also in scope for the Network, given their anticipated growth, the typical loss of inherent structure when these collections are captured into the archive and the great potential of automation for re-contextualisation and access to such records.

The context of a record is key for understanding its value as historical evidence, and the ability to map out and provide access to that context is key for conferring value on what would otherwise be (relatively) disconnected pieces of information, enabling them to be used effectively – found, understood and re-purposed – by historians and other archive-centric scholars drawing on the archival evidence base.

This Network will organise a series of events to explore this question of contextualisation, whether through capturing metadata, enhancing records by semantic tagging, or indeed contextualising records with other records, and thus connecting up previously disconnected information into ‘knowledge graphs’. We will not focus on specific technologies, but rather examine a range of technologies with potential for meeting this challenge, including natural language processing, graph technologies, machine learning, probabilistic approaches, and other methods from the broad field of data science and AI. The focus will thus be on the research question rather than the technology.