Link-Lives, Historical Big Data: Reconstructing Millions of Life Courses from Archival Records Using Domain Experts and Machine Learning: Proceedings of Linked Archives International Workshop 2021

Research output: Contribution to journalJournal articleResearchpeer-review


  • Fulltext

    Final published version, 608 KB, PDF document

The Danish archives comprise some of the world’s most comprehensive source coverage but despite large-scale digitization and transcription projects by diverse actors, there are no shared standards or possibilities for data linkage. The Denmark-based Link-Lives research project (2019-2024) is tackling this disparity by linking individual-level Danish records in census and parish record sources from 1787-1968 to create a multigenerational database for research using a combination of domain expertise and machine learning techniques. In contrast to small-sample linking or fully automated processes, Link-Lives is creating its own manually-linked data to train machine learning as well as exploring the impacts of different approaches to linking. Due to personal data protection legislation and propriety agreements, the data cannot be fully open access, but data outputs will be made available to both researchers and the general public via a website. The project’s interdisciplinary team is based at the Danish National Archives and the University of Copenhagen, in partnership with Copenhagen City Archives, and funded by Carlsberg and Innovation Fund Denmark.
Original languageEnglish
JournalCEUR Workshop Proceedings
Pages (from-to)135-143
Number of pages9
Publication statusPublished - 2021

Number of downloads are based on statistics from Google Scholar and

No data available

ID: 324598873