The Multigenerational Register

The Multigenerational Register

With support from the Novo Nordisk Foundation, the Danish National Archives is developing a new multigenerational register documenting family relationships among Danes since 1920.

alt

Cross-generational research made possible

Danish research is internationally recognized for its ability to analyze long-term historical data from registries, biobanks, and archives.

Currently, data from various sources can be linked via the Central Person Register (CPR number). However, there is a limitation: studies on health and social mobility can only cover a couple of generations, as family relationships in the CPR register are recorded only for children born after 1960.

Data predating the CPR register

With the creation of a groundbreaking multigenerational register, the Danish National Archives aims to extend research possibilities by linking data from periods predating the CPR register.

The register will identify familial relationships among all Danes born as far back as 1920, enabling studies of hereditary and familial influences on aspects such as health and disease.

Establishing family relationships

The multigenerational register will combine information on family relationships from the CPR register with historical records from parish registers. This allows for the reconstruction of family ties among Danes since 1920.

The project involves collaboration with researchers from the Center for Register Research at Aarhus University, who are responsible for linking individuals across the CPR and parish records.

Health and disease across 3–5 generations

Many social and health issues are thought to be influenced by familial factors.

The ability to study phenomena over 3–5 generations will provide valuable insights into how hereditary and family conditions impact health and social trajectories. This knowledge could be instrumental in developing better treatments, personalized medicine, and prevention strategies for social and health challenges.

Digitizing analog sources with AI

To link the CPR register with data from the parish records, researchers at the University of Copenhagen’s Center for AI are developing algorithms to interpret the handwritten records. This is a challenging task, given the diverse handwriting styles in parish records from over 2,000 parishes spanning nearly 60 years.

The algorithms are trained using both manually transcribed records and a dataset combining names and dates from parish records with digital representations in the CPR register.

Training data

In collaboration with the Center for AI at the University of Copenhagen, the team behind the multigenerational register has created a dataset containing images of birth dates and serial numbers from parish records. Each image is labelled with the corresponding content, i.e. the transcribed date or serial number.

This dataset is used to train machine learning models for recognizing handwritten dates and numbers. The application of this dataset is documented in the article Date Recognition in Historical Parish Records. The dataset is open source and available on GitHub.

Collaborations and funding

The multigenerational register project is funded by a grant of 38 million DKK from the Novo Nordisk Foundation. It is led by the Danish National Archives in collaboration with researchers from Aarhus University, the University of Copenhagen, and the Coordinating Body for Register Research (KOR).

For more information, visit the Coordinating Body for Register Research’s website or read the November 2020 press release on the project’s launch.