Can data from spreadsheets be preserved forever?

14. September 2023

Can data from spreadsheets be preserved forever?

The Danish National Archives has established an international collaboration to address the issue of long-term preservation of spreadsheets.


Preserving data from spreadsheets

Today, an image format is used for the long-term preservation of spreadsheets. Using an image format makes it difficult to extract and reuse data. Therefore, a number of archives in Europe, as well as the Open Preservation Foundation, are working to find solutions to this problem.

Preserving governments’ original data from spreadsheets is an international challenge. It has consequences for both research and historical analysis when data cannot be properly extracted and reused.

International collaboration

Converting data into a spreadsheet format ensures high data quality but it’s an uncertain method for long-term preservation.

The TIFF format, which the Danish National Archives currently uses as a preservation format, is a secure format for long-term preservation. However, reuse is cumbersome because data is saved as an image format and therefore needs to be OCR-processed for reuse.

To address this dilemma, the Danish National Archives has established a collaboration with the international membership organization for digital preservation, the Open Preservation Foundation, as well as the National Archives of the Netherlands and Estonia to find a solution that works internationally.

Making a difference in data quality

The Danish Archivist General Morten Ellegaard states:

“Denmark is at the forefront of this international collaboration, which aims to solve the challenges of preserving data in spreadsheets. The collaboration is a great example of how we can collectively develop our capabilities to archive digitally created data and ensure that data is not lost or altered. It can help make a difference in the quality of data for the future and for various purposes, including research.”

The international collaborative group will work to create a new standard for preserving spreadsheets in the OpenDocument Spreadsheets file format. The finished product will be released as open source on GitHub by the end of the year.

Julie Allen, the director of the Open Preservation Foundation, looks forward to the outcome:

“We are really happy with this cooperation and the results that it will achieve. It supports our mission as a global not-for-profit membership organisation, working to advance shared standards and solutions for the long-term preservation of digital content.”