World Digital Preservation Day 2024

As well as making 1000 years’ worth of historical documents available to present day researchers, Essex Record Office also has a role in preserving current information for researchers of the future to access. As it is World Digital Preservation Day, we thought we would share some of the work we have been doing on the latter, preserving digital records for future generations.

A minidisc, CD-R, mini-DV tape, and floppy disk laid flat on a yellow background.

Types of media in the archive: clockwise from top left – minidisc, CD-R, floppy disk and mini-DV tape

Digital preservation is defined as ‘the activities necessary to ensure the continued access to digital materials as long as necessary….beyond the limits of media failure or technological and organisational change’. (Definition taken from the Handbook: Digital Preservation Handbook, 2nd Edition, https://www.dpconline.org/handbook, Digital Preservation Coalition © 2015, accessed on 29 August 2024). It encompasses documents and files that are created and only exist digitally, known as born-digital; scans of paper records that have been destroyed, known as digitised records; and digital copies of existing paper and analogue records, known as digital surrogates.

Essex Record Office does not have many digitised records, but we do have a considerable number of digital surrogates and a growing number of born-digital records in our collections. Put together, we have over 83 terabytes of digital records, up from 64 terabytes in 2021. 97% of these are digital surrogates e.g. images of parish registers and wills. Only 3% are born-digital, but these include Word-processed documents, images, and sound and video recordings that form part of the Essex Sound and Video Archive.

Looking after digital records poses some challenges that are very different to looking after paper and parchment. Risks to the survival of digital records come from the fact that software is needed to view them, which can become obsolete; and hardware is sometimes needed to view them, which has the same problem. For example, we have quite a number of floppy disks in our collections. How many people have computers that still have floppy disk drives? Some risks are however similar to those faced with physical documents. Just as paper can decay through high acidity levels or the effect of moist environments, digital records can decay electronically, often when being copied from one file location to another.

We have been looking at the risks our digital holdings face and how we can mitigate against them, and have been benchmarking our activities against various digital preservation standards that have been devised by the National Digital Stewardship Alliance and the Digital Preservation Coalition. The risk to the holdings has been assessed against a framework provided by the UK National Archives. This was first done in 2022, and revisiting the framework this year has shown that we have made significant progress in lowering the risk of the records becoming inaccessible. This is largely owing to the fact that in 2022, our digital holdings were not particularly well documented, particularly in terms of technical information. We now have a lot more information about the digital records that we hold, which means that it is easier to establish where vulnerabilities exist. This means for example that we can transfer the content kept on file types most at risk of obsolescence onto file types with more longevity.

We are also making progress against the standards, partly through the extra information we now have, but also because we have been busy copying all the files kept on CDs, floppy disks and other portable media onto the cloud. This reduces the risk that this information will become inaccessible because the hardware can no longer be read, either because readers are not kept or maintained, or because the CDs or floppy disks themselves have degraded.

We have also invested in a dedicated computer to carry out digital preservation work, and with it, some specialist software to help. When we have any digital records deposited, we now check it to see what file formats are included and what size they are. File formats are also checked against the National Archive’s PRONOM directory, which helps us see how much longevity they have and if we therefore need to move anything to a new format. If we move records from one file location to another, we now always use a piece of software to check that the transfer has completed successfully and not caused any damage to the files.

We are additionally trying to plan for the types of records that we are likely to receive in the future to make sure we can take them when offered them. Two examples of this are websites and emails. We have a system that can capture copies of websites and present them offline as they would have looked. Consequently, if anyone runs a website of Essex interest that they feel should be preserved, please contact us on ero.enquiry@essex.gov.uk to discuss giving to us to look after. This is particularly pertinent if it can no longer be maintained, but a copy of it is wanted for posterity. Emails are particularly complicated as there can be replies from multiple people, they can include attachments and links and they are littered with personal information. We now have software that allows quite a sophisticated search function for email collections. Names of people and keywords can be searched for as well as labels that we can allocate to an email or groups of emails. It also allows personal information to be identified and redacted and access restrictions to be put on emails where necessary. We can currently only accept email mailboxes in mbox format, which limits it to people with Hotmail or Gmail email addresses, but we would certainly welcome deposits of email mailboxes from these accounts.

A VHS tape, reel-to-reel tape, cartridge, and cassette tape laid flat on a green background.

More media: clockwise from top left – VHS tape, reel-to-reel tape, cassette tape and cartridge

Much more work needs to be done here in the area of digital preservation, particularly relating to how we provide access to these records. Furthermore, even the cloud is not infallible and back-up copies need to made of the information kept on it in case of disaster. We also need to develop our email preservation to include outlook mailboxes. Importantly, we are beginning to work on a long-term plan for digital preservation activities alongside how our records are presented online generally.

Digital Preservation is going to become ever more relevant with increasing quantities of information now being digital only. It is incredibly likely for example that people will now have collections of digital rather than printed photographs, and we have done a lot of preparation to make sure that we are ready to accept these types of collections. This is a rapidly developing area and one with many future uncertainties, but it is one we feel we can tackle and advise on.

The Digital Preservation Coalition are launching a toolkit for community archives today, so if this article has prompted questions about how you safeguard your own digital records for the future or those of an organisation you are part of, please feel free to make use of this, or ask us for guidance. More details can be found at https://www.dpconline.org/.