Tag Archive for digital preservation

Who’s afraid of the dark age?

FOIMan explains why he’s not afraid of the dark age.

In my last post I recounted how pioneers in the UK have contributed to the development of digital preservation solutions over the last 20 years. This was inspired by several news articles at the end of last week reporting on Google Vice-President Vint Cerf’s comments heralding a “digital dark age”. In this piece I want to give my personal reaction to this apocalyptic prediction.

As I indicated in my previous post, the issues raised by Cerf are not new. And indeed he isn’t the first to suggest the dire consequences of a lack of action.

But are such visions realistic? My personal view is that they’re not. Let’s consider what happened in the past.

Acts of Parliament on parchment have survived for over 500 years

Acts of Parliament on parchment have survived for over 500 years

We tend to assume that electronic formats are somehow more fragile than previous media. This isn’t in fact the case. As any archivist or conservator will tell you, failure to keep paper or parchment at the right levels of temperature and humidity can lead to it becoming unreadable. In one job early in my career I found records being stored in damp, dank conditions under the Town Hall steps. They were covered in mould and fungi – to all intents and purposes unreadable. Some of the records were less than 10 years old.

Just as servers can be hacked, intruders or employees with a grievance can access offices and pick up files they shouldn’t have seen. Careless employees can leave files on trains or even in evacuated premises. Fire or flood can destroy whole warehouses of physical records without the insurance of a backup to restore the files.

These risks have always existed. And until more enlightened times, even governments failed to keep their records in suitable storage. Just read Caroline Shenton’s excellent book about the fire that destroyed the Palace of Westminster if you want some illustrations of this.

And yet… Record Offices hold vast quantities of physical records – they complain of lack of space and have significant backlogs requiring cataloguing. Historians will always want more, but the fact is that despite the poor quality of storage in previous eras, the limited literacy of earlier generations, and in some cases the passing of many years, archivists hold vast volumes of evidence on our past.

The problem in our era is not a sparsity of information. It’s a glut of it. And with so much information – whatever format it is originally created in – it is inevitable that a huge proportion of it will survive. Indeed it is fear that information will live on indefinitely that feeds current debate over the right to be forgotten.

It will survive because it is popular – the more copies of a file that exist, the more likely it is that some will remain (take, for example the four copies of Magna Carta recently exhibited together in London). It will survive because people are interested in it. FOI will play its role – now copies of government documents will be found in many personal collections and on websites as well as stored by their creators. A proliferation of information – facilitated in the digital world – will guarantee that vast quantities of it remain accessible to future historians.

The real problem is not whether there will be information that will remain accessible, but which information should do so. As I’ve suggested, lots of it will live on purely through chance. But it is important that organisations (and individuals too) identify the records that have most value – especially long-term value – and take deliberate action to preserve them. This too will happen because there are commercial, governmental or sentimental reasons to retain them. In my last post I explained the need for pharmaceutical companies to retain digital records – so they took steps to ensure that those records would be preserved. Similarly the digital photographs that you look at the most – of your children, your significant experiences – will almost certainly survive because you will regularly look at them and if you have problems accessing them you will do something about it.

Digital records require specific techniques to ensure their preservation (as indeed do records printed on paper or written on parchment). That’s why the work of the pioneers I wrote about in my last post is so important. But in principle at least, preserving digital records is no different to preserving records created in other formats. It requires the organisation or individual to first identify what it needs to keep (a point made by the National Archives’ Chief Executive, Jeff James, on Saturday’s Today programme on BBC Radio 4). How it will keep it is a secondary and technical question, but one that will be answered if the information really does have value.

This is why, short of a nuclear holocaust (in which case I suspect we will have more pressing concerns should we survive), I don’t think a dark age is coming, digital or otherwise.

Preserving our digital past

FOIMan highlights the important work of UK pioneers to preserve digital records for future generations.

Thank goodness for Vint Cerf. Cerf’s up because he has been speaking at a conference in the US about the dangers of a “forgotten century”. He is highlighting the problem of digital preservation which is an important one. And because he’s a “web pioneer” and a Google Vice-President, the media are listening to him.

If you were to read the BBC News website or the Guardian this weekend, you’d be forgiven for thinking that this is a problem that has just dawned on very clever people in the US. But the truth is that whilst it is welcome that this issue has finally attracted the attention of journalists, archivists and people in the IT industry have not only been aware of these issues for some time, but have also been putting forward solutions. Many of them are right here in the UK.

Good luck accessing your 1980s project report (or loading Manic Miner) from this today

Good luck accessing your 1980s project report (or loading Manic Miner) from this today

In a nutshell, the problem is twofold. One, the hardware that runs computer programs is regularly superseded. On my records management courses, I illustrate this by producing a 7″ floppy disk from the 1980s. The hardware that can read that disk only exists now in a handful of museums, but when I was at school, it was used there and in most offices.

Two, the software that is used to create the programs, to manage your email, to retain your photographs, to write letters – also changes all the time. Each time you get a message saying that a new version is available the chances of being able to open a document created in the original version are reduced. Software manufacturers are focussed on creating something that will do lots of new sexy things rather than something that will continue to open your dissertation from ten years ago.

20 years ago this was a problem in the pharmaceutical industry. Research increasingly depended upon technology that produced data which could not be recorded or preserved by traditional methods. This was a concern because regulatory approval for drugs required the experimental data to be available for long periods. And if you wanted to demonstrate that you had discovered a drug for valuable patent purposes then again you needed the records to prove it. A drug like Viagra, say, discovered by scientists in the UK working for Pfizer, is worth billions to the company. So the records proving its discovery are also worth billions.

Back then I was starting my career in records management, and one of the reasons for pursuing it was that at Pfizer I saw it at the cutting edge. Not only did my colleagues invite experts in the field like David Bearman to visit us to discuss the problems, but they were proposing solutions too. They developed, with the help of a UK company called Tessella, a system called the Central Electronic Archive, specifically to retain – and preserve – this important experimental data. The CEA has since been retired, but the work done in establishing it helped to ensure that its contents remained accessible and could be migrated to its successor systems.

This work started in the pharmaceutical industry but its benefits can now be felt in the public sector. My former boss, David Ryan, was headhunted from Pfizer to set up the National Archives’ Digital Preservation Unit. The Unit has produced useful tools such as PRONOM, a database that maps the compatibility of different versions of software so that organisations can work out how to open documents created in older versions of the software. It has also established a programme to extract digital data from central government departments. One example was a 3D reconstruction of a shipping disaster used at an inquest which otherwise could never have been captured and preserved. In establishing the technical infrastructure for these services, the National Archives has continued to rely on Tessella, who won the Queen’s Award for Enterprise in 2011 for their work on this.

The Digital Preservation Unit’s next head, Adrian Brown, has subsequently gone on to establish a digital preservation programme in the UK Parliament, and is widely recognised as an expert in preserving digital formats. Adrian has recently written a handbook on digital preservation to help others looking to ensure their records will be available for decades to come.

There is still a long way to go here, and Vint Cerf is absolutely right to highlight the issue. But much good work has already been done around the world, including here in the UK where pioneers in industry and in the public sector have shown the way.