In the mid-nineties, major initiatives in Digital Preservation started to coalesce at an international level. Right from the outset, Pat Manson, from her various positions within the EC, was a significant influence in forming, developing and promoting policy and accompanying funding schemes, to support Digital Preservation within a range of domains. This paper reflects on the lasting outcomes of this EU-funded Digital Preservation activity, but due to the wealth and breadth of the subject, it is necessarily non-exhaustive. We lean heavily on the comprehensive analysis of EC-funded Digital Preservation by Strodl et al. in 2011, which listed the main relevant EC policies/actions; together with an overview of the key projects and their main aims and objectives. With the benefit of five years’ hindsight, this analysis is revisited, and later projects are included and examined for the first time. We draw on Adrian Brown’s excellent Practical Digital Preservation (2013)1 with its comprehensive overview of useful tools and standardisation activities. The analysis is interwoven with contemporaneous input from Pat’s own papers, presentations and official EC publications, to give a sense of how she guided and shaped EC-funded Digital Preservation. A key focus is on how standardisation activities/best practices within Digital Preservation developed over this period.
2. Early Days of Digital Preservation
First of all, what is meant by Digital Preservation? “Digital preservation research tackles the problems of keeping—preserving—digital content, particularly that which is born digital and, therefore, by definition does not exist in any other format. As early as the mid 1990s the European Commission recognised that this was an emerging and important issue and started funding pioneering research projects in digital preservation” (Manson, 2010). The seminal paper from the United States by Waters and Garrett (1996, p. 43) corroborates this statement and delineates the international efforts at that time by the European Union, British university research libraries and Australian workshop members. It was the Luxembourg workshop in December 1995 (European Commission, 1996) that drew together national libraries and publishers to consider how best to tackle the legal and preservation issues connected to dealing with electronic publications. At this time, Pat was working at the EC as a project officer, and she wrote an article (Manson, 1995) on electronic libraries for VINE, a journal for which she was editor from 1981 to 1988. It is clear that her considerable technical experience in the library world stood her in good stead for leading the EC in its work on funding Digital Preservation (DP), initially from her position as project officer, and then Head of the Cultural Heritage and Technology Enhanced Learning Unit. So what precisely were these DP beginnings?
|ERPANET (2001–2004), a “Preparatory, accompanying and support measure” for organisations to get together and share knowledge/experience of DP. The last posting on the website http://www.erpanet.org/index.php is dated July 2007. The key founders here were Professor Seamus Ross, HATII (Humanities Advanced Technology and Information Institute) at the University of Glasgow, UK; Professor Mariella Guercio, the University of Urbino, Italy; Hans Hofman, the National Archives of the Netherlands; and Niklaus Bütikofer of the Swiss Federal Archives.|
|DELOS (2004–2008), a Network of Excellence comprising 57 partners focussed on digital libraries as the knowledge exchange hub of the future. The project made great advances developing the technical infrastructure components such as the Testbed [see, e.g., Strodl et al. (2006)] necessary for digital libraries, and http://delos-old.isti.cnr.it/news.html provides a glimpse of the wide range of activities undertaken|
|DigitalPreservationEurope (2006–2009), a Coordination Action to pool European DP expertise which produced major outcomes for digital repositories, including the DRAMBORA toolkit (McHugh, Ross, Innocenti, Ruusalepp, & Hofman, 2008) for auditing repositories; and the PLATTER planning tool for electronic repositories.|
These projects led to the setting up of:
- The WePreserve initiative (http://archive.is/*.wepreserve.eu and https://www.youtube.com/user/wepreserve with its suite of digital preservation videos / animations.
Having provided the impetus for these early DP projects, what was Pat’s vision for developing DP across Europe in the years that followed? We will start by looking at the mid 2000s, after the early projects outlined above were finished or drawing to a close.
3. Digital Preservation Research: An Evolving Landscape
Pat’s “Digital Preservation challenges and actions at European level” 2006 slides (Manson, 2006) outlined the importance of our societal memory, and the key place that DP occupies within it, together with the concomitant negative impact if our digital heritage is lost. She emphasised the need to raise awareness of the DP problem; galvanise relevant stakeholders to tackle the issues at a European level; and acquire EC policy support and funding. Audio Visual (AV) archives were highlighted as being particularly at risk. IPR challenges such as dealing with embedded Digital Rights Management (DRM) systems and complex, dynamic objects were pointed out. She stressed the need for cost-effective migration strategies and new measures for building Trusted Repositories. She drew attention to previous projects DELOS and ERPANET; together with PRESTOSPACE (2004-8) (http://prestospace.org/) which supported the use of migration to move AV and film archives to digital formats.
At this time, the focus of the Information Society Technologies (IST) research programme 2005-6 was on developing systems and tools to “support the accessibility and use over time of cultural and scientific resources”, specifically targeting new access environments, and complex objects together with their attendant metadata and contexts. The accompanying IST 2005-6 DP work programme set out the agenda of using Integrated Projects (IPs) to develop experimental platforms; web archiving; and trans-discipline alliances; with a longer view towards addressing the issues of coping with large volumes of digital material, and more dynamic and interactive digital content; and mobilising researchers at the European level. Throughout, Pat delineated the EC policy required to underpin this DP effort: for example, the Council Resolution on Preserving Tomorrow’s Memory which required action at Member State and European Union (EU) level.
In 2007 the plans were laid out for the i2010 digital libraries initiative which had digitisation as the key strategy for preserving digital cultural heritage (Manson, 2007a). Integral to these plans was the EC policy framework comprising an EC Communication (2005); the EC Recommendation to Member States (2006); the European Council Conclusions (2006) and the European Parliament’s own initiative report (2007). These were the first steps in establishing Europeana (http://www.europeana.eu/portal/en) as the European Digital Library, as can be seen from this timeline https://ec.europa.eu/digital-single-market/en/news/timeline-digitisation-and-online-accessibility-cultural-heritage
The 2007 “permanent access to digital knowledge—the challenges for digital preservation” presentation (Manson, 2007b) also drew attention to the need to preserve “not only data, but context of meaning and use”, and to produce models “for digital objects capable of supporting self-preservation features” and the need to assure “integrity, authenticity and accessibility”. OAIS-based systems and tools were held up as the standard ones, and gaps in infrastructure provision were pointed out: namely accompanying registries, certification and accreditation services.
|Planets (2006–2010): an IP to build tools and services for long-term DP (Farquhar & Hockx-Hu, 2007)||http://www.planets-project.eu/|
|CASPAR (2006–2009): an IP to develop a framework to support an end-to-end digital lifecycle for a range of diverse digital material from the cultural and scientific domains (Giaretta, 2007)||http://cordis.europa.eu/project/rcn/92920_en.html|
|SHAMAN (2007–2011): an IP to develop a framework and a theory of preservation that operates across distributed repositories, particularly focussed on e-science (Barateiro & Borbinha, 2012)||https://www.sub.uni-goettingen.de/en/projects-research/project-details/projekt/shaman/|
|PROTAGE (2007–2010) a research project to explore the use of software agents to automate digital processes (Jin, Jiang, & de la Rosa, 2010)||http://www.arhiiv.ee/protage|
|KEEP (2009–2012): a research project to examine the practical use of emulation as a preservation strategy (Anderson, Delve, & Pinchbeck, 2010)||http://cordis.europa.eu/project/rcn/89496_en.html|
|LIWA (2008–2011): a research project to investigate the concept of living web archives (Denev, Mazeika, Spaniol, & Weikum, 2011)||http://www.liwa-project.eu/3|
|IMPACT (2008–2012) to develop OCR and mass digitisation in response to the i2010 challenge||http://www.impact-project.eu/|
Additionally there were a number of follow-on projects, as shown in Table 3.
|SCIDIP-ES (2011–2015): a Coordination and Support Action SCIence Data Infrastructure with a focus on Earth Science (Riddick et al., 2013)||http://www.scidip-es.eu/|
|APARSEN (2011–2015): a Network of Excellence to develop a virtual DP research centre, building upon the Alliance for Permanent Access (APA)||http://www.alliancepermanentaccess.org/|
|SCAPE (2011–2014): an IP to develop an infrastructure for scalable DP via automated workflows based on policy-based preservation planning (Becker, Faria, & Duretec, 2014)||http://scape-project.eu/|
|4C (2013–2015): a Coordination and Support Action to calculate the costs associated with digital curation||http://4cproject.eu/|
|PRESTO PRIME to keep AV contents alive (Oomen et al., 2010)||http://www.prestoprime.org/|
|DAVID: Digital AV Media Damage Prevention and Repair||http://david-preservation.eu/|
|TIMBUS (2011–2014): Timeless Business Processes and Services, to bring DP into the realm of Business Continuity Management (Dappert, Peyrard, Chou, & Delve, 2013)||http://timbusproject.net/|
|E-ARK (2014–2017): European Archival Records and Knowledge preservation—to set up the infrastructure for digital archiving (Aas, Wilson, & Delve, 2016)||http://www.eark-project.eu/|
|THOR (2015–2017), which involves the use of persistent identifiers to build lasting interoperability into research e-infrastructures in order to access open resources||https://project-thor.eu/|
Finally, in 2011 at an IMPACT conference, Pat opined: “[a]mongst the challenges are the need to improve the cost-effectiveness of digitisation, through improved technologies and tools, and to expand the competences in digitisation across Europe’s cultural institutions. At the core of this is the concept of (virtual) centres of competence which aim to exploit the results of research and to leverage national and other initiatives.”
4. Project Outcomes
- With the benefit of hindsight, what may we say of the outcomes of these EC-funded projects? What is the DP legacy that Pat has left behind? Before turning to the direct outcomes from this work, it is, perhaps, worth drawing attention to some of the fruits of Pat’s labours that happened outside EC-funded activity. In 2001, Following Pat’s exhortations to work together across Europe and beyond, the Digital Preservation Coalition (DPC) was formed, and has recently become an international membership organisation, taking DP to industry and commerce as well as to the GLAM4 sector. DPC founding member Neil Beagrie states (N. Beagrie, personal correspondence with J. Delve and D. Anderson, July 25, 2016):
“By its very nature digital preservation is an international issue that spreads beyond national borders and the capacity of any single sector or institution. It is a long-term strategic requirement underpinning our digital economy and society. Successive European programmes have had a major sustained impact in fostering the necessary collaborations at scale, catalysing new solutions, and spreading best practices across Europe.”
- DPC Executive Director, Dr William Kilbride comments (Kilbride, W., personal correspondence with J. Delve and D. Anderson, August 1, 2016):
“The constantly evolving topic of digital preservation has two distinguishing features which the EU has provided significant leadership.
Firstly it is a global challenge which requires a global response. National agencies have an important contribution to make but to be impactful, leadership needs to be given at the right level. The EU’s trans-national support to research in digital preservation has been critical to the growth of the sector and has been a very practical realisation of the principle of subsidiarity on which the EU is founded.
Secondly it is a cross-sectoral and cross-disciplinary challenge. It routinely involves libraries, archives, computing scientists, economists, research engineers, system architects and end users. No one sector can solve the problem on its own, therefore a common understanding becomes vital. So standardisation activities, viewed as a codified encapsulation of good practice, are vital for any meaningful long-term progress that can engage all the relevant parties.”
Alongside the DPC (http://www.dpconline.org/), but responding very much to the climate Pat created, there is the JISC-sponsored Digital Curation Centre (DCC http://www.dcc.ac.uk/) in the UK, the NDCC in the Netherlands (http://www.ncdd.nl/)and nestor (http://www.dnb.de/EN/Wir/Kooperation/nestor/nestor_node.html) in Germany.
Returning to the projects themselves—what are the direct outcomes from them? “Project midwife” Neil Sandford reflects (N. Sandford, personal correspondence with J. Delve and D. Anderson, June 14, 2016):
“My introduction to the world of digital preservation came in the form of a question: “have you ever thought about what a computer will be like in a hundred years’ time?” which led me to reflect on the changes over the third of a century since I first worked with a computer and preservation consisted of a pattern of holes on a paper tape. In the 1990s, many of the challenges being promoted at the time by DG13 (as was) in Luxembourg were concerned with formats (both of file objects and the data contained in them). Other standards -related issues emerged such as interoperability, scalability, the cloud, alliance between public and private sector stakeholders and the development of communities of practice. Through successive framework programmes, some ground-breaking projects were conceived. As their midwife I was proud to be able to contribute to the birth of Planets, Impact and SCAPE as well as projects building on those foundations in the Research Infrastructures and ICT PSP domains such as E-ARK.”
We now turn to the “virtual centres of excellence” to which Pat referred in 2011.
5. Virtual Centres of Excellence/DP Foundations
Joachim Jung, Executive Director of the Open Preservation Foundation (OPF) explains (J. Jung, personal correspondence with J. Delve and D. Anderson, August 2, 2016):
“PLANETS helped shape the terminology that allows the community to talk about digital preservation in an accessible way. Considering the context of digital preservation, the project partners soon recognised the importance of sustaining their results. The PLANETS project was one of the first to create a follow-on organisation to safeguard the results and established the Open PLANETS Foundation in 2010. The organisation was re-named as the Open Preservation Foundation in 2014 to reflect that it now sustains not only the PLANETS results, but provides a safe home for other digital preservation project outputs such as the SCAPE project results, and provides stewardship for key digital preservation tools including JHOVE”.
JHOVE5 (JSTOR/Harvard Object Validation Environment) is a file format identification, validation and characterisation tool. It is implemented as a Java application and is usable on any Unix, Windows, or OS X platform with appropriate Java installation.
The IMPACT membership organisation which was established with the intention of making text digitisation better, faster and cheaper by providing online expertise and access to tools for all parts of the digitisation workflow, set up a Centre of Competence.6
Similarly the PrestoCentre7 was the culmination of the PRESTO SPACE and PRESTO PRIME projects (and later the DAVID8 project), and is the international Virtual Centre of Excellence for preserving AV material. PrestoCentre is a non profit organisation led by five of the major audiovisual archives organisations in Europe9 to facilitate the digitisation and preservation of audiovisual collections across Europe.
APARSEN (Alliance for Permanent Access to the Records of Science in Europe Network)10, which followed on from CASPAR, and the APARSEN Virtual Centre of Excellence in DP, brought about certification for Trusted Digital Repositories, based on the ISO 16363 standard that they developed11, which was, in turn, based on NARA12’s prior TRAC work. APARESEN also offers peer review of digital repositories, based on the Data Seal of Approval, DIN 31644 and ISO16363. They also host a link to the SCIDIP-ES tools. Dr David Giaretta, Co-ordinator of CASPAR, APARSEN, SCIDIP-ES and Director of the APA comments (D. Giaretta, personal correspondence, 2016):
“The vision and support which Pat Manson provided for the digital preservation community was vital for the development of many new ideas. Although the constraints of EC funding did force projects to overpromise and the proposed solutions to fragment, I believe that many of these ideas will be taken up in the e-Infrastructure and Research Data projects funded by the EU and in individual nations around the globe. Pat’s legacy will be the benefits to the global information society which rely on the seeds which she helped to plant.”
The International Internet Preservation Consortium was formed in 2003 http://www.netpreserve.org/about-us and is a key hub for preserving websites.
These virtual centres of excellence, and other similar organisations, are recognised worldwide for their expertise and resources. But what of other outcomes? There follows a non-exhaustive look at some of the main results.
6. Standardisation Initiatives
As well as the OPF, the PLANETS project brought about several major standardisation initiatives. Their Project Manager, Clive Billenness says (C. Billenness, personal correspondence with J. Delve and D. Anderson, July 27, 2016):
“Throughout the first decade of the 21st Century, Europe witnessed an exponential growth in digital data, with only limited consideration about how such data, once created, could be preserved for the long-term without a need for continuous expenditure to sustain the original investment. Decisions about the formats to be used for digital data and what metadata would be required were often taken in an uninformed and inconsistent way, even within different parts of a single organisation.
We are grateful to Pat for her commitment in this area of her Directorate’s work. This has enabled the EC to support the creation of standardisation activities for the representation of data and so to facilitate its cost-effective, long-term preservation. It has also encouraged organisations entrusted with developing both Europe’s digital heritage and its digital economy to collaborate in developing common, shared approaches.
We are now at the time that legacy information systems which were “preservation-aware” are starting to be replaced, and so we are beginning to witness the practical and economic advantages of adherence to standardisation activities which are being achieved by organisations across the European Union and beyond.”
Specifically, PLANETS created/further developed the outcomes mosted in Table 4.
|The SIARD (Software Independent Archiving of Relational Databases) specification and suite of tools. These were created by the Swiss Federal Archives, who have since collaborated with the E-ARK project to launch the second version of the specification: SIARD 2.0: (https://www.bar.admin.ch/bar/en/home/archiving/tools/siard-suite.html and http://www.eark-project.com/resources/specificationdocs/32-specification-for-siard-format-v20 ). SIARD is the de facto database archiving standard used worldwide, and is the basis for database archiving specifications in E-ARK|
|The PLATO DP planning tool, for organisations to draw up detailed and practical DP plans|
|The PLANETS testbed for investigating the suitability of new DP software|
|The PRONOM file format registry|
|File format characterisation tools (FIDO, DROID)|
|ExCEL, preservation fidelity factor software|
|The PLANETS interoperability framework|
Additionally, some of those involved in working in PLANETS went on to make key contributions to metadata standards, for example the British Library’s Dr Angela Dappert’s membership of the international PREMIS preservation metadata standard Executive Committee. She is currently the British Library co-ordinator for the EC THOR13 project on persistent identifiers.
The APARSEN standards register http://fenugreek.fernuni-hagen.de:8080/StandardsWeb/home/standardsRegister.xhtml provides access to a plethora of DP standards, and similarly the DAVID project’s http://david-preservation.eu/publications/public-deliverables/ points to AV preservation standards.
Many projects have built upon the OAIS archiving standard: Planets, CASPAR, APARSEN, KEEP AND E-ARK, to name but a few. The KEEP project was the first to put emulation forward as a realistic alternative/addition to migration, and to insist on the need to capture technical environment information. The project produced an emulation framework which was used by the Computer Games Museum in Berlin to demonstrate legacy computer games. The project also produced the Trustworthy Online Technical Environment Metadata (TOTEM) registry http://www.keep-totem.co.uk/ which links the software and hardware versions needed to describe computing platforms. The TOTEM data model was the basis for the environment data model aspect of the German bwFLA project which heralded Emulation as a Service (EaaS) http://bw-fla.uni-freiburg.de/ All of these emulation initiatives were finalists/winners of the DPC’s biannual preservation awards over the last few years. The TOTEM work is now being developed as part of the UNESCO PERSIST project http://www.ica.org/en/networking/about-unescopersist ; and it also led directly into the environment work of the PREMIS 3.0 data model.
The last project to which we will draw attention is the current pilot B project, E-ARK. Mariella Guercio (2012, pp. 9–10) pointed out that archives had seen relatively little support in the DP funding arena compared with libraries. The Policy Support Programme (PSP) pilot B funding call intended, in part, to address that imbalance, resulted in the E-ARK project, which has now produced a common specification for OAIS Information Packages for use in digital archiving. This common specification builds on the MoReq specification for records management, and is playing a crucial role in enabling those in e-government to submit their data to the archives. There has also been collaboration between E-ARK, and Joint Working Group 7 of ISO, and other standards committees, to develop the EPUB 3 standard. For the first time, specifications have been developed for the OAIS Submission, Archival and Dissemination Information Packages. In addition to developing open source tools and infrastructure for scalable digital archiving, based on Big Data tools and techniques, E-ARK has produced a Knowledge Base hosted by the DLM Forum, the organisation set up in 1994 by the EC for national archives and associated institutions to exchange best practice. A lightweight internet-based digital archiving tool, earkweb, has the potential to cater for the digital archiving needs of regional and local archives, plus those from business, academia and many other types of digital archives.
Guercio, who has been in DP from the very beginning, provides our final valediction (M. Guercio, personal correspondence with J. Delve and D. Anderson, July 26, 2016):
“Pat has played a very important role and has strongly contributed to the capacity of the European institutions to play a crucial role in the field of digital preservation. I had the opportunity to work with her for many years (thanks to projects such as Delos, Erpanet, Caspar and APARSEN) and I have appreciated her capacity in supporting a strategic view in this area.”
The vast majority of the projects and tools listed above are included in Adrian Brown’s Practical Digital Preservation (Brown, 2013), thus attesting to their continued importance and utility (Adrian himself was a founding member of PLANETS and previously Head of DP at the UK National Archives). As previously stated, this can only be a partial and incomplete survey of the development of DP across the last twenty years.
To conclude, the considerable long-lasting expertise, experience and collaboration in DP across many disciplines and domains has been stimulated, supported and led by Pat Manson in her various EC roles. Under her guidance, the digital preservation community in Europe has grown from a few individuals articulating a set of well-founded concerns, into a vibrant international community of practice, providing industrial-strength solutions. Digital preservation activity in Europe is world leading, and internationally engaged. Of course, a great deal remains to be done, and the fast pace of technological change will continue to provide those tasked with the care and protection of our shared cultural heritage with challenges long into the future. The contribution of Pat Manson has been seminal, and much of the credit for the current healthy state of digital preservation worldwide, is hers. In the words of a famous Frenchman “Tout cela est bien quelque chose!”14 (All that is really something!).