The LIBER MARC Harmonization Task Force has its origins in an initiative of the past President of LIBER, Professor Elmar Mittler from the Göttingen State and University Library. Professor Mittler asked Dr Elisabeth Niggemann from Die Deutsche Bibliothek to take part in the meeting of the MARC Harmonization Coordinating Committee in Ottawa in May 2001. Following that meeting the LIBER MARC Harmonization Task Force was founded at the LIBER Annual Conference in July 2001 in London.
The LIBER MARC Harmonization Task Force held its first meeting on 14 January 2002 at Die Deutsche Bibliothek Frankfurt am Main, with the aim of gaining an overview of format activities in Europe. The group's aim was to concentrate on European developments and to build up stronger cooperation in the library world in order to strengthen Europe's international influence.
The LIBER MARC Harmonization Task Force held a second meeting at the IFLA 2002 Conference in Glasgow and discussed the first draft of its report and recommendations to LIBER. After final discussion within the group, this report has been further revised and was submitted to LIBER.
The aim of the report is to give an overview of format activities in European countries and to make recommendations to LIBER regarding the use and development of data formats in Europe. The annex includes reports on migration activities from different countries. The report is based on information on data formats collected and compiled on the basis of a questionnaire distributed to the Conference of European National Librarians (CENL).
Cataloguing issues were further discussed at the 1st IFLA Meeting of Experts on an International Cataloguing Code held in Frankfurt in July 2003. Further meetings will be held at the IFLA conferences in Buenos Aires (2004) and Seoul (2006).
The most common data formats in use are UNIMARC and MARC21, with local MARC adaptations (e.g. danMARC2, MARC21-Fin, NORMARC) achieving the highest score (10 responses). There are also some other local formats such as MAB2 (as a kind of “foreign language”) and OCLC PICA (as a kind of “hybrid language”), but their level of use is, not surprisingly, much lower than that of MARC.
The most frequently used formats are UNIMARC and MARC21 (in each case 10 responses; 31%), followed by “local” 9 (28%) and “other” 3 (9%).
The most frequently used cataloguing rules are national ones (14 responses), followed by local adaptations of AACR2 (8) and AACR2 (6). Only three institutions use modifications or short versions of national cataloguing codes.
The reasons for a decision not to migrate are also interesting. Some institutions have already migrated, are content with the existing format, or use a format which is very close to MARC. Others are constrained by the information system and format of their national library landscape, or by the systems supported by library system providers. Another important consideration is a requirement to handle some more new formats including XML-based ones, Dublin Core and ONIX.
The advantages of establishing a MARC21 European Interest Group are seen as manifold. The most important aspects seem to be the possibility of forging stronger links between European library culture and the widespread MARC format and therefore bringing MARC and European library culture closer together. The representation of European viewpoints and interests concerning format applications and cataloguing questions could so be better guaranteed, and the chance to build European consortia on these topics will be more practicable. Further benefits may include the exchange of experience, a greater understanding of the value of standardization to the European library community, and the opportunity to participate in the development of MARC21 formats.
The Group could be a forum where the harmonization of MARC21 and other related systems can be reviewed and information coordinated so that cooperation with Anglo-American institutions is easier and more efficient. Being able to cooperate and to build a unified community would result in greater influence on MARBI (Machine Readable Bibliographic Information Committee), and would avoid duplication of effort. The group should be a group of experts with the aim of promoting the format and keeping it up to date, and with responsibility for information and knowledge transfer in relation to format questions. It could provide for a better harmonization and interoperability between formats developed for international purposes.
One of the major goals of the LIBER MARC Harmonization Task Force is the discussion of the data format activities of libraries at the European level. Therefore it is desirable to cooperate with the Conference of European National Librarians (CENL) in order to get an overview of the format activities in all European countries. Dr Elisabeth Niggemann has kept the CENL and the CoBRA+ Forum informed about the activities of the LIBER MARC Harmonization Task Force, and she will continue to provide further reports in the future.
The group recommends close cooperation with the international library world in the field of bibliographic data formats. In particular intensive cooperation with IFLA - the IFLA UNIMARC Programme (UP), the Permanent UNIMARC Committee (PUC), the ICABS Programme -, the MARC Harmonization Coordinating Committee (MHCC) and the Machine Readable Bibliographic Information Committee (MARBI), with the intention of exchanging information and bringing European ideas into the international discussion on data formats.
Because of the importance of data formats in cataloguing matters, cooperation with cataloguing standardization bodies is imperative, as is cooperation with the IFLA Cataloguing Section and Division of Bibliographic Control, and the Joint Steering Committee for Revision of Anglo-American Cataloging Rules (JSC).
It can be seen from the results of the CENL questionnaire presented above that there are two principal standards of data format used by European countries: UNIMARC as an IFLA standard and MARC21.
Country reports, with overviews of format activities and format conversions (see annex of this paper), should be placed on the LIBER website, following a standard template, as practical examples, and the results of the CENL questionnaire should be made available on the website.
At its Groningen meeting in January 2004, the LIBER Executive Board fully agreed the previous recommendations. The LIBER MARC21 Interest Group will be established and assigned to the Access Division of LIBER. The Access Division serves to foster and promote access to information resources for the benefit of the patrons of university and research libraries, and to stimulate the development of modern information services.
The following paragraphs provide a short overview of the format activities in the member countries of the LIBER MARC Harmonization Task Force, as presented at the meeting at IFLA 2002. Although the situation is different in each country, the data format is highly important for all. All countries use (or intend to use) the international data formats UNIMARC and MARC21 as the national data format and exchange format. In the countries in which a transition to a new data format was made, a demand for training and documentation on this format was accentuated.
The adoption of UNIMARC as the national and exchange format was due to the fact that Croatia has always had a very strong tradition in cataloguing theory and practice based on IFLA standards, as evidenced by the fact that Eva Verona wrote the Croatian national cataloguing rules.
The National Library does not have any problem with the use of UNIMARC formats, as it has been actively involved in their development since 1991. It has also been involved in different projects and has developed ISSN-UNIMARC-ISSN conversion, and took part in the CERL/RLG conversion project UNIMARC-MARC21-UNIMARC.
Die Deutsche Bibliothek, as well as some regional library systems, uses the format Pica, created by the Dutch organization of the same name, now part of OCLC Pica. The Pica format is derived from MARC and consists of two types. Pica3 is the cataloguing format (with tags of 3 or 4 digits and ISBD-interpunctuation); Pica+ is the internal and storage format (with tags of 3 digits, 2 indicators, and a subfield structure). The Pica format is also used as a reference format (“turntable“) for format conversions.
Other institutions have systems with their own internal formats, covering a wide range from simple to very complicated data structures. Also formats originally designed as exchange formats are used as internal formats, in a more or less abridged and adapted form.
The German exchange format (which is also used in Austria) is MAB (Maschinelles Austauschformat für Bibliotheken - Machine-readable Exchange Format for Libraries). It can be used offline (for delivery on tape and disk or by ftp-transfer) and online (e.g. via Z39.50 gateways). The Expertengruppe MAB-Ausschuss (= Expert Group MAB Committee), under the auspices of Die Deutsche Bibliothek, is responsible for maintenance and development.
The MARC-based formats UNIMARC, USMARC, UKMARC and MARC21 are used for data import and export of German data as well as foreign data from other national libraries and data suppliers.
DC (Dublin Core) is used for the import of metadata of online theses of German university libraries and will be used in the future for the export of data, using the protocol for metadata harvesting of the Open Archives Initiative.
ONIX (Online Information eXchange), the standard format used by publishers and vendors to distribute electronic information about books, is used for data exchange between Neuerscheinungsdienst (Die Deutsche Bibliothek) and German Books in Print (MVB GmbH).
Recently the issue of a transition from German to international cataloguing rules and formats was put on the agenda again. In December 2001 the German Committee for Library Standards opted for migration. The expected benefits are improved international decentralized bibliographic data systems, a simplified international data exchange, the dissemination of German information and data and the selection and acquisition of new software for bibliographic data systems. The conditions, the consequences and the time schedule for the transition should be examined in an 18-month research project funded by the German Research Foundation before migration is introduced. The project, in which Die Deutsche Bibliothek is the project leader, was launched in autumn 2002.
The National Library, being a National Bibliographic Agency, has developed a really important role with respect to the application, development and diffusion of the format in Portugal. It has constructed and disseminated working tools, which have helped Portuguese libraries understand and apply UNIMARC in Portuguese cataloguing procedures (such as in manuals and guidelines); it has translated the UNIMARC Manual (bibliographic and authority); it has organized workshops and UNIMARC training, which are still held on a regular basis, and it is still following up and collaborating on the development of the format.
The National Library has been involved in international projects including USEMARCON (user controlled generic MARC converter) and it has cooperated in international databases including CERL/RLG.
UNIMARC is really important in Portugal, and even libraries that are acquiring new systems request the UNIMARC format. Therefore UNIMARC is the national format.
Several other Higher Education libraries are planning to migrate from UKMARC to MARC21 within the next 1-3 years. A number of public library services, such as Inverclyde Council, are also planning to move to MARC21 from non-MARC metadata formats. This follows advice from SLIC for improving interoperability in the People's Network project for widening access to digital information.
SLIC is also supporting the use of MARC21 for cataloguing digital reproductions through the use of OCLC's CORC service. Several libraries are participating in a pilot project to use CORC for collaborative cataloguing of online resources, including digital reproductions of local history materials created during the NOF-digitise programme. SLIC, CDLR, NLS and Strathclyde University are collaborating on a pilot project to create MARC21 authority records for Scottish names and place names via the NACO programme.
The Co-operative Information Retrieval Network for Scotland (CAIRNS) uses Z39.50 to create a cross-searching service for many of the university and research libraries in Scotland. UKMARC and MARC21 are supported by CAIRNS, which maps data from both to a common display format. The first public library, East Ayrshire, was added to the service in June 2002; this is a non-MARC site, but its Z39.50 server can output data in pseudo-UKMARC format.
There is a growing demand for training and documentation on MARC21 for non-MARC cataloguers, and on migration from UKMARC to MARC21. The Cataloguing and Indexing Group in Scotland (CIGS) has held a series of short seminars on these topics during the past two years, and is planning more. It is also negotiating with commercial suppliers to arrange training sessions in Scotland; potential customers and suppliers have been finding it difficult to synchronize sufficient numbers of trainees to make such sessions worthwhile, although one or two have taken place. CIGS has also made documentation such as standards, local migration notes, and seminar presentations available on the Web.
Not surprisingly, a number of conversion errors have been identified, and they are being dealt with according to a priority list. Some, but not all, can be handled automatically. A concise format manual and a complete format manual for cataloguers are finished (August 2003). LC´s format comments should be more comprehensive. One example: it is, in principle, possible to use designators for non-sorting beginning and end, but LC´s comments are silent about it. On the whole, MARC21 is still heavily influenced by the catalogue card format, which consequently influences system solutions.
The much-debated issue of ISBD punctuation must also be mentioned. There are cataloguing systems that generate the proper punctuation based on subfield codes, but it is not feasible for all subfields in 245 or 250. In particular, $b is used for different things, which require different punctuation.
To sum up, the previous format took pretty good care of the need to structure the information for search purposes (not all search purposes, though!), and linking records was easy. As a consequence, identification of the physical objects sometimes suffered. The MARC21 format, however, takes better care of the identification of the physical objects, but the linking is “manual“ and has to be achieved in a cumbersome manner. Some duplication of information is also needed to cover searching requirements, which should ideally be achieved in a more elegant manner.
UNIMARC is an IFLA standard, which means that it supports all IFLA UBC standards, guidelines, lists etc., as well as relevant ISO standards. It was designed in the mid-1970s based on the experiences gained in designing MARC I, LCMARC, BNB MARC etc., INTERMARC, and the ISO 2709 standard for exchange of bibliographic data. This is important to mention because in designing UNIMARC the experts were free not to follow precedents imposed on the design by national practices, uses and considerations, but to design the format according to the concepts that had their foundations in the new information technology and information theory. Also, the format was designed following the agreed international principles and standard bibliographic description into which national cataloguing rules and practices should fit, while not imposing one over the other. National cultural “flavour“ is thus preserved!
The central mission of the Permanent UNIMARC Committee (PUC) in developing and maintaining the UNIMARC suite of formats is to adopt and support the development of particular IFLA, ISO and Internet standards and also to lead research and development in related fields. The formats are: UNIMARC Manual: Bibliographic Format (2nd ed., update 4, 2004); UNIMARC Manual: Authorities Format (2nd ed. 2001); UNIMARC for Classification Format (Worldwide reviewed; to be posted on IFLANET by August 2002); and UNIMARC for Holdings Data: Draft (to be posted on IFLANET for worldwide review in autumn 2002).
The Permanent UNIMARC Committee has based its work on cooperation not only with IFLA's divisions and sections but also with other IFLA experts and non-IFLA organizations, such as ICA, ISSN, ISBN, publishing industry, etc., as well as with CERL, EROMM and INTERPARTY.
|·||overcoming obstacles to collaboration|
|·||strong support and development of MARC21-based systems|
|·||cost of maintaining separate national standards|
|·||local and national level|
|·||availability of derivable catalogue record|
|·||shift to MARC21 by major UK libraries|
|·||1993 - opening talks with LC|
|·||1995 - consultation in UK|
|·||1997 - progressive convergence strategy|
|·||1999 - UK consultative body - big bang|
|·||2000 - consultation exercise|
|·||2001 - decision to move to MARC21|
|·||costs of maintaining separate national standards - at local and national level|
|·||overcoming obstacles to collaboration - availability of derivable catalogue records|
|·||strong support and development of MARC21-based systems|
|·||shift to MARC21 by major UK libraries|
Since the British Library is the body responsible for developing and maintaining the UKMARC format and for data services based on that format, it has been essential that the UK national consultative body, Book Industry Communication Bibliographic Standards Technical Working Group, and the UKMARC user community should be in agreement with any decisions taken by the library in relation to harmonization with, or transition to MARC21.
A further alternative, progressive convergence, was worked out in 1997 and a joint MARC Harmonization Coordinating Committee was formed. Each of the national libraries, BL, LC, and NLC agreed to keep future developments in step, in order to prevent any further divergence. As a result, the British Library began progressively adopting USMARC fields, starting with unique fields that would have little impact on users and working towards the more complex changes that would require systems or database changes by users. The intention was to consult the community at each stage and stop when users felt the process had gone as far as they were prepared to go. In the meantime Canada and the US harmonized their formats to create MARC21.
In 1999, as part of the ongoing dialogue with UKMARC users, the British Library consulted the Book Industry Communication (BIC) Bibliographic Standards Technical Working Group, the UK consultative body, on the next phase of harmonization. The Working Group advised that users might prefer the big-bang approach, given the changed circumstances since the last major consultation, especially the increasing implementation of MARC21-based systems and collaboration with North American bibliographic utilities.
Whilst the result of the major consultation exercise in Autumn 2000 showed that there was still a large minority (30%) that wished to retain the unique features of UKMARC, there was a clear mandate (57%) for adopting MARC21, with only 7% who wanted to keep UKMARC largely unchanged. With this level of support for MARC21, the British Library decided to move to MARC21 and, significantly, without any preconditions.
During the consultation exercise, a number of concerns were expressed about how the British Library would implement MARC21, and what practical consequences this might have for UK users of MARC. In order to answer some of the questions raised, and to describe the proposed transition process, the British Library has issued a “White Paper” entitled The MARC21 Format and the UK Library Community - Proposals by the British Library. The following sections give an indication of the issues covered by the proposals.
|·||roles and responsibilities for format development and revision;|
|·||roles and responsibilities for publication and related activities.|
Each library will be responsible for consultation within its national structure and for drafting, reviewing and editing any proposals that result from this consultation. In addition, the Library of Congress consults worldwide for input from MARC21 users.
The British Library will consult with its formal national consultative committee to determine which changes to the format should be formally proposed. It is intended that the BIC (Book Industry Communication) Bibliographic Standards Technical Subgroup, which currently oversees UKMARC development, will adopt this role. The membership of the BIC group will be reviewed, in order to ensure adequate UK stakeholder representation. Any UK user or group of users will be able to propose changes to the format through BIC. New proposals will be passed from BIC to the Library of Congress and then disseminated throughout the MARC21 community worldwide via the MARC forum discussion list. These proposals will be discussed and recommendations made at the biannual open US MARC Advisory Committee meetings, held in the context of the MARBI meetings and at the Canadian Committee on MARC (CCM) meeting. The cycle for consideration of format revision proposals will be twice a year, in January/February and June/July.
|·||an open forum for discussing and developing proposals and other input to MARBI through the medium of the BIC Bibliographic Standards Technical Subgroup,|
|·||an open forum for reporting and discussing news and developments from the Library of Congress, and progress on the implementation of proposals.|
|·||USEMARCON - This is a conversion application for UKMARC to MARC21, and vice-versa, which runs on a PC running Windows or Linux platform and can be downloaded from the USEMARCON web page free of charge.|
|·||UKMARC to MARC21 Conversion Tables - UKMARC to MARC21 Conversion Tables - In order to support the consistent and accurate conversion of UKMARC data to MARC21, the British Library has prepared mapping tables from UKMARC to MARC21 which are available free of charge. Requests for the tables should be made by e-mail to firstname.lastname@example.org.|
|·||Consultancy - In addition to the above tools, the British Library is also able to offer priced consultancy services to systems suppliers in the form of training in the use of USEMARCON and the creation of the required rules.|
The project will be undertaken in three phases.
Phase 1 = Data cleaning and enhancement
Phase 2 = Conversion to MARC21
Phase 3 = Conversion to a common character set (probably Unicode, but dependant on ILS vendor chosen)
Both catalogues use epixtech's Dynix library management system. SLIC and NULIS have collaborated on catalogue development for several years, and agreed to work together on migrating to MARC21. Because the SLIC database is smaller, and covers a narrower range of resources, it was decided, after discussion with epixtech, to migrate SLIC first, with NULIS following immediately. The migration is not yet completed for either organization, and the schedule has been extended for two months, until the end of July 2003.
SLIC did not have much to do for its catalogue, because substantial checks on data integrity had been carried out in 2001 when the SPL records were added.
The NULIS catalogue, being much larger, more diverse, and with older records dating from 1987, contained many known problem areas. These included minor data corruption caused by previous migration projects, the presence of several thousand non-MARC records imported during a catalogue merger in 1996, and specific tags and subfields not used by NULIS but attached to MARC records downloaded from utilities and other libraries.
Counts of records affected by each type of problem were carried out using the system's reporting functions. These were used to make decisions on whether to correct the problems prior to migration, use the migration process itself to correct problems, or leave them unresolved. Where counts were low and machine and staff resources available for manual intervention, work programmes were implemented to resolve problems before migration. Some anomalies can be corrected during migration, such as deleting unwanted tag or subfield contents.
The supplier also carried out pre-migration checks. In particular, a list of all tags and subfields used in each catalogue was generated, giving numbers of instances and mappings to indexes and authority files. SLIC and NULIS compared the lists against the UKMARC and MARC21 structures to check for erroneous codings; any problems identified were included in the local preparation activities.
The lists were used by the supplier to ensure that every tag and subfield was represented in a set of test records. NULIS also created a false UKMARC record containing as many tags and subfields, with appropriate content, as possible, so that mappings from UKMARC to MARC21 could be more easily compared.
The British Library did not release the final version of its mapping tables until 1 July 2002, but a draft version became available shortly after the pre-migration activities had begun. The supplier has not conducted a UKMARC to MARC21 migration before, and has therefore had to make several attempts to develop a completely correct mapping of tags and subfields and add the punctuation required by MARC21. This activity is nearly complete for the SLIC catalogue, but will take a little longer for NULIS, which has four times as many tags to map. SLIC is now checking the mapping of authority records, which is not expected to take more than a couple of days.
The migration will be carried out and made available to the library for checking. If problems are found, the mappings will be corrected and the full migration repeated. When the library is satisfied, it will carry out appropriate amendments to the display mappings for staff and public catalogue views and test them. When the parallel MARC21 catalogue is ready for use, it will be switched into the live system, and the UKMARC catalogue removed. It is expected that this stage of the process will take at least a week, during which all cataloguing activity will remain suspended. For this reason, university libraries intending to migrate usually arrange schedules so that this disruption occurs during vacation periods. This may result in bottlenecks for the system suppliers.
Documentation, record templates, and MARC import and export profiles will require appropriate amendment.
External sources of bibliographic records will be reviewed to identify low-cost, quality providers; for example, Library of Congress MARC21 records do not carry any copyright charges.
LIBER Quarterly, Volume 14 (2004), No. 1