The Cedars Project
CURL exemplars in digital archives |

|
Cedars Metadata Brainstorming:
A report from a meeting held in the Royal Historical Society Meeting Room, Main Library, University College London, 24 May 1999.
Cedars Project Document AIS03
|
Status: Draft |
Last updated: 18-Jun-1999 |
Created: 01-Jun-1999 |
Availability: Public |
Present
Kelly Russell (Chair, Cedars), Lou Burnard (Oxford), Michael Day (UKOLN), Michael Popham (Oxford), Derek Sergeant (Cedars, Leeds), Andy Stone (Cedars, Oxford), Ellis Weinberger (Cedars, Cambridge).
Background documentation
Andy Stone [AS] had circulated some documents before the meeting:
- AIW02 - Cedars Preservation Metadata Elements by Andy Stone and Michael Day (25 February 1999) - still the 'baseline' metadata elements document.
- AIW03 - Rethinking PDI for the Cedars Project by Michael Day (20 May 1999).
- AIW05 - Content Information and Representation Information by Andy Stone (21 March 1999).
- AIW06 - Introduction to Leeds Perspectives on CI and RI by Derek Sergeant and Andy Stone (19 May 1999).
- AIW07 - Rights metadata (19 May 1999).
Introduction
Kelly Russell introduced the meeting with a reminder that the purpose of the meeting was to contribute to the production of a Cedars metadata specification and that a desirable outcome would be a list of action points.
An important topic of discussion would relate to potential overlap between the OAIS concepts of Preservation Description Information (PDI) and Representation Information (RI) and the relative roles within Cedars of the Access Issues Working Group (AIWG) and the Digital Preservation Strategies Working Group (DPSWG).
An introduction to XML
Lou Burnard [LB] got the meeting started with a presentation introducing the Extensible Markup Language (XML). Descriptions of XML exist elsewhere (e.g. Bosak and Bray 1999) so will not be repeated here but it will suffice to say that XML is:
- An 'activity' of the World Wide Web Consortium (W3C) and is intended to be the enabling layer that sits below everything that happens on the Web.
- A simple, flexible text format based on SGML (ISO 8879).
LB concluded that XML could be used within Cedars in three ways:
- To think with - XML refocuses attention on the data needing to be processed and stored
- To express Cedars recommendations - the project will need to express its recommendations in some formalism. XML provides a logical choice.
- To use as an in-house tool within the project - to help produce documentation, etc.
Discussion mainly centred on two issues:
- The relational database vs. XML
- Available tools
Preservation Description Information
Michael Day [MD] introduced Cedars project document AIW03. He explained that the document was produced to review the elements identified as PDI in AIW02 and re-state them in a more consistent manner. Three main issues were discussed:
- One issue that MD raised was the potentially ambiguous relationship between Descriptive Information (DI) and PDI in the OAIS model. In OAIS, PDI is supposed to exist in close proximity to the Content Information (CI), while DI provides information about the Information Package itself (see: Reich and Sawyer 1999, p. 18, Figure 2-3). On the other hand, many DI elements would be identical to PDI elements - the only difference being their function (preserving context, etc. as against resource discovery). There would be no point in having identical metadata elements repeated within different parts of any Cedars metadata schema. To avoid duplication, therefore, all bibliographic-type information should be recorded as part of the PDI and DI metadata extracted (if necessary) for creating finding aids.
- MD was of the opinion that Cedars need not spend much time specifying Descriptive Information in detail. Other initiatives, like Dublin Core, are currently considering similar issues so it should be possible within the project to concentrate on PDI and related issues and 'import' bibliographic-type elements from existing schemes.
- LB raised the important issue of what should happen to pre-existing metadata. Items being preserved would potentially arrive with (or be associated with) important descriptive or contextual information. This could be in any format and could either just be added (in this format) to the PDI and/or it could undergo some conversion based on crosswalks. MD pointed out some problems of format conversion based on experience with other projects.
- MD wondered whether more consideration needed to be given to how (and by whom) Cedars metadata should be created? It could be a time-consuming process and would need to be part of the ingest procedure.
Content Information
After lunch, consideration moved on to a discussion of AIW05. Andy Stone briefly explained the origins of the document as an ongoing revision of AIW02 but including only the elements identified there as Representation Information.
- All major elements identified in AIW05 were discussed and most of them were found to need some further refinement or restructuring.
- Derek Sergeant [DS] questioned whether many of these elements were in fact PDI (rather than Representation Information). DS commented that RI - within OAIS - need only contain that information necessary to convert the bit sequences in the Digital Object into meaningful information (Reich and Sawyer 1999, p. 46).
- Granularity issues also were raised as a potential difficulty. Resources within the Cedars scope will need description at a number of levels. More detailed consideration of OAIS (or experience with the demonstrators) may help in this regard.
- The notion of 'dependency' needed further definition.
DS then gave a brief presentation outlining how Representation Information was understood by the Digital Preservation Strategies Working Group.
References
Bosak, J. and Bray, T., 1999, XML and the second-generation Web. Scientific American, 280 (5), May, pp. 79-83.
Reich, Lou and Sawyer, Dan, (eds.), 1999, Reference Model for an Open Archival Information System (OAIS).
Consultative Committee for Space Data Systems, White Book, Issue 5 (CCSDS 650.0-W-5.0). Washington, D.C.:
CCSDS Secretariat, National Aeronautics and Space Administration, Washington, D.C.
Available
from: <URL:http://ssdoo.gsfc.nasa.gov/nost/isoas/ref_model.html>
Cedars is a Consortium of University Research Libraries (CURL) Project funded by the Joint
Information Systems Committee of the UK higher education funding councils through its Electronic
Libraries Programme (eLib).
Maintained by: Michael Day of UKOLN: The UK Office for Library and Information Networking,
University of Bath.
Page created: 01-Jun-1999.
Last updated: 18-Jun-1999.