From: BIBLINK WP4: Format Conversion Feasibility, ed. Rachel Heery, (contributors: Robina Claphan, Michael Day, Brian Holt, Neil Wilson). September 1997.


Format Conversion Feasibility

Notes on mapping of national libraries' metadata requirements to Dublin Core

The BIBLINK Project

1.1 Introduction

Section 6 (Metadata Requirements of National Libraries) of BIBLINK D1.1 Metadata Formats identified the metadata requirements of the national libraries participating in BIBLINK. These metadata requirements were reduced in BIBLINK D4.1 Format Conversion Feasibility to support only a CIP-type function, and were further refined at a meeting in Mo I Rana in July 1997.

The purpose of this document is to map these metadata requirements to Dublin Core, and especially to identify where both qualifiers and extensions would have to be made to the Dublin Core elements.

D1.1 suggests that national libraries wish to create records in various flavours of MARC and would intend to apply detailed cataloguing rules to the content of the records. The formats used by the participating libraries vary, but are usually based on ISBD or AACR2. For the purpose of this paper, it is assumed that ISBD or AACR2 style formats would be desired for the national libraries' metadata requirements for electronic publications.

1.2 DC Qualifiers

The DC-4 meeting in Canberra (Weibel, et al. 1997) proposed the formal identification of the structure of elements and possible qualifiers in Dublin Core. In response, Rebecca Guenther (1997) has recently produced a proposal for Dublin Core qualifiers/substructure, which includes specific proposals for the qualifiers "scheme" and "type". These proposals are currently under discussion in the Dublin Core community. Guenther reiterates the Canberra meeting's insistence that the "type" qualifier should only be used to refine elements, not to extend their semantics and that each element should have a default meaning. The qualifiers can be understood as follows:

If SCHEME and TYPE can not meet these principles, then an extensibility mechanism should be used.

1.3 Extensibility

Where the national libraries' metadata requirements can not be described using Dublin Core - with or without qualifiers - then new elements can be proposed.

1.4 The mapping

BIBLINK Data ElementDublin Core
AuthorCreator

If a distinction needs to be made between personal and corporate authors, DC can differentiate between Creator.Personal and Creator.Corporate. Additionally, the SCHEME "Library of Congress Name Authority File" can be used if appropriate.

ContributorContributor

As with the Creator element, distinctions between Contributor.Personal and Contributor.Corporate can be made, as can a SCHEME for the Library of Congress Name Authority File.

Date of publicationDate

The proposed DC SCHEME default for Date is ISO 8601 with six proposed levels of granularity:

· Year: YYYY (e.g. 1997)

· Year and month: YYYY-MM (e.g. 1997-07)

· Complete date: YYYY-MM-DD (e.g. 1997-07-16)

· Complete date plus hours and minutes: YYYY-MM-DDThh:mmTZD (e.g. 1997-07-16T19:20+01:00)

· Complete date plus hours, minutes and seconds: YYYY-MM-DDThh:mm:ssTZD (e.g. 1997-07-16T19:20:30+01:00)

· Complete date plus hours, minutes, seconds and decimal fractions of a second: YYYY-MM-DDThh:mm:ss.sTZD (e.g. 1997-07-16T19:20:30.45+01:00)

It is unlikely that more than the first three levels would be relevant in the context of BIBLINK.

DescriptionDescription
Edition/versionExtension to DC required.
Extent (size)Extension to DC required.
FormatFormat

In DC terms, Format refers to the data representation of the resource, including things like Postscript or text/html. There is still some debate concerning the desirability of using enumerated lists of format types but the default as currently proposed is free text.

Frequency Extension to DC required.
Hash ValueExtension to DC required.
IdentifierIdentifier

It is proposed that URL is the DC default identifier. Therefore any other identifiers used, e.g. ISBNs, ISSNs or DOIs, will have to include a SCHEME in the DC record.

KeywordsSubject

Keyword is the default for DC Subject.

LanguageLanguage

It has been suggested that the content of this is DC should coincide with NISO Z39.53 three character codes, although the default scheme is free text. If USMARC/Library of Congress style language codes are used (e.g. as in UNIMARC), the SCHEME given should be "Z39.53".

Place of publicationExtension to DC required.
PriceExtension to DC required.
PublisherPublisher
System requirementsExtension to DC required.
Terms and conditionsRights

The default for Rights in DC is free text, although it is intended for a link to an URL.

TitleTitle

1.5 Notes

1.6 Some examples

Please note that the following examples are not intended to be definitive.

1.6.1 Web Page:

Metadata for a Web page, when translated into the BIBLINK data elements (incorporating Dublin Core), might look like the following:


Author (corporate)DC.creator.corporate: Cambridge University Library
TitleDC.title: Taylor-Schechter Unit Home Page
DateDC.date: 19970605
LanguageDC.language: eng
FormatDC.format: text.html
KeywordsDC.subject: Taylor-Schechter Genizah Research Unit; Cairo Geniza, papyrus
IdentifierDC.identifier: http://www.lib.cam.ac.uk/Taylor-Schechter/
Place of publicationCambridge
PublisherDC.publisher: University of Cambridge

1.6.2 CD-ROM:

In comparison, a commercially published CD-ROM could be described using the BIBLINK data elements (incorporating Dublin Core) in the following way:


Author (personal)DC.author.personal SCHEME=Library of Congress Name Authority File: Migne, J.P. (Jacques Paul), 1800-1875
TitleDC.title: Patrologia Latina Database
DateDC.date: 1993
LanguageDC.language: lat
FormatCD-ROM
Extent2 computer laser optical disks ; 4 3/4 in
DescriptionDC.description: The Patrologia Latina Database is an electronic version of the 221 volumes of the first edition of Jacques-Paul Migne's Patrologia Latina which was published between 1844 and 1865. The Patrologia Latina comprises the works of the Church Fathers from Tertullian in 200 AD to the death of Pope Innocent III in 1216. The database is fully searchable.
System requirementsMultimedia PC 486x or higher, 8mb memory, CD-ROM drive, sound card, SVGA 256-colour monitor, Windows 95 or Windows 3.1
KeywordsDC.subject: Early Christian Literature; Patristics;

DC.subject SCHEME=LCSH: Christian literature, Early -- Latin authors - Texts

DC.subject SCHEME=LCSH: Fathers of the church, Latin -- Texts

IdentifierDC.identifier SCHEME=ISBN: 0-89887-113-1
Place of publicationCambridge
PublisherDC.publisher: Chadwyck-Healey

1.7 References

Guenther, R. Dublin Core qualifiers/substructure: a proposal. 15 April 1997. http://www.loc.gov/marc/dcqualif.html

Weibel, S., Iannella, R. and Cathro, W. The 4th Dublin Core Metadata Workshop report. D-Lib Magazine, June 1997. http://www.dlib.org/dlib/june97/metadata/06weibel.html


Maintained by Michael Day of UKOLN, University of Bath.
Last updated 18-Sep-1997.