Interoperability between metadata formats

Mapping Dublin Core to ROADS templates

Michael Day
UKOLN: The UK Office for Library and Information Networking,
University of Bath, Bath, BA2 7AY, United Kingdom
http://www.ukoln.ac.uk/
m.day@ukoln.ac.uk

November 1997


Summary

Table 1: Summary mapping from Dublin Core to ROADS/IAFA templates.

Dublin Core element

IAFA template

TitleTitle
CreatorAuthor-name (From Author (USER)* cluster)
SubjectKeyword
Subject-Descriptor-Scheme
Subject-Descriptor
DescriptionDescription
PublisherPublisher-name (From Publisher (ORGANISATION)* cluster)
ContributorsNo direct equivalent
DateCreation-date
TypeCategory
FormatFormat-v*
Requirements
IdentifierURI-v*
ISBN
ISSN
SourceSource
LanguageLanguage
RelationNo direct equivalent
CoverageNo direct equivalent
RightsNo direct equivalent

Introduction

The Dublin Metadata Core Element Set (Dublin Core for short) was devised as a simple set of data elements so that Internet publishers and authors would be able to create their own metadata records.

The Dublin Core elements were originally agreed at a workshop held in March 1995 at Dublin, Ohio (Weibel, et al. 1995). The workshop report commented that "automatically generated records often contain too little information to be useful, while manually generated records are too costly to create and maintain for the large number of electronic documents currently available on the Internet". Dublin Core elements were designed to mediate between these extremes. A reference description of the Dublin Core element set can be found at the URL:http://purl.org/metadata/dublin_core_elements.

ROADS templates are used by the subject services which use the ROADS software. The templates are a development of the Internet Anonymous FTP Archive (IAFA) templates outlined in an IETF Internet Draft in 1994 (Deutsch, et al., 1994). The mapping of Dublin Core to these templates should provide an interesting examination of Dublin Core's potential role as an interchange format between metadata types, in particular with relation to the ROADS project (Heery 1996; Knight and Hamilton 1996). More information on the ROADS project can be found at the URL:http://www.ukoln.ac.uk/roads/.

Detailed comments on the mappings

Title

This should map neatly across to the ROADS template Title.

Creator

Dublin Core Author or Creator elements are defined as the "person(s) or organizations(s) primarily responsible for the intellectual content of the resource". Dublin Core Creator elements could be mapped to the Author-Name part of the ROADS template Author-(USER)* cluster. Differences in format are not as crucial here as it would be mapping to a more complex scheme like MARC. If a Dublin Core SCHEME is added, e.g.:

Author (type=USMARC): 100 1 Doyle, Arthur Conan $c Sir $d 1859-1930,

things get more complicated.

Subject

In Dublin Core a SCHEME sub-element can be used to note which controlled indexing terms are being used, or which classification system is in use. e.g.:

Subject (scheme=LCSH): UNIX (Computer system)
Subject (scheme=Dewey Decimal System): 004.251 Supercomputers--systems design

If the sub-element includes a well known indexing or classification system, then this could be extracted and placed in the ROADS template "Subject-Descriptor-Scheme" and the data itself could be attached in an "Subject-Descriptor". Presumably, well used indexing or classification schemes could be in an authority file so that the machine could identify them accurately. Alternatively, the SCHEME sub-element could map directly to the ROADS "Subject-Descriptor-Scheme", and the attached data in "Subject-Descriptor". However, this would rely on abbreviations for the schemes being used in a consistent manner.

If no SCHEME sub-element is used, the subject terms could be assumed to be suitable for the ROADS template "Keywords". In Dublin Core, the Subject element can contain any "keywords or phrases that describe the subject or content of the resource".

In Dublin Core the data elements are repeatable. Subject elements containing one or more SCHEME sub-elements are possible. All will have to map to their relevant place in an ROADS template.

e.g.:

Keywords: Supercomputers
Keywords: UNIX
Subject-Descriptor Scheme-v1: DDC
Subject-Descriptor Scheme-v2: LCSH
	Subject-Descriptor-v1: 004.251 Supercomputers--systems design
	Subject-Descriptor-v2: UNIX (Computer system)

Description

This, in Dublin Core, refers to a textual description of the content of a resource. It will map fairly accurately to the ROADS template Description field.

Publisher

Dublin Core Publisher elements are defined as the "entity responsible for making the resource available in its present form". Dublin Core publisher elements could be mapped to the Publisher-Name part of the ROADS template Publisher- (ORGANISATION)* cluster.

Contributors

The Other Contributors element is intended to describe roles like editing, illustrating, compiling, etc., fundamentally, intellectual contrubutions to the resource not covered in the Creator element. It can take the form of a free text string:

Contributors: Transcribed by the University of Maryland at College Park Libraries 
Humanities Electronic Text Center

or can be defined by a "type" or "role" sub-element:

Contributors: (role=Editor): Harnad, Stevan
Contributors: (role=Illustrator): Bailey, Sian

Whichever is used, there is no obvious place in ROADS template based records where this data could be accurately mapped. The closest is the Author - (USER)* category.

Date

The Dublin Core Date of publication element is intended to reflect "the date the resource was made available in its current form". Recommended practice is for an ANSI X3.30-1985 term (YYYYMMDD) to be used. Modified dates can be identified with a qualifier.

Date: May 6, 1995
Date: 19950506
Date (modified): 19970206

It should map to the ROADS template element Creation-Date. There are potential problems with compatibility between date formats. ROADS templates do not specify what form of date should be used. A conversion program might have to convert from ANSI X3.30-1995 terms to a more human-readable form for the ROADS template. Modified dates could map to Last-Revision-Date.

Type

Resource Type: defining the genre or category of the object, it would probably best map to the ROADS template "Category". There are no problems with semantics, although it might be thought best that there might be an authority list of the most well-used terms.

Format

The Dublin Core Format element is intended to provide information about the hardware and software requirements to display or operate the object. To this extent, examples like Windows 3.1 executable file, HTML file or ASCII file, would best map to the ROADS template "Format-v*". If there is more than one format given in a Dublin Core record, ROADS templates would have to automatically generate additional Format-v* elements. If, however, Dublin Core Form elements are free text descriptions of how the object can be displayed or operated, it would map better to the ROADS template "Requirements".

Identifier

The Resource Identifier in Dublin Core is the string or number used to uniquely identify an object. This includes things like ISBNs and ISBNs, as well as URLs. The type of identifier could be identified by a scheme.

Identifier (scheme=ISBN) = 0-19-097636-X
Identifier (scheme=URL) = http://www.ukoln.ac.uk/metadata/home.html

ROADS Templates include attributes for ISBN, ISSN and URI-v*. With schemes present, URLs, ISBNs and ISSNs could be adequately mapped to ROADS. Other, non-standard, identifiers would not, however, necessarily fit into the ROADS templates.

Source

The Dublin Core source element refers to the object from which the object being catalogued is derived, e.g. the previous version of a document.

The source element in ROADS templates are designed to give information as to the source of the object. It is not used in the SERVICE template, but can be included in the DOCUMENT template. This is not necessarily related to an identifier as defined by Dublin Core but is usually a short form of text. A short form of text could presumably be inserted, e.g. "Derived from:" if necessary.

Language

Language in Dublin Core specifies the language of the intellectual content of the object. Where practical, the guidelines note that the content of this field should coincide with the Z39.50 three character codes for written languages.

Again, abbreviations can be used and the source can be included as a scheme:

Language (scheme=USMARC) = spa

In ROADS templates, the Language-v* template is used for the language in which the object is written. Note that it can also be used for the programming language in a SOFTWARE template.

Relation

The Dublin Core relation element gives the relationship of the object to other objects. This could be to other documents in a hierarchy, or maybe to the parent electronic journal, although other relationships are possible. The use of this element is currently under discussion.

ROADS templates do not currently contain relation elements. The Dublin Core relation element will not therefore map to ROADS templates. However discussion is currently taking place within the ROADS project to ensure that basic relationships (e.g. Parent and Child relationships) can be identified in some way.

Coverage

The Dublin Core coverage element describes spatial and temporal characteristics of an object. It would be used for GIS or geospatial data, or something requiring time elements. It has a possible "type" qualifier.

Coverage (type = spatial) = The Atlantic Ocean
Coverage (type = spatial, scheme = LATLONG0 = {West - 180,
East = 180, North = 90, South = 90}
Coverage (type = temporal, scheme = ANSI X3.30-1985) =
{Begin = 19910101, Eng = 19930601}

There is no ROADS templates equivalent of this element, although it could provide part of the Description.

Rights

The Rights Management field in intended to provide a link to a rights-management statement or copyright notice, so that these conditions can be linked to the record.

There is no ROADS/IAFA equivalent to this.


Examples of Mapping

Example 1

The 1995 OCLC Dublin Core metadata workshop report gave some examples of records encoded using the Dublin Core. The first was created by a subject specialist without specific library cataloguing experience. The Dublin Core elements have been amended to reflect current practice

Dublin Core record:
Title: A Unifying Syntax for the Expression of Names and Addresses of Objects 
on the Network as used in the World-Wide Web.
Title: (Subtitle) Universal Resource Identifiers in WWW
Creator: Berners-Lee, T.
Subject: IETF, URI, Uniform Resource Identifiers
Publisher: CERN
Date: 1994
Type: Internet RFC
Format (scheme=IMT): text/plain
Identifier(scheme=URL): gopher://gopher.es.net:70/0R0-57601-/pub/rfcs/rfc1630.txt
Relation (type=child)(identifier=URL): http://ds.internic.net/ds/dspg1intdoc.html
Relation (type=sibling)(identifier=URL): http://ds.internic.net/rfc/rfc1738.txt
IAFA / ROADS template record:
Author-Name: Berners-Lee, T.
Category: Internet RFC
Creation-Date: 1994
Format: text/plain
Keyword: IETF, URI, Uniform Resource Identifiers
Publisher-Name: CERN
Title: A Unifying Syntax for the Expression of Names and Addresses of 
Objects on the Network as used in the World-Wide Web.
Title: Universal Resource Identifiers in WWW
Template-Type: DOCUMENT
URI-v1: gopher://gopher.es.net:70/0R0-57601-/pub/rfcs/rfc1630.txt

Notes:

Most of the record maps quite easily onto the ROADS template. Somehow it will have to work out whether a DOCUMENT, SERVICE or other Template-Type is required.

The Title and Title (subtitle) in Dublin Core is potentially confusing, and would result in two Title elements in ROADS templates. If, however, conversion software could recognise the (subtitle), then it could conceivably add the relevant syntax:

Title: A Unifying Syntax for the Expression of Names and 
Addresses of Objects on the Network as used in the World-Wide 
Web: Universal Resource Identifiers in WWW

The Relation elements in Dublin Core are completely ignored.


Example 2

Dublin Core record
Title: On the Pulse of Morning
Author: Maya Angelou
Publisher: University of Virgina Library Electronic Text Center
Contributors: Transcribed by the University of Virginia Electronic Text 
Center
Date: 1993
Type: Poem
Format: 1 ASCII file
Source: Newspaper stories and oral performance of text at the 
presidential inauguration of Bill Clinton
Language: English
ROADS template record
Author-Name: Maya Angelou
Category: Poem
Creation-Date: 1993
Format-v1: 1 ASCII file
Language-v1: English
Publisher-Name: University of Virginia Library Electronic Text Center
Source: Newspaper stories and oral performance of text at the 
presidential inauguration of Bill Clinton
Template-Type: DOCUMENT
Title: On the Pulse of Morning

Notes:

The DC "Contributors" element, without a further qualifier (e.g. publisher, compiler) does not map onto a ROADS attribute. In this case, however, this is not a major problem as the transcriber is also the publisher.


Example 3

A more complex record

Dublin Core record
Title:		Assessing Information on the Internet: Toward
			Providing Library Services for Computer 
			Mediated Communication

Creator:		Martin Dillon
Creator:		Erik Jul
Creator:		Mark Burge
Creator:		Carol Hickey

Subject: 
	scheme=LCSH:	Internet (Computer network)
			Cataloging of computer files
			Information networks
			Computer networks
			Libraries--Communication systems
			Information storage and retrieval systems


Publisher:		OCLC

Date:			1994

Type:			ResearchPaper

Format: 		7 postscript files
			1 Unix tar file

Identifier:
	Scheme=OCLC:	155653163X

Source:			Martin Dillon, Erik Jul, Mark Burge and Carol
			Hickey. Assessing Information on the Internet:
			Toward Providing Library Services for Computer
			Mediated Communication. OCLC Technical Report Number,
			1234567. Dublin, OH.:OCLC, 1993.

Language:		English

Relation:		For a Web page listing Internet accessible
			OCLC research publications go to: 
			http://www.oclc.org/oclc/menu/reschdoc.htm
ROADS template record:
Author-Name: Carol Hickey
Author-Name: Erik Jul
Author-Name: Mark Burge
Author-Name: Martin Dillon
Category: monograph
Creation-Date: 1994
Format-v1: 7 postscript files, 1 Unix tar file
Language-v1: English
Publisher-Name: OCLC
Source: Source:	Martin Dillon, Erik Jul, Mark Burge and Carol
	Hickey. Assessing Information on the Internet: Toward Providing
	Library Services for Computer Mediated Communication. OCLC Technical
	Report, 1234567. Dublin, OH.:OCLC, 1993.
Subject-Descriptor Scheme-v1: LCSH
Subject-Descriptor-v1: Cataloging of computer files
Subject-Descriptor-v1: Computer networks
Subject-Descriptor-v1: Information networks
Subject-Descriptor-v1: Information storage and retrieval systems
Subject-Descriptor-v1: Internet (Computer network)
Subject-Descriptor-v1: Libraries--Communication systems
Template-Type: DOCUMENT
Title: Assessing Information on the Internet: Toward Providing 
Library Services for Computer Mediated Communication

Notes:

The Relation element is missing from the ROADS template.


References and Bibliography

Caplan, P. and Guenther, R., 1996, Metadata for Internet resources: the Dublin Core Metadata Elements Set and its mapping to USMARC. Cataloging & Classification Quarterly, Vol. 22, nos. 3/4.

Day, M., 1997, Mapping between metadata formats.
URL:http://www.ukoln.ac.uk/metadata/interoperability/

Deutsch, P., Emtage, A., Koster, M. and Stumpf, M., 1994, Publishing information on the Internet with Anonymous FTP. IETF Internet Draft, September.
URL:http://info.webcrawler.com/mak/projects/iafa/iafa.txt

Heery, R., 1996, ROADS: Resource Organisation and Discovery in Subject-based Services. Ariadne, No. 3.
URL:http://ukoln.bath.ac.uk/ariadne/issue3/roads/

Knight, J.P. and Hamilton, M.T., 1996, Overview of the ROADS software. (LUT CS- TR 1010). Loughborough: Loughborough University of Technology, Department of Computer Studies, March.
URL:http://www.roads.lut.ac.uk/Reports/arch/arch.html

Kunze, J.A., 1996, Guide to Creating Core Descriptive Metadata. Draft 3, 18 September.
URL:http://www.ckm.ucsf.edu/meta/mguide3.html

Library of Congress, Network Development and MARC Standards Office, 1997, Dublin Core/MARC Crosswalk
URL:http://lcweb.loc.gov/marc/dccross.html

MARBI, 1995, Mapping the Dublin Core Metadata Elements to USMARC. Discussion Paper, no. 86.
URL:gopher://marvel.loc.gov:70/00/.listarch/usmarc/dp86.doc

MARBI, 1997, Metadata, Dublin Core, and USMARC: a review of current efforts. Discussion Paper No. 99.
URL:gopher://marvel.loc.gov:70/00/.listarch/usmarc/dp99.doc

ROADS, 1995, Field descriptions for DOCUMENT, SOFTWARE, IMAGE, SOUND, VIDEO, MAILARCHIVE, USENET and FAQ IAFA Template types. From ROADS Manual.
URL:http://www.roads.lut.ac.uk/v1/IAFA-help/document.html

Weibel,S., Godby, J., Miller, E. and Daniel, R., 1995, OCLC/NCSA Metadata Workshop report.
URL:http://www.oclc.org:5046/conferences/metadata/dublin_core_report.html

Weider, C., 1994, The Internet anonymous FTP archive templates: towards an Internet resource location system. Journal of Information Networking, Vol. 1, no. 3, pp. 256- 260.


Acknowledgements

UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Services Committee of the UK Higher Education Funding councils, as well as by project funding from JISC's eLib Programme and the European Union. UKOLN also receives support from the University of Bath, where it is based.


This document is a revision of an earlier draft (November 1996) dealing with mapping from the original proposed Dublin Core elements (1995) to ROADS/IAFA templates. Anyone interested in the earlier form of this document should look at: <URL:http://www.ukoln.ac.uk/metadata/interoperability/dcv1_iafa.html>.

For mappings from ROADS/IAFA to Dublin Core, see <URL:http://www.ukoln.ac.uk/metadata/interoperability/iafa_dc.html>.


This work was carried out for the Resource Organisation And Discovery in Subject-based services (ROADS) project funded by the Electronic Libraries (eLib) Programme.

More information on ROADS can be found on the project's Web pages: <URL:http://www.ilrt.bris.ac.uk/roads/>


Maintained by: Michael Day of UKOLN The UK Office for Library and Information Networking, University of Bath.
Document created: 3-Nov-1997
Last updated: 12-Aug-1998

[UKOLN Metadata] [UKOLN Mapping Between Metadata Formats]