Guidelines for encoding identifiers in Dublin Core metadata

Andy Powell
UKOLN, University of Bath

1st draft

1. Introduction

This document provides guidelines for encoding a number of commonly used identifiers in Dublin Core [DCMI] metadata.

2. Definitions

Identifier
Based on the definition and comment for the Identifier element in the Dublin Core Metadata Element Set defintion [DCMES] we can define an identifier as:
An unambiguous reference to [a] resource within a given context. Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Example formal identification systems include the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).
As indicated in the Dublin Core Qualifiers [DCQ] document, identifiers are typically used as values for the Identifier, Source and Relation elements, though they may also be used elsewhere, for example with the Rights element as a reference to a service providing a rights statement.

3. General guidelines

Recommendation 1. All identifiers should be encoded in DC metadata as Uniform Resource Identifiers [URI]. URIs provide a simple, extensible and widely deployed mechanism for identifying resources that supports the encoding of existing identification schemes including the common Uniform Resource Locator (URL) [URI-CLARIFICATION]. For example:


<meta name="DC.Identifier" content="http://www.ukoln.ac.uk/">
<meta name="DC.Relation" content="http://purl.org/net/ukoln">

A list of registered URI schemes is maintained by IANA [URI-SCHEMES].

A URL is simply a type of URI that identifies a resource via a representation of its primary access mechanism (e.g., its network "location"), rather than by some other attributes it may have. Thus, an 'http' URL is a URI (or to put it another way, 'http' is a URI scheme).

A Uniform Resource Name (URN) is a URI that uses the 'urn' URI scheme and that is intended to 'name' a resource in a persistent way. The URN defines sub-spaces, called 'namespaces', which are assigned namespace identifiers. Thus, 'isbn' is a URN namespace identifier (see below):


<meta name="DC.Source" content="urn:isbn:1-56592-149-6">

A list of registered URN namespace identifiers is maintained by IANA [URN-NIDS].

Recommendation 2. In qualified DC applications, the 'URI' encoding scheme should be indicated. Note that URLs and URNs should be given a 'URI' encoding scheme in DC metadata. For example:


<meta name="DC.Identifier" scheme="URI" content="http://www.ukoln.ac.uk/">
<meta name="DC.Relation" scheme="URI" content="http://purl.org/net/ukoln">
<meta name="DC.Source" scheme="URI" content="urn:isbn:1-56592-149-6">

4. Specific identifier scheme guidelines

4.1 DOI

The Digital Object Identifier [DOI] is a persistent identifier of intellectual property entities. DOIs are most commonly encoded as URIs using the 'doi' URI scheme or as URLs using the http://dx.doi.org/ prefix. For example:

doi:10.1000/182
http://dx.doi.org/10.1000/182
Note: at the time of writing, the 'doi' URI scheme had not been registered with IANA.

Recommendation 3. Encode DOIs as 'doi' URIs or 'http' URLs in DC metadata, indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Identifier" scheme="URI" content="doi:10.1000/182">
<meta name="DC.Identifier" scheme="URI" content="http://dx.doi.org/10.1000/182">

4.2 Handle

The Handle System [HANDLE] is a comprehensive system for assigning, managing, and resolving persistent identifiers, known as "handles," for digital objects and other resources on the Internet. Handles are most commonly encoded as URIs using the 'hdl' URI scheme or as URLs using the http://hdl.handle.net/ prefix. For example:

hdl:4263537/4069
http://hdl.handle.net/4263537/4069
Note: at the time of writing, the 'hdl' URI scheme had not been registered with IANA.

Recommendation 4. Encode Handles as 'hdl' URIs or 'http' URLs in DC metadata, indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Identifier" scheme="URI" content="hdl:4263537/4069">
<meta name="DC.Identifier" scheme="URI" content="http://hdl.handle.net/4263537/4069">

4.3 ISBN

An International Standard Book Number (ISBN) identifies an edition of a monographic work and is defined by the standard NISO/ANSI/ISO 2108:1992. RFC-3187 [RFC3187] defines the mechanism for encoding an ISBN as a URN.

Recommendation 5. Encode ISBNs as URNs in DC metadata, using the 'isbn' URN namespace identifier and indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Source" scheme="URI" content="urn:isbn:1-56592-149-6">

4.4 ISSN

The International Standard Serial Number (ISSN) is an eight-digit number which identifies periodical publications, including electronic serials [ISSN]. RFC-3044 [RFC3044] defines the mechanism for encoding an ISSN as a URN.

Recommendation 6. Encode ISSNs as URNs in DC metadata, using the 'issn' URN namespace identifier and indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Source" scheme="URI" content="urn:issn:1361-3200">

4.5 PURL

A PURL is a Persistent Uniform Resource Locator [PURL]. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service.

Recommendation 8. Encode PURLs as URLs in DC metadata, indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Relation" scheme="URI" content="http://purl.org/net/ukoln">

4.6 SICI

The Serial Item and Contribution Identifier (SICI) standard defines a variable length code that provides unique identification of serial items (e.g., issues) and the contributions (e.g., articles) contained in a serial title. SICI is specified in NISO/ANSI Z39.56-1996. [SICI-NID] defines the mechanism for encoding an SICI as a URN.

Note: at the time of writing the SICI URN namespace identifier had not been formally approved by IANA.

Recommendation 8. Encode SICIs as URNs in DC metadata, using the 'sici' URN namespace identifier and indicating a 'URI' encoding scheme in qualified DC applications. For example:


<meta name="DC.Source" scheme="URI" content="urn:sici:1046-8188(199501)13:1%3C69:FTTHBI%3E2.0.TX;2-4">

References

[DCMI] Dublin Core Metadata Initiative
http://dublincore.org/

[DCMES] Dublin Core Metadata Element Set, Version 1.1: Reference Description
http://dublincore.org/documents/dces/

[DCQ] Dublin Core Qualifiers
http://dublincore.org/documents/dcmes-qualifiers/

[URI] Uniform Resource Identifiers (URI): Generic Syntax
http://www.ietf.org/rfc/rfc2396.txt

[URI-CLARIFICATION] URIs, URLs, and URNs: Clarifications and Recommendations 1.0
http://www.w3.org/TR/uri-clarification/

[URI-SCHEMES] Uniform Resource Identifier (URI) SCHEMES
http://www.iana.org/assignments/uri-schemes

[URN-NIDS] URN Namespaces
http://www.iana.org/assignments/urn-namespaces

[DOI] The Digital Object Identifier System
http://www.doi.org/

[HANDLE] The Handle System
http://www.handle.net/>

[RFC3187] RFC-3187: Using International Standard Book Numbers as Uniform Resource Names
http://www.ietf.org/rfc/rfc3187.txt

[ISSN] International Standard Serial Number
http://www.issn.org/

[RFC3044] RFC-3044: Using The ISSN (International Serial Standard Number) as URN (Uniform Resource Names) within an ISSN-URN Namespace
http://www.ietf.org/rfc/rfc3044.txt

[PURL] Persistent Uniform Resource Locator
http://purl.org/

[SICI-NID] Using Serial Item and Contribution Identifiers as Uniform Resource Names
http://www.ietf.org/internet-drafts/draft-hakala-sici-00.txt


Web page by: Andy Powell
Last updated: 07-Jan-2002