Guidelines for assigning identifiers to metadata terms
UKOLN, University of Bath, UK
Is Replaced By:
Status of Document:
This is a DRAFT DCMI Recommended Resource.
|Description of Document:||This document provides some simple guidelines for assigning identifiers to non-DCMI metadata terms (elements, element refinements, encoding schemes and vocabulary terms).|
The DCMI Abstract Model [DCMI-AM] requires that all terms (elements, element refinements, encoding schemes and controlled vocabulary terms) used in metadata application profiles that are compliant with the model must be assigned a URI [RFC3986] that identifies the term. An XML namespace [XML-NAMES] is a collection of names, identified by a URI, that are used in XML documents as element types and attribute names. By convention, all DCMI recommended encodings [DCMI-ENCODINGS] use a concatenation of an XML namespace URI and the term name to provide a mechanism for encoding the term URI. The use of XML namespaces and URI to uniquely identify metadata terms allows those terms to be unambiguously used across applications, promoting the possibility of shared semantics. As indicated in the DCMI Namespace Policy [DCMI-NAMESPACE], DCMI has adopted this mechanism for the identification of all DCMI terms.
This document provides some simple guidelines for assigning URIs to metadata terms in non-DCMI namespaces. This includes non-DCMI elements, element refinements, encoding schemes and controlled vocabulary terms.
Although these guidelines are mainly intended for metadata application profiles that conform with the DCMI Abstract Model, it is hoped that they are generic enough that they may be useful in the context of other metadata applications as well.
All metadata terms must be assigned a URI. The use of fragment identifiers in the URI used to identify metadata terms is optional and is left to the discretion of the implementor.
For the purposes of encoding, the term URI may be partitioned into an XML namespace URI and the term name. Note that, for convenience, it is commonly the case that XML namespace URIs end with either a '#' (hash) or '/' (slash) character.
Groups of related terms (for example, all the terms within a controlled vocabulary) should be assigned URIs within the same XML namespace.
All XML namespace and term URIs should resolve to human and/or machine-readable descriptions of the namespace or term.
Any valid URI [RFC3986] may be used to identify a metadata term. However, the use of a registered URI scheme is recommended [URI-SCHEMES].
All XML namespace and term URIs should be assigned with the intention of them being unique and persistent. This means that the URI must not be used to identify anything else and that it should be expected to last as long as the Internet.
Four simple strategies for assigning URIs to metadata terms are described below.
Where a term is created within the context of a particular project, service or other initiative, the use of a project or service-specific URL may be appropriate. This is probably the simplest strategy in terms of ease of assignment and resolution. However, it is also the most prone to lack of persistence.
Notice that example 1 defines a metadata property while example 2 defines a term within a controlled vocabulary. Remember that in example 2 it will probably also be necessary to define an encoding scheme name for the vocabulary itself, for example http://myproject.org/metadata/terms/Color.
A similar approach, but one that is likely to offer more persistent URIs, is to use PURLs [PURL]. A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. This provides a level of resilience aginst changes in project or service URLs. The use of PURLs to identify metadata terms has already been adopted by a number of metadata-related initiatives such as DCMI itself and RDF Site Summary (RSS) 1.0 [RSS10].
Note that in example 1, the RSS implementors have chosen to embed a version number into the XML namespace URI. This allows them to use the same term name within a new XML namespace in future versions of the application profile. This has advantages in some scenarios. However, implementors should be cautious when using this technique because it may result in URIs being assigned to new terms that have the same semantics as existing terms.
The "info" URI scheme provides a "mechanism for assigning URIs to information assets that have identifiers in public namespaces" but that do not have an appropriate existing URI scheme [INFO-URI-SPEC] [INFO-REGISTRY]. The phrase 'information assets' includes all the metadata terms discussed here. Thus, it is appropriate to consider assigning "info" URIs to metadata terms.
Note that, somewhat confusingly, the draft "info" URI specification uses different terminology from that used here. In the terminology of the specification, ddc is the "info URI namespace component" and 22/eng//004.678 is the "info URI identifier component".
Note also that "info" URIs can not be resolved using current Web browsers (i.e. by using a simple HTTP GET request). Indeed, "info" URIs are designed to be non-dereferencable - i.e. it is not possible to dereference an "info" URI in order to retrieve a representation of the identified resource. Unfortunately, this has serious consequences on their utility for identifiying metadata terms. Since it is not possible to easily obtain a representation of the identified term (typically some metadata about the term), it is not possible to obtain any information about the relationships between the identified term and other terms. This means that the "info" URI is of limited use in the context of the Semantic Web, since it is not possible for software applications to reason automatically based on knowledge about the relationships between multiple metadata terms.
At the time of writing, "info" was not a registered URI scheme.
xmlns.com provides a network space for simple Web namespace management. "The rationale for registering xmlns.com was to secure a short, memorable domain suitable for naming concepts for use in RDF and XML vocabularies" [XMLNS]. The FOAF vocabulary [FOAF] uses xmlns.com to provide an XML namespace URI for its terms.
Note that, at the time of writing, the status and ownership of the xmlns.com domain was slightly unclear and it is therefore not possible to be sure of the long term persistence of URIs based on this domain.
All terms used in metadata application profiles must be assigned a URI before they can be used in the encoding syntaxes recommended by DCMI. It is recommended that implementors assign URIs to terms following the guidelines provided here. Of the four strategies for assigning URIs to terms listed in this document, the use of PURLs is recommended for the identification of all metadata terms.
DCMI Abstract Model
Namespaces in XML, W3C Recommendation, 14 January 1999
IETF (Internet Engineering Task Force) RFC 3986: Uniform Resource Identifiers (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. January 2005.
DCMI Encoding Guidelines
Namespace Policy for the Dublin Core Metadata Initiative (DCMI), 26 October 2001
Uniform Resource Identifier (URI) SCHEMES
RDF Site Summary 1.0
The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces, 9 July 2004
Dewey Decimal Classification
"info" URI registry
FOAF Vocabulary Specification