Recommendations for XML Schema for Qualified Dublin Core

Proposal to DC Architecture Working Group

Ad-hoc Committee: Timothy Cole (UIUC), Thomas Habing (UIUC), Diane Hillmann (Cornell), Jane Hunter (DSTC), Pete Johnston (UKOLN), Carl Lagoze (Cornell), Andy Powell (UKOLN)

Document Date: 2003-02-28

Introduction

This document is a follow-on to three efforts:

  1. Publication of the DCMI Usage Board's consolidated description of all current DCMI terms at http://dublincore.org/usage/terms/dc/current-elements/ and http://dublincore.org/usage/terms/dc/current-schemes/.

  2. Publication of the DCMI proposed recommendation "Guidelines for Implementing Dublin Core in XML" at http://dublincore.org/documents/dc-xml-guidelines/.

  3. Joint work between the Open Archives Initiative and the DCMI to define an XML schema for unqualified Dublin Core, available at http://www.openarchives.org/OAI/2.0/oai_dc.xsd.  This work was motivated by the requirements of the base metadata format for the OAI Protocol for Metadata Harvesting, but is useful for other applications that exchange unqualified Dublin Core records. 

The schema presented in this document conform to the W3C XML Schema (1.0) recommendations. They are suggested rather than prescribed and may, in fact, co-exist with other schema for exchanging Dublin Core metadata.  XML schema are interoperability vehicles; the greater number of applications that agree on a single schema the greater the ability to easily share Dublin Core metadata.  Therefore, while the committee that formulated this proposal hopes that the proposed schema will be useful to a breadth of applications, we recognize that different functionality, provided by different schema, may be required by some. 

While the schema presented here are indeed suggested, the functionality they support is congruent with the qualification model in the Dublin Core Qualifiers document.  Therefore, applications that employ other schema that express additional functionality should recognize that doing so compromises interoperability with applications that use this schema.

Features and Motivation of the Schema

The set of schema proposed meet the following set of requirements:

The Proposed Schemas and their Use

Base schemas

These three schemas declare XML elements to represent the Dublin Core elements and element refinements and a number of complexTypes to represent encoding schemes:

Container schemas

These schemas declare XML elements to act as containers for specified subsets of the Dublin Core elements and element refinements declared in the base schemas:

Sample application schemas

These schemas provide examples of how a container schema might be used in an application:

Schema: dc.xsd
Target XML Namespace: http://purl.org/dc/elements/1.1/

The schema dc.xsd defines a complexType called SimpleLiteral:

  <xs:complexType name="SimpleLiteral">
   <xs:complexContent mixed="true">
    <xs:restriction base="xs:anyType">
     <xs:sequence>
      <xs:any processContents="lax" minOccurs="0" maxOccurs="0"/>
     </xs:sequence>
     <xs:attribute ref="xml:lang" use="optional"/>
    </xs:restriction>
   </xs:complexContent>
  </xs:complexType>
    

The SimpleLiteral complexType is defined in terms of mixed complexContent. However, the cardinality attributes on the xs:any element dictate that this complexType does not permit child elements.

The fifteen Dublin Core elements in this namespace are represented as XML elements. The schema declares an abstract element any with a type of SimpleLiteral. Because it is declared as abstract, this element can not be used in an instance document. Each XML element representing a Dublin Core element is declared as a non-abstract element which is substitutable for the any element e.g.

  <xs:element name="title" substitutionGroup="any"/>
  

Finally, the schema defines a group elementsGroup and a complexType elementContainer. With the dc:any element, these two constructs provide mechanisms by which external schemas can reference the set of elements declared in this schema without referencing each element individually - though it is still possible for an external schema to reference individual elements if desired.

For example, a schema can simply import the dc.xsd schema and use the elementContainer complexType as the type of an element, and this would make the DC elements available as child elements.

   <xs:import namespace="http://purl.org/dc/elements/1.1/"
                 schemaLocation="dc.xsd"/>

   <xs:element name="simpledc" type="dc:elementContainer"/>
  

An example of such a schema is provided as simpledc.xsd.

The simpledc.xsd schema does not use a targetNamespace. It is possible to validate an instance directly against this schema. Where an application wishes to specify a namespace for the container element, it can be assigned when this schema is included in an application schema.

An example of such an application schema is provided as appsimpledc.xsd.

An example of an instance document which validates against that application schema is provided as testsimpledc.xml.

An example of an instance document which fails to validate against that application schema is provided as testsimpledc2.xml. (dcterms:modified not permitted.)

Schema: dcterms.xsd
Target XML Namespace: http://purl.org/dc/terms/

The schema dcterms.xsd imports the schema dc.xsd. The Dublin Core elements and element refinements in this namespace are all represented as XML elements, and importing the dc.xsd schema makes the any abstract element and the SimpleLiteral complexType available for use. Importing the dc.xsd schema also enables the indication of relationships between DC element refinements and the elements that they refine, using substitutionGroups.

An XML element which represents a DC element in this namespace is declared as substitutable for the any abstract element:

   <xs:element name="audience" substitutionGroup="dc:any"/>
  

And an XML element which represents a DC element refinement is declared as susbtitutable for the element it refines:

   <xs:element name="alternative" substitutionGroup="dc:title"/>
  

Encoding schemes are mechanisms for constraining the "value spaces" of DC elements and element refinements. In this schema, they are represented as named complexTypes derived from the SimpleLiteral complexType. For example, the complexType corresponding to the encoding scheme for "W3CDTF" is as follows:

  <xs:complexType name="W3CDTF">
   <xs:simpleContent>
    <xs:restriction base="dc:SimpleLiteral">
        <xs:simpleType>
           <xs:union memberTypes="xs:gYear xs:gYearMonth xs:date xs:dateTime"/>
        </xs:simpleType>
        <xs:attribute ref="xml:lang" use="prohibited"/>
    </xs:restriction>
   </xs:simpleContent>
  </xs:complexType>
  

N.B. Some schema-validating XML parsers may not support this construct. See notes.

The use of one of these complexTypes is specified by the use of the xsi:type attribute in the instance document. The value of the xsi:type attribute is a QName correponding to the name of the complexType:

   <dc:date xsi:type="dcterms:W3CDTF">2002-07-09</date>
  

Use of this datatype means that a validating parser will check that the element content conforms to one of the builtin date/time types.

Not all of the complexTypes associated with encoding schemes impose such "tight" validation. For example, the complexType for "LCSH" prescribes only that the element content is a character string:

  <xs:complexType name="LCSH">
   <xs:simpleContent>
    <xs:restriction base="dc:SimpleLiteral">
        <xs:simpleType>
          <xs:restriction base="xs:string"/>
        </xs:simpleType>
        <xs:attribute ref="xml:lang" use="prohibited"/>
    </xs:restriction>
   </xs:simpleContent>
  </xs:complexType>
  

In theory at least, it is possible to define a complexType which enumerates all the possible values of a Library of Congress Subject Heading, but it would be impractical to validate against such a list. However, the principle of validating against an enumerated list of values is illustrated in the schema dcmitype.xsd for the DCMI Type Vocabulary (see next section).

An example schema which takes this approach for ISO639-2 language codes is available at http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/iso639-2.xsd.

Similarly to the dc.xsd schema, the dcterms.xsd schema defines a group elementsAndRefinementsGroup as a means of referring to all the elements and element refinements. A complexType elementOrRefinementContainer is also defined.

A schema can simply import the dcterms.xsd schema and use the elementOrRefinementContainer complexType as the type of an element, and this would make the DC elements and element refinements available as child elements.

   <xs:import namespace="http://purl.org/dc/terms/"
                 schemaLocation="dcterms.xsd"/>

   <xs:element name="qualifieddc" type="dcterms:elementOrRefinementContainer"/>
  

An example of such a schema is provided as qualifieddc.xsd.

Like the simpledc.xsd schema, the qualifieddc.xsd schema does not use a targetNamespace. An implementation may validate directly against this schema or it may specify a namespace for the container element by including this schema in an application schema.

An example of such an application schema is provided as appqualifieddc.xsd.

An example of an instance document which validates against that application schema is provided as testqualifieddc.xml.

An example of an instance document which fails to validate against that application schema is provided as testqualifieddc2.xml. ('1963/08/17' is not a valid W3CDTF date.)

Schema: dcmitype.xsd
Target XML Namespace: http://purl.org/dc/dcmitype/

The dcmitype.xsd includes only a named simpleType which defines an enumerated list of values for the DCMI Type Vocabulary.

This simpleType is referenced in a complexType in the dcterms.xsd schema.