Title:

DC-TEXT: A Text Syntax for Dublin Core Metadata

Creator:
Pete Johnston
Date Issued:
2006-03-14
Identifier:
http://www.ukoln.ac.uk/metadata/dcmi/dc-text/2006-03-14/
Replaces:
http://www.ukoln.ac.uk/metadata/dcmi/dc-text/2006-01-13/
Is Replaced By:
Not applicable
Latest Version:
http://www.ukoln.ac.uk/metadata/dcmi/dc-text/
Description of Document: This document specifies a Text Syntax for representing Dublin Core metadata description sets.

Contents

  1. Introduction
  2. The DCMI Abstract Model and DC-TEXT
  3. The DC-TEXT Syntax
  4. Examples
  5. Appendix A: DC-TEXT in BNF
  6. Notes
  7. References

1. Introduction

The DCMI Abstract Model [DCAM] describes the components which make up DC metadata description sets and the relationships between them. This document specifies a syntax for serialising, or representing, a DC metadata description set in plain text. The format is referred to as "DC-TEXT" A plain text format for serialisation of such description sets is useful as a means of presenting examples in a way which highlights the constructs of the Abstract Model, and also as a means of comparing the information represented in other formats such as DC-XML, RDF/XML and XHTML/HTML.

2. The DCMI Abstract Model (Summary)

According to the DCMI Abstract Model [DCAM]:

3. The Syntax

A formal description of the DC-TEXT syntax is presented in Appendix A This section presents an overview of the syntax and a set of examples illustrating how the various constructs of the Abstract Model are represented.

3.1 The Structure of a DC-TEXT Document

The general structure of a DC-TEXT document is as follows:

namespace declaration

label (
      label ( content )
      label (
             label ( [...] )
      [ ... ]
            )
      )

Each of the primary components of a DC metadata description set defined by the DCMI Abstract Model is represented in DC-TEXT by a syntactic structure of the form:

label ( content )

where label is replaced by one of the following strings:

DescriptionSet, Description, DescriptionId, ResourceURI, Statement, PropertyURI, DescriptionRef,
VocabularyEncodingSchemeURI, ValueURI, ValueString, Language, SyntaxEncodingSchemeURI,
RichRepresentation, Base64, MIME

and content is either:

  • a sequence of one or more syntactic structures of the form label( content ) (i.e. these structures are "nested"); or
  • a string of the form "literal", which represents that Unicode literal; or
  • a string of the form <uri>, which represents a URI; or
  • a string of the form prefix:name, which represents a "qualified name" used as an abbreviation for a URI
  • a string which represents a language tag
  • a string which is a locally-scoped identifier used to establish relationships between values and their descriptions

For each label value in the list above, the permitted form of content is determined by the syntax rules specified in Appendix A. These are explained through examples below.

The DC-TEXT syntax supports the representation of a single DC description set, so a DC-TEXT document consists of zero or more namespace declarations followed by a single label( content ) syntactic structure with a label of DescriptionSet, and as content, one or more nested label( content ) structured with a label of Description. i.e. a DC-TEXT document has the following outline form:

@prefix prefix: <uri> .

DescriptionSet (
     Description (
          Statement ( ... )
          Statement ( ... )
          )
     Description (
          Statement ( ... )
          Statement ( ... )
          )
     )

3.1 URIs, Qualified Names, and Namespace Declarations

The DCMI Abstract Model uses URIs to refer to resources and to metadata terms (properties, vocabulary encoding schemes and syntax encoding schemes). In the DC-TEXT syntax, URIs may be written in full or may be represented as qualified names. A qualified name is made up of two parts, a prefix and a name, separated by a colon (":"). In DC-TEXT, wherever a qualified name is used, it is used to represent a URI. The URI represented by the qualified name is determined by appending the name part of the qualified name to the URI with which the prefix is associated in a namespace declaration (sometimes called the namespace URI).

Namespace declarations occur at the start of a DC-TEXT document, and have the following form:

@prefix prefix: <uri>

For example, the following declarations associates the prefix dc with the URI http://purl.org/dc/elements/1.1/ and the prefix ex with the URI http://example.org/resources/

@prefix dc: <http://purl.org/dc/elements/1.1/>
@prefix ex: <http://example.org/resources/>

Note that the limitations on the characters which can occur in the name part of a qualified name mean that there are URIs that can not be expressed as qualified names. For example the URIs http://example.org/resources/12345 and http://example.org/resources#12345 can not be represented as qualified names, because the name part can not include the "/" or "#" characters, and can not begin with a numeric character.

3.3 Comments

Comments can be inserted anywhere in a DC-TEXT document. A comment starts with a "#" and ends with a newline.

# A comment at the start of the document
@prefix prefix: <uri> .

DescriptionSet (
     Description (
# A comment at the start of a description
          Statement ( ... ) # A comment following a statement
          Statement ( ... )
          )
     Description (
          Statement ( ... )
          Statement ( ... )
          )
     )

3.4 String Escapes

To follow.

4. Examples

This section provides examples of how the DC-TEXT syntax represents all the constructs of the DCMI Abstract Model.

4.1 Statements Using Value Strings and Value URIs

The first example is of a description set containing a single description with a single simple statement with a property URI and a value string to represent the value:

DescriptionSet (
  Description (
    Statement (
      PropertyURI ( <http://purl.org/dc/elements/1.1/title> )
      ValueString ( "DCMI Home Page" )
      )
    )
  )

Example 1: Value Strings

The second example introduces a resource URI which identifies the subject of the description, using the ResourceeURI ( <uri> ) syntactic structure:

DescriptionSet (
  Description(
    ResourceURI( <http://dublincore.org/pages/home> )
    Statement (
      PropertyURI ( <http://purl.org/dc/elements/1.1/title> )
      ValueString ( "DCMI Home Page" )
      )
    )
  )

Example 2: Resource URI

By introducing namespace declarations, the qualified name mechanism can be used to abbreviate both the resource URI and the property URI. The same description set as in the previous example might be encoded as follows.

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      Value String ( "DCMI Home Page" )
      )
    )
  )

Example 3: Qualified Names

The value string may be associated with a language tag, represented using the Language( tag ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
        )
      )
    )
  )

Example 4: Language Tags

A single statement may include multiple value strings to represent the value. In DC-TEXT this is represented by repeating the ValueString ( "literal" ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
        )
      ValueString ( "El Home Page de DCMI"
        Language ( es-ES )
        )
      )
    )
  )

Example 5: Multiple Value Strings

A statement may include a value URI to identify the value, using the ValueURI ( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
        )
      ValueString ( "El Home Page de DCMI"
        Language( es-ES )
        )
      )
    Statement(
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
      )
    )
  )

Example 6: Value URIs

4.2 Vocabulary and Syntax Encoding Scheme URIs

A statement may include a vocabulary encoding scheme URI to specify the type of the value, a class of which the value is an instance. In DC-TEXT this is represented using the VocabularyEncodingSchemeURI ( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
        )
      Value String( "El Home Page de DCMI"
        Language ( es-ES )
        )
      )
    Statement (
      PropertyURI ( dc:creator )
      Value URI ( agent:DCMI )
      )
    Statement (
      PropertyURI ( dc:subject )
      VocabularyEncodingSchemeURI ( dcterms:LCSH )
      ValueString ( "Information technology")
      )
    )
  )

Example 7: Vocabulary Encoding Scheme URIs

A value string may be associated with a syntax encoding scheme URI, using the SyntaxEncodingSchemeURI( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix xs: <http://www.w3.org/2001/XMLSchema#> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
        )
      Value String( "El Home Page de DCMI"
        Language ( es-ES )
        )
      )
    Statement(
      Property URI( dc:creator )
      Value URI ( agent:DCMI )
      )
    Statement (
      PropertyURI ( dc:subject )
      VocabularyEncodingSchemeURI ( dcterms:LCSH )
      ValueString ( "Information technology" )
      )
    Statement (
      PropertyURI ( dcterms:modified )
      ValueString ( "2006-02-14"
        SyntaxEncodingSchemeURI ( xs:date )
      )
    )
  )

Example 8: Syntax Encoding Scheme URIs

4.3 Multiple Descriptions in Description Set

A description set may contain multiple descriptions, represented by a list of Description( content ) syntactic structures. The order has no significance.

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
      )
    )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
      )
    )
  )

Example 9: Multiple Descriptions

A description may be the description of a resource which is a value in a statement in another description within the description set. If the resource has been assigned a URI, then that URI appears as a value URI in the statement where the resource is the value and as a resource URI in the description of that resource.

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
      )
    Statement (
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
      )
    )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
      )
    Statement (
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
      )
    )
  Description (
    ResourceURI ( agent:DCMI )
    Statement (
      PropertyURI ( foaf:name )
      ValueString ( "Dublin Core Metadata Initiative" )
      )
    )
  )
  

Example 10: Multiple Related Descriptions

In some cases it may be that the resources do not have URIs assigned, but such a resource may still be a value in a ststement, and the subject of another description. In DC-TEXT, the association between the value of one statement and the description of that resource is made by labelling the description using a DescriptionId ( id ) syntactic structure. The id value may then be cited using the DescriptionRef ( id ) syntactic structure in one or more statements elsewhere in the same description set. This is a syntactic mechnism for linking references to values to their descriptions: the id itself does not appear in the Abstrct Model

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
      )
    Statement (
      PropertyURI ( dc:creator )
      DescriptionRef ( descDCMI )
      )
    )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
      )
    Statement (
      PropertyURI ( dc:creator )
      DescriptionRef ( descDCMI )
      )
    )
  Description (
    DescriptionId ( descDCMI )
    Statement (
      PropertyURI ( foaf:name )
      ValueString ( "Dublin Core Metadata Initiative" )
      )
    )
  )
  

Example 11: Multiple Related Descriptions

4.4 Rich Representations

A value may be represented not simply by a value string, but also by a rich represntation: an XML fragment or a piece of binary data.

In DC-TEXT, an XML fragment is represented using the RichRepresentation ( "literal" ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:description )
      RichRepresentation ( "<p xmlns=\"http://www.w3.org/1999/xhtml\">This is the DCMI<br />Home Page</p>" )
      )
    )
  )

Example 12: Rich Representations - XML

In DC-TEXT, a binary data object is enocoded as a Base64-encoded literal and represented using the Base64 ( "literal" MIME ( "MIME-type" ) ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:description )
      RichRepresentation ( 
        Base64 ( "abcdefghij" MIME ( "image/png" ) )
        )
      )
    )
  )

Example 13: Rich Representations - Binary Data

Appendix A. Grammar

A DC-TEXT document is a sequence of Unicode characters encoded in UTF-8 defined by the grammar below. It is specified by means of the version of Extended BNF used in XML 1.0 (Third Edition) [XML]

DCAM Text - EBNF
[1] dcTextDoc ::= comment* ws* namespaceDeclaration* ws* comment* ws* descriptionSet ws* comment* ws*
[2] namespaceDeclaration ::= '@prefix' ws+ prefixName? ':' ws+ uriref
[3] descriptionSet ::= 'DescriptionSet(' ws* comment* ws* description ws* comment* ws* ')'
[4] description ::= 'Description(' ws* comment* ws* (descriptionId ws* comment* ws*)? (resourceURI ws* comment* ws*)? (statement ws* comment* ws*)+ ')'
[5] statement ::= 'Statement(' ws* comment* ws* propertyURI ws* comment* ws* (vocabEncSchemeURI ws* comment* ws*)? (valueURI ws* comment* ws*)? (valueRepresentation ws* comment* ws*)* (descriptionReference ws* comment* ws*)? ')'
[6] valueRepresentation ::= valueString | richRepresentation
[7] valueString ::= 'ValueString(' ws* comment* ws* quotedString ws* comment* ws* (languageTag ws* comment* ws*)? (valueURI ws* comment* ws*)? (syntaxEncSchemeURI ws* comment* ws*)? ')'
[8] languageTag ::= 'Language(' ws* language ws* ')'
[9] richRepresentation ::= 'RichRepresentation(' ws* comment* ws* ( quotedString | Base64 ) ws* comment* ws* ')'
[10] base64 ::= 'Base64 (' ws* comment* ws* quotedString ws* mime comment* ws* ')'
[11] mime ::= 'mime (' ws* quotedString ws* ')'
[12] descriptionReference ::= 'DescriptionRef(' ws* name ws* ')'
[13] descriptionId ::= 'DescriptionId(' ws* name ws* ')'
[14] resourceURI ::= 'ResourceURI(' ws* resourceRef ws* ')'
[15] propertyURI ::= 'PropertyURI(' ws* resourceRef ws* ')'
[16] valueURI ::= 'ValueURI(' ws* resourceRef ws* ')'
[17] vocabEncSchemeURI ::= 'VocabEncSchemeURI(' ws* resourceRef ws* ')'
[18] syntaxEncSchemeURI ::= 'SyntaxEncSchemeURI(' ws* resourceRef ws* ')'
[19] resourceRef ::= uriref | qualifiedName
[20] qualifiedName ::= prefixName? ':' name?

The following productions are as defined by Turtle [TURTLE]:

DCAM Text - EBNF - continued
[21] comment ::= '#' ( [^#xA#xD] )*
[22] ws ::= #x9 | #xA | #xD | #x20
[23] uriref ::= '<' relativeURI '>'
[24] language ::= [a-z]+ ('-' [a-z0-9]+ )*
[25] nameStartChar ::= [A-Z] | "_" | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[26] nameChar ::= nameStartChar | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
[27] name ::= nameStartChar nameChar*
[28] prefixName ::= ( nameStartChar - '_' ) nameChar*
[29] relativeURI ::= ucharacter*
[30] quotedString ::= string | longString
[31] string ::= #x22 scharacter* #x22
[32] longString ::= #x22 #x22 #x22 lcharacter* #x22 #x22 #x22
[33] character ::= '\u' hex hex hex hex |
'\U' hex hex hex hex hex hex hex hex |
'\\' |
[#x20-#x5B] | [#x5D-#x10FFFF]
See String Escapes for full details.
[34] echaracter ::= character | '\t' | '\n' | '\r'
See String Escapes for full details.
[35] hex ::= [#x30-#x39] | [#x41-#x46]
hexadecimal digit (0-9, uppercase A-F)
[36] ucharacter ::= ( character - #x3E ) | '\>'
[37] scharacter ::= ( echaracter - #x22 ) | '\"'
[38] lcharacter ::= echaracter | '\"' | #x9 | #xA | #xD

Notes

References

[DCAM]
DCMI Abstract Model
http://dublincore.org/documents/abstract-model/

[XML]
Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation 04 February 2004.
http://www.w3.org/TR/REC-xml

[XML]
XML Schema Part 0: Primer Second Edition. W3C Recommendation 28 October 2004.
http://www.w3.org/TR/xmlschema-0/

[TURTLE]
Turtle - Terse RDF Triple Language
http://www.dajobe.org/2004/01/turtle/


Valid XHTML 1.0!Valid CSS!