Title:
|
DC-TEXT: A Text Syntax for Dublin Core Metadata |
Creator:
|
Pete Johnston
|
Date Issued:
|
2006-03-14
|
Identifier:
|
|
Replaces:
|
|
Is Replaced By:
|
Not applicable
|
Latest Version:
|
|
Description of Document: | This document specifies a Text Syntax for representing Dublin Core metadata description sets. |
|
The DCMI Abstract Model [DCAM] describes the components which make up DC metadata description sets and the relationships between them. This document specifies a syntax for serialising, or representing, a DC metadata description set in plain text. The format is referred to as "DC-TEXT" A plain text format for serialisation of such description sets is useful as a means of presenting examples in a way which highlights the constructs of the Abstract Model, and also as a means of comparing the information represented in other formats such as DC-XML, RDF/XML and XHTML/HTML.
According to the DCMI Abstract Model [DCAM]:
a description set is made up of one or more descriptions
a description is made up of
a statement is made up of
exactly one property URI and
zero or one references to a value in the form of a value URI
zero or more representations of a value, each in the form of a value representation
zero or one vocabulary encoding scheme URI
a value representation is either
a value string may be associated with a value string language
a value string may be associated with a syntax encoding scheme URI
a value may be the subject of a related description
A formal description of the DC-TEXT syntax is presented in Appendix A This section presents an overview of the syntax and a set of examples illustrating how the various constructs of the Abstract Model are represented.
The general structure of a DC-TEXT document is as follows:
Each of the primary components of a DC metadata description set defined by the DCMI Abstract Model is represented in DC-TEXT by a syntactic structure of the form:
where label
is replaced by one of the following strings:
|
and content
is either:
label( content )
(i.e. these structures are "nested"); or"literal"
, which represents that Unicode literal; or<uri>
, which represents a URI; orprefix:name
, which represents a "qualified name" used as an abbreviation for a URIFor each label
value in the list above, the permitted form of content
is determined by the syntax rules specified in Appendix A. These are explained through examples below.
The DC-TEXT syntax supports the representation of a single DC description set, so a DC-TEXT document consists of zero or more namespace declarations
followed by a single label( content )
syntactic structure with a label
of DescriptionSet
, and as content
, one or more nested label( content )
structured with a label
of Description
. i.e. a DC-TEXT document has the following outline form:
The DCMI Abstract Model uses URIs to refer to resources and to metadata terms (properties, vocabulary encoding schemes and syntax encoding schemes). In the DC-TEXT syntax, URIs may be written in full or may be represented as qualified names
. A qualified name
is made up of two parts, a prefix
and a name
, separated by a colon (":"). In DC-TEXT, wherever a qualified name
is used, it is used to represent a URI. The URI represented by the qualified name is determined by appending the name part of the qualified name to the URI with which the prefix is associated in a namespace declaration
(sometimes called the namespace URI).
Namespace declarations occur at the start of a DC-TEXT document, and have the following form:
For example, the following declarations associates the prefix dc
with the URI http://purl.org/dc/elements/1.1/
and the prefix ex
with the URI http://example.org/resources/
Note that the limitations on the characters which can occur in the name
part of a qualified name mean that there are URIs that can not be expressed as qualified names. For example the URIs http://example.org/resources/12345
and http://example.org/resources#12345
can not be represented as qualified names, because the name
part can not include the "/" or "#" characters, and can not begin with a numeric character.
This section provides examples of how the DC-TEXT syntax represents all the constructs of the DCMI Abstract Model.
The first example is of a description set containing a single description with a single simple statement with a property URI and a value string to represent the value:
DescriptionSet ( Description ( Statement ( PropertyURI ( <http://purl.org/dc/elements/1.1/title> ) ValueString ( "DCMI Home Page" ) ) ) ) |
Example 1: Value Strings
The second example introduces a resource URI which identifies the subject of the description, using the ResourceeURI ( <uri> )
syntactic structure:
DescriptionSet ( Description( ResourceURI( <http://dublincore.org/pages/home> ) Statement ( PropertyURI ( <http://purl.org/dc/elements/1.1/title> ) ValueString ( "DCMI Home Page" ) ) ) ) |
Example 2: Resource URI
By introducing namespace declarations, the qualified name mechanism can be used to abbreviate both the resource URI and the property URI. The same description set as in the previous example might be encoded as follows.
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) Value String ( "DCMI Home Page" ) ) ) ) |
Example 3: Qualified Names
The value string may be associated with a language tag, represented using the Language( tag )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" Language ( en-GB ) ) ) ) ) |
Example 4: Language Tags
A single statement may include multiple value strings to represent the value. In DC-TEXT this is represented by repeating the ValueString ( "literal" )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" Language ( en-GB ) ) ValueString ( "El Home Page de DCMI" Language ( es-ES ) ) ) ) ) |
Example 5: Multiple Value Strings
A statement may include a value URI to identify the value, using the ValueURI ( <uri> )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" Language ( en-GB ) ) ValueString ( "El Home Page de DCMI" Language( es-ES ) ) ) Statement( PropertyURI ( dc:creator ) ValueURI ( agent:DCMI ) ) ) ) |
Example 6: Value URIs
A statement may include a vocabulary encoding scheme URI to specify the type of the value, a class of which the value is an instance. In DC-TEXT this is represented using the VocabularyEncodingSchemeURI ( <uri> )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcterms: <http://purl.org/dc/terms/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" Language ( en-GB ) ) Value String( "El Home Page de DCMI" Language ( es-ES ) ) ) Statement ( PropertyURI ( dc:creator ) Value URI ( agent:DCMI ) ) Statement ( PropertyURI ( dc:subject ) VocabularyEncodingSchemeURI ( dcterms:LCSH ) ValueString ( "Information technology") ) ) ) |
Example 7: Vocabulary Encoding Scheme URIs
A value string may be associated with a syntax encoding scheme URI, using the SyntaxEncodingSchemeURI( <uri> )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix xs: <http://www.w3.org/2001/XMLSchema#> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" Language ( en-GB ) ) Value String( "El Home Page de DCMI" Language ( es-ES ) ) ) Statement( Property URI( dc:creator ) Value URI ( agent:DCMI ) ) Statement ( PropertyURI ( dc:subject ) VocabularyEncodingSchemeURI ( dcterms:LCSH ) ValueString ( "Information technology" ) ) Statement ( PropertyURI ( dcterms:modified ) ValueString ( "2006-02-14" SyntaxEncodingSchemeURI ( xs:date ) ) ) ) |
Example 8: Syntax Encoding Scheme URIs
A description set may contain multiple descriptions, represented by a list of Description( content )
syntactic structures. The order has no significance.
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" ) ) ) Description ( ResourceURI ( page:althome ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Alternative Home Page" ) ) ) ) |
Example 9: Multiple Descriptions
A description may be the description of a resource which is a value in a statement in another description within the description set. If the resource has been assigned a URI, then that URI appears as a value URI in the statement where the resource is the value and as a resource URI in the description of that resource.
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" ) ) Statement ( PropertyURI ( dc:creator ) ValueURI ( agent:DCMI ) ) ) Description ( ResourceURI ( page:althome ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Alternative Home Page" ) ) Statement ( PropertyURI ( dc:creator ) ValueURI ( agent:DCMI ) ) ) Description ( ResourceURI ( agent:DCMI ) Statement ( PropertyURI ( foaf:name ) ValueString ( "Dublin Core Metadata Initiative" ) ) ) ) |
Example 10: Multiple Related Descriptions
In some cases it may be that the resources do not have URIs assigned, but such a resource may still be a value in a ststement, and the subject of another description. In DC-TEXT, the association between the value of one statement and the description of that resource is made by labelling the description using a DescriptionId ( id )
syntactic structure. The id value may then be cited using the DescriptionRef ( id )
syntactic structure in one or more statements elsewhere in the same description set. This is a syntactic mechnism for linking references to values to their descriptions: the id itself does not appear in the Abstrct Model
@prefix page: <http://dublincore.org/pages/> . @prefix agent: <http://example.org/agents/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Home Page" ) ) Statement ( PropertyURI ( dc:creator ) DescriptionRef ( descDCMI ) ) ) Description ( ResourceURI ( page:althome ) Statement ( PropertyURI ( dc:title ) ValueString ( "DCMI Alternative Home Page" ) ) Statement ( PropertyURI ( dc:creator ) DescriptionRef ( descDCMI ) ) ) Description ( DescriptionId ( descDCMI ) Statement ( PropertyURI ( foaf:name ) ValueString ( "Dublin Core Metadata Initiative" ) ) ) ) |
Example 11: Multiple Related Descriptions
A value may be represented not simply by a value string, but also by a rich represntation: an XML fragment or a piece of binary data.
In DC-TEXT, an XML fragment is represented using the RichRepresentation ( "literal" )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:description ) RichRepresentation ( "<p xmlns=\"http://www.w3.org/1999/xhtml\">This is the DCMI<br />Home Page</p>" ) ) ) ) |
Example 12: Rich Representations - XML
In DC-TEXT, a binary data object is enocoded as a Base64-encoded literal and represented using the Base64 ( "literal" MIME ( "MIME-type" ) )
syntactic structure:
@prefix page: <http://dublincore.org/pages/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . DescriptionSet ( Description ( ResourceURI ( page:home ) Statement ( PropertyURI ( dc:description ) RichRepresentation ( Base64 ( "abcdefghij" MIME ( "image/png" ) ) ) ) ) ) |
Example 13: Rich Representations - Binary Data
A DC-TEXT document is a sequence of Unicode characters encoded in UTF-8 defined by the grammar below. It is specified by means of the version of Extended BNF used in XML 1.0 (Third Edition) [XML]
The following productions are as defined by Turtle [TURTLE]:
[21] | comment | ::= | '#' ( [^#xA#xD] )* |
[22] | ws | ::= | #x9 | #xA | #xD | #x20 |
[23] | uriref | ::= | '<' relativeURI '>' |
[24] | language | ::= | [a-z]+ ('-' [a-z0-9]+ )* |
[25] | nameStartChar | ::= | [A-Z] | "_" | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] |
[26] | nameChar | ::= | nameStartChar | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040] |
[27] | name | ::= | nameStartChar nameChar* |
[28] | prefixName | ::= | ( nameStartChar - '_' ) nameChar* |
[29] | relativeURI | ::= | ucharacter* |
[30] | quotedString | ::= | string | longString |
[31] | string | ::= | #x22 scharacter* #x22 |
[32] | longString | ::= | #x22 #x22 #x22 lcharacter* #x22 #x22 #x22 |
[33] | character | ::= | '\u' hex hex hex hex | '\U' hex hex hex hex hex hex hex hex | '\\' | [#x20-#x5B] | [#x5D-#x10FFFF] See String Escapes for full details. |
[34] | echaracter | ::= | character | '\t' | '\n' | '\r' See String Escapes for full details. |
[35] | hex | ::= | [#x30-#x39] | [#x41-#x46] hexadecimal digit (0-9, uppercase A-F) |
[36] | ucharacter | ::= | ( character - #x3E ) | '\>' |
[37] | scharacter | ::= | ( echaracter - #x22 ) | '\"' |
[38] | lcharacter | ::= | echaracter | '\"' | #x9 | #xA | #xD |
[DCAM]
DCMI Abstract Model
http://dublincore.org/documents/abstract-model/
[XML]
Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation 04 February 2004.
http://www.w3.org/TR/REC-xml
[XML]
XML Schema Part 0: Primer Second Edition. W3C Recommendation 28 October 2004.
http://www.w3.org/TR/xmlschema-0/
[TURTLE]
Turtle - Terse RDF Triple Language
http://www.dajobe.org/2004/01/turtle/