Metadata Application Profile: eBank UK project

Descriptive header

dc:title Metadata Application Profile: eBank UK project
dc:creator Traugott Koch
dc:contributor Monica Duke
dc:contributor Simon Coles
dc:description Application Profile for the eBank UK project and service
dc:publisher UKOLN, University of Bath, UK
dc:date 2005-11-02
dc:identifier http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/2005/11/02/

Latest version: http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/


Preamble

This is the documentation of the application profile for the eBank UK project repository and service, described according to the Dublin Core Application Profile Guidelines (CEN/ISSS CWA14855, ftp://ftp.cenorm.be/PUBLIC/CWAs/e-Europe/MMI-DC/cwa14855-00-2003-Nov.pdf).
Metadata properties described in this application profile are those that are used in external metadata supplied by the public services of eBank UK.

In addition, this Application Profile is configured according to the Dublin Core Metadata Initiative Abstract Model.

DC Definition and DC Comment are taken from: http://dublincore.org/documents/dcmi-terms/

XML Schemas

Based on this Application Profile, the project defined XML Schemas: Two schema definitions are provided:

The first declares the ebank_dc container element, and the second reuses the elementOrRefinement container defined by the qualified DC schema and specifies the types of the encoding schemes (enumerating their values where appropriate).

They are consistent with the model for qualified Dublin Core and are intended to define the allowed contents of a record that describes a resource. (See http://dublincore.org/documents/2003/04/02/dc-xml-guidelines/ for the use of the terms resource and record). The main resource being described in the crystallography application of eBank UK is a dataholding - an aggregation of datasets about one crystal structure.

The XML Schemas have been defined to be compatible with the encoding guidelines for implementing Dublin Core in XML http://dublincore.org/documents/2003/04/02/dc-xml-guidelines/

Metadata exchange by OAI-PMH

eBank UK currently exports metadata using the OAI-PMH 2.0 protocol in two different metadata formats:

For the export in simple DC, all qualified DC elements as listed below are "dumbed down" to the main DC element (as listed under Refines:). Encoding schemes (as listed under Has Encoding Scheme:) are not mentioned.

Describing each dataset of the data holding

Since the users have indicated that some information about datasets, (namely, their type), is required, a further minimal description of each datasets is exchanged as an addendum to the dataholding description. Each dataset description is also an ebank_dc description and currently simply uses the dc:identifier element and the dc:type element. The dc:identifier element has as its value the URL of the dataset (in the current eprints.org implementation this is found within the jump-off page for the crystal structure dataholding). The dc:type takes a value from those defined in the ebankterms schema.

Additional Wrapper elements using METS

Since each record exchanged by OAI-PMH may contain more than one ebank_dc element (one ebank_dc element for the dataholding, and one each for each of the datasets), a wrapper element is being used as a package container for the ebank_dc elements.
Rather than redefine an ebank container element, METS is used as a packaging standard.

A schematic overview is provided at http://www.ukoln.ac.uk/projects/ebank-uk/private/schematic-view.gif

Contents


Namespaces

Term URI http://purl.org/dc/elements/1.1/
Name dc:
Label Dublin Core

Term URI http://purl.org/dc/terms/
Name dcterms:
Label Dublin Core terms

Term URI http://purl.org/ebank/terms/
Name ebankterms:
Label Ebank Terms


Data Types

Name <string>
Label Character string
Definition A string of ASCII characters. No formatting tags may be included. The following characters must be encoded for XML: '&' - '&amp;'; '<' - '&lt;'; '>' - '&gt;'. A limited list of non-ASCII characters may be included encoded as character entities.

Name <date>
Label Date
Defined By http://www.w3.org/TR/NOTE-datetime
Definition Character string representing a date to the complete date level of the W3CDTF profile of ISO 8601, of the form: [ YYYY-MM-DD | YYYY-MM | YYYY ]

Name <URI>
Label URI
Defined By http://www.ietf.org/rfc/rfc2396.txt
Definition Character string for a URI


Describing the dataholding

Elements and element refinements

Term URI http://purl.org/dc/elements/1.1/title
Name title
Label Title
Defined by:
DC Definition A name given to the resource.
DC Comment Typically, a Title will be a name by which the resource is formally known.
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: 1
Best practice The title of the dataholding is identical with the IUPAC compound name as specified by the author according to IUPAC guidelines outlined in the archive software documentation
Open questions


Term URI http://purl.org/dc/elements/1.1/creator
Name creator
Label Creator
Defined by:
DC Definition An entity primarily responsible for making the content of the resource.
DC Comment Examples of a Creator include a person, an organisation, or a service. Typically, the name of a Creator should be used to indicate the entity.
eBank UK Definition
eBank UK Comment Creator(s) of the dataholding
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: unbounded
Best practice Encoding: lastname, firstname, initials. Where more than one creator is provided, a separate property should be used for each
Open questions


Term URI http://purl.org/dc/elements/1.1/subject
Name subject
Label Subject and Keywords
Defined by:
DC Definition The topic of the content of the resource.
DC Comment Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
eBank UK Definition
eBank UK Comment Keywords selected from an adapted version of the IUCr World Directory of Crystallographers list
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Vocabulary Encoding Scheme http://purl.org/ebank/terms/Keywords
Obligation Strongly recommended
Condition
Datatype string
Occurrence Min: 0; Max: unbounded
Best practice A restricted list of keywords is provided. Where more than one keyword is provided, a separate property should be used for each
Open questions


Term URI http://purl.org/dc/elements/1.1/subject
Name subject
Label Subject and Keywords
Defined by:
DC Definition The topic of the content of the resource.
DC Comment Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
eBank UK Definition
eBank UK Comment Chemical compound, identified with an InChI (International Chemical Identifier)
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/ebank/terms/InChI
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: 1
Best practice Must be encoded according to NIST guidelines. It is recommended that the eBank toolbox is used. Encoding scheme to be mentioned in ebank_mets metadata export
Open questions


Term URI http://purl.org/dc/elements/1.1/subject
Name subject
Label Subject and Keywords
Defined by:
DC Definition The topic of the content of the resource.
DC Comment Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
eBank UK Definition
eBank UK Comment Chemical compound, identified with a Chemical Formula
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/ebank/terms/ChemicalFormula
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: 1
Best practice Deposition guidelines given in the archive documentation should be followed. Encoding scheme to be mentioned in ebank_mets metadata export
Open questions


Term URI http://purl.org/dc/elements/1.1/subject
Name subject
Label Subject and Keywords
Defined by:
DC Definition The topic of the content of the resource.
DC Comment Typically, a Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
eBank UK Definition
eBank UK Comment Chemical category: CompoundClass
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Vocabulary Encoding Scheme http://purl.org/ebank/terms/CompoundClass
Obligation Mandatory
Condition Only one from the list of Compound Classes can be assigned
Datatype string
Occurrence Min: 1; Max: 1
Best practice Encoding scheme to be mentioned in ebank_mets metadata export
Open questions


Term URI http://purl.org/dc/elements/1.1/publisher
Name publisher
Label Publisher
Defined by:
DC Definition An entity responsible for making the resource available
DC Comment Examples of a Publisher include a person, an organisation, or a service. Typically, the name of a Publisher should be used to indicate the entity.
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: 1
Best practice Affiliation of the creator(s)
Open questions See internal Guidelines, Implement. note


Term URI http://purl.org/dc/terms/modified/
Name modified
Label Date modified
Defined by:
DC Definition Date on which the resource was changed.
DC Comment -
eBank UK Definition
eBank UK Comment Date on which the dataholding was changed
Type of term Element Refinement
Refines http://purl.org/dc/elements/1.1/date
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/dc/terms/W3CDTF
Obligation Mandatory
Condition For retrieval reasons, every record needs to have a "date modified"
Datatype date
Occurrence Min: 1; Max: 1
Best practice Date of the last modification of the dataholding. Can be identical to Date Created. Only date exported as metadata. Format: YYYY-MM-DD
Open questions


Term URI http://purl.org/dc/terms/created/
Name created
Label Date created
Defined by:
DC Definition Date of creation of the resource
DC Comment -
eBank UK Definition
eBank UK Comment Date of creation of the data holding
Type of term Element Refinement
Refines http://purl.org/dc/elements/1.1/date
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/dc/terms/W3CDTF
Obligation Mandatory
Condition
Datatype date
Occurrence Min: 1; Max: 1
Best practice Not displayed in exported metadata, only on reports page. Format: YYYY-MM-DD
Open questions


Term URI http://purl.org/dc/elements/1.1/type
Name type
Label Resource Type
Defined by:
DC Definition The nature or genre of the content of the resource.
DC Comment -
eBank UK Definition
eBank UK Comment Type of the data holding.
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Mandatory
Condition
Datatype string
Occurrence Min: 1; Max: 1
Best practice All dataholdings are given the type: "Crystal structure data holding" as a fixed value
Open questions


Term URI http://purl.org/dc/elements/1.1/identifier
Name identifier
Label Resource Identifier
Defined by:
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Example formal identification systems include the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/dc/terms/URI
Obligation Mandatory
Condition
Datatype URI
Occurrence Min: 1; Max: 1
Best practice This is the Crystal Structure Report URL
Open questions


Term URI http://purl.org/dc/elements/1.1/identifier
Name identifier
Label Resource Identifier
Defined by:
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Example formal identification systems include the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/ebank/terms/DOI
Obligation Mandatory
Condition
Datatype URI
Occurrence Min: 1; Max: 1
Best practice String has the following syntax: DOI:10.1594/ecrystals.chem.soton.ac.uk/[acronym/number of our choice]
Open questions


Term URI http://purl.org/dc/elements/1.1/relation
Name relation
Label Relation
Defined by:
DC Definition A reference to a related resource.
DC Comment Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.
eBank UK Definition
eBank UK Comment Pointer(s) to relevant article(s)
Type of term Element
Refines -
Refined by http://purl.org/dc/terms/isReferencedBy
Encoding Scheme For -
Has Encoding Scheme -
Obligation Optional
Condition
Datatype URI
Occurrence Min: 0; Max: unbounded
Best practice Not implemented
Open questions


Term URI http://purl.org/dc/terms/isReferencedBy
Name isReferencedBy
Label Is Referenced By
Defined by:
DC Definition The described resource is referenced, cited, or otherwise pointed to by the referenced resource.
DC Comment -
eBank UK Definition
eBank UK Comment Holds a citation of a referencing article
Type of term Element Refinement
Refines http://purl.org/dc/elements/1.1/relation
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Optional
Condition
Datatype URI
Occurrence Min: 0; Max: unbounded
Best practice Not implemented
Open questions


Term URI http://purl.org/dc/terms/hasPart
Name hasPart
Label Has Part
Defined by:
DC Definition The described resource includes the referenced resource either physically or logically.
DC Comment -
eBank UK Definition
eBank UK Comment References to data files which are part of the data holding.
Type of term Element Refinement
Refines http://purl.org/dc/elements/1.1/relation
Refined by -
Encoding Scheme For -
Has Encoding Scheme -
Obligation Mandatory
Condition All data files which are part of the data holding need to be pointed to.
Datatype URI
Occurrence Min: 1; Max: unbounded
Best practice All data files are to be enumerated in both oai_dc and ebank_mets metadata export. Where more than one pointer is provided, a separate property should be used for each. Each value is identified by an URI.
Open questions


Term URI http://purl.org/dc/elements/1.1/rights
Name rights
Label Rights Management
Defined by:
DC Definition Information about rights held in and over the resource.
DC Comment Typically, a Rights element will contain a rights management statement for the resource, or reference a service providing such information. Rights information often encompasses Intellectual Property Rights (IPR), Copyright, and various Property Rights. If the Rights element is absent, no assumptions can be made about the status of these and other rights with respect to the resource.
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Encoding Scheme
Obligation Strongly recommended
Condition
Datatype URI
Occurrence Min: 0; Max: 1
Best practice Fixed value: URL pointing to a general plain text rights statement
Open questions


Vocabulary Encoding Schemes (Classes)

Term URI http://purl.org/ebank/terms/Keywords
Name Keywords
Label Ebank UK keywords
Defined by:
eBank UK Definition Keywords adapted from the IUCr World Directory of Crystallographers list
eBank UK Comment
Type of term Vocabulary Encoding Scheme
Vocabulary Encoding Scheme For http://purl.org/dc/elements/1.1/subject
Obligation Mandatory
Datatype string
Best practice Select the terms from the restricted list at: http://ebank.eprints.org/keywords.html
Open questions


Term URI http://purl.org/ebank/terms/CompoundClass
Name CompoundClass
Label Compound Class
Defined by:
eBank UK Definition Chemical category: CompoundClass
eBank UK Comment
Type of term Vocabulary Encoding Scheme
Vocabulary Encoding Scheme For http://purl.org/dc/elements/1.1/subject
Obligation Mandatory
Datatype string
Best practice Select the terms from http://www.rdn.ac.uk/oai/ebank/20060205/ebankterms.xsd
Open questions


Syntax Encoding Schemes (Datatypes)

Term URI http://purl.org/ebank/terms/InChI
Name InChI
Label InChI
Defined by:
eBank UK Definition InChI (International Chemical Identifier) for a chemical compound
eBank UK Comment
Type of term Syntax Encoding Scheme
Syntax Encoding Scheme For http://purl.org/dc/elements/1.1/subject
Obligation Mandatory
Datatype string
Best practice String must commence with 'InChI=' and include the version number. It is recommended that InChI strings are generated using the eBank toolbox
Open questions


Term URI http://purl.org/ebank/terms/ChemicalFormula
Name ChemicalFormula
Label Chemical Formula
Defined by:
eBank UK Definition Chemical Formula identifying a chemical compound
eBank UK Comment
Type of term Syntax Encoding Scheme
Syntax Encoding Scheme For http://purl.org/dc/elements/1.1/subject
Obligation Mandatory
Datatype string
Best practice Generate according to guidelines in the archive software and documentation
Open questions


Term URI http://purl.org/ebank/terms/DOI
Name DOI
Label DOI
Defined by:
eBank UK Definition
eBank UK Comment Digital Object Identifier (DOI), registered with TIB Hannover
Type of term Syntax Encoding Scheme
Syntax Encoding Scheme For http://purl.org/dc/elements/1.1/identifier
Obligation Mandatory
Datatype string
Best practice String has the following syntax: DOI:10.1594/ecrystals.chem.soton.ac.uk/[acronym/number of our choice]
Open questions


Describing the datasets

Elements and element refinements

Term URI http://purl.org/dc/elements/1.1/identifier
Name identifier
Label Resource Identifier
Defined by:
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Example formal identification systems include the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).
eBank UK Definition
eBank UK Comment
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Syntax Encoding Scheme http://purl.org/dc/terms/URI
Obligation Mandatory
Condition
Datatype URI
Occurrence Min: 1; Max: 1
Best practice Each dataset has to be assigned one URL. Only exported in the metadata format ebank_mets
Open questions At the moment, all datasets belonging to a dataholding are identified by the URL of the crystal structure report


Term URI http://purl.org/dc/elements/1.1/type
Name type
Label Resource Type
Defined by:
DC Definition The nature or genre of the content of the resource.
DC Comment -
eBank UK Definition
eBank UK Comment Types of individual datasets (as part of the data holding) in crystallography
Type of term Element
Refines -
Refined by -
Encoding Scheme For -
Has Vocabulary Encoding Scheme http://purl.org/ebank/terms/EbankDatasetType
Obligation Mandatory
Condition Each dataset has to be assigned one dataset type
Datatype string
Occurrence Min: 1; Max: 1
Best practice Only exported in the metadata format ebank_mets
Open questions


Vocabulary Encoding Schemes (Classes)

Term URI http://purl.org/ebank/terms/EbankDatasetType
Name EbankDatasetType
Label Ebank Dataset-Type
Defined by:
eBank UK Definition Types of individual datasets (as part of the data holding) in crystallography
eBank UK Comment
Type of term Vocabulary Encoding Scheme
Vocabulary Encoding Scheme For http://purl.org/dc/elements/1.1/type
Obligation Mandatory
Datatype string
Best practice Select the terms from http://www.rdn.ac.uk/oai/ebank/20060205/ebankterms.xsd
Open questions


Traugott Koch, UKOLN, University of Bath, UK.

2005 eBank UK project.

Created: 2005-11-02

Last update: 2006-02-13

URL: http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/2005/11/02/