UKOLN AHDS Introduction To Metadata



What is Metadata?

Metadata is often described as "data about data". The concept of metadata is not new - a Library catalogue contains metadata about the books held in the Library. What is new is the potential that metadata provides in developing rich digital library services.

The term metadata has come to mean structured information that is used by automated processes. This is probably the most useful way to think about metadata [1].

The Classic Metadata Example

The classic example of metadata is the library catalogue. A catalogue record normally contains information about a book (title, format, ISBN, author, etc.). Such information is stored in a structured, standardised form, often using an international standard known as MARC. Use of this international standard allows catalogue records to be shared across organisations.

Why is Metadata So Important?

Although metadata is nothing new, the importance of metadata has grown with the development of the World Wide Web. As is well-known the Web seeks to provide universal access to distributed resources. In order to develop richly functional Web applications which can exploit the Web's global information environment it is becoming increasingly necessary to make use of metadata which describes the resources in some formal standardised manner.

Metadata Standards

In order to allow metadata to be processed in a consistent manner by computer software it is necessary for metadata to be described in a standard way. There are many metadata standards available. However in the Web environment the best known standard is the Dublin Core standard which provides an agreed set of core metadata elements for use in resource discovery.

The Dublin Core standard (formally known as the Dublin Core Metadata Element Set) has defined 15 core elements: Title, Creator, Subject, Description, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage and Rights [2].

The core element set is clearly very basic. A mechanism for extending Dublin Core elements has been developed. This allows what is known as Qualified Dublin Core elements to refine the core elements. For example DC.Date.Created refines the DC.Date element by allowing the date of creation of the resource to be described. DC.Date.Modified can be used to describe the date on which the resource was changed. Without the qualifiers, it would not be possible to tell which date related to which event. Work is in progress in defining a common framework for qualifiers.

Using Metadata

The Dublin Core standard defines a set of core elements. The standard does not specify how these elements should be deployed on the Web. Initially consideration was given to using Dublin Core by embedding it within HTML pages using the <meta> element e.g. <meta name="DC.Creator" content="John Smith">. However this approach has limitations: initially HTML was not rich enough to all metadata schemes to be including (which could specify that a list of keywords are taken from the Library Of Congress list); it is not possible to define relationships for metadata elements (which may be needed if, for example, there are multiple creators of a resource) and processing the metadata requires the entire HTML document to be downloaded.

In order to address these concerns a number of alternative approaches for using metadata have been developed. RDF (Resource Description Framework) [3], for example, has been developed by W3C as a framework for describing a wide range of metadata applications. In addition OAI (Open Archives Initiative) [4] is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

In addition to selecting the appropriate standards use of metadata may also require use of a metadata management system and a metadata repository.

References

  1. Metadata Demystified, NISO,
    <http://www.niso.org/standards/resources/Metadata_Demystified.pdf>
  2. Dublin Core Metadata Element Set, DCMI,
    <http://dublincore.org/documents/dces/>
  3. Resource Description Framework (RDF), W3C,
    <http://www.w3.org/RDF/>
  4. Open Archives Initiative (OAI),
    <http://www.openarchives.org/>
  5. Information Environment Home, JISC,
    <http://www.jisc.ac.uk/index.cfm?name=ie_home>