Metadata and electronic information

Michael Day
UKOLN: The UK Office for Library and Information Networking,
University of Bath, Bath, BA2 7AY, United Kingdom
http://www.ukoln.ac.uk/
m.day@ukoln.ac.uk

  

  Based on a presentation given at the final CIRCE (Community Information Resource Service) Workshop held at the Birmingham Council House on 15 January 1999.

1. Metadata

Metadata is a term that is being increasingly used by the library and information communities and elsewhere. Metadata can be defined as 'data about data', but the term is usually applied to structured data about resources that can help support a variety of operations. This data is structured so that it can become machine-understandable as well as machine-readable.

The term metadata can be applied to any descriptive data about resources, and in the library and information community can be used to describe bibliographic catalogues or databases of community information. However, metadata has largely been identified with issues of Internet resource discovery.

2. Dublin Core and RDF

The Dublin Core is an international and interdisciplinary initiative to define a core set of metadata elements for resource discovery on the Internet [1]. The format has been developed by means of a series of invitational workshops, the first being held in Dublin, Ohio in March 1995. The workshop series and related work has resulted in the definition of fifteen core metadata elements (Table 1). These elements are optional, repeatable and extensible. They can also be qualified in various ways so that, for example, the content of a subject field could contain data from an external scheme like the Library of Congress Subject Headings.

  1. Title
  2. Subject
  3. Description
  4. Creator
  5. Publisher
  6. Contributor
  7. Date
  8. Type
  9. Format
  10. Identifier
  11. Source
  12. Language
  13. Relation
  14. Coverage
  15. Rights

Table 1: Dublin Core elements.

Current implementations of Dublin Core on the Web are often based on metadata embedded in HTML META tags. However, with the Web moving towards the Extensible Markup Language (XML), future implementations of Dublin Core on the Web will possibly utilise the Resource Description Framework (RDF). The RDF is an initiative of the World Wide Web Consortium (W3C), the organisation that overlooks the development of open Web standards. RDF provides an extensible, XML-based metadata framework for the Web and has been under development since 1997. The RDF Model and Syntax specification was finally released as a W3C Recommendation on the 24 February 1999 [2].

The Dublin Core community has been actively involved in the development of RDF and implementations of Dublin Core for resource discovery may be amongst the earliest RDF applications on the Web.

3. Distributed and heterogeneous metadata

There are a large number of different metadata formats in existence which range from relatively simple formats like the ROADS templates used by some Internet subject services to more complex formats based on richer concepts of resource description - often implemented in a SGML environment. This means that metadata about resources in any particular subject domain will usually exist in a variety of different formats and often will also be distributed in a number of geographical locations. This has implications for resource discovery where there is a need to search all of this heterogeneous and distributed metadata in a systematic way. Potential solutions to these problems include the construction of gateways that apply protocols like Z39.50 and Whois++ as well as core metadata formats like Dublin Core. For example, the Arts and Humanities Data Service (AHDS), a UK-based service that gives access to materials resulting from research and teaching in the humanities, has implemented a resource discovery system that uses Dublin Core and a Z39.50 gateway [3].

4. Other metadata applications

It is becoming increasingly clear that metadata can be used for applications other than resource discovery. Metadata can, for example, be used to help manage the retrieval of resources based on the technological requirements of the user. It already has a role in the filtering of information to users through content rating services. Metadata can also help manage access to resources based on intellectual property rights information. Metadata has potential additional roles in the authentication of resources and in providing solutions to the difficult problems of digital preservation.

These areas tend to be less well developed than metadata applications in resource discovery but help emphasise how good practice in the digital environment increasingly depends on the creation and maintenance of quality metadata.

5. References

  1. Dublin Core metadata initiative: <URL:http://purl.oclc.org/dc>
  2. Resource Description Framework: <URL:http://www.w3.org/RDF/>
  3. Arts and Humanities Data Service: <URL:http://www.ahds.ac.uk/>

6. Acknowledgements

UKOLN is funded by the British Library Research and Innovation Centre (BLRIC), the Joint Information Systems Committee (JISC) of the UK higher education funding councils, as well as by project funding from the JISC's Electronic Libraries (eLib) Programme and the European Union. UKOLN also receives support from the University of Bath, where it is based.

A version of this paper has been published under the title "All you ever wanted to know about... metadata" in Electronic Public Information, March/April 1999, p. 15.


Maintained by: Michael Day of UKOLN The UK Office for Library and Information Networking, University of Bath.
First published in this form: 02-Mar-1999.
Last updated: 09-Apr-1999.