Metadata in a nutshell

by Michael Day, UKOLN: the UK Office for Library and Information Networking, University of Bath, UK.

[Draft of an article published in Information Europe 6(2), Summer 2001, p. 11. Information Europe is the quarterly magazine of EBLIDA (the European Bureau of Library, Information and Documentation Associations)].

Metadata is sometimes defined literally as 'data about data,' but the term is normally understood to mean structured data about resources that can be used to help support a wide range of operations. These might include, for example, resource description and discovery, the management of information resources and their long-term preservation.

While the first use of 'metadata' originated in contexts related to digital information (chiefly with regard to databases), the general understanding of the term has since broadened to include any kind of standardised descriptive information about resources, including non-digital ones. So, for example, library catalogues, abstracting and indexing services, archival finding aids and museum documentation might all be seen as containing metadata. The advantages of this are twofold. Firstly, it allows librarians, archivists and museum documentation specialists to co-operate usefully across professional boundaries. Secondly, it enables the cultural heritage professions to communicate more effectively with those domains that also have an interest in metadata: e.g., software developers, publishers, the recording industry, television companies, the producers of digital educational content and those concerned with geographical and satellite-based information.

Metadata standards and applications

One consequence of this wide range of communities having an interest in metadata is that there are a bewildering number of standards and formats in existence or under development. The library world, for example, has developed the MARC formats as a means of encoding metadata defined in cataloguing rules and has also defined descriptive standards in the International Standard Bibliographic Description (ISBD) series. Other domains have defined metadata standards based on implementations of the Standard Generalised Markup Language (SGML) or the Extensible Markup Language (XML). Examples of these are the Encoded Archival Description (EAD) and the CIMI Document Type Definition (DTD); an SGML DTD developed by the CIMI consortium.

As noted before, metadata is not only used for resource description and discovery purposes. For example, metadata can be used to help administer and manage resources, e.g. to record information about their location and acquisition. It can also be used to record any intellectual property rights vested in resources and to help manage user access to them. Other metadata might be technical in nature, documenting how resources relate to particular software and hardware environments or for recording digitisation parameters. The creation and maintenance of metadata is also seen as an important factor in the long-term preservation management of digital resources and for helping to preserve the context and authenticity of resources. Examples of these 'richer' understandings of metadata are the development of an Australian Recordkeeping Metadata Schema (RKMS), the MPEG-7 Multimedia Content Description Interface standard for audio-visual resources and the NISO draft definition of technical metadata for digital still images.

The Dublin Core

Perhaps the most well-known metadata initiative is the Dublin Core. The Dublin Core defines fifteen metadata elements for simple resource discovery; title, creator, subject and keywords, description, publisher, contributor, date, resource type, format, resource identifier, source, language, relation, coverage and rights management. One of the specific purposes of DC is to support cross-domain resource discovery; i.e. to serve as an intermediary between the numerous community-specific formats being developed. It has already been used in this way in the service developed by the EU-funded EULER project and by the UK Arts and Humanities Data Service (AHDS) catalogue. The Dublin Core element set is also used by a number of Internet subject gateway services and in services that broker access to multiple gateways, e.g. the broker service being developed by the EU-funded Renardus project.

