Choosing a Metadata Standard For Resource Discovery

Background

Resource discovery metadata is an essential part of any digital resource. If resources are to be retrieved and understood in the distributed environment of the World Wide Web, they must be described in a consistent, structured manner suitable for processing by computer software. There are now many formal standards. They range from simple to rich formats, from the loosely structured to the highly structured, and from proprietary, emerging standards, to international standards.

There is no set decision-making procedure to follow but here are some factors that should normally be considered:

Purpose of metadata: A well-articulated definition of purposes at the outset can act as a benchmark against which to compare standards. Metadata may be for:

Choosing the correct file types and establishing required sizes
Retrieval: Can I find the resource?
Identification: Can I distinguish the resource from other similar resources (e.g. similar titles, or other editions or versions)?
Access: Can I use the resource (e.g. are there legal restrictions on access and usage and is it in a format I can handle)?

Attributes of resource: It is important that you also identify your resource type (e.g. text, image), its domain of origin (e.g. library, archive or museum), subject (e.g. visual arts, history) and the specific features that are essential to an understanding of it. Datasets, digital texts, images and multimedia objects, for instance, clearly have very different attributes. Does your resource have pagination or is it three-dimensional? Was it born digital or does it have a hard-copy source? Which attributes will the user need to know to understand the resource?

Design of standard: Metadata standards have generally been developed in response to the needs of specific resource types, domains or subjects. Therefore, once you know the type, domain and broad subject of your resource, you should be able to draw up a shortlist of likely standards. Here are some of the better-known ones:

Anglo-American Cataloguing Rules (AACR2) Library resources
See <http://www.aacr2.org/>
Data Documentation Initiative (DDI) Social sciences, datasets
See <http://www.icpsr.umich.edu/DDI/>
Dublin Core (DC) All domains, resource types, and subjects
See <http://dublincore.org/>
Encoded Archival Description (EAD) Archives.
See <http://www.loc.gov/ead/>
ISAD (G) Guidelines for the preparation of archival descriptions
See <http://www.hmc.gov.uk/icacds/eng/ISAD(G).pdf>
MARC 21 - Libraries, bibliographic records
See <http://www.loc.gov/marc/>
RSLP Collection-level description Collections of all subjects, domains and types.
See <http://www.ukoln.ac.uk/metadata/cld/>
SPECTRUM Museum objects
See <http://www.mda.org.uk/spectrum.htm>
Text Encoding Initiative (TEI) Digital texts
See <http://www.tei-c.org/>
VRA Core 3.0 Visual art images
See <http://www.vraweb.org/vracore3.htm>

The key attributes of your resource can be matched against each standard in turn to find the best fit. Is there a dedicated element for each attribute? Are the categories of information relevant and at a suitable level of detail?

Granularity: At this point it is worth considering whether your metadata should (as is usual) be created at the level of the text, image or other such item or at collection level. Collection-level description may be provided where item-level metadata is not feasible or as an additional layer providing an overview of the resource. This could be valuable for large-scale digitisation projects or portals where item-level searching may retrieve an unmanageable number of 'hits'. Digital reproductions may be grouped like their real world sources e.g. by subject or provenance - or be assigned to multiple 'virtual collections'. The RSLP Collection Level Description is emerging as the leading format in this area.

Interoperability: It is important, wherever possible, to choose one of the leading standards (such as those listed above) from within your subject community or domain. This should help to make your resource accessible beyond the confines of your own project. Metadata that is in a recognisable common format may be harvested by subject or domain-wide portals and cross-searched with resources from many other institutions. In-house standards may be tailored to your precise needs but are unlikely to be compatible with other standards and should be used only where nothing suitable already exists. If your over-riding need is for interoperability across all domains or subjects, Dublin Core may be the most suitable standard but it may lack the richness required for other purposes. Care should be taken to ensure that in-house standards at least map to Dublin Core or one of the DC Application profiles.

Support: Using a standard that is well supported by a leading institution can also bring cost benefits. Implementation guidance, user guidance, examples, XML/RDF schemas, crosswalks, multi-lingual capacity, and software tools may pre-exist, thus easing the process of development, customisation and update.

Growth: Consider too whether the standard is capable of further development? Are there regular working groups and workshops devoted to the task?

Extensibility: Also, does the standard permit the inclusion of data elements drawn from other schemas and the description of new object types? It may be necessary to 'mix and match' elements from more than one standard.

Reputation: Funding bodies will be familiar with established, international standards - something, perhaps, to remember when applying for digitisation grants.

Ease of use: Be aware that the required level of expertise can vary greatly between standards. AACR2 and MARC 21, for instance, may produce rich bibliographic description but require the learning of rules. The simpler Dublin Core may allow creators to produce their own metadata records with no extensive training.

Existing experience: Have staff at your organisation used the metadata standard before? If so, the implementation time may be reduced.

Summary

There is no single standard that is best for all circumstances. Each is designed to meet a need and has its own strengths and weaknesses. Start by considering the circumstances of the individual digital project and identify the need(s) or purpose(s) that the metadata will need to satisfy. Once that is done, one can evaluate rival metadata schemas and find the best match. A trade-off will normally have to be made between the priorities listed above.

Summary

Further Information

Digital libraries: metadata resources, Sophie Felfoldi, International Federation of Library Associations and Institutions.
<http://www.ifla.org/II/metadata.htm>
Application profiles: mixing and matching metadata schemas, Rachel Heery and Manjula Patel. In: Ariadne, No. 25, 24 September 2000,
<http://www.ariadne.ac.uk/issue25/app-profiles/>