Collection Level Description

A review of existing practice

...an eLib supporting study

[contents]
[previous] [next]


1. Introduction

This study reviews existing practice for providing collection level descriptions, as it exists in the library, archival, museum and Internet communities. It originated from discussions at MODELS workshops [MODELS], where the need for a review of different approaches to collection level description was identified, particularly in the context of phase 3 of the Electronic Libraries Programme [ELIB]. The study was taken forward by UKOLN as a MODELS recommendation.

At the simplest level one can think of a 'collection' as being any aggregation of individual 'items' (also known as objects or resources). Items may be physical or digital. Physical items include books, journals, museum artefacts, photographs, papers etc. Digital items include Web pages, databases, images, etc. In some cases the digital items are surrogates of physical items, in others the digital items are the primary (only) manifestation of the item. Some collections are catalogues (metadata) for other collections. For example, a library catalogue, which is itself a collection, typically describes the items in one or more collections within a library. Collections may be grouped by type, by subject area, by geographic location of resources or according to some other criteria. Collections may be permanent or transient. Collections of Web resources may only exist long enough to transfer information about the collection from one application to another.

Section 2 of this study provides a more detailed discussion on what the term 'collection' means, firstly from the perspective of libraries, archives and museums and then taking a look at the more recent meaning of the term as it is used on the World Wide Web.

This study uses the following terminology:

Collection
An aggregation of resources. Collections are exemplified by the following non-exhaustive list: Internet catalogues (e.g. Yahoo); subject gateways (e.g. SOSIG, OMNI, ADAM, EEVL, etc.); library, museum and archival catalogues; Web indexes (e.g. Alta Vista); collections of text, images, sounds, datasets, software, other material or combinations of these (this includes databases, CD-ROMs and collections of Web resources); collections of events (e.g. a lecture series); library and museum collections; archives. A variety of mechanisms for providing collection level descriptions are described in section 3 of this study.
Item
An individual object, for example a Web page, an image file, an audio file, or a movie. Items are often referred to as resources, objects, documents or document like objects (DLO). The dividing line between collections and items is somewhat vague because items may themselves be collections of other objects. For example, a Web page may be a collection of text, images, applets, etc. However, because the component parts are intended to be rendered together as a whole, a Web page is typically treated as an item rather than a collection. Description of individual objects is well established within the curatorial traditions. Consider, for example, the MARC records used in libraries to describe books and journals. On the Internet, resource description is less well established but a variety of mechanisms either have been or are being developed, including GILS [GILS] and the Dublin Core [DC]. These mechanisms for resource description are only described in detail in this study insofar as they may be used to provide collection level descriptions.
Service
An application level service and associated protocol. Services provide the mechanism for end-users or, in the case of digital resources, end-user's client software, to gain access to collections and their component items. Services may be physical, a library or museum service, or digital, a Z39.50 server. Access to digital services is typically based on the client-server model currently, though we are likely to see a move towards distributed object models in the future. Digital service descriptions typically provide the client with enough information to connect to the server. The information they contain is dependent on the particular protocol in use but is often as simple as a machine name, a port and a database name. Section 4 of this report considers a variety of mechanisms that have been, or currently are being, developed to describe application level services.
Service Provider
An organisation or individual who manages and provides access to resources or collections of resources. Describing organisations and individuals is well understood and a variety of mechanisms for providing directories, often called 'white-pages services', have been developed, both within ISO (X.500 [X500]) and by the Internet community (LDAP [LDAP], WHOIS++ [RFC1835]). This report does not discuss service provider description in any detail.

Andy Powell, UKOLN