Daniel Greenstein, Arts and Humanities Data Service Executive (Daniel.Greenstein@ahds.ac.uk) and
Lorcan Dempsey, UK Office for Library and Information Networking (L.Dempsey@ukoln.ac.uk)
Network technologies have helped to lower many of the geographical barriers which impede scholars' access to information resources. From a desktop computer we can search the holdings of libraries, archives, museums, and digital collections worldwide. Where digital collections are concerned, such searches may open out onto the resources themselves. An online database comprising information about World Wide Web resources, for example, enables its users to discover and then, via hypertext links, to use the resources referred to in the database. The reduction of geographical obstacles, however, has revealed others that were hitherto obscured. Different approaches to resource description (more commonly known amongst librarians as cataloguing), is significant amongst them. Quite simply, historical manuscripts are described differently than books which are in turn described differently than databases of art historical images, and archaeological excavation archives. These differences are understandable and depend in part on a resource's structure, provenance, and intellectual content, on the interests and professional practices of those who may be charged with its conservation and management, and on the wishes of users who want access to it. They can, however, frustrate resource discovery where resource discovery entails searching online catalogue databases of differently described resources.
Take two examples. A cultural historian may be interested in the image of the city in modernist literature. A preliminary investigation might wish to consider the Dublin of Joyce, the Prague of Kafka, and the discussions of Paris by Walter Benjamin. Books and serial articles, demographic and other social data sets, and images are equally appropriate. A child doing a school project on butterflies and natural selection provides a second example. S/he may wish to discover the existence of some museum objects, some textbook or encyclopaedia discussions, and some images. One might expect much of the material for these exercises to be available somewhere on the network. However, given the current diversity in how such material is described and made accessible via online information systems, discovery is both time consuming and tedious. Library catalogues will provide access to relevant books in libraries. Information systems are under development which will provide access to relevant Internet resources, museum objects, image databases, geospatial reference data, statistical and scientific data sets, and so on. Such systems serve 'domain-specific' collection, user, curatorial, or other requirements.
Yet in these two examples as in so many others, users have cross-domain information needs. That is, they require access to information about relevant materials irrespective of where, how (e.g. as books, audio tapes, digital objects), or by whom (e.g. librarians, data archivists, museum curators) they are stored, and regardless of the manner in which they are described or catalogued. They want to query any number of domain-specific information systems in parallel. In this respect, users require a framework which will allow resource discovery across domains. So do information managers who wish, for example, to integrate access to collections which are sufficiently diverse so as to require different and separate catalogues. A university, for example, may wish to enable students and teachers to discover scholarly materials irrespective of whether information about those materials is described and organised differently in separate library, archive, and museum information systems. It may further seek to integrate information to a particular range of externally managed Web-based information resources. In all these cases, information professionals and users demonstrate an interest in a framework that will facilitate meaningful integrated access to the intellectual record generally.
This publication reports work on that part of the framework which involves metadata for resource discovery - that is the descriptive information which is supplied for information resources to facilitate their location or discovery by interested users. The work has been conducted under the auspices of the Arts and Humanities Data Service (AHDS 1997) and the UK Office for Library and Information Networking (UKOLN 1997), and with funding from the Joint Information Systems Committee of the UK's Higher Education Funding Councils (JISC 1997). It grows out of a common understanding that scholars require access to information resources irrespective of where, how, and by whom they are stored, described, and managed. For UKOLN, the understanding grows out of research into the means of networking access to scholarly materials generally. For the AHDS it grows out of a narrower service requirement - integrating user access to information about very different collections of scholarly data being developed by five geographically distributed Service Providers, each of which is working on behalf of a particular arts and humanities community.
The report contains detailed recommendations regarding metadata which we feel will support genuine cross-domain resource discovery. These recommendations are based closely upon the Dublin Core which has been developed under the auspices of the Online Computer Library Center (OCLC), and which is reported by Stuart Weibel in Chapter 1. There are, however, at least two substantive differences between the broader Dublin Core and the recommendations that are made here. Firstly, the Dublin Core workshop series has had an intellectual and heuristic focus. It will continue to do so, but will increasingly be enriched by the deployment experiences and choices of work such as that reported here. Secondly, the Dublin Core reflects discussions about the resource discovery requirements associated with document-like objects (notably Web-based ones) and images. The recommendations presented here reflect the wider range of media and subject perspectives of interest to the arts and humanities communities that the AHDS represents and serves. Subject perspectives include: archaeology; history; literary, linguistic, and other textual studies; the performing arts; and the visual arts. Media perspectives are as wide ranging and include: archaeological excavation and survey data; databases; digital images and image banks; electronic texts; film and video (in both digital and non-digital formats); geospatial information; linguistic corpora, multi-media objects; and sound recordings (in both digital and non-digital formats).
Research on the framework took place in a series of carefully focused workshops, summary reports from which are presented in Chapter 2. The first workshop was hosted by UKOLN in December 1996. Sponsored by the UK's Electronic Libraries Programme and entitled 'Integrating Access to Resources Across Multiple Domains', the workshop was the fourth in a series entitled Moving to Distributed Environments for Library Services (MODELS). It involved representatives from different curatorial and scholarly communities and a number of information retrieval specialists, and evaluated the desirability and then the basic requirements of so-called cross-domain resource discovery. It prepared the groundwork for subsequent specialist workshops by affirming the essential scholarly importance of cross-domain discovery and by helping to distinguish between highly generalisable metadata for resource discovery and metadata presented in more detailed formats as appropriate for other more specialist purposes (e.g. data use, management, and preservation). The workshop also indicated that having located an information resource with the benefit of resource discovery metadata, a user might wish to 'drill down' or have access to that more detailed and specialist information. Here, the workshop articulated a precise and well-defined user requirement for the packaging of specialist metadata formats within highly generic resource discovery metadata. In this regard it reiterated recommendations arising from the second Dublin Core meeting which was hosted earlier in 1996 by OCLC and UKOLN (Weibel and Dempsey 1996).
Six specialist workshops followed on from the MODELS meeting and concentrated on resource discovery from the subject and media perspectives represented by the AHDS. The following workshops, each involving a large and appropriate range of scholarly and curatorial professionals, produced comprehensive reports which are available from the AHDS's Web pages (AHDS 1997):
In order to ensure comparability across the workshops, they were conducted according to a set of closely written guidelines prepared by Paul Miller, a metadata and resource discovery expert and now a member of the Archaeology Data Service. The guidelines (Miller 1997) ensured that a common approach was adopted by each workshop. Notably that each workshop from its own domain perspective:
By requiring a number of common inputs (e.g. papers about metadata, the Dublin Core, and about data description practices or standards of relevance to the domains), the guidelines also ensured that the workshops' members shared a common understanding of the issues being addressed. The guidelines finally required commonly organised workshop reports. These facilitated the identification of divergent and convergent domain-specific resource-discovery requirements and the development of a unifying metadata format. That format emerged from a final or summary meeting convened by Paul Miller and involving the workshop organisers and AHDS and UKOLN representatives. It is documented in extenso in Chapter 3 and makes up the heart of this report.
Chapters 4 and 5 set the AHDS's and UKOLN's work on resource discovery in different but essential contexts. In Chapter 4, Daniel Greenstein and Robin Murray address implementation issues with reference to the suite of resource discovery tools being developed for the AHDS. Based upon the Z39.50 network applications protocol (Library of Congress 1997a) and exploiting the commonality provided by the unifying metadata format presented in Chapter 3, these tools will enable scholars to search seamlessly across a range of differently structured collection catalogues. In Chapter 5 Lorcan Dempsey, Rosemary Russell, and Rachel Heery place AHDS/UKOLN work on metadata in the context of that on integrating access to networked information resources more generally.
Chapter 6, by Miller, acts as a conclusion and crystallises the methodological but also the organisational lessons that were learned through the AHDS's and UKOLN's investigations. It also outlines further work that is required. Finally, a bibliography provides pointers for those who wish to know more about metadata for resource discovery, about the Dublin Core and its uses, about the resource description standards and controlled vocabularies that are regularly in use in the subject areas and with the media types examined by the AHDS and UKOLN, and about the Z39.50 network applications protocol.
Although it is hoped that our detailed recommendations about the use of metadata will prove instructive to many communities, there are other lessons which derive from our work. The present report would not have been possible without the conscientious involvement of hundreds of individuals, many of whom are listed in the acknowledgements below. Representing numerous institutions, academic specialisms, information providers, curatorial professions, and others holding a particular stake in the creation, management, or use of scholarly resources, they came together in person and in sustained electronic discussion. Their dialogue defined a set of common problems in terms which members of diverse communities could agree to understand together. It identified mutually satisfactory and compatible solutions to those problems which we hope will be trialled through extensive testbed application. The AHDS, UKOLN, and the editors are deeply indebted to all those who contributed in this process whether by drafting a report, attending a workshop, participating in an electronic discussion, or responding to a draft document. Your willingness to identify and cultivate a common ground belies those who tell us that communication between different stakeholding communities is the greatest single obstacle to more effective exploitation of network and digital technologies.
This workshop process would have been impossible without generous funding from the UK Higher Education Funding Councils' Joint Information Systems Committee. Thanks are also due to the following people who attended an AHDS/UKOLN workshop or provided feedback about reports arising therefrom.
Send comments or questions to firstname.lastname@example.org
Last modified: Monday, 17-Nov-97 16:52:01 GMT by D. Greenstein
This page was originally part of the Arts and Humanities Data Service (AHDS) Website: http://ahds.ac.uk/public/metadata/disc_02.html