Collection Description Focus, Guidance Paper 2 UKOLN

Maintaining collection-level descriptions



1. The problem

A collection-level description describes attributes or characteristics of a collection, and probably describes the relationships between the collection and other entities, such as its location and the various agents who have rights over the collection or who control access to it, and attributes of those related entities.

The values of almost all of these attributes are subject to change over time, as the collection and/or its context of storage and use changes. If a collection-level description is to have continued value, then it needs to reflect these changes. Like many other forms of metadata record, collection-level descriptions should not be regarded as resources to be created, published and left unchanged. Rather, they should be seen as dynamic resources which require active management for as long as the resource described - the collection - is made available.

It is important to distinguish between the form in which your collection-level description data is stored and the forms in which it is made available to users. Quite probably, descriptions will be made available in a number of forms e.g. as HTML documents for Web access, as XML documents for sharing with other applications and systems. The generation of these different "representations" may require the selection and/or combination of different subsets of the stored data, but as far as possible these various forms should be generated from a single source.

2. Towards maintainable collection-level descriptions

2.1. Store the content of collection-level descriptions in a suitable 'management system'

Collection-level descriptions are a class of metadata record and will probably be used in a range of different contexts [1]. The data which makes up a collection-level description should be stored in a form that:

  • supports the creation and maintenance of the data at an appropriate level of granularity;
  • ensures the security and integrity of the data;
  • permits flexible querying/selection of the data;
  • allows the data to be exposed/exported/presented in different forms as required;
  • takes into account the fact that access to the data will be required over the long term

Typically, this means storing the data as fields and records within a database, and developing or configuring software tools to manage the content and to present that content in different forms as required and possibly to be migrated to other storage systems in the future. The system should also allow for the management of any additional data such as metadata about the collection-level descriptions themselves.

Recommendation: Store the content of collection-level descriptions as structured data in a suitable 'management system'.

2.2. Avoid redundancy in storing collection-level descriptions

As noted above, the description of a collection typically involves the description of a number of entities, including the collection itself, its location, catalogues describing the collection, and agents related to these other entities in various ways. The relationships between these entities are typically not one-to-one. A single repository probably hosts several collections, a single individual or organisation may have some relationship with multiple collections, and so on.

Data should be stored so that, as far as possible, redundant duplication is avoided e.g. so that the attrbutes of a person or organisation are recorded once in the database and that set of data is referenced as required. This means that if an update is required, it can be made to just one instance of the data and the change is reflected in all the forms derived from that instance.

Where hierarchical hasPart/isPartOf relationships exist between collections, it may be appropriate to use an "inheritance"-based approach. The SCONE implementation of the Heaney "Analytical Model" [2] advocates this approach:

To avoid unnecessary duplication, data about catalogues, location and access are linked only once to a hierarchically-related set of collections. The link is made to the highest level Collection in the set to which the data applies; that is, at the same level of granularity. It is assumed that the information is valid for all sub-collections in the hierarchy [3].
Recommendation: Design the structure of your management system so that redundant duplication of data is avoided as far as possible.

2.3. Establish responsibilities and procedures

Repositories typically apply systematic processes for the creation and management of item-level metadata, in the form of library catalogue records, archival finding aids, museum collection management systems, and so on, including procedures for the quality control of the content of the records. The management of collection-level description records should be approached in a similar way.

The effective maintenance of collection-level description records requires that the task is integrated within the core information management processes of the organisation: it should be clear who has responsibility for the task, and procedures should be established for ensuring that it is carried out. It may be appropriate to establish procedures for the periodic review of collection-level descriptions.

The clarification of procedures may be particularly important where the creation of CLD records has been primarily the initiative of an agency other than the organisations holding the collections. If that work has been the responsibility of a fixed-term project, the responsibility for the content of the records after the lifetime of the project must be clarified within the duration of the project.

Further, if a collection-level description is used in a number of systems and services, it is important to consider how any updates to its content are reflected in those systems and services. It may be the case that all such services dynamically "pull" the record from a single source - either each time the content is used or at periodic intervals (e.g. by harvesting). In other cases, it may be necessary for the provider of the record to "push" the updated record to the services which use it. An undertanding of the requirements and responsibilities is essential if updates are to be reflected accurately and in a timely manner.

Recommendation: Integrate the maintenance of collection-level description records within existing processes of metadata management, and ensure that clear procedures are developed. Ensure that you understand the requirements for ensuring that updates to your collection-level descriptions are reflected in other services which use them.

Some issues

3.1. The volatility of CLD elements

Dennis Nicholson and Gordon Dunsire [4] highlight that although the classes of entities describe in CLDs are broadly similar to those described in item-level metadata, at least for bibliographic items, (i.e. agents, locations, and subjects), the range of agents is wider: collection description incorporates "collectors", "owners" and "administrators", as well as the creators of items. Also the relative volatility of descriptive content is different: for collections, locations are likely to be persistent while some of the related agents may change, and it is possible to imagine accruals to a collection resulting in additions to the subject terms associated with the CLD.

Acknowledgements

Much of the content included here has been developed from discussions with participants in the Collection Description Focus Workshop series.

References

[1] Collection Description Focus. Creating reusable collection-level descriptions. Collection Description Focus Guidance Paper 1. October 2002. Available at http://www.ukoln.ac.uk/cd-focus/guides/gp1/

[2] Heaney, Michael. An Analytical Model of Collections and their Catalogues. Third issue revised, Jan 2000. Available at http://www.ukoln.ac.uk/metadata/rslp/model/amcc-v31.pdf

[3] Dunsire, Gordon. Technical and functional description of the SCONE demonstrator service.. Final Report of the RSLP SCONE project Annexe B.1. Jun 2002. Available at http://scone.strath.ac.uk/FinalReport/SCONEFPNXB1.pdf

[4] Nicholson, Dennis and Gordon Dunsire. From Functional Granularity to People Interoperability: Applied Collection Description in the SCONE project.. Presentation to OCLC/SCURL seminar, New Directions in Metadata. Aug 2002. Available at http://www.oclcpica.org/content/1111/pdf/DennisNicholsonGordonDunsire.pdf