A distributed national electronic resource

 

introduction

JISC provides a range of information services for the UK higher education community. Collectively these are the basis of a national electronic resource. However, they stand alone as network services. They are not linked together, or higher level services do not exist which combine them in useful ways. So, for example, one interacts individually with bibliographic databases at the data centres: to search several, one looks at each behind its different interface and logon procedures. <another non-bibliographic example>

It is agreed that this is not a sustainable approach to the provision and use of rich learning and research materials. For this reason, the JISC intends to promote an additional layer of service which weaves these resources into a fabric of integrated use. The ‘distributed national electronic resource’ (DNER) is the name given to this policy objective.

The DNER will improve the quality of teaching, research and learning by:

n         Improving the discovery and use of information resources, releasing the value of the JISC investment

n         Improving the effectiveness of user interaction with resources, providing point and click access to rich communication, learning and information environments.

A three phase approach to DNER development is proposed here. These phases provide a way of managing development and can proceed in parallel. These phases are:

n         The description phase. Concerned with resource and service description to ensure relevant rendezvous between users and resources.

n         The federation phase. Concerned with management of service integration through additional federating services.

n         The embedding phase. Concerned with providing interpretive, integrating and communications services which further embed the DNER services in users’ learning and working environments.  (not entirely clear about this and it has not been developed very much below – it moves into some more interesting local/personal issues though.)

Alongside these phases, we identify the crucial importance of outreach and consensus activities. Users of the DNER will also be users of the National Grid for Learning, of the National Electronic Library for Health, of New Library, and of a multitude of other resources and services. It is important that the ‘content infrastructure’ that is put in place to manage the DNER is as closely aligned with developments elsewhere as possible, and, moreover, that the lessons learned in its construction are shared in targeted ways with other national initiatives and sectors.

Finally, it is worth noting that the DNER will be a significant social and engineering enterprise. Its construction should be planned. We believe that this should involve significant prior specification and design work coupled with focused competitive calls. A more general call without prior specification and preparation will not deliver the benefits we believe are possible.

some words

A feature of current discussions is that there is no settled vocabulary. Some terms may have partial or sectoral associations, which are not commonly shared. We do not intend to contribute much to the solution of this problem here, but we do need to have some common understandings. Accordingly we adopt the following uses:

n         Resource. This is an informational or learning entity of interest. It may be a database, a newsgroup, a mailing list, a journal article, a learning environment, an image, a map, a geographic information system, and so on. Various resource typologies exist, but we do not need any finer discrimination for present purposes. Many resources typically reside in collections, where a collection comprises similar resources. These collections might be databases, web sites, document supply centres or data libraries. Such collections are also, of course, resources, and collections may contain other collections.

n         Network service. A resource is made network accessible through a network service. So, for example, a catalogue may be made accessible through a telnet service, an http service, and a Z39.50 service. <non bibliographic example> A ‘broker’ is a service which provides consistent access to other services, typically to heterogeneous or homogenous services from a number of variously located service providers.

 

phase one

The description phase is concerned with the structured disclosure of DNER resources and network services. Description supports the relevant rendezvous of users, resources and services.

Descriptions should be readable by human users, but also by machine users acting on their behalf. The following entities need to be described:

n         Resources. A description of the ‘content’ of a resource to facilitate discovery, selection, and use.

n         Network service. A description, or profile, of access and data exchange characteristics to facilitate client access and use. In order for a client or broker to access a network service it must know: the location of that service; the access protocol (which may be a search and retrieve protocol, a directory protocol or other form of protocol depending on the service being accessed); the request format (which may be defined by a query language); and the schema(s) relevant to the service (for example the metadata format in use).

n         Conditions of use.  These may be associated with particular resources and services, and vary according to user and use.

n         Schema. A schema describes the structure (attributes and relationships) of a data item such as a resource description. Schema registries are central repositories which can provide human- or machine-readable schema descriptions.

Approach

Two stages are necessary: a specification stage and an implementation stage. There may be some merit in managing these stages separately. The former needs to resolve some outstanding research and development issues. The latter leads to a production ‘directory’ service, which provides human and machine interfaces and reusable descriptions.

Some outstanding issues

n         Metadata and retrieval. Traditionally information retrieval applications have imposed a very ‘flat’ model of description and retrieval. Individual records describe individual resources by several attributes. Relationships between resources have not been well supported, or retrieval by different resource types. This is increasingly recognised as a severe limitation. For example, Dublin Core version 2 will support a more structured approach which recognises resource types and their relationships, and RDF will allow a richer approach to the design and encoding of metadata schemes. This becomes important as we want to describe collections, resources within collections, and so on, and work here will need to reflect that. It is not yet clear what protocols will query this data. This is an inhibitor to current resource description services.

n         Persistent resource identification. Network resources are typically identified by a URL. However, this is not a persistent identifier which leads to well known difficulties. As resources become more volatile, as they are mirrored across different locations, and as our information environments become richer (e.g. we would like to bookmark a particular database query from within a learning environment), we increasingly need to have persistent identifiers.

 

Phase two

The federation phase, which aims to layer additional integrating services on top of current service components. Again several stages can be identified:

n         Identification of motivating service groups. The DNER has been a centrally led initiative, which has been welcomed by the community. While it is clear that a higher level of integration is desirable, it is less clear what form that should take and over what services. Accordingly, this stage will identify usage scenarios which will motivate the construction of the DNER. Examples of scenarios with a resource type, a subject, and a ?? focus are given here:

n         The DNER journal collection. A first step might be unified access to DNER abstracting and indexing services across data centres. Followed by some work on a common way of locating holdings against catalogue files. Followed by a consistent way of requesting journals. The goal is to provide an end-user with point and click access to the journal literature.

n         Subject hubs. The RDN will be organised around faculty level hubs, but currently services concentrate on one collection type (internet resource descriptions). A subject hub which federated different collection types (internet resource descriptions, data sets, journal articles, mailing lists) would be interesting. Edina and EEVL are moving towards this type of arrangement.

n         ??

n         Definition of network service profiles. These define the services offered on the network. Where possible and relevant, services should conform to agreed service profiles to ensure maximum participation in the DNER. A set of common profiles which cover common resource types will be defined, and conformance to them will influence the selection of resources and suppliers.

n         Implementation of broker services. The DNER will be realised by providing higher level broker access to network services. The ‘directory’ service outlined in Phase 1 is itself a broker supporting discovery and use. Although the ambition in the long term will be to develop brokers in such a way that they provide a plug and play infrastructure for the use and exploitation of content, it is likely that initial broker services will require quite  a bit of customisation or bespoke development given the lag between standards developments and requirements, and the immaturity of the current infrastructure. The broker will realise a particular service combination, as described in the scenarios above.

 

The JISC is currently investing in a range of ‘broker’ services. Hybrid libraries, clumps, the AHDS gateway, subject gateways, and so on, all broker access to other services in various ways. These will further benefit from agreed service profiles and the directory service as discussed here. These brokers support discovery services and there is an emphasis on bibliographic or textual resources. The DNER brokers will integrate a wider range of service types in various combinations. Some broker scenarios (e.g. ??) would involve significant research and development effort, and would be considerably in advance of the state-of-the-art. For this reason, it will be necessary to select initial broker services with an eye to practical deployment and usefulness. The technical elaboration of the MIA provides a framework within which to define brokers. Competitive bids for brokers should be invited against tight guidelines which emphasise the centrality of a strongly motivated group of services, and the importance of modular development, reuse of existing components, defined APIs, and an open source approach.

Some outstanding issues

n         Distributed authentication.

n         Charging.

n         Protocol framework. The emergence of XML, the object web, and a potential variety of new structured applications, mean that there are risks associated with development. This reinforces the need for a designed, modular approach.

n         Lack of experience in distributed environments. Such things as user support, resource brand and identity, user control, and so on need to be explored in such layered service environments. 

 

Phase three

The embedding phase, concerned with the wider integration with learning and work environments, in institutional and personal contexts. The DNER will be successful when it becomes invisible, a part of the furniture of people’s routine working and learning.

This phase has two strands of activity.

n         The local habitat. The hybrid library projects are concerned to create integrated user environments for access to information resources. In a sense they aim to create the user side of the coin, where the DNER creates the provider’s. The next stage is to create several exemplar institutional environments where information, learning and other resources are brought together in a user’s normal working environment, together with rich communications and other tools. Whereas other programmes create the pedagogical framework for learning applications, this will provide patterns for weaving them into integrated institutional services.

n         Further content infrastructure. Such integration will depend on further infrastructure components which need work, especially user profiles, agreed interfaces between emerging learning and information approaches, and a consistent approach to sharing interpretive contexts. The latter might be realised through structured, searchable, sharable ‘essays’ or ‘folders’ which provide instructional, training, learning, or other interpretive contexts for services.

The institutional habitats will be the subject of competitive calls for finite exemplar projects. Projects must show an ability to build on the state of the art. The infrastructure components of calls for concerted actions to achieve consensual solutions.

 

Lorcan Dempsey, 10 February 1999.  Comments from Andy Powell and Tracy Gardner, 10 February 1999.

 

Comments to L.Dempsey@ukoln.ac.uk.