Distributed Systems

Identifiers for learning objects - a discussion paper

Introduction

This paper attempts to put forward a set of requirements for the identification of learning objects. This is a discussion paper, so comments are welcome - send them to the cetis-metadata@jiscmail.ac.uk mailing list. The intention is to develop a set of learning object identifier requirements that can be used to determine which of the current technologies (URI, URN, URL, PURL, Handle, DOI, POI ...) are best suited to the needs of the UK higher and further education community.

Why do we need identifiers?

The need for unique and persistent identifiers can be summed up as follows:

Unique and persistent identifiers are needed in order that, having discovered a resource, people can reliably cite it without having to perform the discovery again, and can pass the citation on for use by others. Similarly, software applications use identifiers to relibably reference resources, to share those references with other applications and to instantiate linkages between resources (for example, between a metadata record and the resource it describes). Often, though not always, there is a related requirement that people and software can use the identifier as a mechanism for accessing the resource, i.e. that there is a service that 'resolves' the identifier to the current location of the resource.

It should be remembered that, in the context of e-learning systems, identifiers need to be assigned to both learning objects and metadata records about those learning objects, i.e. learning objects and metadata records need to be treated as different 'resources'. There is no obvious requirement that the same kind of identifiers need to be assigned to both objects and metadata, therefore the requirements for the two kinds of identifiers are treated separately below.

Works, expressions, manifestations, ...

As indicated above, in the context of this discussion paper, the simple answer to the question "What do we want to identify?" is:

learning objects,
metadata records about learning objects.

We do not need to try and answer the question "What is a learning object?" here. However, it is worth noting the difference between a 'work' (the abstract entity representing a "distinct intellectual or artistic creation"), an 'expression' (the "intellectual or artistic realization of a work") and a 'manifestation' (the "physical [or digital] embodiment of an expression of a work"). While it is not necessary to dwell on these definitions here - interested parties should read the IFLA report on the Functional Requirements for Bibliographic Records - it is worth considering which of these we want to identify.

There are some scenarios in which it is necessary to identify the 'work', for example when making general recommendations about software packages...

" Crystal Studio is a recommended resource for the teaching of crystallography at undergraduate level."

However, in the main, it is more common to need to cite a particular 'manifestation' (of an 'expression') of the 'work' (i.e. pointing a student to a particular release or language version of a learning object)..."

"To perform this exercise you will need a copy of Crystal Studio version 5.0 (versions 4.0 Lite and 4.0 Professional do not support the required options)."

Therefore, I would suggest that identifiers are assigned to each manifestation of a learning object, which in practice means that each new combination of release, language translation and platform version of a learning object should be assigned a separate identifier.

Learning object identifier requirements

The following list forms a proposed statement of requirements for the identifiers of learning objects within the UK HE and FE community. Learning object identifiers should be:

Persistent: Learning object identifiers should be expected to work reliably for 10-15 years after they have been assigned.
Unique: Learning object identifiers should be unique, i.e. the same identifier should not be assigned to more than one learning object. (Note that a learning object may have more than one identifier assigned to it).
Resolvable: All identifiers assigned to learning objects should be resolvable, i.e. there should be a 'resolution service' that accepts the identifier and returns a URL for a current location of the learning object.
Usable in Web browsers: All identifiers assigned to learning objects should be actionable in current Web browsers, i.e. they should take the form of 'clickable' links without the need for additional plug-ins. (Ideally, the 'resolution service' should not make use of HTTP redirects which leave the URL of the current location of the learning object in the browser location bar. The reason for this is that if the end-user then bookmarks the learning object, they bookmark the URL, not the identifier. However, this may not be technically possible to achieve.)
Transportable: Learning objects should be able to move between repositories (locations) without the identifier needing to be changed.
Simple to assign: The process of assigning identifiers to learning objects should be as simple as possible. Assignment should be independent of the workflow associated with creating and managing a learning object. In particular, assignment should be independent of the process of depositing the learning object in a repository. For example, if a person creates a learning object on their PC, and packages it using a desktop tool, creating metadata about it in the process, then they should be able to assign an identifier to both the learning object and the metadata record and have those two identifiers honoured (i.e. used) by a repository (or several repositories) when the learning object is deposited. The creator doesn't want the repository to assign a different identifier to the object and metadata when they deposit them: nor do they want to have to wait until the point of deposit before they can assign an identifier.
Assignable in devolved environments: Assigners of identifiers should able to be work independently of each other, without reference to a central service, in such a way that guarantees uniqueness of each identifier.
Usable in non-digital environments: Learning object identifiers should be usable in non-Web contexts, i.e. it should be possible to print identifiers in paper articles or dictate them over the phone. For this reason, learning object identifiers should be reasonably short and reasonably intelligable to people.
URI compliant: Learning object identifiers must conform to the URI specification. (Ideally, identifiers should be based on existing standards and technologies - however, they should also be independent of any particular protocol.)
Free at the point of use: There should be no cost (at least to the end-user) associated with the assignment or use of learning object identifiers.

Metadata record identifier requirements

The requirements for identifiers of metadata records about learning objects are the same as those listed above. However, it could be argued that requirements 4 and 8 (usable in Web-browsers and in non-digital environments) are of much lower priority in the case of identifiers for metadata records.

Issues

Various issues are raised by the use of persistent identifiers in practice. From the point of view of learning object creators, learning object repository administrators, resolver systems administrators and end-users, we need to be able to answer the following questions for any chosen learning object identifier scheme...

Creators:

What is the cost of assigning an identifier in terms of money and effort?
How do people find and use identifiers that have been assigned?
What is the ongoing maintenance cost (money and effort) and whose responsibility is it to maintain identifiers and for how long?

Learning object repository administrators:

At what point in the workflow are identifiers assigned?
What is the cost (money and effort) to administrators of assigning an identifier or of making use of previously assigned identifier by the creator?
What is the ongoing maintenance cost (money and effort) and for how long?

Resolver systems administrators:

How do resolver tables get populated?
How is resolver data kept current?
How long must identifiers work for?

End-user:

How do I find out what identifier has been assigned to a given resource?
How do I use the identifier (e.g. cite resource to someone else)?

Andy Powell
Last updated: 27-Apr-2003

[distributed systems] [UKOLN]