6. Glossary

Acceptable use Top

Terms and conditions setting out which Service Providers can do what with metadata harvested from a particular Data Provider or group of Data Providers. At the Cornell meeting (September 2000) where the foundations for the OAI protocol were agreed upon, an explicit choice was been made to hand over acceptable use issues to communities implementing the OAI protocol.

The OAI-PMH does not address issues of acceptable use of harvested metadata, although it does allow for the inclusion of an "about" container attached to each harvested metadata record. Typically such an "about" container could be used to specify the terms and conditions of the usage of a metadata record. In this way, individual communities can express terms and conditions regarding metadata use at the level of individual records. In addition to that, at the level of a repository, the response to the Identify verb allows for the inclusion of an open-ended "description" container. Communities could use this container to include terms and condition information for all metadata records in the repository. From a technical perspective, these provide hooks are there to allow communities to specify terms and conditions for the usage of metadata harvested from their repositories.

Aggregator Top

An OAI aggregator is both a Service Provider and a Data Provider. It is a service that gathers metadata records from multiple Data Providers and then makes those records available for gathering by others using the OAI-PMH.

Archive Top

The term "archive" in the name Open Archives Initiative reflects the origins of the OAI in the e-prints community where the term archive is generally accepted as a synonym for repository of scholarly papers. Members of the archiving profession have justifiably noted the strict definition of an ?archive? within their domain; with connotations of preservation of long-term value, statutory authorization and institutional policy. The OAI uses the term ?archive? in a broader sense: as a repository for stored information. Language and terms are never unambiguous and uncontroversial and the OAI respectfully requests the indulgence of the professional archiving community with this broader use of ?archive?.
(OAI definition quoted from FAQ on OAI Web site)

Conformant Top

A repository is deemed to be OAI conformant if upon protocol testing by OAI it responds to each of the protocol requests with a response that validates with its XML schema, and also responds to malformed requests with the appropriate errors and exception conditions.

Container Top

Containers are places in OAI-PMH responses where XML complying with any external schema may be supplied. Containers are provided for extensibility and for community specific enhancements. The OAI Implementation Guidelines lists the existing optional containers and provides links to existing schemas.

Data Provider Top

A Data Provider maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata.
(OAI definition quoted from FAQ on OAI Web site)

Data representation Top

In this context, the format in which data of a particular type is set out in order to provide interoperability across repositories.

DC (Dublin Core) Top

Dublin Core (DC) is a metadata format defined on the basis of international consensus. The Dublin Core Metadata Element Set defines fifteen elements for simple resource description and discovery, all of which are recommended, and none of which are mandatory. DC has been extended with further optional elements, element qualifiers and vocabulary terms.
(Definition draws on UKOLN's metadata glossary and Metadata in a nutshell by Michael Day)

DCMI (Dublin Core Metadata Initiative) Top

The Dublin Core Metadata Initiative is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global workshops, conferences, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.
(Definition quoted from Dublin Core Metadata Initiative at

DCMES (Dublin Core Metadata Element Set) Top

The Dublin Core metadata element set is a standard for cross-domain information resource description. Here an information resource is defined to be "anything that has identity". This is the definition used in Internet RFC 2396, "Uniform Resource Identifiers (URI): Generic Syntax", by Tim Berners-Lee et al. There are no fundamental restrictions to the types of resources to which Dublin Core metadata can be assigned.
(Definition quoted from Dublin Core Metadata Initiative—Dublin Core Metadata Element Set, Version 1.1: Reference Description at

Document-like object Top

A document-like object is a digital data unit that is comparable to a paper document. The term designates a relatively simple stable resource, and would not cover, for example multimedia artifacts or interactive services.

DTD (Document Type Definition) Top

A DTD is a formal specification of the structure of a document.

E-print Top

An e-print is an author self-archived document. In the sense that the term is ordinarily used, the content of an e-print is the result of scientific or other scholarly research.

Flow control Top

The management of the flow of data between Data Provider and Service Provider in order to assure that neither end of the transaction suffers overload.

Harvester Top

In OAI-PMH a harvester is a client application issuing OAI-PMH requests.

Harvesting Top

In the OAI context, harvesting refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store.

Identifier Top

In OAI-PMH an identifier is a unique key for an item in a repository.

Item Top

In OAI-PMH an item is a component of an repository from which metadata about a resource can be disseminated. An item has an unique identifier.

Interoperability Top

Interoperability is the ability of systems, services and organisations to work together seamlessly toward common or diverse goals. In the technical arena it is supported by open standards for communication between systems and for description of resources and collections, among others. Interoperability is considered here primarily in the context of resource discovery and access.

Metadata Top

Structured information about resources (including both digital and non-digital resources). Metadata can be used to help support a wide range of operations on those resources. In the context of services based on metadata harvested via OAI-PMH, the most common operation is discovery and retrieval of resources.

OAI (Open Archives Initiative) Top

OAI is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

OAI-PMH (OAI Protocol for Metadata Harvesting) Top

OAI-PMH is a lightweight harvesting protocol for sharing metadata between services.

Protocol Top

A protocol is a set of rules defining communication between systems. FTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are examples of other protocols used for communication between systems across the Internet.


A PURL is a Persistent Uniform Resource Locator. Functionally a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. The client can then complete the URL transaction in the normal fashion. In Web parlance, this is a standard HTTP redirect.
(Definition quoted from PURL at

Record Top

In OAI-PMH a record is metadata in a specific metadata format.

Repository Top

In OAI-PMH a repository is a network accessible server that is able to process OAI-PMH requests correctly.

Resource Top

A resource is anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., today's "weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources.
(Definition from Guidelines for implementing Dublin Core in XML by Andy Powell and Pete Johnston)

In OAI-PMH a resource is an object the metadata is "about". The nature of resources is not defined in the OAI-PMH. Thus, resources may be digital or non-digital.

Service Provider Top

A Service Provider issues OAI-PMH requests to data providers and uses the metadata as a basis for building value-added services.
(OAI definition quoted from FAQ on OAI Web site)
A Service Provider in this manner is "harvesting" the metadata exposed by Data Providers

Set Top

In the OAI-PMH a Set is an optional construct for grouping items in a repository.


URI is the acronym for Universal Resource Identifier. URIs are strings that identify things on the Web. URIs are sometimes informally called URLs (Uniform Resource Locators), although URLs are more limited than URIs. URIs are used in a number of schemes, including the HTTP and FTP URI schemes.

Value-added service Top

A service that is based on harvested metadata, and adds value for its users by means which may include normalisation and enriching of the harvested metadata for example. Types of services which may be offered include search services, citation linking, overlay journals, and peer-review services, among others.

XML (Extensible Markup Language) Top

XML is a language for creating other languages. It defines a means of describing data. XML can be validated against a DTD or schema setting out the elements of the language created. XML mappings exist for a number of metadata record formats.

XML namespace Top

An XML namespace is a collection of names, identified by a URI reference [RFC2396], which are used in XML documents as element types and attribute names. XML namespaces differ from the "namespaces" conventionally used in computing disciplines in that the XML version has internal structure and is not, mathematically speaking, a set.
(Definition quoted from W3C—Namespaces in XML at

XML schemas Top

XML Schemas express shared vocabularies and allow machines to carry out rules made by people. They provide a means for defining the structure, content and semantics of XML documents.
(Definition quoted from W3C Architecture Domain—XML schema at

