The role of classification schemes in Internet resource description and discovery
Work Package 3 of Telematics for Research project DESIRE (RE 1004)
Title page
Table of Contents

Previous - Next

Executive Summary

Some Internet services concerned with giving access to other Internet sites use classification schemes for organising a browsing structure giving access to their selected resources. This is especially true of Internet subject services which often use a browsable classified structure in addition to a searchable index.

The use of a classification scheme gives some advantages to an Internet service to the extent that it helps with browsing, enables the broadening and narrowing of searches, gives context to search terms being used, allows (under certain conditions) multi-lingual access to collections of material and the partitioning and manipulation of a large database. If an exisiting classification scheme is chosen, it will have a good chance of not becoming obsolete and will possibly be well-known to users.

Classification schemes can be defined by several categories, but can be broadly divided into:

All of these classification types are used to some extent on the Internet. Universal schemes like DDC and UDC are used by many Internet services and are readily available in machine-readable form. Subject services, however, appear more likely to use a subject specific scheme.

The type of classification scheme chosen for use in an Internet service should depend upon the scope of service which is planned. A subject service, where possible, could use a well-known, international, subject specific scheme. Another service, which either has a more general brief or is in a subject areas where there is no agreed 'standard' classification system in use, could use or adapt a unversal scheme.

For the widest interoperability, more than one classification scheme could be used or conversion programs designed. Alternatively, a universal scheme could be used to 'glue' different subject services together while the actual services themselves would be classified in a different, relevant, subject-specific scheme.

Classification is a time-consuming and expensive process, so research has been carried out into the automatic classification of Internet resources. Various projects have investigated how subject terms collected from a search of a database can be converted into classification notation. Two projects, the Nordic WAIS/WWW project (Lund) and Project GERHARD (Oldenburg) used UDC for the conversion, while OCLC's Project Scorpion is looking at DDC. Other projects are looking at neural-networks and at automatic conversions between classification schemes.

Automatic classification processes are also important if large robot-generated services want to add a browsing structure for their documents.

Next Table of Contents

Page maintained by: UKOLN Metadata Group
Last updated: 14-May-1997