The role of classification schemes in Internet resource description and discovery

The role of classification schemes in Internet resource description and discovery
Work Package 3 of Telematics for Research project DESIRE (RE 1004)

Title page
Table of Contents

Previous - Next

2. Current use of classification schemes in existing search services

2.5. International subject specific schemes

A list of classification schemes and controlled vocabularies used in existing Internet services can be found in McKiernan (1996). Those in use include a number of internationally used subject-specific schemes like the National Library of Medicine (NLM) scheme and Ei, and national schemes like the Danish Veterinary and Agricultural Library Classification. Those considered here will be those relevant to the DESIRE test-bed services.

2.5.1. Art

For object description in art an important instrument is the AAT (Art & Architecture Thesaurus), but as the AAT is a thesaurus, and not a classification scheme, it isn't reviewed in this report. As more and more images become available in digital form the need arises for a classification scheme developed especially for subject description of visual images. Iconclass is such a scheme.

2.5.1.1 Iconclass

Iconclass <URL:http://iconclass.let.uu.nl/>is an iconographic classification system, developed by Henri van de Waal (1910-1972), Professor of Art History at the University of Leiden. Iconclass is an alpha-numerical classification, hierarchically and systematically ordered, of the subjects of Western art, offering definitions and keywords in English. With this scheme it is possible to describe objects, persons, events, situations and abstractions that appear in visual images.

It is divided into 9 main categories: 1. Religion and Magic; 2. Nature; 3 Human Being, Man in General; 4. Society, Civilisation, Culture; 5. Abstract Ideas and Concepts; 6. History; 7. Bible; 8. Literature; 9. Classical Mythology and Ancient History. The code gets longer when the concept becomes more specific. Retrieval is not only possible via the alpha-numerical notations, but also via a subject index (controlled vocabulary).

Usage

An overview of institutions and projects using Iconclass can be found at: <URL:http://iconclass.let.uu.nl/texts/institut.htm>

A CD-ROM with images of Dutch printers devices from the period 1540-1700 was produced as a pilot project for the use of the computer version of Iconclass (1992).

Multilingual capability

Although the system is published in English, the alpha-numerical notations make it language independent.

Availability

The Iconclass System and Bibliography, together with a very extensive alphabetical index, were published in 17 volumes by the Koninklijke Nederlandse Akademie van Wetenschappen (KNAW), in the years between 1973 and 1985. In the years 1990-1991, a computerised version of the Iconclass System was prepared at the Department of Computers & Humanities of Utrecht University. This version was baptised the Iconclass Browser and published in 1992 by the Iconclass Research & Development Group. It runs on computers installed with Microsoft Windows. A second edition, now including the electronic Iconclass Bibliography, appeared in 1994. The Iconclass home page also offers access to a WWW version of the Iconclass Browser which gives a good impression of Iconclass's contents, though its interface is far less powerful than the interface of the Iconclass Browser for Windows.

Copyright

Copyright for the ICONCLASS Browser: is held by the ICONCLASS Research & Development Group (IRDG), Universities of Utrecht and Leiden; Vakgroep Computer & Letteren, Utrecht University.

Development effort

The Iconclass Research & Development Group (IRDG) is the distributor of the Iconclass Browser, and the central editing board. It monitors the consistency of the scheme and decides about changes in new editions.

Other issues

Iconclass describes individual images and does not apply to collections of images.

Literature on Iconclass

Brandhorst and Van Huisstede (1992)

Grund (1993)

Iconclass home page: <URL:http://iconclass.let.uu.nl/>

2.5.2. Social Sciences

There are no major classification schemes for the social sciences. In many cases LC Subject Headings are used, or the ERIC thesaurus. For classification, universal schemes like DDC, UDC, or LCC have been used.

2.5.3. Medicine

2.5.3.1 NLM: National Library of Medicine

The National Library of Medicine Classification covers the field of medicine and related sciences. It is a broad classification, intended to be used for the shelf arrangement of all library materials. The NLM Classification is a system of mixed notation fashioned after the Library of Congress Classification (LCC) where alphabetical letters which denote broad subject categories are further subdivided by numbers. The NLM Classification utilises schedules QS-QZ and W- WZ, permanently excluded from the LCC schedules and is intended to be used with the LCC schedules which supplement the NLM Classification for subjects bordering on medicine and for general reference materials. The LC schedules for Human Anatomy (QM), Microbiology (QR), and Medicine (R) are not used at all by the National Library of Medicine since they overlap the NLM Classification.

The headings are interpreted broadly and include the physiological system, the speciality or specialities connected with them, the regions of the body chiefly concerned and subordinate related fields. The Classification is hierarchical, and within each schedule, division by organ usually has priority. Each main schedule, as well as some sections within a schedule, begins with a group of form numbers ranging generally from 1-49 which are used to classify materials by publication type, e.g., dictionaries, atlases, laboratory manuals, etc.

Since the NLM Classification scheme is designed as a broad classification, it can be used for both large and small collections and it can also be adapted to handle specialised collections. For example, a dental library may want to expand the WU (Dentistry) schedule to meet their specific needs.

Internet Usage

The NLM is in use by the OMNI (Organising Medical Networked Information) service <URL:http://www.omni.ac.uk/>. OMNI also uses UDC but will stop using this in the near future because it is considered weak in the medical field (see section 2.2.1: UDC review). OMNI offers the possibility to browse the NLM in classified order or alphabetically.

Multilingual capability

The scheme is available in English, Japanese, French and Spanish.

Integration between classification scheme and controlled subject headings

Terms used in the NLM Classification: schedule headings, subheadings and class number captions are compatible with Medical Subject Headings (MeSH) descriptors. Most of the index entries in the fifth edition of NLM are in the current forms of MeSH descriptors. Others are in non-MeSH forms when no appropriate MeSH term is available to express a concept.

Availability

The fifth edition of the National Library of Medicine Classification, 1994 can be ordered from the Superintendent of Documents, U.S. Government Printing Office. The classification is not available in electronic format. When it becomes available electronically in the future, they anticipate that a normal charge of licensing fee will apply.

Copyright

Development effort

The NLM Cataloguing Section has an in-house online classification file which is amenable for continuous updating in order to add new index records, new classification numbers as well as to modify existing records. As a rule, as new MeSH descriptors are added or existing ones changed, they try also to incorporate pertinent additions and changes in the classification scheme.

The fifth edition, published in 1994, contains close to 4,000 classification numbers and a comprehensive index containing over 18,000 index terms including both the index entries and cross references. Approximately 300 new classification numbers were added including form numbers which are repeated in applicable schedules across the entire classification scheme. Future changes and additions to the NLM Classification, will be announced in NLM Technical Bulletins.

Literature on NLM

U.S. National Library of Medicine (1996)

2.5.4. Science and engineering

2.5.4.1. Engineering Information (Ei) Classification Codes

The Ei Classification Codes are a classification scheme developed by Engineering Information, Inc. Engineering Information, also known as Ei, <URL:http://www.ei.org/> was created in 1884, at Washington University, St. Louis. Ei's aim is to identify, organise and facilitate easy access to the published engineering literature of the world.

The system has been further subdivided (1993) and now comprises six main categories, subdivided into 38 subject series and over 800 individual classes. Up to four levels of increasing specificity are provided below the main categories. It is a numeric scheme, but not hierarchical in content, but enumerative, and with some severe logical shortcomings in its structure.

Usage

Ei publishes the most comprehensive interdisciplinary engineering databases in the world, of which Ei Compendex is the most popular, with over three million records of journal articles, technical reports, conference papers and proceedings in electronic form, dating from 1970 onwards. It is also accessible on the WWW as a fee-based service.

Ei classification codes are used by two Internet subject services: EELS (Engineering Electronic Library, Sweden) <URL:http://eels.lub.lu.se/> and EEVL (Edinburgh Engineering Virtual Library) <URL:http://eevl.icbl.hw.ac.uk/>.

EELS is a co-operative project of The Swedish Univ. of Technology Libraries, a consortium of the six most important research libraries in Sweden in technological subjects with some co-operation from other Nordic countries. It is developed and maintained at Lund University Library, NetLab, since the beginning of 1994 as one of the first selective subject services on the Internet using a library classification scheme and thesaurus.

EELS is arranged according to the Ei subject classification for most of subjects. The following subject fields are included: civil engineering; mechanical engineering; electrical engineering; computing; chemical engineering/chemistry; mathematics; physics; environmental engineering/science; engineering management. Some areas of interest which are not covered explicitly in the classification from Ei are for the moment maintained outside the classification, i.e. polar research and cold region technology. Descriptors (DE) from the Ei Thesaurus have been added to each resource title. Additional controlled terms, describing document types and similar are added in the annotation field of every record, since the Ei Thesaurus does not cover digital and Internet documents. A robot-generated browsable and searchable database of "all" engineering WWW pages in the Internet has been added recently. It will soon be structured according to the Ei classification as well and will, in different ways, be integrated into the main quality controlled service.

The Edinburgh Engineering Virtual Library (EEVL) project started in August 1995, funded by the Joint Information Systems Committee (JISC) for two years, to develop a gateway to Internet resources in engineering as part of the UK Electronic Libraries Programme (eLib). It is based at Heriot-Watt University and is being developed in collaboration with six other universities in the UK (Moffatt 1996). EEVL is similar in concept to the subject services OMNI and SOSIG. The classification scheme adopted by EEVL is an in-house scheme which is loosely based on the Ei Classification. This approach was adopted after the investigation of a number of gateway services utilising conventional library classification schedules (UDC, Dewey, Ei) appeared to reveal that many parts of the subject trees remained empty. Rather than adopt an elaborate rigid classification which was originally developed for placing books on shelves as opposed to organising networked resources, the decision was taken to adopt a more fluid classification which could adapt as broad subject categories fill up.

Integration between classification scheme and controlled subject headings

All EELS resources are annotated, classified using the Ei classification and indexed with descriptors from the Ei Thesaurus. The Ei Thesaurus contains in its 2nd ed. 1995 over 8,300 descriptors and a total of 17,400 entry points including 9,000 non-preferred terms. It is a hierarchical thesaurus, which in addition provides the relationship between the descriptors and classification codes from the Ei classification. This allows EELS to offer sophisticated navigational and search features integrating browsing and searching, classification and thesaurus. Ei has so far not allowed EELS to display only the scheme to end-users (that is without any resources connected to codes).

Linking to third party classification data

The Ei classification codes and thesaurus serve as indexing tools for the database Ei Compendex*Plus, the printed Engineering Index and other index products. The main reason why EELS chose the Ei classification and thesaurus was the plan to build an integrated engineering service between Compendex for printed resources with EELS offering the digital resources, presenting to the user one single search and browse access point and retrieving records from both services. This development is not finished yet, since it requires changes to other software packages, like SilverPlatter's ERL and WebSPIRS.

Copyright

The Copyright for the classification system and the thesaurus belongs to Engineering Information Inc., Hoboken, N.J., USA.

Literature on Ei:

EELS: <URL:http://eels.lub.lu.se/>

EEVL: <URL:http://eevl.icbl.hw.ac.uk/>

Ei home page: <URL:http://www.ei.org/>

Moffat (1996)

2.5.4.2. Mathematics Subject Classification

Usage

The 1991 Mathematics Subject Classification (MSC 1991) was compiled by the Editorial Offices of both Mathematical Reviews and Zentralblatt für Mathematik / Mathematics Abstracts which are the two main review journals in mathematics.

The American Mathematical Society (AMS) offers a Web page of Materials Organized by Mathematical Subject Classification. It lists main sections of the 1991 Mathematics Subject Classification with links to electronic journals, pre-prints, Web sites and pages, databases and other pertinent material in the corresponding fields: <URL:http://www.ams.org/mathweb/mi-mathbyclass.html>.

Availability

The scheme is available on the Internet: <URL:http://www.zblmath.fiz-karlsruhe.de/class/index.html>

Users are allowed to install a local version on their own machine: a tarred and compressed version is available at: <URL:http://www.ma.hw.ac.uk/~chris/mr-html.tar.Z>

Users of the system are asked to inform the authors so that they can inform of any updates.

Development effort

The editors of Mathematical Reviews and Zentralblatt für Mathematik have initiated the process of revising the 1991 Mathematics Subject Classification. They plan to have this revision completed by the end of 1998 so that it can begin to be used in Current Mathematical Publications in mid-1999, and in Mathematical Reviews and Zentralblatt für Mathematik from 2000.

2.5.4.3. ACM Computing Classification System (CCS)

The ACM (Association for Computing Machinery) <URL:http://www.acm.org/> Computing Classification System has become a standard for identifying and categorising computing literature, as well as areas of computing interest and/or expertise. The current taxonomy for categorising the computing literature saw its first release in 1982. Until recently, CCS was named the Computing Reviews Classification System (CRCS); it was renamed in recognition of its general use as a standard for classifying the computing literature. The 1991 Classification System is a cumulative revision of the 1982 version of the Computing Reviews Classification System. The 1982 Classification System had in turn superseded the previous CR classification introduced in 1964.

The Classification has two main parts: a numbered tree containing unnumbered subject descriptors, and a General Terms list. The unnumbered subject descriptors are essentially fourth level nodes.

Usage

It is Used in ACM's online databases and in CD-ROM files.

An Internet service using CCS is Ariadne, developed by the Medoc-project in Germany. <URL:http://ariadne.inf.fu-berlin.de:8000/>.

Availability

A complete copy of CCS is available in the January 1996 editions of Computing Reviews and ACM's Guide to Computing Literature. The CCS is also accessible on the WWW in both 1964 and 1991 versions: <URL:http://www.acm.org/class/>

Development effort

The ACM Publications Board is now updating the Computing Classification System (CCS).

Copyright

The copyright of the ACM Computing Classification System belongs to the Association for Computing Machinery (1995).

Literature on ACM CCS:

ACM home page: <URL:http://www.acm.org/>

Next Table of Contents

Page maintained by: UKOLN Metadata Group
Last updated: 24-Jan-2000