The role of classification schemes in Internet resource description and discovery

The role of classification schemes in Internet resource description and discovery
Work Package 3 of Telematics for Research project DESIRE (RE 1004)

Title page
Table of Contents

Previous - Next

4. Conclusion

The Thirty-sixth Allerton Institute, held in October 1994, was a starting point for the discussion of the use of classification systems in information networks. Most of the closing remarks by Marcia Bates and Sarah Thomas pointing to important working directions are still relevant to the issues covered in this report (Cochrane 1995):

"1. Exploit technology

a) for adding class numbers to materials in digital form

b) for linking subject access systems like LCSH and DDC.

c) for providing navigation and retrieval tools based on outlines of knowledge within classification schedules.

2. Extend the use of library classification to Internet resources …

4. Share development strategies among and between various classification systems and thesauri, creating the ability to link with one another including multilingual and specialized systems …

6. Build bridges from the past (e.g. , library collections classified by DDC, LCC, etc.) to the future (e.g., digitized full text collections) …

12. Organize the classification schemes differently for the end-user than for the classifier and provide more than one scheme for users to browse and navigate before and after retrieval".

In the meantime, a variety of classification schemes are being used to bring systematic order to discovery oriented Internet services. The major universal schemes like DDC and LCC are mainly used by services run by the library community, while UDC is used primarily in Europe for subject specific services or for a general information gateway like NISS.

For services in several countries, like the Netherlands or Sweden, national general schemes are used. Subject specific international schemes are the dominating choice among subject-based information gateways. However, many services develop browsing structures on their own, similar to traditional classification systems or have developed extensive local adaptations of existing schemes.

TABLE 4.1: Summary of reviewed classification schemes

BC CCS DDC Ei Icon. LCC MSC NLM SAB UDC

Number of Internet services using system 1 1 17 2 0 5 0 1 4 5

Multilingual capability Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Widely translated. No No Yes No No No No Yes No Yes

Integration with LCSH No No Yes No No Yes No No No No

Integration with other systems GTT No LCC Ei the-saurus No DDC No MeSH No No

Digital Availability Yes Yes Yes Yes Yes Yes Yes No No Yes

Copyright Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Extensibility Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Subject-specific No Yes No Yes Yes No Yes Yes No No

Table 4.1 shows some of the features identified in section 2 for all the reviewed classification schemes. The most used scheme in Internet services is DDC, which reflects its use in traditional and other online services. All the schemes have a multilingual capability to the extent that they use Arabic numerals, sometimes with added letters from the Latin alphabet. The real constraint on their use, however, is the availability of suitable translations and only UDC and DDC have been translated to any significant degree. Both LCC and DDC are integrated to some extent with LCSH and other schemes are integrated with relevant subject schemes, like NLM to MeSH and Ei to the Ei thesaurus. Most of the schemes are available in some digital form, although the exact way this is done varies between schemes. All the classification schemes are extensible, although not always in a completely logical manner, e.g. new subjects are fitted into any remaining gaps in SAB.

The use of a classification scheme in an subject-based Internet service would be extremely useful. It offers the following advantages:

It brings together small collections of similar resources
The use of a systematic well-supported hierarchical structure supports the browsing of these collections
It gives a context for search terms and allows filtering and high precision searches
If the same classification scheme is used, more than one database can be searched with the same approach

The main criteria for the choice of classification system would normally be the scope of the service: its subject, language and geographic coverage and its user population.

In some situations the solution is quite obvious: for documents from all areas of knowledge, published throughout the world and in many languages and to be offered to an international multi-disciplinary community of users, a universal scheme can be selected, at least as a basic solution. DDC and UDC have a good multilingual capability due to the fact that they are entirely numerical and their schedules have been widely translated. If the collection however focuses on a rather limited subject area or discipline and there is a suitable international subject-specific scheme available, it should be used.

Problems will occur for services covering subjects where there are several different schemes (e.g. the earth sciences), although the use of concordances may help. There will also be problems when there is no comprehensive scheme available for a service covering a particular geographic area or subject scope (e.g. the European social sciences in SOSIG).

Perceived shortcomings in classification schemes are sometimes countered by adaptations and amendments to a scheme. For example: EEVL's variant of Ei, NISS and SOSIG's use of UDC, etc. Adaptations can arise from the use of classification schemes in this different electronic environment. One is not preparing a shelf arrangement of physical objects, but a digital, virtual display in an online system where the classification scheme itself is used as a browsing aid.

Another reason for adapting classification schemes is the potential, when using the exact version of a library classification system, that some parts of the scheme could remain completely empty while other parts of the scheme are overcrowded. This is due to the possibility that the subjects in existing digital documents might widely differ from those found in printed collections, or that the sizes of printed and digital collections in this subject area might also be different (cf. 2.5.4.1. Ei classification).

In spite of these good reasons to locally adapt schemes, changes to a scheme will hamper interoperability and co-operation.

Interoperability between subject services could be accomplished by an hybrid usage of universal and subject-specific schemes. Universal schemes could 'glue' different subject systems together and provide a coherent structuring principle at a top-entry level to subject specific services. Then, when moving into the subject services themselves, a subject-specific scheme could be used.

With regard to subject-specific classification schemes, it is advisable that only well-established schemes should be used. Whenever feasible, especially in small services, it might help if a classification from one of the universal schemes could be added. Conversion programs between classification schemes could help accomplish interoperability as well.

Home-grown schemes on the Web are not normally specifically designed to classify academic resources (for the research community) but aim to categorise a wider breadth of form and content: e.g. entertainment, commercial information, government information, etc. UDC, Dewey and LCC and subject-specific schemes, on the contrary, have been developed as schemes to classify the whole of knowledge and are especially useful for classifying academic resources, although as DDC shows, they can embrace popular types of content too. For an academic subject service, home-grown schemes should, therefore, not be developed.

Scheme conversion programs and methods of shared classification are considered very useful especially for subject specific services. Different methods of derived indexing recently developed, clustering and selection technologies, agents and concept maps, and similar techniques of automatic classification are soon expected to offer good improvement in services of limited size.

Next Table of Contents

Page maintained by: UKOLN Metadata Group
Last updated: 14-May-1997

	BC	CCS	DDC	Ei	Icon.	LCC	MSC	NLM	SAB	UDC

Number of Internet services using system	1	1	17	2	0	5	0	1	4	5
Multilingual capability	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Widely translated.	No	No	Yes	No	No	No	No	Yes	No	Yes
Integration with LCSH	No	No	Yes	No	No	Yes	No	No	No	No
Integration with other systems	GTT	No	LCC	Ei the-saurus	No	DDC	No	MeSH	No	No
Digital Availability	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	No	Yes
Copyright	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Extensibility	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Subject-specific	No	Yes	No	Yes	Yes	No	Yes	Yes	No	No