The role of classification schemes in Internet resource description and discovery
Work Package 3 of Telematics for Research project DESIRE (RE 1004)
Table of Contents
The Dewey Decimal Classification System (DCC) was first produced by Melvil Dewey in 1876, originally being produced for a small North American college library. It is currently in its 21st edition (Mitchell 1995; Mitchell, et al. 1996) and is published by Forest Press. DDC is distributed in Machine-Readable Cataloguing (MARC) records produced by the Library of Congress (LC) and bibliographic utilities like OCLC and RLIN. DDC is also used in the national bibliographies of the UK, Canada, Australia, Italy and other countries (Comaromi, et al. 1990, p.6). Research carried out by OCLC in the 1980s established that DDC was a suitable tool for browsing, first for library catalogues and then for the Internet (e.g. Markey 1989; Vizine-Goetz 1996a).
McKiernan (1996) lists 14 sites that use, or claim to use, DDC for the organisation of resources. 6 of these sites were also available from the relevant part of the Yahoo! ontology: Computers and Internet:Internet:World Wide Web:Searching the Web:Indices to Web Documents:Dewey Decimal Classification <URL:http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Indices_to_Web_Documents/Dewey_Decimal_Classification/index.html>.
The Yahoo! page included a reference to the Champaign Public Library page and McKiernan (1996) included the Utah State Library but there appeared to be no Web pages organised by DDC at either of these library sites. These 2 sites were ignored. 13 sites remained:
An Internet search was also undertaken for the terms 'Dewey Decimal' or 'DDC'. Web pages were searched using the major robot-based search engines (AltaVista, Excite, HotBot, InfoSeek/UltraSeek, Lycos, Webcrawler), and searches were made of USENET postings (via Dejanews) and the archives of the PACS-L and web4lib mailing lists. The following 2 extra sites were found:
Expanding Universe specialised in astronomy resources, and PICK in resources for information and library studies.
Two further Internet sites that use DDC are NetFirst and Biz/ed:
NetFirst has used DDC to organise a browsing structure since October 1996. DDC notations had been present in their links from the start of the service, but have only recently been made available for browsing (Oehler 1996).
Biz/ed is a subject service for business education, based at the University of Bristol and funded by a consortium of commercial and educational organisations, including Elib. Biz/ed offers an online catalogue of good quality Internet resources (using the ROADS software, and based on the SOSIG model - in fact, SOSIG and Biz/ed share the same offices). Since its inception in 1996, this catalogue has used an abridged version of DDC to classify resources, and to create browsable subject sections.
NetFirst can be considered a major search tool, and Biz/ed aims to develop into an important resource for those interested in business education. The other sites were on a smaller scale, although PICK, Expanding Universe and the Canadian Information by Subject sites were useful because of the specialised nature of the resources they gave access to. The access figures for those sites which had them did not indicate heavy use.
The sites can be categorised into those maintained by libraries and universities, and those maintained by dedicated individuals Another categorisation relates to the type of resources held, either general or subject specific. All the sites used DDC to order resources, with the exception of the Blue Web'n, a site specialising in Web resources for education. Its use of DDC was not as an organiser but as an adjunct to a list of subjects in alphabetical order. Each subject was given its 3-digit DDC number, merely as a reference.
Of the general-resource, library-based sites, Basalt Regional Library, Mid-Continent Public Library and Morton Grove Public Library (also known as WEBrary) used primarily 3-digit DDC numbers to group the resources they pointed to. Even at this general level of classification many numbers only led to single resources. The WEBrary was notable for including a short review alongside each resource. The Internet Resource at Napier University Library was the most threadbare in terms of the number of resources pointed to. Many DDC numbers had no resources attached.
The best of the general-resource, library-maintained sites was the National Library of Canada's Canadian Information by Subject. Here DDC numbers went beyond the initial decimal point and the quantity of resources meant that clumps of resources appeared under individual DDC numbers. Nevertheless the classification was not that deep.
Of the specific-resource, library/university-based sites, two of the sites made full use of the DDC, using number-building facilities appropriate for their focused domains: Expanded Universe, based at the Metropolitan Toronto Reference Library, PICK at the Thomas Parry Library, University of Wales Aberystwyth, These sites used the longest DDC numbers of all the sites surveyed. Biz/ed had used the business section of DDC to pick out a selection of numbers and classes that could be used to form the browsable sections on the site.
Of the sites maintained by individuals, three stand out as being excellent guides to general resources, Global Network Of Silicon Information Services (GNOSIS) run by Patrick W. Clancey, CyberDewey: A Catalogue for the World Wide Web run by David A. Mundie and WWLib by Peter Burden at the University of Wolverhampton. CyberDewey is perhaps the weakest of the three, with many resources being pages on Yahoo!, BUBL or the World Wide Web Virtual Library and mainly shorter DDC numbers being used. Both GNOSIS and WWLib, despite being general resource collections, contain some long and well-built DDC numbers, especially in areas like computing, presumably of interest to their creators. Burden, the creator of WWLib, is also an advocate of using automated techniques to classify the Web (Wallis and Burden 1995).
Alastair Smith's Bookmarks on the Net is what it says it is, his somewhat eclectic bookmarks organised using short DDC numbers. The World Wide Web Reference Collection purports to organise reference sources by DDC but is almost devoid of resources.
In conclusion, the two specialised collections, Expanded Universe and PICK are perhaps the most successful as DDC gives them the facility to organise tightly-focused resources. The Biz/ed service has pragmatically used a section of DDC to structure the browsable interface of the catalogue. The general collections have the feeling of overlong bookmarks, with DDC less in evidence as a powerful organiser.
DDC is used by more libraries than any other classification scheme. It is currently used in 135 different countries and has been translated into 30 languages (Thompson, Shafer & Vizine-Goetz 1997). It is used by the Library of Congress Cataloguing Service in the bibliographic records it creates, alongside the Library of Congress Classification (LCC). DDC is found in many national bibliographies and in the records created by the major commercial bibliographic services. DDC classifications also appear (together with LCC and Library of Congress Subject Headings (LCSH))in the Cataloguing in Publication (CIP) data produced by the Library of Congress, the British Library and some other national libraries. OCLC is now the owner and publisher (via Forest Press) of DDC and the company maintain a Web page for Forest Press and DDC at: <URL:http://www.oclc.org/oclc/fp/fptxthm.htm>
The famous DDC 'decimal' notation, because it consists solely of digits and decimal points, is language independent. There is an attempt in DDC to attach meaning to groups of digits, albeit in a somewhat unwieldy manner.
Translations of DDC exist in French, Italian, Spanish, Turkish and other languages. For more information about current products see: <URL:http://www.oclc.org/oclc/fp/products/fpprod-t.htm>.
The first edition of DDC was published in 1876. As such it was a product of its time and was imbued with a Western, Christian, pre-technological age viewpoint. Subsequent revisions (the latest being Edition 21 (Mitchell et al. 1996)) have done much to try and alleviate this. For example, Edition 21 saw a major revision of the life sciences in 560-590 which saw a move away from the organism centred approach in earlier versions of DDC towards an orientation concentrating more upon processes (New and Trotter 1996). The scheme is updated more frequently than other universal schemes.
The 620s have long been a problem in DDC, as the process of revision struggles to keep up with the tangled growth of engineering. To give one example, building, as a human act that involves design, is spread between 624, 690 and 720. From Dewey's time this trifurcation has been a problem.
DDC is used successfully by several large art libraries, however it must be said that the entire 700 class has been faulted for its fragmentation and overlapping, but criticisms tend to focus on the final two divisions. Moves to revise the 780 schedule have not met with success as it is still an extremely difficult schedule to use while the 790s are still chaotic.
The 300s have seen much revision to try and iron out weaknesses, yet they still remain. For example, social groups are still classed separately from their culture. While the statistics of a subject are now classed at the subject with 021 appended, no number exists for the statistics of neo- or perinatal death or indeed, any mortality statistics with respect to a particular disease. The law schedule has seen major disagreement over whether jurisdiction or type of law should be classed first and, as a result, it allows 'options' in its interpretation.
Business and economics
Biz/ed compared the business and economics coverage of DDC with UDC and found in favour of DDC. This subject field has changed in emphasis in recent years (e.g. there is an increased emphasis on market economies), and it was found that the DDC reflected this change best, having been revised and updated more frequently that the UDC.
Using a universal scheme for a subject specific service can cause problems however, as the more detailed the classification, the more complicated it is to use the scheme.
DDC numbers are linked to LCSH headings by most major bibliographic services to the extent that their bibliographic records contain LCSH headings together with DCC and LCC classification data. The USMARC record, for example, contains specific tags for several different classification schemes: DDC, LCC, UDC and NLM together with tags for subject headings including LCSH and MeSH.
DDC 21 classification numbers have also been mapped to LCSH, using statistical mappings between the two systems generated from the OCLC database.
Selected new LCSH headings are individually linked to DDC numbers and are made available via: <URL:http://www.oclc.org/oclc/press/961206.htm>, although this only includes a very small proportion of the complete LCSH listings.
The USMARC format allows for links to be made between DDC and other classification systems, including LCC, UDC and NLM.
The DDC classification is maintained and edited in machine-readable format. The scheme is maintained using an Editorial Support System which was first used to create Edition 20 in 1990 (Comaromi et al.1989). Subsets of this database have been made available under licence for certain projects.
An online summary of the first 3 digits of DDC 21 numbers is available on the WWW at: <URL:http://www.oclc.org/oclc/fp/ddc/ddcsum21/ddc21sm1.htm>. The scheme has been made available to Internet users on an experimental basis with the intention of encouraging authors and other relevant people to use DDC to classify materials on the Internet.
The Software system Dewey for Windows is available on CD-ROM (for details see: <URL:http://www.oclc.org/oclc/fp/deweywin/dwytoc-t.htm>), however tools like this and the Classification Plus component of Cataloger's Desktop are only designed to give cataloguers' access to DDC and are not suitable for the application of DDC in an Internet environment.
Copyright rests with Forest Press/OCLC. They can be contacted via the Forest Press page <URL:http://www.oclc.org/oclc/fp/welcome/fpwelc-t.htm>. Those using the classification would be able to apply the notation without restraint in, say, library catalogues and WWW pages, but use of the other information in the schedules would require permission from Forest Press.
Edition 21 (Mitchell et al 1996) is the latest revision, produced under the direction of Forest Press. Although DDC has undergone constant and sometime radical revision from within it has remained structurally recognisable and constantly useful.
When devised, DDC was an enumerative scheme. Subsequent revisions have absorbed and made use of the significant challenges and contributions of classification theory, chiefly the structure and methodology of faceted classification and the use of facet analysis. As a result, subsidiary tables and 'divide like' devices that reflect and can express many aspects of complex topics have been expanded, even though DDC is not a faceted scheme.
DDC stands ideologically or theoretically between its two major counterparts: decidedly more flexible than the Library of Congress Classification and certainly simpler than UDC.
DDC is well supported by an extensive explanatory and training literature. Its simple notation has accumulated considerable user familiarity over the long span of its existence.
Dewey editors are trying to code the nature of the relationships between class numbers and subject headings to see if it is possible to develop classifier and retrieval assistance tools (Vizine-Goetz, 1996b).
|Next||Table of Contents|
Page maintained by: UKOLN Metadata Group
Last updated: 14-May-1997