The impact of electronic publishing on library services and resources in the UK

3.7.3 Subject retrieval

Because the whole text, including any assigned indexing terms, of digitised (character-coded) documents can be searched for the occurrence of words, phrases and other character-strings, subject retrieval should in principle be fast and effective. This is indeed true as far as searching the relatively small amounts of text represented by a single bibliographic database or a single CD-ROM is concerned, but it does not mean that the vast amount of text scattered throughout the Internet is equally searchable. There are problems in identifying the networks, and then the network partners, which hold potentially relevant material. Given that the number of Internet hosts was reported to be 2.56 million in October 1993, the size of the problem can be readily appreciated.

A number of Networked Information Retrieval Tools have been developed to help users to search for and locate information held somewhere in the networks. These include Archie, Gopher, WAIS (Wide Area Information Service), World Wide Web, Veronica (Very Easy Rodent Oriented Internet-wide Computer Archive) and probably others; for explanations of these tools, the reader is referred to J. Foster (ref19). However, these tools, which have been developed by computer specialists rather than librarians, do not begin to match the precision necessary to retrieve specific information. As Foster (op cit) says, "It is quite easy for a researcher to spend hours roaming the network looking for information of relevance and to find nothing of interest. This is partly due to the fact that information coverage is patchy, there is much duplication and in some areas there is not much of value, but it is also due to the lack of organisation and structure." There is a need for close cooperation between librarians and network specialists to identify the deficiencies of existing tools and to develop effective methods of organising and structuring the universe of information and data held on the networks.

