BRITISH LIBRARY RESEARCH AND INNOVATION REPORT 3

The Impact of Digital Resources on British Library Reading Rooms


5. THE INTERNET

This section describes some key aspects and dimensions of the Internet. It then discusses the amounts of scholarly information available on the Internet in two areas which are specifically relevant to the Library, namely periodicals and monographs; and it concludes with brief descriptions of the current levels of interaction between the British Library and the Internet.

5.1 Current state of the Internet

The Internet has grown and evolved rapidly over recent years. Growth and evolution are continuing, and are bringing great uncertainty. It is difficult to make quantitative forecasts about the Internet with any confidence.

The Internet has evolved in recent years from a predominantly academic network into today’s increasingly generalised and increasingly commercialised network. Its structures - standards, decision-making processes, and physical infrastructure - mostly evolved to suit the relatively small academic community rather than today’s large and growing general commercial network. These structures are now under pressure:

"No one is sure what business models will be successfully applied to Internet services [ ... ] now we have the ‘jump in, profits be damned’ approach, with lots of companies spending millions of dollars to stake a claim. This model will not last much longer" [4]

"Growth [ ... ] is stretching the Internet beyond its design specifications and capabilities. Its business model is becoming increasingly fragile" [5]

There is another, unrelated, factor which serves to cloud predictions: the future of network communications hardware and infrastructures. One of the liveliest unresolved debates in the IT industry at present is over the commercial future of a new workstation concept dubbed the Network Computer (NC). If it is successful, the NC may decrease the cost of the hardware required to connect to the Internet by up to 75%. If this happens, NCs would displace some PCs as Internet workstations and, critically, greatly increase the number of network users. This could lead to market changes which result in cheaper and faster telecommunications; which in turn would encourage more use. Such a spiral of lowering cost and increasing usage could radically change the nature of the Internet.

This is an important perspective for this study. All predictions of the effect of the Internet on libraries must depend on assumptions about its future, and about the volumes of information it will hold and about levels of use by different constituencies. No reliable estimates are available to support such assumptions, so any predictions of the impact on libraries must be treated with appropriate circumspection.

5.2 Internet statistics

There are several measures of Internet size and growth, all imperfect. The two most relevant are the numbers of servers connected to the Internet and the numbers of users. The number of servers* is routinely measured by network programs which check electronic directories and which validate them by attempting to confirm the existence of servers.

They show a logarithmic growth rate, which (currently) reflects the speed with which corporate users are connecting servers for commercial reasons. This is shown in Figure 3, which plots the rise in World-Wide Web server numbers over the past eight years.

Graph showing increase from c. 2 hosts in 1989 
to c. 9,500,000 hots in 1996
Figure 3: WWW Hosts, 1989 - 1996 [6]

This figure has two limitations in the present context:

Nevertheless, this clearly demonstrates the rapid growth, to over 10 million host computers at present.

Measurement of the number of Internet users suffers from a number of methodological difficulties. For example, is an Internet user:

Even if this is agreed, questions remain about accounts with multiple users, and methods of gathering statistics accurately, given the Internet’s inherently distributed and unmanaged structure.

Added to these difficulties are the levels of uncertainty caused by the rapid growth and high levels of change. This uncertainty is clearly shown on the graph in Figure 4:

Graph showing figures as described in the paragraph below
Figure 4: Internet Usage Prediction

This graph, taken from a recent Gartner Group [7] report shows a prediction of the number of Internet users worldwide to 1998. Even in this limited timescale, the prediction varies between a "conservative" 40 million and the "optimistic" forecast of some 160 million users.

To set these global statistics into a national context, a recent study by IDC suggests there are some 1.3 million users in the UK in 1995 and forecasts this will grow to 7.7 million by the turn of the century [8] .

The issues of growth and uncertainty have been explored at some length here because they provide an important context to the subsequent findings: all debate about any effect of the Internet must be conditioned by the high levels of growth, evolution and uncertainty associated with it.

5.3 Academic journals on the Internet

There is growing commercial interest in using the Internet as a medium for publishing and storing journals. Broadly, journals on the Internet can be grouped into three categories:

The first, Library-sponsored projects, have as yet little global impact. The number of journals published and the sizes of the back runs provide dauntingly large challenges. Projects so far tend to have impacts limited to particular fields, often in individual universities.

Internet-only periodicals, while numerous across the entire network, are few in number as concerns "quality" scholarly journals**. Their attraction is that they permit novel multimedia functions which are entirely impossible on paper - maps with hyperlinks to databases, manipulable images of three-dimensional models are examples - but the economic models for their long-term existence are still being sought.

The third category, parallel publications, is the most likely to be significant in the short term. While such a conclusion cannot be definite, it arises because a high percentage of readers use recent issues; and because some publishers have already announced parallel editions***, and other major publishers are thought to be on the verge of doing so.

The mention of publishers being "on the verge" of parallel publishing is crucial. No segment of the Internet is developing more rapidly than periodical publishing, and it is impossible to obtain accurate or up to date statistics for that reason. For example, the most definitive resource in the area [9] found the number of quality journals published electronically doubled from 1994 to 1995. There is no agreement as to whether digital editions will reach a position of great importance, with or without parallel paper editions.

The total number of periodical titles currently published on paper has been estimated [10] as around 130,000. This figure, while open to interpretation on definitional grounds, appears to be accepted as a reasonable inclusive limit. Of these, DSC subscribes to some 40,000.

The most recent study [11] of quality digital journals dates back to late 1995. Concentrating on the field of STM (science, technology and medicine), it discovered 115 titles. This can be contrasted to the large number of STM titles subscribed to by DSC: approximately 16,000. Since the survey, a number of substantial academic publishers have announced parallel editions, and as referred to above other announcements are numerous. We can therefore estimate that there are on the order of 1,000 quality titles available on the Internet now. In other words, around 1/16 of STM titles are available electronically today; but the proportion will increase rapidly in the short term. For estimating purposes, it seems realistic to suppose that half of the academic STM titles will be available digitally within five years.

5.4 Books/Monographs on the Internet

The concept of a "book" or "monograph" on the Internet is ill-defined; much of the information on the WWW consists of defined text, with defined editions, and all is published. Very little of it is thought of as being books or monographs, however.

A vanishingly small proportion of monographs are published directly in digital form. Probably the most useful meaning of "Books/Monographs on the Internet" for this study is therefore "an existing paper book which has been digitised".

There are numerous projects and ongoing initiatives around the world which are engaged in digitising or collecting books [12], [13]. Many, though not all, make the results of their digitisation available via the Internet. Examples of some larger initiatives, together with a rough estimate of the number of texts they hold, are listed in the table in Figure 5.

Project/InitiativeApprox. Size
Cornell University projects2,000
Project Gutenberg650
Oxford Text Archive2,000
Open Book Project2,000
University of Virginia Electronic Text Center4,200

Figure 5: Sample Digitised Book Projects

Note that this table does not attempt to be complete in its coverage; some large initiatives are not shown. It includes only English texts (save for a small fraction of non-English language texts which are held in some of the above, notably the Oxford Text Archive). It excludes in particular the large digital collection of French language texts - 100,000 books - assembled at the Bibliothèque Nationale de France. Another exclusion is the large US-based library projects, some of which expect to digitise large quantities of English language materials in the near future.

The table in Figure 5 shows that over 10,000 books are readily identified. An exact count would be time consuming - especially when other projects/initiatives [14] are included - as the various lists overlap. On this basis, it seems reasonable to suppose that there may be up to 20,000 English language books (as defined above) already available via the Internet. This number is constantly growing, as existing endeavours proceed and as new projects are announced.

Although the focus here is on the Internet, it is appropriate to consider also books published digitally in other ways. Small specialist publishers (eg Spectrum Press, Voyager) and technical publishers (eg BSI, HMSO) publish materials on diskette; the former as a primary medium, the latter as a secondary format. The numbers of publications available in this way amounts to only a few hundreds.

The figure of about 20,000 Internet books can be contrasted with the far larger number of texts deposited or acquired in print form at the British Library annually: some 181,000 monographs last year alone . The entire collection comprises some 12 million volumes of monographs and serials in London, with a further 3 million monographs in DSC [15].

Most of the books available digitally are heritage materials, mostly literature (old texts which are out of copyright), though some modern materials are also available. It is felt that the situation may change in the near future if book publishers choose to publish parallel digital editions, to the academic sector as an example.

At present, however, it is clear that the proportion of books available by Internet represents a minuscule proportion of the Library’s holdings or acquisitions, by any measure. The numbers are growing rapidly, as projects digitise existing books, and may grow more rapidly if publishers make digital versions available.

5.5 The British Library on the Internet

The British Library currently has a presence on the Internet in the form of its Portico service. This service currently contains some 560 "pages" of information. Portico mostly holds information about the Library rather than displaying holdings. The volume of access to Portico is measured in terms of "raw hits". Portico is receiving on the order of 1/3 million raw hits per month. This apparently high level of interest is typical of the Internet phenomenon; but note that raw hits is a particularly crude measure of interest because of several limitations:

The Library has undertaken a number of small-scale digitisation projects (Burney collection, St Pancras Treasures, etc) but with the exception of the Beowulf manuscript these are not accessible via the Internet.

The Library also makes its OPAC available to some universities via the Internet. Finally, the Library recently announced accessibility of its digital catalogues through the Internet (the BLAISE-Web service), with a service providing not only OPAC functionality but also the ability to download fully formatted catalogue records for use in local catalogues.

5.6 The Internet in the British Library

Staff in some reading rooms make use of the Internet for research. Access for readers is currently being introduced on a trial basis. The trial will involve one PC in each of the reading rooms for BLISS, SRIS and DSC.

* A server can be defined in informal terms as a computer which is connected to the Internet to supply information. This contrasts with a PC or workstation connected to the Internet to obtain information from it.

** The term "quality journal" is taken as meaning a regular periodical which is peer-reviewed and which attracts academic respect.

*** e.g. American Mathematical Society, Academic Press, Institute of Physics Publishing


<< Contents Page | Next Section >>

Converted to HTML by Isobel Stark of UKOLN, August 1996