Appendix 8 Library Technology News Articles

The following articles were published in the News section of the Library Technology News magazine.

Public Library Domain Names - February 1998

This is the first of a series of regular News items by the WebWatch project providing information on the status of Public Libraries on the Web.

WebWatch is a project aiming to develop and use robot software for analyzing various aspects of the World Wide Web. The project is funded by the British Library Research and Innovation Centre (BLRIC) and is based at UKOLN.

Domain names can indicate the nature of the organization hosting the server. The Hardens' list of public library websites is used as the input for the WebWatch surveys. The list currently shows 133 sites. From this list the following domains were found: 43% gov.uk , 41% org.uk (of which 41% are part of EARL), 7% co.uk, 5% ac.uk, 5% com, 1.5% org and 1.5% net. The domains refer respectively to UK local government sites, UK organizations, commercial providers, UK academic institutions, companies, organizations and commercial providers.

It will be interesting to see how the proportions change as more public library websites come online.

WebWatching Public Library Web Site Entry Points - April 1998

In March the WebWatch project analysed the entry-points of public library websites. The analysis looked at the total size of the entry-points (including images) and profiled the hyperlinks contained in each (which included hyperlinks and "active maps").

Figure 1 shows a frequency distribution of the total-size of entry-points.

Figure 1 - Frequency Distribution Of The Total-Size Of Entry-Points

Figure 1 shows that the sizes of entry-points is approximately normal. The mean size of a page is about 23Kb, the median is around 19Kb. The trail to the right and the extreme outlier correspond to sites using larger images or more images than most, for example, a large detailed logo.

Figure A8-2 profiles the number of hyperlinks contained within each entry-point.

Figure 2 - Number Of Hyperlinks Contained Within Each Entry-Point

The average number of hyperlinks per entry-point is about 13. The outlier represents a cluster of pages that provide links to local branch libraries and other local information.

It will be interesting to see how these profiles change as public libraries gain experience in managing websites. Will, for example, the entry-point sizes grow (making more use of images) or shrink (providing a faster response)?

WebWatching Academic Library Web Sites - June 1998

Following on from UKOLN's analysis of public library websites, we analysed 86 University and college library websites.

Figure 1 - Size Of Academic Library Web Sites

The average size of an academic library website is around 4.6 Mb. The histogram intervals in Figure 1 are 2Mb, so the chart indicates that most sites are less than 2Mb in total. The smallest site was around 4 Kb (consisting of one page). The largest was 133Mb (not shown in Figure 1 in order to keep the scale manageable). These figures include all resources (HTML, images and so on).

Academic library sites are larger than their public library counterparts and make greater use of web technologies, such as dynamically generated pages. Note that dynamically generated pages from four web sites were excluded from the analysis, due to a hyperlink recursion that was unsuitable for robot traversal.

A more detailed report will soon be available from the WebWatch web area.

Academic and Public Library Web Sites - August 1998

The WebWatch project has analysed both academic library web sites and public library web sites in the UK earlier this year. The reports on these web crawls can be found at the WebWatch web-area, <URL: http://www.ukoln.ac.uk/web-focus/webwatch/>. This column looks at some comparisons between the two analyses.

Academic library sites contain a lot more content. There are more resources (primarily HTML documents and images) within academic library sites than public library sites resulting in an overall larger site size in Kb for academic libraries. The average size of a page (HTML plus inline images) is roughly the same for both kinds of site, so it takes no more time to download a page from either site. The academic libraries do however show a greater use of technologies such as JavaScript and CGI.

We found the structure of academic library sites more suited to robot traversal (apart from their dynamic content) than the public library sites, primarily because there is greater structuring of directories within academic library sites.

Since academic libraries are part of institutions which have been networked for some time, it is not surprising to find their web sites more developed than public libraries. It will be interesting to continue comparing the two communities as public libraries gain experience in providing information on the web.

Academic Libraries and JANET Bandwidth Charging - Nov. 1998

UK Universities form part of the global Internet via a connection to JANET, the UK academic and research network. On 1 August 1998, charging was introduced for each institution's use of JANET transatlantic links. Currently, traffic from the US is charged at a rate of 2 pence/Mb during the hours 06:00 to 01:00 (though many universities will obtain subsidies from HEFCE or DENI).

By identifying domains within links to resources in a web site, it is possible to get a rough idea of the potential of the site to generate transatlantic bandwidth (although, of course, the situation is complicated by the use of caches, users entering URLs directly etc.). WebWatch therefore analysed academic library websites to explore this idea further.

Figure 1 shows the five most popular domains hyperlinked to from academic library web pages. Assuming that the UK domain implies that packets are not routed via the US, the chart shows that most links point to resources that will not directly consume transatlantic bandwidth. If the small number of other domains represent very popular resources, then there is still the potential to attract charging. In this case, such hyperlinks were found relatively deep within the web sites which may suggest that these will not be frequently selected.

Figure 1 - Top Five Hyperlinked Domains

Final WebWatch News Column - February 1999

This month sees the last WebWatch news column. Although UKOLN hopes to continue with some WebWatch activities, the project has now formally terminated and the final report is available from the WebWatch reports area at <URL: http://www.ukoln.ac.uk/web-focus/webwatch/
reports/
>.

WebWatch analyses have been amongst the first to attempt characterisation of UK Web communities such as public libraries and UK academic institutions. We hope that the analyses have provided useful information for relevant communities as well as laying some foundations for further work.

All reports, articles and WebWatch material can be obtained from the web site, at <URL: http://www.ukoln.ac.uk/web-focus/webwatch/ >.

Readers may find the various WebWatch services useful to analyse their own web pages, these can be found at <URL: http://www.ukoln.ac.uk/web-focus/webwatch/services/ >.