Going Digital: issues in digitisation for public libraries
By Neil Beagrie, Joint Information Systems Committee (JISC), on behalf
of EARL, The Library Association and UKOLN
As public libraries develop the People’s Network and respond to funding calls to create digital content, senior managers increasingly need to address policy issues related to digitisation and the management of electronic content and services . This paper aims to set out some of the key issues involved in ‘going digital’ and developing digitisation projects. It also points as appropriate, to more detailed sources of advice and guidance for operational managers. It has drawn heavily on experience from relevant digitisation programmes in Higher Education and the cultural heritage sector.
Digitisation can be forward-looking, attractive to institutions and funding bodies, and an increasingly important area for public libraries. However it is important that new technology is employed to serve rather than dictate aims of projects. For public libraries, the context of partnership and collaboration in digitisation is also important. Many projects will only take off by virtue of their being externally funded and being a partnership led by others or a project at regional level. Managers will need to consider carefully why they should digitise material, how this might impact on their services in the long-term, and how it should be funded and managed in parallel with other services. A paramount concern here is that digitisation should be set in a wider policy framework and related to an organisation's core mission. Although organisations may inevitably be responsive to external funding opportunities, it is noticeable that some of the most successful digitisation projects have had a well-established wider context, thought through the issues, and have therefore achieved the greatest impact. A holistic ‘life cycle’ approach is promoted in this document and provides many benefits in undertaking digitisation. As most of the funding is front loaded into the data creation phase there are obvious benefits in thinking through longer-term issues at an early stage and seeking to build these in when the funding is being considered.
Getting it Right First Time
The initial planning and implementation phases of a digitisation project are widely recognised as being crucial to its eventual success. Many of the decisions made at this time will determine the future sustainability and usefulness of the resources created. Because of this many guidelines such as the New Opportunities Fund technical guidelines  recommend a holistic or ‘life cycle’ approach to projects in which all stages from data creation to future use and interdependencies between them are considered.
Obtaining high quality archival masters and quality metadata will support generation of different access formats and resolutions and future re-packaging and use. The potential demand for digitisation and competition for funding also mean that opportunities to re-digitise in the future will be rare and so there are strong incentives to getting it right first time and for ‘one-time capture’ of the material.
Digitisation is a complex process and a lesson from past digitisation projects is that no project is likely to get it right first time through chance alone. The most successful projects have built in a pilot phase in which equipment can be calibrated and procedures tested and then adjusted. Others have advised that a pilot phase or project before the main digitisation is underway would be the one thing they would recommend to others about to embark on a digitisation project. Even when a decision has been made to outsource the digitisation, obtaining sample outputs from different potential suppliers and testing procedures is an essential first step.
Because digitisation is a complex process good staff training at an early stage pays substantial dividends. Training in project management is valuable for those engaged in digitisation projects particularly where many different partners/providers are involved in collaborative programmes. Project management courses are widely available through institutions or external providers. Training in digitisation is also becoming available and can make a valuable contribution to skilling up staff to either undertake or manage digitisation projects .
Although many digital imaging projects are undertaken to reach new audiences or provide new services surprisingly few have undertaken market research or systematically involve users in the prototyping stages of the project. Such involvement can be crucial to developing a business plan, future sustainability of the resource, and the attractiveness of the proposal to funders. A good example in a local authority context of market research/user involvement in developing a proposal and prototyping a project is the Wiltshire Wills project .
Digitisation represents a substantial investment not only by external funding bodies but often by the local authority either financially or in kind. It is important therefore to maximise the benefit of this investment both by ensuring that there is a sound business plan for sustaining use of the resource after the initial funding for creation is expended and that thought is given to maintaining future access across changing technologies (often referred to as digital preservation strategies). Further consideration is given to this below.
The Digitisation Process
Excellent sources of advice on the digitisation process are now available online  or in printed publications . They provide valuable background knowledge and can be comprehensive sources of information. A good understanding of the digitisation process is essential not only for those wishing to undertake digitisation in-house but for those outsourcing and managing contracts. This is not solely a technical process: an equally important input to digitisation process is the knowledge held by library staff of the material and their users.
Selection may be guided by specific library or collection development plans and a range of factors such as: potential cost of digitisation; the value of material (in terms of uniqueness, intellectual or historical value); comprehensiveness and range; its condition and ease of digitisation; how complementary it is to other local, regional, or national collections (particularly for collaborative programmes); its potential audience; and legal issues in its digitisation. A number of institutions have now prepared selection guidelines which provide a valuable general template to follow .
Copyright is of critical importance in all digitisation projects. It not only influences what can be digitised or distributed but what can be archived. It is very easy to underestimate the amount of time required for projects to clear rights to allow digitisation to proceed. A policy issues paper has been produced devoted solely to copyright and is highly recommended .
There are many ways to achieve digitisation of materials whether working in-house or outsourced and a decision on this will be made in light of the specific project needs and local circumstances .
Organisations which decide to outsource their digitisation have a wide range of service providers to choose from .
Preparation is another area which it is very easy to overlook or underestimate. The condition and nature of material to be digitised will be crucial. Unique materials in poor condition may require conservation before scanning to avoid damage to the originals. Similarly slides which are dirty may need cleaning if an image of sufficient quality is to be produced .
Use of file formats which have been well documented, have undergone thorough testing and are non-proprietary and usable on different hardware and software platforms minimises the frequency of future migration, improves sustainability of the resource, and reduces the risk and costs in their future maintenance. Similarly utilising formats which have been widely adopted minimises risk as it is more likely that migration paths will be provided by the manufacturers and a degree of 'backward compatibility' will be available between versions of the file format as it evolves. There are a number of excellent sources of more detailed advice on file formats and archiving and access requirements  .
Catalogue and technical information associated with digital resources are crucial for searching and access, interoperability with related online materials, and future collection management. There is now a substantial body of literature dealing with metadata (both at the item and collection level) and guidelines on metadata standards are available in the NOF technical guidelines . Consideration should be given not only to metadata schemes, but standardisation and formatting of entries using appropriate thesauri and recording standards to allow effective retrieval, and cross searching with other online resources.
Good quality-control is important not only for production of digital images but any accompanying metadata if a high-quality, reliable, and attractive product is to be produced. Formal quality-control and sampling, independent automated and/or manual checks, and agreed procedures for rectification should be built into project proposals and implementation plans.
Accurate costing is of high concern to managers and careful consideration of issues such as selection, preparation, rights clearance, and piloting projects noted above will help in this process. A number of worksheets have been produced to help estimate digitisation costs .
Digital data is very secure providing appropriate management is followed. However if this management is not in place digital data is easily corrupted or destroyed. Institutions will need to implement a range procedures both during and after projects to ensure the digital resources they are creating are not subject to loss or damage and will be suitable for future access and use. Digitisation projects may generate substantial storage requirements and the need for specialised facilities. These may be available in-house, as part of a consortium, or from third party services. General guidance on data management and references to further details sources are available from texts on digital preservation .
Backups and Disaster Recovery Planning
It is essential that investment in digitisation is safeguarded through the implementation of appropriate regular backup and disaster recovery procedures. Multiple copies of the data stored both on and offsite reduce the risk of loss or damage and procedures for regular updating of backups should always be in place. Depending on local institutional arrangements, it may be possible to automate backups over a network. It is advisable to write copies to a selection of different media formats to guard against faults introduced by media's suppliers into their products.
Storage media are an important consideration and their selection will be based on a combination of factors including storage capacity, data transfer rates, the manufacturer’s market presence and reputation, and the IT infrastructure of the organisation. Different requirements and criteria may apply to storage media for access and archive copies.
Magnetic media provide a versatile and cheap storage medium. Digital Linear Tape (DLT) is frequently recommended for archival storage because of its high storage capacity and good data transfer speeds. Magnetic media are constantly evolving and in addition to frequent changes in devices, manufacturers undertake often an almost constant evolution of production processes. It is important to be aware that faults in manufacture can occur and to make appropriate safeguards in backup and recovery procedures. Media should also be of high quality and purchased from reputable brands and suppliers.
Optical storage media such as CD-ROM, CD-R, and DVD use laser light to read from a data layer and are an increasingly popular method of storage. As with magnetic media, optical media have been subject to a constant process of evolution and changes in manufacture. Smaller storage capacity and slower data transfer rates either writing to or from the media can be issues to consider. The quality of the media, a reputable source, and appropriate handling and storage environment will all affect its longevity. The use of light sensitive dyes means some CD-R's are less stable than CD-ROMs and more concerns have been raised over their use as archival media. As with magnetic media there is considerable diversity in practice and production of CD-R and greater care is needed in selecting high quality media from reputable suppliers for archival purposes.
In addition to the media it is important that attention is paid to the recording and access devices such as tape drives. These should be of good quality and well-maintained. Problems with the access devices e.g. head/media crashes are one of the most common causes of damage to magnetic storage media.
Media refreshing and reformatting
Media refreshing and reformatting are essential management components for all digital media to avoid media degradation and to facilitate longer term preservation strategies. Archive copies will be periodically refreshed onto identical media to address media degradation and impermanence and are reformatted/transferred to new storage media as storage technologies change. Media refreshing and reformatting should take place:
Appropriate environmental conditions will increase the longevity of digital storage media and help prevent accidental damage to a resource or its documentation.
Audit -- audit procedures provide reassurance that the resource has not been inadvertently or deliberately changed following refreshment and/or migration procedures and to check the readability and integrity of the data over time. Employ quality control procedure such as bit/byte or other checksum comparisons with originals to ensure the authenticity and integrity of items after media refreshing.
Data security procedures are essential for preservation and integrity of the resource by preventing alteration or loss. It is important to note that not all digital resources will require identical levels of security. Guidance on levels of security can be found in BS 7799 Information Security Management. All personal data will need to conform with the requirements of the Data Protection Act (1998).
Delivery services are an essential requirement to any digitisation project. A mixture of delivery mechanisms may be adopted with delivery to users locally within the library, distribution of physical media such as CD-ROM, or online delivery via the Web . Delivery services may be via the local institution, a consortium, third party services, or even a combination of these (see business models below). Initial market research, user involvement, and subsequent evaluation will contribute to refining options for service delivery and interface issues. Where a third party service or consortia is being used, there are considerable benefits to early dialogue and engagement in the project, particularly to ensure consistency and compatibility, and that legal issues are addressed.
Integration with other resources
With the widespread adoption of the Web, Web delivery is likely to be a key consideration for many digitisation projects and a requirement for many funding bodies. Web delivery may allow integration and cross searching of digital resources from the project and other collections by users. Careful consideration should be given to maximising the points of access to the material and interoperability. Valuable guidance on developing Web services, design of Web pages, and metadata to support disclosure and interoperability are available from UKOLN .
Minimum technical standards for delivery services will normally be set by funders and networks e.g. guidance for NOF digitise  or the National Grid for Learning . In addition guidance may be provided on accessibility issues.
It is important to utilise technical and design standards to support accessibility and social inclusion policies. Guidance on designing Web sites to take account of access for those with disabilities and a wide range of different hardware and software is available on the Web. A free online service to test Web pages for accessibility issues is also available .
It can be a desirable strategy to distinguish between formats (or versions) used for archiving and access on the basis of different requirements, e.g. it would be appropriate to store a high resolution image as a TIFF master file (archival format), but to distribute the image as a JPEG file (access format) of smaller size for transmission over a network. It would not be appropriate to store the JPEG image as both the access and archival format because of the irretrievable data loss this would involve.
In addition to system security issues raised under data management above, services will need to consider mechanisms to protect intellectual property rights in their material. This may include watermarking and fingerprinting of images, authentication of users, and licensing arrangements. Public Web sites may utilise low resolution images with limited commercial potential and copyright notices and user licence agreements to discourage inappropriate use .
Digitisation and digital services are an emerging area and many of the models for long-term viability are as yet untested or too new to be predictable and established. Sustainability in this sense covers two inter-dependent and related areas: the resources needed to maintain the digital material over time (internally and/or externally generated); and the preservation strategies needed to maintain future access and authenticity. The life cycle approach is important for sustainability: decisions made in creating digital resources have significant implications for their future maintenance and use. Use of open standards and investment in good digital capture and quality control help future proof and minimise risk when many variables on future strategies remain uncertain.
In addition to seeking to become mainstream activities within institutions with core funding support, many digitisation initiatives may need to explore options to raise external funding and/or utilise external services to reach niche markets. A number of commercial image providers are established and have acted for heritage organisations to market images and provide a royalty stream. However their requirements are often quite specific and may not be suitable for all needs. Not for-profit organisations have also been established to distribute and market digital materials often to the educational community . Issues in developing business models may include institutional, funding body, and supplier policies on advertising, sponsorship and charging. An excellent and detailed discussion on the subject of income generation is available from the NOF technical advisory service .
A companion paper in this series on Charging and Networked Services provides valuable guidance on charging for services. The case for and against charging and the need to balance aspirations of free access and reality of limited funds is carefully rehearsed .
Sustaining access through preservation
Sustaining access to and preserving digital resources differ substantially from the print environment. A book is immediately intelligible to the reader but digital materials require the use of software and hardware to render and display the contents. In the traditional environment, libraries can preserve the physical objects (books, newspapers, etc) to preserve access to the information. Volume printing, physical persistence, and multiple library collections, help ensure preservation and continuing access to such materials. In the digital environment different norms apply. Sustaining access and preservation of digital resources is concerned with preserving information regardless of the object on which that information is stored. This is because software and hardware used to access the information change rapidly and become obsolete and the physical media on which digital data are stored are impermanent. Digital preservation involves inter-dependent strategies for preservation in the short to medium term based on securing the computer systems, storage media, data and documentation; and strategies for longer-term preservation to address the issues of software and hardware obsolescence. Common strategies for long-term preservation can be summarised as follows:
Most digitisation projects will only be responsible for the creation and maintenance of digital resources for the lifetime of the project and will make arrangements for longer-term preservation and access within the institution or with an appropriate repository. The institution or repository will then be responsible for implementing preservation strategies and procedures. Digital preservation is an area of active research and emerging practice and further guidance on preservation strategies is available from a number of sources .
Digitisation opens up new audiences and services for public libraries and needs to be integrated into the plans and policies of the institution and sector to maximise its effectiveness. It is a complex process with many crucial dependencies between different stages over time. Utilising a holistic life-cycle approach for digitisation initiatives will help develop sustainable and successful projects. Much has already been learnt from digitisation projects within libraries and from the development of digital services. It is hoped that the approach and issues outlined in this paper and references to more detailed sources and past projects will contribute to future successful initiatives in the sector.
If you wish to comment on any issues raised in this paper, please use the Feedback option on the main Networked Services Policy Task Group Web site.