UKOLN AHDS Digitisation Of Still Images Using A Flat-Bed Scanner

Preparing For A Large-Scale Digitisation Project

The key to the development of a successful digitisation project is to separate it into a series of stages. All projects planning to digitise documents should establish a set of guidelines to help ensure that the scanned images are complete, consistent and correct. This process should consider the proposed input and output of the project, and then find a method of moving from the first to the second.

This document provides preparatory guidance to consider when approaching the digitisation of many still images using a flatbed scanner.

Choose Appropriate Scanning Software

Before the digitisation process may begin, the digitiser requires suitable tools to scan & manipulate the image. It is possible to scan a graphic using any image processing software that supports TWAIN (an interface to connect to a scanner, digital camera, or other imaging device from within a software application), however the software package should be chosen carefully to ensure it is appropriate for the task. Possible criteria for measuring the suitability of image processing software include:

A timesaving may be found by utilizing a common application, such as Adobe Photoshop, Paintshop Pro, or GIMP. For most purposes, these offer functionality that is rarely provided by editing software included with the scanner.

Check The Condition Of The Object To Be Scanned

Image distortion and dark shading at page edges are common problems encountered during the digitisation process, particularly when handling spine-bound books. To avoid these and similar issues, the digitiser should ensure that:

  1. The document is uniformly flat against the document table.
  2. The document is not accidentally moved during scanning.
  3. The scanner is on a flat, stable surface.
  4. The edges of the scanner are covered by paper to block external light, caused when the object does not lay completely flat against the scanner.

Scanning large objects that prevent the scanner lid being closed (e.g. a thick book) often causes discolouration or blurred graphics. Removing the spine will allow each page to be scanned individually, however this is not always an option (i.e. when handling valuable books). In these circumstances you should consider a planetary camera as an alternative scanning method.

Identification Of A Suitable Policy For Digitisation

It is often costly and time-consuming to rescan the image or improve the level of detail in an image at a later stage. Therefore, the digitiser should ensure that a consistent approach to digitisation is taken in the initial stages. This will include the choice of a suitable resolution, file format and filename scheme.

Establish a consistent quality threshold for scanned images

It is difficult to improve low quality scans at a later date. It is therefore important to digitise images at a at a slightly higher resolution (measured in pixels per inch) and scan type (24-bit or higher for colour, or 8-bit or higher for grey scale) than required and rescale the image at a later date.

Choose an appropriate image format

Before scanning the image, the digitiser should consider the file format in which it will be saved. RGB Baseline TIFF Rev 6 is the accepted format of master copies for archival and preservation (although PNG is a possible alternative file format). To preserve the quality, it is advisable to avoid compression where possible. If compression must be used (e.g. for storing data on CD-ROM), the compression format should be noted (Packbits, LZW, Huffman encoding, FAX-CCITT 3 or 4). This will avoid incompatibilities in certain image processing applications.

Data intended for dissemination should be stored in one of the more common image formats to ensure compatibility with older or limited browsers. JPEG (Joint Photographic Experts Group) is suitable for photographs, realistic scenes, or other images with subtle changes in tone, however its use of 'lossy' compression causes sharp lines or letterings are likely to become blurred. When modifying an image, the digitiser should return to the master TIFF image, make the appropriate changes and resave it as a JPEG.

Choose an appropriate filename scheme

Digitisation projects will benefit from a consistent approach to file naming and directory structure that allows images to be organized in a manner that avoids confusion and can be quickly located. An effective naming convention should identify the categories that will aid the user when finding a specific file. For example, the author, year it was created, thematic similarities, or other notable factors. The digitiser should also consider the possibility that multiple documents will have the same filename or may lack specific information and consider methods of resolving these problems. Guidance on this issue can be found in related QA Focus documents.

Further Information