UKOLN AHDS QA Focus Case Study Documents: Print All - Digitisation



This page is for printing out the case studies on the subject of Digitisation. Note that some of the internal links may not work.


Case Study 07

Using SVG in the Artworld Project


About The Artworld Project

Artworld logo Artworld [1] is a consortium project funded by JISC under the 5/99 funding round. The consortium consists of The Sainsbury Centre for Visual Arts (SCVA) at The University of East Anglia (UEA) and the Oriental Museum at the University of Durham.

The main deliverable for the project is its Web site which will include a combined catalogue of parts of the two collections and a set of teaching resources.

Object images are being captured using digital photography at both sites and some scanning at SCVA. Object data is being researched at both sites independently and is input to concurrent Microsoft Access databases. Image data is captured programmatically from within the Access database. Object and image data are exported from the two independent databases and checked and imported into a Postgres database for use within the catalogue on the Web site.

There are four teaching resources either in development or under discussion. These are African Art and Aesthetics, Egyptian Art and Museology, An Introduction to Chinese art and Japanese Art. These resources are being developed by the department of World Art Studies and Museology at UEA, The department of Archaeology, Durham and East Asian Studies, University of Durham respectively. The Japanese module is currently under negotiation. These resources are stored as simple XML files ready for publication to the Web.

The target audience in the first instance are undergraduate art history, anthropology and archaeology students. However, we have tried to ensure that the underlying material is structured in such a way that re-use at a variety of levels, 16 plus to post graduate is a real possibility. We hope to implement this during the final year of the project by ensuring conformance with IMS specifications.

How Use Of SVG Came About

In the early days of the project we were trying very hard to find an IT solution that would not only fulfill the various JISC requirements but would be relatively inexpensive. After a considerable amount time researching various possibilities we selected Apache's Cocoon system as our Web publishing engine. To help us implement this we contracted a local internet applications provider Luminas [2].

The Cocoon publishing framework gives us an inexpensive solution in that the software is free so we can focus our resources on development.

One area that we had inadvertently missed during early planning was how we represent copyright for the images whilst providing some level of protection. We considered using watermarking however this would have entailed re-processing a considerable number of images at a time when we had little resource to spare.

This issue came up in conversation with Andrew Savory of Luminas as early notification that all of the images already transferred to the server and in use through Cocoon would need to be replaced. As we talked about the issues Andrew presented a possible solution, why not insert copyright notices into the images "on the fly". This would be done using a technology called SVG (Scalable Vector Graphics). What SVG could do for us is to respond to a user request for an image by combining the image with the copyright statement referenced from the database and present the user with this new combined image and copyright statement.

We of course asked Luminas to proceed with this solution. The only potential stumbling block was how we represent copyright from the two institutions in a unified system. The database was based on the VADS/VRA data schema so we were already indicating the originating institution in the database. It was then a relatively simple task to include a new field containing the relevant copyright statements.

It should be noted that a composite JPEG (or PNG, GIF or PDF) image is sent to the end user - there is no requirement for the end user's browser to support the PNG format. The model for this is illustrated in Figure 1.

Process For Creating Dynamic Images
Figure 1: Process For Creating Dynamic Images

Lessons Learnt

Although in this case we ended up with an excellent solution there are a number of lessons that can be derived from the sequence of events. Firstly the benefits of detailed workflow planning in the digitisation process cannot be understated. If a reasonable solution (such as water marking) had been planed into the processes from the start then a number of additional costs would not have been incurred. These costs include project staff time in discussing solutions to the problem, consultancy costs to implement a new solution. However, there are positive aspects of these events that should be noted. Ensuring that the project has a contingency fund ensures that unexpected additional costs can be met. Close relations with contractors with free flow of information can ensure that potential solutions can be found. Following a standard data schema for database construction can help to ensure important data isn't missed. In this case it expedited the solution.

About Cocoon and SVG

Cocoon [3] is an XML Publishing Framework that allows the possibility of including logic in XML files. It is provided through the Apache software foundation.

SVG (Scalable Vector Graphics) [4] [5] is non proprietary language for describing two dimensional graphics in XML. It allows for three types of objects: vector graphic shapes; images and text. Features and functions include: grouping, styling, combining, transformations, nested transformations, clipping paths, templates, filter effects and alpha masks.

References

  1. Artworld, University of East Anglia
    <http://artworld.scva.uea.ac.uk/>
  2. Luminas,
    <http://www.luminas.co.uk/>
  3. Apache Cocoon, Apache,
    <http://xml.apache.org/cocoon/>
  4. Introduction to SVG, xml.com,
    <http://www.xml.com/pub/a/2001/03/21/svg.html>
  5. SVG Specification, W3C,
    <http://www.w3.org/TR/2000/CR-SVG-20001102/>

Contact Details

Paul Child
ARTWORLD Project Manager
Sainsbury Centre for Visual Arts
University of East Anglia
Norwich NR4 7TJ

Tel: 01603 456 161
Fax: 01603 259 401
Email: p.child AT uea.ac.uk

QA Focus Comments

Citation Details

Using SVG in the Artworld Project, Child, P., QA Focus case study 07, UKOLN,
<http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-07/>

The document was published in January 2003.


Case Study 08

Crafts Study Centre Digitisation Project - and Why 'Born Digital'


Background

The Crafts Study Centre (CSC) [1], established in 1970, has an international standing as a unique collection and archive of twentieth century British Crafts. Included in its collection are textiles, ceramics, calligraphy and wood. Makers represented in the collection include the leading figures of the twentieth century crafts such as Bernard Leach, Lucie Rie and Hans Coper in ceramics; Ethel Mairet, Phyllis Barron and Dorothy Larcher, Edward Johnston, Irene Wellington, Ernest Gimson and Sidney Barnsley. The objects in the collection are supported by a large archive that includes makers' diaries, documents, photographs and craftspeoples' working notes.

The Crafts Study Centre Digitisation Project

The Crafts Study Centre Digitisation Project [2] has been funded by the JISC to digitise 4,000 images of the collection and archive and to produce six learning and teaching modules. Although the resource has been funded to deliver to the higher education community, the project will reach a wide audience and will be of value to researchers, enthusiasts, schools and the wider museum-visiting public. The Digitisation Project has coincided with an important moment in the CSC's future. In 2000 it moved from the Holborne Museum Bath, to the Surrey Institute of Art & Design, University College, Farnham, where a purpose-built museum with exhibition areas and full study facilities, is scheduled to open in spring 2004.

The decision to create 'born digital' data was therefore crucial to the success not only of the project, but also in terms of the reusability of the resource. The high-quality resolutions that have resulted from 'born digital' image, will have a multiplicity of use. Not only will users of the resource on the Internet be able obtain a sense of the scope of the CSC collection and get in-depth knowledge from the six learning and teaching modules that are being authored, but the relatively large file sizes have produced TIFF files that can be used and consulted off-line for other purposes.

These TIFF files contain amazing details of some of the objects photographed from the collection and it will be possible for researchers and students to use this resource to obtain new insights into for example, the techniques used by makers. These TIFF files will be available on site, for consultation when the new CSC opens in 2004. In addition to this, the high-quality print out-put of these images means that they can be used in printed and published material to disseminate the project and to contribute to building the CSC's profile via exhibition catalogues, books and related material.

Challenges

The project team were faced with a range of challenges from the outset. Many of these were based on the issues common to other digital projects, such as the development of a database to hold the associated records that would be interoperable with the server, in our case the Visual Arts Data Service (VADS), and the need to adopt appropriate metadata standards. Visual Resources Association (VRA) version 3.0 descriptions were used for the image fields. Less straightforward was the deployment of metadata for record descriptions. We aimed for best practice by merging Dublin Core metadata standards with those of the Museum Documentation Association (mda). The end produce is a series of data fields that serve firstly, to make the database compatible with the VADS mapping schema, and secondly to realise the full potential of the resource as a source of information. A materials and technique field for example, has been included to allow for the input of data about how a maker produced a piece. Users of the resource, especially students and researchers in the history of art and design will be able to appreciate how an object in the collection was made. In some records for example, whole 'recipes' have been included to demonstrate how a pot or textile was produced.

Other issues covered the building of terminology controls, so essential for searching databases and for achieving consistency. We consulted the Getty Art and Architecture Thesaurus (AAT) and other thesauri such as the MDA's wordhord, which acts as a portal to thesauri developed by other museums or museum working groups. This was sometimes to no avail because often a word simply did not exist and we had to reply on terminology develop in-house by curators cataloguing the CSC collection, and have the confidence to go with decisions made on this basis. Moreover, the attempt to standardise this kind of specialist collection can sometimes compromise the richness of vocabulary used to describe it.

Lessons Learnt

Other lessons learnt have included the need to establish written image file naming conventions. Ideally, all image file names should tie in with the object and the associated record. This system works well until sub-numbering systems are encountered. Problems arise because different curators when cataloguing different areas of the collection, have used different systems, such as letters of the alphabet, decimal and Roman numerals. This means that if the file name is to match the number marked on the object, then it becomes impossible to achieve a standardised approach. Lessons learnt here, were that we did not establish a written convention early enough in the project, with the result that agreement on how certain types of image file names should be written before being copied onto CD, were forgotten and more than one system was used.

The value of documenting all the processes of the project cannot be overemphasised. This is especially true of records kept relating to items selected for digitisation. A running list has been kept detailing the storage location, accession number, description of the item, when it was photographed and when returned to storage. This has provided an audit trail for every item digitised. A similar method has been adopted with the creation of the learning and teaching modules, and this has enhanced the process of working with authors commissioned to write the modules.

Lastly, but just as importantly, has been the creation of QA forms on the database based on suggestions presented by the Technical Advisory Services for Imaging (TASI) at the JISC Evaluation workshop in April 2002. This has established a framework for checking the quality and accuracy of an image and its associated metadata, from the moment that an object is selected for digitisation, through to the finished product. Divided into two sections, dealing respectively with image and record metadata, this has been developed into an editing tool by the project's documentation officer. The QA forms allows for most of the data field to be checked off by two people before the image and record is signed off. There are comment boxes for any other details, such as faults relating to the image. A post-project fault report/action taken box has been included to allow for the reporting of faults once the project has gone live, and to allow for any item to re-enter the system.

The bank of images created by the Digitisation Project will be of enormous importance to the CSC, not only in terms of widening access to the CSC collection, but in helping to forge its identity when it opens its doors as a new museum in 2004 at the Surrey Institute of Art & Design, University College.

References

  1. The Crafts Study Centre,
    <http://www.craftscentre.surrart.ac.uk/>
  2. The Crafts Study Centre Digitisation Project, JISC,
    <http://www.jisc.ac.uk/index.cfm?name=project_crafts>

Contact Details

Jean Vacher
Digitisation Project Officer
Crafts Study Centre Surrey Institute of Art & Design, University College

QA Focus Comments

Citation Details

Crafts Study Centre Digitisation Project - and Why 'Born Digital', Vacher, J., QA Focus case study 08, UKOLN,
<http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-08/>

Changes

15 Oct 2003
The URL for The Crafts Study Centre Web site was changed from http://www.surrart.ac.uk/whatshappening/galleries/craftstudy.html to http://www.craftscentre.surrart.ac.uk/
The URL for information about The Crafts Study Centre project on the JISC Web site was changed from http://www.jisc.ac.uk/dner/development/projects/crafts/ to http://www.jisc.ac.uk/index.cfm?name=project_crafts

Case Study 09

Image Digitisation Strategy and Technique: Crafts Study Centre Digitisation Project


Background

Information about the Crafts Study Centre (CSC) [1] and the Crafts Study Centre Digitisation Project [2] is given in another case study [3].

Problem Being Addressed

At the outset of The Crafts Study Centre (CSC) Digitisation Project extensive research was undertaken by the project photographer to determine the most appropriate method of image capture. Taking into account the requirements of the project as regards to production costs, image quality and image usage the merits of employing either traditional image capture or digital image capture were carefully considered.

The Approach Taken

The clear conclusion to this research was that digital image capture creating born digital image data via digital camera provided the best solution to meet the project objectives. The main reasons for reaching this conclusion are shown below:

  1. All image capture and post-production is carried out in-house allowing precise control over quality and output.
  2. Fast, safe and efficient delivery of captured digital image data for transfer to CSC database and subsequent Web delivery.
  3. Photographer maintains full control over the final appearance of captured images particularly in respect of colour balance, tone, exposure and cropping thus helping to maintain high degrees of quality control.
  4. Born digital image capture provides cost efficiency as compared to traditional image capture and associated processing and scanning costs.

Items from the CSC collection are identified by members of the project team and passed to the photographer for digitisation. Once the item has been placed in position and the appropriate lighting arranged, it is photographed by using a large format monorail camera (cambo) hosting a Betterlight digital scanning back capable of producing image file sizes of up to 137 megabytes without interpolation.

Initially a prescan is made for appropriate evaluation by the photographer. Any necessary adjustments to exposure, tone, colour, etc are then made via the camera software and then a full scan is carried out with the resulting digital image data being automatically transferred to the photographers image editing program, in this case Photoshop 6.

Final adjustments can then be made, if required and the digital image then saved and written onto CDR for onward delivery to the project database.

The main challenges in setting up this system were mostly related to issues regarding colour management, appropriate image file sizes, and standardisation wherever possible.

To this end a period of trialling was conducted by the photographer at the start of the image digitisation process using a cross section of subject matter from the CSC collection.

Identifying appropriate file sizes for use within the project and areas of the digital imaging process to which a level of standardisation could be applied was fairly straightforward, however colour management issues proved slightly more problematic but were duly resolved by careful cross-platform (Macintosh/MS Windows) adjustments and standardisation within the CSC and the use of external colour management devices.

References

  1. The Crafts Study Centre,
    <http://www.craftscentre.surrart.ac.uk/>
  2. The Crafts Study Centre Digitisation Project, JISC
    <http://www.jisc.ac.uk/index.cfm?name=project_crafts>
  3. Crafts Study Centre Digitisation Project - and Why 'Born Digital', QA Focus
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-08/>

Contact Details

David Westwood
Project Photographer
Digitisation Project Officer
Crafts Study Centre
Surrey Institute of Art & Design, University College

QA Focus Comments

Citation Details

Image Digitisation Strategy and Technique: Crafts Study Centre Digitisation Project, Westwood, D., QA Focus case study 09, UKOLN,
<http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-09/>

The document was published in January 2003.

Changes

15 Oct 2003
The URL for The Crafts Study Centre Web site was changed from http://www.surrart.ac.uk/whatshappening/galleries/craftstudy.html to http://www.craftscentre.surrart.ac.uk/
The URL for information about The Crafts Study Centre project on the JISC Web site was changed from http://www.jisc.ac.uk/dner/development/projects/crafts/ to http://www.jisc.ac.uk/index.cfm?name=project_crafts

Case Study 19

Digitisation of Wills and Testaments by the Scottish Archive Network (SCAN)


Background

Scottish Archive Network (SCAN) is a Heritage Lottery Funded project. The main aim of the project is to open up access to Scottish Archives using new technology. There are three strands to the project:

  1. Creating collection level descriptions of records held by 52 archives throughout Scotland [1].
  2. Provide all the resources that a researcher will need when accessing archive resources over the Internet [2].
  3. Digitisation of the Wills and Testaments registered in Scotland from 1500s to 1901 [3].

The digitisation of the Wills and Testaments are the focus of this case study.

Problem Being Addressed

The digitisation of the testaments is an ambitious undertaking. The main issues to be considered are:

The Approach Taken

Document Preparation

As digital objects, images of manuscript pages lack the obvious information given by a physical page bound in a volume. It is important for completeness and for sequence that the pages themselves are accurately paginated. This gives a visual indication of the page number on the image as well as being incorporated into the naming convention used to identify the file. As a result quality is improved by reducing the number of pages missed in the digitisation process and by ensuring that entire volumes are captured and in the correct sequence.

Image Capture

The image capture program (dCam) automated the file naming process thereby reducing operator error and automatically capturing metadata for each image. This included date, time, operator id, file name, camera id and so on which helped in identifying whether later problems related to operator training or to a specific workstation. The program also included simple options for retakes.

Post Image Capture

We have instituted a secondary quality assurance routine. This involves an operator (different to the one who captured the images) examining a selection of the images for any errors missed by the image capture operator. Initially, 100% of the images were checked, but a 30% check was soon found to be satisfactory. The quality control is carried out within 24 hours of a volume being digitised, which means that the volume is still available in the camera room should the any retakes be necessary. The QA operators have a list of key criteria to assess the image - completeness, colour, consistency, clarity and correctness. When operators finds a defective image they reject it and select the reason from a standardised list. Although the images are chosen at random, whenever an error is found the QA program will present the next sequential image, as it is more likely for errors to be clustered together. A report is produced by the QA program which is then used to select any retakes. The reports are also analysed for any recurring problems that may be corrected at the time of capture. Further QA criteria: the quality of the cameras had been specified in terms of capacity (i.e. number of pixels), and we found that it is also possible to specify the quality of the CCD in terms of an acceptable level of defective pixels. This, however, does have a bearing on cost.

Problems Experienced

Preparation

This was a time consuming process, which was slower than capture itself. It was important to build up sufficient material in advance of the digitisation getting underway.

Capture

We chose to capture colour images. The technique used was to take three separate colour images through red, green and blue filters and then to combine them into a single colour image. This worked well and produced very high quality colour images. However, it was very difficult to spot where there had been slight movement between the three colour shots. At a high level of magnification this produced a mis-registration between the 3 colour planes. The QA process sometimes caught this but it was far more costly for this to be trapped later on. We discovered that where there had been slight movement, the number of distinct colours in an image was almost double the average. We used this information to provide a report to the QA operators highlighting where potential colour shift had taken place. In addition the use of book cradles helped reduce this problem as well as enabling a focused image to be produced consistently.

Things We Would Do Differently

The project has achieved successful completion within budget. For the digital capture program it proved possible to capture an additional 1 million pages as the capture and quality control workflow worked well. It is clear that the process is well suited to high throughput capture of bound manuscript material. Loose-leaf material took far more conservation effort and a much longer time to capture.

References

  1. Online Catalogues Microsite, Scottish Archive Network,
    <http://www.scan.org.uk/aboutus/indexonline.htm>
  2. Scottish Archive Network,
    <http://www.scan.org.uk/>
  3. Scottish Documents, Scottish Archive Network,
    <http://www.scottishdocuments.com/content/>

Contact details

Rob Mildren
Room 2/1
Thomas Thomson House
99 Bankhead Crossway N
Edinburgh
EH11 4DX

for SCAN Business:
Tel: 0131-242-5802
Fax: 0131-242-5801
Email: rob.mildren@scan.org.uk
URL: http://www.scan.org.uk/

for NAS Business:
Tel: 0131-270-3310
Fax: 0131-270-3317
Email: rob.mildren@nas.gov.uk
URL: http://www.nas.gov.uk/

QA Focus Comments

This case study describes a project funded by the Heritage Lottery Fund. Although the project has not been funded by the JISC, the approaches described in the case study may be of interest to JISC projects.