UKOLN
Raising Awareness

"A centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities."

UKOLN is based at the University of Bath.
Improving Access to Text

The IMPACT Project

Improving Access to Text

IMPACT is a European project that aims to speed up the process and enhance the quality of mass digitisation in Europe. The IMPACT research programme will significantly improve digital access to historical printed text through the development and use of innovative Optical Character Recognition software and linguistic technologies.

IMPACT will also build capacity in mass digitisation across Europe. The twenty-six partners (eleven libraries, thirteen research institutes or universities, and two private sector companies) collectively constitute a Centre of Competence that will share best practice and expertise with the cultural heritage communities in Europe.

Project news

Authoritative and up-to-date IMPACT news is available from:

Project background

IMPACT project architecture diagram

Improving Access to Text (IMPACT) is a EUR 11.5 million research project funded by the European Union as part of the Seventh Framework Programme (FP7), and led by the National Library of the Netherlands (Koninklijke Bibliotheek). This four-year "Large-scale Integrating Project" has 15 partners including national and research libraries, universities and industrial partners.

The project started work in January 2008 and its sub-projects are focused on three main activity areas:

  • Text recognition - research into improving the digitisation and OCR workflow, including: the enhancement of images to maximise document segmentation and OCR; improved segmentation methodologies; adaptive OCR; the integration of language models into OCR; and exploring novel techniques for OCR processing
  • Enhancement and enrichment - research into improvements that can be made post OCR, including: tools to support the collaborative correction of OCR output; the definition and development of historical lexica for English, Dutch and German; the enhancement of the XML output files that combine OCR results with technical and layout metadata
  • Operational context/Capacity building - exploring the integration of IMPACT tools into their wider digitisation context, including: a technical framework for integrating all IMPACT tools and enabling their takeup by second phase partners; an evaluation framework for IMPACT tools; a requirements forum; and the development of a range of externally-facing resources, including: documentation on digitisation workflows and IMPACT tools, a helpdesk, and a series of training events

UKOLN's main roles in the project are to work on the externally-facing parts of IMPACT, primarily in helping to produce and disseminate documentation (best practice guides, briefings, case studies, etc.) on text digitisation frameworks and IMPACT tools and a series of training events.

Project partners

There were originally fifteen full-partners in IMPACT: seven libraries, six research institutes and two private sector companies:

Library partners

University and research partners

From 2010, an additional 11 partners have joined the project, primarily to help test the IMPACT tools and their extensibility to new language groups. They are a mixture of libraries and research groups and are based in Bulgaria, the Czech Republic, France, Poland, Slovenia and Spain.

UKOLN's Contribution to the Project

UKOLN is the leader of two work packages in IMPACT:

  • CB1: Learning resource toolbox
  • CB4: Training

UKOLN Publications and Presentations

IMPACT Metadata Best Practice Guide [paper]
Michael Day
October 2010
IMPACT Metadata Best Practice Guide (October 2010) - please send comments to the IMPACT LinkedIn Group: http://www.linkedin.com/groups?mostPopular=&gid=130648
The Improving Access to Text (IMPACT) project and other European initiative [presentation]
Michael Day
September 2009
JISC Workshop: OCR for the Mass Digitisation of Textual Materials, University of Bath, 24 September 2009
IMPACT Conference: Optical Character Recognition in Mass Digitisation [conference report]
Lieke Ploeger, Yola Park, Jeanna Nikolov-Ramirez Gaviria, Clemens Neudecker, Fedor Bochow and Michael Day
April 2009
Ariadne 59

UKOLN staff working on the IMPACT Project

Ed Bremner
Research Officer
E-mail: e.bremner@ukoln.ac.uk

Marieke Guy
Research Officer (part time)
E-mail: m.guy@ukoln.ac.uk

Michael Day
R&D Team Leader
E-mail: m.day@ukoln.ac.uk


Seventh Framework Programme

IMPACT is funded as part of the European Union's Seventh Framework Programme

It is managed by the Cultural Heritage and Technology Enhanced Learning unit of the European Commission's Information Society and Media Directorate General (DG INFSO)