Project SPECTRa : an Overview

Chemical information is essential to many sciences outside chemistry, including material, life and environmental sciences, and supports major industries including pharmaceuticals. The reporting of the synthesis and properties of new chemical compounds is central to this. However, it has been reported that 80% of all crystallographic data are never published and we estimate that in organic chemistry 99% of all spectra (which are essential for the full analytical characterization and understanding of chemical structures) are lost.

SPECTRa - the Submission, Preservation and Exposure of Chemistry Teaching and Research Data - is an eighteen month project which will develop a set of customized software tools to enable chemists to routinely deposit experimental data, much of which is currently lost, in Open Access digital repositories. The project is a partnership between the chemistry departments and university libraries at Cambridge and Imperial College, and has formal links with the eBank-UK project. Requirements in a number of different user disciplines (X-ray crystallography, computational chemistry and synthetic organic chemistry) will be determined by interview and survey. A customized version of the DSpace digital repository will be developed. Additional tools and context-specific metadata will facilitate the subsequent re-use of the deposited information.

Progress so far has been mainly in the crystallographic area. Building on interviews with departmental crystallographers, and on work previously done by the eBank project for its eCrystals repository, we have developed a set of software requirements for a toolset to enable crystallographers to deposit structures and associated metadata in an OAI-compliant repository. Work is currently on schedule.

A number of issues have arisen. On the technical side we need to further investigate which of the digital packaging standards to adopt (METS or DIDL). On the policy side, we have to manage researchers' views on Open Access, which may require some degree of short-term embargo in order to encourage their willingness to deposit data.