Setting up an institutional ePrints archive - what is involved?

Chris Rusbridge and William J Nixon
Information Services, University of Glasgow

UKOLN Meeting:
Developing an agenda for institutional e-print archives, Wednesday 11th July, 2001

Abstract

This brief conference paper describes the set-up and implementation of a trial eprints service at the University of Glasgow. It covers the goals of the service, a range of issues, the installation and configuration of the software and feedback from users.

Introduction

The University of Glasgow has a trial ePrints service: ePrints @ Glasgow at: http://eprints.lib.gla.ac.uk:333/. This service has some 15 papers in it and was set-up, as a pilot, in April 2001.

Goals at Glasgow

The overarching aim of the development of ePrints at the University of Glasgow is to provide an effective [further] means of ensuring the disclosure of, and access to the scholarly work and research of this institution. The term "scholarly work" is intended to be inclusive and covers not only peer-reviewed journal articles but also books, theses, chapters, conference papers and grey literature such as meeting reports.

Our core range of goals include:

Another original, initial goal was to have an ePrints service available which staff could deposit papers into for the current round of RAE submissions. However within the timescale available we were unable an operational service in time.  

Secondary goals

In addition to these goals a secondary range of goals have also been identified which will underpin the operational service. Our eprints service should be compatible with world-wide OAI infrastructure and be harvested by service providers. It should be compatible with Glasgow's "intranet" service and ensure seamless access, at the moment eprints requires users to have yet another user. The material, should also through the choice of document formats be hospitable to both display and future preservation activity. 

However, it is important to stress that a non-goal, at least in this initial phase is solving the long-term digital preservation issues. Glasgow is a strong proponent of the need for digital preservation but this should not be a factor in impeding the development and roll-out of an ePrints repository. 

Getting it used

A range of questions still need to be answered before we can launch a live and operational ePrints service. These include:

Issues

The trial eprints service which we have established at http://eprints.lib.gla.ac.uk:333/ has raised a range of issues about running the service which must be addressed. We have begun to deal with some issues such as document formats  in this initial trial, more complex issues such as Copyright, Digital Preservation and the long term administration of the service will be dealt with this as the service develops.

Document formats

The ePrints software can support a wide range of document formats and can be customised to include new formats.

Some document formats unsuitable for long term such as Microsoft Word. To deal with this we have added Rich Text Format so that submitters can easily save their document to RTF from Word. The ideal would be to have users submit documents in HTML [or better yet XML]. However the HTML should be checked at the submission stage, software such as Microsoft Word which can save to HTML includes many proprietary tags and creates code which can be browser or platform specific. eprints also includes Postscript which we felt was a format it would be useful to retain.

Acrobat's Portable Document Format, although also proprietary is now well established and we have included it in our list of accepted formats.

Desirable features

The trial service has enabled us to identify a range of additional features which would be desirable in a future release of the software. 

Authentication

Currently, the ePrints service cannot take advantage of existing University of Glasgow authentication services such as the one provided by our Novell Directory Services. To use the ePrints @ Glasgow, users need to be allocated a separate username and password for ePrints. Ideally we would like to integrate ePrints into the authentication system for our Intranet.

The ePrints software generates an alphanumeric password which is then sent to users. In the interim however, we have realised that users cannot change their own password, perhaps to keep in consistent with another one which they use. This can only be done by the Administrator. It would be useful for users to be able to change this password when they access their account details. 

User file over-write option

In the early pilot stages, as we uploaded new documents onto the eprints server we have found that if we make a mistake in the document or the number of files uploaded it is not possible to simply overwrite the file with a new copy. In one instance we uploaded a file with a link to a local stylesheet which was not available - the paper would then not display in Netscape. It was not possible for the user to upload a corrected version with the full link to the stylesheet. A new "version" of the paper needed to be uploaded and the previous one deleted by the Administrator.

Audit trail

Some of our trial users felt that for exercises such as the RAE it would be useful to have an audit trail available of their documents history. This would include when the document was written, the date submitted and then the date approved. ePrints currently displays the date when material has been accepted into the repository.

Additional fields for display

With the range of material which will be deposited in the Glasgow ePrints service there are a number of fields which we will look at adding to the information screens for documents. All users, when submitting their document must allocate a specific document type [e.g. conference paper, journal article etc] and status [e.g. Published, In Press etc] - these can be searched on when the document has been accepted. We would like to add Document Type and Status to the information display in addition to Title, Author, Keywords and Abstract. In discussions with eprints this is possible and we will incorporate these into our next release.

Wider Issues

A range of wider issues will also need to be addressed before the services becomes fully operational. These include:

The wide range of technical, cultural and legal issues identified by our initial pilot service reinforce the need for extensive piloting of an ePrints service. 

Software

Information Services at the University of Glasgow installed the beta version of the eprints open archive software [http://eprints.org] in January 2001 as an initial trial to gain experience of the software. In April we installed release 1.0 of the software [to date we have not yet moved to 1.1.1]. 

The ePrints project is led by the Library but the software was installed by our Computing Services department onto one of their main Unix servers. Running an ePrints service requires skills in three distinct areas:

Experience with Unix, the Apache web server, MySQL and the Perl programming language are all required to effectively install and get the ePrints software up and running. 

Once installed, Library staff had to decide on the subjects to be included, any additional document types required and the overall "look and feel" of the service. The configuration was initially somewhat "trial and error" as staff spent time on familiarising themselves with the relevant set-up files and making decisions about the organization of the subject tree. 

At the moment, the range of subjects reflects the University's range of faculties and departments. The trial release has not [yet] adopted a standard classification scheme such as Dewey or Library of Congress. 

The ePrints software has a submission buffer, which all content must sit in before it becomes publicly available. This allows the Administrator to review the material, file types and the details which the user has created before it is accepted. The Administrator is also responsible for maintaining the ePrints database of users. He can change users passwords, delete documents and as appropriate create new subject headings.

User Feedback

User feedback from staff who have deposited papers has been positive and they have found the online help to be good. The range of papers and documents which they have submitted and provided us with a range of additional document types such as lecture papers and project reports which are not part of the core collection of eprints document types. We have also received a couple of papers in French and German and will be adding a Language field in our next release. 

For the purposes of the RAE staff also identified fields such as ISBN / ISSN which it would be helpful to have available.

e-Theses

In conjunction with the development of an ePrints service, Information Services at Glasgow is also considering the possibility of starting a voluntary e-theses deposit service.

Networked Digital Library of Theses and Dissertations [http://www.ndltd.org] under the auspices of Virginia Tech in the United States is the acknowledged leader of development for e-theses. At the moment there is currently only one UK contributor: City University.

An e-theses service may use the same software as our eprints service however they would reamin separate - but cross-searchable repositories.

Conclusions

Developing and implementing a trial eprints service has been harder than we thought but it is worth-while persevering with.


Posted: 08 August 2001