Project cluster summaries

From DigiRepWiki

For the Second Digital Repositories Programme meeting, 27th-28th March 2006, projects within the Integrating infrastructure cluster were asked to supply summaries of their projects, with questions and challenges for discussion in the cluster session. In addition, some projects from other clusters have supplied useful summaries and questions.

All of the summaries received so far are reproduced below.

Contents

Summaries from the Digital Repositories Programme projects

The Acessing and Storing Knowledge project

The ASK project began June 2005 and will end May 2007. It is funded under the JISC Repositories Programme. The project is a collaboration between the Universy of Oxford and the University of Highlands and Islands (UHI) Millenium Institute. The broad goal of the project is to integrate repository services.

The ASK project is using the e-framework to document a reference model and design for a pilot repository software system. The e-framework tries to map the vocabulary used to talk about e-learning, e-science and e-administration systems to a level of abstraction whereby developers can begin to design individual software artefacts that provide interoperable services. In short a framework begins to correlate the services users need with the services developers can build.

Since the e-framework is at an early stage in its development many of the service abstractions from the users perspective and recommendations on which interoperability specifications service providers should implement are not well-defined. For this reason the JISC is commissioning projects to create reference models and use these to build pilot implementations. The ASK project is one of these efforts - the project team expects to feed findings into a re-definition of some of the services within the e-framework.

Read more about the ASK project: http://ask.oucs.ox.ac.uk/ask/index.php/Project_plan

MIDESS

The MIDESS Project MIDESS explores the management of digitised content in an institutional and cross-institutional context through the development of a digital repository infrastructure. The project addresses how support can be provided for the use of digital content in a learning and research context, in an integrated manner. The partners in the project are the University of Leeds, University of Birmingham, London School of Economics (LSE) and University College of London (UCL). MIDESS will build functional demonstrator digital content repositories at three of the partner institutions, providing a set of suitable platforms to examine the issues and validity of implementing full digital content management services.

A number of University Libraries in the UK have embarked on digitisation programmes, and a large amount of content has been generated as a result of these initiatives. However, few institutions have, as yet, seriously explored the management of digitised content through a digital content infrastructure. Digitised content typically tends to be available through unstructured or semi-structured html pages, or managed through proprietary systems which do not provide for adequate exposure, sharing or re-use of materials.

The main areas of interest for the MIDESS project are:

  1. What are the best ways to share digital collections?
  2. What is the most appropriate metadata for sound, image and video collections?
  3. How can digital repositories of multimedia material best be used in a learning and teaching context?
  4. What are the Digital preservation requirements for digital image collections?
  5. How can Digital Repositories best be used in medicine?

PerX

The PerX project is developing a pilot service which provides subject resource discovery across a series of repositories of interest to the engineering learning and research communities. This pilot will be used as a test-bed to explore the practical issues that will be encountered when considering the possibility of full scale subject resource discovery services.

The project has deliberately taken a broad definition of what constitutes a repository and the pilot service is currently being set up to cross search a range of repository types (e.g. Research Outputs, e-thesis, Learning Materials, Technical Reports, etc.) using a variety of techniques (ie. OAI-PMH, Z39.50, and SRU/SRW).

Issues to be investigated during the project include: the range and availability of digital repository sources; exploration of cultural barriers to the use of repositories in the subject community, functionality of software tools; advocacy to encourage participation of repository providers; maintenance issues; interactions with infrastructural shared services; enhancing metadata quality; embedding and reuse of resource discovery services; improving search and browse results presentation; service profiling for particular audiences, and service sustainability.

Questions:

1. How do subject based services effectively identify suitable repositories? IESR scope is currently quite narrow and the OpenDoAR subject classification of repositories results in many repositories being classified under many multiple headings. Metadata repositories are also difficult to identify.

2. How do subject based services gain a subject perspective on multidisciplinary collections? Many repositories currently offer no effective means to subdivide collections on a subject basis and those that do, often use different classification schemes.

One possible scenario is that repositories are harvested/collected by resource type at a national level (e.g. e-prints by EprintsUK, e-thesis by Ethos, Learning Objects by JORUM etc). These national services subsequently enhance metadata (including the addition subject classification) from all UK repositories and act as data providers for subject based services.

3. How can potential future services be sustainable? Neil has asked us to think about how the work of our project might engage with the JISC investment in the 'integrating infrastructure' - we need more info on this 'integrating infrastructure' first.

PROWE

BACKGROUND

New online, accessible and cost-effective networking tools, such as wikis and blogs, encourage the open exchange of ideas. They also offer the possibility of collaborative authoring and other shared endeavours such as informal repository development that contribute to building and sustaining a community of practice. Part-time distance education tutors, without the security of a single or permanent institutional 'home', often lack the opportunity to participate in collaborative activity and community building interaction. The PROWE project seeks to explore the potential of new networking tools to foster community and realise part-time tutor professional development.

THE PROJECT

The Personal Repositories Online Wiki Environment project, PROWE for short, focuses on the potential of repositories at the informal and individual levels. The project's overall aim is to develop an understanding of how current technologies can, and are, being used to support communities of part time distance tutors. In particular, the project seeks to establish the role that individual and group repositories play in informing professional practice and facilitating (part-time) staff development. The project has a central research question: "In what ways could wiki and wiki-type environments be useful and useable as personal and informal repositories to support professional development within part-time tutor communities of practice?" Two institutions, The Open University (OU) and University of Leicester (UoL), are involved in the project. Both institutions make extensive use of part-time tutors although tutor duties, and thus associated professional development needs, vary quite widely.

ACTIVITY TO DATE

The project began with consultation with tutors and academic staff from both institutions, by means of face-to-face focus groups. Simultaneously an analysis was undertaken of wiki/blog/bliki solutions currently available. Assessing possible tool options, the project had to take account of the requirement that any environment developed must potentially be widely applicable across a variety of institutional contexts within the higher education sector. It must also be compatible across different VLE platforms and accessible via different internet browsers. It quickly became clear that the needs of tutors in the two partner institutions were both similar in focus (need for community, etc.) but also very different (different recruitment practices, different technical capabilities, etc.) and that the project should not look to find a single solution but to map and test a variety of options and possibilities for meeting common needs across diverse contexts. Presently UoL are implementing testing of a pure wiki environment 'seeded' with core materials which tutors can use and develop whilst the OU is testing a blog type environment with file storage capabilities.

SOME ISSUES

The part time distance tutor is a very diverse species inhabiting equally diverse working habitats! There is no simple solution to assisting their professional development or supporting their resource management needs. Software solutions already exist for most of the activities that tutors would wish to undertake. However, not all are available in any one particular wiki or wiki-type environment. Or, where they are there are overriding issues to be considered – such as security, authentication or storage. Most wikis are completely open, accessible to any one, and thus are liable to interference or spam attack. This vulnerability makes them unsuitable for stable, long term repository type use. Wiki technology also makes it difficult to store multiple versions of documents which would be required to ensure that contributors do not lose control of their original contributions. One major challenge for PROWE is to propose options which will offer enough security (i.e. which respect access and the integrity of inputs) to inspire the trust and confidence of users in order to ensure that they contribute. Options must also be flexible enough to encourage the widespread uptake needed to, in turn, ensure that solutions endure as stable, secure and supportive resources over time. Another challenge for the project is to link the development of repository deposited resources to individuals and not just to the institutions in which they are presently teaching. This will be essential to ensure sustainability. It will also offer users the incentive to contribute by providing long term personal satisfaction and continued professional development over a career which may span multiple institutional affiliations.

Anne Hewling, The Open University (a.hewling@open.ac.uk)

Repository Bridge

Repository Bridge: project led by the University of Wales Aberystwyth (UWA), interested in the interaction between institutional and national (archival) repositories for access to and preservation of digital materials, specifically electronic copies of theses. We have developed a method for export of theses and metadata from a university's IR to another repository (the National Library of Wales, NLW) and are currently testing the software we have written to support this. The interaction between this method and the EThOS UK Database of Theses is also being explored, jointly with EThOS.

We appreciate that this is one approach to integrating repositories, though arguably not so much at the infrastructural level - our aim is the sharing and management of items and metadata. Our interest in integrating infrastructure is twofold - whether the approach we use for export of theses between UWA and NLW can be expanded both to other institutions and other types of material, so it might form an aspect of a Welsh network of repositories, and also how our sharing and management of content approach fits in with other aspects of integration of infrastructure.

QUESTIONS

  1. How does our approach tie in with general question of infrastructure integration?
  2. How closely can / should we integrate Welsh repositories and their relationship with the NLW?

Rights and Rewards

The project is researching into 'blended' repositories of teaching material and research output. With a high focus on motivational aspects of depositing to repositories, we undertook a national survey to discover what motivates academics to contribute to repositories and what rights they would like to retain in their materials. We focused on previous, present and future use of repositories in terms of these rights and rewards.

Our most recent activity is investigating the different digital lifecycles for the different types of files and formats that are most commonly used as well as the most appropriate metadata for each of these types.

We are also executing a workflow mapping exercise which aims to identify the processes involved with creating and sharing teaching material and research output. We also aim to highlight the support in order for a blended research and teaching materials repository to be sustainable.

In future work, the Rights and Rewards Project aims to;

  • Develop a rights solution and a rewards scheme
  • Develop and implement best practice guidelines for both reward schemes and rights solutions within existing pilot repository architectures
  • Recommend future technical developments within current repository architectures and associated standards to support rights and rewards fully
  • Recommend best practice to the HE and FE community form a model of collaboration between the project partners.

Questions:

  1. What would make a good rewards mechanism and rights solution?
  2. What stakeholders are involved in the creation and sharing of teaching material?
  3. Which types of files and formats are most commonly used when creating teaching material?
  4. Which is the most appropriate metadata for each file?
  5. What are the differences between research output and teaching material repositories and whether the two types can be 'blended'?
  6. What support is needed to successfully implement, maintain and sustain a repository of teaching materials?
  7. What best practice can be recommended to others in HE/FE who may be setting up these kinds of repositories?

For further information, please visit the project website : http://rightsandrewards.lboro.ac.uk/

SHERPA Plus

SHERPA Plus is working across four areas: to extend the existing repository holdings through end-user advocacy; to look at issues with extending the content-types of existing repositories, with data-sets, multimedia etc; to extend the network of repositories in UK HEIs through advocacy and advice; and to establish UKCORR - a UK Council of Research Repositories.

The original SHERPA project was involved with the advocacy and establishment of institutional repositories in its member institutions. SHERPA Plus will develop and extend this work to address the wider UK HE community and to advocate the general establishment of institutional repositories in HE institutions.

SHERPA Plus will produce a series of deliverables to these aims, helping to:

  • Assist all stakeholders with advocating activities for populating existing repositories
  • Advocate resources, information and advice for institutions wanting to establish repositories.
  • Broaden the existing national network of repositories through advocacy, dissemination, practical advice, information and support
  • Supporting repository-level, institutional and national policy development
  • Reviewing and analyse extending repository holdings with datasets, multimedia, grey literature, learning objects and other content types

Our interest in integrating infrastructure is in the establishment of fundemental services and facilities to enable academics to use repositories as a natural and literally everyday part of their working habits.

Questions:

  1. As project workers and OA advocates we can all see a bright future of interlocking information sources, processes and protocols. However, our academic colleagues still have not bought into the idea. Why not? Can we ask the opposite of what we normally consider in the hope of getting further forward?
  2. What is wrong with repositories?
  3. What drawbacks do they have?
  4. What are the real, practical problems I will have as an academic?
  5. What are the Killer Apps (services, probably) that will make repositories a must-have for institutions? What is fundemental to institutions and can repositories address any of these needs? RAE sounds pretty strong to me. If there are difficulties this time around, how can we prepare a national turn-key solution for research assessment for the next?
  6. What are the Killer Apps (services, probably) that will make repositoires must-use for academics? What else are must-use systems for academics - what do they give and what lessons can we learn from these. email? web? RAE? hierarchies of esteem? telephones? cars? They get communications, respect and convenience from this top-of-the-head list. And do repositories offer easy advantages in these areas? Interestingly, the answer is - should do - but no - not really, not yet. What else?
  7. Are we too purist? Can we not see the wood of the lucrative logging operation for the trees in the groves of academe? Can we use repositories to generate money? If we can, then this is the best attention-getter I know. A brainstorm in the ponds of mammon might reveal some unregarded income streams.

SPIRE

The SPIRE project is looking into the feasibility of using peer-2-peer systems in UK academic settings. We are focusing on the LionShare open source P2P system which has been created by Penn State University. The LionShare client application differs from normal P2P clients in that it authenticates with your institutions central security system and it can also use aspects of Shibboleth to create access controls for selected files.

The main areas of interest we have as a project are:

  • Infrastructure: How does an informal technology such as P2P fit into and integrated infrastructure?
  • Shibboleth: When will large federations be in place? How could a federation use Creative Commons?
  • Context / Annotation: How can context be given to digital objects in repository style systems.
  • Workflow: Can the informal P2P philosophy blend into current workflow and institutional requirements?
  • Uptake: Would people use an informal P2P system if it was formally rolled out by the institution?

A more comprehensive list of the areas we are looking into or that have arisen are as follows:

Is it possible to transfer a dynamic and freeform technology from the wider web and place it in an academic environment in a meaningful way?

How would students and academics use P2P as part of their studies / research?

Do the formal requirements of the institution negate the initial flexibility of the P2P format? Is the reason people use P2P partiality due to anonymity?

Will the Virtual Home concept work for users who are not members of an institution?

Context:

  • In an academic environment, often the context that comes with an object (picture, document etc) is as, if not more important than the actual object its self. The SPIRE project feel that some sort of annotation / contextualisation system should be attached to each object which could potentially start a useful commentary.

IP

  • It would appear that the P2P network could only extend as far as any IP agreement that was in place between institutions. We envisage that this agreement could be attached to the Shibboleth federation that the P2P system was linked with.
  • Because of the informal nature of P2P we envisage that a creative commons licence would need to be applied to all objects within the P2P system by default.

Shibboleth

  • To account for IP (see above) and to allow the use of access controls on shared files LionShare needs to be part of a Shibboleth federation. As such the SPIRE project is interested in the general uptake of Shibboleth.

Uptake

  • A P2P network needs a critical mass of users and materials before is becomes popular. If P2P is appropriate how do we generate initial interest?
  • Because it is technically challenging to tie any technology to centralised security systems any authenticated P2P system would have to be rolled out by a central IT department. This means that what is normally a viral model of uptake for P2P would have to change to a centralised rollout.
  • LionShare also has a federated repository search. Would a client app rolled out as part of an institutional desktop be more popular than a website for a single point of contact search?

SPECTRa

Project SPECTRa : an Overview

Chemical information is essential to many sciences outside chemistry, including material, life and environmental sciences, and supports major industries including pharmaceuticals. The reporting of the synthesis and properties of new chemical compounds is central to this. However, it has been reported that 80% of all crystallographic data are never published and we estimate that in organic chemistry 99% of all spectra (which are essential for the full analytical characterization and understanding of chemical structures) are lost.

SPECTRa - the Submission, Preservation and Exposure of Chemistry Teaching and Research Data - is an eighteen month project which will develop a set of customized software tools to enable chemists to routinely deposit experimental data, much of which is currently lost, in Open Access digital repositories. The project is a partnership between the chemistry departments and university libraries at Cambridge and Imperial College, and has formal links with the eBank-UK project. Requirements in a number of different user disciplines (X-ray crystallography, computational chemistry and synthetic organic chemistry) will be determined by interview and survey. A customized version of the DSpace digital repository will be developed. Additional tools and context-specific metadata will facilitate the subsequent re-use of the deposited information.

Progress so far has been mainly in the crystallographic area. Building on interviews with departmental crystallographers, and on work previously done by the eBank project for its eCrystals repository, we have developed a set of software requirements for a toolset to enable crystallographers to deposit structures and associated metadata in an OAI-compliant repository. Work is currently on schedule.

A number of issues have arisen. On the technical side we need to further investigate which of the digital packaging standards to adopt (METS or DIDL). On the policy side, we have to manage researchers' views on Open Access, which may require some degree of short-term embargo in order to encourage their willingness to deposit data.

TrustDR

The TrustDR project is mainly concerned with exploring the legal, organisational, cultural and technical aspects of operating an institutional digital repository of learning objects. The legal dimensions of e-learning particularly those affecting the sharing and reuse of learning materials in the form of learning objects are currently conceived of as presenting serious obstacles to future development, so this project is very timely.

The real challenge is how the education sector can take advantage of the new digital media and technologies without having to pay a huge cost in terms of administration, legal fees and insurance? In this, the issue of trust is central. How can the education sector conduct its business within this environment in such a way that the various creators, publishers and consumers of intellectual property retain their trust? A social or economic system that has low levels of trust tends to have much higher running costs. In a low-trust system, expensive lawyers, contracts and insurance are used as a substitute for behavioural constraint. So, if trust reduces transaction costs in an economy how can we build and maintain it in the context of digital repositories? Some of the main barriers to the success of such repositories are not technical but legal and cultural.

The the project will be interested in looking at the cultural issues that need to be addressed in developing DRM (Digital Rights Management) systems. It will be concerned at how to arrive at an agreed legal expression of rights in the form of licences (especially those developed by the ‘Creative Commons’, http://creativecommons.org/) and user agreements from various groups of stakeholders, and whether there are any common patterns that can be identified and possibly transferred for use elsewhere. The project will also be looking at how these expressions of rights can be included in rights metadata using a Digital Rights Expression Language (DREL).

We are working on a pretty broad front and are seeking to deliver materials to users at different levels in institutions that seek 'explain and persuade' about the benefits of adopting a simple licensing regime and incorporating it into policy and and DRM systems. We have come up with a simple framework built on previous work by project Romeo and the DRM report for JISC by Intrallect.

On the way we are having to deal with some interesting isues - like where is the value (if any) in e-learning materials?

VERSIONS

The VERSIONS Project, led by London School of Economics and Political Science, addresses the issues and uncertainties relating to versions of academic papers in digital repositories. VERSIONS aims to help build trust in open access repository content among all stakeholders: authors, researchers, librarians, publishers.

VERSIONS is investigating researchers’ needs and current practice relating to use of open access research papers. By looking at the requirements of economics researchers in major European research institutions, the project will uncover any variations in practice between different EU countries and aims to provide explanations for differences.

One of the deliverables will be a toolkit of guidelines on versions for stakeholders offering practical guidance about retention of author copies, standardised description of versions and related advice about deposit of papers in open access repositories. The project will make recommendations to JISC on standards for versions of eprints.

Questions / Challenges of interest to the VERSIONS Project

  1. Standardising the description of resources: metadata records and other parts of the repository records – cataloguing rules
  2. Ensuring interoperability between repository services and internet services to improve discovery
  3. Ensuring international interoperability (for example international services based around subjects or item types)

All of these are of interest on the specific topic of version identification, but could equally apply to other parts of the description such as unique author identification, subject description.

Summaries from other projects attending the meeting

Digital Curation Centre

Digital curation is all about maintaining and adding value to a trusted body of digital information for current and future use. Working with other practitioners, the Digital Curation Centre will support UK institutions that store, manage and preserve these data to help ensure their enhancement and their continuing long-term use. The purpose of our centre is to provide a central focus for research and development into curation issues and to cultivate expertise and promote good practice, nationally and internationally, for the management of all research outputs in digital format.

The DCC is currently involved in two pilot projects that are closely related to digital repositories and long-term preservation. The first project will look into aspects of audit and certification relating to digital repositories. Seeking to complement the existing international efforts in this area the DCC will work with the Center for Research Libraries (CRL) to undertake a pilot audit of the digital repository at the Koninklijke Bibliotheek in the Netherlands in late April 2006. This exercise represents the third in a series of pilot audits that aims to assess the viability of the RLG/NARA Certification Task Force Audit Checklist for Certifying Digital Repositories. Additional independent pilot audits will be undertaken by the DCC in 2006 within the UK, with the results informing subsequent work and contributing to the accumulated understanding.

The second pilot project is the DCC LOCKSS Technical Support Service (LTSS), funded as part of the JISC/CURL LOCKSS Pilot Programme. The Pilot Programme aims to raise awareness of LOCKSS in the UK and will monitor, assess, and support the use of LOCKSS technology for e-journal archiving and preservation. Twenty-four universities from across the UK have successfully bid to participate in this pilot and have very recently had LOCKSS boxes installed at their institutions. In addition to gathering valuable community feedback on the usability and functionality of LOCKSS, the Technical Support Service will also provide both technical and non-technical support to the UK LOCKSS community, including first line support and development of publisher specific plug-ins, as well as training and awareness raising events.

Questions:

  1. How can we best identify, promote and collaborate in the research and development activity being carried out by other JISC programme projects and other projects (e.g., in the data and eScience communities, JISC core middleware projects)?
  2. Should the various JISC-funded projects start to look into assigning 'bricks' (from the JISC/DEST e-Framework model) to the various services we are developing?
  3. Is there a need for an audit and certification framework for digital repositories in the UK? If so, what characteristics would an accredited certifying body be required to have?

GeoXwalk

geoXwalk is JISC funded middleware implementing a digital gazetteer service and server for the UK academic Higher and Further Education community. The rationale behind the project is that there is currently no unified entry point to assist in geographic searching within the existing academic network as each information provider/service adopts different geographic coding conventions (some use postcodes, others placenames, some grid references etc.). geoXwalk is designed to make geographic searching transparent by 'crosswalking' these different geographies.

geoXwalk is more than just a simple lookup facility however, as every geographic feature stored in the gazetteer has its detailed geometry stored with it (i.e. a city would be stored as a polygonal footprint (co-ordinate list), a river as a linear footprint etc.). Holding the geometry as an integral attribute of the feature enables complex spatial searching based on relationships between features e.g. is feature A within a distance of feature B?; what features are contained within feature C?; what features does feature D intersect? and so on. Additional tools that assist in the semi-automated creation of geospatial metadata to enhance existing resources have also been developed.

geoXwalk obviates the problem of variable geographic naming by coding geographic features based on a persistent and consistent coding convention - national grid references, thus allowing the 'where' to become as important a search dimension as the 'who' and the 'what'.

Pop Question: - who has used Google Earth/Maps in the last six months?

  1. What range of geographies do your current services use to assist users in locating resources?
  2. How do you currently implement geographic searching within the services you use or run?
  3. How could geoXwalk middleware enable and/or enhance the geopgraphic search capabilities of your services?
  4. What value can we place on enabling resources to be searched in this way and what would the business model be?

IEMSR

The IEMSR (Information Environment Metadata Schema Registry) project is developing a metadata schema registry as a pilot shared service within the JISC Information Environment. http://www.ukoln.ac.uk/projects/iemsr/

Metadata schema registries enable the publication, navigation and sharing of information about metadata. The IEMSR will act as the primary source for authoritative information about metadata schemas recommended by the JISC IE Standards framework.

Metadata within the JISC IE is based largely on two key standards: the Dublin Core Metadata Element Set (DCMES) and the IEEE Learning Object Metadata (LOM) standard. The IEMSR will provide the JISC IE with a single point of referral for recommended schemas. It will allow various initiatives within the JISC IE to publish "application profiles" of these standards in a common registry, making them available to others. This provides a concrete way of encouraging sensible uniformity alongside necessary divergence. It helps avoid unnecessary duplication of effort, and supports sharing of common approaches.

The project has now entered Phase 2, the continuing development of web interface and desktop software, both key elements in encouraging uptake and use of the registry.

Issues of particular relevance to the current phase include: providing a focused business case; the dissimilar software base surrounding the DC and LOM metadata standards; incentive to add data to the registry; provenance of donated data; reuse of donated data; defining a usable interface (eg. faceted views on metadata profiles); use cases and user profiles; the possibility of integration with DC/LOM-specific tools such as XML/XSD validation.

Questions:

  1. Regarding any of the issues above/the project in general?
  2. Who uses what schemas - integration with IESR/others?

IESR

The Information Environment Service Registry (http://iesr.ac.uk/) is one of the building blocks of the Information Environment. It holds information about electronic resources and the technical services through which they can be accessed, as well as details about the organisations which provide the resources and services. The aim is to supply enough technical information so that portal-builders and developers of other applications can use the IESR to provide (for example) cross-searching facilities over those resources that are of interest to the users of the portal/application.

The IESR's data is available through a variety of protocols (http://iesr.ac.uk/use/), including a web interface - although this is definitely not the point of the service, which is intended to be invisible to 'real' users. An online editor for the creation of descriptions was added this month - further information is on the 'Be Included' pages of the site. A variety of technical access methods can be described in the IESR, including OAI-PMH repositories, Z39.50 access to library OPACs and Web Services. Most of the resources currently described are only accessible through web pages, but we anticipate that this will change over the next few years.

Questions:

  1. How do IESR's aims fit with related registry efforts such as ROAR and OpenDOAR?
  2. What is the best way for the various registries to collaborate and inter-operate (while keeping barriers to inclusion low for the contributors of information)?

PARADIGM

The PARADIGM Project, led by Oxford University, addresses the issues and uncertainties relating to working with the personal papers of private individuals to develop policies and procedures to care for the born 'digital' archives which are increasingly replacing the paper records of the past.

PARADIGM's primary objective is to act as an exemplar project: providing record creators (we are working with politicians papers) and curating institutions with an introduction to the preservation of digital personal archives based on the projects experiences.

One of the deliverables will be an on-line workbook - www.paradigm.ac.uk/workbook which will provide guidance on accessioning and ingesting digital private papers into digital repositories and processing these in line with archival and digital preservation requirements. Another outcome will be the creation of guidance notes for creators of digital private papers.

Questions / Challenges of interest to the PARADIGM Project

  1. We need to build on and integrate the work of others rather than re-inventing the wheel.
  2. We have a preference for flexible solutions which are cross-platform and based on open standards.
  3. Creating the atmosphere of trust with private individuals to enable the preservation of sensitive materials over decades.
  4. Developing ingest workflows for complex collections of objects - many repositories deal only with a handful of object types and we need to deal with whatever we find on an individual's computer AND maintain the contextual relationships between objects AND do it as efficiently as possible.

Sherpa DP

The SHERPA DP project is investigating the creation of a collaborative, shared preservation environment framed around the OAIS Reference Model. The project brings together the SHERPA institutional repository systems with the preservation repository established by the Arts and Humanities Data Service to create an environment that addresses the lifecycle of digital information. The project is being lead by the Arts & Humanities Data Service with the University of Nottingham as the named project partner. Five project partners (London Leap, The White Rose consortium, University of Edinburgh, University of Nottingham and the University of Glasgow) will provide 15 repositories to serve as a testbed for the service. The project's primary aim is to investigate a disaggregated preservation service that removes the burden of adding a preservation layer to individual institutional repositories, and the need for them to seek to employ scarce preservation management skills and expertise. In the disaggregated model developed for the project, institutional repositories will continue to be responsible for the ingest and distribution while a specialist service, provided by the AHDS, will take responsibility for the long-term management and preservation of research data. The SHERPA DP project is currently investigating the following subjects:

  • An investigation of the OAIS reference model as a method to develop a disaggregated, persistent preservation environment for the SHERPA consortium. This includes the assignment of rights and responsibilities and establishment of a preservation workflow.
  • An exploration of METS as a framework to package metadata created by institutional repositories, as well as preservation metadata for e-print created by the AHDS.
  • Establish a coordinated set of protocols and software to be implemented as a working preservation service for a group of institutional repositories.
  • An exploration of open source software and tools to add functionality to and extend the storage layer of DSpace, ePrints and Fedora repository systems.
  • Draw together the experience gained into a Digital Preservation User Guide that will complement the 'The Preservation Management of Digital Material Handbook' created by Maggie Jones and Neil Beagrie, and act as a practical user guide to implementing this type of preservation environment.

Challenges/questions:

  1. What methods are currently being tested to share data between institutions?
  2. How does our investigation of OAIS, PREMIS and related technologies compare to other projects?
  3. How can preservation concerns be implemented into the workflow of an institutional repository?

STARGATE

The STARGATE project (http://www.cdlr.strath.ac.uk) is exploring the use of static repositories as a means of exposing publisher metadata to OAI-based disclosure, discovery and alerting services within the JISC Information Environment and beyond. The project's primary aim is to examine the use of static repositories to lower the technical barriers to the implementation of OAI-compliant repositories, thereby enabling small publishers of electronic resources to participate more readily in OAI-based disclosure and delivery services. In doing so, it is seeking to improve the retrieval of collections of articles from the same journal or issue and so to also address the problem that often the URL of the ‘published’ version of an article is less visible in OAI-based services Static repositories and static repository gateways are a development of the OAI-PMH specification that makes participation in networks of data and service providers even simpler. To create these static repositories the project is capitalising on existing metadata that the publishers have created and made available in some form (for example webpage meta-tags). The project is working with four publishers in the Library and Information Science domain to explore the use of OAI static repositories

Questions:

  1. What benefits (and other effects) would the greater visibility of publishers’ metadata have for the information environment?
  2. How might metadata records across different repositories be linked by a higher-level service (publisher’s version to author pre-print and vice-versa)?
  3. What other collections might be usefully exposed via the static repository approach?
  4. Who should run a static repository gateway service for publishers?

Information about remaining DRP projects

CDLOR; CLADDIER; Community Eprints; EThOS; GRADE; IRIScotland; IRRA; IRS; OpenDOAR; R4L; RepoMMan; STORe; UKCDR; UNPUPR