Deposit API meeting Warwick Minutes

From DigiRepWiki

Contents

Minutes

Minutes from the Deposit API meeting, held at the
Radcliffe Training and Conference Centre, University of Warwick
11th-12th July 2006

Attendees

  1. Chris Gutteridge, Southampton University, eprints.org
  2. Jim Downing, University of Cambridge, Dspace / SPECTRa project
  3. Martin Morrey, Intrallect
  4. Phil Nicholls, technical consultant
  5. David Flanders, Birkbeck College
  6. Rachel Heery, UKOLN, JISC Digital Repositories Support team
  7. Julie Allinson, UKOLN, JISC Digital Repositories Support team

Structure

The meeting covered the following:

  • Scope and context
  • Discussion of scenarios and use cases, minutes of last meeting, strawman, comments from Richard Green
  • Brainstorming on components and levels
  • XML serialisation based on outcomes of brainstorm
  • Next steps and actions

Scope and Context

This meeting follows on from one held in London in February (see Report). At that meeting there was some discussion of what components a Deposit API or protocol would have.

The purpose of this 2-day meeting is to take forward previous discussions and draft a service description (initially as a document, later to be expressed as UML if effort available, to explore binding options, and to plan implementation.

The effort can be described as 'JISC Repository Deposit Service Description'. In the terminology used within the e-Framework, this work is to scope and develop a Service Expression for a Deposit Service Genre.

Rachel Heery gave an overview of the Interoperability meeting in New York (SEE notes here and web site here), in particular Herbert van de Sompel's ideas re surrogates, provenance and a structured data model for digital objects. Ultimately this work may propose an alternative to the Deposit activity, but its complexity suggests a longer-term goal. This shouldn't stop current work towards a quicker solution, the one might feed into the other.

A lot of interest has been expressed in this work, from UK and beyond including from some Australian projects. It was agreed that this interest should be galvanised and other players invited to input.

In Scope

It was agreed that the initial focus is on getting new content into repositories. Policies/properties/permissions are in scope, i.e. the properties a depositor needs to know before they can deposit, and the permissions that specify where and what they can deposit.

Out of (current) scope but important

  • Update and Delete
  • Mappings between metadata schemas and packaging formats
  • Identifier solutions (although the requirement for a central JISC-funded registration and resolution service for identifiers was discussed)
  • Relationships between digital objects – this should be captured within the metadata (e.g. see Eprints Application Profile
  • Tracking and provenance
  • Authentication

Discussion

Scenarios

Several scenarios/use cases were considered, see All_the_Scenarios_and_Use_Cases_Submitted:

  • Depositing geospatial data in the GRADE repository system
  • Deposit scenario (from Les Carr)
  • Data and metadata capture in the R4L project
  • Data deposit into the AHDS preservation repository
  • Deposit via a desktop client application (from Jorum)

Discussion of these helped galvanise the requirements expressed in the brainstorm session.

Requirements

  • Support an 'Eazi-Deposit' service. This might be a more or less centralised service which would be able to accept deposits and direct them to an appropriate repository or repositories. Might be implemented by institution or third-party service.
  • Support Multiple Deposit. This is, or will be, a fact, particularly in the light of the RCUK statements on OA (some research councils are mandating deposit into their repositories, but authors will also want/need to deposit in their institutional or departmental repository).
  • Support transfer of deposits between intermediate hosts, e.g. from central repository to another repository.
  • Support different types of repositories (School/Dept; Institution; Subject; Funding body; Cross-University, e.g. research project)
  • Support different content/format/packaging types and profiles of these
  • Support different workflows for deposit, e.g.
    • user to multiple repositories via intermediate client
    • user to repository, repository to additional repositories
  • Support user-triggered and machine-triggered deposit
  • Support Status, e.g. deposit to different states of a workflow (Live, Pending etc.)
  • Support Collections and changes in policy and permissions by Collection
  • Support error codes
  • Support non-instantaneous processes, e.g. deposit may be pending authority checks before acceptance.
  • Recommend use of URIs for vocabulary strings, e.g. enumerated error codes
  • Support Validation report and integrity checks
  • Support anonymous deposit
  • Support more complex, authenticated deposit
  • Support acceptance and handling of incomplete records
  • Support rejection of records (reasons for rejection are out of scope)
  • Support machine-readability (in the future)
  • Support human-selected targets for deposit
  • Support different deposit requests
  • Support 2-way licences
  • Recommend vocabulary for describing the various packaging standards
  • XSLTs for deposit/explain would be useful


Outstanding issues

  • Local differences between repositories and different repository requirements need consideration.
  • The possibility for loss as packages move around should be considered.
  • Need to explore how Deposit fits with JSR 170(238), OKI OSID and IDL
  • Need to explore other bindings, e.g. SOAP XML over http

Brainstorming

For the brainstorm and xml documents we used the convention that mandatory elements are Level 0, optional elements (plus mandatory ones) are Level 1

Components of the deposit explain and receipt were brainstormed into the following document, based on a request-response model:

XML serialisation and binding

Draft XML serialisations for the response:

Next steps and actions

By End July

ACTION: Develop a UML model and translate to WSDL (Phil Nicholls)

ACTION: review and feedback comments, via wiki/email on the following specs. By End July.

  • Webdav – Jim Downing
  • Atom – Jim Downing
  • SRW Update – Martin Morrey
  • SMTP – Martin Morrey
  • PENSE? AICC - Martin Morrey
  • FTP – Phil Nicholls
  • Fedora API – Richard Green? other Fedora implementers?; JA/RH to follow-up
  • Flickr – Chris Gutteridge

ACTION: Feedback comments to Neil Jacobs (Rachel Heery); explore potential for a funded development project in the next JISC round of funding (all)

By Mid-August

ACTION: Proof of concept testing (Developers)

Martin, Jim and Chris to work on a Level 0 implementation for their respective applications, extending to Level 1 elements if possible. Initial proof of concept implementation will be XML over http for simplicity (note SOAP attachment spec not yet finalised)
Rachel/Julie to contact Richard Green for Fedora input
Jim Downing to contact Stuart Lewis for Fedora input
Additional data and clients to be sourced

Mid/Late August

ACTION: Meeting to discuss and write-up the protocol, London (Developers)

Late August

ACTION: Develop a demonstrator (Developers)

Early September

ACTION: Dissemination

  • JISC-DRP (Julie / Rachel)
  • CETIS SIG (Julie / Rachel to contact Phil Barker)
  • Via conferences and other events (e.g. Open Scholarship 2006; JISC Open Access event)
  • Plugfest?

Ongoing

ACTION: Engage other players

  • IMS Community (Martin Morrey noted that IMS were receptive to the idea at alt-i-lab; possible SIG)
  • Fedora implementers (Rachel Heery to contact Sandy Payette; Jim Downing to contact Stuart Lewis)
  • E-Framework – Deposit is 'earmarked' as a Service Genre in the next instantiation of the e-framework; e-Framework people to be kept up to date wrt Deposit activity.

Definition

It was agreed Deposit API was not an appropriate name for the work. The aim of the work is to produce a protocol (abstract model and binding) for service description and deposit, principally to repositories, rather than an API. A possible re-name could be JISC Repository Deposit Service Description.