SWORD meeting 2007-04

From DigiRepWiki

SWORD Home | SWORD Wiki | SWORD Project Background


SWORD Project meeting 30th April 2007

Monday, 30th April, 10.30am-4pm

Birkbeck College, London, Room 104 in Clore Management Centre (building no. 2) on this map: http://www.bbk.ac.uk/bbk/findingbirkbeck/maps/immediateareawebmap.html


  • Introducing the project, agreeing our scope
  • Workpackage 1 : evaluation of standards; finalising the specification
  • Workpackage 2 : development
  • Workpackage 3 : user testing
  • Workpackage 4,5 : community acceptance, project management
  • Agreeing actions, date of next meeting (late June/early July)



  • Julie Allinson
  • Stuart Lewis
  • Les Carr
  • Martin Morrey
  • Neil Taylor
  • Glen Robson
  • David Flanders
  • Jim Downing


  • Richard Jones
  • Richard Green
  • Chris Gutteridge

Introducing the project

See http://www.ukoln.ac.uk/ukoln/staff/j.allinson/sword-2007-04-30.pdf

Agreeing our scope

Purpose is to support remote deposit by people and machines; also deposit from legacy databases/filestores

Possible new scenarios:

  • save as, appears in repository
  • email to repository
  • post blog entries to a repository.

The focussed scope was agreed (deposit, plus necessary permissions/conditions that relate directly to deposit - see http://www.ukoln.ac.uk/ukoln/staff/j.allinson/sword-2007-04-30.pdf).


Discussion during the first part of the day explored a range of issues regarding the specification and implementation, e.g.


Putting things in a particular collection and knowing where it's been put is the key requirement; as a minimum, a repository must offer one default collection.

Is detailed identification of collections too specific, and is that out of scope for this group?

Collections and loss of hierarchy is an issue, particularly where a client is offering human users a list of collections for deposit; might be worth investigating http://www.sitemaps.org (can be extended), although this is out of scope for SWORD.

There are there are standards for describing collections (e.g. ead, dc-collections).

mediated deposit, authentication and trust

Mediated deposit is an important requirement, missing in earlier discussions, i.e the facility to deposit (as 'depositor') on behalf of somebody else (as 'owner') into a specified collection, and supply their identity; this applies both to human and machine deposits, for the latter there may also be machine-generated metadata, e.g. identifying author/owner etc.

Three levels of access:

  • 0) anonymous user (anyone in the world can deposit candidate items for review)
  • 1) authenticated user (individual can deposit into own accessible collections)
  • 2) superuser (can deposit anything to anywhere they have access to on behalf of a set of users)

It has been assumed that an authentication service is being used for authenticated users (2); for machine users, we need to avoid the assumption that there is a person and a browser. We are aware that this is more complex than that assumption implies!

The APP approach is HTTP basic and HTTPS for secure access.


It was agreed that the only 'right' that needs managing is the right to deposit; this should be easy to expose.


Possible deposit mechanisms: email, ftp, webdav ...

Would we need alternate bindings (of atom? or our 'own' specification?), e.g.

  • smtp binding (cf flickr), e.g. email zip file, with xml body to a repository
  • SOAP (OSID can be used to build a soap binding on top of xml over http; java jackrabbit),
  • REST
  • wsdl (soa important)
  • cgi approach?

It was agree that what we are trying to do will require co-operation at the client (deposit) and server (repository) side, it's not magic.

There was some discussion wrt use of http headers and multiparts and the need to reserve headers; also question over use of http parameter names vs xml namespace.

transaction identifiers and status

An optional element in our list of parameters, client-generated, used to provide an audit trail. Agreed that it might have uses, but would certainly not be a required parameter.

HTTP status codes will provide a success or failure identifier, although we may require more than that.

Is it enough to supply a uri for status information? we can't rely on email addresses and people being part of the process, does this status information need to be machine-readable?

Status levels

  • http request received and returned (this is a given)
  • level 0 - accept/reject
  • level 1 - further information about position in workflow and detailed status information from the repository (out of scope?)

On receipt of a deposit, the minimum a repository must do is unpack and check contents, decide whether it will accept and return a status in the receipt.

Is deposit asynchronous?


do we need to provide a vocabulary for accepted packaging standards?

what about things that aren't packaged, e.g. pdf + metadata - would that be expected to arrive as a zip file?

what about single files where m/d is embedded, e.g. in PDF or PDFa, or ODF - these aren't yet being used widely

Other projects of interest for user testing

ROBOT (Stuart Lewis), The Depot (Edina - they are aware of the work) and SOURCE (David Flanders)


Level of implementation, 0 or 1? - 0 initially, moving to 1 later in the project

Reference client =

web-client or desktop tool? (java/adobe apollo) - it will probably only be possible to do one of these


Evaluation of standards

The timescale of the project make it impossible to make a thorough investigation of the identified standards, but, based on the expertise of the assembled group; simple evaluations were made. See:

Actions and Next steps

It was agreed that work needs to move forward quickly to fit in with the project timescales. Agreeing the specification is a priority.

  • Review APP against our revised parameters, identifying
    • what APP does and doesn't support
    • what else APP does, and whether this may be of use to SWORD or whether it adds a layer of complexity beyond what is practical to implement
    • how APP might need be extended to support additional parameters

ACTION: JA to read through APP and map to parameters, before 9th May.

  • Videoconference 9th May, 2-4pm.

ACTION: SL to organise videoconference.