Talk:CRIG ReST Workshop

From DigiRepWiki

From a set of emails on what the ReST workshop should/could be about:

To begin engagement in the ReST process we have had a pre-discussion going on that should help spark the debate from the start:

Is your objective to discuss the sorts of ReST stuff that we might want to do, or get some actual hacking done, or just to talk about ReST APIs and what that might mean for repositories? I expect that a lot of time would wind up being devoted to what URLs restful repositories ought to be exposing; this will be a nightmare of a conversation, as identifier stuff always is, but useful…

I would hope that we would be able to at least wall off some URI syntax and then get down to work. A success factor for the day would be if we couldn't agree upon a common URI syntax for a single method, eg a GET syntax?

In the context of a ReSTful interface, the identifier issue is a done deal, isn't it?

Sounds like famous last words to me :)

I actually don't know, which is why I mentioned it. I guess what I mean is do we need general forms for all types of resource and service within a repository, can we allow query strings (something like OpenURL templates), and how does it relate to the human readable web content identifiers.

Regarding enforcement of standard URL structures across all repository systems, I think this would be disastrous. The API should reflect the application's domain model, which will be unique for each application. If you want a standard interface, write an OSID wrapper. Don't conflate service oriented architectures with standardization. They are completely separate concerns.

Agreeing on a URL syntax isn't the only approach, and I'd argue it's the wrong one.

Having a specified URL scheme means that the current web UIs for repos would have to be deprecated or that the ReSTful web services would be separate from the web site, which isn't really the point. Another disadvantage is that it could become difficult to combine web service implementations in a repository (e.g. if the repositories community decide I've got to structure my URLs one way and the e-Science community decide I've got to structure them another). I'd argue we need standard relationships and/or types we can use in RDF or Follow Your Nose links in hypertext, e.g.

<link rel="thumbnail" href="http://example.com/resource/1/thumbnail"> <myResource> <crig:hasThumbnail> <myThumbnail> (Not that thumbnail is a particularly big deal as these things go).

This is completely compatible with publishing a WADL for the repository features, which is something else we could consider.

It all comes down to "what is a repository". We found ourselves abandoning the Web UI concepts. In fact, what we ended up implementing was a REST interface for the repository data model, and so everything to do with workflows and data validation and licensing and policies just went out of the window. This is both very powerful, and very alarming. Perhaps all this really should be separate from the web site. Perhaps the web site is a vanishingly small part of the 'way in' to a repository.

I definitely agree that your Web Service APIs should be separate from the Website. A website is a human interface. The APIs are software interfaces.

And perhaps that's the point? An abstraction of a repository is a persistent storage environment, and so CRUD is all we need. And there are many applications that we can base around CRUD. But that's not the only abstraction possible for a repository, so we could imagine a whole family of RESTful web services to cater to these different abstractions. A repository is like a blogging system, or a file system, or a publication environment or a dissemination environment.

As I understand it, when designing a REST API you should think of your web application as a set of resources, each of which you can perform CRUD (Create Retrieve Update & Delete) operations on. This forces you to map out your domain model by asking yourself "What kind of CRUD operations does my webapp perform?" Some examples are:

CRUD content objects (ie. upload a file, retrieve that file, replace it with a new file, and/or delete it) CRUD metadata containers (ie. create a DC element set, retrieve DC metadata for some content, modify that metadata, and/or delete it) CRUD items in a workflow (ie. add an item to a workflow, modify its state within the workflow, remove it from the workflow) CRUD access control policies etc.

In theory, if your application is already designed in a modular, decoupled manner, you already have all of these hooks and it's just a matter of defining controllers that expose them at sensible URL endpoints. However, in reality, defining a REST API often helps you refactor your code because it forces you to analyze the system from a perspective that demands proper decoupling. People often talk about DRYing (http://en.wikipedia.org/wiki/Don't_repeat_yourself) their code as part of the process of creating a sensible REST API.

Ideal pub discussion, if you ask me...

Yippee!