UKOLN AHDS Archiving Pebble Blogs at ramble.oucs



1. About This Case Study

This case study describes the experiences in migrating a small-scale blogging service, once it had been recognised that the resources costs in providing to maintain the service outweighted the benefits.

2. Background

RAMBLE [1] was a small JISC-funded project hosted at the University of Oxford that linked mobile blogs with online learning environments. To practise what we preached, the project team maintained a project blog with many of the entries written offline and then posted from a handheld device.

The project team hosted its own blog server called Pebble [2], a feature-rich multi-user multi-contributor blog, written as a Web application in Java and released under an open source license. Those who have deployed it seem to be impressed. The software was well suited for the project because it supported the private blogs that were need for personal student reflections, in addition to public blogs.

3. Problems Of Growth

When colleagues in the department (Oxford University Computing Service) heard about Pebble, they also wanted a blog. The project team encouraged this usage, but pointed out that no guarantees of service reliability could be provided: the blogging service was (as so often is the case) provided largely on good will. A year or so later, with blog spam escalating at an alarming rate, the project team was obliged to switch off the service.

4. Migration

But what about the blogs themselves? The Pebble Web application underlying the RAMBLE blogs was taken offline at short notice and all the blogs vanished immediately together with comments etc. Although a properly resourced service would not be abruptly terminated, this is a general issue to consider if you are providing hosting arrangements at your institution: if you are not going to maintain a blog server forever, what happens to a blog, say, when a student graduates?

A first reaction might be to develop export facilities for the student to take the content with them. Aside from the issue of standard formats for such data and what students can actually do with them (copy and paste is not really a practical option for more than a few entries), there is the perhaps greater issue of context. Even for the relatively few blogs on ramble.oucs there were some subscribers to newsfeeds, trackbacks and hyperlinks from other sites to permalink entries and generally it had been established in a variety of contexts including projects, individual work patterns and daily activities.

Fortunately, Pebble's design is amenable to static archiving under the most popular Web servers: for instance, it has clean URLs, not only .html extensions for the permalinks, but also for calendar dates and so on. This was a real boon when it came to creating a usable archive.

A brief summary of the migration steps is given below.

Step 1. Copied the blogs elsewhere temporarily

  1. We installed (deployed) a copy of the same version of Pebble on the author's Windows XP desktop PC, accessed under localhost.
  2. We stopped the Pebble Web application and copied across the Pebble blogs from the original server plus associated data, all of which are contained in the file system, the blog entries being stored as XML files.
  3. We restarted Pebble on the author's machine.
  4. We requested a few final 'farewell' messages from colleagues and posted on their behalf.
  5. We tidied up the blog display, removing the comments and trackback decorators and some spam.

Step 2. Created the archive

  1. We created a static archive using wget (with options -r -k l 0)
  2. We used ReplaceEm to do a recursive search and replace on references to localhost:port/path_to_blogs/, pointing them to ramble.oucs.ox.ac.uk/blog/
  3. We created a compressed archive (.tar.gz) of the generated files

Step 3. Deployed the archive

  1. We had been running Tomcat under Apache,= with ramble.oucs as a virtual host. We removed the tie between Apache and Tomcat on the server (specifically removed reference to blog directories in mod jk2's workers2.properties file).
  2. We created a blogs directory within Apache's htdocs space for the virtual hosting of ramble.oucs.
  3. We copied over and unpacked the .tgz file.
  4. We checked the result at <http://ramble.oucs.ox.ac.uk/blog/>.

The results are not perfect and there are probably other viable approaches, but this has been a good result as a great deal has been preserved in context and other interested parties may benefit from our experiences.

5. Lessons Learnt

Based on this case study the following lessons can be learnt:

About This Document

This document was originally published as a blog posting [3] by Paul Trafford. It was subsequently edited by Brian Kelly, UKOLN and published as a QA Focus case study. UKOLN is grateful to Paul Trafford for permission to edit and republish this article.

References

  1. RAMLE, Oxford University Computing Service,
    <http://ramble.oucs.ox.ac.uk/>
  2. Pebble, Sourceforge,
    <http://pebble.sourceforge.net/>
  3. Archiving Pebble blogs at ramble.oucs, Paul Trafford, 22 July 2006, Connect.educause.edu,
    <http://connect.educause.edu/blog/pault/archiving_pebble_blogs_at_ramble_oucs/2449>

Contact Details

Paul Trafford
Oxford University Computing Service
13 Banbury Road
University of Oxford
Oxford
OX2 6NN
Email: paul.trafford@oucs.ox.ac.uk
URL: <http://users.ox.ac.uk/~pault/>