Raising Awareness

"A centre of excellence in digital information management, providing advice and services to the library, information and cultural heritage communities."

UKOLN is based at the University of Bath.

Enhanced Tagging for Discovery (EnTag)


1) Intute demonstrator

The Intute demonstrator includes accessing Dewey Decimal Classification (DDC). Different options for linking DDC to social tagging are provided. Search system for purposes of illustrating search scenarios is also designed. The demonstrator is now available, the main output being the Enhanced Tagger. Please consult the settings document for help with the hardware and software requirements and the training document for help with using the demonstrator.

The Intute demonstrator comprises three major interfaces: searching, simple tagging, and enhanced tagging. Once a person logs in, he/she arrives to the searching interface (Figure 1). The searching interface provides the following features:

1) Main Tag Cloud: a tag cloud with tags linked to documents to which they were assigned. It is an alphabetical list of all tags in the demonstrator, with different font sizes relative to popularity. Filter By drop-down menu on top offers the My Tags option which presents the current taggers’s tags only. By default everyone’s tags are shown (Everyone’s Tags).
2) Taggers: a cloud of names of taggers linked to documents they indexed.
3) A free-text search box, with an option to limit searching to tags, title and description fields.

The documents found are shown in the Results pane. They are automatically ranked according to the MySQL full-text natural language search. Title, description (from Intute) and existing tags are shown here. By clicking on the Title, the URL opens in a new window.

Figure 1. Intute demonstrator’s searching interface

Once a document is selected from search results, clicking on the “Tag” button will return a tagging interface. Here title, URL, and description are displayed. Two tagging interfaces are provided: 1) the Simple Tagger, with tagging features common in popular social tagging applications; and 2) the Enhanced Tagger, with additional suggestions from the controlled vocabulary. (In the study the log-on screen provides the choice of the interfaces). Both tagging interfaces have the following options from which to select tags (Figure 2):

1) Main Tag Cloud.
2) Taggers: names of taggers linked to tags they have used, the latter listed in the All {Tagger Name}’s Tags pane; and,
3) My Tags For This Document.
By clicking on a selected tag, the tag will be shown in the text box. By pressing the Tag Document button the tag will be added to the document as well as listed in the “My Tags for This Document” pane. A tag can also be typed in.

Enhanced Tagger (Figure 2) additionally provides suggestions from the controlled vocabulary, presented in three panes at the bottom of the screen. In the first pane to the left (“Automatically suggested matches, Find appropriate context(s)”), DDC classes are listed. They are automatically derived by a string-matching comparison of DDC vocabulary to a user-entered term from the text box above the panes upon clicking the Suggest button. Immediately after the user comes to the enhanced tagging page, initial suggestions are automatically generated by treating the document’s title as if it had been entered as a tag.

If the user clicks on one of the listed classes from the first pane, its narrower and broader classes are shown in the second pane (“Explore hierarchy around the selected context”), allowing interactive browsing of the hierarchical context. Simultaneously, in the third pane (“Select/edit relevant tags”) a tag-cloud-like list of DDC captions, DDC relative index terms and LCSH mapped terms is presented as a source of suggestions from which the user may select a tag. Selecting a tag copies it to the text box, where it can be further edited; pressing the Tag Document button adds the tag to the document.

Figure 2. Intute demonstrator’s enhanced tagging interface

2) STFC demonstrator

The STFC demonstrator is based on extended current author tagging system of the STFC ePublication Archive along social tagging lines.

The Tagger interface is supplied in conjunction with the ePubs metadata editing tool so that tags can be entered for a specific publication by its authors. The figure below shows a screen shot of a typical tagging screen.

Figure 3. STFC interface.

The screen is divided into four main areas:

1) At the top-centre, the title and abstract of the publication selected for tagging are displayed.
2) At the bottom-centre, a browse interface for the controlled vocabulary is shown. Initially top-level terms are shown. Clicking on a term will show its narrower and related terms. The current path to the top of the hierarchy is always shown as a ‘breadcrumb’ trail along the top of the hierarchy. Terms can be selected as tags by clicking on the “+” symbol to the left of each term. Apart from browsing the controlled vocabulary, one can also search it (the “Search thesauri” link).
3) To the left a tag cloud is displayed, with tags ranked in order of descending use frequency. They can be selected by either clicking on them (if they are free-text terms) or clicking on the spyglass symbol to their left (if they are from the controlled vocabulary). This will enter them into the “Add” term box to the right of the screen, where they can be accepted as a tag for the paper. The tag cloud as a default shows the terms used by the current author. “Show all” will display all authors’ tags.
4) To the right, the current selected free-text (top) and controlled vocabulary terms (bottom) are shown. They can be deselected by clicking on the “—” sign to their left. In the centre of the panel, there is a free-text box, where the user can enter free-text terms. Multiple terms can be entered by separating them with commas.

Once a suitable selection of tags has been made using the tool, the user can accept them by clicking on the “Confirm” button at the bottom of the screen.

The ACM Computing Classification Scheme was the chosen controlled vocabulary. Since its main purpose is to classify papers which are submitted to various ACM journals, it has a widespread awareness and authority within the computing community. It was first imported into SKOS and then into the tool.

The STFC demonstrator is an Apache Cocoon application using combined Java and XML techniques, the underlying database being Oracle. It links dynamically to the STFC ePubs institutional repository so that once the user performs a search in the repository, in a specially adapted edit mode there is an option to enter the Tagger system. Once the system is started, the title and abstract of the work are transferred together with any existing free tags.

Currently the Demonstrator is behind the STFC firewall and is not publically available. STFC are considering whether to develop the system further for use in within the ePubs production system.