Metadata

DESIRE: Development of a European Service for Information on Research and Education

Ranking by Quality within ROADS

Demonstrator

A demonstrator for quality-based ranking in ROADS is available: DESIRE Quality Ranking Demonstrator. You may wish to view the demo in a separate window while reading this page.

The demonstrator cross-searches SOSIG, ranking the results according to ratings in another ROADS database. For the purposes of this demonstrator ratings were generated automatically for the resources that result from a search for the term 'cognitive'. The generated ratings were based on the results of the WebSAT metrics tool [WEBSAT] (output as IAFA templates). In practice it is expected that human generated ratings would provide the basis for ranking.

Choice of Ranking Algorithms in ROADS

The standard ROADS distribution uses a simple ranking algorithm based on the frequency with which the search terms appear in the record describing each resource. The suggested way of changing the ranking algorithm is to replace the Rank.pm module with your own. A more flexible approach was required for quality-based ranking since we want to offer a choice of ranking algorithms based on different quality criteria.

An approach for adding flexible ranking to ROADS is described in 'Offering a Choice of Ranking Algorithms in ROADS'.

With the above technique in place it was possible to implement a number of ranking algorithms and have them selected between by the user on the search page.

Basic Approach to Quality Ranking

The basic approach to quality ranking assumes that the 'score' for a resource is obtained from the sum of one or more weighted quality attributes. In the simplest case attributes are numeric although a particular ranking method might convert textual values to corresponding numerics.

Since many ranking algorithms, quality-related or otherwise, can be based on the weighting approach this functionality has been separated out and can be used by any ranking algorithm.

The ranking algorithm requires a set of attribute-value pairs and a set of attribute-weighting pairs. For each attribute in the attribute-weighting set the corresponding value is weighted and added to a running total.

The allows simple ranking based on a single attribute (such as the size of the document) as well as more complex ratings such as an accessibility rating based on a number of criteria.

The issue then becomes how to obtain the attribute-weighting pair and how to obtain the attribute-rating pairs.

Obtaining attribute-weighting pairs

Weightings are stored in a 'ranking' directory within the config directory. Files have the extension '.wtg' and the name of the file is used to identify the weighting. Example accessibility1.wtg:

WebSATAccessibility-IMGNoAltNotLink: 1
WebSATAccessibility-IMGNoAltLink: 2
WebSATAccessibility-APPLETNoAlt: 1
WebSATAccessibility-IMAGEMAPNoTextAnchor: 1

Obtaining Ratings

Ratings may be obtained in a number of ways, they may be stored along with resource discovery metadata, they may be stored in an external ROADS database, or they may be stored in an external ratings bureau accessible via HTTP.

The demonstrator includes examples of each of the above. Each has a corresponding ratings algorithm that can be selected from the search page with a corresponding weighting.

Quality Metadata with Resource Discovery Metadata

A simple and quick way of making quality attributes available is to store them in the same template as resource discovery metadata. This may be appropriate if resources are assessed for quality as they are catalogued.

It may also be appropriate if resource discovery metadata and quality ratings are harvested from other sources and can be automatically combined.

Problems occur if ratings can be provided by multiple parties. In this can values for the same attribute from different parties will need to be distinguished.

Quality Metadata in another ROADS database

A more flexible option is to store quality ratings in a separate ROADS database - different ranking algorithms might even get quality ratings from different databases.

Accessing the templates from the external database can be by ROADS handle if the databases are coordinated. If not, URIs can be used if using a version of ROADS that has been extended to allow templates to be retrieved by URI.

Note that a single WHOIS++ query is generated to retrieve ratings for all resources in the result set. This offers a much faster response that the alternative of generating an individual WHOIS++ query for each result.

Quality Metadata accessible as RDF/XML over HTTP

A more general approach would be to for the ranking algorithm to request ratings by URI over HTTP from a ratings bureau which returns ratings in RDF/XML format.

Note that a gateway can be placed in front of a ROADS installation to allow a ROADS database to act as a ratings bureau. This approach is used within the demonstrator.

An HTTP request is generated for every template in the result set. Obviously it would be quicker to get all ratings in a single HTTP request. This would require the ratings bureau to support such an operation. This functionality has not been implemented for client or server within this deliverable but could be added in the future.

References

[WEBSAT] - WebSAT Quality Metrics Tool - http://zing.ncsl.nist.gov/~webmet/sat/websat-info.html

[Up to: DESIRE: Quality Ratings Implementation ]

Maintained by: Tracy Gardner of UKOLN, the UK Office for Library and Information Networking, University of Bath.
Document created: 19-Apr-1999.
Last updated: 15-Jun-1999.

[Quality Ratings] [Metadata] [UKOLN]