UKOLN AHDS QA Focus Briefing Documents: Print All



This page is for printing out all of the briefing papers. The briefing papers are given in numerical order. Note that some of the internal links may not work.


Briefing 01

Compliance with HTML Standards


Why Bother?

Compliance with HTML standards is needed for a number of reasons:

Which Standards?

The World Wide Web Consortium, W3C, recommend use of the XHTML 1.0 (or higher) standard. This has the advantage of being an XML application (allowing use of XML tools) and can be rendered by most browsers. However authoring tools in use may not yet produce XHTML. Therefore HTML 4.0 may be used.

Cascading style sheets (CSS) should be used in conjunction with XHTML/HTML to describe the appearance of Web resources.

Approaches To Creating Resources

Web resources may be created in a number of ways. Often HTML authoring tools such as DreamWeaver, FrontPage, etc. are used, although experienced HTML authors may prefer to use a simple editing tool. Another approach is to make use of a Content Management System. An alternative approach is to convert proprietary file formats (e.g. MS Word or PowerPoint).  In addition sometimes proprietary formats are not converted but are stored in their native format.

Monitoring Compliance

A number of approaches may be taken to monitoring compliance with HTML standards. For example you can make use of validation features provided by modern HTML authoring tools, use desktop compliance tools or Web-based compliance tools.

The different types of tools can be used in different ways. Tools which are integrated with a HTML authoring tool should be used by the page author. It is important that the author is trained to use such tools on a regular basis. It should be noted that it may be difficult to address systematic errors (e.g. all files missing the DOCTYPE declaration) with this approach.

A popular approach is to make use of SSIs (server-side includes) to retrieve common features (such as headers, footers, navigation bars, etc.). This can be useful for storing HTML elements (such as the DOCTYPE declaration) in a manageable form. However this may cause validation problems if the SSI is not processed.

Another approach is to make use of a Content Management System (CMS) or similar server-side technique, such as retrieving resources from a database. In this case it is essential that the template used by the CMS complies with standards.

It may be felt necessary to separate the compliance process from the page authoring. In such cases use of a dedicated HTML checker may be needed. Such tools are often used in batch, to validate multiple files. In many cases voluminous warnings and error messages may be provided. This information may provide indications of systematic errors which should be addressed in workflow processes.

An alternative approach is to use Web-based checking services. An advantage with this approach is that the service may be used in a number of ways: the service may be used directly by entering the URL of a resource to be validated or live access to the checking service may be provided by including a link from a validation icon as used at <http://www.ukoln.ac.uk/qa-focus/> as shown in Figure 1 (this approach could be combined with use of cookies or other techniques so that the icon is only displayed to an administrator).

Figure 1: Using icons as link to validation service

Another approach is to configure your Web server so that users can access the validation service by appending an option to the URL. For further information on this technique see the QA Focus briefing document A URI Interface To Web Testing Tools> at <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-59/>. This technique can be deployed with a simple option on your Web server's configuration file.


Briefing 02

Use of Automated Tools For Testing Web Site Accessibility


Accessibility And Web Sites

It is desirable to maximise the accessibility of Web sites in order to ensure that Web resources can be accessed by people who may suffer from a range of disabilities and who may need to use specialist browsers (such as speaking browsers) or configure their browser to enhance the usability of Web sites (e.g. change font sizes, colours, etc.).

Web sites which are designed to maximise accessibility should also be more usable generally, (e.g. for use with PDAs) and are likely to be more easily processed by automated tools.

Accessibility Testing Tools

Although the development of accessible Web sites will be helped by use of appropriate templates and can be managed by Content Management Systems (CMSs), there will still be a need to test the accessibility of Web sites.

Full testing of accessibility with require manual testing, ideally making use of users who have disabilities. The testing should also address the usability of Web sites as well as its accessibility.

Manual testing can however be complemented with use of automated accessibility checking tools. This document covers the use of automated accessibility checking tools.

Accessibility Guidelines

The W3C WAI (Web Accessibility Initiative) have developed guidelines on the accessibility of Web resources. Many institutions are making use of the WAI guidelines and will seek to achieve compliance with the guidelines to A, AA or AAA standards. Many testing tools will measure the compliance of resources with these guidelines.

Examples Of Automated Accessibility Checking Tools

The best-known accessibility checking tool was Bobby which was renamed as WebXact, a Web-based tool for reporting on the accessibility of a Web page and its compliance with W3C's WAI guidelines. However this tool is no longer available.

HiSoft's Cynthia Says provides an alternative accessibility and Web site checking facility - see <http://www.contentquality.com/>.

Note that it can be useful to make use of a multiple checking tools. W3C WAI provides a list of accessibility testing tools at <http://www.w3.org/WAI/ER/tools/>.

Typical Errors Flagged By Automated Tools

When you use testing tools warnings and errors will be provided about the accessibility of your Web site. A summary of the most common messages is given below.

No DOCTYPE
HTML resources must contain a DOCTYPE at the top of the HTML file, which defines the version of HTML used. To provide compliance with HTML standards this must be provided. Ideally this will be provided in the HTML template used by authors.
No Character Encoding
HTML resources should describe the character encoding of the document. This information should be provided. Ideally this will be provided in the HTML template used by authors.
No ALT Tags
ALT tags are used to provide a textual description of images. In order to comply with HTML standards ALT tags must be provided for all images.
Use Relative Sizes And Positioning Rather Than Absolute
Many HTML features can accept relative or absolute size units. In order to ensure that resources can be sized properly on non-standard devices relative values should be used.
Link Phrase Used More Than Once
If multiple links on a page have the same link text the links should point to the same resource.

Caveats

As mentioned previously, automated testing tools cannot confirm that a resource is accessible by itself - manual testing will be required to complement an automated approach. However automated tools can be used to provide an overall picture, to identify areas in which manual testing many be required and to identify problems in templates or in the workflow process for producing HTML resources and areas in which training and education may be needed.

Note that automated tools may sometimes give inaccurate or misleading results. In particular:

Use Of Frames
HTML resources which use frames may be incorrectly analysed by automated tools. You should ensure that the frameset page itself and all individual framed pages are accessible.
Use Of Redirects
HTML resources which use redirects may be incorrectly analysed by automated tools. You should ensure that the original page itself and the destination are accessible. Remember that redirects can be implemented in a number of ways, including server configuration options and use of JavaScript, <meta> tags, etc. on a Web page.
Use Of JavaScript
HTML resources which use JavaScript may be incorrectly analysed by automated tools. You should ensure that the source page itself and the output from the JavaScript are accessible.

Briefing 03

Use Of Proprietary Formats On Web Sites


Use Of Proprietary Formats

Although it is desirable to make use of open standards such as HTML when providing access to resources on Web sites there may be occasions when it is felt necessary to use proprietary formats. For example:

URL Naming Conventions For Access To Proprietary Formats

If it is necessary to provide access to a proprietary file format you should not cite the URL of the proprietary file format directly. Instead you should give the URL of a native Web resource, typically a HTML page. The HTML page can provide additional information about the proprietary format, such as the format type, version details, file size, etc. If the resource is made available in an open format at a later date, the HTML page can be updated to provide access to the open format - this would not be possible if the URL of the proprietary file was used.

An example of this approach is illustrated. In this case access to MS PowerPoint slides are available from a HTML page. The link to the file contains information on the PowerPoint version details.

Converting Proprietary Formats

Various tools may be available to convert resources from a proprietary format to HTML. Many authoring tools nowadays will enable resources to be exported to HTML format. However the HTML may not comply with HTML standards or use CSS and it may not be possible to control the look-and-feel of the generated resource.

Another approach is to use a specialist conversion tool which may provide greater control over the appearance of the output, ensure compliance with HTML standards, make use of CSS, etc.

If you use a tool to convert a resource to HTML it is advisable to store the generated resource in its own directory in order to be able to manage the master resource and its surrogate separately.

You should also note that some conversion tools can be used dynamically, allowing a proprietary format to be converted to HTML on-the-fly.

MS Word

MS Word files can be saved as HTML from within MS Word itself. However the HTML that is created is of poor quality, often including proprietary or deprecated HTML elements and using CSS in a form which is difficult to reuse.

MS PowerPoint

MS PowerPoint files can be saved as HTML from within MS PowerPoint itself. However the Save As option provides little control over the output. The recommended approach is to use the Save As Web Page option and then to chose the Publish button. You should then ensure that the HTML can be read by all browsers (and not just IE 4.0 or later). You should also ensure that the file has a meaningful title and the output is stored in its own directory.

Dynamic Conversion

In some circumstances it may be possible to provide a link to an online conversion service. Use of Adobe's online conversion service for converting files from PDF is illustrated.

It should be noted that this approach may result in a loss of quality from the original resource and is dependent on the availability of the remote service. However in certain circumstances it may be useful.


Briefing 04

Mothballing Your Web Site


About This Document

When the funding for a project finishes it is normally expected that the project's Web site will continue to be available in order to ensure that information about the project, the project's findings, reports, deliverables, etc. are still available.

This document provides advice on "mothballing" a project Web site.

Web Site Content

The entry point for the project Web site should make it clear that the project has finished and that there is no guarantee that the Web site will be maintained.

You should seek to ensure that dates on the Web site include the year - avoid content which says, for example, "The next project meeting will be held on 22 May".

You may also find it useful to make use of cascading style sheets (CSS) which could be used to, say, provide a watermark on all resources which indicate that the Web site is no longer being maintained.

Technologies

Although software is not subject to deterioration due to aging, overuse, etc. software products can cease to work over time. Operating systems upgrades, upgrades to software libraries, conflicts with newly installed software, etc. can all result in software products used on a project Web site to cease working.

It is advisable to adopt a defensive approach to software used on a Web site.

There are a number of areas to be aware of:

Process For Mothballing

We have outlined a number of areas in which a project Web site may degrade in quality once the project Web site has been "mothballed".

In order to minimise the likelihood of this happening and to ensure that problems can be addressed with the minimum of effort it can be useful to adopt a systematic set of procedures when mothballing a Web site.

It can be helpful to run a link checker across your Web site. You should seek to ensure that all internal links (links to resources on your own Web site) work correctly. Ideally links to external resources will also work, but it is recognised that this may be difficult to achieve. It may be useful to provide a link to a report of the link check on your Web site.

It would be helpful to provide documentation on the technical architecture of your Web site, which describes the server software used (including use of any unusual features), use of server-side scripting technologies, content management systems, etc.

It may also be useful to provide a mirror of your Web site by using a mirroring package or off-line browser. This will ensure that there is a static version of your Web site available which is not dependent on server-side technologies.

Contacts

You should give some thought to contact details provided on the Web site. You will probably wish to include details of the project staff, partners, etc. However you may wish to give an indication if staff have left the organisation.

Ideally you will provide contact details which are not tied down to a particular person. This may be needed if, for example, your project Web site has been hacked and the CERT security team need to make contact.

Planning For Mothballing

Ideally you will ensure that your plans for mothballing your Web site are developed when you are preparing to launch your Web site!


Briefing 05

Accessing Your Web Site On A PDA


About This Document

With the growing popularity in use of mobile devices and pervasive networking on the horizon we can expect to see greater use of PDAs (Personal Digital Assistants) for accessing Web resources.

This document describes a method for accessing a Web site on a PDA. In addition this document highlights issues which may make access on a PDA more difficult.

AvantGo

About

AvantGo is a well-known Web based service which provides access to Web resources on a PDA such as a Palm or Pocket PC.

The AvantGo service is freely available from <http://www.avantgo.com/>.

Once you have registered on the service you can provide access to a number of dedicated AvantGo channels. In addition you can use an AvantGo wizard to provide access to any publicly available Web resources on your PDA.

An example of two Web sites showing the interface on a Palm is illustrated.

Benefits

If you have a PDA you may find it useful to use it to provide access to your Web site, as this will enable you to access resources when you are away from your desktop PC. This may also be useful for your project partners. In addition you may wish to encourage users of your Web site to access it in this way.

Other Benefits

AvantGo uses robot software to access your Web site and process it in a format suitable for viewing on a PDA, which typically has more limited functionality, memory, and viewing area than a desktop PC. The robot software may not process a number of features which may be regarded as standard on desktop browsers, such as frames, JavaScript, cookies, plugins, etc.

The ability to access a simplified version of your Web site can provide a useful mechanism for evaluating the ease with which your Web site can be repurposed and for testing the user interface under non-standard environments.

You should be aware of the following potential problem areas:

Entry Point Not Contained In Project Directory
If the project entry point is not contained in the project's directory, it is likely that the AvantGo robot will attempt to download an entire Web site and not just the project area.
Frames
If your Web site contains frames and you do not use the appropriate option to ensure that the full content can be accessed by user agents which do not support frames (such as the AvantGo robot software) resources on your Web site will not be accessible.
Plugin Technologies
If your Web site contains technologies which require plugins (such as Flash, Java, etc.) you will not be able to access the resources.

Summary

As well as providing enhanced access to your Web site use of tools such as AvantGo can assist in testing access to your Web site. If your Web site makes use of open standards and follows best practices it is more likely that it will be usable on a PDA and by other specialist devices.

You should note, however, that use of open standards and best practices will not guarantee that a Web site will be accessible on a PDA.


Briefing 06

404 Error Pages On Web Sites


Importance Of 404 Error Pages

A Web sites 404 error page can be one of the most widely accessed pages on a Web site. The 404 error page can also act as an important navigational tool, helping users to quickly find the resource they were looking for. It is therefore important that 404 error pages provide adequate navigational facilities. In addition, since the page is likely to be accessed by many users, it is desirable that the page has an attractive design which reflects the Web sites look-and-feel.

Types Of 404 Error Pages

Web servers will be configured with a default 404 error page. This default is typically very basic.

In the example shown the 404 page provides no branding, help information, navigational bars, etc.

Figure 1: A Basic 404 Error Message
Figure 1: A Basic 404 Error Message

An example of a richer 404 error page is illustrated. In this example the 404 page is branded with the Web site's colour scheme, contains the Web site's standard navigational facility and provide help information.

Figure 2: A Richer 404 Error Message
Figure 2: A Richer 404 Error Message

Functionality Of 404 Error Pages

It is possible to define a number of types of 404 error pages:

Server Default
The server default 404 message is very basic. It will not carry any branding or navigational features which are relevant to the Web site.
Simple Branding, Navigational Features Or Help Information
The simplest approach to configuring a 404 page is to add some simple branding (such as the name of the Web site) or basic navigation features (link to the home page) or help information (an email address).
Richer Branding, Navigational Features, Help Information Or Additional Features
Some 404 pages will make use of the Web sites visual identity (such as a logo) and will contain a navigational bar which provides access to several areas of the Web site. In addition more complete help information may be provided as well as additional features such as a search facility.
Full Branding, Navigational Features, Help Information And Additional Features
A comprehensive 404 page will ensure that all aspects of branding, navigational features, help information and additional features such as a search facility are provided.
As Above Plus Enhanced Functionality
It is possible to provide enhanced functionality for 404 pages such as context sensitive help information or navigational facilities, feedback mechanisms to the page author, etc.

Further Information

An article on 404 error pages, based on a survey of 404 pages in UK Universities is available at <http://www.ariadne.ac.uk/issue20/404/>. An update is available at <http://www.ariadne.ac.uk/issue32/web-watch/>.


Briefing 07

Approaches To Link Checking


Why Bother?

There are several reasons why it is important to ensure that links on Web sites work correctly:

However there are resource implications in maintaining link integrity.

Approaches To Link Checking

A number of approaches can be taken to checking broken links.

Note that these approaches are not exclusive: Web site maintainers may choose to make use of several approaches.

Policy Issues

There is a need to implement a policy on link checking. The policy could be that links will not be checked or fixed - this policy might be implemented for a project Web site once the funding has finished. For a small-scale project Web site the policy may be to check links when resources are added or updated or if broken links are brought to the project's attention, but not to check existing resources - this is likely to be an implicit policy for some projects.

For a Web site one which has a high visibility or gives a high priority to the effectiveness of the Web site, a pro-active link checking policy will be needed. Such a policy is likely to document the frequency of link checking, and the procedures for fixing broken links. As an example of approaches taken to link checking by a JISC service, see the article about the SOSIG subject gateway [1].

Tools

Experienced Web developers will be familiar with desktop link-checking tools, and many lists of such tools are available [2] [3]. However desktop tools normally need to be used manually. An alternative approach is to use server-based link-checking software which send email notification of broken links.

Externally-hosted link-checking tools may also be used. Tools such as LinkValet [4] can be used interactively or in batch. Such tools may provide limited checking for free, with a licence fee for more comprehensive checking.

A popular approach is to make use of SSIs (server-side includes) to retrieve common features (such as headers, footers, navigation bars, etc.). This can be useful for storing HTML elements (such as the DOCTYPE declaration) in a manageable form. However this may cause validation problems if the SSI is not processed.

Another approach is to use a browser interface to tools, possibly using a Bookmarklet [5] although UKOLN's server-based ,tools approach [6] is more manageable.

Other Issues

It is important to ensure that link checkers check for links other than <a href=""...> and <img src="...">. There is a need to check external JavaScript, CSS, etc. files (linked to by the <link> tag) and that checks are carried out on personalised interfaces to resources.

It should also be noted that erroneous link error reports may sometimes be produced (e.g. due to misconfigured Web servers).

References


Briefing 08

Search Facilities For Your Web Site


Background

Web sites which contain more than a handful of pages should provide a search facility. This is important for several reasons:

Approaches To Providing Search Facilities

The two main approaches to the provision of search engines on a Web site are to host a search engine locally or to make use of an externally-hosted search engine.

Local Search Engine

The traditional approach is to install search engine software locally. The software may be open source (such as ht://Dig [1]) or licensed software (such as Inktomi [2]). It should be noted that the search engine software does not have to be installed on the same system as the Web server. This means that you are not constrained to using the same operating system environment for your search engine as your Web server.

Because the search engine software can hosted separately from the main Web server it may be possible to make use of an existing search engine service within the organisation which can be extended to index a new Web site.

Externally-Hosted Search Engines

An alternative approach is to allow a third party to index your Web site. There are a number of companies which provide such services. Some of these services are free: they may be funded by advertising revenue. Such services include Google [3], Atomz [4] and FreeFind [5].

Pros And Cons

Using a locally-installed search engine gives you control over the software. You can control the resources to be indexed and those to be excluded, the indexing frequency, the user interface, etc. However such control may have a price: you may need to have technical expertise in order to install, configure and maintain the software.

Using an externally-hosted search engine can remove the need for technical expertise: installing an externally-hosted search engine typically requires simply completing a Web form and then adding some HTML code to your Web site. However this ease-of-use has its disadvantages: typically you will lose the control over the resources to be indexed, the indexing frequency, the user interfaces, etc. In addition there is the dependency on a third party, and the dangers of a loss of service if the organisation changes its usage conditions, goes out of business, etc.

Trends

Surveys of search facilities used on UK University Web sites have been carried out since 1998 [6]. This provides information not only on the search engines tools used, but also to spot trends.

Since the surveys began the most widely used tool has been ht://Dig - an open source product. In recent years the licensed product Inktomi has shown a growth in usage. Interestingly, use of home-grown software and specialist products has decreased - search engine software appears now to be a commodity product.

Another interesting trend appears to be in the provision of two search facilities; a locally-hosted search engine and a remote one - e.g. see the University of Lancaster [7].

References

  1. ht://Dig,
    <http://www.htdig.org/>
  2. Inktomi,
    <http://www.inktomi.com/>
  3. Google,
    <http://www.google.com/>
  4. Atomz,
    <http://www.atomz.com/>
  5. FreeFind,
    <http://www.freefind.com/>
  6. Surveys of Search Engines on UK University Web Sites,
    <http://www.ukoln.ac.uk/web-focus/surveys/uk-he-search-engines/>
  7. University of Lancaster Search Page,
    <http://www.lancs.ac.uk/search.htm>

Briefing 09

Image QA in the Digitisation Workflow


Introduction

Producing an archive of high-quality images with a server full of associated delivery images is not an easy task. The workflow consists of many interwoven stages, each building on the foundations laid before. If, at any stage, image quality is compromised within the workflow, it has been totally lost and can never be redeemed.

It is therefore important that image quality is given paramount consideration at all stages of a project from initial project planning through to exit strategy.

Once the workflow is underway, quality can only be lost and the workflow must be designed to capture the required quality right from the start and then safeguard it.

Image QA

Image QA within a digitisation project's workflow can be considered a 4-stage process.

1 Strategic QA

Strategic QA is undertaken in the initial planning stages of the project when the best methodology to create and support your images, now and into the future will be established. This will include:

Local Search Engine

2 Process QA

Process QA is establishing quality control methods within the image production workflow that support the highest quality of capture and image processing, including:

3 Sign-off QA

Sign-off QA is implementing an audited system to assure that all images and their associated metadata are created to the established quality standard. A QA audit history is made to record all actions undertaken on the image files.

4 On-going QA

On-going QA is implementing a system to safeguard the value and reliability of the images into the future. However good the initial QA, it will be necessary to have a system that can report, check and fix any faults found within the images and associated metadata after the project has finished. This system should include:

QA in the Digitisation Workflow

Much of the final quality of a delivered image will be decided, long before, in the initial 'Strategic' and 'Process' QA stages where the digitisation methodology is planned and equipment sourced. However, once the process and infrastructure are in place it will be the operator who needs to manually evaluate each image within the 'Sign-off' QA stage. This evaluation will have a largely subjective nature and can only be as good as the operator doing it. The project team is the first and last line of defence against any drop in quality. All operators must be encouraged to take pride in their work and be aware of their responsibility for its quality.

It is however impossible for any operator to work at 100% accuracy for 100% of the time and faults are always present within a productive workflow. What is more important is that the system is able to accurately find the faults before it moves away from the operator. This will enable the operator to work at full speed without having to worry that they have made a mistake that might not be noticed.

The image digitisation workflow diagram in this document shows one possible answer to this problem.

workflow diagram

Acknowledgements:

This document was written by TASI, the Technical Advisory Service For Images.


Briefing 10

Enhancing Web Site Navigation Using The LINK Element


Introduction

This document provides advice on how the HTML <link> element can be used to improve the navigation of Web sites.

The LINK Element

About

The purpose of the HTML <link> element is to specify relationships with other documents. Although not widely used the <link> element provides a mechanism for improving the navigation of Web sites.

The <link> element should be included in the <head> of HTML documents. The syntax of the element is: <link rel=”relation” href=”url”>. The key relationships which can improve navigation are listed below.

Table 1: Key Link Relations
Relation Function
nextRefers to the next document in a linear sequence of documents.
prevRefers to the previous document in a linear sequence of documents.
homeRefers to the home page or the top of some hierarchy.
firstRefers to the first document in a collection of documents.
contentsRefers to a document serving as a table of contents.
helpRefers to a document offering help.
glossaryRefers to 1 document providing a glossary of terms that pertain to the current document.

Benefits

Use of the <link> element enables navigation to be provided in a consistent manner as part of the browser navigation area rather than being located in an arbitrary location in the Web page. This has accessibility benefits. In addition browsers can potential enhance the performance by pre-fetching the next page in a sequence.

Browser Support

A reason why <link> is not widely used has been the lack of browser support. This has changed recently and support is now provided in the latest versions of the Opera and Netscape/Mozilla browsers and by specialist browsers (e.g. iCab and Lynx).

Since the <link> element degrades gracefully (it does not cause problems for old browser) use of the <link> element will cause no problems for users of old browsers.

An illustration of how the <link> element is implemented in Opera is shown below.

Browser Support For The <link>Element
Figure 1: Browser Support For The <link> Element

In Figure 1 a menu of navigational aids is available. The highlighted options (Home, Contents, Previous and Next) are based on the relationships which have been defined in the document. Users can use these navigational options to access the appropriate pages, even though there may be no corresponding links provided in the HTML document.

Information Management Challenges

It is important that the link relationships are provided in a manageable way. It would not be advisable to create link relationships by manually embedding them in HTML pages if the information is liable to change.

It is advisable to spend time in defining the on key navigational locations, such as the Home page (is it the Web site entry point, or the top of a sub-area of the Web site). Such relationships may be added to templates included in SSIs. Server-side scripts are a useful mechanism for exploiting other relationships, such as Next and Previous - for example in search results pages.

Further Information

Additional information is provided at
<http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-10/>.


Briefing 11

What Are Open Standards?


Background

The use of open standards can help provide interoperability and maximise access to resources and services. However this raises two questions: "Why open standards?" and "What are open standards?".

Why Open Standards?

Open standards can provide several benefits:

What Are Open Standards?

The term "open standards" is ambiguous. As described in Wikipedia "There is no single definition and interpretations vary with usage" [1]. The EU's definition is [2]:

Some examples of recognised open standards bodies are given in Table 1.

Table 1: Examples Of Independent Standards Organisations
Standards Body Comments
W3C World Wide Web Consortium (W3C). Responsible for the development of Web standards (recommendations). See <http://www.w3.org/TR/>. Relevant standards include HTML, XML, CSS, SMIL, SVG, etc.
IETF Internet Engineering Task Force (IETF). Responsible for the development of Internet standards (known as IETF RFCs). See list of IETF RFCs at <http://www.ietf.org/rfc.html>. Relevant standards include HTTP, MIME, etc.
ISO International Organisation For Standardization (ISO). See <http://www.iso.org/iso/en/stdsdevelopment/whowhenhow/how.html>. Relevant standards areas include character sets, networking, etc.
NISO National Information Standards Organization (NISO). See <http://www.niso.org/>. Relevant standards include Z39.50.
IEEE Institute of Electrical and Electronics Engineers (IEEE). See <http://www.ieee.org/>.
ECMA ECMA International. Association responsible for standardisation of Information and Communication Technology Systems (such as JavaScript). See <http://www.ecma-international.org/>.

 

Other Types Of Standards

The term proprietary refers to formats which are owned by an organisation, group, etc. The term industry standard is often used to refer to a widely used proprietary standard. For example, the proprietary Microsoft Excel format is sometimes referred to as an industry standard for spreadsheets. To make matters even more confusing, the prefix is sometime omitted and MS Excel can be referred to as a standard.

To further confuse matters, companies which own proprietary formats may choose to make the specification freely available. Alternatively third parties may reverse engineer the specification and publish the specification. In addition tools which can view or create proprietary formats may be available on multiple platforms or as open source.

In all these cases, although there may appear to be no obvious barriers to use of the proprietary format, such formats should not be classed as open standards as they have not been approved by a neutral standards body. The organisation owning the format may chose to change the format or the usage conditions at any time. File formats in this category include Microsoft Office formats, Macromedia Flash and Java.

References

  1. Open Standard, Wikipedia, <http://en.wikipedia.org/wiki/Open_standard>
  2. Open Standard: European Union definition, Wikipedia, <http://en.wikipedia.org/wiki/Open_standard#European_Union_definition>

Briefing 12

How To Evaluate A Web Site's Accessibility Level


Background

Many Web developers and administrators are conscious of the need to ensure that their Web sites reach as high a level of accessibility as possible. But how do you actually find out if a site has accessibility problems? Certainly, you cannot assume that if no complaints have been received through the site feedback facility (assuming you have one), there are no problems. Many people affected by accessibility problems will just give up and go somewhere else.

So you must be proactive in rooting out any problems as soon as possible. Fortunately there are a number of handy ways to help you get an idea of the level of accessibility of the site which do not require an in-depth understanding of Web design or accessibility issues. It may be impractical to test every page, but try to make sure you check the Home page plus as many high traffic pages as possible.

Get A Disabled Person To Look At The Site

If you have a disability, you have no doubt already discovered whether your site has accessibility problems which affect you. If you know someone with a disability which might prevent them accessing information in the site, then ask them to browse the site, and tell you of any problems. Particularly affected groups include visually impaired people (blind, colour blind, short or long sighted), dyslexic people and people with motor disabilities (who may not be able to use a mouse). If you are in Higher Education your local Access Centre [1] may be able to help.

View The Site Through A Text Browser

Get hold of a text browser such as Lynx [2] and use it to browse your site. Problems you might uncover include those caused by images with no, or misleading, alternative text, confusing navigation systems, reliance on scripting or poor use of frames.

Browse The Site Using A Speech Browser

You can get a free evaluation version of IBM's Homepage Reader [3] or pwWebSpeak [4], speech browsers used by many visually impaired users of the Web. The browsers "speak" the page to you, so shut your eyes and try to comprehend what you are hearing.

Alternatively, try asking a colleague to read you the Web page out loud. Without seeing the page, can you understand what you're hearing?

Look At The Site Under Different Conditions

As suggested by the World Wide Web Consortium (W3C) Web Accessibility Initiative (WAI) [5], you should test your site under various conditions to see if there are any problems including (a) graphics not loaded (b) frames, scripts and style sheets turned off and (c) browsing without using a mouse. Also, try using bookmarklets or favelets to test your Web site under different conditions: further information on accessibility bookmarklets can be found at [6].

Check With Automatic Validation Tools

There are a number of Web-based tools which can provide valuable information on potential accessibility problems such as Rational Policy Tester Accessibility [7] and The Wave tools [8]. You should also check whether the underlying HTML of your site validates to accepted standards using the World Wide Web Consortium's MarkUp Validation Service [9] as non-standard HTML can also cause accessibility problems

Acting on Your Observations

Details of any problems found should be noted: the effect of the problem, which page was affected, plus why you think the problem was caused. You are unlikely to catch all accessibility problems in the site, but the tests described here will give you an indication of whether the site requires immediate attention to raise accessibility. Remember that improving accessibility for specific groups, such as visually impaired people, will often have usability benefits for all users.

Commission an Accessibility Audit

Since it is unlikely you will catch all accessibility problems and the learning curve is steep, it may be advisable to commission an expert accessibility audit. In this way, you can receive a comprehensive audit of the subject site, complete with detailed prioritised recommendations for upgrading the level of accessibility of the site. Groups which provide such audits include the Digital Media Access Group, based at the University of Dundee or the RNIB, who also audit Web sites for access to the blind.

Further Information

Additional information is provided at
<http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-12/>.

Acknowledgments

This document was written by David Sloan, DMAG, University of Dundee and originally published at by the JISC TechDis service We are grateful for permission to republish this document.

References

  1. Access Centres,
    http://www.nfac.org.uk/
  2. Lynx,
    http://lynx.isc.org/release/
  3. Homepage Reader, IBM,
    http://www-3.ibm.com/able/solution_offerings/hpr.html
  4. pwWebSpeak,
    http://www.soundlinks.com/pwgen.htm
  5. Web Content Accessibility Guidelines, Appendix A, W3C WAI,
    http://www.w3.org/TR/WAI-WEBCONTENT/
  6. Bookmarklets: An aid to checking the accessibility of your website, Nicola McIlroy
    http://www.dmag.org.uk/resources/design_articles/bookmarklets.asp
  7. Rational Policy Tester Accessibility,
    http://www-306.ibm.com/software/awdtools/tester/policy/accessibility/
  8. WAVE,
    http://www.wave.webaim.org/
  9. W3C HTML Validator, W3C,
    http://validator.w3.org/

Briefing 13

Software Code Development


About

This document gives high-level advice for people who develop software for use either internally within a project or for use externally as a project deliverable.

Background

Each computer programming language has its own coding conventions. However there are a number of general points that you can follow to ensure that your code is well organised and can be easily understood by others. These guidelines are not in any way mandatory but attempt to formalise code so that reading, reusing and maintaining code is easier. Most coding standards are arbitrary but adopting some level of consistency will help create better software.

The key point to remember is that good QA practice involves deciding on and recording a number of factors with your programming team before the onset of your project. Having such a record will allow you to be consistent.

Documentation

In order for programmers to use your software it is important that you include clear documentation. This documentation will take the form of internal and external documentation.

Naming Conventions

Naming conventions of files, procedures and variables etc. should be sensible and meaningful and agreed on before the projects starts. Use of capitalisation may vary in different programming languages but it is sensible to avoid names that differ only in case or look very similar. Also avoid names that conflict with standard library names.

Code

There are a number of key points to remember when writing your code:

Standards

Standards are "documented agreements containing technical specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics, to ensure that materials, products, processes and services are fit for their purpose" (ISO 1997). The aim of international standards is to encapsulate most appropriate current practice. The International Organization for Standardization (ISO) [1] is the head of all national standardisation bodies. The most relevant ISO standard for software code development is ISO 9000-3: 1997 (QA for the development, supply, installation and maintenance of computer software). For other relevant standards also check the Institute of Electrical and Electronics Engineers [2] and the American National Standards Institute [3].

Project QA

At the start of development it may help to ask your team the following questions:

  1. Do you have local guidelines for writing code?
  2. Are your software staff aware of the conventions to be used?
  3. Do you have procedures in place for use when creating local software?

References

  1. The International Organization for Standardization (ISO)
    <http://www.iso.ch/>
  2. Institute of Electrical and Electronics Engineers (IEEE)
    <http://www.ieee.org/
  3. American National Standards Institute (ANSI)
    <http://www.ansi.org/

Further information on Software QA at Sticky Minds
<http://www.stickyminds.com/>


Briefing 14

Creating and Testing Web Forms


Background

A Web form is not dissimilar in appearance and use to a paper form. It appears on a Web page and contains special elements called controls along with normal text and mark up. These controls can take the form of checkboxes, text boxes, radio buttons, menus, etc. Users generally fill in the form by entering text and selecting menu items and then submit the form for processing. The agent could be an email or Web server.

Web forms have a variety of uses and are a way to get specific pieces of information from the user. Web sites with forms have their own specific set of problems and need vigorous testing to ensure that they work.

Designing Forms

Some of the key things to consider when designing your form are:

Mandatory Fields

Making fields compulsory can cause problems. Occasionally a user may feel that the question you have asked is inappropriate in context or they just can't provide the information. You need to decide if the information needed is absolutely necessary. Users will be more likely to give information if you explain why the data that you're asking for is needed. It is acceptable to ask the user if they meant to leave out a piece of information and then accept their answer.

Validation of forms can be carried out either client side or by processing on the server. Client side validation requires the use of a scripting language like JavaScript and can be problematic if the user's browser disallows scripting. However server side validation can be more complicated to set up.

Drop Down Lists

Sometimes the categories you offer in a drop down list do not match the answer that the user wants to give you. Sites from the USA often ask for states, which UK users cannot provide. If you want to use a drop down list make sure that your error messages are helpful rather than negative and allow users to select an 'other' option. If you have given a good selection of categories then you should rarely get users picking this.

Also consider if the categories you have provided are subjective enough. There may be issues over the terms used to refer to particular countries (for example if a land area is disputed. If you have to provide a long drop down list then it might be worth offering the common categories first. You could also try subdividing the categories into two-drop downs where the selection from the first dynamically creates the options in the second.

Separate Display

You may wish to have the user see a new page or sidebar when filling in a form. A new page may be easier to look at but can be annoying if it is perceived as a diversion or, even worse, an advertisement. It may also be prevented from opening by window blocking software available on newer browsers.

User Errors

Users will often make typing or transcription errors when filling a form in. These errors can occur in any free text fields on the form.

Occasionally users will press the submit or send button either deliberately or inadvertently when only part-way through the form. Make sure that you have an appropriate error message for this and allow users to go back to the unfinished form. Users also often fill in part of a form and then click on the back button. They may be doing this to lose the data they have filled in, to check previous data or because they think they have finished. These activities suggest poor user interface design.

It is important to provide a helpful message on the submission screen explaining the form has been submitted successfully. You could also give replicate the details inputted for users to print out as hard copy.

Testing Forms

Once you have created your Web form you need to test it thoroughly before release. There are a number of different free software products available that will help you with your testing. Tools such as Roboform [1] are freely available and can be used to store test data in and automatically fill in your forms with data.

When testing your form it is worth bearing in mind some problem areas:

References

  1. Roboform,
    <http://www.roboform.com/>
  2. BabelMap,
    <http://www.babelstone.co.uk/Software/BabelMap.html>

Briefing 15

The Purpose Of Your Project Web Site


Background

Before creating a Web site for your project you should give some thought to the purpose of the Web site, including the aims of the Web site, the target audiences, the lifetime, resources available to develop and maintain the Web site and the technical architecture to be used. You should also think about what will happen to the Web site once project funding has finished.

Purposes

Your project Web site could have a number of purposes. For example:

Your Web site could, of course, fulfill more than a single role. Alternatively you may choose to provide more than one Web site.

Why You Need To Think About The Different Purposes

You should have an idea of the purposes of your project Web site before creating it for a number of reasons:

Web Site For Information About The Project

Once funding has been approved for your project Web site you may wish to provide information about the project, often prior to the official launch of the project and before project staff are in post. There is a potential danger that this information will be indexed by search engines or treated as the official project page. You should therefore ensure that the page is updated once an official project Web site is launched so that a link is provided to the official project page. You may also wish to consider stopping search engines from indexing such pages by use of the Standard For Robot Exclusion [1].

Web Site For Access To Project Deliverables

Many projects will have an official project Web site. This is likely to provide information about the project such as details of funding, project timescales and deliverables, contact addresses, etc. The Web site may also provide access to project deliverables, or provide links to project deliverables if they are deployed elsewhere or are available from a repository. Usually you will be proactive in ensuring that the official project Web site is easily found. You may wish to submit the project Web site to search engines.

Web Site To Support Communications With Project Partners

Projects with several partners may have a Web site which is used to support communications with project partners. The Web site may provide access to mailing lists, realtime communications, decision-making support, etc. The JISCMail service may be used or commercial equivalents such as YahooGroups. Alternatively this function may be provided by a Web site which also provides a repository for project resources.

Web Site As Repository For Project Resources

Projects with several partners may have a Web site which is used to provide a repository for project resources. The Web site may contain project plans, specifications, minutes of meetings, reports to funders, financial information, etc. The Web site may be part of the main project Web site, may be a separate Web site (possibly hosted by one of the project partners) or may be provided by a third party. You will need to think about the mechanisms for allowing access to authorised users, especially if the Web site contains confidential or sensitive information.

References

  1. robots.txt Robots Exclusion Standard,
    <http://www.robotstxt.org/>

Briefing 16

URI Naming Conventions For Your Project Web Site


Background

Once you have agreed on the purpose(s) of your project Web site(s) [1] you will need to choose a domain name for your Web site and conventions for URIs. It is necessary to do this since this can affect (a) The memorability of the Web site and the ease with which it can be cited; (b) The ease with which resources can be indexed by search engines and (c) The ease with which resources can be managed and repurposed.

Domain Name

You may wish to make use of a separate domain name for your project Web site. If you wish to use a .ac.uk domain name you will need to ask UKERNA. You should first check the UKERNA rules [2]. A separate domain name has advantages (memorability, ease of indexing and repurposing, etc) but t his may not be appropriate, especially for short-term projects. Your organisation may prefer to use an existing Web site domain.

URI Naming Conventions

You should develop a policy for URIs for your Web site which may include:

Issues

Grouping Of Resources

It is strongly recommended that you make use of directories to group related resources. This is particularly important for the project Web site itself and for key areas of the Web site. The entry point for the Web site and key areas should be contained in the directory itself: e.g. use http://www.foo.ac.uk/bar/ to refer to project BAR and not http://www.foo.ac.uk/bar.html) as this allows the bar/ directory to be processed in its entirety, independently or other directories. Without this approach automated tools such as indexing software, and tools for auditing, mirroring, preservation, etc. would process other directories.

URI Persistency

You should seek to ensure that URIs are persistent. If you reorganise your Web site you are likely to find that internal links may be broken, that external links and bookmarks to your resources are broken, that citations to resources case to work. Y ou way wish to provide a policy on the persistency of URIs on your Web site.

File Names and Formats

Ideally the address of a resource (the URI) will be independent of the format of the resource. Using appropriate Web server configuration options it is possible to cite resources in a way which is independent of the format of the resource. This should allow easy of migration to new formats (e.g. HTML to XHTML) and, using a technology known as Transparent Content Negotiation [3] provide access to alternative formats (e.g. HTML or PDF) or even alternative language versions.

File Names and Server-Side Technologies

Ideally URIs will be independent of the technology used to provide access to the resource. If server-side scripting technologies are given in the file extension for URIs (e.g. use of .asp, .jsp, .php, .cfm, etc. extensions) changing the server-side scripting technology would probably require changing URIs. This may also make mirroring and repurposing of resources more difficult.

Static URIs Or Query Strings?

Ideally URIs will be memorable and allow resources to be easily indexed and repurposed. However use of Content Management Systems or databases to store resources often necessitates use of URIs which contain query strings containing input parameters to server-side applications. As described above this can cause problems.

Possible Solutions

You should consider the following approaches which address some of the concerns:

References

  1. The Purpose Of Your Project Web Site
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-15/html/>
  2. UKERNA
    <http://www.ukerna.ac.uk>
  3. Transparent Content Negotiation
    <http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html>

Briefing 17

Performance Indicators For Your Project Web Site


Background

It is desirable to measure usage of your project Web site as this can give an indication of its effectiveness. Measuring how the Web site is being used can also help in identifying the usability of the Web site. Monitoring errors when users access your Web site can also help in identifying problem areas which need to be fixed.

However, as described in this document, usage statistics can be misleading. Care must be taken in interpreting statistics. As well as usage statistics there are a number of other types of performance indicators which can be measured.

It is also important that consistent approaches are taken in measuring performance indicators in order to ensure that valid comparisons can be made with other Web sites.

Web Statistics

Web statistics are produced by the Web server software. The raw data will normally be produced by default - no additional configuration will be needed to produce the server's default set of usage data.

The server log file records information on requests (normally referred to as a "hit") for a resource on the web server. Information included in the server log file includes the name of the resource, the IP address (or domain name) of the user making the request, the name of the browser (more correctly, referred to as the "user agent") issuing the request, the size of the resource, date and time information and whether the request was successful or not (and an error code if it was not). In addition many servers will be configured to store additional information, such as the "referer" (sic) field, the URL of the page the user was viewing before clicking on a link to get to the resource.

Tools

A wide range of Web statistical analysis packages are available to analyse Web server log files [1]. A widely used package in the UK HE sector is WebTrends [2].

An alternative approach to using Web statistical analysis packages is to make use of externally-hosted statistical analysis services [3]. This approach may be worth considering for projects which have limited access to server log files and to Web statistical analysis software.

Configuration Issues

In order to ensure that Web usage figures are consistent it is necessary to ensure that Web servers are configured in a consistent manner, that Web statistical analysis packages process the data consistently and that the project Web site is clearly defined.

You should ensure that (a) the Web server is configured so that appropriate information is recorded and (b) that changes to relevant server options or data processing are documented.

Limitations

You should be aware that the Web usage data does not necessarily give a true indication of usage due to several factors:

Despite these reservations collecting and analysing usage data can provide valuable information.

Other Types Of Indicators

Web usage statistics are not the only type of performance indicator which can be used. You may also wish to consider:

With all of the indicators periodic reporting will allow trends to be detected.

Conclusions

It may be useful to determine a policy on collection and analysis of performance indicators for your Web site prior to its launch.

References

  1. Web server log files, UKOLN,
    <http://www.ukoln.ac.uk/nof/support/help/papers/performance/>
  2. WebTrends,
    <http://www.netiq.com/webtrends/>
  3. Externally-hosted statistical analysis services, Exploit Interactive, Issue 5, April 2000,
    <http://www.exploit-lib.org/issue5/indicators/>

Briefing 18

QA Procedures For The Design Of CAD Data Models


Background

The creation of CAD (Computer Aided Design) models is an often complex and confusing procedure. To reduce long-term manageability and interoperability problems, the designer should establish procedures that will monitor and guide system checks.

Establish CAD Layout Standards

Interoperability problems are often caused by poorly understood or non-existent operating procedures for CAD. It is wise to establish and document your own CAD procedures, or adopt one of the national standards developed by the BSI (British Standards Institution) or NIBS (National Institute of Building Sciences). These may be used to train new members in the house-style of a project, provide essential information when sharing CAD data among different users, or provide background material when depositing the designs with a preservation repository. Particular areas to standardize include:

Procedures on constructing your own CAD standard can be found in the Construct IT guidelines (see references).

Be Consistent With Layers And Naming Conventions

When creating CAD data models, a consistent approach to layer creation and naming conventions is useful. This will avoid confusion and increases the likelihood that the designer will be able to manipulate and search the data model at a later date.

The designer has two options to ensure interoperability:

Ensure Tolerances Are Consistent

When exporting designs between different CAD applications it is common for model relationships to disintegrate, causing entities to appear disconnected or disappear from the design altogether. A common cause is the use of different tolerance levels - a method of placing limits on gaps between geometric entities. The method of calculating tolerance often varies in different applications: some use absolute tolerance levels (e.g. 0.005mm), others work to a tolerance level relative to the model size (e.g. 10-4 the size), while others have different tolerances according to the units used. When considering moving a design between different applications it is useful to ensure the tolerance level can be set to the same value and identify potential problem areas that may be corrupted when the data model is reopened.

Check For Illegal Geometry Definitions

Interoperability problems are also caused by differences in how the system identifies invalid geometry definitions, such as the three-sided degenerate NURBS surfaces. Some systems allow the creation of such entities, others will reject them, whereas some systems know that they are not permissible and in an effort to prevent them from being created, generate twisted four sided surfaces.

Further Information


Briefing 19

Making Software Changes to a Web Site


Background

It is desirable to minimise the time a Web site is unavailable. However it may be necessary to bring a Web server down in order to carry out essential maintenance. This document lists some areas to consider if you wish to minimum down time.

Planning

The key to any form of critical path situation is planning. Planning involves being sure about what needs to be done and being clear about how it can be done. Quality Assurance is vital at this stage and final 'quality' checking is often the last act before a site goes live or a new release is launched.

Prior to Down Time

During Down Time

After Down Time

Conclusions

Advance preparation is vital if you want to minimise time your site downtime and avoid confusion when installing new releases.

References

  1. Error Detection on the UKOLN Web site, QA Focus, UKOLN,
    < http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-14/>

Briefing 20

Documenting Digitisation Workflow


Background

Digitisation is a production process. Large numbers of analogue items, such as documents, images, audio and video recordings, are captured and transformed into the digital masters that a project will subsequently work with. Understanding the many variables and tasks in this process - for example the method of capturing digital images in a collection (scanning or digital photography) and the conversion processes performed (resizing, decreasing bit depth, convert file formats, etc.) - is vital if the results are to remain consistent and reliable.

By documenting the workflow of digitisation, a life history can be built-up for each digitised item. This information is an important way of recording decisions, tracking problems and helping to maintain consistency and give users confidence in the quality of your work.

What to Record

Workflow documentation should enable us to tell what the current status of an item is, and how it has reached that point. To do this the documentation needs to include important details about each stage in the digitisation process and its outcome.

  1. What action was performed at a specific stage? Identify the action performed. For example, resizing an image.
  2. Why was the action performed? Establish the reason that a change was made. For example, a photograph was resized to meet pre-agreed image standards.
  3. When was the action performed? Indicate the specific date the action was performed. This will enable project development to be tracked through the system.
  4. How was the action performed? Ascertain the method used to perform the action. A description may include the application in use, the machine ID, or the operating system.
  5. Who performed the action? Identify the individual responsible for the action. This enables actions to be tracked and identify similar problems in related data.

By recording the answers to these five questions at each stage of the digitisation process, the progress of each item can be tracked, providing a detailed breakdown of its history. This is particularly useful for tracking errors and locating similar problems in other items.

The actual digitisation of an item is clearly the key point in the workflow and therefore formal capture metadata (metadata about the actual digitisation of the item) is particularly important.

Where to Record the Information

Where possible, select an existing schema with a binding to XML:

Quality Assurance

To check your XML document for errors, QA techniques should be applied:

Further Information


Briefing 21

QA for GIS Interoperability


Background

Quality assurance is essential to ensure GIS (Geographic Information System) data is accurate and can be manipulated easily. To ensure data is interoperable, the designer should audit the GIS records and check them for incompatibilities and errors.

Ensure Content Is Available In An Appropriate GIS Standard

Interoperability between GIS standards is encouraged, enabling complex data types to be compared in unexpected methods. However, the varying standards can limit the potential uses of the data. Designers are often limited by the formats available in different tools. When possible, it is advisable to use OpenGIS - an open, multi-subject standard constructed by an international standard consortium.

Resolve Differences In The Data Structures

To integrate data from multiple databases, the data must be stored in a compatible field structure. Complementary fields in the source and target databases must be of a compatible type (Integer, Floating Point, Date, a Character field of an appropriate length etc.) to avoid the loss of data during the integration process. Checks should also be made that specific fields that are incompatible with similar products (e.g. dBase memo fields) are exported correctly. Specialist advice should be taken to ensure the memo information is not lost.

Ensure Data Meet The Required Standards

Databases are often created in an ad hoc manner without consideration of later requirements. To improve interoperability the designer should ensure data complies with relevant standards. Examples include the BS7666 standard for British postal addresses and the RCHME Thesauri of Architectural Types, Monument Types, and Building Materials.

Compensate For Different Measurement Systems

The merging of two different data sources is likely to present specific problems. When combining two GIS tables, the designer should consider the possibility that they have been constructed using different projection measurement systems (a method of representing the Earth's three-dimensional form on a two-dimensional plane and locate landmarks by a set of co-ordinates). The projection co-ordinate systems vary across nations and through time: the US has five primary co-ordinate systems in use that significantly differ with each other. The British National Grid removes this confusion by using a single co-ordinate, but can cause problems when merging contemporary with pre-1940 maps that were based upon Cassini projection. This may produce incompatibilities and unexpected results when plotted, such as moving boundaries and landmarks to different locations that will need to be rectified before any real benefits can be gained. The designer should understand the project system used for each layer to compensate for inaccuracies.

Ensure Precise Measurements Are Accurate

When recreating real-world objects created by two different people, the designer should note the degree of accuracy. One person may measure to the nearest millimetre, while the other measures to the centimetre. To check this, the designer should answer the following questions:

  1. How many numbers are shown after the point (e.g. 2:12 cm)?
  2. Is this figure consistent with the second designers' measurement methods?
  3. Has the value been rounded up or down, or has a third figure been removed?

These subtle differences may influence the resulting model, particularly when designing smaller objects.

Further Information


Briefing 22

Choosing A Suitable Digital Rights Solution


Background

Digital Rights Management (DRM) refers to any method for a software developer to monitor, control, and protect digital content. It was developed primarily as an advanced anti-piracy method to prevent illegal or unauthorised distribution of content. Common examples of DRM include watermarks, licensing, and user registration. It is in use by Microsoft and other businesses to prevent unauthorised copying and use of their software (obviously, the different protection methods do not always work!).

For institutions, DRM can have limited application. Academia actively encourages free dissemination of work, so stringent restrictive measures are unnecessary. However, it will have use in limiting plagiarism of work. An institution is able to distribute lecture notes or software without allowing the user to reuse text or images within their own work. Usage of software packages or site can also be tracked, enabling specific content to be displayed to different users. To achieve these goals different methodologies are available.

Why do I need Digital Rights Management?

As stated above, Digital Rights Management is not appropriate for all organisations. It can introduce additional complexity into the development process, limit use and cause unforeseen problems at a later date. The following questions will assess your needs:

  1. Do you trust your users to use your work without plagiarising or stealing it?

  2. If the answer to question 1 is yes, do you wish to track unauthorised distribution or impose rules to prevent it?

  3. Will you be financially affected if your work is distributed without permission?

  4. Will digital rights restrictions interfere with the project goals and legitimate usage?

  5. In terms of cost and time management, can you afford to implement DRM restrictions?

  6. If the answer to question 5 is yes, can you afford a strong and costly level of protection (restrictive digital rights) or weak protection (supportive) that costs significantly less?

What types of DRM Methodologies Exist?

Digital Rights methodologies can be divided into two types supportive and restrictive. The first relies upon the user's honesty to register or acquire a license for specific functionality. In contrast, the restrictive method assumes the user is untrustworthy, placing barriers (e.g. encryption and other preventive methods) that will thwart casual users who attempt to infringe their work.

1) Supportive digital rights

The simplest and most cost effective DRM strategy is supportive digital rights. This requires the user to register for data before they are allowed access, blocking all non-authorised users. This assumes that individuals will be less likely to distribute content if they can be identified as the source of the leak. Web sites are the most common use of this protection method. For example, Athens, the NYTimes and other portals provide registration forms or license agreement that the user must complete before access is allowed. The disadvantage of this protection method is the individual can easily copy or distribute data once they have it. Support digital rights is suited to organisations that want to place restrictions upon who can access specific content, but do not wish to restrict content being used by legitimate users.

2) Restrictive digital rights

Restrictive digital rights are more costly, but place more stringent controls over the user. It operates by checking if the user is authorised to perform a specific action and, if not, prevents them from doing it. Unlike supportive rights management, it ensures that content cannot be used at a later date, even if it has been saved to hard disk. This is achieved by incorporating watermarks and other identification methods into the content.

Restrictive digital can be divided into two sub-categories:

The requirements for Digital rights implementations are costly and time-consuming, making them potentially unobtainable by the majority of service providers. For a data archive it is easier to prevent unauthorised access to resources than it is to limit use when they actually possess the information.

Ensuring Interoperability

Digital rights is reliant upon the need to record information and store it in a standard layout format that others can use to identify copyrighted work. Current digital rights establish a standard metadata schema to identify ownership

Two options are available to achieve this goal: create a bespoke solution or use an established rights schema. An established rights schema provides a detailed list of identification criteria that can be used to catalogue a collection and establish copyright holders at different stages. Two possible choices for multiple media types are:

Summary

Digital rights are an important issue that allows an institution to establish intellectual property rights. However, it can be costly for small organizations that simply wish to protect their image collection. Therefore the choice of supportive or restrictive digital rights is likely to be influenced by value of data in relation to the implementation cost.

Further Information


Briefing 23

Recording Digital Sound


Background

The digitisation of digital audio can be a complex process. This document contains quality assurance techniques for producing effective audio content, taking into consideration the impact of sample rate, bit-rate and file format.

Sample Rates

Sample rate defines the number of samples that are recorded per second. It is measured in Hertz (cycles per second) or Kilohertz (thousand cycles per second). The following table describes four common benchmarks for audio quality. These offer gradually improving quality, at the expense of file size.

Table 1: Description of the various sample frequencies available
Samples per second Description
8kHz Telephone quality
11kHz At 8 bits, mono produces passable voice at a reasonable size.
22kHz 22k, half of the CD sampling rate. At 8 bits, mono, good for a mix of speech and music.
44.1kHz Standard audio CD sampling rate. A standard for 16-bit linear signed mono and stereo file formats.

The audio quality will improve as the number of samples per second increases. A higher sample rate enables a more accurate reconstruction of a complex sound wave to be created from the digital audio file. To record high quality audio a sample rate of 44.1kHz should be used.

Bit-rate

Bit-rate indicates the amount of audio data being transferred at a given time. The bit-rate can be recorded in two ways - variable or constant. A variable bit-rate creates smaller files by removing inaudible sound. It is therefore suited to Internet distribution in which bandwidth is a consideration. A constant bit-rate, in comparison, records audio data at a set rate irrespective of the content. This produces a replica of an analogue recording, even reproducing potentially unnecessary sounds. As a result, file size is significantly larger than those encoded with variable bit-rates.

Table 2 indicates how a constant bit-rate affects the quality and file size of an audio file.

Table 2 Indication of audio quality expected with different bit-rates
Bit rate Quality MB/min
1411 CD quality 10.584
192 Good CD quality 1.440
128 Near CD quality 0.960
112 Near CD quality 0.840
64 FM quality 0.480
32 AM quality 0.240
16 Short-wave quality 0.120

 

Digital Audio Formats

The majority of audio formats use lossy compression to reduce file size by removing superfluous audio data. Master audio files should ideally be stored in a lossless format to preserve all audio data.

Table 3 Common Digital Audio Formats
Format Compression Streaming support Bit-rate Popularity
MPEG Audio Layer III (MP3) Lossy Yes Variable Common on all platforms
Mp3PRO (MP3) Lossy Yes Variable Limited support
Ogg Vorbis (OGG) Lossy Yes Variable Limited support
RealAudio (RA) Lossy Yes Variable Popular for streaming
Microsoft wave (WAV) Lossless Yes Constant Primarily for Windows
Windows Media (WMA) Lossy Yes Variable Primarily for Windows

Conversion between digital audio formats can be complex. If you are producing audio content for Internet distribution, a lossless-to-lossy (e.g. WAV to MP3) conversion will significantly reduce bandwidth usage. Only lossless-to-lossy conversion is advised. The conversion process of lossy-to-lossy will further degrade audio quality by removing additional data, producing unexpected results.

What Is The Best Solution?

Whether digitising analogue recordings or converting digital sound into another format, sample rate, bit rate and format compression will affect the resulting output. Quality assurance processes should compare the technical and subjective quality of the digital audio against the requirements of its intended purpose.

A simple suite of subjective criteria should be developed to check the quality of the digital audio. Specific checks may include the following questions:

Objective technical criteria should also be measured to ensure each digital audio file is of consistent or appropriate quality:

Further Information


Briefing 24

Handling International Text


Background

Digital text is one of the oldest description methods, but remains divided by differing file format, encoding methods, schemas, and encoding methods. When choosing a digital text format it is necessary to establish the project needs. Is plain text suitable for the task and are text markup and formatting required? How will the information be displayed and where? This document describes these issues and provides some guidelines for their use.

What is the Best Tool for the Job?

Digital text has existed in one form or another since the 1960s. Many computer users take for granted that they can quickly write a letter without restriction or technical considerations. A commercial project, however, requires consideration of long-term needs and goals. To avoid complications at a later date, the developer must ensure the tools in use are the most appropriate for the task and, if not, what can be used in their place. To achieve this three questions should be answered:

  1. How will textual information be viewable for the user?
  2. What problems may I encounter if textual information is stored incorrectly?
  3. How will textual information be organized?

File Formats

It is often assumed that everyone can read text. However, this is not always the case. Digital text imposes restrictions upon the content that can have a significant impact upon the project.

In particular, there are two main issues:

The choice of format will be dependent upon the following factors:

Character Encoding

For allowing universal information access, plain text remains useful. It has the advantage of being simple to interpret and small in file size. However, there are some differences in the method that is used to encode text characters. The most common variations are ASCII (American Standard Code for Information Interchange) and Unicode.

Problems

Several problems may be encountered when storing textual information. For text files it is a simple process to convert the file to Unicode. However, for more complex data, such as databases, the conversion process will become more difficult. Problems may include:

Structural Mark-up

Although ASCII and Unicode are useful for storing information, they are only able describe each character, not the method they should be displayed or organized. Structural mark-up languages enable the designer to dictate how information will appear and establish a structure to its layout. For example, the user can define a tag to store book author information and publication date.

The use of structural mark-up can provide many organizational benefits:

The most common markup languages are SGML and XML. Based upon these languages, several schemas have been developed to organize and define data relationships. This allows certain elements to have specific attributes that define its method of use (see Digital Rights document for more information). To ensure interoperability, XML is advised due to its support for contemporary Internet standards (such as Unicode).

Further Information


Briefing 25

Choosing A Suitable Digital Video Format


Background

Digital video can have a dramatic impact upon the user. It can reflect information that is difficult to describe in words alone, and can be used within an interactive learning process. This document contains guidelines to best practice when manipulating video. When considering the recording of digital video, the digitiser should be aware of the influence of file format, bit-depth, bit-rate and frame size upon the quality of the resulting video.

Composition of a Digital Video File

Digital video consists of a series of images played in rapid succession to create the illusion of movement. It is commonly accompanied by an audio track. Unlike graphics and sound that are relatively small in size, video data can be hundreds of megabytes, or even gigabytes, in size.

The visual and audio information are individually stored within a digital 'wrapper' an umbrella structure consisting of the video and audio data, as well as information to playback and resynchronise the data.

What is the Best Solution?

Digital video remains a complex area that combines the problems of audio and graphic data. When choosing to encode video the designer must consider several issues:

  1. Are there any existing procedures to guide the encoding process?
  2. What type of delivery method will be used to distribute the video?
  3. What video quality is acceptable to the user?
  4. What type of problems are likely to be encountered?

Distribution Methods

The distribution method will have a significant influence upon the file format, encoding type and compression used in the project.

Removeable media - Video distributed on CD-ROM or DVD are suited to progressive encoding methods that do not conduct extensive error checking. Although file size is not as critical in comparison to Internet streaming, it continues to have some influence.

The compression type is dependent upon the need of the user and the type of removeable media:

NAME PURPOSE OF MEDIA Compression
Streaming Progressive Media
Advanced Streaming Format (ASF) Y     Temporal
Audio Video Interleave (AVI)   Y   Temporal
MPEG-1   Y VideoCD Temporal
MPEG-2   Y DVD Temporal
QuickTime (QT) Y Y   Temporal
QuickTime Pro Y Y   Temporal
RealMedia (RM) Y Y   Temporal
Windows Media Video (WMV) Y Y   Temporal
DivX   Y Amateur CD distribution Temporal
MJPEG   Y   Spatial

Table 1: A comparison list of the different file formats, highlighting their intended purpose and compression method.

Video Quality

The provision of video data for an Internet-based audience places specific restrictions upon the content. Quality of the video output is dependent upon three factors:

Screen Size Pixels per frame Bit depth (bits) Frames per second Bandwidth required (megabits)
640 x 480 307,200 24 30 221.184
320 x 240 76,800 16 25 30.72
320 x 240 76,800 8 15 9.216
160 x 120 19,200 8 10 1.536
160 x 120 19,200 8 5 0.768

Table 2: Indication of the influence of screen size, bit-depth and frames per second has upon required bandwidth

When creating video, the designer must balance the video quality with the facilities available to the end user. As an example, an 8-bit screen of 160 x 120 pixels, and 10-15 frames per second is used for the majority of content found on the Internet.

Problems

Video presents numerous problems for the designer caused by the complexity of formats and structure. Problems may include:

Definitions

Temporal Compression - Reduces the amount of data stored over a sequence of frames. Rather than describing every pixel in each frame, temporal compression stores a key frame, followed by descriptive information on changes.

Spatial Compression - Condenses each frame independently by mapping similar pixels within a frame. For example, two shades of red will be merged. This results in a reduction in image quality, but enables the file to be edited in its original form.

Progressive Encoding - Refers to any format where the user is required to download the entire video before they are allowed to watch it.

Internet Streaming - Enables the viewer to watch sections of video without downloading the entire thing, allowing users to evaluate video content after just a few seconds. Quality is significantly lower than progressive formats due to compression being used.

Further Information


Briefing 26

Intellectual Property Rights


Introduction

Internet IPR is inherently complex, breaking across geographical boundaries, creating situations that are illegal in one country, yet not in another, or contradict existing laws on Intellectual Property. Copyright is a subset of IPR, which applies to all artistic works. It is automatically assigned to the creator of original material, allowing them to control all public usage (copying, adaptation, performance and broadcasting).

Ensuring that your organization complies with Intellectual Property rights requires a detailed understanding of two processes:

  1. Managing copyright on own work.
  2. Establishing ownership of 3rd party copyright.

Managing Copyright on Own Work

Unless indicated, copyright is assigned to the author of an original work. When producing work it is essential that it be established who will own the resulting product the individual or the institution. Objects produced at work or university may belong to the institution, depending upon the contract signed by the author. For example, the copyright for this document belongs to the AHDS, not the author. When approaching the subject, the author should consider several issues:

When producing work as an individual that is intended for later publication, the author should establish ownership rights to indicate how work can be used after initial publication:

Copyright Clearance

Copyright is an automatically assigned right. It is therefore likely that the majority of works in a digital collection will be covered by copyright, unless explicitly stated. The copyright clearance process requires the digitiser to check the copyright status of:

Copyright clearance should be established at the beginning of a project. If clearance is denied after the work has been included in the collection, it will require additional effort to remove it and may result in legal action from the author.

In the event that an author, or authors, is unobtainable, the project is required to demonstrate they have taken steps to contact them. Digital preservation projects are particularly difficult in this aspect, separating the researcher and the copyright owner by many years. In many cases, more recently the 1986 Domesday project, it has proven difficult to trace authorship of 1000+ pieces of work to individuals. In this project, the designers created a method of establishing permission and registering objections by providing contact details that an author could use to identify their work.

Indicating IPR through Metadata

If permission has been granted to reproduce copyright work, the institution is required by law to indicate intellectual property status. Metadata is commonly used for this purpose, storing and distributing IP data for online content. Several metadata bodies provide standardized schemas for copyright information. For example, IP information for a book could be stored in the following format.

<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
          <publish_date>2001-04-16</publish_date>
          <publisher>Addison Press</publisher>
          <copyright>Galos, M. 2001</copyright>
</book>

Access inhibitors can also be set to identify copyright limitations and the methods necessary to overcome them. For example, limiting e-book use to IP addresses within a university environment.

Further Information


Briefing 27

Implementing Quality Assurance For Digitisation


Background

Digitisation often involves working with hundreds or thousands of images, documents, audio clips or other types of source material. Ensuring these objects are consistently digitised and to a standard that ensures they are suitable for their intended purpose can be complex. Rather than being considered as an afterthought, quality assurance should be considered as an integral part of the digitisation process, and used to monitor progress against quality benchmarks.

Quality Assurance Within Your Project

The majority of formal quality assurance standards, such as ISO9001, are intended for large organisations with complex structures. A smaller project will benefit from establishing its own quality assurance procedures, using these standards as a guide. The key is to understand how work is performed and identify key points at which quality checks should be made. A simple quality assurance system can then be implemented that will enable you to monitor the quality of your work, spot problems and ensure the final digitised object is suitable for its intended use.

The ISO 9001 identifies three steps to the introduction of a quality assurance system:

  1. Brainstorm: Identify specific processes that should be monitored for quality and develop ways of measuring the quality of these processes. You may want to think about:
    • Project goals: who will use the digitised objects and what function will they serve.
    • Delivery strategy: how will the digitised objects be delivered to the user? (Web site, Intranet, multimedia presentation, CD-ROM).
    • Digitisation: how will data be analysed or created. To ensure consistency throughout the project, all techniques should be standardized.
  2. Education: Ensure that everyone is familiar with the use of the system.
  3. Improve: Monitor your quality assurance system and looks for problems that require correction or other ways it may be improved.

Key Requirements For A Quality Assurance System

First and foremost, any system for assuring quality in the digitisation process should be straightforward and not impede the actual digitisation work. Effective quality assurance can be achieved by performing four processes during the digitisation lifecycle:

  1. The key to a successful QA process is to establish a clear and concise work timeline and, using a step-by-step process, document on how this can be achieved. This will provide a baseline against which actual work can be checked, promoting consistency, and making it easier to spot when digitisation is not going according to plan.
  2. Compare the digital copy with the physical original to identify changes and ensure accuracy. This may include, but is not limited to, colour comparisons, accuracy of text that has been scanned through OCR software, and reproduction of significant characteristics that give meaning to the digitised data (e.g. italicised text, colours).
  3. Perform regular audit checks to ensure consistency throughout the resource. Qualitative checks can be performed upon the original and modified digital work to ensure that any changes were intentional and processing errors have not been introduced. Subtle differences may appear in a project that takes place over a significant time period or is divided between different people. Technical checks may include spell checkers and the use of a controlled vocabulary to allow only certain specifically designed descriptions to be used. These checks will highlight potential problems at an early stage, ensuring that staff are aware of inconsistencies and can take steps to remove them. In extreme cases this may require the re-digitisation of the source data.
  4. Finally, measures should be taken to establish some form of audit trail that tracks progress on each piece of work. Each stage of work should be 'signed off' by the person responsible, and any unusual circumstances or decisions made should be recorded.

The ISO 9001 system is particularly useful in identifying clear guidelines for quality management.

Summary

Digitisation projects should implement a simple quality assurance system. Implementing internal quality assurance checks within the workflow allows mistakes to be spotted and corrected early-on, and also provides points at which work can be reviewed, and improvements to the digitisation process implemented.

Further Information


Briefing 28

Choosing An Appropriate Raster Image Format


Background

Any image that is to be archived for future use requires specific storage considerations. However, the choice of file format is diverse, offering advantages and disadvantages that make them better suited to a specific environment. When digitising images a standards-based and best practice approach should be taken, using images that are appropriate to the medium they are used within. For disseminating the work to others, a multi-tier approach is necessary, to enable the storing of a preservation and dissemination copy. This document will discuss the formats available, highlighting the different compression types, advantages and limitations of raster images.

Factors to Consider when Choosing Image Formats

When creating raster-based images for distribution file size is the primary consideration. As a general rule, the storage requirements increase in proportion to the improvement in image quality. A side effect of this process is that network delivery speed is halved, limiting the amount that can be delivered to the user. For Internet delivery it is advised that designers provide a small image (30-100k) that can be accessed quickly for mainstream users, and provide a higher quality copy as a link or available on a CD for professional usage.

When digitising the designer must consider three factors:

Distribution Methods

The distribution method will have a significant influence upon the file format, encoding type and compression used in the project.

To summarise, Table 1 shows the appropriateness of different file formats for streaming or progressive recording.

  Maximum no. of colours Compression Type Suited for: Issues
BMP 16,777,216 None General usage. Common on Windows platforms A Windows format rather than an Internet format. Unsupported by most browsers.
GIF87a 256 Lossless High quality images that do not require photographic details File sizes can be quite large, even with compression
GIF89a 256 Lossless Same as GIF87a, animation facilities are also popular See above
JPEG 16,777,216 Lossy High quality photographs delivered in limited bandwidth environment. Degrades image quality and produces wave-like artefacts on image.
PNG-8 256 Lossless Developed to replace GIF. Produces 10-30% smaller than GIF files. File sizes can be large, even with compression.
PNG-24 16,777,216 Lossless Preserves photograph information File sizes larger than JPG.
TIFF 16,777,216 Lossless Used by professionals. Redundant file information provides space for specialist uses (e.g. colorimetry calibration). Suitable for archival material. Unsuitable for Internet-delivery

Table 1: Comparison table of image file formats

Once chosen, the file format will, to a limited extent, dictate the possible file size, bit depth and compression method available to the user.

Compression Type

Compression type is a third important consideration for image delivery. As the name suggests, compression reduces file size by using specific algorithms. Two compression types exist:

As an archival format, lossy compression is unsuitable for long-term preservation. However, its small file size is used in many archives as a method of displaying lower resolution images for Internet users.

Bit-depth

Bit-depth refers to the maximum number of colours that can be displayed in an image. The number of colours available will rise when the bit depth is increased. Table 2 describes the relationship between the bit depth and number of colours.

Bit depth 1 4 8 8 16 24 32
Maximum No. of colours 2 16 256 256 65,536 16,777,216 16,777,216

Table 2: A conversion table showing the relationship between bit-depth and maximum number of colours

The reduction of bit-depth will have a significant effect upon image quality. Figure 3 demonstrates the quality loss that will be encountered when saving at a low bit-depth.

24-bit 8-bit 4-bit 1-bit
24-bit 8-bit 4-bit 1-bit
Original image Some loss of colour around edges. Suitable for thumbnail images Major reduction in colours. Petals consist almost solely of a single yellow colour. Only basic layout data remains.

Figure 3: Visual comparison of different bit modes

Image Conversion Between Different Formats

Image conversion is possible using a range of applications (Photoshop, Paint Shop Pro, etc.). Lossless-to-lossless conversion (e.g. PNG-8 to GIF89a) can be performed without quality loss. However, lossless-to-lossy (PNG-8 to JPEG) or lossy-to-lossy conversion will result in a quality loss, dependent upon the degree of compression used. For dissemination of high-quality images, a lossy format is recommended to reduce file size. Smaller images can be stored in a lossless format.

Further Information


Briefing 29

Choosing A Vector Graphics Format For The Internet


Background

The market for vector graphics has grown considerably, in part, as a result of improved processing and rendering capabilities of modern hardware. Vector-based images consist of multiple objects (lines, ellipses, polygons, and other shapes) constructed through a sequence of commands or mathematical statements to plot lines and shapes in a two-dimensional or three-dimensional space. For Internet usage, this enables graphics to be resized to ever increasing screen resolutions without concern that an image will become 'jaggy' or unrecognisable.

File Formats

Several vector formats exist for use on the Internet. These construct information in the same way yet provide different functionality. The table below provides a breakdown of the main formats.

Name Developer Availability Viewers Uses
Scalable Vector Graphics (SVG) W3C Open standard Internet browser Internet-based graphics
Shockwave/Flash Macromedia Proprietary Flash plugin for browser Video media and multimedia presentation
Vector Markup Language (VML) W3C Open standard MS Office, Internet Explorer, etc. XML-based format.

For Internet delivery of static images, the W3 recommend SVG as a standard open format for vector diagrams. VML is also common, being the XML language exported by Microsoft products. For text-based vector files, such as SVG and VML, the user is recommended to save content in Unicode.

If the vector graphics are to be integrated into a multimedia presentation or animation, Shockwave and Flash offer significant benefits, enabling vector animation to be combined with audio.

Creating Vector Graphics

A major feature of vector graphics is its ability to construct detailed objects that can be resized without quality loss. XML (Extensible Markup Language) syntax the basis of the SVG and VML languages is understandable by non-technical users who wish to understand the object being constructed. The example below demonstrates the ability to create shapes using a few commands. The circle, shown on the left, was created by the textual data on the right.

Figure 1: SVG graphics and associated code <svg width="8in" height="8in">
<desc>This is a red circle with a black outline</desc>
<g><circle style="fill: red; stroke: black" cx="200" cy="200" r="100"/>
<text x="2in" y="2in">Hello World</text></g>
</svg>

Figure 1: SVG graphics and associated code

XML Conventions

Although XML enables the creation of a diversity of data types it is extremely meticulous regarding syntax usage. To remain consistent throughout multiple documents and avoid future problems, several conventions are recommended:

The use of XML enables a high level of interoperability between formats. When converting for a target audience, the designer has two options:

  1. Vector-to-Raster conversion - Raster conversion should be used for illustrative purposes only. The removal of all coordination data eliminates the ability to edit files at a later date.
  2. Vector-to-Vector conversion - Vector-to-vector conversion enables data to be converted into different languages. The use of XML enables the user to manually convert between two different formats (e.g. SVG to VML).

At the start of development it may help to ask your team the following questions:

  1. What type of information will the graphics convey? (Still images, animation and sound, etc.)
  2. What type of browser/operating system will be used to access the content? (Older browsers and non Mac/PC browsers have limited or no support for XML-based languages.)

Further Information


Briefing 30

Summary of the QA Focus Methodology


Background

In order to provide value for money and a return on investment from the funders there is a need for project deliverables not only to be functional in their own right but also to be widely accessible, easily repurposed and deployed in a service environment.

To achieve these aims projects should ensure that their deliverables comply with appropriate standards and best practices. Although it may be easy to require compliance, it may not always be easy to implement appropriate standards and best practices. In order to ensure that best endeavours are made it is recommended that projects should implement quality assurance (QA) procedures.

QA Focus's Methodology

Projects may be concerned that implementation of QA procedures can be time-consuming. The approach recommended by QA Focus is designed to be lightweight and to avoid unnecessary bureaucracy, while still providing a mechanism for implementation of best practices.

The QA Focus methodology is based on the following:

It is felt that use of this methodology should not only be beneficial to the projects themselves, but also help to minimise problems when project deliverables are re-used.

Example: QA For Web Sites

As an example of implementation of this approach the QA policy for standards for the QA Focus Web site is given below.

Area: Web site standards

Standards: The Web site will be based on the XHTML 1.0 and CSS 2.0 standards.

Architecture: The Web site will make use of PHP. XHTML 1.0 templates will be provided for use by authors, who will use simple HTML tools such as HTML-kit. Web site will provide access to an MS Access database. This will also comply with XHTML 1.0 and CSS 2.0 standards. The Web site will also host MS Word and MS PowerPoint files. These documents will also be available in HTML.

Exceptions: Resources converted from proprietary formats (such as MS Word and PowerPoint) need not necessarily comply with XHTML and CSS standards if doing so would be too time-consuming.

Responsibilities: The QA Focus project manager is responsible for changing this policy and addressing serious deviations from the policy.

Checking: Resources should be validated when they are created or updated usually using the ,validate tool. When several resources are updated the ,rvalidate tool should be used.

Audit trail: A full audit should be carried out at least quarterly. The findings should be published on the QA Focus Web site, and deviations from the policy documented.

A second example describes the QA policy for link checking of the QA Focus Web site.

Area: Web site: link checking

Best Practice: There should be no internal broken links and links to external resources should work when a page is created. We should seek to fix broken links to external resources. Exceptions: There may be broken links in historical documents or surveys. In addition, if remote Web sites are updated it may be too time-consuming to update the links.

Change Control: The QA Focus project manager is responsible for changing this policy and addressing serious deviations from the policy.

Checking: When resources are created or updated the resource should be link-checked, usually using the ,checklink tool. When several resources are updated the ,rchecklink tool should be used.

Audit trail: A full audit should be carried out at least quarterly. Initially two tools should be used to spot deficiencies in the link-checking software. The findings should be published on the QA Focus Web site, and deviations from the policy documented.

These two examples illustrate that developing QA policies need not be time-consuming. In addition implementation of these policies need not be time-consuming and can improve the quality of the Web site.


Briefing 31

Matrix for Selection of Standards


Background

JISC and the JISC advisory services provide advice on a wide range of standards and best practices which seek to ensure that project deliverables are platform and application-independent, accessibility, interoperable and are suitable for re-purposing.

The standards and best practices which JISC advisory service recommend have been developed with these aims in mind.

Challenges

Although use of recommended standards and best practices is encouraged, there may be occasions when this is not possible:

Building on existing systems: Projects may be based on development of existing systems, which do not use appropriate standards.
Standards immature: Some standards may be new, and there is a lack of experience in their use. Although some organisations may relish the opportunity to be early adopters of new standards, others may prefer to wait until the benefits of the new standards have been established and many teething problems resolved.
Functionality of the standard: Does the new standard provide functionality which is required for the service to be provided?
Limited support for standards: There may be limited support for the new standards. For example, there may be a limited range of tools for creating resources based on the new standards or for viewing the resources.
Limited expertise: There may be limited expertise for developing services based on new standards or there may be limited assistance to call on in case of problems.
Limited timescales: There may be insufficient time to gain an understanding of new standards and gain experience in use of tools.

In many cases standards will be mature and expertise readily available. The selection of the standards to be deployed can be easily made. What should be done when this isn't the case?

A Matrix Approach

In light of the challenges which may be faced when wishing to make use of recommended standards and best practices it is suggested that projects use a matrix approach to resolving these issues.

Area Your Comments
Standard
How mature is the standard?  
Does the standard provide required functionality?  
Implementation
Are authoring tools which support the standard readily available?  
Are viewing tools which support the standard readily available?  
Organisation
Is the organisation culture suitable for deployment of new standards?  
Are there strategies in place to continue development in case of staffing changes?  

Individual projects will need to formulate their own matrix which covers issues relevant to their particular project, funding, organisation, etc.

Implementation

This matrix approach is not intended to provide a definitive solution to the selection of standards. Rather it is intended as a tool which can assist projects when they go through the process of choosing the standards they intend to use. It is envisaged that projects will document their comments on issues such as those listed above. These comments should inform a discussion within the project team, and possibly with the project's advisory or steering group. Once a decision has been made the rationale for the decision should be documented. This will help to ensure that the reasonings are still available if project teams members leave.

For examples of how projects have addressed the selection of standards can see:


Briefing 32

Changing A Project's Web Site Address


Background

A project's Web site address will provide, for many, the best means for finding out about the project, reading abouts its activities and using the facilities which the projects provides. It is therefore highly desirable that a project's Web site address remains stable. However there may be occasions when it is felt necessary to change a project's Web site address. This document provides advice on best practices which should help to minimise problems.

Best Practices For A Project Web Site Address

Ideally the entry point for project's Web site will be short and memorable. However this ideal is not always achievable. In practice we are likely to find that institutional or UKERNA guidelines on Web addresses preclude this option.

The entry point should be a simple domain name such as <http://www.project.ac.uk/> or a directory such as <http://www.university.ac.uk/depts/library/project/>. Avoid use of a file name such as <http://www.university.ac.uk/depts/library/project/index.html> as this makes the entry point longer and less memorable and can cause problems if the underlying technologies change.

Reasons For Changing

If the address of a project Web site is determined by institutional policies, it is still desirable to avoid changing the address unnecessarily. However there may be reasons why a change to the address is needed.

Implementing Best Practices:
There may be an opportunity to implement best practices for the address which could not be done when the Web site was launched.
Changes In Organisation's Name:
The name of an institution may change e.g. the institution is taken over or merges with another institution.
Changes In Organisational Structure:
The organisational structure may change e.g. departments may merge or change their name.
Changes In Project Partners:
The project partner hosting the Web site may leave the project.
Project Becomes Embedded In Organisation:
The project may become embedded within the host institution and this requires a change in the address.
Project Is Developed With Other Funding Streams:
The project may continue to be developed through additional funding streams and this requires a change in the address.
Project Becomes Obsolete:
The project may be felt to be obsolete.
Technical Changes:
Technological changes may necessitate a change in the address.
Changes In Policies:
Institutional policy changes may necessitate a change in the address.
Changes In Web Site Function:
The project Web site may change its function or additional Web sites may be needed. For example, the main Web site may initially be about the project and a new Web site is to be launched which provides access to the project deliverables.

Advice On Changing Addresses

Projects should consider potential changes to the Web site address before the initial launch and seek to avoid future changes or to minimise their effect. However if this is not possible the following advice is provided:

Monitor Links:
Prior to planning a change use the www.linkpopularity.com (or equivalent) service to estimate the numbers of links to you Web sites.
Monitor Search Engines:
Examine the numbers of resources from your Web site which are indexed by popular search engines.

This information will give you an indication of the impact a change to your Web site address may have. If you intend to change the address you should:

Consider Technical Issues:
How will the new Web site be managed? How will resources be migrated?
Consider Migration:
How will the change of address be implemented? How will links to the old address be dealt with? How will you inform users of the change?
Inform Stakeholders:
Seek to inform relevant stakeholders, such as funding bodies, partners and others affected by the change.

Checking Processes

It is advisable to check links prior to the change and afterwards, to ensure that no links are broken during the change. You should seek to ensure that links on your Web site go to the new address.


Briefing 33

Implementing A Technical Review


Background

When projects submit an initial proposal the project partners will probably have an idea as to the approaches which will be taken in order to provide the project deliverables. During the project's life it may be desirable to review the approaches which were initially envisaged and, if necessary to make changes. This document describes possible approaches to periodic reviews.

Reasons For A Review

There are a number of reasons why a technical review may be necessary:

Technological issues:
There may be changes with underlying technologies. For example the software which was initially envisaged being used may be found to be inappropriate or alternative software may be felt to provide advantages.
Staffing issues:
There may be staffing changes. For example key technical staff may leave and are difficult to replace.
Organisational issues:
There may be changes within the organisation which is providing the project.
Changing requirements:
There may be changes in the requirements for the project, following, say, a user needs requirements survey.
Ensure that deliverables comply with standards and best practices:
It may be necessary to ensure that the project has implemented quality assurance processes to ensure that project deliverables comply with appropriate standards and best practices.

A project review may, of course, also address non-technical issues.

Approaches To A Review

Projects may find it useful to allocate some time during the project life span to a technical review of the project.

Review by development team:
The project development team may wish to reflect on the approaches they have taken. They may be encouraged to provide a report to the project manager.
Review by project partners:
The project partners may be involved in the review process.
Review involving third parties:
The project team may wish to invite external bodies to participate in the review.
Comparison with one's peers:
You may chose to compare your deliverables with your peers, such as similar projects. This approach is particular suited for reviewing publicly available deliverables such as Web sites.

When organising a project review you should take care to ensure that the review is handled in a constructive manner.

Outputs From A Review

It is important to note that any improvements or changes which may have been identified during a view need not necessarily be implemented. There may be a temptation to implement best practices when good practices are sufficient, and that implementation of best practices may take longer to implement than envisaged. The outputs from a review may be:

Better understanding:
The review may have an educational role and allow project partners to gain a better understanding of issues.
Enhanced workflow practices:
Rather than implementing technical changes the review may identify the need for improvements to workflow practices.
Documenting lessons:
The review may provide an opportunity to document limitations of the existing approach. The documentation could be produced for use by project partners, or could be made more widely available (e.g. as a QA Focus Case Study).
Deployed in other areas:
The recommendations may be implemented in other areas which the project partners are involved in.
Implemented within project:
The recommendations may be implemented within the project itself. If this is the case it is important that the change is driven by project needs and not purely on technical grounds. The project manager should normally approve significant changes and other stakeholders may need to be informed.

Conclusions

It can be useful to allocate time for a mid-project review to ensure that project work is proceeding satisfactorily. This can also provide an opportunity to reassess the project's technical architecture.


Briefing 34

Use Of Cascading Style Sheets (CSS)


Background

This document reviews the importance of Cascading Style Sheets (CSS) and highlights the importance of ensuring that use of CSS complies with CSS standards.

Why Use CSS?

Use of CSS is the recommended way of defining how HTML pages are displayed. You should use HTML to define the basic structure (using elements such as <h1>, <p>, <li>, etc.) and CSS to define how these elements should appear (e.g. heading should be in bold Arial font, paragraphs should be indented, etc.).

This approach has several advantages:

Maintenance:
It is much easier to maintain the appearance of a Web site. If you use a single CSS file updating this file allows the Web site look-and-feel to be altered easily; in contrast use of HTML formatting elements would require every file to be updated to change the appearance.
Functionality:
CSS provides rich functionality, including defining the appearance of HTML pages when they are printed.
Accessibility:
Use of CSS provides much greater accessibility, allowing users with special needs to alter the appearance of a Web page to suit their requirements. CSS also allows Web pages to be more easily rendered by special devices, such as speaking browsers, PDAs, etc.

There are disadvantages to use of CSS. In particular legacy browsers such as Netscape 4 have difficulty in processing CSS. However, since such legacy browsers are now in a minority the biggest barrier to deployment of CSS is probably a lack of understand or inertia.

Approaches To Use Of CSS

There are a number of ways in which CSS can be deployed:

External CSS Files:
The best way to use CSS is to store the CSS data in an external file and link to this file using the <link> HTML element. This approach allows the CSS definitions to be used by every page on your Web site.
Internal CSS:
You can store CSS within a HTML by including it using the <style> element within the <head> section at the top of your HTML file. However this approach means the style definitions cannot be applied to other files. This approach is not normally recommended.
Inline CSS:
You can embed your CSS inline with HTML elements: for example <p style="font-color: red" > uses CSS to specify that text in the current paragraph is red. However this approach means that the style definitions cannot be applied to other paragraphs. This approach is discouraged.

Ensure That You Validate Your CSS

As with HTML, it is important that you validate your CSS to ensure that it complies with appropriate CSS standards. There are a number of approaches you can take:

Within your HTML editor:
Your HTML editing tool may allow you to create CSS. If it does, it may also have a CSS validator.
Within a dedicated CSS editor:
If you use a dedicated CSS editor, the tool may have a validator.
Using an external CSS validator:
You may wish to use an external CSS validator. This could be a tool installed locally or a Web-based tool such as those available at W3C [1] and the Web Design Group [2] .

Note that if you use external CSS files, you should also ensure that you check that the link to the file works.

Systematic CSS Validation

You should ensure that you have systematic procedures for validating your CSS. If, for example, you make use of internal or inline CSS you will need to validate the CSS whenever you create or edit an HTML file. If, however, you use a small number of external CSS files and never embed CSS in individual HTML files you need only validate your CSS when you create or update one of the external CSS files.

References

  1. Validator CSS, W3C, <http://jigsaw.w3.org/css-validator/>
  2. CSSCheck, WDG, <http://www.htmlhelp.com/tools/csscheck/>

Briefing 35

Deployment Of XHTML 1.0


Background

This document describes the current recommended versions of HTML. The advantages of XHTML 1.0 are given together with potential challenges in deploying XHTML 1.0 so that it follows best practices.

Versions Of HTML

HTML has evolved since it was first created, responding to the need to provide richer functionality, maximise its accessibility and allow it to integrate with other architectural developments. The final version of the HTML language is HTML 4.0. This version is mature and widely supported, with a wide range of authoring tools available and support provided in Web browsers.

However HTML has limitation: HTML resources cannot easily be reused; it is difficult to add new features to the HTML language; it is difficult to integrate HTML pages with other markup languages (e.g. MathML for including mathematical expressions, SVG for including scalable vector graphics, etc).

XHTML 1.0

XHTML was developed address these concerns. XHTML is the HTML language described in the XML language. This means that the many advantages of XML (ability to reuse resources using the XSLT language; ability to integrate other XML application, etc.) are available for authors creating conventional Web pages.

In order to support migration from HTML to a richer XHTML world, XHTML has been designed so that it is backwards compatible with the current Web browsers.

Since XHTML 1.0 provides many advantages and can be accessed by current browsers it would seem that use of XHTML 1.0 is recommended. However there are a number of issues which need to be addressed before deploying XHTML 1.0 for your Web site.

Deployment Issues

Compliance

Although HTML pages should comply with the HTML standard, browsers are expected to be tolerant of errors. Unfortunately this has led to an environment in which many HTML resources are non-compliant. This environment makes it difficult to repurpose HTML by other applications. It also makes rendering of HTML resources more time-consuming than it should, since browsers have to identify errors and seek to render them in a sensible way.

The XML language, by contrast, mandates that XML resources comply with the standard. This has several advantages: XML resources will be clean enabling the resources to be more easily reused by other applications; applications will be able to process the resources more rapidly; etc. Since XHTML is an XML application an XHTML resource must be compliant in order for it to be processed as XML.

XHTML 1.0 And MIME Types

Web browsers identify file formats by checking the resource's MIME type. HTML resources use a text/html MIME type. XHTML resources may use this MIME type; however the resources will not be processed as XML, therefore losing the benefits provided by XML. Use of the application/xhtml+xml MIME type allows resources to be processed as XML. This MIME type is therefore recommended if you wish to exploit XML's potential.

Implementation Issues

You should be aware of implementation issues before deploying XHTML 1.0:

Guaranteeing Compliance:
You must ensure that your resources are compliant. Unlike HTML, non-compliant resources should not be processed by XML tools. This may be difficult to achieve if you do not have appropriate tools and processed.
Browser Rendering:
Although use of the application/xhtml+xml MIME type is recommended to maximise the potential of a more structured XML world, this environment is not tolerant of errors. Use of the text/html MIME type will allow non-compliant XHTML resources to be viewed, but exploiting this feature simply perpetuates the problems of a HTML-based Web.
Resource Management:
It is very import that you give thought to the management of a Web site which uses XHTML. You will need to ensure that you have publishing processed which avoids resources becoming non-compliant. You will also need to think about the approaches of allocating MIME types.

Conclusions

Use of XHTML 1.0 and the application/xhtml+xml MIME type provides a richer, more reusable Web environment. However there are challenges to consider in deploying this approach. Before deploying XHTML you must ensure that you have addressed the implementation difficulties.


Briefing 36

IMS Question And Test Interoperability


Introduction

This document describes an international specification for computer based questions and tests, suitable for those wishing to use computer based assessments in courses.

What Is IMS Question And Test Interoperability?

Computers are increasingly being used to help assess learning, knowledge and understanding. IMS Question and Test Interoperability (QTI) [1] is an international specification for a standard way of sharing such test and assessment data. It is one of a number of such specifications being produced by the IMS Global Learning Consortium to support the sharing of computer based educational material such as assessments, learning objects and learner information.

This new specification is now being implemented within a number of assessment systems and Virtual Learning Environments. Some systems store the data in their own formats but support the export and import of question data in IMS QTI format. Other systems operate directly on IMS QTI format data. Having alternative systems conforming to this standard format means that questions can be shared between institutions that do not use the same testing systems. It also means that banks of questions can be created that will be usable by many departments.

Technical Details

The QTI specification uses XML (Extensible Markup Language) to record the information about assessments. XML is a powerful and flexible markup language that uses 'tags' rather like HTML. The IMS QTI specification was designed to be pedagogy and subject neutral. It supports five different type of user response (item selection, text input, numeric input, xy-position selection and group selection) that can be combined with several different input techniques (radio button, check box, text entry box, mouse xy position dragging or clicking, slider bar and others). It is able to display formatted text, pictures, sound files, video clips and even interactive applications or applets. How any particular question appears on the screen and what the user has to do to answer it may vary between different systems, but the question itself, the knowledge or understanding required to answer it, the marks awarded and the feedback provided should all remain the same.

The specification is relatively new. Version 1.2 was made public in 2002, and a minor upgrade to Version 1.2.1 was made early in 2003, that corrected some errors and ambiguities. The specification is complex comprising nine separate documents. Various commercial assessment systems (e.g. Questionmark [2], MedWeb, Canvas Learning [3]) have implemented some aspect of IMS QTI compatibility for their assessments. A number of academic systems are also being developed to comply with the specification. These include the TOIA project [4] which will have editing and course management facilities, the SToMP system [5], which was used with students for the first time in 2002, and a Scottish Enterprise system called Oghma which is currently being developed.

Discipline Specific Features

A disadvantage of such a standard system is that particular features required by some disciplines are likely to be missing. For example, engineering and the sciences need to be able to deal with algebraic expressions, the handling of both accuracy and precision of numbers, the use of alternative number bases, the provision of randomised values, and graphical input. Language tests need better textual support such as the presetting of text entry boxes with specific text and more sophisticated text based conditions. Some of these features are being addressed by groups such as the CETIS assessment SIG [6].

What This Means To You

If you are starting or planning to start using computer based tests, then you need to be aware of the advantages of using a standard-compliant system. It is clearly a good idea to choose a system that will allow you to move your assessments to another system at a later time with the minimum of effort or to be able to import assessments authored elsewhere.

A consideration to bear in mind, however, is that at this early stage in the life of the specification there will be a range of legacy differences between various implementations. It will also remain possible with some 'compliant' systems to create non-standard question formats if implementation specific extensions are used. The degree of conformity of any one system is a parameter that is difficult to assess at any time. Tools to assist with this are now beginning to be discussed, but it will be some time before objective measures of conformance will be available. In view of this it is a good idea to keep in touch with those interested in the development of the specification, and the best way within UK HE is probably via the CETIS Assessment Special Interest Group Web site [7].

It is important that the specification should have subject specific input from academics. The needs of different disciplines are not always well known and the lack of specific features can make adoption difficult. Look at the examples on the CETIS Web site and give feedback on areas where your needs are not being met.

References And Further Information

  1. QTI Specification,
    <http://www.imsglobal.org/>
  2. Questionmark,
    <http://www.questionmark.com/>
  3. Canvas Learning Author and Player,
    <http://www.canvaslearning.com/>
  4. TOIA,
    <http://www.toia.ac.uk>
  5. SToMP,
    <http://www.stomp.ac.uk/>
  6. CETIS Assessment Special Interest Group,
    <http://www.cetis.ac.uk/assessment>
  7. CETIS,
    <http://www.cetis.ac.uk/>

The following URLs may also be of interest.

Acknowledgments

This document was originally written by Niall Sclater and Rowin Cross of CETIS and adapted by Dick Bacon, Department of Physics, University of Surrey, consultant to the LTSN Physical Sciences Centre.

The original briefing paper (PDF format) is available on the CETIS Web site. The version available on this Web site was originally published in the LTSN Physical Science News (Centre News issue 10).


Briefing 37

Top 10 Quality Assurance Tips


The Top 10 Tips

1 Document Your Policies

You should ensure that you document policies for your project - remember that it can be difficult to implement quality if there isn't a shared understanding across your project of what you are seeking to achieve. For example, see the QA Focus policies on Web standards and link checking [1] [2].

2 Ensure Your Technical Infrastructure Is Capable Of Implementing Your Policies

You should ensure that your technical infrastucture which is capable of implementing your policies. For example, if you wish to make use of XHTML on your Web site you are unlikely to be able to achieve this if you are using Microsoft Word as your authoring tool.

3 Ensure That You Have The Resources Necessary To Implement Your Policies

You should ensure that you have the resources needed to implement your policies. This can include technical expertise, investment in software and hardware, investment in training and staff development, etc.

4 Implement Systematic Checking Procedures To Ensure Your Policies Are Being Implemented

Without systematic checking procedures there is a danger that your policies are not implemented in practice. For example, see the QA Focus checking procedures for Web standards and link [3] [4].

5 Keep Audit Trails

You should seek to provide audit trails which provide a record of results of your checking procedures. This can help to spot trends which may indicate failures in your procedures (for example, a sudden growth in the numbers of non-compliant HTML resources may be due to deployment of a new authoring tool, or a lack of adequate training for new members of the project team).

6 Learn From Others

Rather than seeking to develop quality assurance policies and procedures from scratch you should seek to learn from others. You may find that the QA Focus case studies [5] provide useful advice which you can learn from.

7 Share Your Experiences

If you are in the position of having deployed effective quality assurance procedures it can be helpful for the wider community if you share your approaches. For example, consider writing a QA Focus case study [6].

8 Seek 'Fitness For Purpose' - Not Perfection

You should seek to implement 'fitness for purpose' which is based on the levels of funding available and the expertise and resources you have available. Note that perfection is not necessarily a useful goal to aim for - indeed, there is a danger that 'seeking the best may drive out the good'.

9 Remember That QA Is For You To Implement

Although the QA Focus Web site provides a wide range of resources which can help you to ensure that your project deliverables are interoperable and widely accessible you should remember that you will need to implement quality assurance within your project.

10 Seek To Deploy QA Procedures More Extensively

Rather than seeking to implement quality assurance across your project, it can be beneficial if quality assurance is implemented at a higher level, such as within you department or organisation. If you have an interest in more widespread deployment of quality assurance, you should read about the ISO 9000 QA standards [7].

References

  1. Policy on Web Standards, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/web/>
  2. Policy on Linking, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/links/>
  3. Procedures for Web Standards, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/procedures/web/>
  4. Procedures for Linking, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/procedures/links/>
  5. Case Studies, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/>
  6. Contributing To Case Studies, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/#contributing>
  7. Selection and Use of the ISO 9000:2000 family of standards, ISO,
    <http://www.iso.org/iso/en/iso9000-14000/understand/selection_use/selection_use.html>

Briefing 38

From Project To Production Service


Background

Projects deliverables are normally expected to be deployed in a service environment. The deliverables could be passed on to an existing JISC service provider. In some cases, however, a project may evolve into a service. This document outlines some of the issues that need to be considered to facilitate such a transition.

If evolving to a service is not relevant to your project, the issues services need to address when deploying your project deliverables may still be of interest.

Hosting

Hosting of your project deliverables is one of the first issues to be considered. A prototype service may be developed on in-house equipment and in an environment which may not be appropriate for long term production. Issues to consider include:

Data Feeds

Your service may require regular updates of the raw data which the service is delivering to users. Issues to consider when moving into a production environment include:

Gateway Links

The JISC supports a range of subject-specific gateway services. Decide which gateway, if any, your service fits into. The subject matter of your service may span more than one area and therefore need to be incorporated in more than one gateway.

Review the RDN [1] and the services within it and see where a description and link to your service may fit. Arrange for your service to be made visible. The more links that are established to your service, the more likely it is to become visible to search engines such as Google and the more successful it is likely to be in terms of awareness and take-up.

Legal Issues

When an experimental or development system is turned into a production service, there are a number of copyright, licensing and other legal issues that need to be carefully considered.

Does your service contain any material that is subject to copyright or IPR legislation? This could include such things as images, artwork, extracts from publications, sound or movie clips and so on. If it does, you will need to get permission before you can 'publish' your site.

Have you considered how accessible your service is to those with special needs or disabilities? There are now legal obligations that need to be taken into account before releasing a new system.

The JISC TechDis service [2] provides information on how to make your Web site conform. Also consult the appropriate QA Focus document on Accessibility Testing [3] and the JISC Legal Information Service [4] for a range of advice on issues such as Data Protection, Freedom of Information, Disability and the Law, Intellectual Property and much else.

Managing Expectations

As soon as you have a reliable release date, publicise the fact on relevant JISCMail and other lists. Keep people informed as to the progress of the new service as launch day approaches.

As soon as delays appear inevitable, let people know, even if a revised date hasn't been fixed. This will help front-line staff, who will have to support your service, decide on their own local information strategy.

Launching the Service

The move of an experimental or development service into a full production service provides a 'hook' for raising its profile. Things to consider include:

Support and Publicity

Consider the kind of support and publicity materials that are appropriate for your service. Examples include:

Think about the target audience for the material. You may want to produce different versions for users from different backgrounds and experience. Consider which items may be worth printing (as opposed to being made available on the Web). For example posters and flyers are useful for distribution at events such as conferences and exhibitions. Review what other JISC services have done and discuss their experiences with them.

You should also seek advice from the JISC's Communications and Marketing Team [5] who maintain a register of key events and are able to help with such things as preparing and issuing press releases.

Service Development

Once your service is in production there will be a requirement to improve or update the service and to fix problems. User feedback on suggested service improvements or errors should be gathered through a contact publicised on the service's Web site.

Presentations and demonstrations provide forums for discussion and constructive criticism. Find out if there is an existing user group who will extend their remit to cover your service.

When changes are identified and implemented, ensure that the change is publicised well in advance. Unless the change is an important bug fix, try to make the changes infrequently, preferably to coincide with term-breaks.

Service Monitoring

Check if your service will come under the remit of the JISC's Monitoring Unit [6]. If it does, you will need to agree a service level definition with them. Typically you will also need to:

References

  1. Resource Discovery Network (RDN),
    <http://www.rdn.ac.uk/>
  2. TechDis,
    <http://www.techdis.ac.uk/>
  3. Accessibility Testing, QA Focus, briefing paper no. 2,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-02/>
  4. JISC Legal Information Service,
    <http://www.jisclegal.ac.uk/>
  5. Communications and Marketing Team, JISC,
    <http://www.jisc.ac.uk/index.cfm?name=people_masterlist#outreach>
  6. JISC Monitoring Unit,
    <http://www.mu.jisc.ac.uk/>

Briefing 39

Planning An End User Service


Background

For some projects, it will be clear from the start that the intention is to transition the project into an end-user service, either hosted by the project itself, or by another host such as a national data centre.

Other projects may have the potential for development into a production service, but without this being a declared aim of the project.

In both cases, it is sensible to think carefully about how the system might fit into a service environment at the planning and design stage, to avoid costly re-engineering and retro-fitting of features later on.

Software Environment

The software regime that may seem most appropriate for an experimental development environment may not be the best choice when running a large-scale end-user service. Issues to think about include:

Consultations

A key factor in the success of any project is careful preparation and planning. If you intend your project to develop into an end-user production service, it is worth spending time and effort in the early stages of the project testing your ideas and designs. It is easier to rewrite a specification document than to re-engineer a software product.

Depending on the nature of the project, some of the following may be worth considering:

Authentication and Authorisation

Controlling access to your service may not be an issue when it is in an experimental or development phase, but will become an important consideration if it is released into service.

Some issues to review include:

Legal Issues

When your project reaches the stage of being turned into an production service with large numbers of users, consideration will need to be given to issues which are less important during the development phase.

It is helpful to be aware of these at an early stage in the planning and design of the project to avoid difficult problems later. Some things you should think about include:

Planning for Maintenance

It is to be expected that a Web-based user service will require maintenance, revision and updating during its lifetime. There may be requests for new features, or for modifications to the way existing facilities work.

Bear in mind that the people doing this work may not be the original project team that created the service. It is important that the end-products are designed and structured in such a way as to allow parts of the system to be modified and updated by others who are less familiar with the system without unexpected consequences.

Therefore, when starting to develop a new system:

References

  1. Athens access management services,
    <http://www.athens.ac.uk/>
  2. Internet 2,
    <http://middleware.internet2.edu/>
  3. TechDis,
    <http://www.techdis.ac.uk/>
  4. Accessibility Testing, QA Focus briefing paper no. 2, UKOLN
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-02/>
  5. JISC Legal Information Service,
    <http://www.jisclegal.ac.uk/>

Briefing 40

Top 10 Tips For Service Deployment


About This Document

This document provides top tips which can help to ensure that project deliverables can be deployed into a service environment with the minimum of difficulties.

The Top 10 Tips

1 Document The Technical Architecture For Your Project

Provide a description of the technical architecture of aspects of your project which are intended for deployment into service. The description will be helpful for the service provider. In addition it can help the funders in gaining an appreciation of the technical approaches being taken by projects across a digital library programme as well as being of value to your project team (especially if staff leave).

2 Document Any Deviations From Use Of Recommended Standards Or Best Practices

You should ensure that your technical infrastructure which is capable of implementing your policies. For example, if you wish to make use of XHTML on your Web site you are unlikely to be able to achieve this if you are using Microsoft Word as your authoring tool.

3 Document Use Of Unusual Or Innovative Aspects Of Your Project

If you are making use of any new standards or unusual technologies you should document this, and explain the reasons for your choice. This could include use of emerging standards (e.g. SVG, SMIL), use of Content Management Systems, etc.

4 Have An Idea Of Where You Envisage Your Project Deliverables Being Deployed

Give some thought to where your deliverables will be deployed. This could be by a JISC Service, within your institution, within other institutions or elsewhere.

5 Seek To Make The Service Provider Aware Of Your Project

You should seek to make contact with the service provider for your deliverables. You should seek to gain an understanding of their requirements (e.g. see [1] [2]). In addition it can help if the service provider is aware of your work and any special requirements associated with your project.

6 Be Aware Of Legal, IPR, etc. Barriers To Service Deployment

The service provider will need to ensure that there are no legal barriers to the deployment of your deliverables. This can include clarifying copyright, IPR and accessibility issues.

7 Ensure Your Have Any Documentation Which Is Necessary To Assist Service Deployment

You should ensure that you provide installation documentation which should list dependencies on other software and cover any security or performance issues. As well as the installation documentation you should also provide user documentation which can help the service provide support for end users.

8 Remember To 'Let Go'

Although it can be helpful of your project team is in a position to provide advice to the service provider after the end of the project, the project team should also be willing to relinquish control over the project if, for example, the service provider needs to make changes to your deliverables.

9 Learn From Others

Learn from the experiences of others. For example, read the case studies which provide various examples of porting systems into a service environment [3] [4].

10 Share Your Experiences

Be willing to share your experiences. For example, consider writing a case study for QA Focus [5] [5].

References

  1. From Project To Production Service, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-38/>
  2. Planning An End User Service, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-39/>
  3. Launching New Database Services: The BIDS Experience, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-27/>
  4. Providing Access To Full Text Journal Articles, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-28/>
  5. Contributing To Case Studies, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/#contributing>

Briefing 41

Introduction To Metadata


What is Metadata?

Metadata is often described as "data about data". The concept of metadata is not new - a Library catalogue contains metadata about the books held in the Library. What is new is the potential that metadata provides in developing rich digital library services.

The term metadata has come to mean structured information that is used by automated processes. This is probably the most useful way to think about metadata [1].

The Classic Metadata Example

The classic example of metadata is the library catalogue. A catalogue record normally contains information about a book (title, format, ISBN, author, etc.). Such information is stored in a structured, standardised form, often using an international standard known as MARC. Use of this international standard allows catalogue records to be shared across organisations.

Why is Metadata So Important?

Although metadata is nothing new, the importance of metadata has grown with the development of the World Wide Web. As is well-known the Web seeks to provide universal access to distributed resources. In order to develop richly functional Web applications which can exploit the Web's global information environment it is becoming increasingly necessary to make use of metadata which describes the resources in some formal standardised manner.

Metadata Standards

In order to allow metadata to be processed in a consistent manner by computer software it is necessary for metadata to be described in a standard way. There are many metadata standards available. However in the Web environment the best known standard is the Dublin Core standard which provides an agreed set of core metadata elements for use in resource discovery.

The Dublin Core standard (formally known as the Dublin Core Metadata Element Set) has defined 15 core elements: Title, Creator, Subject, Description, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage and Rights [2].

The core element set is clearly very basic. A mechanism for extending Dublin Core elements has been developed. This allows what is known as Qualified Dublin Core elements to refine the core elements. For example DC.Date.Created refines the DC.Date element by allowing the date of creation of the resource to be described. DC.Date.Modified can be used to describe the date on which the resource was changed. Without the qualifiers, it would not be possible to tell which date related to which event. Work is in progress in defining a common framework for qualifiers.

Using Metadata

The Dublin Core standard defines a set of core elements. The standard does not specify how these elements should be deployed on the Web. Initially consideration was given to using Dublin Core by embedding it within HTML pages using the <meta> element e.g. <meta name="DC.Creator" content="John Smith">. However this approach has limitations: initially HTML was not rich enough to all metadata schemes to be including (which could specify that a list of keywords are taken from the Library Of Congress list); it is not possible to define relationships for metadata elements (which may be needed if, for example, there are multiple creators of a resource) and processing the metadata requires the entire HTML document to be downloaded.

In order to address these concerns a number of alternative approaches for using metadata have been developed. RDF (Resource Description Framework) [3], for example, has been developed by W3C as a framework for describing a wide range of metadata applications. In addition OAI (Open Archives Initiative) [4] is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

In addition to selecting the appropriate standards use of metadata may also require use of a metadata management system and a metadata repository.

References

  1. Metadata Demystified, NISO,
    <http://www.niso.org/standards/resources/Metadata_Demystified.pdf>
  2. Dublin Core Metadata Element Set, DCMI,
    <http://dublincore.org/documents/dces/>
  3. Resource Description Framework (RDF), W3C,
    <http://www.w3.org/RDF/>
  4. Open Archives Initiative (OAI),
    <http://www.openarchives.org/>
  5. Information Environment Home, JISC,
    <http://www.jisc.ac.uk/index.cfm?name=ie_home>

Briefing 42

Metadata Deployment


Introduction

This document describes the issues you will need to address in order to ensure that you make use of appropriate approaches for the deployment of metadata within your project.

Why Do You Wish To Use Metadata?

The first question you should address is "Why do you wish to use metadata?". You may have heard that metadata is important. You may have heard that metadata will help solve many problems you have with your project. You may have heard that others are using metadata and you don't wish to be left behind. Although all of these points have some validity, they are not sufficient in isolation to justify the time and effort needed in order to deploy metadata effectively.

You should first specify the problem you wish to address using metadata. It may be that you wish to allow resources on your Web site to be found more easily from search engines such as Google. It may be that you wish to improve local searching on your Web site. It may be that you wish to interoperate with other projects and services. Or it may be that you wish to improve the maintenance of resources on your Web site. In all of these cases metadata may have a role to play; however different approaches may be needed to tackle these different problem and, indeed, approaches other than use of metadata may be more effective (for example, Google makes only limited use of metadata so an alternative approach may be needed).

Identifying The Functionality To Be Provided

Once you have clarified the reasons you wish to make use of metadata you should identify the end user functionality you wish to provide. This is needed in order to define the metadata you will need, how it should be represented and how it should be created, managed and deployed.

Choosing The Metadata Standard

You will need to choose the metadata standard which is relevant for your purpose. In many cases this may be self-evident - for example, your project may be funded to develop resources for use in an OAI environment, in which case you will be using the OAI application.

Metadata Modelling

It may be necessary for you to decide how to model your metadata. For example if you wish to use qualified Dublin Core metadata you will have to chose the qualifiers you wish to use. A QA Focus case study illustrates the decision-making process [1].

Metadata Management

It is important that you give thought to the management of the metadata. If you don't you are likely to find that your metadata becomes out-of-date. Since metadata is not normally displayed to end users but processed by software you won't even be able to use visual checking of the metadata. Poor quality metadata is likely to be a major barrier to the deployment of interoperable services.

If, for example, you embed metadata directly into a file, you may find it difficult to maintain the metadata (e.g. the creator changes their name or contact details). A better approach may be use of a database (sometimes referred to as a metadata repository) which provides management capabilities.

Example Of Use Of This Approach

The Exploit Interactive [2] e-journal was developed by UKOLN with EU funding. Metadata was required in order to provide enhanced searching for the end user. The specific functionality required was the ability to search by issue, article type, author and title and by funding body. In addition metadata was needed in order to assist the project manager producing reports, such as the numbers of different types of articles. This functionality helped to identify the qualified Dublin Core elements required.

The MS SiteServer software used to provide the service provided an indexing and searching capability for processing arbitrary metadata. It was therefore decided to provide Dublin Core metadata stored in <meta> tags in HTML pages. In order to allow the metadata to be more easily converted into other formats (e.g. XHTML) the metadata was held externally and converted to HTML by server-side scripts.

A case study which gives further information (and describes the limitations of the metadata management approach) is available [3].

References

  1. Gathering the Jewels: Creating a Dublin Core Metadata Strategy, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-13/>
  2. Exploit Interactive,
    <http://www.exploit-lib.org/>
  3. Managing And Using Metadata In An E-Journal, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-01/>

Briefing 43

Quality Assurance For Metadata


Introduction

Once you have decided to make use of metadata in your project, you then need to agree on the functionality to be provided, the metadata standards to be used and the architecture for managing and deploying your metadata. However this is not the end of the matter. You will also need to ensure that you have appropriate quality assurance procedures to ensure that your metadata provides fitness for its purposes.

What Can Go Wrong?

There are a number of ways in which services based on metadata can go wrong, such as:

Incorrect content:
The content of the metadata may be incorrect or out-of-date. There is a danger that metadata content is even more likely to be out-of-date than normal content, as content is normally visible, unlike metadata which is not normally displayed on, say, a Web page. In addition humans can be tolerant of errors, ambiguities, etc. in ways that software tools normally aren't.
Inconsistent content:
The metadata content may be inconsistent due to a lack of cataloguing rules and inconsistent approaches if multiple people are involved in creating metadata.
Non-interoperable content:
Even if metadata is consistent within a project, other projects may apply different cataloguing rules. For example the date 01/12/2003 could be interpreted as 1 December or 12 January if projects based in the UK and USA make assumptions about the date format.
Incorrect format:
The metadata may be stored in a non-valid format. Again, although Web browsers are normally tolerant of HTML errors, formats such as XML insist on compliance with standards.
Errors with metadata management tools:
Metadata creation and management tools could output metadata in invalid formats.
Errors with the workflow process:
Data processed by metadata or other tools could become corrupted through the workflow. As a simple example a MS Windows character such as © could be entered into a database and then output as an invalid character in a XML file.

QA For Metadata Content

You should have procedures to ensure that the metadata content is correct when created and is maintained as appropriate. This could involve ensuring that you have cataloguing rules, ensuring that you have mechanisms for ensuring the cataloguing rules are implemented (possibly in software when the metadata is created). You may also need systematic procedures for periodic checking of the metadata.

QA For Metadata Formats

As metadata which is to be reused by other applications is increasingly being stored in XML it is essential that the format is compliant (otherwise tools will not be able to process the metadata). XML compliance checking can be implemented fairly easily. More difficult will be to ensure that metadata makes use of appropriate XML schemas.

QA For Metadata Tools

You should ensure that the output from metadata creation and management tools is compliant with appropriate standards. You should expect that such tools have a rich set of test suites to validate a wide range of environments. You will need to consider such issues if you develop your own metadata management system.

QA For Metadata Workflow

You should ensure that metadata does not become corrupted as it flows through a workflow system.

A Fictitious Nightmare Scenario

A multimedia e-journal project is set up. Dublin Core metadata is used for articles which are published. Unfortunately there are documented cataloguing rules and, due to a high staff turnover (staff are on short term contracts) there are many inconsistencies in the metadata (John Smith & Smith, J.; University of Bath and Bath University; etc.)

The metadata is managed by a home-grown tool. Unfortunately the author metadata is output in HTML as DC.Author rather than DC.Creator. In addition the tool output the metadata in XHTML 1.0 format which is embedded in HTML 4.0 documents.

The metadata is created by hand and is not checked. This results in a large number of typos and use of characters which are not permitted in XML without further processing (e.g. £, — and &).

Rights metadata for images which describes which images can be published freely and which is restricted to local use becomes separated from the images during the workflow process.


Briefing 44

Metadata Harvesting


Background

As the number of available digital resources increases so does the need for quick and accurate resource discovery. In order to allow users to search more effectively many resource discovery services now operate across the resources of multiple distributed content providers. There are two possible ways to do this. Either by distributed searching across many metadata databases or by searching harvested metadata.

Metadata harvesting is the aggregation of metadata records from multiple providers into a single database. Building applications or services that use these aggregated records provides additional views of those resources, assisting in access across sectors and greater exposure of those resources to the wider community.

Open Archives Initiative Protocol for Metadata Harvesting

When metadata harvesting is carried out within the JISC Information Environment the Open Archives Initiative Protocol for Metadata Harvesting (OAI PMH) [1] version 2.0 is recommended. The Open Archives Initiative [2] had it roots in the e-prints community who were trying to improve access to scholarly resources. The OAI PMH was developed initially by an international technical committee in 1999. It is a light-weight low cost protocol that is built on HTTP and XML. The protocol defines six requests, known as verbs:

  1. GetRecord Identify
  2. ListIdentifiers
  3. ListMetadataFormats
  4. ListRecords
  5. ListSets

In order for metadata to be shared effectively two things need to happen:

  1. Content/data providers need to make metadata records available in a commonly understood form.
  2. Service providers need to obtain these metadata records from the content providers and hold them in a repository.

OAI PMH provides a means of doing the above.

Record Format

At the lowest level a data provider must support the simple Dublin Core [3] record format ('oai_dc'). This format is defined by the OAI-PMH DC XML schema [4]. Data providers may also provide metadata records in other formats. Within the JISC Information Environment if the repository is of value to the learning and teaching community projects should also consider exposing metadata records that conform to the UK Common Metadata Framework [5] in line with the IMS Digital Repositories Specification using the IEEE LOM XML schemas [6] .

OAI-PMH also provides a number of facilities to supply metadata about metadata records for example rights and/or provenance information can be provided in the <about> element of the GetRecord response. Also collection-level descriptions can be provided in the <description> element of the Identify response.

Example OAI DC metadata record

The following example is taken from the Library of Congress Repository 1).


<oai_dc:dc>
<dc:title>Empire State Building. [View from], to Central Park</dc:title>
<dc:creator>Gottscho, Samuel H. 1875-1971, photographer.</dc:creator>
<dc:date>1932 Jan. 19</dc:date>
<dc:type>image</dc:type>
<dc:type>two-dimensional nonprojectible graphic</dc:type>
<dc:type>Cityscape photographs.</dc:type>
<dc:type>Acetate negatives.</dc:type>
<dc:identifier>http://hdl.loc.gov/loc.pnp/gsc.5a18067</dc:identifier>
<dc:coverage>United States--New York (State)--New York.</dc:coverage>
<dc:rights>No known restrictions on publication.</dc:rights>
</oai_dc:dc>

Conformance Testing for Basic Functionality

The OAI gives information on tests an OAI repository must successfully complete in order to be entered in the registry. For example:

More information on the tests necessary is available from the OAI Web site [7]. Projects could use the tests listed to create a checklist to measure their repository's conformance.

References

  1. The Open Archives Initiative Protocol for Metadata Harvesting,
    <http://www.openarchives.org/OAI/openarchivesprotocol.html>
  2. Open Archives Initiative,
    <http://www.openarchives.org/>
  3. Dublin Core,
    <http://dublincore.org/>
  4. OAI-PMH DC XML Schema,
    <http://www.openarchives.org/OAI/2.0/oai_dc.xsd>
  5. UK Common Metadata Framework,
    <http://metadata.cetis.ac.uk/guides/>
  6. IMS Digital Repositories Specification,
    <http://www.imsglobal.org/digitalrepositories/>
  7. Registering as a Data Provider,
    <http://www.openarchives.org/data/registerasprovider.html>

Further Information


Briefing 45

Top 10 Tips For Preserving Web Sites


About This Document

This document provides top tips which can help to ensure that project Web sites can be preserved.

The Top 10 Tips

1 Make Use Of Open Standards

You should seek to make use of open standard formats for your Web site. This will help you to avoid lock-in to proprietary formats for which access may not be available in the future.

2 Define The Purpose(s) Of Your Web Site

You should have a clear idea of the purpose(s) of your project Web site, and you should document the purposes. Your Web site could, for example, provide access to project deliverables for end users; could provide information about the project; could be for use by project partners; etc. A policy for preservation will be dependent of the role of the Web site.

3 Have A URI Naming Policy

Before launching your Web site you should develop a URI naming policy. Ideally you should contain the project Web site within its own directory, which will allow the project Web site to be processed (e.g. harvested) separately from other resources on the Web site.

4 Think Carefully Before Having Split Web Sites

The preservation of a Web site which is split across several locations may be difficult to implement.

5 Think About Separating Web Site Functionality

On the other hand it may be desirable to separate the functionality of the Web site, to allow, for example, information resources to be processed independently of other aspects of the Web site. For example, the search functionality of the Web site could have its own sub-domain,(e.g. search.foo.ac.uk) which could allow the information resources (under www.foo.ac.uk) to be processed separately.

6 Explore Potential For Exporting Resources From A CMS

You should explore the possibility of exporting resources from a backend database or Content Management Systems in a form suitable for preservation.

7 Be Aware Of Legal, IPR, etc. Barriers To Preservation

You need to be aware of various legal barriers to preservation. For example, do you own the copyright of resources to be preserved; are there IPR issues to consider; are confidential documents (such as project budgets, minutes of meetings, mailing list archives, etc.) to be preserved; etc.

8 Test Mirroring Of Your Web Site

You should test the mirroring of your project Web site to see if there are technical difficulties which could make preservation difficult. See, for example, the QA Focus document on Accessing Your Web Site On A PDA [1].

9 Provide Documentation

You should provide technical documentation on your Web site which will allow others to preserve your Web site and to understand any potential problem areas. You should also provide documentation on your policy of preservation.

10 Share Your Experiences

Learn from the experiences of others. For example read the case study on Providing Access to an EU-funded Project Web Site after Completion of Funding [2] and the briefing document on Mothballing Web Sites [3].

References


Briefing 46

QA for Web Sites: Useful Pointers


Quality Assurance

Below are some key pointers that can help you enhance the quality assurance procedures used for your Web site.

Useful Pointers

1 Authoring Tools

Are the tools that you use to create your Web site appropriate for their tasks? Do they produce compliant and accessible code? Can the tools be configured to incorporate QA processes such as HTML validation, link checking, spell checking, etc? If not, perhaps you should consider evaluating other authoring tools or alternative approaches to creating and maintaining your content.

2 Tracking Problems

How do you deal with problem reporting? Consider implementing a fault reporting log. Make sure that all defects are reported, that ownership is assigned, details are passed on to the appropriate person, a schedule for fixes is decided upon, progress made is recorded and the resolution of problem is noted. There could also be a formal signing off procedure.

3 Use A QA Model

A model such as the QA Focus Timescale Model will help you to plan the QA you will need to implement over the course of your project:

Strategic QA:
Carried out before development takes place. This involves establishing best methodology for your Web site, the choice of standards, etc.
Workflow QA:
Carried out as formative QA before and during development. This involves establishing and documenting a workflow, processes etc.
Sign-off QA:
Carried out as summative QA once one stage of development has been carried out. This involves establishing an auditing system where everything is reviewed.
On-going QA:
Carried out as summative QA once one stage of development has been carried out. This involves establishing a system to report check, fix any faults found etc.
4 Use Automated Testing Tools

There are a variety of tools out there for use and a number are open source or free to use. These can be used for HTML and CSS validation, link checking, measuring load times, etc.

5 Don't Forget Manual Approaches

Manual approaches to Web site testing can address areas which will not be detecting through use of automated tools. You should aim to test key areas of your Web site and ensure that systematic errors which are found are addressed in areas of the Web site which are not tested.

6 Use A Benchmarking Approach

A benchmarking approach involves comparisons of the findings for your Web site with your peers. This enables comparisons to be made which can help you identify areas in which you may be successful and also areas in which you may be lagging behind your peers.

7 Rate The Severity Of Problems

You could give a severity rating to problems found to decide whether the work be done now or it can wait till the next phase of changes. An example rating system might be:

Level 1:
There is a failure in the infrastructure or functionality essential to the Web site.
Level 2:
The functionality is broken, pages are missing, links are broken, graphics are missing, there are navigation problems, etc.
Level 3:
There are browser compatibility problems, page formatting problems, etc.
Level 4:
There are display issues, for example with the font, or text issues such as grammar.
8 Learn From The Problems You Find

Make sure that you do not just fix problems you find. Recognising why the problems have occurred allows you to improve your publishing processes so that the errors do not reoccur.

Useful URLs

The following resources provide additional advice on quality assurance for Web sites.


Briefing 47

Transcribing Documents


Digitising Text by Transcription

Transcription is a very simple but effective way of digitising small to medium volumes of text. It is particularly appropriate when the documents to be digitised have a complex layout (columns, variable margins, overlaid images etc.) or other features that will make automatic digitisation using OCR (Optical Character Recognition) software difficult. Transcription remains the best way to digitise hand written documents.

Representing the Original Document

All projects planning to transcribe documents should establish a set of transcription guidelines to help ensure that the transcriptions are complete, consistent and correct.

Key issues that transcription guidelines need to cover are:

It is generally good practice to not correcting factual errors or mistakes of grammar or spelling in the original.

Avoiding Errors

Double-entry is the best solution - where two people separately transcribe the same document and the results are then compared. Two people are unlikely to make the same errors, so this technique should reveal most errors. It is, however often impractical because of the time and expense involved. Running a grammar and spell checker over the transcribed document is a simpler way of finding many errors (but assumes the original document was spelt and written according to modern usage).

Transcribing Structured Documents

Structured documents, such as census returns or similar tabular material may be better transcribed into a spreadsheet package rather than a text editor. When transcribing tables of numbers, a simple but effective check on accuracy is to use a spreadsheet to calculate row and column totals that can be compared with the original table. Transcriber guidelines for this type of document will need to consider issues such as:

It is good practice to record values, such as weights, distances, money and ages as they are found, but also to include a standardised representation to permit calculations (e.g. 'baby, 6m' should be transcribed verbatum, but an addition entry of 0.5, the age in years, could also be entered)

Further Information

Many genealogical groups transcribe documents, and provide detailed instructions. Examples include:


Briefing 48

Top 10 Tips For Database Design


About This Document

This document provides 10 tips which can help to ensure that databases can be easily exported and manipulated with the minimum of difficulties.

The Top 10 Tips

1 Develop A Prototype

Significant time can be saved by creating the structure in a simple desktop database (such as Microsoft Access) before finalising the design in one of the enterprise databases. The developer will be able to recognise simple faults and makes changes more rapidly than would be possible at a later date.

2 Split database structure into multiple tables

Unlike paper-based structures, databases do not require the storage of all fields in a single table. For large databases it is useful to split essential information into multiple tables. Before creating a database, ensure that the data has been normalised to avoid duplication.

3 Use understandable field names

The developer should avoid field names that are not instantly recognisable. Acronyms or internal references will confuse users and future developers who are not completely familiar with the database.

4 Avoid illegal file names

It is considered good practice to avoid exotic characters in file or field names. Exotic characters would include ampersands, percentages, asterisks, brackets and quotation marks. You should also avoid spaces in field and table names.

5 Ensure Consistency

Remain consistent with data entry. If including title (Mr, Miss, etc.) include it for all records. Similarly, if you have established that house number and address belong in different fields, always split them.

6 Avoid blank fields

Blank fields can cause problems when interpreting the data at a later date. Does it mean that you have no information, or you have forgotten to enter the information? If information is unavailable it is better to provide a standard response (e.g. unknown).

7 Use standard descriptors for date and time

Date and time can be easily confused when exporting database fields in a text file. A date that reads '12/04/2003' can have two meanings, referring to April 12th or December th, 2003. To avoid ambiguity always enter and store dates with a four-digit century and times of day using the 24 hour clock. The ISO format (yyyy-mm-dd) is useful for absolute clarity, particularly when mixing databases at a later date.

8 Use currency fields if appropriate

Currency data types are designed for modern decimal currencies and can cause problems when handling old style currency systems, such as Britain's currency system prior to 1971 that divided currency into pounds, shillings and pence.

9 Avoid proprietary extensions

Care should be taken when using proprietary extensions, as their use will tie your database to a particular software package. Examples of proprietary extensions include the user interface and application-specific commands.

10 Avoid the use of field dividers

Commas, quotation marks and semi-colons are all used as methods of separating fields when databases are exported to a plain text file and subsequently re-imported into another database. When entering data into a database you should choose an alternative character that represents these characters.

Further Information


Briefing 49

Top Tips For Resolving Poor Performance in Database Design


About This Document

This document provides top tips which can help to ensure that databases are created that can be easily exported and manipulated with the minimum of difficulties.

The Top Tips

1 Normalise database structure

The majority of database performance issues are caused by un-normalised or partially normalised data. Normalisation is the technique used to simplify the design of a database in a way that removes redundant data and improves the efficiency of the database design. It consists of three levels (1st, 2nd and 3rd) normal forms that require the removal of duplicate information, removal of partial (when the value in a field is dependent on part of the primary key) and transitive (when the value in one non-key field is dependent on the value in another non-key field) dependencies.

2 Create an index

About 70% of good SQL performance can be attributed to proper and efficient indexes. Indexes are used to provide fast and efficient access paths to data, to enforce uniqueness on column values, to contain primary key values, to cluster data, and to partition tables.

3 Are indexes being used consistently?

Indexes have many benefits, but they have disadvantages. Each index requires storage space and must be modified each time a new row is inserted or deleted, as well as each time a column value in the key is updated. You should ensure that indexes are only used when necessary. In many circumstances it may be more appropriate to modify the structure of an existing one? Use the EXPLAIN statement to find the access path chosen by the optimiser.

4 Check the query

Ensure the query is structured correctly by rechecking the WHERE clause. Are the host variables defined correctly and are the predicates designed to use existing indexes?

5 Avoid unnecessary sorting of data

Unnecessary data sorting can also have a detrimental impact upon processing speed. You should ensure that all sorts (ORDER BY, GROUP BY, UNION, UNION ALL, joins, DISTINCT) only refer to the necessary data.

6 Avoid unnecessary row counts

When developing stored procedures (a series of SQL commands), use the SET NOCOUNT ON option at the start of your procedure. This will prevent the superfluous "row count" messages from being generated, and will improve performance by eliminating wasted network traffic.

7 Check table JOINs

Remove unnecessary JOINS and sub queries - Would the application be more efficient without the join or sub-query? Are simple or multiple queries more efficient?

8 8. Check connection delays when connecting to an external database

Many problems can be encountered when connecting to an organisational database from home or anywhere outside the faculty. Many delays are caused by DNS lookup timeout. Check that the database server can resolve the IP address. If the intervening firewall uses NAT, then the IP address will match the firewall's interface closest to the database server. If you are troubleshooting the connection, gather more information using 'tcpdump' and examine the packet timings to determine where the delay is occurring.

9 Think about the database location

Many performance issues are caused by the host application rather than the database itself. When identifying performance issues it is useful to perform an Internet search using application keywords to identify problematic combinations. For example, tests have found that the use of a MS Access database run from a NetWare server can dramatically increase the query time if the database is not stored in the drive root.

10 Export queries in desktop databases if necessary

Though it is theoretically possible to use SQL (Structured Query Language) script files between databases, the range of implementations in desktop databases differ. This may cause significant delays. In practice code needs to be recreated altered to account for implementation differences.

Further Information


Briefing 50

Improving Interoperability Between Multiple Databases


About This Document

A relational database is a set of structured data, organised according to a data model. When exporting data from one application to another, it is a simple process to export the data as an ASCII text file that will describe every field within a table. However, many problems can be encountered that will increase the amount of effort and time required to import the data. This paper describes specific quality-based techniques that should be used in the development process to minimise the difficulty encountered at a later date.

Documenting The Database Structure

The key to continued access of a digital resource is documentation. This avoids the problems that arise when an administrator leaves the project and essential knowledge is lost. Before exporting data you should make a note of the table relationships and primary keys. This will allow the data to be recombined using the same structure in an alternative package. You should also identify specific requirements of each field. For example, the field size, import mask, validation rules, default value, indexing, etc.

Use Appropriate Descriptors

Two problems relating to the database organisation can be avoided by the use of appropriate descriptors. The first is to understand the importance of table and field names when identifying information. A row of numbers has little meaning until we identify the context, i.e. payroll numbers, lottery numbers, etc. This will make it easier to interpret and recombine the data at a later date.

The second issue to consider when choosing fields names is the possibility that this data will become corrupt at a later date or will be misinterpreted by the application. This is caused when specific reserved characters used for distinguishing between field (commas, semi-colons, tabs, quotations, etc.), or system-illegal characters (ampersands, asterisks, hash, or other mathematical symbols) are used.

It is important to avoid such issues by restricting yourself to the English alphabet or numerical values, and avoiding other symbols.

Ensure Consistency

When handling data from multiple databases it is good practice to standardise the responses so that they can be understood and manipulated more easily. This may involve a simple process of replacing all reference to one value with another (e.g. changing 1,2,3 to Mr, Mrs, Miss). In other circumstances you may need to write a query to split the postcode from the main address field.

You should also ensure that date and time are referenced correctly. These can be easily confused when exporting database fields in a text file. For example, a date that reads '12/04/2003' can be interpreted as April 12th or December 4th, 2003. To avoid ambiguity always enter and store dates with a four-digit century and times of day using the 24 hour clock. The ISO format (yyyy-mm-dd) is useful for absolute clarity, particularly when mixing databases at a later date.

Proprietary Extensions

Care should be taken when using proprietary extensions, as their use will tie your database to a particular software package. Unlike SQL commands, these application-specific elements cannot be exported to other applications without extensive work to convert or recreate the resource. Examples of proprietary extensions include the user interface and application-specific commands.

Further Information


Briefing 51

Intellectual Rights Clearance On The Internet


Introduction

The Internet contains an assortment of copyrighted work owned by millions of people or organisations throughout the world. It can be a legal minefield for anyone attempting to establish intellectual rights to specific works. In most cases it is extremely difficult to establish the author or owner to gain permission for its use.

One way of addressing IPR (Intellectual Property Rights) issues is to describe ownership in as much depth as possible: establishing who is responsible for specific works can help a producer protect themselves from potential legal difficulties.

This document provides guidelines on gaining copyright clearance for using third party works within your own project. It encourages the use of standard practices that will simplify the process and improve the quality of copyright clearance information stored, providing a protection against future legal action.

Copyright Clearance

Copyright is an automatically assigned right. It is therefore likely that the majority of works in a digital collection will be covered by copyright, unless explicitly stated. The copyright clearance process includes requiring the digitiser to check the copyright status of:

Copyright clearance should be undertaken at the beginning of a project. If clearance is denied after the work has been included in the collection, it will require additional effort to remove it and may result in legal action from the author. Therefore:

Maintain a negotiation log
A log will document all meetings, outlining subjects of discussion, objections and agreements by either party. This will enable the organization to refer to the relevant section to establish they have gained copyright clearance and refer to a detailed description of the meetings that took place.
Identify who the author is and when it was produced
Current copyright law in the UK indicates the author's lifespan plus 70 years as the limit for copyright. Therefore it is possible that a collection may consist of works that are outside current copyright laws (such as the entire works of Shakespeare, Conan Doyle, etc.). If the author is still alive, they must be contacted to gain permission to use their work.
Establish long-term access rights
Internet content may appear in a site archive for several years after it was published. When reaching agreement with the author, establish any time limits for the use of their work, indicating the length of time that work can be used. If the goal of the project is to enable long-term preservation of work, persuade the individual/s to allow the repository to host work indefinitely and to allow the conversion of it to modern formats when required.

In the event that an author, or authors, is unobtainable, the project is required to demonstrate they have taken steps to contact them. Digital preservation projects are particularly difficult in this aspect, separating the researcher and the copyright owner by many years. In many cases, more recently the 1986 Domesday project, it has proven difficult to trace authorship of 1000+ pieces of work to individuals. In this project, the designers created a method of establishing permission and registering objections by providing contact details that an author could use to identify their work.

Indicating IPR through Metadata

If permission has been granted to reproduce copyright work, the institution should take measures to reflect intellectual property. Metadata is commonly used for this purpose, storing and distributing IP data for online content. Several metadata bodies provide standardized schemas for copyright information. For example, IP information for a book could be stored in the following format:


<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
      <publish_date>2001-04-16</publish_date>
      <publisher>Addison Press</publisher>
      <copyright>Galos, M. 2001</copyright>
</book>

Access inhibitors can also be set to identify copyright limitations and the methods necessary to overcome them. For example, limiting e-book use to IP addresses within a university environment.

Further Information


Briefing 52

Protecting Copyright On Your Own Work


Introduction

The Internet contains an assortment of copyrighted work owned by millions of people or organisations throughout the world. The ease of publication and availability of text, graphics and video allow anyone to become their own publisher. As an effect, modern web sites contain a jigsaw of copyrighted works produced by multiple authors.

This free attitude to copyright presents a challenge to authors - what measures can be taken for authors to protect their own work? More accurately, can copyrighted work be protected in some way?

This document provides guidelines for protecting your own work. It describes methods of establishing authorship, possible licencing models that meet your needs, and methods of reflecting copyright on the Internet.

Choosing the Correct Licence

The Internet has forced an increasing debate on the role of IPR and copyright. This has resulted in alternatives to traditional intellectual property rights appearing.

To protect your work it is important that the distribution license is considered before you release your work. This can be achieved by answering several questions:

If the answer to these questions is no, you are automatically assigned rights to copyright your work. However, if the answer is yes, you should seek alternative license agreements that preserve your right to place your work into the public domain or allow the user to perform certain actions. Popular variants include CopyLeft, notably the GPL, and Collective Commons - two different license agreements that avoid traditional copyright restrictions, by establishing permission to distribute content without restriction. More information can be found on these subjects in the QA Focus document 'Choosing Alternative Licences For Digital Content' [1].

Managing Copyright on Own Work

Unless indicated, copyright is assigned to the author of an original work. When producing work it is essential that it be established who will own the resulting product - the individual or the institution. Objects produced at work or university may belong to the institution, depending upon the contract signed by the author. For example, the copyright for this document belongs to the AHDS, not the author. When approaching the subject the author should consider several issues:

Can I establish that I am the author of this work?
At this point the author should provide evidence they produced the work on a specific date. One commonly used method is to post a sealed envelope to yourself or request that a solicitor store evidence within a safe. If ownership is challenged at a later date, the document can be opened in the presence of a solicitor.
Am I using unaccredited copyrighted material produced by others?
Published work that contains unaccredited material infringe upon the intellectual property of others. The results of such discovery will vary: the unaccredited author may request they are credited or a correction is published; the author may request their work is removed; or they make take legal action against the author. To avoid such issues, document all research made during investigation.

When producing work as an individual that is intended for later publication, the author should establish ownership rights to indicate how work can be used after initial publication:

Ownership after publication
Authors are encouraged to retain as many rights as possible to enable the continued use of articles in hard copy and electronic form.
Ownership in different media
Where publication in a specific form (e.g. hard-copy) is the intention, rights to publish in other forms (e.g. electronic) should, if possible, be retained.

Indicating IPR through Metadata

If permission has been granted to reproduce copyright work, the institution should take measures to reflect intellectual property. Metadata is commonly used for this purpose, storing and distributing IP data for online content. Several metadata bodies provide standardized schemas for copyright information. For example, IP information for a book could be stored in the following format:


<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
      <publish_date>2001-04-16</publish_date>
      <publisher>Addison Press</publisher>
      <copyright>Galos, M. 2001</copyright>
</book>

Access inhibitors can also be set to identify copyright limitations and the methods necessary to overcome them. For example, limiting e-book use to IP addresses within a university environment.

References

  1. Choosing Alternative Licences For Digital Content, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-54/>

Further Information


Briefing 53

How To Protect Your Rights With A Licence Agreement


Introduction

The Internet is often promoted as a means of getting information to the widest possible audience at the lowest possible cost. Barriers to the flow of information are not encouraged, and few repositories establish formal agreements with depositing authors.

Although mutual benefit is the primary goal of many collaborative projects, some method of formalizing the relationship between author and distributor is useful. A deposit agreement can be used to define a consensual contract between the depositing author and the repository, clarifying the rights and obligations of both.

The deposit agreement dictates several requirements of both parties:

Licencing Terms

The first aspect of a licence agreement that should be determined is the licencing terms. This indicates the distribution type permitted. Two types exist:

  1. Exclusive distribution Exclusive licenses impose specific and wide-ranging restrictions upon distribution. They are primarily used for commercial repositories that are restricted by copyright or charges that non-exclusive distribution will devalue the content.
  2. Non-exclusive licenses Non-exclusive licenses, typically found in academic-orientated repositories offer a useful alternative to commercial distribution that encourages the author to voluntarily submit work as a method of gaining wider public exposure. These non-exclusive licenses establish the rights of the depositor to submit work to differing repositories at a subsequent date without legal restrictions.

Requirements

To protect the organization from legal threats at a later date the licence agreement requires several issues to be considered during the submission lifetime. In the initial stages the repository should establish content ownership, audience and potential use, migration and distribution rights. In the long-term the repository should consider withdrawal criteria.

Initial Stages of Development

Establish ownership
A licence agreement must first establish whom the owners are, and if it differs from the author. This may help to minimize the repository's legal liability by formally establishing that the depositor holds the necessary copyright to deposit the material and is able to do so without infringement.
Confirm ownership
The licence agreement should clearly indicate that the depositor retains ownership. This is a particularly important inclusion in a deposit agreement, designed to protect the repository from potential legal action taken as a result of the actions of the author. Equally, the deposit agreement can help establish that the author is not legally responsible for ensuring the accuracy of the information they have provided if, for example, it later becomes out-of-date.
Audience and potential use
In some circumstances, particularly exclusive distribution, a licence agreement will need to establish terms permitted by the author relating to potential usage. This may be prompted by concerns that wide dissemination will damage the long-term value of the content. Institutional repositories may wish to clarify with depositors that deposited e-prints will only be used for non-commercial or academic uses.

Mid-term Considerations

Migration Strategy
For repositories such as the AHDS a migration strategy will be particularly important. This enables the repository to migrate the content to a different file format if the original submitted format becomes obsolete.

Long-term Considerations

Withdrawal criteria
The licence agreement should establish the situations under which the author may withdraw their work from the repository and whether the repository can continue to hold relevant metadata records after it is withdrawn.

Licence agreements should be considered an essential part of an e-print repository's operation. They can resolve many of the potential problems that might arise. For the repository, it provides a formal framework that defines what the repository can and cannot do, making it easier to manage the e-print in the long-term while helping to reduce its legal liabilities.

Further Information


Briefing 54

Choosing Alternative Licences For Digital Content


Introduction

Licences are a core part of intellectual property rights management. Licences allow the copyright holder to devolve specific rights to use, store, copy and disseminate work to another party.

Licences are typically restrictive, and acceptable uses of the licensed work are carefully delineated. However, copyright holders may wish to encourage widespread sharing and use of their work. In these situations an alternative licensing model may be appropriate.

Should I Choose an Alternative Licence?

To identify if an alternative licence is appropriate, the following questions should be addressed:

If the answer to these questions is yes, then an alternative licence agreement may be appropriate.

The developer has a number of options when planning to release their work: including creating their own licence or using an existing one. Both options have recognisable benefits. The bespoke licence allows the developer to define their own terms and conditions, while rejecting conditions with which they disagree. However, the creation of a licence can be a long process that may result in the licence containing legal loopholes.

An alternative is to use an existing 'copyleft' licence. Copyleft is an umbrella term that may refer to several similar licences. When choosing a licence, the developer must consider their own needs:

Licensing Digital Works

Many authors argue the traditional copyright restrictions opposes the free distribution of digital works, whether they are text, graphics, or sound, on the Internet. This could be for a variety of reasons; the author wishes to spread their ideas; they wish to attract feedback on their work, etc. For these purposes, traditional copyright and public domain licences are unsuitable.

Creative Commons is a particularly popular licencing model available to all creative works. It is therefore usual to find it applied to Web sites, scholarship, music, film, photography and literature that are not traditionally covered by similar distribution schemes.

Similar to the GNU General Public Licence, Creative Commons licences allow the author to give the reader specific rights, such as permission to distribute the work and make derivatives, without resolving copyright of the original work. Though these freedoms encourage comparisons to the public domain, the Creative Commons licences are more restrictive, placing specific provisos upon the work:

  1. The author must be credited.
  2. Any derivative works must meet the licensing criteria established by the author. Derivatives cannot be place in public domain without permission.

To encourage authors to use Creative Commons [1] the developers provide an online multiple-choice form to choose the most appropriate licence model. At the time of writing, eleven variations exist that differ according to four different values (attribution, non-commercial, derivative works and share alike). These can be found on the Creative Commons licence agreement page [2]. In addition, a specifically developed metadata set is provided, allowing an individual to easily find and use work.

The Creative Commons licence may be suitable for individuals who wish their work to be seen and used by as many people as possible, but do not want to give away their rights. It may be unsuitable for businesses that wish to charge for access or restrict content in some way.

Dual-Licencing

Copyleft licences [3] such as the Creative Commons may promote free dissemination, however there is little encouragement for businesses that wish to make a profit to use them. The solution is to categorise your software under a dual-licence; one for free open-source distribution, the other for proprietary commercial distribution. This model allows a business to take contributions made in the open source version, apply it to their for-cost version and sell it at retail price.

References

  1. Creative Commons,
    <http://creativecommons.org/>
  2. Creative Commons Licences,
    <http://creativecommons.org/license/>
  3. What is Copyleft?,
    <http://www.gnu.org/copyleft/copyleft.html>

Briefing 55

Top 10 Tips For Web Sites


The Top 10 Tips

1 Ensure Your Web Site Complies With HTML Standards

You should ensure that your Web site complies with HTML standards. This will involve selecting the standard for your Web site (which currently should be either HTML 4.0 or XHTML 1.0); implementing publishing procedures which will ensure that your Web pages comply with the standard and quality assurance procedures to ensure that your publishing processes work correctly [1] [2].

2 Make Use Of CSS - And Ensure The CSS Is Compliant

You should make use of CSS (Cascading Style Sheets) to define the appearance of your HTML pages. You should seek to avoid use of HTML formatting elements (e.g. avoid spacer GIFs, <font> tags, etc.) - although it is recognised that use of tables for formatting may be necessary in order to address the poor support for CSS-positioning in some Web browsers. You should also ensure that your CSS is compliant with appropriate standards [3].

3 Provide A Search Facility For Your Web Site

You should provide a search facility for your Web site if it contains more than a few pages [4].

4 Ensure Your 404 Error Page Is Tailored

You should aim to ensure that the 404 error page for your Web site is not the default page but has been configured with appropriate branding, advice and links to appropriate resources, such as the search facility [5].

5 Have A URI Naming Policy For Your Web Site

You should ensure that you have a URI naming policy for your Web site [6].

6 Check Your Links - And Have a Link-Checking Policy

You should ensure that you check for broken links on your Web site. You should ensure that links work correctly when pages are created or updated. You should also ensure that you have a link checking policy which defines the frequency for checking links and your policy when broken links are detected [7].

7 Think About Accessibility

You should address the accessibility of your Web site from the initial planning stages. You should ensure that you carry out appropriate accessibility testing and that you have an accessibility policy [8].

8 Think About Usability

You should address the usability of your Web site from the initial planning stages. You should ensure that you carry out appropriate usability testing and that you have a usability policy.

9 Use Multiple Browsers For Checking

You should make use of several browsers for testing the accessibility, usability and functionality of your Web site. You should consider making use of mainstream browsers (Internet Explorer and FireFox) together with more specialist browsers such as Opera.

10 Implement QA Policies For Your Web Site

You should ensure that you have appropriate quality assurance procedures for your Web site [9] [10].

References

  1. Compliance with HTML Standards, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-01/>
  2. Deployment Of XHTML 1.0, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-35/>
  3. Use Of Cascading Style Sheets (CSS), QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-34/>
  4. Search Facilities For Your Web Site, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-08/>
  5. 404 Error Pages On Web Sites, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-06/>
  6. URI Naming Conventions For Your Project Web Site, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-16/>
  7. Approaches To Link Checking, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-07/>
  8. Accessibility Testing, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-02/>
  9. Summary of the QA Focus Methodology, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-30/>
  10. Implementing Your Own QA, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-58/>

Briefing 56

Using Instant Messaging Software


About Instant Messaging

Instant messaging (IM) is growing in popularity as the Internet becomes more widely used in a social context. The popularity of IM in a social context is leading to consideration of its potential for work purposes in providing real time communications with colleagues and co-workers.

Popular IM applications include MSN Messenger, Yahoo Messenger and AOL Messenger [1]. In addition to these dedicated applications a number of Web-based services also provide instant messaging facilities within the Web site, such as YahooGroups [2]. The JISCMail list management service also provides a Web-based instant messaging facility [3].

The Benefits

Instant Messaging software can provide several benefits:

Instant messaging fans appreciate the immediacy of communications it provides, which can be particularly valuable when working on small-scale concrete tasks.

Possible Problems

There is a need to be aware of potential problems which can be encountered when using instant messaging software:

Critics of instant messaging argue that, although IM may have a role to play for social purposes, for professional use email should be preferred.

Policies For Effective Use of Instant Messaging

Instant messaging may prove particularly useful when working with remote workers or if you are involved in project work with remote partners. However in order to make effective use of instant messaging tools there is a need to implement a policy governing its usage which addresses the problem areas described above.

Software:
You will have to select the IM software. Note you may find that users already have an ID for a particular IM application and may be reluctant to change. There are multi-protocol IM tools available, such as gaim [4] and IM+ [5] although you should be aware that these may have limited functionality. In addition to these desktop applications, there are also Web-based tools such as JWChat [6].
Usage:
You will need to define how instant messaging is to be used and how it will complement other communications channels, such as email.
Privacy, security, etc issues:
You will need to define a policy on dealing with interruptions, privacy and security issues.
It is important to note that different IM environments (e.g. Jabber and MSN) work in different ways and this can affect privacy issues.
Records:
You will need to define a policy on recording instant messaging discussions. Note that a number of IM clients have built-in message archiving capabilities.

As an example of a policy on use of instant messaging software see the policy produced for the QA Focus project [7] together with the QA Focus case study [8]. As an example of use of IM in an online meeting see the transcript and the accompanying guidelines at [9].

References

  1. Instant Messenger FAQs, University of Liverpool,
    <http://www.liv.ac.uk/CSD/helpdesk/faqs/instant/>
  2. YahooGroups,
    <http://groups.yahoo.com/>
  3. DISCUSS Discussion Room at JISCMail, JISCMail,
    <http://www.jiscmail.ac.uk/lists/discuss.html>
  4. GAIM,
    <http://gaim.sourceforge.net/>
  5. IM+, Shape Services,
    <http://www.shapeservices.de/eng/im/>
  6. Jabber Web Chat, JWChat,
    <http://jwchat.sourceforge.net/>
  7. Policy on Instant Messaging, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/instant-messaging/>
  8. Implementing A Communications Infrastructure, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/case-studies/case-study-12/>
  9. Approaches To Web Development: Online Discussion, Web Focus, UKOLN,
    <http://www.ukoln.ac.uk/web-focus/events/online/VLS-aug-2001/>

Briefing 57

Accessibility Testing In Web Browsers


About This Document

This document provides advice on configuring popular Web browsers in order to ensure your Web site is widely accessible. The document covers Internet Explorer 7.0, Firefox 3 and Opera 9.6 running on Microsoft Windows.

Disabling JavaScript

Some browsers do not support JavaScript. Some organisations / individuals will disable JavaScript due to security concerns.

Browser Technique
Internet Explorer Select Tools menu and Internet Options option. Select the Security tab, choose the Internet icon choose the Custom level option. Scroll to the Scripting option and choose the Disable (or Prompt) option.
Firefox Select Tools menu and Options option. Open Content, unselect the Enable Javascript option and select OK.
Opera Select File menu and choose Preferences option. Choose the Multimedia option, disable JavaScript option and select OK.

 

Resizing Text

Some individuals will need to resize the display in order to read the information provided.

Browser Technique
Internet Explorer Select View menu and choose Text Size option.
Firefox Select View menu and choose the Zoom option. Choose the option to Zoom in. Repeat using Zoom out
Opera Select View menu and choose Zoom option. Then zoom by a factor or, say, 50% and 150%.

 

Disabling Images

Some people cannot see images and some may disable images for performance or privacy reasons.

Browser Technique
Internet Explorer Select Tools menu and Internet Options option. Uncheck the Show pictures automatically option.
Firefox Select View menu and Options option. Open the Content tab, uncheck the Load images automatically tab and select OK.
Opera Select File menu and choose Preferences option. Choose Multimedia option, select the Show images pull-down menu and choose the Show no images option and select OK.

 

Disabling Popup Windows

Some browsers and assistive technologies may not support pop-up windows. Individuals may disable pop-up windows due to their misuse by some commercial sites.

Browser Technique
Internet Explorer Select the Tools tab and Pop-Up Blocker option. Ensure that the Pop-Up Blocker option is selected.
Firefox Select Tools menu and Options option. Select the Content tab and click on the Block pop-up windows option.
Opera Select File menu and choose Preferences option. Choose Windows option in the Pop-ups pull-down menu and choose the Refuse Pop-ups option and select OK.

 

Systematic Testing

You should use the procedures in a systematic way: for example as part of a formal testing procedure in which specific tasks are carried out.

Use of Bookmarklets And FireFox Extensions

Bookmarklets are browser extension may extend the functionality of a browser. Many accessibility bookmarklets are available (known as Firefox Extensions for the Firefox browser). It is suggested that such tools are used in accessibility testing. See Interfaces To Web Testing Tools at <http://www.ariadne.ac.uk/issue34/web-focus/>


Briefing 58

Implementing Your Own QA


About This Document

This document describes how you can implement your own quality assurance policies and procedures to support your development work.

The QA Focus Methodology

The QA Focus methodology aims to ensure that IT development work produces services which are widely accessible and interoperable. It seeks to do this by developing a quality assurance framework which developers can make use of.

As described in the QA Focus briefing document "Summary of the QA Focus Methodology" [1] the QA Focus methodology is based on:

Documented policies on standards and best practices:
If the standards and best practices are not documented it will be difficult to ensure best practices are implemented, especially in light of staff turnover, changing environments, etc.
Documentation of the architecture used:
A description of the architecture is needed to ensure that the architecture used to implement the system is capable of complying with the standards.
Documented exceptions:
There may be occasions when deviations from standards may be allowed. Such deviations should be documented and responsibility for this agreed.
Systematic checking:
It is necessary to document systematic procedures for ensuring compliance with standards.
Audit trails:
It can be helpful to provide audit trails which can help spotting trends.

Implementing Your Own QA

The QA Focus briefing document "Summary of the QA Focus Methodology" [1] provides examples of implementing QA in the areas of Web standards and link checking. In this document we provide a template which can be used for any relevant aspect of IT development work.

QA Template

The following template can be used for developing your own QA framework.

Area:
The area covered by the QA (e.g. Web, software development, usability, ...)
Standards:
The standards which are relevant to the area and which you intend to make use of.
Best Practises:
The best practices which are relevant to the area and which you intend to make use of.
Architecture:
The architecture you intend to use.
Exceptions:
A summary of the exceptions to best practices and recommended standards and a justification for the exceptions.
Change Control:
A description of the responsibility for changing this QA document and the process for changing the policy.
Checking:
A description of the systematic checking procedures which will ensure that you are complying with the policies you have established.
Audit trail:
A description of audit trails (if any) which provide a record your compliance checking, in order to identify any trends.

As can be seen this QA template is simple and straightforward to use. The QA Focus methodology recognises the lack of resources which can hinder the deployment of more comprehensive QA frameworks and so has developed a more light-weight approach.

Examples

Examples of use of this approach can be found on the QA Focus Web site, which includes details of QA policies and procedures in the areas of Web standards [2], linking [3], usage statistics [4] and instant messaging [5].

References

  1. Summary of the QA Focus Methodology, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-30/>
  2. Policy On Web Standards, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/web/>
  3. Policy On Linking, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/links/>
  4. Policy On Usage Statistics, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/statistics/>
  5. Policy On Instant Messaging, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/qa/policies/instant-messaging/>

Briefing 59

A URI Interface To Web Testing Tools


Background

As described in other QA Focus briefing document [1] [2] it is important to ensure that Web sites comply with standards and best practices in order to ensure that Web sites function correctly, to provide widespread access to resources and to provide interoperability. It is therefore important to check Web resources for compliance with standards such as HTML, CSS, accessibility guidelines, etc.

This document summarises different models for such testing tools and describes a model which is based on provided an interface to testing tools through a Web browsers address bar.

Models For Testing Tools

There are a variety of models for testing tools:

Although a variety of models are available, they all suffer from the lack of integration will the normal Web viewing and publishing process. There is a need to launch a new application or go to a new Web resource in order to perform the checking.

A URI Interface To Testing Tools

A URI interface to testing tools avoids the barrier on having to launch an application or move to a new Web page. With this approach if you wish to validate a page on your Web site you could simply append an argument (such as ,validate) in the URL bar when you are viewing the page. The page being viewed will then be submitted to a HTML validation service. This approach can be extended to recursive checking: appending ,rvalidate to a URI will validate pages beneath the current page.

This approach is illustrated. Note that this technique can be applied to a wide range of Web-based checking services including:

This approach has been implemented on the QA Focus Web site (and on UKOLN's Web site). For a complete list of tools available append ,tools to any URL on the UKOLN Web site or see [3].

Implementing The URI Interface

This approach is implemented using a simple Web server redirect. This has the advantage of being implemented in a single place and being available for use by all visitors to the Web site.

For example to implement the ,validate URI tool the following line should be added to the Apache configuration file:

RewriteRule /(.*),validate http://validator.w3.org/check?uri=http://www.foo.ac.uk/$1 [R=301]

where www.foo.ac.uk should be replaced by the domain name of your Web server (note that the configuration details should be given in a single line).

This approach can also be implemented on a Microsoft IIS platform, as described at [3].

References

  1. Compliance with HTML Standards, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-01/>
  2. Use Of Cascading Style Sheets (CSS), QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-34/>
  3. Web Site Validation and Auditing Tools, UKOLN,
    <http://www.ukoln.ac.uk/site/tools/>

Briefing 60

Top Tips For Selecting Open Source Software


About this document

Performance and reliability are the principal criteria for selecting software. In most procurement exercises however, price is also a determining factor when comparing quotes from multiple vendors. Price comparisons do have a role, but usually not in terms of a simple comparison of purchase prices. Rather, price tends to arise when comparing "total cost of ownership" (TCO), which includes both the purchase price and ongoing costs for support (and licence renewal) over the real life span of the product. This document provides tips about selecting open source software.

The Top Tips

Consider The Reputation

Does the software have a good reputation for performance and reliability? Here, word of mouth reports from people whose opinion you trust is often key. Some open source software has a very good reputation in the industry, e.g. Apache Web server, GNU Compiler Collection (GCC), Linux, Samba, etc. You should be comparing "best of breed" open source software against its proprietary peers. Discussing your plans with someone with experience using open source software and an awareness of the packages you are proposing to use is vital.

Monitor Ongoing Effort

Is there clear evidence of ongoing effort to develop the open source software you are considering? Has there been recent work to fix bugs and meet user needs? Active projects usually have regularly updated web pages and busy development email lists. They usually encourage the participation of those who use the software in its further development. If everything is quiet on the development front, it might be that work has been suspended or even stopped.

Look At Support For Standards And Interoperability

Choose software which implements open standards. Interoperability with other software is an important way of getting more from your investment. Good software does not reinvent the wheel, or force you to learn new languages or complex data formats.

Is There Support From The User Community?

Does the project have an active support community ready to answer your questions concerning deployment? Look at the project's mailing list archive, if available. If you post a message to the list and receive a reasonably prompt and helpful reply, this may be a sign that there is an active community of users out there ready to help. Good practice suggests that if you wish to avail yourself of such support, you should also be willing to provide support for other members of the community when you are able.

Is Commercial Support Available?

Third party commercial support is available from a diversity of companies, ranging from large corporations such as IBM and Sun Microsystems, to specialist open source organizations such as Red Hat and MySQL, to local firms and independent contractors. Commercial support is most commonly available for more widely used products or from specialist companies who will support any product within their particular specialism.

Check Versions

When was the last stable version of the software released? Virtually no software, proprietary or open source, is completely bug free. If there is an active development community, newly discovered bugs will be fixed and patches to the software or a new version will be released. For enterprise use, you need the most recent stable release of the software, be aware that there may have been many more recent releases in the unstable branch of development. There is, of course, always the option of fixing bugs yourself, since the source code of the software will be available to you. But that rather depends on your (or your team's) skill set and time commitments.

Think Carefully About Version 1.0

Open source projects usually follow the "release early and often" motto. While in development they may have very low version numbers. Typically a product needs to reach its 1.0 release prior to being considered for enterprise use. (This is not to say that many pre-"1.0" versions of software are not very good indeed, e.g. Mozilla's 0.8 release of its Firefox browser).

Check The Documentation

Open source software projects may lag behind in their documentation for end users, but they are typically very good with their development documentation. You should be able to trace a clear history of bug fixes, feature changes, etc. This may provide the best insight into whether the product, at its current point in development, is fit for your purposes.

Do You Have The Required Skill Set?

Consider the skill set of yourself and your colleagues. Do you have the appropriate skills to deploy and maintain this software? If not, what training plan will you put in place to match your skills to the task? Remember, this is not simply true for open source software, but also for proprietary software. These training costs should be included when comparing TCOs for different products.

What Licence Is Available?

Arguably, open source software is as much about the license as it is about the development methodology. Read the licence. Well-known licenses such as the General Public License (GPL) and the Lesser General Public License (LGPL) have well defined conditions for your contribution of code to the ongoing development of the software or the incorporation of the code into other packages. If you are not familiar with these licenses or with the one used by the software you are considering, take the time to clarify conditions of use.

What Functionality Does The Software Provide?

Many open source products are generalist and must be specialised before use. Generally speaking the more effort required to specialise a product, the greater is its generality. A more narrowly focused product will reduce the effort require to deploy it, but may lack flexibility. An example of the former is GNU Compiler Collection (GCC), and an example of the latter might be Evolution email client, which works well "out of the box" but is only suitable for the narrow range of tasks for which it was intended.

Further Information

Acknowledgements

This document was written by Randy Metcalfe of OSS Watch. OSS Watch is the open source software advisory service for UK higher and further education. It provides neutral and authoritative guidance on free and open source software, and about related open standards.

The OSS Watch Web site ia available at http://www.oss-watch.ac.uk/.


Briefing 61

Deployment Of Software Into Service


Background

The start of your project will involve a great deal of planning and work scheduling. If you will be developing software, this is also the best time to consider and plan for its long-term future and viability. Decisions on software development made in the early stages of a project are important as they will often govern the options open to you for deployment beyond the life of the project. Although some choices may be influenced by the current technical environment of your institution, early consideration of a range of deployment issues will allow the possibility of a greater number of hosting options at the end of project, so ensuring continued existence of the software you have developed, and long-term access to it.

Careful choices will also reduce the cost of the work required for deployment, and allow you to minimize the portion of your budget that you have to allocate to the Service Provider.

Choice of Platform

If possible, software should be developed on the same platform that will eventually be used for service delivery. Microsoft Windows and Unix (especially Solaris) servers are the main options.

Porting software developed on one platform to another may not be straightforward, even if the chosen application software is claimed to run on both platforms. Proven technical solutions are preferred - do you have examples where your chosen application software has been used on both platforms?

Development Environment

Software and Licensing Issues

If software licenses are required by the Service Provider, these must be available at a cost within the service budget. Be aware of licensing conditions: a Service Provider may require a commercial license if a charge is to be made for the service, whereas the project may be able to use an educational license. The cost of the various types of license may vary.

Care is also needed when choosing software that is free at the point of use to project staff, such as a University site licence for a commercial database system. Even though the project itself may incur no additional costs, licences could be prohibitively expensive for the Service Provider.

Consider the use of open source software [1] to avoid most licence problems! Good quality open source software can greatly reduce the cost of software development. Developers should be aware, however, that some 'open source' software is poorly written, inadequately documented, and entirely unsupported. Be aware that the costs of ongoing software maintenance, often undertaken by staff outside the original project, should also be factored in.

Best Practice

Good programming practice and documentation is very important. Well-written and structured software with comprehensive documentation will ease transition to a service environment and aid the work of the Service Provider [2]. It is better for a project to recruit a good engineer used to working in a professional development environment, than to recruit purely on the basis of specific technical skills. Also, try to code in languages commonly adopted in your application area: for example Java or Perl for Web programming. You can write Web applications in Fortran, but don't.

If possible a modular architecture is best. It will maximise the options for transfer to a Service Provider and also any future development. For example, if one application were used for a Web user interface and another for a database back end then, provided these communicate using open standards (Z39.50, standard SQL, for example), Web Services might be added to the service at a future date. A service built with a fully integrated single package of components that use proprietary native protocols might have to be scrapped and rebuilt to satisfy even fairly minor new requirements.

Use of Open Standards [3] should ensure portability, but there will still need to be technical structures supporting their use and deployment, whether in a project or service environment. You will need to document all the technical layers that need to be reproduced by the Service Provider in order for your software to run. Open standards can also give flexibility; for example the project and the service provider do not necessarily need to use the same SQL database, provided the standard is followed.

Usability

Be aware of your intended user base. Ensure that any user interface developed during the project has been through usability tests and allow time for any feedback to be incorporated into the final design. A well-designed interface will mean less support calls for the Service Provider.

When designing your user interface remember that there are legal requirements to fulfil with regard to disability access which the Service Provider will need to be satisfied are met. The JISC TechDis [4] service provides information and advice. You may wish to consider provision of user documentation and training documentation in support of the service, which the Service Provider could use and make available.

Monitoring & Auditing

Comprehensive error reporting should be a feature of the deployed application. This will aid the Service Provider in identifying and solving problems. You should consider building comprehensive error reporting mechanisms into your software from the beginning, along with various mechanisms for escalating the severity of reported errors that threaten the viability of the service. These may range from simply logging errors to a file, through to emailing key personnel.

Services must be monitored. It should be possible to use a simple HTTP request (or equivalent for non-Web interfaces) to test the service is available and running, without requiring a multi-step process (such as log in, initiate session and run a search).

Logging is crucial for services, especially subscription services where customers need to monitor usage to assess value for money. Project COUNTER [5] defines best practice in this area. If project staff are still available, the Service Provider will then be able to provide you with logging information and potentially seek your input on future activity and development.

Authentication

Authentication and authorisation should be flexible since requirements are subject to change. Enable the service provider to execute an external script, or at least write their own module or object, rather than embedding the authentication mechanism in the user interface code.

Machine to machine connections

Where the product makes use of external middleware services (an example being for OpenURL support), ensure these are totally configurable by the service provider. Configuration files are good; but the ability to add modules or objects for these features is better.

Legal Issues

Although not a technical consideration, it is important and worth emphasising that the Service Provider will require that all copyright and IPR issues be clarified. Where software has been developed, does the institution at which project staff work have any IPR guidelines that must be followed? What provision is needed to allow the Service Provider to make changes to the software? Is a formal agreement between the project institution and the Service Provider needed?

Service Environment

If you have identified where your software could be hosted then make early contact with the Service Provider to discuss costs and any constraints that may arise in deployment.

The Service Provider will need to be confident that your application will be stable, will scale, and will perform acceptably in response to user demand. If this is not the case then the application will eventually bottleneck and tie up machine resources unproductively which will lead to unresponsiveness.

You should ensure that the application is stress tested by an appropriate number of users issuing a representative number of service requests. There are also several tools available to stress test an application, but a prerequisite to this step is that the project team should be aware of their intended user base and the anticipated number of users and requests. The project team should also be aware of project and service machine architectures as divergence in architecture will affect the viability of any stress testing metrics generated. The Service Provider will want estimates of memory and processor use scaled by the number of simultaneous users. Performance and scalability will remain unresolved issues unless the project software can be tested in a service environment. If this is the case it is especially important to stick to proven technical solutions. You should discuss stress-testing results with your Service Provider as soon as possible.

Adopting best practices is a good start to ensuring that your application will be stable. The discipline of producing well-written and properly documented code is one safeguard against the generation of bugs within code.

If there are likely to be service updates you will need to consider the procedures involved and detail how the new data will be made available and incorporated into the service. Service Providers will generally wish to store two copies of databases that require updates; one being used for service with the other instance being used for updates and testing. Updated databases will also require frequent backups whilst static data requires only one copy of the database without regular backups. Consider splitting large data sets into separate segments: a portion that is static (for example archive data added prior to 2001) and a smaller portion that is updated (data added since 2001). Also, aim to keep data and application software as separate as possible. Again, this will aid a backup regime in a service environment.

You should anticipate that the Service Provider may need to make changes to your software. This may be due to possible technical conflicts with other services hosted by the Service Provider, or may be due to their implementation policy or house style. Again, early contact with a possible Service Provider will highlight these issues and help avoid potential difficulties. Also consider if project staff will be available for referrals of errors or omissions in functionality. If not, you will need to allow the Service Provider to make changes to your software.

If further development of the software beyond the project is feasible you should agree a development schedule and a timetable for transfer to production, as provision of a continued and stable service will be of prime importance to the Service Provider. Major changes to a user interface will also have implications for support and user documentation. If no continued development is planned, the Service Provider may still wish to introduce bug fixes or new versions of any software you have used. Again, good documentation and well-documented code will ensure that problems are minimised.

You should consider under what circumstances your software should be withdrawn and cease to be made available through a Service Provider. If you would expect to be involved in the decision to withdraw the service then contact with project personnel will need to be maintained, or you will need to provide guidance at time of transfer to service about the possible lifetime of the hosting agreement.

Moving Your Software

Allow time before the end of the project to work with the Service Provider. The availability and expertise of project staff will influence the success of moving to service deployment and ultimately the associated costs.

A complete handover of the software without good contact with the project team and without support may well cause problems and will also take longer. This is particularly true if the application contains technologies unfamiliar to the Service Provider. The project team should be prepared to assist the Service Provider in areas where it has specialist expertise and, if possible, factor in continued access to project personnel beyond the end of the project.

Complete and full documentation detailing the necessary steps for installation and deployment of your software and the service architecture, will aid an optimum transition to hosting by a Service Provider. The Service Provider may not have exactly the same understanding and skill set as the project team itself, and will require explicit instructions. Alternatively, the Service Provider may request help from the project team in identifying a particular aspect of the service architecture that could be replaced with a preferred and known component. Deploying technologies that are unfamiliar to the Service Provider will reduce their responsiveness and effectiveness in handling problems with the application.

Consideration should be given to development of a test bed and test scripts that will allow the Service Provider to confirm correct operation of your software once installed in the service environment.

Things Not To Worry About

Backup and disaster recovery procedures are the responsibility of the Service Provider; do not waste project time on defining specific procedures for the service (but do, of course, back up your project work for your own benefit).

So You Want To Be Different...

Project activity is by its nature about exploring possibilities to develop new service functionality, and you may choose, or need, to utilise emerging tools and technologies. This approach to development and the software it produces may not fit comfortably with the desire of the Service Provider. A Service Provider wants software of service quality based on known solutions that ensures good use of resources and sustainability in a service environment. It is recognised that these opposing drives may be inevitable and that projects must be allowed to explore new technologies and methods, even at the expense of placing additional demands on Service Providers to resolve the problems of deployment.

If relatively immature technologies are being used it is very important that modular development procedures are used as much as possible. Where software has been developed in a modular fashion it will often be relatively straightforward to replace individual components; for example, to change to a different database application or servlet container. During the development process this means competing technologies, which may be at different stages of maturity, can be benchmarked against each other. At the deployment stage it means the option that provides best 'service quality' can be adopted.

Whatever choice of software environment is made, it is always wise to follow best practise by producing well-written and documented code.

It is worth stressing the benefits of contacting possible Service Providers to explore options at the start of a project: they too may be considering future strategy and it is possible that both your and their plans might benefit from co-operation.

References

  1. Top Tips For Selecting Open Source Software, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-60/>
  2. Software Code Development, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-13/>
  3. What Are Open Standards?, QA Focus, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-11/>
  4. TechDis,
    <http://www.techdis.ac.uk>
  5. Project COUNTER,
    <http://www.projectcounter.org>

Briefing 62

Digitising Data For Preservation


Background

Digitisation is a production process. Large numbers of analogue items, such as documents, images, audio and video recordings, are captured and transformed into the digital masters that a project will subsequently work with. Understanding the many variables and tasks in this process - for example the method of capturing digital images in a collection (scanning or digital photography) and the conversion processes performed (resizing, decreasing bit depth, convert file formats, etc.) - is vital if the results are to remain consistent and reliable. By documenting the workflow of digitisation, a life history can be built-up for each digitised item. This information is an important way of recording decisions, tracking problems and helping to maintain consistency and give users confidence in the quality of your work.

What to Record

Workflow documentation should enable us to tell what the current status of an item is, and how it has reached that point. To do this the documentation needs to include important details about each stage in the digitisation process, and its outcome.

By recording the answers to these five questions at each stage of the digitisation process, the progress of each item can be tracked, providing a detailed breakdown of its history. This is particularly useful for tracking errors and locating similar problems in other items. The actual digitisation of an item is clearly the key point in the workflow, and therefore formal capture metadata (metadata about the actual digitisation of the item) is particularly important.

Where to Record the Information

Where possible, select an existing schema with a binding to XML:

Quality Assurance

To check your XML document for errors, QA techniques should be applied:

Further Information


Briefing 63

Choosing a Metadata Standard For Resource Discovery


Background

Resource discovery metadata is an essential part of any digital resource. If resources are to be retrieved and understood in the distributed environment of the World Wide Web, they must be described in a consistent, structured manner suitable for processing by computer software. There are now many formal standards. They range from simple to rich formats, from the loosely structured to the highly structured, and from proprietary, emerging standards, to international standards.

There is no set decision-making procedure to follow but here are some factors that should normally be considered:

Purpose of metadata: A well-articulated definition of purposes at the outset can act as a benchmark against which to compare standards. Metadata may be for:

Attributes of resource: It is important that you also identify your resource type (e.g. text, image), its domain of origin (e.g. library, archive or museum), subject (e.g. visual arts, history) and the specific features that are essential to an understanding of it. Datasets, digital texts, images and multimedia objects, for instance, clearly have very different attributes. Does your resource have pagination or is it three-dimensional? Was it born digital or does it have a hard-copy source? Which attributes will the user need to know to understand the resource?

Design of standard: Metadata standards have generally been developed in response to the needs of specific resource types, domains or subjects. Therefore, once you know the type, domain and broad subject of your resource, you should be able to draw up a shortlist of likely standards. Here are some of the better-known ones:

The key attributes of your resource can be matched against each standard in turn to find the best fit. Is there a dedicated element for each attribute? Are the categories of information relevant and at a suitable level of detail?

Granularity: At this point it is worth considering whether your metadata should (as is usual) be created at the level of the text, image or other such item or at collection level. Collection-level description may be provided where item-level metadata is not feasible or as an additional layer providing an overview of the resource. This could be valuable for large-scale digitisation projects or portals where item-level searching may retrieve an unmanageable number of 'hits'. Digital reproductions may be grouped like their real world sources e.g. by subject or provenance - or be assigned to multiple 'virtual collections'. The RSLP Collection Level Description is emerging as the leading format in this area.

Interoperability: It is important, wherever possible, to choose one of the leading standards (such as those listed above) from within your subject community or domain. This should help to make your resource accessible beyond the confines of your own project. Metadata that is in a recognisable common format may be harvested by subject or domain-wide portals and cross-searched with resources from many other institutions. In-house standards may be tailored to your precise needs but are unlikely to be compatible with other standards and should be used only where nothing suitable already exists. If your over-riding need is for interoperability across all domains or subjects, Dublin Core may be the most suitable standard but it may lack the richness required for other purposes. Care should be taken to ensure that in-house standards at least map to Dublin Core or one of the DC Application profiles.

Support: Using a standard that is well supported by a leading institution can also bring cost benefits. Implementation guidance, user guidance, examples, XML/RDF schemas, crosswalks, multi-lingual capacity, and software tools may pre-exist, thus easing the process of development, customisation and update.

Growth: Consider too whether the standard is capable of further development? Are there regular working groups and workshops devoted to the task?

Extensibility: Also, does the standard permit the inclusion of data elements drawn from other schemas and the description of new object types? It may be necessary to 'mix and match' elements from more than one standard.

Reputation: Funding bodies will be familiar with established, international standards - something, perhaps, to remember when applying for digitisation grants.

Ease of use: Be aware that the required level of expertise can vary greatly between standards. AACR2 and MARC 21, for instance, may produce rich bibliographic description but require the learning of rules. The simpler Dublin Core may allow creators to produce their own metadata records with no extensive training.

Existing experience: Have staff at your organisation used the metadata standard before? If so, the implementation time may be reduced.

Summary

There is no single standard that is best for all circumstances. Each is designed to meet a need and has its own strengths and weaknesses. Start by considering the circumstances of the individual digital project and identify the need(s) or purpose(s) that the metadata will need to satisfy. Once that is done, one can evaluate rival metadata schemas and find the best match. A trade-off will normally have to be made between the priorities listed above.

Summary

There is no single standard that is best for all circumstances. Each is designed to meet a need and has its own strengths and weaknesses. Start by considering the circumstances of the individual digital project and identify the need(s) or purpose(s) that the metadata will need to satisfy. Once that is done, one can evaluate rival metadata schemas and find the best match. A trade-off will normally have to be made between the priorities listed above.

Further Information


Briefing 64

Metadata And Subject Searching


Introduction

Digital collections are only likely to make an impact on the Web if they are presented in such a way that users can retrieve their component parts quickly and easily. This is true even if they have been well selected, digitised to a suitable standard and have appropriate metadata formats. Subject-based access to the collection through searching and/or browsing a tree-like structure can greatly enhance the value of your resource.

Subject Access - Some Options

Subject-based access can be provided in several ways:

Keywords: A simple but crude method is to anticipate the terms that an unguided searcher might intuitively choose and insert them into a keyword field within relevant records. For instance, the text of Ten days that shook the world [1], a classic narrative of the events of 1917, is more likely to be retrieved if the keywords Russian Revolution are added by the cataloguer (based on his/her analysis of the resource and subject knowledge) and if the keyword field is included in the search. In the absence of an agreed vocabulary, however, variant spellings (labor versus labour), and synonyms or near synonyms (Marxist versus Communist) that distort retrieval are likely to proliferate.

Thesauri and subject schemes: Controlled vocabularies, known as thesauri, can prevent inconsistent description and their use is recommended. They define preferred terms and their spelling. If the thesaurus structure is shown on the search interface, users may be guided through broader-term, narrower-term and associated-term relationships to choose the most appropriate keyword with which to search. Take care to choose a vocabulary appropriate to the scope of your resource. A broad and general collection might require a correspondingly universal vocabulary, such as the Library of Congress Subject Headings (>LCSH) [2]. A subject-specific vocabulary, such as the Getty Art and Architecture Thesaurus (AAT) [3], may provide a more limited but detailed range of terms appropriate for a tightly focused collection.

Classification schemes: Keywords and thesauri are primarily aids to searching but browsing can often be a more rewarding approach - particularly for users new to a given subject area. Thesauri are not always structured ideally for browsing as when related or narrower terms are listed alphabetically rather by topical proximity. Truly effective browsing requires the use of a subject classification scheme. A classification scheme arranges resources into a hierarchy on the basis of their subject but differs from a thesaurus in using a sophisticated alphanumeric notation to ensure that related subjects will be displayed in close, browsable, proximity. A well-designed classification scheme should present a navigable continuum of topics from one broad subject area to another and in this way guide the user related items that might otherwise be missed, as in this example from the Dewey Decimal Classification (DDC) [4].

700 Arts, fine and decorative
740 Drawing and decorative arts
745 Decorative arts
745.6 Calligraphy, heraldic design, illumination
745.66 Heraldic design

The notation does not necessarily have to be displayed on screen, however. The subject terms, rather than their respective numbers, may mean more to the user. Another tip is to assign multiple classification numbers to any item that crosses subjects. Digital items can have several 'virtual' locations, unlike a book, which is tied to a single position on a shelf.

Keywords, thesauri and classification can be used in combination or individually.

Choosing a Classification Scheme

The most important consideration when choosing a classification scheme is to select the one that best fits the subject, scope and intended audience of your resource.

Universal classification schemes: These are particularly appropriate where collections and their audiences span continents, subjects and languages. Dewey Decimal Classification (DDC) [5], for instance, is the most widely recognised scheme worldwide, whilst UDC (Universal Decimal Classification) [6] is predominant in Europe and Asia. Well-established schemes of this sort are most likely to have user-friendly online implementation tools.

National or subject-specific schemes: More specific collections are usually best served by schemes tailored to a single country (e.g. BC Nederlandse Basisclassificatie) [7], language, or subject (e.g. NLM National Library of Medicine) [8]. If nothing suitable exists, a scheme can be created in-house.

Homegrown schemes: Project-specific schemes can be flexible, easy to change and suited wholly to one's own needs so that there are no empty categories or illogical subject groupings to hinder browsing. However, the development process is costly, time-consuming and requires expert subject-knowledge. Users are sure to be unfamiliar with your categories and, perhaps worst of all, such schemes are unlikely to be interoperable with the broader information world and will hinder wider cross searching. They should be regarded very much as a last resort.

Adapting an existing scheme: A better approach is normally to adapt an existing scheme by rearranging empty classes, raising or lowering branches of the hierarchy, renaming captions, or extending the scheme. Be aware, though, that recurring notation may be found within a scheme at its various hierarchical levels or the scheme might officially be modified over time, both of which can lead to conflict between the official and customised versions. Take care to document your changes to ensure consistency through the lifetime of the project. Some well-known Internet search-services (e.g. Yahoo!) [9] have developed their own classifications but there is no guarantee that they will remain stable or even survive into the medium term.

Double classification: It may be worthwhile classifying your resource using a universal scheme for cross-searching and interoperability in the wider information environment and at the same time using a more focused scheme for use within the context of your own Web site. Cost is likely to be an issue that underpins all of these decisions. For instance, the scheme you wish to use may be freely available for use on the Internet or alternatively you may need to pay for a licence.

References

  1. Ten Days That Shook the World, Reed, J. New York: Boni & Liveright, 1922; Bartleby.com, 2000,
    <http://www.bartleby.com/79/>
  2. Library of Congress Authorities,
    <http://authorities.loc.gov/>
  3. Art & Architecture Thesaurus Online,
    <http://www.getty.edu/research/conducting_research/vocabularies/aat/>
  4. Dewey Decimal Classification and Relative Index, Dewey, M. in Joan S. Mitchell et al (ed.), 4 vols, (Dublin, Ohio: OCLC, 2003), Vol. 3, p. 610
  5. Dewey Decimal Classification,
    <http://www.oclc.org/dewey/>
  6. Universal Decimal Classification,
    <http://www.udcc.org/>
  7. Nederlandse Basisclassificatie (Dutch Basic Classification),
    <http://www.kb.nl/dutchess/>
  8. National Library of Medicine
    <http://wwwcf.nlm.nih.gov/class/>
  9. Yahoo!
    <http://www.yahoo.com/>

Further Information


Briefing 65

Audio For Low-Bandwidth Environments


Background

Audio quality is surprisingly difficult to predict in a digital environment. Quality and file size can depend upon a range of factors, including vocal type, encoding method and file format. This document provides guidelines on the most effective method of handling audio.

Factors To Consider

When creating content for the Internet it is important to consider the hardware the target audience will be using. Although the number of users with a broadband connection is growing, the majority of Internet users utilise a dial-up connection to access the Internet, limiting them to a theoretical 56kbps (kilobytes per second). To cater for these users, it is useful to offer smaller files that can be downloaded faster.

The file size and quality of digital audio is dependent upon two factors:

  1. File format
  2. Type of audio

By understanding how these three factors contribute to the actual file size, it is possible to create digital audio that requires less bandwidth, but provides sufficient quality to be understood.

File Format

File format denotes the structure and capabilities of digital audio. When choosing an audio format for Internet distribution, a lossy format that encodes using a variable bit-rate is recommended. Streaming support is also useful for delivering audio data over a sustained period without the need for an initial download. These formats use mathematical calculations to remove superfluous data and compress it into a smaller file size. Several popular formats exist, many of which are household names. MP3 (MPEG Audio Layer III) is popular for Internet radio and non-commercial use. Larger organisations, such as the BBC, use Real Audio (RA) or Windows Media Audio (WMA), based upon its digital rights support. Table 1 shows a few of the options that are available.

Format Compression Streaming Bit-rate
MP3 Lossy Yes Variable
Mp3PRO Lossy Yes Variable
Ogg Vorbis Lossy Yes Variable
RealAudio Lossy Yes Variable
Windows Media Audio Lossy Yes Variable

Figure 1: File Formats Suitable For Low-Bandwidth Delivery

Once recorded audio is saved in a lossy format, it is wise to listen to the audio data to ensure it is audible and that essential information has been retained.

Finally, it is recommended that a variable bit-rate is used. For speech, this will usually vary between 8 and 32kbp as needed, adjusting the variable rate accordingly if incidental music occurs during a presentation.

Choosing An Appropriate Encoding Method

The audio quality required, in terms of bit-rate, to record audio data is influenced significantly by the type of audio that you wish to record: music or voice.

Assessing Quality Of Audio Data

The creation of audio data for low-bandwidth environments does not necessitate a significant loss in quality. The audio should remain audible in its compressed state. Specific checks may include the following questions:

Further Information


Briefing 66

Producing And Improving The Quality Of Digitised Images


Introduction

To produce high-quality digital images you should follow certain rules to ensure that the image quality is sufficient for the purpose. This document presents guidance on digitising and improving image quality when producing a project Web site.

Choose Suitable Source Material

Quality scans start with quality originals - high-contrast photos and crisp B&W line art will produce the best-printed results. Muddy photos and light-coloured line art can be compensated for, but the results will never be as good as with high-quality originals. The use of bad photos, damaged drawings, or tear sheets - pages that have been torn from books, brochures, and magazines - will have a detrimental effect upon the resultant digital copy. If multiple copies of a single image exist, it is advisable to choose the one that has the highest quality.

Scan at a Suitable Resolution

It is often difficult to improve scan quality at a later stage. It is therefore wise to scan the source according to consistent, pre-defined specifications. Criteria should be based upon the type of material being scanned and the intended use. Table 1 indicates the minimum quality that projects should choose:

Use Type Dots Per Inch (dpi)
Professional Text 200
Graphics 600
Non-professional Text 150
Graphics 300

Table 1: Guidelines To Scanning Source Documents

Since most scans require subsequent processing, (e.g. rotate an image to align it correctly) that will degrade image quality, it is advisable to work at a higher resolution and resize the scans later.

Once the image has been scanned and saved to in an appropriate file format, measures should be taken to improve the image quality.

Straighten Images

For best results, an image should lay with its sides parallel to the edge of the scanner glass. Although it is possible to straighten images that have been incorrectly digitised, it may introduce unnecessary distortion of the digital image.

Sharpen the Image

To reduce the amount of subtle blur (or 'fuzziness') and improve visual quality, processing tools may be used to sharpen, smooth, improve the contrast level or perform gamma correction. Most professional image editing software contains filters that perform this function automatically.

Correct Obvious Faults

Scanned images are often affected by many problems. Software tools can be used to remove the most common faults:

Be careful you do not apply the same effect twice. This can create unusual effects that distract the observer when viewer the picture.

Further Information


Briefing 67

Implementing and Improving Structural Markup


Background

Digital text has existed in one form or another since the 1960s. Many computer users take for granted that they can quickly write a letter without restriction or technical considerations. This document provides advice for improving the quality of structural mark-up, emphasising the importance of good documentation, use of recognised standards and providing mappings to these standards.

Why Should I Use Structural Mark-Up?

Although ASCII and Unicode are useful for storing information, they are only able describe each character, not the method they should be displayed or organized. Structural mark-up languages enable the designer to dictate how information will appear and establish a structure to its layout. For example, the user can define a tag to store book author information and publication date.

The use of structural mark-up can provide many organizational benefits:

The most common markup languages are SGML and XML. Based upon these languages, several schemas have been developed to organize and define data relationships. This allows certain elements to have specific attributes that define its method of use (see Digital Rights document for more information). To ensure interoperability, XML is advised due to its support for contemporary Internet standards (such as Unicode).

Improving The Quality Of Structural Mark-Up

For organisations that already utilise structural mark-up the benefits are already apparent. However, some consideration should be made on improving the quality of descriptive data. The key to improving data quality is twofold: utilise recognised standards whenever possible; and establish detailed documentation on all aspects of the schema.

Documentation Documentation is an important, if often ignored, aspect of software development. Good documentation should establish the purpose of structural data, examples, and the source of the data. Good documentation will allow others to understand the XML without ambiguity.

Use recognised standards Although there are many circumstances where recognised schemas are insufficient for the required task, the designer should investigate relevant standards and attempt to merge their own bespoke solution with the various standard. In the long-term this will have several benefits:

  1. The project can take advantage of existing knowledge in the field, allowing them to cover areas where they have limited or no experience.
  2. Improve access to content by supporting proven standards, such as SVG.
  3. The time required to map their data to alternative schemas used by other organisations will be reduced significantly.

TEI, Dublin Core and others provide cross-subject metadata elements that can be combined with subject specific languages.

Provide mappings to recognised standards Through the creation of different mappings the developer will standardise and enhance their approach to schema creation, removing potential ambiguities and other problems that may arise. In an organisational standpoint, the mappings will also allow improved relations between cooperating organisations and diversify the options available to use information in new ways.

Follow implementation conventions In addition to implementing recognised standards, it is important that the developer follow existing rules to construct existing elements. In varying circumstances this will involve the use of an existing data dictionary, an examination of XML naming rules. Controlled languages (for example, RDF, SMIL, MathML and SVG) use these conventions to implement specific localised knowledge.

Further Information


Briefing 68

Techniques To Assist The Location And Retrieval Of Local Images


Summary

Use of a consistent naming scheme and directory structure, as well as controlled vocabulary or thesaurus improve the likelihood that digitised content captured by many people over an extended period will be organized in a consistent manner that avoid ambiguity and can be quickly located.

This QA paper describes techniques to aid the storage and successful location of digital images.

Storing local images

Effective categorization of images stored on a local drive can be equally as important as storing them in an image management system. Digitisation projects that involve the scanning and manipulating of a large number of images will benefit from a consistent approach to file naming and directory structure.

An effective naming convention should identify the categories that will aid the user when finding a specific file. To achieve this, the digitisers should ask themselves:

This can be better described with an example. A digitisation project is capturing photographs taken during wartime Britain. They have identified location, year and photographer as search criteria for locating images. To organize this information in a consistent manner the project team should establish a directory structure, common vocabulary and shorthand terms for describing specific locations. Figure 1 outlines a common description framework:

A sample naming convention

Potential Problems

To avoid problems that may occur when the image collection expands or is transferred to a different system, the naming convention should also take account the possibility that:

Naming conventions will allow the project to avoid the majority of these problems. For example, a placeholder may be chosen if one of the identifiers is unknown (e.g. 'ukn' for unknown location, 9999 for year). Special care should be taken to ensure this placeholder is not easily mistaken for a known location or date. Additional criteria, such as other photo attributes or a numbering system, may also be used to distinguish images taken by the same person, in the same year, at the same location.

Identification of Digital Derivatives

Digital derivatives (i.e. images that have been altered in some way and saved under a different name) introduce further complications in how you distinguish the original from the altered version. This will vary according to the type of changes made. On a simple level, you may simply choose a different file extension or store files in two different directories (Original and modified). Alternatively you may append additional criteria onto the filename (e.g. _sm for smaller images or thumbnails, _orig and _modif for original and modified).

Further Information


Briefing 69

QA In The Construction Of A TEI Header


Background

Since the TEI header is still a relatively recent development, there has been a lack of clear guidelines as to its implementation; with the result that metadata has tended to be poor and sometimes erroneous. The implementation of a standard approach to metadata will improve the quality of data and increase the likelihood of locating relevant information.

Structure of a TEI header

The TEI header has a well-defined structure that may provide information analogous to that of a title page for printed text. The <teiHeader> element contains four major components:

  1. FileDesc: The mandatory <fileDesc> element contains a full bibliographic description of an electronic file.
  2. EncodingDesc: The <encodingDesc> element details the relationship between the electronic text and the source (or sources) from which it was derived. Its use is highly recommended.
  3. ProfileDesc: The <profileDesc> element provides a detailed description of any non-bibliographic aspects of a text. Specifically the languages and sublanguages used, the situation in which it was produced, or the participants and their setting.
  4. RevisionDesc: The <revisionDesc> element provides a change log in which each change made to a text may be recorded. The log may be recorded as a sequence of <change> elements each of which contains a corpus or collection of texts, that share many characteristics, or you may use one header for the corpus and individual headers for each component of the corpus.

A corpus or collection of texts, which share many characteristics, may have one header for the corpus and individual headers for each component of the corpus. In this case the type attribute indicates the type of header. For example, <teiHeader type-"corpus"> indicates the header for corpus-level information.

Some of the header elements contain running prose that consists of one or more <p>s. Others are grouped:

What Standards Should I Conform To?

The cataloguer should observe the Anglo-american cataloguing rules 2nd ed. (rev), AACR2, and the international standard bibliographic for electronic resources, ISBD (ER) when creating new headers. AACR2 is used in the Source Description of the header, which is primarily concerned with printed material, whereas ISBD (ER) is used more heavily in the rest of the File Description in which the electronic file is being described.

Further Information


Briefing 70

Establishing a Digital Repository


Background

Digital repositories are often thought of primarily as a computer system, consisting of hardware, software and networks, but they are more than this. Digital repositories are organisations similar in purpose to libraries or archives and, just as it does for these organisations, quality assurance should form an integral part of the work of a digital repository.

Repository Requirements

A digital repository should:

A digital repository intent on long-term retention of its holdings should conform to the Reference Model for an Open Archival Information System (OAIS) [1].

Useful information is available in the QA Focus Briefing papers on "From Project To Production Service" [2] and "Planning An End User Service" [3].

Collections Management Policy and Procedures

Quality assurance can be incorporated into the work of a digital repository through the establishment of formal (but not necessarily complex) policies and procedures.

The CEDARS project suggested that collections management policies should cover: selection, acquisition, organisation, storage, access (user registration and authentication, delivery of masters versions), de-selection, and preservation. Policies developed to cover these topics should be subject to internal and external review as part of a formal approval process. Policies should be reviewed at regular intervals.

Policies should be written to conform to the requirements of relevant legislation, notably the Data Protection Act, 1998.

The day-to-day operation of the repository should be connected to its overall policy framework through the development of procedures. Procedures should be:

Digital repositories need to make use of a wide range of standards and best practices for data creation, metadata creation, data storage, transmission and for many other areas. Many of these topics are discussed in more detail in other QA Focus documents. Selection of technical standards should take particular account of the guidance in QA Focus briefing papers on "Matrix for Selection of Standards" [4] and "Top Tips For Selecting Open Source Software" [5].

Rights and Responsibilities

A digital repository should operate within a clear legal framework that establishes the rights and responsibilities of the repository, its depositors and its users. A formal agreement should be established between each depositor and the repository, by way of a signed licence form or other technique (unavoidable online licence agreement).

This agreement should limit the liability of the repository (e.g. where a depositor does not have copyright), while conferring the repository with rights to manage or withdraw content. Otherwise, the depositor's rights should be protected and any limits on the service provided by the repository should be made clear (such as limits on how long data will be stored, and whether migration or other preservation actions will be undertaken).

References

  1. Reference Model for an Open Archival Information System (OAIS), Consultative Committee for Space Data Systems, January 2002
    <http://ssdoo.gsfc.nasa.gov/nost/wwwclassic/documents/pdf/CCSDS-650.0-B-1.pdf>
  2. From Project To Production Service, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-38/>
  3. Planning An End User Service, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-39/>
  4. Matrix for Selection of Standards, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-31/>
  5. Top Tips For Selecting Open Source Software, QA Focus,
    http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-60/>

Further Information


Briefing 71

QA Techniques For The Storage Of Image Metadata


Background

The archival of digital images requires the consideration of the most effective method of storing technical and life cycle information. Metadata is a common method used to describe digital resources, however the different approaches may confuse many users.

This paper describes QA techniques for choosing a suitable method of metadata storage that takes into account the need for interoperability and retrieval.

Choosing a Suitable Metadata Association Model

Metadata may be associated with an image in three ways:

Internal Model:
Metadata is stored within the image file itself, either through an existing metadata mapping or attached to the end of an image file in an ad hoc manner. Therefore, it is simple to transfer metadata alongside image data without special requirements or considerations. However, support for a metadata structure differs between file formats and assignment of the same metadata record to multiple images causes inefficient duplication in comparison to a single metadata record associated with a group of images.
External Model:
A unique identifier is used to associate external metadata with an image file e.g. an image may be stored on a local machine while the metadata is stored on a server. This is better suited to a repository and is more efficient when storing duplicate information on a large number of objects. However, broken links may occur if the metadata record is not modified when an image is moved, or visa versa. Intellectual Property data and other information may be lost as a result.
Hybrid Model:
Uses both internal and externally associated metadata. Some metadata (file headers/tags) are stored directly in the image file while additional workflow metadata is stored in an external database. The deliberate design of external record offers a common application profile between file formats and provides a method of incorporating format-specific metadata into the image file itself. However, it shares the disadvantages of internal & external models in terms of duplication and broken links.

When considering the storage of image metadata, the designer should consider three questions:

  1. What type of metadata do you wish to store?
  2. Is the file format capable of storing metadata?
  3. What environment is the metadata intended to be stored and used within?

The answer to these questions should guide the choice of the metadata storage model. Some file formats are not designed to store metadata and will require supplementation through the external model; other formats may not store data in sufficient detail for your requirements (e.g. lifecycle data). Alternatively, you may require IP (Intellectual Property) data to be stored internally, which will require a file format that supports these elements.

Ensuring Interoperability

Metadata is intended for the storage and retrieval of essential information regarding the image. In many circumstances, it is not possible to store internal metadata in a format that may be read by different applications. This may be for a number of reasons:

Before choosing a specific image format, you should ensure the repository software is able to extract metadata and that editing software does not corrupt the data if changes are made at a later date. To increase the likelihood of this, you should take one of the following approaches:

Although this will not guarantee interoperability, these measures will increase the likelihood that it may be achieved.

Structuring Your Image Collection

To organise your image collection into a defined structure, it is advisable to develop a controlled vocabulary. If providing an online resource, it is useful to identify your potential users, the academic discipline from which they originate, and the language they will use to locate images. Many repositories have a well-defined user community (archaeology, physics, sociology) that share a common language and similar goals. In a multi-discipline collection it is much more difficult to predict the terms a user will use to locate images. The US Library of Congress [3], the New Zealand Time Frames [4] and International Press Telecommunications Council (IPTC) [5] provide online examples of how a controlled vocabulary hierarchy may be used to catalogue images.

References


Briefing 72

Using The QA For Web Toolkit


About The QA Focus Toolkits

The QA Focus Toolkits are an online resource which can be used as a checklist to ensure that your project or service has addressed key areas which can help to ensure that your deliverables are fit for its intended purpose, widely accessible and interoperable and can be easily repurposed.

The QA For Web Toolkit is one of several toolkits which have been developed by the QA Focus project to support JISC's digital library programmes. This toolkit addresses compliance with standards and best practices for Web resources.

Accessing The QA For Web Toolkit

The QA For Web Toolkit is available from <http://www.ukoln.ac.uk/qa-focus/toolkit/>. The toolkit is illustrated in Figure 1:

Figure 1: The QA For Web Toolkit
Figure 1: The QA For Web Toolkit

Coverage

The toolkit addresses the following key areas

Embedding The Toolkit In Your Work

The toolkit can provide access to a set of online checking services.

You should seek to ensure that systematic checking is embedded within your work. If you simply make occasional use of such tools you may fail to spot significant errors. Ideally you will develop a systematic set of workflow procedures which will ensure that appropriate checks are carried out consistently.

You should also seek to ensure that you implement systematic checks in areas in which automated tools are not appropriate or available.

You may wish to use the results you have found for audit trails of compliance of resources on your Web site.

About The QA For Web Toolkit Resource

The QA For Web Toolkit described in this document provides a single interface to several online checking services hosted elsewhere. The QA Focus project and its host organisations (UKOLN and AHDS) have no control over the remote online checking services. We cannot guarantee that the remote services will continue to be available.

Further Information

Further toolkits are available at <http://www.ukoln.ac.uk/qa-focus/toolkit/>

.

Briefing 73

Using The QA For Metadata Toolkit


About The QA Focus Toolkits

The QA For Metadata Toolkit is an online resources which can be used as a checklist to ensure that your project or service has addressed key areas which can help to ensure that the metadata you use in your service is fit for its intended purpose and will ensure that your application will be interoperable.

The QA For Metadata Toolkit is one of several toolkits which have been developed by the QA Focus project to support JISC's digital library programmes.

Accessing The QA For Metadata Toolkit

The QA For Metadata Toolkit is available from <http://www.ukoln.ac.uk/qa-focus/toolkit/>. The toolkit is illustrated in Figure 1:

Figure 1: The QA For Metadata Toolkit
Figure 1: The QA For Metadata Toolkit

Coverage

The toolkit addresses the following key areas

Embedding The Toolkit In Your Project Activities

The toolkit can provide access to a set of online checking services.

The toolkit can provide a simple checklist for ensuring that your project has addressed key areas in the development and deployment of metadata. As well as providing an aide memoire for projects the toolkit may also be useful in a more formal context. For example the answers could be used in initial scoping work at the early stages of a project or in reports to the project funders. In addition answers to the issues raised may be helpful for other potential users of the metadata or the final service provider of the project deliverables.

About The QA For Metadata Toolkit Resource

The QA For Web Toolkit described in this document provides a single interface to several online checking services hosted elsewhere. The QA Focus project and its host organisations (UKOLN and AHDS) have no control over the remote online checking services. We cannot guarantee that the remote services will continue to be available.

Further Information

Further toolkits are available at <http://www.ukoln.ac.uk/qa-focus/toolkit/>

.

Briefing 74

Improving The Quality Of Digitised Images


Summary

A digitised image requires careful preparation before it is suitable for distribution. This document describes a workflow for improving the quality of scanned images by correcting faults and avoiding common errors.

Preparing your master image

The sequence in which modifications are made will have a significant contribution to the quality of the final image. Although conformance to a strict sequence is not always necessary, inconsistencies may be introduced if the order varies dramatically between images. The Technical Advisory Service for Images (TASI) recommends the following order:

  1. Does the image require rotation or cropping?
    In many circumstances, the digitiser will not require the entire image. Cropping an image to a specific size, shape or orientation will reduce the time required for the computer to manipulate the image and prioritise errors to those considered important.
  2. Are shades and colours difficult to distinguish?
    Scanners and digital cameras often group colours into a specific density range. This makes it difficult to differentiate shades of the same colour. Use the Histogram function with Photoshop (or other software) and adjust the different levels to best use the range of available tones.
  3. Is the colour balance accurate in comparison to the original?
    Some colours may change when digitised, e.g. bright orange may change to pink. Adjust the colour balance by modifying the Red, Green & Blue settings. Decreasing one colour increases its opposite.
  4. Are there faults or artefacts on the image?
    Visual checks should be performed on each image, or a selection of images, to identify faults, such as dust specks or scratches on the image.

Once you are satisfied with the results, the master image should be saved in a lossless image format - RGB Baseline TIFF Rev 6 or PNG are acceptable for this purpose.

Improving image quality

Subsequent improvements by resizing or sharpening the image should be performed on a derivative.

  1. Store work-in-progress images in a lossless format
    Digitisers often get into the habit of making modifications to a derivative image saved in a 'lossy' format, i.e. a format that simplifies detail to reduce file size. This is considered bad practice, will reduce quality and cause compression 'artefacts' to appear over subsequent edits. When repeatedly altering an image it is advisable to save the image in a lossless format (e.g. TIFF, PNG) until the image is ready for dissemination. Once all changes have been made it can be output in a lossy format.
  2. Filter the image
    Digitised images often appear 'noisy' or contain dust and scratches. Professional graphic manipulation (Photoshop, PaintShop Pro, etc.) possesses graphic processors that can be useful in removing these effects. Common filters include 'Despeckle' that subtly blurs an image to reduce the amount of 'noise' in an image and 'median' that blends the brightness of pixels and discards pixels that are radically different from adjacent pixels.
  3. Remove distracting effect
    If you are funded to digitise printed works, moiré (pronounced more-ray) effects may be a problem. Magazine or newspaper illustrations that print an image as thousands of small coloured dots produce a noticeable repeating pattern when scanned. Blur effects, such as the Gaussian blur, are an effective method of reducing noticeable moiré effects, however these also reduce image quality. Resizing the image is also an effective strategy that forces the image-processing tool to re-interpolate colours, which will soften the image slightly. Although these effects will degrade image to an extent, the results are often better than a moiré.

Further Information


Briefing 75

Digitisation Of Still Images Using A Flat-Bed Scanner


Preparing For A Large-Scale Digitisation Project

The key to the development of a successful digitisation project is to separate it into a series of stages. All projects planning to digitise documents should establish a set of guidelines to help ensure that the scanned images are complete, consistent and correct. This process should consider the proposed input and output of the project, and then find a method of moving from the first to the second.

This document provides preparatory guidance to consider when approaching the digitisation of many still images using a flatbed scanner.

Choose Appropriate Scanning Software

Before the digitisation process may begin, the digitiser requires suitable tools to scan & manipulate the image. It is possible to scan a graphic using any image processing software that supports TWAIN (an interface to connect to a scanner, digital camera, or other imaging device from within a software application), however the software package should be chosen carefully to ensure it is appropriate for the task. Possible criteria for measuring the suitability of image processing software include:

A timesaving may be found by utilizing a common application, such as Adobe Photoshop, Paintshop Pro, or GIMP. For most purposes, these offer functionality that is rarely provided by editing software included with the scanner.

Check The Condition Of The Object To Be Scanned

Image distortion and dark shading at page edges are common problems encountered during the digitisation process, particularly when handling spine-bound books. To avoid these and similar issues, the digitiser should ensure that:

  1. The document is uniformly flat against the document table.
  2. The document is not accidentally moved during scanning.
  3. The scanner is on a flat, stable surface.
  4. The edges of the scanner are covered by paper to block external light, caused when the object does not lay completely flat against the scanner.

Scanning large objects that prevent the scanner lid being closed (e.g. a thick book) often causes discolouration or blurred graphics. Removing the spine will allow each page to be scanned individually, however this is not always an option (i.e. when handling valuable books). In these circumstances you should consider a planetary camera as an alternative scanning method.

Identification Of A Suitable Policy For Digitisation

It is often costly and time-consuming to rescan the image or improve the level of detail in an image at a later stage. Therefore, the digitiser should ensure that a consistent approach to digitisation is taken in the initial stages. This will include the choice of a suitable resolution, file format and filename scheme.

Establish a consistent quality threshold for scanned images

It is difficult to improve low quality scans at a later date. It is therefore important to digitise images at a at a slightly higher resolution (measured in pixels per inch) and scan type (24-bit or higher for colour, or 8-bit or higher for grey scale) than required and rescale the image at a later date.

Choose an appropriate image format

Before scanning the image, the digitiser should consider the file format in which it will be saved. RGB Baseline TIFF Rev 6 is the accepted format of master copies for archival and preservation (although PNG is a possible alternative file format). To preserve the quality, it is advisable to avoid compression where possible. If compression must be used (e.g. for storing data on CD-ROM), the compression format should be noted (Packbits, LZW, Huffman encoding, FAX-CCITT 3 or 4). This will avoid incompatibilities in certain image processing applications.

Data intended for dissemination should be stored in one of the more common image formats to ensure compatibility with older or limited browsers. JPEG (Joint Photographic Experts Group) is suitable for photographs, realistic scenes, or other images with subtle changes in tone, however its use of 'lossy' compression causes sharp lines or letterings are likely to become blurred. When modifying an image, the digitiser should return to the master TIFF image, make the appropriate changes and resave it as a JPEG.

Choose an appropriate filename scheme

Digitisation projects will benefit from a consistent approach to file naming and directory structure that allows images to be organized in a manner that avoids confusion and can be quickly located. An effective naming convention should identify the categories that will aid the user when finding a specific file. For example, the author, year it was created, thematic similarities, or other notable factors. The digitiser should also consider the possibility that multiple documents will have the same filename or may lack specific information and consider methods of resolving these problems. Guidance on this issue can be found in related QA Focus documents.

Further Information


Briefing 76

Choosing A Suitable Digital Watermark


Summary

Watermarking is an effective technology that solves many problems within a digitisation project. By embedding Intellectual Property data (e.g. the creator, licence model, creation date or other copyright information) within the digital object, the digitiser can demonstrate they are the creator and disseminate this information with every copy, even when the digital object has been uploaded to a third party site. It can also be used to determine if a work has been tampered with or copied.

This paper describes methods for establishing if a project requires watermarking techniques and criteria for choosing the most suitable type.

Purpose Of A Watermark

Before implementing watermarking within your workflow, you should consider its proposed purpose. Are you creating watermarks to indicate your copyright, using it as a method of authentication to establish if the content has been modified, or doing so because everyone else has a watermarking policy? The creation of a watermark requires significant thought and modification to the project workflow that may be unnecessary if you do not have a specific reason for implementing it.

For most projects, digital watermarks are an effective method of identifying the copyright holder. Identification of copyright is encouraged, particularly when the work makes a significant contribution to the field. However, the capabilities of watermarks should not be overstated. It is useful in identifying copyright, but is incapable of preventing use of copyrighted works. The watermark may be ignored or, given sufficient time and effort, removed entirely from the image. If the intent is to restrict content reuse, a watermark may not be the most effective strategy.

Required Attributes Of A Watermark

To assist the choice of a watermark, the project team should identify the required attributes of a watermark by answering two questions:

  1. To whom do I wish to identify my copyright?
  2. What characteristics do I wish the watermark to possess?

The answer to the first question is influenced by the skills and requirements of your target audience. If the copyright information is intended for non-technical and technical users, a visible watermark is the most appropriate. However, if the copyright information is intended for technical users only or the target audience is critical of visible watermarks (e.g. artist may criticise the watermark for impairing the original image), an invisible watermark may be the best option.

To answer the second question, the project team should consider the purpose of the watermark. If the intent is to use it as an authentication method (i.e. establish if any attempt to modify the content has been made), a fragile watermark will be a valued attribute. A fragile watermark is less robust towards modifications where even small change of the content will destroy embedded information. In contrast, if the aim is to reflect the owner's copyright, a more robust watermark may be preferential. This will ensure that copyright information is not lost if an image is altered (through cropping, skewing, warp rotation, or smoothing of an image).

Choosing A Resilient Watermark

If resilience is a required attribute of a digital watermark, the project team has two options: invisible or visible watermark. Each option has different considerations that make it suitable for specific purposes.

Invisible Watermarks
Invisible watermarks operate by embedding copyright information within the image itself. As a rule, watermarks that are less visible are weaker and easier to remove. When choosing a variant it is important to consider the interaction between watermark invisibility and resilience. Some examples are shown in Table 1:

Name Description Resilience
Bit-wiseMakes minor alterations to the spatial relation of an image Weak
Noise InsertionEmbed watermark within image noiseWeak
Masking and filteringSimilar to paper watermarks on a bank note, it provides a subtle, though recognisable evidence of a watermark.Strong
Transform domainUses dithering, luminance, or lossy techniques (similar to JPEG compression) on the entire or section of an image.Strong

Table 1: Indication of resilience for invisible watermarks

'Bit-wise' & 'noise insertion' may be desirable if the purpose is to determine whether the medium has been altered. In contrast, 'transform domain' and 'masking' techniques are highly integrated into the image and therefore more robust to deliberate or accidental removal (caused by compression, cropping, and image processing techniques) in which significant bits are changed. However, these are often noticeable to the naked eye.

Visible Watermarks
A bird A visible watermark is more resilient and may be used to immediately identify copyright without significant effort by the user. However, these are, by design, more intrusive to the media. When creating a visible watermark, the project team should consider its placement. Projects funded with public money should be particularly conscious that the copyright notice does not interfere with the purpose of the project. A balance should be reached between the need to make the watermark difficult to remove and its use to the user.

Both watermarks make them suitable for specific situations. If handling a small image collection, it may be feasible (in terms of time and effort) to use both watermarks as a redundant protection measure - in the event that one is removed, the second is likely to remain.

Information Stored within the Watermark

If the project is using a watermark to establish their copyright, some thought should be made on the static information you wish to provide. For example:

Some content management systems are also able to generate dynamic watermarks and embed them within the image. This may record the file information (file format, image dimensions, etc.) and details about the download transaction (transaction identifier, download date, etc.). This may be useful for tracking usage, but may annoy the user if the data is visible.

Implementing Watermarks in the Project Workflow

To avoid unnecessary corruption of a watermark by the digitiser/creator themselves, the watermark creation process should be delayed until the final steps of the digitisation workflow. Watermarks can be easily removed when the digitiser is modifying the image in any way (e.g. through cropping, skewing, adjustment of the RGB settings, or through use of lossy compression). If an image is processed to the degree that the watermark cannot be recognized, then reconstruction of the image properties may be possible through the use of an original image.

Further Information


Briefing 77

An Introduction To RSS And News Feeds


Background

RSS is increasingly being used to provide news services and for syndication of content. The document provides a brief description of RSS news feed technologies which can be used as part of a communications strategy by projects and within institutions. The document summarises the main challenges to be faced when considering deployment of news feeds.

What Are News Feeds?

News feeds are an example of automated syndication. News feed technologies allow information to be automatically provided and updated on Web sites, emailed to users, etc. As the name implies news feeds are normally used to provide news; however the technology can be used to syndicate a wide range of information.

Standards for News Feeds

The BBC ticker [1] is an example of a news feed application. A major limitation with this approach is that the ticker can only be used with information provided by the BBC.

The RSS standard was developed as an open standard for news syndication, allowing applications to display news supplied by any RSS provider.

RSS is a lightweight XML application (see RSS fragment). Ironically the RSS standard proved so popular that it led to two different approaches to its standardisation. So RSS now stands for RDF Site Summary and Really Simple Syndication (in addition to the original phrase Rich Site Summary).

<title>BBC News</title>
<url>http://news.bbc.co.uk/nol/shared/img/bbc_news_120x60.gif</url>
<link>http://news.bbc.co.uk/</link>
<item>
<title>Legal challenge to ban on hunting</title>
<description>The Countryside Alliance prepares a legal challenge to Parliament Act ... </description>
<link>http://news.bbc.co.uk/go/click/rss/0.91/public/-/1/hi/... </link>.

Figure 1: Example Of An RSS File

Despite this confusion, in practice many RSS viewers will display both versions of RSS (and the emerging new standard, Atom).

News Feeds Readers

scrolling RSS ticker

There are a large number of RSS reader software applications available [2] and several different models. An example of a scrolling RSS ticker is also shown above [3]. RSSxpress [4] (illustrated below) is an example of a Web-based reader which embeds an RSS feed in a Web page.

RSSxpress

In addition to these two approaches, RSS readers are available with an email-style approach for the Opera Web browser [5] and Outlook [6] and as extensions for Web browsers [7] [8].

Creating News Feeds

There are several approaches to the creation of RSS news feeds. Software such as RSSxpress can also be used to create and edit RSS files. In addition there are a number of dedicated RSS authoring tools, including standalone applications and browser extensions (see [9]). However a better approach may be to generate RSS and HTML files using a CMS or to transform between RSS and HTML using languages such as XSLT.

Issues

Issues which need to be addressed when considering use of RSS include:

Further Information

  1. Desktop Ticker, BBC,
    <http://news.bbc.co.uk/1/hi/help/3223354.stm>
  2. RSS Readers, Weblogs Compendium,
    <http://www.lights.com/weblogs/rss.html>
  3. RSSxpress, UKOLN
    <http://rssxpress.ukoln.ac.uk/>
  4. ENewsBar
    <http://www.enewsbar.com/>
  5. RSS Newsfeeds In Opera Mail, Opera
    <http://www.opera.com/products/desktop/m2/rss/>
  6. Read RSS In Outlook, intraVnews,
    <http://www.intravnews.com/>
  7. RSS Extension for Firefox, Sage,
    <http://sage.mozdev.org/>
  8. RSS Reader, Pluck,
    <http://www.pluck.com/product/rssreader.aspx>
  9. Web / Authoring / Languages / XML / RSS, Webreference.com,
    <http://www.webreference.com/authoring/languages/xml/rss/>

Briefing 78

An Introduction To Wikis


Background

Wiki technologies are increasingly being used to support development work across distributed teams. This document aims to give a brief description of Wikis and to summarise the main challenges to be faced when considering the deployment of Wiki technologies.

What is A Wiki?

A Wiki or wiki (pronounced "wicky" or "weekee") is a Web site (or other hypertext document collection) that allows a user to add content. The term Wiki can also refer to the collaborative software used to create such a Web site [1].

The key characteristics of typical Wikis are:

Wikipedia - The Largest Wiki

The Wikipedia is the largest and best-known Wiki - see <http://www.wikipedia.org/>.

Wikipedia

The Wikipedia provides a good example of a community Wiki in which content is provided by contributors around the world.

The Wikipedia appears to have succeeded in providing an environment and culture which has minimised the dangers of misuse. Details of the approaches taken on the Wikipedia are given on the Wikipedia Web site [2].

What Can Wikis Be Used For?

Wikis can be used for a number of purposes:

Wikis - The Pros And Cons

As described in [6] advantages of Wikis may include:

Disadvantages of Wikis include:

Further Information

  1. Wiki, Wikipedia,
    <http://en.wikipedia.org/wiki/Wiki>
  2. Wikimedia principles, Wikimedia,
    <http://meta.wikimedia.org/wiki/Wikimedia_principles>
  3. IT and Society Wiki, Queen's University Belfast
    <http://itsoc.mgt.qub.ac.uk/ITandSociety>
  4. FOAF Wiki, FoafProject,
    <http://rdfweb.org/topic/FoafProject>
  5. Experiences of Using a Wiki for Note-taking at a Workshop, B. Kelly, Ariadne 42, Jan 2005,
    <http://www.ariadne.ac.uk/issue42/web-focus/>
  6. , E. Tonkin, Ariadne 42, Jan 2005,
    <http://www.ariadne.ac.uk/issue42/tonkin/>

Briefing 79

An Introduction To Audio And Video Communication Tools


Background

Audio and video applications are being increasingly used to support project working across distributed project teams. This document aims to give a brief description of audio and video tools which can be used to support such collaborative work within our institutions and to summarise the main challenges to be faced when considering their deployment across organisations.

The Potential For Audio And Video Tools

The growth in broadband is leading to renewed interest in audio and video-conferencing systems. In the past such services often required use of specialist hardware and software. However tools are now being developed for home use. This briefing document explores some of the issues concerning use of such technologies within an institution.

An Example Of An Audio Tool

Skype The Skype Internet telephony system [1] is growing in popularity. Skype is popular because it can provide free calls to other Skype users. In addition Skype has potential for use in an academic context:

It should be noted, however, that Skype is a proprietary application and concerns over its use have been raised.

Examples Of Video Tools

MSN Messenger Instant Messaging clients such as MSN Messenger [2] also provide audio and video capabilities. Such tools can raise expectations of student users who may wish to use such tools for their own use.

It should be noted, however, that there are interoperability problems with such tools (e.g. both users may need to be running the latest version of the MS Windows operating system). In addition the management of user IDs and setting up areas for group discussions may be issues.

An alternative approach is use of software such as VRVS [3], an Access Grid application. This Web-based system provides managed access to virtual rooms, etc. VRVS is intended for use by GRID users and not be appropriate for certain uses. However it illustrates an alternative approach.

VRVS

Issues

Issues which need to be addressed when considering use of such tools include

Further Information

  1. Skype,
    <http://www.skype.com/>
  2. MSN Messenger, Microsoft,
    <http://messenger.msn.com/>
  3. Virtual Rooms, Virtual Meetings, A. Powell, Ariadne, issue 41, Oct 2004,
    <http://www.ariadne.ac.uk/issue41/powell/>

Briefing 80

An Introduction To Persistent Identifiers


What are Persistent Identifiers?

An identifier is any label that allows us to find a resource. One of the best-known identifiers is the International Standard Book Number (ISBN), a unique ten-digit number assigned to books and other publications. On the Internet the most widely known identifier is the Uniform Resource Locator (URL), which allows users to find a resource by listing a protocol, domain name and, in many cases, file location.

A persistent identifier is, as the name suggests, an identifier that exists for a very long time. It should at the very least be globally unique and be used as a reference to the resource beyond the resource's lifetime. URLs, although useful, are not very persistent. They only provide a link to the resource's location at the moment in time they are cited, if the resource moves they no longer apply. The issue of 'linkrot' on the Internet (broken links to resources), along with the need for further interoperability has led to the search for more persistent identifiers for digital resources.

Principles for Persistent Identification

The International Digital Object Identifier (DOI) Foundation [1] states that there are two principles for persistent identification:

  1. Assign an ID to a resource: Once assigned the number must identify the same resource beyond the lifetime of the resource or identifier.
  2. Assign a resource to an ID: The resource should persistently continue to be the same thing.

Uniform Resource Identifiers

A Uniform Resource Identifier (URI) is the string that is used to identify anything on the Internet. URLs along with Uniform Resource Names (URNs) are both types of URI. A URN is a name with global scope and does not necessarily imply a location. A URN will include a Namespace Identifier (NID) Code and a Namespace Specific String (NSS). The NID specifies the identification system used (e.g. ISBN) and the NSS is local code that identifies a resource. For someone to find a resource using a URN they must use a resolver service.

Persistent URLs

Persistent URLs (PURLs) [2] have been developed by the Online Computer Library Centre (OCLC) as an interim measure for Internet resources until the URN framework is well established. A PURL is functionally a URL, but rather than pointing at a location points at a resolution service, which redirects the user to the appropriate URL. If the URL changes it just needs to be amended in the PURL resolution service

Example: http://purl.oclc.org/OCLC/PURL/summary
This is made up of the protocol (http), the resolver address (http://purl.oclc.org/) and the user-assigned name (OCLC/PURL/summary).

Digital Object Identifiers

The Digital Object Identifier (DOI) system was initiated by the Association of American publishers in an attempt to assist the publishing community with copyright and electronic commerce. DOIs are described by the International DOI Foundation, who manage them, as persistent, interoperable, actionable identifiers. They are persistent because they identify an object as a first-class entity (not just the location), they are interoperable because they are designed with the future in mind and they are actionable because they allow a user to locate a resource by resolution using the Handle System. The Handle System, developed by the Corporation for National Research Initiatives (CNRI) includes protocols that enable a distributed computer system to store handles of digital resources and resolve them into a location. DOIs can be assigned by a Registration Agency (RA), which provides services for a specific user community and may charge fees. The main RA for the publishing community is CrossRef [3].

Example: 10.1000/123456
This is made up of the prefix (10.1000) which is the string assigned to an organisation that registering DOIs and the suffix (123456) which is a unique (to a given prefix) alphanumeric string, which could be an existing identifier.

Using Persistent Identifiers

While DOIs hold great potential for helping many information communities enhance interoperability they have yet to reach full maturity. There are still many unresolved issues, such as their resolution (how users use them in to receive a Web page), registration of the DOI system, the persistence of the International DOI Foundation as an organisation and what exactly their advantages are over handles or PURLs. Until these matters are resolved they will remain little more than a good idea for most communities.

However the concept of persistent identifiers is still imperative to a working Internet. While effort is put into finding the best approach there is much that those creating Web pages can do to ensure that their URIs are persistent. In 1998 Tim Berners-Lee coined the phrase Cool URIs to describe URIs which do not change. His article explains the methods a Webmaster would use to design a URI that will stand the test of time. As Berners-Lee elucidates "URIs don't change: people change them." [4].

References

  1. International DOI Foundation,
    <http://doi.org/>
  2. PURL,
    <http://purl.org/>
  3. CrossRef,
    <http://www.crossref.org/>
  4. Cool URIs Don't Change, W3C,
    <http://www.w3.org/Provider/Style/URI.html>

Briefing 81

An Introduction To Folksonomies


What is a Folksonomy?

A folksonomy is a decentralised, social approach to creating metadata for digital resources. It is usually created by a group of individuals, typically the resource users, who add natural language tags to online items, such as images, videos, bookmarks and text. These tags are then shared and sometimes refined. Folksonomies can be divided into broad folksonomies, when lots of users tag one object, and narrow folksonomies, when a small number of users tag individual items. This new social approach to creating online metadata has sparked much discussion in the cataloguing world.

Note that despite its name a folksonomy is not a taxonomy. A taxonomy is the process, within subject-based classification, of arranging the terms given in a controlled vocabulary into a hierarchy. Folksonomies move away from the hierarchical approach to an approach more akin to that taken by faceted classification or other flat systems.

The History of Folksonomies

With the rise of the Internet and increased use of digital networks it has become easier to both work in an informal and adhoc manner, and as part of a community. In the late 1990s Weblogs (or blogs), a Web application similar to an online diary, became popular and user centred metadata was first created. In late 2003 delicious, an online bookmark manager, went live. The ability to add tags using a non-hierarchical keyword categorisation system was appended in early 2004.Tagging was quickly replicated by other social software and in late 2004 the Folksonomy name, a portmanteau of folk and taxonomy, was coined by Thomas Vander Wal.

Strengths and Weaknesses of Folksonomies

Robin Good is quoted as saying that "a folksonomy represents simultaneously some of the best and worst in the organization of information." There is clearly a lot to be learnt from this new method of classification as long as you remain aware of the strengths and weaknesses.

Strengths

Serendipity
Folksonomies at this point in time are more about browsing than finding and a great deal of useful information can be found in this way.
Cheap and extendable
Folksonomies are created by users. This makes them relatively cheap and highly scalable, unlike more formal methods of adding metadata. Often users find that it is not a case of 'folksonomy or professional classification' but 'folksonomy or nothing'.
Community
The key to folksonomies success is community and feedback. The metadata creation process is quick and responsive to user needs, new words can become well used in days. If studied they can allow more formal classification systems to emerge and demonstrate clear desire lines (the paths users will want to follow).

Weaknesses

Imprecision of terms
Folksonomy terms are added by users which means that they can be ambiguous, overly personalised and imprecise. Some sites only allow single word metadata resulting in many compound terms, many tags are single use and at present there is little or no synonym control.
Searching
The uncontrolled set of terms created can mean that folksonomies may not support searching as well as services using controlled vocabularies.

The Future for Folksonomies

Over time users of the Internet have come to realise that old methods of categorisation do not sit comfortably in a digital space, where physical constraints no longer apply and there is a huge amount to be organised. Search services like Yahoo's directory, where items are divided into a hierarchy, often seem unwieldy and users appear happier with the Google search box approach. With the rise of communities on the Web there has also come about a feeling that meaning comes best from our common view of the world, rather than a professional's view.

While there is no doubt that the professional cataloguing will continues to have a place, both off the Internet and on, there has been recent acceptance that new ways of adding metadata, such as folksonomies, need more exploration, alongside other areas like the semantic Web. The two models of categorisation (formal and informal) are not mutually exclusive and further investigation could only help us improve the way we organise and search for information. If nothing else folksonomies have achieved the once believed unachievable task of getting people to talk about metadata!

Further Information

The following additional resources may be useful:

Bookmark Sites

Images, Video and Sound

Other


Briefing 82

An Introduction To Creative Commons


What is a Creative Commons?

Creative Commons (CC) [1] refers to a movement started in 2001 by US lawyer Lawrence Lessig that aims to expand the collection of creative work available for others to build upon and share. The Creative Commons model makes a distinction between the big C (Copyright) meaning All Rights Reserved and CC meaning Some Rights Reserved. It does so by offering copyright holders licences to assign to their work, which will clarify the conditions of use and avoid many of the problems current copyright laws pose when attempting to share information.

What Licences?

There are a series of eleven Creative Commons licences available to download from the Web site. They enable copyright holders to allow display, public performance, reproduction and distribution of their work while assigning specific restrictions. The six main licences combine the four following conditions:

Icon for Attribution Attribution - Users of your work must credit you.
Icon for Non-commercial Non-commercial - Users of your work can make no financial gain from it.
Icon for Non-derivative Non-derivative - Only verbatim copies of your work can be used.
Icon for Share-alike Share-alike - Subsequent works have to be made available under the same licence as the original.

The other licences available are the Sampling licence, the Public Domain Dedication, Founders Copyright, the Music Sharing licence and the CC Zero licence. Creative Commons also recommends two open source software licences for those licensing software: the GNU General Public licence and the GNU Lesser Public licence.

Each license is expressed in three ways: (1) legal code, (2) a commons deed explaining what it means in lay person's terms and (3) a machine-readable description in the form of RDF/XML (Resource Description Framework/Extensible Mark up Language) metadata. Copyright holders can embed the metadata in HTML pages.

International Creative Commons

The Creative Commons licences were originally written using an American legal model but through the Creative Common international (CCi) have since been adapted for use in a number of different jurisdictions. The regional complexities of UK law has meant that two different set of licences have had to be drafted for use of the licenses the UK. Creative Commons works with the Arts and Humanities Research Board Centre for Studies in Intellectual Property and Technology Law at Edinburgh University on the Scotland jurisdiction-specific licenses and the Information Systems and Innovation Group (ISIG) to create the England and Wales jurisdiction-specific licenses.

Why Use Creative Commons Licences?

There are many benefits to be had in clarifying the rights status of a work. When dealing with Creative Commons licenced work, it is known if the work can be used without having to contact the author, thus allowing the work to be exploited more effectively, more quickly and more widely, whilst also increasing the impact of the work. Also in the past clarification of IPR has taken a huge amount of time and effort, Creative Commons could save some projects a considerable amount of money and aid their preservation strategies. More recently, because Creative Commons offers its licence in a machine-readable format, search engines can now search only CC licenced resources allowing users easier access to 'free materials'.

Issues

Although Creative Commons has now been in existence for a while there are still issues to be resolved. For example in the UK academic world the question of who currently holds copyright is a complex one with little commonality across institutions. A study looking at the applicability of Creative Commons licences to public sector organisations in the UK has been carried out [2].

Another key area for consideration is the tension between allowing resources to be freely available and the need for income generation. Although use of a Creative Commons license is principally about allowing resources to be used by all, this does not mean that there has to be no commercial use. One option is dual licensing, which is fairly common in the open source software environment.

References

  1. Creative Commons,
    <http://creativecommons.org/>
  2. Creative Commons Licensing Solutions for the Common Information Environment, Intrallect,
    <http://www.intrallect.com/cie-study/>

Briefing 83

An Introduction To Podcasting


What Is Podcasting?

Podcasting has been described as "a method of publishing files to the internet, often allowing users to subscribe to a feed and receive new files automatically by subscription, usually at no cost." [1].

Podcasting is a relatively new phenomena becoming popular in late 2004. Some of the early adopters regard Podcasting as a democratising technology, allowing users to easily create and publish their own radio shows which can be easily accessed within the need for a broadcasting infrastructure. From a technical perspective, Podcasting is an application of the RSS 2.0 format [2]. RSS can be used to syndicate Web content, allowing Web resources to be automatically embedded in third party Web sites or processed by dedicated RSS viewers. The same approach is used by Podcasting, allowing audio files (typically in MP3 format) to be automatically processed by third party applications - however rather than embedding the content in Web pages, the audio files are transferred to a computer hard disk or to an MP3 player - such as an iPod.

The strength of Podcasting is the ease of use it provides rather than any radical new functionality. If, for example, you subscribe to a Podcast provided by the BBC, new episodes will appear automatically on your chosen device - you will not have to go to the BBC Web site to see if new files are available and then download them.

Note that providing MP3 files to be downloaded from Web sites is sometimes described as Podcasting, but the term strictly refers to automated distribution using RSS.

What Can Podcasting Be Used For?

There are several potential applications for Podcasting in an educational context:

Possible Problems

Although there is much interest in the potential for Podcasting, there are potential problem areas which will need to be considered:

It would be advisable to seek permission before making recordings or making recordings available as Podcasts.

Podcasting Software

Listening To Podcasts

It is advisable to gain experiences of Podcasting initially as a recipient, before seeking to create Podcasts. Details of Podcasting software is given at [3] and [4]. Note that support for Podcasts in iTunes v. 5 [5] has helped enhance the popularity of Podcasts. You should note that you do not need a portable MP3 player to listen to Podcasts - however the ability to listen to Podcasts while on the move is one of its strengths.

Creating Podcasts

When creating a Podcast you first need to create your MP3 (or similar) audio file. Many recording tools are available, such as the open source Audacity software [6]. You may also wish to make use of audio editing software to edit files, include sound effects, etc.

You will then need to create the RSS file which accompanies your audio file, enabling users to subscribe to your recording and automate the download. An increasing number of Podcasting authoring tools and Web services are being developed [7] .

References

  1. Podcasting, Wikipedia,
    <http://en.wikipedia.org/wiki/Podcasting>
  2. RSS 2.0, Wikipedia,
    <http://en.wikipedia.org/wiki/Really_Simple_Syndication>
  3. iPodder Software,
    <http://www.ipodder.org/directory/4/ipodderSoftware>
  4. iTunes - Podcasting,
    <http://www.apple.com/podcasting/>
  5. Podcasting Software (Clients), Podcasting News,
    <http://www.podcastingnews.com/topics/Podcast_Software.html>
  6. Audacity,
    <http://audacity.sourceforge.net/>
  7. Podcasting Software (Publishing), Podcasting News,
    <http://www.podcastingnews.com/topics/Podcasting_Software.html>

Briefing 84

Usage Statistics For Web Sites


About This Document

Information on performance indicators for Web sites has been published elsewhere [1] [2]. This document provides additional information on the specific need for usage statistics for Web sites and provides guidance on ways of ensuring the usage statistics can be comparable across Web sites.

About Usage Statistics For Web Sites

When a user accesses a Web page several resources will normally be downloaded to the user (the HTML file, any embedded images, external style sheet and JavaScript files, etc.). The Web server will keep a record of this, including the names of the files requested and the date and time, together with some information about the user's environment (e.g. type of browser being used).

Web usage analysis software can then be used to provide overall statistics on usage of the Web site. As well as giving an indication of the overall usage of a Web site, information can be provided on the most popular pages, the most popular entry points, etc.

What Can Usage Statistics Be Used For?

Usage statistics can be used to give an indication of the popularity of Web resources. Usage statistics can be useful if identifying successes or failures in dissemination strategies or in the usability of a Web site.

Usage statistics can also be useful to system administrators who may be able to use the information (and associated trends) in capacity planning for server hardware and network bandwidth.

Aggregation of usage statistics across a community can also be useful in profiling the impact of Web services within the community.

Limitations Of Usage Statistics

Although Web site usage statistics can be useful in a number of areas, it is important to be aware of the limitations of usage statistics. Although initially it may seem that such statistics should be objective and unambiguous, in reality this is not the case.

Some of the limitations of usage statistics include:

Recommendations

Although Web site usage statistics cannot be guaranteed to provide a clear and unambiguous summary of Web site usage, this does not mean that the data should not be collected and used. There are parallels with TV viewing figures which are affected by factors such as video recording. Despite such known limitations, this data is collected and used in determining advertising rates.

The following advice may be useful:

Document Your Approaches And Be Consistent

You should ensure that you document the approaches taken (e.g. details of the analysis tool used) and any processing carried out on the data (e.g. removing robot traffic or access from within the organisation). Ideally you will make any changes to the processing, but if you do you should document this.

Consider Use Of Externally Hosted Usage Services

Traditional analysis packages process server log files. An alternative approach is to make use of an externally-hosted usage analysis service. These services function by providing a small graphical image (which may be invisible) which is embedded on pages on your Web site. Accessing a page causes the graphic and associated JavaScript code, which is hosted by a commercial company, to be retrieved. Since the graphic is configured to be non-cachable, the usage data should be more reliable. In addition the JavaScript code can allow additional data to be provided, such as additional information about the end users PC environment.

References

  1. Performance Indicators For Your Project Web Site, QA Focus briefing document No. 17,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-17/>
  2. Performance Indicators For Web Sites, Exploit Interactive (5), 2000,
    <http://www.exploit-lib.org/issue5/indicators/>

Briefing 85

An Introduction To Web Services


What Are Web Services?

Web services are a class of Web application, published, located and accessed via the Web, that communicates via an XML (eXtensible Markup Language) interface [1]. As they are accessed using Internet protocols, they are available for use in a distributed environment, by applications on other computers.

What's The Innovation?

The idea of Internet-accessible programmatic interfaces, services intended to be used by other software rather than as an end product, is not new. Web services are a development of this idea. The name refers to a set of standards and essential specifications that simplify the creation and use of such service interfaces, thus addressing interoperability issues and promoting ease of use.

Well-specified services are simple to integrate into larger applications, and once published, can be used and reused very effectively and quickly in many different scenarios. They may even be aggregated, grouped together to produce sophisticated functionality.

Example: Google Spellchecker And Search Services

The Google spellchecker service, used by the Google search engine, suggests a replacement for misspelt words. This is a useful standard task; simply hand it a word, and it will respond with a suggested spelling correction if one is available. One might easily imagine using the service in one's own search engine, or in any other scenario in which user input is taken, perhaps in an intelligent "Page not found" error page, that attempts to guess at the correct link. The spellchecker's availability as a Web service simplifies testing and adoption of these ideas.

Furthermore, the use of Web services is not limited to Web-based applications. They may also usefully be integrated into a broad spectrum of other applications, such as desktop software or applets. Effectively transparent to the user, Web service integration permits additional functionality or information to be accessed over the Web. As the user base continues to grow, many development suites focus specifically on enabling the reuse and aggregation of Web services.

What Are The Standards Underlying Web Services?

'Web services' refers to a potentially huge collection of available standards, so only a brief overview is possible here. The exchange of XML data uses a protocol such as SOAP or XML-RPC. Once published, the functionality of the Web service may be documented using one of a number of emerging standards, such as WSDL, the Web Service Description Language.

WSDL provides a format for description of a Web service interface, including parameters, data types and options, in sufficient detail for a programmer to write a client application for that service. That description may be added to a searchable registry of Web services.

A proposed standard for this purpose is UDDI (Universal Description, Discovery and Integration), described as a large central registry for businesses and services. Web services are often seen as having the potential to 'flatten the playing field', and simplify business-to-business operations between geographically diverse entities.

Using Web Services

Due to the popularity of the architecture, many resources exist to support the development and use of Web services in a variety of languages and environments. The plethora of available standards may pose a problem, in that a variety of protocols and competing standards are available and in simultaneous use. Making that choice depends very much on platform, requirements and technical details.

Although Web services promise many advantages, there are still ongoing discussions regarding the best approaches to the underlying technologies and their scope.

References

  1. The JISC Information Environment and Web Services, A. Powell and E. Lyon, Ariadne, issue 31, April 2002,
    <http://www.ariadne.ac.uk/issue31/information-environments/>
  2. World Wide Web Consortium Technical Reports, W3C,
    <http://www.w3.org/TR/>

Further Information


Briefing 86

Usability and the Web


Background

Usability refers to a quality attribute that assesses how easy user interfaces are to use. The term is also used to refer to a number of techniques and methods for improving usability during the various stages of design and development.

What Does Usability Include?

Usability can be separated into several components [1] such as:

Efficiency: How quickly an experienced user can perform a given task Memorability: Once familiar with an interface, is it easily forgettable? Errors: How easy is it to make mistakes/recover from mistakes? Satisfaction: Is the design enjoyable to use?
Learnability:
How easy it is to get to grips with an unfamiliar interface?
Efficiency:
How quickly an experienced user can perform a given task?
Memorability:
Once familiar with an interface, is it easily forgettable?
Errors:
How easy is it to make mistakes/recover from mistakes?
Satisfaction:
Is the design enjoyable to use?

These characteristics are all useful metrics, although the importance of each one depends on the expected uses of the interface in question. In some circumstances, such as software designed for a telephone switchboard operator, the time it takes for a skilled user to complete a task is rather more important than learnability or satisfaction. For an occasional web user, a web site's designers may wish to focus principally on providing a site that is learnable, supports the user, and is enjoyable to use. Designing a usable site therefore requires a designer to learn about the needs of the site's intended users, and to test that their design meets the criteria mentioned above.

Why Does Usability Matter?

More attention is paid to accessibility than to usability in legislation, perhaps because accessibility is perceived as a clearly defined set of guidelines, whilst usability itself is a large and rather nebulous set of ideas and techniques. However, a Web site can easily pass accessibility certification, and yet have low usability; accessibility is to usability what legible handwriting is to authorship. Interfaces with low usability are often frustrating, causing mistakes to be made, time to be wasted, and perhaps impede the user from successfully reaching their intended goal at all. Web sites with low usability will not attract or retain a large audience, since if a site is perceived as too difficult to use, visitors will simply prefer to take their business elsewhere.

Usability Testing

User testing is traditionally an expensive and complicated business. Fortunately, modern discount ('quick and dirty') methods have changed this, so that it is now possible to quickly test the usability of a web site at any stage in its development. This process, of designing with the user in mind at all times, is known as user-centred design. At the earliest stages, an interface may be tested using paper prototypes or simple mockups of the design. It is advisable to test early and often, to ensure that potential problems with a design are caught early enough to solve cheaply and easily. However, completed Web sites also benefit from usability testing, since many such problems are easily solved.

User testing can be as simple as asking a group of users, chosen as representative of the expected user demographic, to perform several representative tasks using the Web site. This often reveals domain-specific problems, such as vocabulary or language that is not commonly used by that group of users. Sometimes user testing can be difficult or expensive, so discount techniques such as heuristic evaluation [2], where evaluators compare the interface with a list of recommended rules of thumb, may be used. Other discount techniques include cognitive walkthrough in which an evaluator role-plays the part of a user trying to complete a task. These techniques may be applied to functional interfaces, to paper prototypes, or other mockups of the interface.

A common method to help designers is the development of user personas, written profiles of fictitious individuals who are designed to be representative of the site's intended users. These individuals' requirements are then used to inform the design process and to guide the design process.

Conclusions

Considering the usability of a web site not only helps users, but also tends to improve the popularity of the site in general. Visitors are likely to get a better impression from usable sites. Quick and simple techniques such as heuristic evaluation can be used to find usability problems; frequent testing of a developing design is ideal, since problems can be found and solved early on. Several methods of usability testing can be used to expose different types of usability problems.

References And Further Information

  1. Usability 101: Introduction to Usability, J. Nielsen,
    <http://hcibib.org/tcuid/chap-4.html>
  2. Heuristic Evaluation, J. Nielsen,
    <http://portal.acm.org/citation.cfm?id=142869>

Briefing 87

Introduction to Cognitive Walkthroughs


Introduction To Cognitive Walkthroughs

The cognitive walkthrough is a method of discount ("quick and dirty") usability testing requiring several expert evaluators. A set of appropriate or characteristic tasks to be completed is compiled. The evaluators then "walk" through each task, noting down problems or difficulties as they go.

Since cognitive walkthroughs are often applied very early in development, the evaluators will often be working with mockups of interfaces such as paper prototypes and role-playing the part of a typical user. This is made much simpler if user personas, detailed descriptions of fictitious users, have been developed, because these simplify the role-playing element of cognitive walkthrough. These are often developed at the beginning of a user-centred design process, because designers often find it much easier to design to the needs of a specific user.

Evaluators are typically experts such as usability specialists, but the same basic technique can also be applied successfully in many different situations.

The Method

Once you have a relatively detailed prototype, paper or otherwise, you are ready to try a cognitive walkthrough.

Start off by listing the tasks that you expect users to be able to perform using your Web site or program. To do this, think about the possible uses of the site; perhaps you are expecting users to be able to book rooms or organise tours, or find out what events your organisation is running in the next month, or find opening times and contact details for your organisation. Write down each of these tasks.

Secondly, separate these tasks into two parts: the user's purpose (their intention) and the goals that they must achieve in order to complete this. Take the example of organising a tour; the user begins with the purpose of finding out what tours are available. In order to achieve this, they look for a link on your Web site leading to a Web page detailing possible tours. Having chosen a tour, they gain a new purpose - organising a tour date - and a new set of goals, such as finding a Web page that lets them book a tour date and filling it out appropriately.

Separating tasks into tiny steps in this way is known as decomposition, and it is mostly helpful because it allows you to see exactly where and when the interface fails to work with the user's expectations. It is important to do this in advance, because otherwise you find yourself evaluating your own trial-and-error exploration of the interface! Following these steps "wearing the users' shoes" by trying out each step on a prototype version of the interface shows you where the user might reach an impasse or a roadblock and have to retrace his or her steps to get back on track. As a result, you will gain a good idea of places where the interface could be made simpler or organised in a more appropriate manner.

To help this process, a Walkthrough Evaluation Sheet is filled in for each step taken. An example is shown below [1]:

  1. Will the users be trying to produce whatever effect the action has?
  2. Will users see the control (button, menu, switch, etc.) for the action?
  3. Once users find the control, will they recognize that it produces the effect they want?
  4. After the action is taken, will users understand the feedback they get, so they can go on to the next action with confidence?

Advantages and Disadvantages

Cognitive walkthroughs are often very good at identifying certain classes of problems with a Web site, especially showing how easy or difficult a system is to learn or explore effectively - how difficult it will be to start using that system without reading the documentation, and how many false moves will be made in the meantime.

The downside is principally that on larger or more complex tasks they can sometimes be time-consuming to perform, so the technique is often used in some altered form. For example, instead of filling out an evaluation sheet at each step, the evaluation can be recorded on video [2]; the evaluator can then verbally explain the actions at each step.

Conclusions

'Cognitive walkthroughs are helpful in picking out interface problems at an early stage, and works particularly well together with a user-centred design approach and the development of user personas. However, the approach can sometimes be time-consuming, and since reorganising the interface is often expensive and difficult at later stages in development, the cognitive walkthrough is usually applied early in development.

References

  1. Evaluating the design without users, from Task-Centered User Interface Design,
    <http://hcibib.org/tcuid/chap-4.html>
  2. The Cognitive Jogthrough,
    <http://portal.acm.org/citation.cfm?id=142869>

Briefing 88

Task Analysis and Usability


Background

A key issue in usability is that of understanding users, and a key part of user-centred design is that of describing the tasks that the users expect to be able to accomplish using the software you design [1]. Because of the origins of usability as a discipline, a lot of the terminology used when discussing this issue comes from fields such as task analysis. This briefing paper defines some of these terms and explains the relationship between usability and task analysis.

What Is Task Analysis?

Within the usability and human-computer interaction communities, the term is generally used to describe study of the way people perform tasks - that is, the way in which a task is currently performed in real-life situations. Task analysis does not describe the optimal or ideal procedure for solving a problem. It simply describes the way in which the problem is currently solved.

Gathering Data For Task Analysis

Since the intent of task analysis is description of an existing system, the ideal starting point is data gathered from direct observation. In some cases, this is carried out in a controlled situation such as a usability laboratory. In others, it is more appropriate to carry out the observation "in the field" - in a real-life context. These may yield very different results!

Observational data can be gathered on the basis of set exercises, combined with the "think-aloud" technique, in which subjects are asked to describe their actions and their reasoning as they work through the exercise. Alternatively, observations can be taken by simply observing subjects in the workplace as they go through a usual day's activities. The advantage of this latter method is principally that the observer influences events as little as possible, but the corresponding disadvantage is that the observations are likely to take longer to conclude.

Unfortunately, there are significant drawbacks of direct observation, principally cost and time constraints. For this reason, task analysis is sometimes carried out using secondary sources such as manuals and guidebooks. This, too, has drawbacks - such sources often provide an idealised or unrealistic description of the task.

A third possibility is conducting interviews - experts, themselves very familiar with a task, can easily answer questions about that task. While this can be a useful way of solving unanswered questions quickly, experts are not always capable of precisely explaining their own actions as they can be too familiar with the problem domain, meaning that they are not aware on a conscious level of the steps involved in the task.

Analysing Observations

There are several methods of analysing observational data, such as knowledge-based analysis, procedural [2] or hierarchical task analysis, goal decomposition (the separation of each goal, or step, into its component elements) and entity-relationship based analysis. Data can also be visualised by charting or display as a network. Some methods are better suited to certain types of task - e.g. highly parallel tasks are difficult to describe using hierarchical task analysis (HTA). On the other hand, this method is easy for non-experts to learn and use. Each answers a slightly different question - for example, HTA describes the knowledge and abilities required to complete a task, while procedural task analysis describes the steps required to complete a task.

A simple procedural task analysis is completed as follows:

  1. Choose the appropriate procedure to complete the task that is being analysed.
  2. Determine and write down each step in that procedure; break down each step as far as possible.
  3. Complete every step of the procedure.
  4. Check that the procedure gave the correct result.

These steps can be charted as a flowchart for a clear and easy to read visual representation.

Conclusions

Task analysis provides a helpful toolkit for understanding everyday processes and for describing how human beings solve problems. It is not appropriate to perform detailed task analysis in every situation, due to cost and complexity concerns. However, the results of a task analysis can usefully inform design or pinpoint usability problems, particularly differences between the system designer's assumptions and the users' "mental models" - ways of looking at - the task to be performed.

References

  1. Task Analysis and Human-Computer Interaction, Crystal & Ellington,
    <http://www.ils.unc.edu/~acrystal/AMCIS04_crystal_ellington_final.pdf>
  2. Procedural Task Analysis,
    <http://classweb.gmu.edu/ndabbagh/Resources/Resources2/procedural_analysis.htm>

Briefing 89

Heuristic Evaluation


Background

Heuristic evaluation is a method of user testing, which enables a product to be assessed in order to identify usability problems - that is, places where the product is not easy to use. It is a discount ("quick and dirty") method, which means that it is cheap and requires relatively little expertise.

What's Involved In Heuristic Evaluation?

In this technique, a number of evaluators are first introduced to the heuristics, then given some tasks to complete and invited to report the problems - where the system fails to comply with the heuristics - either verbally or in some form of written report or checklist. Unlike many forms of usability testing, the evaluators do not have to be representative of the system's expected users (although they can be!), nor do the evaluators have to be experts, as the heuristics can be read and understood in a few minutes. Just three to five evaluators are needed to find the majority of usability problems, so the technique is quite efficient and inexpensive.

The problems found in heuristic evaluation essentially represent subjective opinions about the system. Evaluators will frequently disagree (there are no absolute right or wrong answers) but these opinions are useful input to be considered in interface design.

What Heuristics Should I Use?

There are several sets of possible heuristics available on the Web and elsewhere. This reflects the fact that they are "rules of thumb", designed to pick out as many flaws as possible, and various sets of usability evaluators have found different formalisations to be most useful for their needs, e.g. [1]. Probably the most commonly used is Nielsen's set of ten usability heuristics [2] given below with a sample question after each one:

An excellent resource to help you choose a set of heuristics is the Interactive Heuristic Evaluation Toolkit [3] which offers heuristics tailored to your expected user group, type of device, and class of application.

When Should Heuristic Evaluation Be Carried Out?

As heuristic evaluation is simple and cheap, it is possible to use it to quickly test the usability of a web site at any stage in its development. Waiting until a fully functional prototype Web site exists is not necessary; interface ideas can be sketched out onto paper or mocked up using graphics software or Flash. These mockups can be tested before any actual development takes place.

Most projects will benefit from a user-centred design process, an approach that focuses on supporting every stage of the development process with user-centred activities. It is advisable to test early and often, in order to ensure that potential problems with a design are caught early enough that they can be solved cheaply. However, even web sites that are already active can benefit from usability testing, since many such problems are easily solved, but some problems are difficult or expensive to solve at a late stage.

Conclusions

If a developing design is tested frequently, most usability problems can be found and solved at an early stage. Heuristic evaluation is a simple and cheap technique that finds the majority of usability problems. An existing Web site or application will often benefit from usability testing, but testing early and often provides the best results. Finally, it is useful to alternate use of heuristic evaluation with use of other methods of usability testing, such as user testing, since the two techniques often reveal different sets of usability problems.

References

  1. Heuristic Evaluation - A System Checklist, Deniese Pierotti, Xerox Corp.
    <http://www.stcsig.org/usability/topics/articles/he-checklist.html>
  2. Heuristic Evaluation, Jakob Nielsen,
    <http://www.useit.com/papers/heuristic/>
  3. Interactive Heuristic Evaluation Toolkit,
    <http://www.id-book.com/catherb/>

Further Information


Briefing 90

Developing User Personas


Background

When designing a Web site or program, the obvious question to ask at once is, "who are my audience?" It seems natural to design with users in mind, and just as natural to wish to build a product that is satisfactory to all one's users - however, experience shows that it is difficult to design something that appeals to everybody [1]. Instead, it is useful to start with a few sample profiles of users, typical examples of the audience to whom the design should appeal, and design to their needs. Not only is it easier for the designer, but the result is usually more appealing to the user community.

Researching A User Persona

The first step in developing a user persona is to learn a little about your users; qualitative research techniques like one-to-one interviews are a good place to start. It's best to talk to several types of users; don't just focus on the single demographic you're expecting to appeal to, but consider other groups as well. Focusing on one demographic to the exclusion of others may mean that others do not feel comfortable with the resulting design, perhaps feeling alienated or confused. The expected result of each interview is a list of behaviour, experience and skills. After a few interviews, you should see some trends emerging; once you feel confident with those, it's time to stop interviewing and start to build personas [2].

Developing A User Persona

Once you have an idea of each type of persona, write down the details for each one. It may help to write a sort of biography, including the following information:

You can even find a photograph or sketch that you feel fits the personality and add it to the persona's description.

Why User Personas?

The intent behind a user persona is to create a shared vocabulary for yourself and your team when discussing design questions and decisions. User personas provide easy-to-remember shorthand for user types and behaviour, and can be used to refer to some complex issues in a simple and generally understood way. Sharing them between management and development teams, perhaps even with funders, also provides a useful avenue for effective communication of technical subjects. Furthermore, it is much easier to design for a persona with whom one can empathise than for a brief, dry description of user demographics.

It is good practice, when making design decisions, to consider each user persona's likely reaction to the result of the decision. Which option would each user persona prefer?

User personas can also feed in to discount usability testing methods such as the cognitive walkthrough, saving time and increasing the effectiveness of the approach.

Finally, the research required to create a user persona is an important first step in beginning a user-centred design process, an approach that focuses on supporting every stage of the development process with user-centred activities, which is strongly recommended in designing for a diverse user group.

Conclusions

User personas are a useful resource with which to begin a design process, which allow the designers to gain understanding of their users' expectations and needs in a cheap and simple manner, and can be useful when conducting discount usability testing methods. Additionally, they make helpful conversational tools when discussing design decisions.

References

  1. The Inmates are Running the Asylum, Alan Cooper, ISBN: 0672316498
  2. 5 Minute Whitepaper: Which persona are you targeting?,
    <http://newsletter.refinery.com/e_article000334332.cfm?x=b11,0,w>

Further Information


Briefing 91

The e-Framework for Education and Research


The e-Framework for Education and Research

The e-Framework is an initiative by the UK's Joint Information Systems Committee (JISC), Australia's Department of Education, Science and Training (DEST) and partners to produce an evolving and sustainable, open standards based, service oriented technical framework to support the education and research communities.

The e-Framework supports a service oriented approach to developing and delivering education, research and management information systems. Such an approach maximises the flexibility and cost effectiveness with which systems can be deployed, both in an institutional context, nationally and internationally.

The e-Framework allows the community to document its requirements and processes in a coherent way, and to use these to derive a set of interoperable network services that conform to appropriate open standards. By documenting requirements, processes, services, protocol bindings and standards in the form of 'reference models' members of the community are better able to collaborate on the development of service components that meet their needs (both within the community and with commercial and other international partners). The 'e-Framework' also functions as a strategic planning tool for the e-Framework partners.

The initiative builds on the e-Learning Framework [1] and the JISC Information Environment [2] as well as other service oriented initiatives in the areas of scholarly information, research support and educational administration. A briefing paper that provides an overview of the e-Framework [3] and how the partners intend to use it can be found in the resources section [4].

Guiding Principles For The e-Framework

The e-Framework Partnership intends to operate in accordance with the following guiding principles:

The Adoption of a Service Oriented Approach to System and Process Integration

A service-oriented framework provides significant benefits to stakeholders including policy makers, managers, institutions, suppliers and developers and is a business driven approach for developing ICT infrastructure that encourages innovation by being agile and adaptive.

A service-oriented framework currently provides the best means of addressing systems integration issues within institutions, between institutions and across the domains within education and research.

The definition of services is driven by business requirements and processes. The factoring of the services is a key to the effectiveness of the framework.

A high level 'abstract' service definition should not duplicate or overlap another service. An abstract service definition is a description of a service that is independent of the language or platform that may be used to implement the service.

The e-Framework activities will strive for technical excellence and adoption of co-developed good practices.

The Development, Promotion and Adoption of Open Standards

Open standards are key to achieving integration between systems, institutions and between domains in the education and research communities. Open standards are defined for the e-Framework as those standards that are developed collaboratively through due process, are platform independent, vendor neutral, extensible, reusable, publicly accessible and not encumbered by royalties. In order to achieve impact open standards require international collaboration and consensus.

Community Involvement in the Development of the e-Framework

Framework. Collaboration between technical and domain experts, practitioners, developers and vendors will be essential to the evolution and uptake of the e-Framework approach. Capacity and capability will need to be developed

Open Collaborative Development Activities

In order to support evolution of the e-Framework, results will be publicly available. Engagement with communities of use will be essential in the development of the e-Framework. Sustained international development of the e-Framework cannot be undertaken by a single organisation and collaboration between organisations is required. Where possible and appropriate, Open Intellectual Property licensing approaches (such as open source, Creative Commons, royalty free patent licences) will be adopted.

Flexible and Incremental Deployment of the e-Framework

The e-Framework supports and promotes flexible deployment by institutions and facilitates incremental deployment and change. The e-Framework will accommodate both open source and proprietary implementations. Institutions will decide whether to use open or closed source implementations in deploying the e-Framework

The e-Framework's founding organisations, DEST and JISC, have devised a temporary model for the management of and engagement with the e-Framework designed to support the incubation and nurture of the e-Framework, throughout the critical early years of development. The governance and stewardship structures will be iteratively refined as part of the e-Framework work plan.

About This Document

This document is a modified version of a document on "The e-Framework for Education and Research" published on the E-Framework Web site at <http://www.e-framework.org/about/> (version last modified on 2005-10-16 11:05 PM).

The document was originally written by Wilbert Kraan, CETIS and has been republished as a QA Focus briefing document. We are grateful to Wilbert for permission to reprint this document.


Briefing 92

An Introduction to Web 2.0


Web 2.0

The term 'Web 2.0' was coined to define an emerging pattern of new uses of the Web and approaches to the Web development, rather than a formal upgrade of Web technologies as the 2.0 version number may appear to signify. The key Web 2.0 concepts include:

It's an attitude, not a technology:
An acknowledgement that Web 2.0 is not primarily about a set of standards or applications, but a new mindset to how the Web can be used.
A network effect:
This describes applications which are more effective as the numbers of users increase. This effect is well-known in computer networks, with the Internet providing an example of how network traffic can be more resilient as the numbers of devices on the Internet grows.
Openness:
The development of more liberal licences (such copyright licences such Creative Commons; open sources licences for software) can allow integration of data and reuse of software without encountering legal barriers.
Trust Your Users:
Rather than having to develop complex access regimes, a more liberal approach can be taken who can make it easier for users to make use of services.
Network as a platform:
The Web can now be used to provide access to Web applications, and not just informational resources. This allows users to make use of applications without having to go through the cumbersome exercise of installing software on their local PC.
Always beta:
With Web applications being managed on a small number of central servers, rather on large numbers of desktop computers, it becomes possible for the applications to be enhanced in an incremental fashion, with no requirements for the user of the application to upgrade their system.
The long tail:
As the numbers of users of the Web grows, this can provide business opportunities for niche markets which previously it may not have been cost-effective to reach.
Small pieces, loosely coupled:
As the technical infrastructure of the Web stabilises, it becomes possible to integrate small applications. This enables services to be developed more rapidly and can avoid the difficulties ort developing and maintaining more complex and cumbersome systems.

Web 2.0 Application Areas

The key application areas which embody the Web 2.0 concepts include:

Blogs
A Web site which is commonly used to provide diaries, with entries provided in chronological order. Blogs can be used for a variety of purposes, ranging from reflective learning by students and researchers through to dissemination channels for organisations.
Wikis
A wiki refers to a collaborative Web-based authoring environment. The term wiki comes from an Hawaiian word meaning 'quick' and the origins of the name reflect the aims of the original design of wikis to provide a very simple authoring environment which allows Web content to be created with the need to learn the HTML language or to install and master HTML authoring tools.
Syndicated Content
RSS and Atom formats have been developed to enable content to be automatically embedded elsewhere. RSS was initially developed to support reuse of blog content produced. RSS's success led to the format being used in other areas (initially for the syndication of news feeds and then for other alerting purposes and general syndication of content). The Atom format was developed as an alternative to RSS.
Mashups
A mashup is a service which contains data and services combined from multiple sources. A common example of a mashup is a Google Maps mashup which integrated location data was a map provided by the Google Maps service.
Podcasts
A podcast initially referred to syndicated audio content, which can be transferred automatically to portable MP3 players, such as iPods. However the term is sometimes misused to describe a simple audio file.
Social sharing services
Applications which provide sharing of various types of resources such as bookmarks, photographs, etc. Popular examples of social sharing services include del.icio.us and Flickr.
Social networks
Communal spaces which can be used for group discussions and sharing of resources.
Folksonomies and tagging
A bottom-up approach to providing labels for resources, to allow them to be retrieved.

Further Information


Briefing 93

An Introduction to AJAX


What Is AJAX?

Asynchronous JavaScript and XML (AJAX) is an umbrella term for a collection of Web development technologies used to create interactive Web applications, mostly W3C standards (the XMLHttpRequest specification is developed by WHATWG [1]:

Since data can be sent and retrieved without requiring the user to reload an entire Web page, small amounts of data can be transferred as and when required. Moreover, page elements can be dynamically refreshed at any level of granularity to reflect this. An AJAX application performs in a similar way to local applications residing on a user's machine, resulting in a user experience that may differ from traditional Web browsing.

The Origins of AJAX

Recent examples of AJAX usage include Gmail [2], Flickr [3] and 24SevenOffice [4]. It is largely due to these and other prominent sites that AJAX has become popular only relatively recently - the technology has been available for some time. One precursor was dynamic HTML (DHTML), which twinned HTML with CSS and JavaScript but suffered from cross-browser compatibility issues. The major technical barrier was a common method for asynchronous data exchange; many variations are possible, such as the use of an "iframe" for data storage or JavaScript Object Notation for data transmission, but the wide availability of the XMLHttpRequest object has made it a popular solution. AJAX is not a technology, rather, the term refers to a proposed set of methods using a number of existing technologies. As yet, there is no firm AJAX standard, although the recent establishment of the Open AJAX group [5], supported by major industry figures such as IBM and Google, suggests that one will become available soon.

Using AJAX

AJAX applications can benefit both the user and the developer. Web applications can respond much more quickly to many types of user interaction and avoid repeatedly sending unchanged information across the network. Also, because AJAX technologies are open, they are supported in all JavaScript-enabled browsers, regardless of operating system - however, implementation differences of the XMLHttpRequest between browsers cause some issues, some using an ActiveX object, others providing a native implementation. The upcoming W3C 'Document Object Model (DOM) Level 3 Load and Save Specification' [6] provides a standardised solution, but the current solution has become a de facto standard and is therefore likely to be supported in future browsers.

Although the techniques within AJAX are relatively mature, the overall approach is still fairly new and there has been criticism of the usability of its applications; further information on this subject is available in the Ajax and Usability QA Focus briefing document [7]. One of the major causes for concern is that JavaScript needs to be enabled in the browser for AJAX applications to work. This setting is out of the developer's control and statistics show that currently 10% of browsers have JavaScript turned off [8]. This is often for accessibility reasons or to avoid scripted viruses.

Conclusions

The popularity of AJAX is due to the many advantages of the technology, but several pitfalls remain related to the informality of the standard, its disadvantages and limitations, potential usability issues and the idiosyncrasies of various browsers and platforms. However, the level of interest from industry groups and communities means that it is undergoing active and rapid development in all these areas.

References

  1. Web Hypertext Application Technology Working Group,
    <http://www.whatwg.org/>
  2. GMail,
    <http://gmail.google.com/>
  3. Flickr,
    <http://www.flickr.com/>
  4. 24SevenOffice,
    <http://www.24sevenoffice.com/>
  5. The Open AJAX group,
    <http://www.siliconbeat.com/entries/ajax.pdf>
  6. Document Object Model (DOM) Level 3 Load and Save Specification, W3C,
    <http://www.w3.org/TR/DOM-Level-3-LS/>
  7. AJAX and Usability, QA Focus briefing document,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-94/>
  8. W3Schools Browser Statistics,
    <http://www.w3schools.com/browsers/browsers_stats.asp>

Briefing 94

AJAX And Usability Issues


Introducing AJAX

The term Asynchronous JavaScript and XML (AJAX) refers to a method by which a number of technologies can be combined to enable web applications to communicate in an asynchronous manner with services - that is, they can dynamically send and receive information without forcing page reloads, in a manner that is transparent to the user. This allows for a user experience similar to that of using local applications on a desktop PC. The background to AJAX is discussed in more depth in a QA Focus briefing paper [1].

Using AJAX

AJAX applications potentially offer a number of benefits, such as speed and efficiency in terms of bandwidth and time, and consistency in terms of appearance and behaviour across browser platforms. However, there are several potential disadvantages, a number of which are covered in this briefing paper. Some are related to security, such as limitations set as a response to cross-scripting security flaws, and the deactivation of JavaScript by many users, although this may also be due to accessibility issues. Others involve implementation issues, design issues or mismatch with user expectations.

AJAX and Usability

Certain usability issues have been identified as particularly common:

Concept of State

The Web is built according to a very specific content of state; once a page has been downloaded, it is usually expected to remain static. AJAX uses dynamic Web page updates, which means that state transition (the move from one page view to another) is now much more complex, as separate elements may update asynchronously. AJAX applications frequently do not store application state information; this breaks the 'back' button functionality of the browser. Many Web users use the back button as their primary means of navigation and struggle to control the system without it. Supporting undo and redo is one of the key usability rules and vital in allowing users to recover from errors - that said, it is not always possible or advisable, such as in the case of the sale event in e-commerce.

AJAX requires developers to explicitly support this functionality in their software, or use a framework that supports it natively. Various solutions to this problem have been proposed or implemented, such as the use of invisible IFRAME elements that invoke changes which populate the history originally used by the browser's back button.

A related issue is that because AJAX allows asynchronous data exchange with the server, it is difficult for users to bookmark a particular state of the application. Solutions to this problem have also started to appear. Some developers use the URL anchor or fragment identifier (the identifier after the hash '#') to keep track of state, and therefore allow users to return to the application in a given state. Some AJAX applications also include specially constructed permalinks.

The asynchronous nature of AJAX can also confound search engines and web spiders, which traditionally record only the content of the page. Since these usually disregard JavaScript entirely, an alternative access must be provided if it is desirable for the web page in question to be indexed by search engines. Many AJAX applications will not benefit from indexing by external search engines, such as email, mapping services or online chat clients; however, as the popularity of AJAX grows, it is likely that more informational sites will begin to apply the technology.

User Expectations

The Web is no longer in its infancy and most users have now become fairly familiar with its conventions. When entering a Web site there are certain expectations of how information will be served up and dealt with. Without explicit visual clues to the contrary, users are unlikely to realise that the content of a page is being modified dynamically. AJAX applications often do not offer visual clues if, for example, a change is being made to the page or content is being preloaded. The usual clues (such as the loading icon) are not always available. Again, solving this problem requires designers to explicitly support this functionality, using traditional user interface conventions wherever possible or alternative clues where necessary.

Response Time

AJAX has the potential to reduce the amount of traffic between the browser and the server, as information can be sent or requested as and when required. However, this ability can easily be misused, such as by polling the server for updates excessively frequently. Since data transfer is asynchronous, a lack of bandwidth should not be perceivable to the user; however, ensuring this is the case requires smart preloading of data.

Design Issues

AJAX makes many techniques available to developers that, previously, were available only by using DHTML or a technology like Flash. There is therefore concern that, as with these previous technologies, designers have access to a plethora of techniques that bring unfamiliar usability or accessibility problems. Gratuitous animation, pop ups, blinking text and other distractions all have accessibility implications and stop the user from fully focussing on the task at hand. When creating AJAX applications developers should fully consider the ramifications from the user's perspective.

Accessibility

Most methods of AJAX implementation rely heavily on features only present in desktop graphical browsers and not in text-only readers [2]. Developers using AJAX technologies in Web applications will find attempting to adhere to WAI accessibility guidelines a challenge. They will need to make sure that alternate options for users on other platforms, or with older browsers and slow Internet connections, are available.

A more detailed explanation of usability and the Web is available as a briefing paper [3].

Conclusions

The concerns surrounding adoption of AJAX are not unfamiliar; many stem from user and developer experience of Flash. Like Flash, the technologies comprising AJAX may be used in many different ways; some are more prone to usability or accessibility issues than others. The establishment of standard frameworks, and the increasing standardisation of the technologies behind AJAX, is likely to improve the situation for the Web developer.

In the meantime, the key for developers is to remember is that despite the availability of new approaches, good design remains essential and Jacob Nielson's Ten Usability Heuristics [4] should be kept in mind. AJAX applications need to be rigorously tested to deal with the idiosyncrasies of different browsers, platforms and usability issues. Further, applications should degrade gracefully and offer alternative functionality for those users who do not have JavaScript enabled.

Note that as the use of AJAX increases and more programming libraries become available, many of the issues mentioned in this briefing paper will be resolved. In parallel it is likely that over time browsers will standardise and incorporate better support for new technologies.

References

  1. An Introduction To AJAX, QA Focus briefing document,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-93/>
  2. W3Schools Browser Statistics, W3Schools,
    <http://www.w3schools.com/browsers/browsers_stats.asp>
  3. Usability and the Web, QA Focus briefing document,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-86/>
  4. Ten Usability Heuristics, Useit.com,
    <http://www.useit.com/papers/heuristic/heuristic_list.html>

Briefing 95

Service Registries And UDDI


What are Service Registries?

There are a wealth of services available on the Web. The high-profile examples are Amazon and Google APIs, which cover services as diverse as ISBN lookups, book cover image retrieval, Web search, spell-checking and geographical mapping, but many other services are available, performing all sorts of task from library access to instant price quotes and reviews for shopping sites. Some services perform a large and complex task, others perform a single task, and all speak different 'languages', from SOAP (Simple Object Access Protocol) to REST (Representational State Transfer).

For any given Web-based problem, there is an excellent possibility that a service is available that offers useful functionality. Unfortunately, finding these services is not always easy, but the process is greatly facilitated by using a service registry, designed to allow developers to register their services and for anybody, developer or end-user, to locate useful services. A service registry is an important part of any service-oriented architecture.

Types of Service Registry

Various service registries already exist. The first major standard to appear was UDDI (Universal Description, Discovery and Integration), which was designed mostly with SOAP-based Web services in mind. UDDI was originally designed with the idea that there would be very few service registries needed - like the Yellow Pages, there would be one central source of information, which would be available from a number of places, but each one would offer the same content. Several other types of service registry exist, such as the JISC's IESR (Information Environment Service Registry) project [1], which focuses on improving resource discovery mechanisms for electronic resources, that is, to make it easier to find materials to support teaching, learning and research. This briefing paper focuses on UDDI, although it is important to realise that the original UDDI standard has now been replaced by UDDI v3 [2] and is no longer generally used as part of a centralised approach. Instead, many organisations use corporate UDDI servers that are not publicly accessible.

Why Use Service Registries?

Service registries can be accessed in a number of ways. Most can be accessed via a Web interface, so if one is looking for a service or type of service, one can use a service registry like a typical search engine, entering keywords and reading textual descriptions of each service. However, many registries are designed to permit a second mode of use, where services are described in a machine-readable way. This means that, should one service become unavailable, the systems that were using that service can search the service registry for a second, compatible service that could be used in its place.

Using a UDDI Service Registry

UDDI can be used in two ways, because it is both a service registry and a business registry. One can look up businesses or organisations in the business registry, or search for services by description or keyword - UDDI also supports a formal taxonomy system that can be used for formally classifying services. It is sometimes more effective to begin searching for a service by looking at known providers. When an appropriate service has been found, it can be used at once; the UDDI server provides not only the name and description of services, but also information about where the Web service can be found, what protocol should be used to communicate with it, and details of the functionality that it can provide. Adding new services to a UDDI service registry may be done either using a Web interface designed for administrative access, or through an API (application program interface). Each different type of service registry supports its own method or methods - for example, the IESR service registry provides a Web form through which services can be added, or an XML -based service submission function.

Quality Issues

When adding data to any sort of service registry, it is important to ensure that the data is provided in the form recommended by the registry. Malformed entries will not be searchable, and may confuse other users. Equally, once you have listed a service in a service registry, remember that if the service is moved, shut down or altered, it will also be necessary to update the listing in the registry.

Conclusions

Service registries are an important part of the service-oriented Internet, automating and simplifying resource discovery. However, there is no single standard for service registries; as of today, each group has its own resource discovery strategy. Taking part in a service registry will generally lead to additional exposure for the listed services and resources, but does convey the additional responsibility of ensuring that listings remain up-to-date.

References

  1. JISC Information Environment Service Registry,
    <http://iesr.ac.uk/>
  2. UDDI V3 Specification, Oasis,
    <http://uddi.org/pubs/uddi-v3.00-published-20020719.htm>

Briefing 96

Open Standards For JISC Development Programmes


The Importance of Open Standards

Open standards play an important role in JISC's development programmes in order to help ensure:

Open standards are of particular importance in a JISC development environment, in order to enable developments in one institution to be reused in another, and to support the diversity of the UK's higher and further education environment.

Open Standards and JISC's Development Programmes

JISC funds a range of development programmes. Various procedures and policies may be developed for the different programme calls. This briefing document outlines some of the key principles related to the selection and use of open standards for project teams (a) preparing proposals and (b) developing project deliverables, once a proposal has been accepted.

The key areas for projects are:

Programme-Specific Advice and Procedures

Please note that projects should consult with Programme Managers and appropriate documentation for specific information related to individual programme calls.

Preparing Your Proposal

When preparing a proposal it is important that you are familiar with standards which are relevant to your proposal and to the particular programme you are involved with. You should ensure that you read relevant resources such as resources documented in the programme call, QA Focus advisory resources [1], JISC services such as UKOLN [2], CETIS [3], etc.

Your proposal should demonstrate that you have an understanding of appropriate open standards and that you are able to implement the relevant open standards, or can provide valid reasons if you are not able to do this.

Implementing Your Proposal

If your proposal is accepted you will need to address the issue of the selection and deployment of standards in more detail. For example, you may have a work package on the selection of standards and the technical architecture used to create, manage and use the standards.

It is important that you have appropriate quality assurance processes in place in order to ensure that you are making use of open standards in an appropriate way. Further advice on quality assurance is provided on the QA Focus Web site [4].

Sharing Your Experiences

Since projects may be involved in developing innovative applications and services, making use of standards in innovative ways, making use of emerging standards or prototyping new standards, it is important that feedback is provided on the experiences gained. The feedback may be provided in various ways: on project reports; by producing case studies; by providing feedback on emailing lists, on Wikis, etc.

Where possible it is desirable that the feedback is provided in an open way, allowing other projects, programmes and the wider community to benefits form such experiences. JISC Programmes may put in place structures to support such feedback.

References

  1. Briefing documents, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/>
  2. UKOLN,
    <http://www.ukoln.ac.uk/>
  3. CETIS,
    <http://www.cetis.ac.uk/>
  4. Quality Assurance, QA Focus briefing documents,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/#qa>

About This Document

This document was produced to support JISC's development programmes. We are grateful to JISC for their feedback on the document.


Briefing 97

Introduction To OPML


OPML

OPML stands for Outline Processor Markup Language. OPML was originally developed as an outlining application by Radio Userland. However it has been adopted for a range of other applications, in particular providing an exchange format for RSS.

This document describes the OPML specification and provides examples of use of OPML for the exchange of RSS feeds.

The OPML Specification

The OPML specification [1] defines an outline as a hierarchical, ordered list of arbitrary elements. The specification is fairly open which makes it suitable for many types of list data. The OPML specification is very simple, containing the following elements:

<opml version="1.0">
The root element which contains the version attribute and one head and one body element.
<head>
Contains metadata. May include any of these optional elements: title, dateCreated, dateModified, ownerName, ownerEmail, expansionState, vertScrollState, windowTop, windowLeft, windowBottom, windowRight.
<body>
Contains the content of the outline. Must have one or more outline elements.
<outline>
Represents a line in the outline. May contain any number of arbitrary attributes. Common attributes include text and type.

Limitations Of OPML

OPML has various shortcomings:

OPML Applications

Import and Export of RSS Files

OPML can be used in a number of application areas. One area of particular interest is in the exchange of RSS files. OPML can be used to group together related RSS feeds. RSS viewers which provide support for OPML can then be used to read in the group, to avoid having to import RSS files individually. Similarly RSS viewers may also provide the ability to export groups of RSS files as a single OPML file.

OPML Viewers

OPML viewers can be used to view and explore OPML files. OPML viewers have similar functionality as RSS viewers, but allow groups of RSS files to be viewed.

The QA Focus Web site makes use of RSS and OPML to provide syndication of the key QA Focus resources [2]. This is illustrated in Figure 1, which shows use of the Grazr inline OPML viewer [3]. This application uses JavaScript to read and display the OPML data.

Other OPML viewers include Optimal OPML [4] and OPML Surfer [5].

Figure 1: Grazr
Figure 1: Grazr

Risk Assessment

It should be noted that OPML is a relatively new format and only limited experiences have been gained in its usage. Organisations who wish to make exploit the benefits of OPML should seek to minimise any risks associated with use of the format and develop migration strategies if richer or more robust alternative formats become available.

Acknowledgments

This briefing document makes use of information published in the OPML section on Wikipedia [6].

References

  1. OPML Specification,
    <http://www.opml.org/spec>
  2. RSS Feeds, QA Focus,
    <http://www.ukoln.ac.uk/qa-focus/rss/#opml>
  3. Grazr,
    <http://www.grazr.com/>
  4. Optimal OPML,
    <http://www.optimalbrowser.com/optimal.php>
  5. OPML Surfer,
    <http://www.kbcafe.com/rss/opmlsurfer.aspx>
  6. OPML, Wikipedia
    <http://en.wikipedia.org/wiki/Outline_Processor_Markup_Language>

Briefing 98

Risk Assessment For Making Use Of Third Party Web 2.0 Services


Background

This briefing document provides advice for Web authors, developers and policy makers who are considering making use of Web 2.0 services which are hosted by external third party services. The document describes an approach to risk assessment and risk management which can allow the benefits of such services to be exploited, whilst minimising the risks and dangers of using such services.

Note that other examples of advice are also available [1] [2].

About Web 2.0 Services

This document covers use of third party Web services which can be used to provide additional functionality or services without requiring software to be installed locally. Such services include:

Advantages and Disadvantages

Advantages of using such services include:

Possible disadvantages of using such services include:

Risk Management and Web 2.0

A number of risks associated with making use of Web 2.0 services are given below, together with an approach to managing the dangers of such risks.

Risk Assessment Management
Loss of service (e.g. company becomes bankrupt, closed down, ...) Implications if service becomes unavailable.
Likelihood of service unavailability.
Use for non-mission critical services.
Have alternatives readily available.
Use trusted services.
Data loss Likelihood of data loss.
Lack of export capabilities.
Evaluation of service.
Non-critical use.
Testing of export.
Performance problems.
Unreliability of service.
Slow performance Testing.
Non-critical use.
Lack of interoperability. Likelihood of application lock-in.
Loss of integration and reuse of data.
Evaluation of integration and export capabilities.
Format changes New formats may not be stable. Plan for migration or use on a small-scale.
User issues User views on services. Gain feedback.

Note that in addition to risk assessment of Web 2.0 services, there is also a need to assess the risks of failing to provide such services.

Example of a Risk Management Approach

A risk management approach [3] was taken to use of various Web 2.0 services on the Institutional Web Management Workshop 2006 Web site.

Use of established services:
Google and Google Analytics are used to provide searching and usage reports.
Alternatives available:
Web server log files can still be analysed if the hosted usage analysis services become unavailable.
Management of services:
Interfaces to various services were managed to allow them to be easily changed or withdrawn.
User Engagement:
Users are warned of possible dangers and invited to engage in a pilot study.
Learning:
Learning may be regarded as the aim, not provision of long term service.

References

  1. Checklist for assessing third-party IT services, University of Oxford,
    <http://www.oucs.ox.ac.uk/internal/3rdparty/checklist.xml>
  2. Guidelines for Using External Services, University of Edinburgh,
    <https://www.wiki.ed.ac.uk/download/attachments/8716376/GuidelinesForUsingExternalWeb2.0Services-20080801.pdf?version=1>
  3. Risk Assessment, IWMW 2006, UKOLN,
    <http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2006/risk-assessment/>

Briefing 99

Impact Analysis For Web Sites


Background

This briefing document provides advice on approaches to measuring the impact of a service provided by a Web site.

The document describes an approach to risk assessment and risk management which can allow the benefits of such services to be exploited, whilst minimising the risks and dangers of using such services.

Traditional Approaches To Impact Analysis

A traditional approach to measuring the impact of a Web site is to report on Web server usage log files [1]. Such data can provide information such as an indication of trends and growth in usage; how visitors arrived at the Web site; how users viewed pages on your Web site and details on the browser technologies used by your visitors.

However although such information can be useful, it is important to recognise that the underlying data and the data analysis techniques used may be flawed [2]. For example:

It should also be noted that care must be taken when aggregating usage statistics:

So although analysis of Web site usage data may be useful, the findings need to be carefully interpreted.

Other Approaches To Impact Analysis

Although Web site usage analysis may have flaws, there are other approaches which can be used to measure the impact of a Web site: Such alternative can be used to complement Web usage analysis.

Link analysis
If other Web sites have links to your Web site, this can be an indication of the value placed on you Web site. Services such as LinkPopularity.com [3] can provide such data. Keeping a record of the numbers of sites linking to you can also help show trends.
Analysis of social bookmarking services:
Services such as del.icio.us [4] allow you to bookmark resources. A useful aspect of the service is the ability to observe others who are bookmarking the same resource. So bookmarking your own Web site will allow you to record the numbers of people who bookmark your site. This may be a useful indicator, if the social bookmarking service you use if popular with your target audience.
User comments:
Comments from your user community can provide a particularly valuable way of measuring impact. Feedback can be obtained in a variety of ways: focus groups; online questionnaires, online guest books, etc.
Analysis of Web sites, mailing lists, Blogs, etc.:
Search engines such as Google, Technorati [5]], etc. may enable you to find comments about your Web site, but also provide various metrics which may be useful.

Possible disadvantages of using such services include:

Embedding Impact Analysis

In order to maximise the benefits, you may find it useful to develop an Impact Analysis Strategy. This should ensure that you are aware of the strengths and weaknesses of the approaches you plan to use, have mechanisms for gathering information in a consistent and effective manner and that appropriate tools and services are available.

References

  1. Usage Statistics For Web Sites, QA Focus briefing document no. 84, UKOLN,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-84/>
  2. Performance Indicators for Web Sites, B. Kelly, Exploit Interactive, issue 5, April 2000,
    <http://www.exploit-lib.org/issue5/indicators/>
  3. LinkPopularity.com,
    <http://www.linkpopularity.com/>
  4. del.icio.us,
    <http://del.icio.us/>
  5. Technorati,
    <http://www.technorati.com/>

Briefing 100

An Introduction To Microformats


Background

This document provides an introduction to microformats, with a description of what microformats are, the benefits they can provide and examples of their usage. In addition the document discusses some of the limitations of microformats and provides advice on best practices for use of microformats.

What Are Microformats?

"Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging)." [1].

Microformats make use of existing HTML/XHTML markup: Typically the <span> and <div> elements and class attribute are used with agreed class name (such as vevent, dtstart and dtend to define an event and its start and end dates). Applications (including desktop applications, browser tools, harvesters, etc.) can then process this data.

Examples Of Microformats

Popular examples of microformats include:

An example which illustrates the commercial takeup of the hCalendar microformat is its use with the World Cup 2006 fixture list [4]. This application allows users to choose their preferred football team. The fixtures are marked up using hCalendar and can be easily added to the user's calendaring application.

Limitations Of Microformats

Microformats have been designed to make use of existing standards such as HTML. They have also been designed to be simple to use and exploit. However such simplicity means that microformats have limitations:

Best Practices for Using Microformats

Despite their limitations microformats can provide benefits to the user community. However in order to maximise the benefits and minimise the risks associated with using microformats it is advisable to make use of appropriate best practices. These include:

References

  1. About Microformats, Microformats.org,
    <http://microformats.org/about/>
  2. Tails Export: Overview, Firefox Addons,
    <https://addons.mozilla.org/firefox/2240/>
  3. Google hCalendar,
    <http://greasemonkey.makedatamakesense.com/google_hcalendar/>
  4. World Cup KickOff,
    <http://www.worldcupkickoff.com/>
  5. Risk Assessment For The IWMW 2006 Web Site, UKOLN,
    <http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2006/risk-assessment/#microformats>

Briefing 101

Tangram Model For Web Accessibility


Background

This document describes a user-focussed approach to Web accessibility in which the conventional approach to Web accessibility (based on use of WAI WCAG guidelines) can be applied within a wider context.

Traditional Approach To Web Accessibility

The conventional approach to Web accessibility is normally assumed to be provided by implementation of the Web Content Accessibility Guidelines (WCAG) which have been developed by the Web Accessibility Initiative (WAI).

In fact the WCAG guidelines are part of a set of three guidelines developed by WAI, the other guidelines being the Authoring Tools Accessibility Guidelines (ATAG) and the User Agent Accessibility Guidelines (UAAG). The WAI approach is reliant on full implementation of these three sets of guidelines.

Limitations Of The WAI Approach

Although WAI has been a political success, with an appreciation of the importance of Web accessibility now widely acknowledged, and has provided a useful set of guidelines which can help Web developers produce more accessible Web sites, the WAI model and the individual guidelines have their flaws, as described by Kelly et al [1]:

The Tangram Model

Although the WAI approach has its flaws (which is understandable as this was an initial attempt to address a very difficult area) it needs to be recognised that WCAG guidelines are valuable. The challenge is to develop an approach which makes use of useful WCAG guidelines in a way which can be integrated with others areas of best practices (e.g. including usability, interoperability, etc.) and provides a richer and more usable and accessible experience to the target user community.

In the tangram model for Web accessibility (developed by Sloan, Kelly et al [3]) each piece in the tangram (see below left) represents guidelines in areas such as accessibility, usability, interoperability, etc. The challenge for the Web developer is to develop a solution which is 'pleasing' to the target user community (see below right).

Tangram model

The tangram model provides several benefits:

References

  1. Forcing Standardization or Accommodating Diversity? A Framework for Applying the WCAG in the Real World, Kelly, Sloan et al, Proceedings of the 2005 International Cross-Disciplinary Workshop on Web Accessibility (W4A),
    <http://www.ukoln.ac.uk/web-focus/papers/w4a-2005/>
  2. Developing A Holistic Approach For E-Learning Accessibility, Kelly, Phipps and Swift, CJLT 2004, 3(1),
    <http://www.ukoln.ac.uk/web-focus/papers/cjtl-2004/>
  3. Contextual Web Accessibility - Maximizing the Benefit of Accessibility Guidelines, Sloan, et al, Proceedings of the 2006 International Cross-Disciplinary Workshop on Web Accessibility (W4A),
    <http://www.ukoln.ac.uk/web-focus/papers/w4a-2006/>

Briefing 102

Web 2.0: Supporting Library Users


Introduction

Web 2.0 is described by Wikipedia as referring to "a second generation of services available on the web that lets people collaborate and share information online" [1].

Web 2.0 is essentially all about creating richer user experiences through providing interactive tools and services, which sit on top of static web sites. Web sites which utilise aspects of Web 2.0 are often personalisable, dynamically driven, and rich in community tools and sharing functions. The data from underlying systems, such as a Library Management System, can be exposed and shared, usually using XML. Web 2.0 developments are underpinned by open source software and open standards - and they often use widely available components such as RSS, blogging tools and social bookmarking services.

Recently, the term Library 2.0 has also come to prominence. Library 2.0 [2] follows the principles of Web 2.0, in that it promotes the evaluation and adoption of software and tools which were originally created outside of the Library environment. These are over-layed on traditional library services - such as the Library OPAC - in order to create a more dynamic, interactive and personalisable user experience.

Re-inventing the Library OPAC

Web 2.0 technologies can be used to vastly improve the user experience when searching the Library OPAC. Traditionally, many Library OPACs have been designed with librarians rather than users in mind, resulting in interfaces which are not intuitive or attractive to users.

An excellent example of Web 2.0 in action comes from Plymouth State University [3]. The Library OPAC has been redesigned to give it the look and feel of a blog. The home page consists of a new books feed, presented in an engaging and attractive fashion, in contrast to the more usual approach of providing a set of search options. Users can view items, or click on options to 'find more like this' 'comment' or 'get more details'. The site is powered by WPopac, which the developer describes as 'an OPAC 2.0 Testbed' [4]. WPopac is itself based around the WordPress blogging tool [5].

The site contains lots of other useful features such as a list of the most popular books, recent search terms used by other users, ability to tag items with user-chosen tags, user comments and book reviews. Many of these features are provided by the use of RSS feeds from the Library Management System itself. The tagging tools are provided by an external social tagging Web site called Technorati [6].

Book reviews and detailed descriptions are also provided from the Amazon Web site, and other Amazon services [7] such as 'search inside' and 'tables of contents' are also integrated into the site.

Another interesting approach to the OPAC comes from Ann Arbor District Library (a US public library), who have created an online 'wall of books' using book jacket images [8]. Each image in the 'wall' links back to an item record in the Library OPAC. This is a novel approach to presenting and promoting library services, using an attractive 'virtual' library display to entice people into the OPAC for further information.

Services Integration

Google have provided a number of solutions for Libraries wishing to encourage their users to access local electronic and print resources when searching in Google Scholar [9]. Libraries can link their collections to Google Scholar to ensure that users are directed to locally-held copies of resources. The OCLC WorldCat database has already been harvested by Google Scholar, so that all the records in this database (a large Union Catalogue) are searchable via Google Scholar.

Google is also offering its 'Library Links' programme which enables libraries using an OpenURL link resolver to include a link from Google Scholar to their local resources as part of the Google Scholar search results. Google Scholar users can personalise their searching by selecting their 'home' library as a preference. In order to set up integration with an OpenURL resolver, libraries simply need to export their holding from their link resolver database and send this to Google. Once set up, Google will harvest new links from the link resolver database on an ongoing basis.

Library 'Mash-ups'

A mash-up can be described as a web site which uses content from more than one source to create a completely new service. One example of a tool which can be used to support 'mash-ups' is Greasemonkey [10]. Greasemonkey is an extension for the Firefox web browser which enables the user to add scripts to any external web page to change its behaviour. The use of scripts enables the user to easily control and re-work the design of the external page to suit their own specific needs.

The University of Huddersfield have used Greasemonkey to enable local information to be displayed in Amazon when a user is searching Amazon [11]. The information is drawn from their Library OPAC - for example, users can be shown whether a particular title is available in the Library at Huddersfield, and can link back to the Library OPAC to see the full record. If the book is out on loan, a due date can be shown. This approach encourages users back into the Library environment to use Library services rather than making a purchase through Amazon.

A New Approach to Resource Lists

Intute is a free online service managed by JISC for the UK higher education community [12]. It provides access to a searchable catalogue of quality-stamped web resources suitable for use in education. It was previously known as the RDN (Resource Discovery Network). Intute can be used by libraries to create their own local lists of web resources, thus reducing the need for libraries to maintain their own lists of useful Web sites for their users. Libraries can use the MyIntute service to create their own lists of resources drawn from the Intute catalogue. These can be updated on a weekly basis, if required, by creating weekly email alerts which identify new records added to the database which meet a stated set of criteria. The records can then be exported in order to present them on the libraries' own Web site, using local branding and look and feel. Library staff can add new records to the Intute database if they find useful resources which have not already been catalogued in Intute.

Supporting Users

Blogs are increasingly used by libraries as promotional, alerting and marketing tools; providing a useful method of promoting new services, alerting users to changes and offering advice and support.

The Library at the Royal College of Midwives [13] runs a blog to keep users informed about new service developments. Typical postings include information about new books received by the Library, pointers to content from electronic journals, news about new projects and services and other items which might usually be found in a Library newsletter. The advantage of the blog over a newsletter is that it is quick and easy to maintain and update, can therefore be kept more up-to-date, and can be integrated into the other services that the Library offers, through the Library Web site. Users can also choose to receive it as an RSS feed.

At the University of Leeds, Angela Newton runs a blog which is aimed at supporting information literacy developments [14]. Angela's blog is focused mainly at academic and support staff at Leeds who are interested in developing information-literate students. She focuses on a range of topics, such as assessment and plagiarism, academic integrity, use of information in education, research methods and e-learning tools. Angela's blog is one of a number of 'practitioner' blogs at Leeds, and has been well received by the e-learning community within the institution.

Another example is the "Univ of Bath Library Science News" service. In this case a blog service hosted at Blogspot.com provides "Updates for the Faculty of Science from your Subject Librarians: Linda Humphreys (Chemistry, Physics, Pharmacy & Pharmacology, Natural Sciences) Kara Jones (Biology & Biochemistry, Mathematics and Computing Sciences)" [15].

References

  1. Web 2.0, Wikipedia, Proceedings of the 2005 International Cross-Disciplinary Workshop on Web Accessibility (W4A),
    <http://en.wikipedia.org/wiki/Web_2.0>
  2. Library, Wikipedia,
    <http://en.wikipedia.org/wiki/Library_2.0>
  3. Plymouth State University Library OPAC,
    <http://www.plymouth.edu/library/opac/>
  4. WPopac,
    <http://maisonbisson.com/blog/post/11133/>
  5. WordPress,
    <http://wordpress.org/>
  6. Technorati,
    <http://technorati.com/>
  7. Amazon Web Services,
    <http://www.amazon.com/b/104-7926800-8152762?ie=UTF8&node=3435361>
  8. Ann Arbor District Library wall of books,
    <http://www.superpatron.com/wall-of-books/aadl/aadl-fiction-20060322.html>
  9. Google Scholar Services for Libraries, Google, Google,
    <http://scholar.google.com/scholar/libraries.html>
  10. Greasemonkey,
    <http://greasemonkey.mozdev.org/>
  11. Greasemonkey example,
    <http://library.hud.ac.uk/mediawiki/dughug2006/UsingFreeSoftware.zip>
  12. Intute,
    <http://www.intute.ac.uk/>
  13. Royal Colleges of Midwives Library blog,
    <http://midwifery-info.blogspot.com/>
  14. Angela Newton's Information Literacy blog, University of Leeds,
    <https://elgg.leeds.ac.uk/libajn/weblog/>
  15. Univ of Bath Library Science News, University of Bath,
    <http://bathsciencenews.wordpress.com/>

Briefing 103

Web 2.0: Addressing the Barriers to Implementation in a Library Context


Introduction

Web 2.0 is described by Wikipedia as referring to "a second generation of services available on the web that lets people collaborate and share information online" [1].

Web 2.0 is essentially about creating richer user experiences through providing interactive tools and services. Web sites which utilise aspects of Web 2.0 are often personalisable, dynamically driven, and rich in community tools and sharing functions. The data from underlying systems, such as a Library Management System, can be exposed and shared, usually using XML. Web 2.0 developments are underpinned by open source software and open standards - and they often use widely available components such as RSS, blogging tools and social bookmarking services.

Recently, the term Library 2.0 has also come to prominence [2]. Library 2.0 follows the principles of Web 2.0, in that it promotes the evaluation and adoption of software and tools which were originally created outside of the Library environment. These are over-layed on traditional library services - such as the Library OPAC - in order to create a more dynamic, interactive and personalisable user experience.

Barriers to Implementation

Information professionals often express concern about a number of issues relating to implementation of Web 2.0 within their Library service.

Scaleability

A significant concern centres on questions regarding scaleability of the Web 2.0 approach. Information professionals are usually concerned with finding institution-wide approaches to service provision - for example, in relation to reading lists it is important that a Library service is able to receive these from tutors in a timely fashion, that they are supplied in a consistent format and that they are made available in a standard and consistent way - perhaps through a central reading list management service with links to the Library Web site and the VLE. If other approaches start to be used it becomes difficult for these to be managed in a consistent way across the institution. For example, a tutor might decide to use a service such as Librarything [3] to create his or her own online reading list. Information professionals then face a potential headache in finding out about this list, synchronising it with other approaches across the institution, integrating it with other systems on campus such as the Library Sys tem, VLE or portal and presenting it in a consistent fashion to students.

Support Issues

Another concern centres on the supportability of Web 2.0 tools and services. Information professionals are often involved with training users to use library tools and services. They use a variety of approaches in order to achieve this ranging from hands-on training sessions, through to paper-based workbooks and interactive online tutorials. Information professionals are often concerned that users will struggle to use new tools and services. They are keen to develop training strategies to support implementation of new services. With Web 2.0 this can be difficult as users may be using a whole range of freely available tools and services, many of which the information professional may not themselves be familiar with. For example, if Tutor A is using Blogger [4] with her students, whereas Tutor B is using ELGG [5]], information professionals may find themselves being faced with expectations of support being provided for a wide variety of approaches. Students might encounter a different set of tools for each module that they take, and the support landscape quite quickly becomes cluttered. Information professionals can start to feel that they are losing control of the environment in which they are training and supporting users, and may also start to feel uneasy at their own inability to keep up to speed with the rapid changes in technology.

Longevity

Information professionals are also concerned about the longevity of Web 2.0 services. By its very nature, Web 2.0 is a dynamic and rapidly moving environment. Many of the tools currently available have been developed by individuals who are committed to the open source and free software movement who may not be backed by commercial funding; the individuals may have a tendency to lose interest and move on to other things once an exciting new piece of technology comes along. Some successful tools may end up being bought by commercial players which might result in their disappearance or incorporation into a commercial charging model. It appears risky to rely on services which may disappear at any time, where no support contract is available, no guarantee of bugs being fixed or formal processes for prioritisation of developments.

Commercialisation

Where Web 2.0 developments are backed by commercial organisations, this may also cause some concern. For example, Amazon provide many useful Web 2.0 services for enhancing the Library OPAC. However, information professionals may feel uneasy about appearing to be promoting the use of Amazon as a commercial service to their users. This might potentially damage relationships with on-campus bookshops, or leave the Library service open to criticism from users that the Library is encouraging students to purchase essential materials rather than ensuring sufficient copies are provided.

Web 2.0 technologies may also raise anxieties concerning strategy issues. Information professionals might worry that if Google Scholar [6] is really successful this would reduce use of Library services, potentially leading to cancellation of expensive bibliographic and full-text databases and a resulting decline in the perceived value of the Library within the institution. Library strategies for promoting information literacy might potentially be undermined by students' use of social tagging services which bypass traditional controlled vocabularies and keyword searching. The investment in purchase and set-up of tools such as federated search services and OpenURL resolvers might be wasted because users bypass these services in favour of freely available tools which they find easier to use.

Addressing the Barriers

Building On Web 2.0 Service

Information professionals can turn many of the perceived drawbacks of Web 2.0 to their advantage by taking a proactive approach to ensure that their own services are integrated with Web 2.0 tools where appropriate. Google, for example, offers a 'Library Links' programme which enables libraries using an OpenURL link resolver to include a link from Google Scholar to their local resources as part of the Google Scholar search results. Google Scholar users can personalise their searching by selecting their 'home' library as a preference. In order to set up integration with an OpenURL resolver, libraries simply need to export their holding from their link resolver database and send this to Google. Once set up, Google will harvest new links from the link resolver database on an ongoing basis. By using such tools, the information professional can put the Library back in the picture for users, who can then take advantage of content and services that they might not otherwise have come across because they had by-passed the Library Web site.

Working With Vendors

It is also important to work with Library Systems vendors to encourage them to take an open approach to Web 2.0 integration. Many Library Systems vendors are starting to use Web 2.0 approaches and services in their own systems. For example, the Innovative Interfaces' LMS [7] now provides RSS feed capability. RSS feeds can be surfaced within the OPAC, or can be driven by data from the Library system and surfaced in any other Web site. This provides a useful service for ensuring greater integration between Library content and other systems such as institutional VLEs or portals. Users can also utilise RSS feeds to develop their own preferred services. Information professionals should lobby their LMS partners to support a more open standards approach to service development and integration, and should ensure that they don't get too constrained by working with one particular vendor.

Working With National Services

Use of the national services can also help to ensure a more sustainable approach is taken to Web 2.0 developments. For example Intute [8], a nationally funded service working closely with the higher education community, can provide a range of Web 2.0 services which libraries can utilise in enhancing their own local services, with the advantage that the service is funded on a long-term basis and support can be provided by a dedicated team of staff employed full-time to resolve issues, develop services and fix bugs.

Working With Peers

Information professionals also need to work together to share ideas and experiences, implement developments and learn from each other. There are already lots of good examples of this kind of sharing of ideas and expertise - for example, Dave Pattern's blog at the University of Huddersfield [9] provides a useful resource for those interested in implementing Web 2.0 services in a library context. It is also important that information professionals are willing to work across professions - implementation of Web 2.0 services requires the contribution of IT professionals, e-learning specialists and academics.

Trying It

Finally, information professionals need to be willing to get their hands dirty and take risks. Web 2.0 is often concerned with rapid deployment of services which may still be in a beta state. The key is to get things out quickly to users, then develop what works and learn from things that doesn't. This requires a willingness to be flexible, a confidence in users' ability to cope with rapid change without requiring a lot of hand-holding and support, and the courage to step back from a particular approach that hasn't worked and move on to other things. Not everything will be successful or enthusiastically taken up by your users and you need to be prepared to cut your losses if something does not work. Users' views should be actively sought, and they should be involved in the development and testing process wherever possible.

References

  1. Web 2.0, Wikipedia,
    <http://en.wikipedia.org/wiki/Web_2.0>
  2. Library, Wikipedia,
    <http://en.wikipedia.org/wiki/Library_2.0>
  3. Librarything,
    <http://www.librarything.com/>
  4. Blogger,
    <http://www.blogger.com/start>
  5. ELGG,
    <http://elgg.net/>
  6. Google Scholar Services for Libraries,
    <http://scholar.google.com/scholar/libraries.html>
  7. Innovative Interfaces Inc,
    <http://www.iii.com/>
  8. Intute,
    <http://www.intute.ac.uk/>
  9. Dave Pattern's blog,
    <http://www.daveyp.com/blog/>

Briefing 104

Guide To The Use Of Wikis At Events


About This Document

This document describes how Wiki software can be used to enrich events such as conferences and workshops. The document provides examples of how Wikis can be used; advice on best practices and details of a number of case studies.

Use of Wikis at Events

Many events used to support the development of digital library services nowadays take place in venues in which WiFi network is available. Wikis (Web-based collaborative authoring tools) are well suited to exploit such networks in order to enrich such events in a variety of ways.

Examples of how Wikis can be used at events include:

Note-taking at discussion groups:
Wikis are ideal for use by reporters in discussion groups. They are easy to use and ensure that, unlike use of flip charts of desktop applications, the notes are available without any further processing being required.
Social support before & during the event:
Wikis can be used prior to or during events, to enable participants to find people with similar interests (e.g. those travelling from the same area; those interested in particular social events; etc.).

Best Practices

Issues which need to be addressed when planning a Wiki service at an event include:

AUP
It is advisable to develop an Acceptable Use Policy (AUP) covering use of the Wiki and other networked services. The AUP may cover acceptable content; responsibilities and preservation of the data.
Local or Hosted Services
There will be a need to decide whether to make use of a locally-hosted service or choose a third party service. If the latter option is preferred there is a need to decide whether to purchase a licences service or use a free service (which may have restructured functionality). In all cases there will be a need to establish the at Wiki software provides the functionality desired.
Registration Issues
You will need to decide whether registration is needed in order to edit the Wiki. Registration can help avoid misuse of the Wiki, but registration, especially if it requires manual approval, can act as a barrier to use of the Wiki.
Design Of The Wiki
Prior to the event you should establish the structure and appearance of the Wiki. You will need to provide appropriate navigational aids (e.g. the function of a 'Home' link; links between different areas; etc.).
After The Event
You will need to develop a policy of use of the Wiki after the event. You may, for example, wish to copy contents from the Wiki to another area (especially if the Wiki is provided by a third party service).

Examples Of Usage

A summary of use of Wikis at several UKOLN events are given below.

Table 1: Use of Wikis at UKOLN Events
Event Comments
Joint UKOLN/UCISA Workshop, Nov 2004 A workshop on "Beyond Email: Strategies For Collaborative Working In The 21st Century" was held in Leeds in Nov 2004. The four discussion groups each used a Wiki provided by the externally-hosted Wikalong service. The Wiki pages were copied to the UKOLN Web site after the event. It was noted that almost 2 years after the event the original pages were still intact (although link spam was subsequently found). See <http://www.ukoln.ac.uk/web-focus/events/workshops/ucisa-wlf-2004-11/>
IWMW 2005 Workshop, Jul 2005 Wikalong pages were set up for the discussion groups at the Institutional Web Management Workshop 2005. As an example of the usage see <http://www.ukoln.ac.uk/web-focus/events/workshops/ webmaster-2005/discussion-groups/south-east/>
Joint UKOLN / CETIS / UCISA Workshop, Feb 2006 A workshop on Initiatives & Innovation: Managing Disruptive Technologies was held at the University of Warwick in Feb 2006. The MediaWiki software was installed locally for use by the participants to report on the discussion groups. See <http://www.ukoln.ac.uk/web-focus/events/workshops/ ucisa-ukoln-cetis-2006/wiki/>.
ALT Research Seminar 2006 The licensed Jot Wiki service was used during a day's workshop on Emerging Technologies and the 'Net generation'. The notes kept on the Wiki were used to form the basis of a white paper. See <http://altspring.jot.com/WikiHome>.
IWMW 2006 Workshop, Jun 2006 The MediaWiki software was used by the participants at the Institutional Web Management Workshop 2006 to report on the discussion groups and various sessions. It was also available for use prior to the event. See <http://www.ukoln.ac.uk/interop-focus/community/index/IWMW2006>

 


Briefing 105

Use Of Social Tagging Services At Events


About This Document

This document describes how social tagging services can be used to enrich events such as conferences and workshops. The document provides examples of how social tagging services can be used; advice on best practices and details of a number of case studies.

Use Of Social Tagging Services At Events

Social tagging services, such as social bookmarking services (e.g. del.icio.us) and photo sharing services (e.g. Flickr) would appear to have much to offer events such as conferences and workshops, as such events, by definition, are intended for groups of individuals with shared interests.

Since many events used to support the development of digital library services nowadays take place in venues in which WiFi network is available, such events are ideal for exploiting the potential of social tagging services.

Examples of how social tagging can be used at events include:

Proving Links To Resources:
Making use of social bookmarking services such as del.icio.us enables resources mentioned in presentations to be more easily accessed (no need for speakers to waste time spelling out long URLs).
Finding Related Resources:
Use of social bookmarking services such as del.icio.us enable resources related to those provided by the speaker to be more easily found.
Contributing Related Resources:
Use of social bookmarking services such as del.icio.us enable participants to provide additional resources related to those provided by the speaker.
Evaluation And Impact Analysis:
Use of a standard tag for an event can enable Blog postings, photographs and other resources related to the event to be quickly found, using services such as Technorati. This can help assist the evaluation and impact analysis for an event.
Community Building:
Use of photo sharing services such as Flickr can enable participants at an event to share their photographs of the event.

Best Practices

Issues to be addressed when using social networking services at events include:

Selecting The Services
You should inform participants of the recommend services and, ideally, provide an example of the benefits to the participants, especially if they are unfamiliar with them.
Defining Your Tags
You should provide a recommended tag for your event, and possibly a format for use of the tag if more than one tag is to be used. You should try to ensure that your tag is unambiguous and avoids possible clashes with other uses of the tag. Note that use of dates can help to disambiguate tags.
Be Aware Of Possible Dangers
You should be aware of possible dangers and limitations of the services, and inform your participants of such limitations. This may include possible spamming of the services; data protection and privacy issues; long term persistence of the services; etc.

Case Study - IWMW 2006

The Institutional Web Management Workshop 2006, held at the University of Bath on 14-16th June 2006, made extensive use of social bookmarking services:

Standardised Tags
The tag iwmw2006 was recommended for use in social bookmarking services. Tags for plenary talks and parallel sessions had the format iwmw2006-plenary-speaker and iwmw2006-parallel-facilitator.
del.icio.us
The IWMW 2006 Web site was bookmarked on del.icio.us (this allowed the event organisers to observe other who had bookmarked the Web site in order to monitor communities of interest). In addition, speakers and workshop facilitators were invited to use del.icio.us for key resources mentioned in their talks.
Flickr
Participants who took photographs at the event were encouraged to tag the photographs with the tag iwmw2006 if they uploaded their photos to services such as Flickr.
Event Evaluation and Impact Assessment Using Technorati
The impact of the workshop was evaluated by using the Technorati service to find Blog postings which made use of the iwmw2006 tag.
Aggregation Using Suprglu
The various social bookmarking services which used the recommended tag were aggregated by the Suprglu service at the URL <http://iwmw2006.suprglu.com/>.

Briefing 106

Exploiting Networked Applications At Events


About This Document

Increasingly WiFi networks are available in lecture theatres [1]. With greater ownership of laptops, PDAs, etc. we can expect conference delegates to make use of the networks. There is a danger that this could lead to possible misuse (e.g. accessing inappropriate resources; reading email instead of listening; etc.) This document describes ways in which a proactive approach can be taken in order to exploit enhance learning at events. The information in this document can also be applied to lectures aimed at students.

Design Of PowerPoint Slides

A simple technique when PowerPoint slides are used is to make the slides available on the Web and embed hypertext links in the slides (as illustrated). This allows delegates to follow links which may be of interest.

Use of PowerPoint

Providing access to PowerPoint slides can also enhance the accessibility of the slides (e.g. visually impaired delegates can zoom in on areas of interest).

Using Bookmarking Tools

Social bookmarking tools such as del.icio.us can be used to record details of resources mentioned [2] and new resources added and shared.

Realtime Discussion Facilities

Providing discussion facilities such as instant messaging tools (e.g. MSN Messenger, Jabber or Gabbly) can enable groups in the lecture theatre to discuss topics of interest.

Support For Remote Users

VoIP (Voice over IP) software (such as Skype) and related audio and video-conferencing tools can be used to allow remote speakers to participate in a conference [3] and also to allow delegates to listen to talks without being physically present.

Using Blogs And Wikis

Delegates can make use of Blogs to take notes: This is being increasingly used at conferences, especially those with a technical focus, such as IWMW 2006 [4]. Note that Blogs are normally used by individuals. In order to allow several Blogs related to the same event to be brought together it is advisable to make use of an agreed tag [5].

Unlike Blogs, Wikis are normally used in a collaborative way. They may therefore be suitable for use by small groups at a conference [6]. An example of this can be seen at the WWW 2006 conference [7].

Challenges

Although WiFi networks can provide benefits there are several challenges to be addressed in order to ensure that the technologies do not act as a barrier to learning.

User Needs
Although successful at technology-focussed events, the benefits may not apply more widely. There is a need to be appreciative of the event environment and culture. There may also be a need to provide training in use of the technologies.
AUP
An Acceptable Use Policy (AUP) should be provided covering use of the technologies.
Performance Issues, Security, etc.
There is a need to estimate the bandwidth requirements, etc. in order to ensure that the technical infrastructure can support the demands of he event. There will also be a need to address security issues (e.g. use of firewalls; physical security of laptops, etc.).There is a need to estimate the bandwidth requirements, etc. in order to ensure that the technical infrastructure can support the demands of the event. There will also be a need to address security issues (e.g. use of firewalls; physical security of laptops, etc.).
Equal Opportunities
If not all delegates will possess a networked device, care should be taken to ensure that delegates without such access are not disenfranchised.

References

  1. Using Networked Technologies To Support Conferences, Kelly, B. et al, EUNIS 2005,
    <http://www.ukoln.ac.uk/web-focus/papers/eunis-2005/paper-1>
  2. Use Of Social Tagging Services At Events, QA Focus briefing document no. 105,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-105/>
  3. Interacting With Users, Remote In Time And Space, Phipps, L. et al, SOLSTICE 2006,
    <http://www.ukoln.ac.uk/web-focus/events/conferences/solstice-2006/>
  4. Workshop Blogs, IWMW 2006,
    <http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2006/blogs/>
  5. Use Of Social Tagging Services At Events, QA Focus briefing document no. 105,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-105/>
  6. Guide To The Use Of Wikis At Events, QA Focus briefing document no. 104,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-104/>
  7. WWW2006 Social Wiki, WWW 2006,
    <http://www2006.org/wiki/w/Main_Page>

Briefing 107

Guidelines For Exploiting WiFi Networks At Events


About This Document

Increasingly WiFi networks are available in lecture theatres, conference venues, etc. We are beginning to see various ways in which networked applications are being used to enhance conferences, workshops and lectures [1].

Availability Of The Network

If you are considering making use of a WiFi network to support an event you will need to ensure that (a) a WiFi network is available; (b) costs, if any, for use of the network and (c) limitations, if any, on use of the network. Note that even if a WiFi network is available, usage may restricted (e.g. to academic users; local users; etc.)

Demand From The Participants

There may be a danger in being driven by the technology (just because a WiFi network is available does not necessarily mean that the participants will want to make use of it). Different groups may have differing views on the benefits of such technologies (e.g. IT-focussed events or international events attracting participants from North America may be particularly interested in making use of WiFi networks).

If significant demand for use of the WiFi network is expected you may need to discuss this with local network support staff to ensure that (a) the network has sufficient bandwidth to cope with the expected traffic and (b) other networked services have sufficient capacity (e.g. servers handling logins to the network).

Proactive Or Reactive Approach?

You may choose to provide details of how to access the WiFi network and leave the participants to make use of it as they see fit. Alternatively you may wish to manage the way in which it is used, and provide details of networked applications to support the event, as described in [2], [3] and [4].

Financial And Administrative Issues

If there is a charge for use of the network you will have to decide how this should be paid for? You may choose to let the participants pay for it individually. Alternatively the event organisers may chose to cover the costs.

You will also have to set up a system for managing usernames and passwords for accessing the WiFi network. You may allocate usernames and passwords as participants register or they may have to sign a form before receiving such details.

Support Issues

There will be a need to address the support requirements to ensure that effective use is made of the technologies.

Participants
There may be a need to provide training and to ensure participants are aware of how the networked technologies are being used.
Event Organisers, Speakers, etc.
Event organisers, chairs or sessions, speakers, etc should also be informed of how the networked technologies may be used and may wish to give comments on whether this is appropriate.
AUP
An Acceptable Use Policy (AUP) should be provided which addresses issues such as privacy, copyright, distraction, policies imposed by others, etc.
Evaluation
It would be advisable to evaluate use of technologies in order to inform planning for future events.

Physical And Security Issues

You will need to address various issues related to the venue and the security of computers. For example, you may need to provide advice on where laptop users should sit (often next to a power supply and possibly away from people who do not wish to be distracted by noise). There will also be health and safety issues to consider. There will also be issues regarding the physical security of computers and the security against viruses, network attacks, etc.

References

  1. Using Networked Technologies To Support Conferences, Kelly, B. et al, EUNIS 2005,
    <http://www.ukoln.ac.uk/web-focus/papers/workshops/eunis-2005/paper-1>
  2. Exploiting Networked Applications At Events, QA Focus briefing document no. 106,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-106/>
  3. Use Of Social Tagging Services At Events, QA Focus briefing document no. 105,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-105/>
  4. Guide To The Use Of Wikis At Events, QA Focus briefing document no. 104,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-104/>

Briefing 108

An Introduction to Secure Web Practice


About This Document

Since the early years of the Web, the process of designing and building a Web site has changed. The availability of pre-packaged software for many common tasks - content management, blogging, forum systems and so forth - has improved. Many third party services are now available. Web frameworks of varying complexity are available in almost any common programming language. Templating systems are commonplace - sites making use of advanced features such as AJAX functionality will often make use of a framework to simplify design. The fact that the design and complexity of the tools has increased does not mean that security is now out of the hands of the site developer. Some frameworks explicitly handle certain security issues, but it is good practice to work with security in mind. This document provides guidelines on some of these issues.

Platform Security Issues

Every component in your web site, including the Web server and underlying frameworks or platforms, may suffer from security flaws. As they are discovered, the developers behind that software package will issue advisories regarding security flaws, as well as software patches, upgrades or workarounds. Ongoing maintenance of a web site involves keeping an eye out for advisories, perhaps on appropriate mailing lists or RSS newsfeeds, and prompt action when such issues are discovered. Remember to plan for this essential ongoing maintenance work in your budget. Note that using popular components is likely to help security - issues are discovered and fixed quickly.

User Authentication

In general, user authentication is a difficult piece of functionality to write from scratch. If your project parameters permit you to use an existing system, or to tie your authentication into an existing system such as PHP's sessioning, consider the option. Ideally, do not store user passwords in plain text, because of the possibility that the passwords could be retrieved and used to compromise users' accounts elsewhere - in practice, users often maintain one password for several systems or sites. Make use of a hashing function, such as an md5sum s2], and store the result instead.

User Input

Many security flaws result from the assumption that user input is trustworthy, and that it contains what the programmer intended the user to input. Others result from the use of client-side code, such as JavaScript, to check the user's input before it is posted. The client's browser is not a secure environment - the user can alter browser behaviour. Even if client-side code is used, the check must be run again on the server.Examples of Common Vulnerabilities

Conclusions

Make security a priority at all stages in the design and build process. Plan for maintenance in your budget. When specifying an application, page or component that takes user input, write down a list of possible vulnerabilities and ensure that you have addressed these during the design and build phases. Test your Web applications - provide unexpected input, and see if you can find any way to break them. Provide custom error pages - diagnostic information on errors can be useful for malicious users. Speak to a security expert if possible, and ensure that you have taken reasonable precautions. A secure application can only be ensured by an ongoing commitment to that aim.

References

  1. Common Security Vulnerabilities in eCommerce Applications, K. K. Mookhey,
    <http://www.securityfocus.com/infocus/1775>
  2. 2. Consider MD5 checksums for application passwords, Scott Stephens,
    <http://www.builderau.com.au/architect/soa/Consider_MD5_checksums_for_application_passwords/0,39024564,39130325,00.htm>

Briefing 109

Web Security, Services and AJAX


About This Document

Web security is an ongoing concern of many - site developers, browser developers and everyday users. Some of the basics of Web security have been covered in a previous briefing paper [1]. However, the increasing use of Web Services and APIs in various contexts has led to some novel security concerns, as well as the resurgence of several older issues. This document will discuss a few of the security issues brought up by the use of AJAX and 'Web 2.0' service APIs.

Cross-domain Requests

Since Web services are not hosted locally, consuming data from such services involves making requests to services on other domains. Default security settings in JavaScript disallow this - most browsers will not permit it for security reasons. There are legitimate security reasons for this. If developers wish to overcome this restriction, they typically create a proxy for that service - a server-side script running on their local domain that forwards the request and returns the response. Proxies must be secured against unauthorised use. Many services will ask for an API key or password; be careful with this information and do not make it visible in the JavaScript of your page.

Multi-layered Security

Do not consider the browser as a secure environment; users can read JavaScript - even if it has been obfuscated - and alter it as it runs on their machine. In fact, they can tailor requests that do not make use of the page at all. Therefore, ensure that all the crucial business logic takes place on the server, rather than the browser. You may wish to perform it on the browser to avoid unnecessary crosstraffic between client and server, or for usability/interface reasons, but though this means that the logic has been duplicated, it is important to perform this step again when processing input requests.

According to research by Gartner Inc, 70% of attacks occur via the application layer [2]. Before deploying AJAX, consider the risks and ensure that your project is ready and able to meet the commitment required to complete the task securely. Access control in particular should be considered attentively, as well as communications channels - remember that by default XMLHTTPRequest does not encrypt the data it transmits.

Bandwidth and Speed

Several applications of AJAX have been designed to provide almost real-time input or feedback to the user, or for continuous small updates to be made to a page. When putting together an application using AJAX or frequent service calls in general, add up the amount of bandwidth that will be used during a typical interaction or use of the service.

Fallback Plans

If your service is slow or unavailable - or JavaScript is turned off on the browser - the majority of web applications should still work. The only exception to this is the relatively small subset of applications whose functionality depends directly on input taken from services, such as mash-ups based around maps taken from Google Earth.

Managing Complexity

Because AJAX applications share the business logic between the server and the client, the developers responsible for each element need to work together closely. Developers dealing with technology relatively new to them will need time to produce successive prototypes, since this is a learning process with few authoritative references from which to work.

Conclusions

As with traditional Web applications [3], make security a priority, and plan for maintenance in your budget. When specifying an application, page or component that takes user input, write down a list of possible vulnerabilities and ensure that you have addressed these during the design and build phases. Test your Web applications - both the JavaScript layer and the server-side layer(s). Speak to a security expert if possible, and ensure that you have taken reasonable precautions. A secure application can only be ensured by an ongoing commitment to that aim.

Finally, be conservative with application of novel technologies, and consider a 'feature-freeze' and testing cycle before deployment. New features will also require testing before deployment, so plan to add features only when there is time to audit the result.

References

  1. An Introduction to Secure Web Practice, QA Focus briefing document no. 108,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-108/>
  2. AJAX Security, Stewart Twynham, IT Observer,
    <http://www.it-observer.com/articles/1062/ajax_security/>
  3. AJAX Threats Worry Researchers, Bill Brenner,
    <http://searchsecurity.techtarget.com/originalContent/0,289142,sid14_gci1207759,00.html>

Briefing 110

Publish On Demand


About Publish On Demand

Publish on demand, also known a print on demand or POD, is an increasingly common and accessible alternative (or supplement) to traditional publishing methods. Publish on demand (note that the term Print On Demand is a trademark of Cygnus Business Media, Inc. [1]) became possible due to the development of equipment capable of economically printing very small print runs of materials in book form. This briefing document discusses technical prerequisites and restrictions and provides an overview of the advantages and disadvantages of the methodology.

How does it work?

POD systems make use of digital technologies to enable an all-digital printing system. This reduces initial setup costs, meaning that small print runs can be published at a reasonable unit cost. The technology becomes less price-effective at large volumes, making POD systems particularly economical for publications with low or limited demand, or where the predictability of market demand is fairly low. The minimal setup costs also imply that publications can be made available on a speculative basis.

A 'Long-Tail' Technology

The advent of POD also coincides with recent discussions characterised by the phrase 'the long tail'. This phrase refers to the large number of moderately popular or less popular publications that are interesting to a moderately large segment of the population, but are not bestsellers or chart hits [2]. Such content is often very difficult to get hold of, since demand is too low for it to be marketable effectively according to traditional economics of large-scale publishing. Small-scale publication - the print-on-demand methodology- reduces this barrier, meaning that content can be digitally stored, browsed and made available for sale on virtual shopfronts, and a physical object such as a book or a CD is printed for the buyer.

Digital browsing and distribution also means that physical limits, such as the size of a shopfront and the amount of space available in the warehouse for storing a large print run of books, becomes a less significant factor in the decision of whether to publish.

Technical Prerequisites

Popular POD services include Xlibris, Lulu [3] and Blurb. Each has its own process; Blurb, for example, specialises in printing from Web applications such as blogs and wikis, whilst Lulu provides a number of modes of use - essentially, most PDF files with embedded fonts can be printed using the Lulu publishing system. Customised covers and so forth can also be created, although simple and effective defaults are also available. Several book formats are available, ranging from bound A4 to British or American paperback novel formats.

Advantages and Disadvantages

There remains a stigma associated to the use of print-on-demand for certain types of content, particularly fiction. It is associated in many peoples' minds with the use of small presses or vanity presses, which to many have the reputation of preying on the unwary novelist, demanding large setup fees for small print runs and providing no help in terms of marketing or distributing the work. ISBNs need to be bought separately, and distribution difficulties exist for works without an ISBN assigned to them. However, there are many valid uses of print-on-demand, ranging from publishing of local-interest or specialist books (particularly valid, therefore, for institutions such as museums) to print publication of conference proceedings.

Conclusions

Affordable small-scale publishing has opened many doors for the smaller organisation or the individual. However, whilst a useful methodology, other publication methodologies should be considered in parallel - in particular, there are many potential difficulties and responsibilities that should be considered in terms of distribution and marketing, which are not handled by POD.

References

  1. Print on Demand, Wikipedia,
    <http://en.wikipedia.org/wiki/Print_on_Demand>
  2. The Long Tail, Chris Anderson,
    <http://www.wired.com/wired/archive/12.10/tail_pr.html>
  3. Lulu basics, Lulu,
    <http://www.lulu.com/>

Briefing 111

Layout Testing with Greeked Pages


Background

Page layout, content and navigation are not always designed at the same time. It is often necessary to work through at least part of these processes separately. As a result, it may not be possible to test layouts with realistic content until a relatively late stage in the design process, meaning that usability problems relating to the layout may not be found at the appropriate time.

Various solutions exist for this problem. One is the possibility of testing early prototype layouts containing 'greeked' text - that is, the 'lorem imsum' placeholder text commonly used for layout design [1]. A method for testing the recognisability of page elements was discussed in Neilsen's Alertbox back in 1998 [2], though the concept originated with Thomas S. Tullis [3].

Technique

Testing will require several users - around six is helpful without being excessively time-consuming. Ensure that they have not seen or discussed the layouts before the test! First, create a list of elements that should be visible upon the layout. Nielsen provides a list of nine standard elements that are likely to present on all intranet pages - but in your particular case you may wish to alter this list a little to encompass all of the types of element present on this template.

Give each test user a copy of each page - in random sequence, to eliminate any systematic error that might result from carrying the experience with the first page through to the second. Ask the test user to draw labelled blocks around the parts of the page that correspond to the elements you have identified. Depending on circumstances, you may find that encouraging the user to 'think aloud' may provide useful information, but be careful not to 'lead' the user to a preferred solution.

Finally, ask the user to give a simple mark out of ten for 'appeal'. This is not a very scientific measure, but is nonetheless of interest since this allows you to contrast the user's subjective measure of preference against the data that you have gathered (the number of elements correctly identified). Nielsen points out that the less usable page is often given a higher average mark by the user.

Scoring The Test

With the information provided, draw a simple table:

Layout Correctly Identified Page Elements Subjective Appeal
1 N% (eg. 65%) # (e.g. 5/10)
2 M% (eg. 75%) # (e.g. 6/10)

This provides you with a basic score. You will probably also find your notes from think-aloud sessions to be very useful in identifying the causes of common misunderstandings and recommending potential solutions.

When Should Page Template Evaluation Be Carried Out?

This technique can be applied on example designs, so there is no need to create a prototype Web site; interface ideas can be mocked up using graphics software. These mockups can be tested before any actual development takes place. For this reason, the template testing approach can be helpful when commissioning layout template or graphical design work. Most projects will benefit from a user-centred design process, an approach that focuses on supporting every stage of the development process with user-centred activities, so consider building approaches like this one into your development plans where possible.

Conclusions

If a developing design is tested frequently, most usability problems can be found and solved at an early stage. The testing of prototype page layouts is a simple and cheap technique that can help to tease out problems with page layout and visual elements. Testing early and often can save money by finding these problems when they are still cheap and simple to solve.

It is useful to make use of various methods of usability testing during an iterative design and development cycle, since the various techniques often reveal different sets of usability problems - testing a greeked page template allows us to separate the usability of the layout itself and the usability of the content that will be placed within this content [2]. It is also important to evaluate issues such as content, navigation mechanisms and page functionality, by means such as heuristic evaluation and the cognitive walkthrough - see QA Focus documents on these subjects [4] [5]. Note that greeked template testing does look at several usability heuristics: Aesthetic & minimalist design and Consistency and standards are important factors in creating a layout that scores highly on this test.

Finally, running tests like this one can help you gain a detailed understanding of user reactions to the interface that you are designing or developing.

References

  1. Lorem Ipsum Generator,
    <http://lorem-ipsum.perbang.dk/>
  2. Testing Greeked Page Templates, Jakob Nielsen,
    <http://www.useit.com/alertbox/980517.html>
  3. A method for evaluating Web page design concepts, T.S. Tullis. In ACM Conference on Computer-Human Interaction CHI 98 Summary (Los Angeles, CA, 18-23 April 1998), pp. 323-324.
  4. Introduction To Cognitive Walkthroughs, QA Focus briefing document no. 87,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-87/>
  5. Heuristic Evaluation, QA Focus briefing document no. 89,
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-89/>

Further Information


Briefing 112

An Introduction To Mashups


What Is A Mashup?

Wikipedia defines a mashup as "a web application that combines data from more than one source into a single integrated tool" [1]. Many popular examples of mashups make use of the Google Map service to provide a location display of data taken from another source.

Technical Concepts

As illustrated in a video clip on "What Is A Mashup?" [2] from a programmer's perspective a mashup is based on making use of APIs (application programmers interface). In a desktop PC environment, application programmers make use of operating system functions (e.g. drawing a shape on a screen, accessing a file on a hard disk drive, etc.) to make use of common functions within the application they are developing. A key characteristic of Web 2.0 is the notion of 'the network as the platform'. APIs provided by Web-based services (such as services provided by companies such as Google and Yahoo) can similarly be used by programmers to build new services, based on popular functions the companies may provide. APIs are available for, for example, the Google Maps service and the del.icio.us social book marking service.

Creating Mashups

Many mashups can be created by simply providing data to Web-based services. As an example, the UK Web Focus list of events is available as an RSS feed as well as a plain HTML page [3]. The RSS feed includes simple location data of the form:

<geo:lat>51.752747</geo:lat>
<long>-1.267138</geo:long>

This RSS feed can be fed to mashup services, such as the Acme.com service, to provide a location map of the talks given by UK Web Focus, as illustrated.

Figure 1: Mashup Of Location Of  UK Web Focus Events
Figure 1: Mashup Of Location Of UK Web Focus Events

Tools For The Developer

More sophisticated mashups will require programming expertise. The mashup illustrated which shows the location of UK Universities and data about the Universities [4] is likely to require access to a backend database.

Figure 2: A Google Maps Mashup Showing Location and Data About UK Universities
Figure 2: A Google Maps Mashup Showing Location and Data About UK Universities

However a tools are being developed which will allow mashups to be created by people who may not consider themselves to be software developers. Such tools include Yahoo Pipes [5], PopFly [6] and Google Mashup Editor [7].

Allowing Your Service To Be 'Mashed Up'

Paul Walk commented that "The coolest thing to do with your data will be thought of by someone else" [8]. Mashups provide a good example of this concept: if you provide data which can be reused this will allow others to develop richer services which you may not have the resources or expertise to develop. It can be useful, therefore, to seek to both provide structured data for use by others and to avoid software development if existing tools already exist. However you will still need to consider issues such as copyright and other legal issues and service sustainability.

References

  1. Mashup (web application hybrid, Wikipedia,
    <http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)>
  2. What is A Mashup?, ZDNet,
    <http://news.zdnet.com/2422-13569_22-152729.html >
  3. Forthcoming Events and Presentations, UK Web Focus, UKOLN,
    <http://www.ukoln.ac.uk/web-focus/events/>
  4. University Locator, University of Northumbria,
    <http://northumbria.ac.uk/browse/unimapper/>
  5. Yahoo Pipes, Yahoo,
    <http://pipes.yahoo.com/pipes/>
  6. Popfly, Microsoft,
    <http://www.popfly.com/>
  7. Google Mashup Editor, Google,
    <http://editor.googlemashups.com/>
  8. The coolest thing to do with your data will be thought of by someone else, Paul Walk, 23 July 2007,
    <http://blog.paulwalk.net/2007/07/23/>