Brian Kelly*, Alastair Dunning+, Marieke Guy* and Lawrie Phipps#
* UKOLN, University of Bath, Claverton Down, Bath, UK
+ AHDS, King's College London, London, UK
# TechDis, The Network Centre, 4 Innovation Close, York Science Park, York, UK
<B.Kelly@ukoln.ac.uk>, <Alastair.Dunning@ahds.ac.uk>, <M.Guy@ukoln.ac.uk>, <Lawrie@techdis.ac.uk>
The importance of open standards for providing access to digital resources is widely acknowledged. Bodies such as the W3C are developing the open standards needed to provide universal access to digital cultural heritage resources. However, despite the widespread acceptance of the importance of open standards, in practice many organisations fail to implement open standards in their provision of access to digital resources. It clearly becomes difficult to mandate use of open standards if it is well-known that compliance is seldom enforced. Rather than abandoning open standards or imposing a stricter regime for ensuring compliance, this paper argues that there is a need to adopt a culture which is supportive of use of open standards but provides flexibility to cater for the difficulties in achieving this.
Keywords: open standards, proprietary formats, policies
The World-Wide Web is widely accepted as the key platform for providing access to digital cultural heritage resources. The Web promises universal access to resources and provides flexibility (including platform- and application-independence) though use of open standards. In practice however, it can be difficult to achieve this goal. Proprietary formats can be appealing and, as we learnt during the "browser wars", software vendors can state their support for open standards while deploying proprietary extensions which can result in services which fail to be interoperable.
Many digitisation programmes which seek to provide access to digital cultural heritage resources will expect funded projects to comply with a variety of open standards. However if, in practice, projects fail to implement open standards this can undermine the premise that open standards are essential and would appear to threaten the return of application- and platform-specific access to resources.
Although a commitment to Web development based on open standards is desirable in practice it is likely that there will be occasions when use of proprietary solutions may be needed. But the acceptance of a mixed economy in which open standards and proprietary formats can be used as appropriate can lead to dangers. So should we mandate strict compliance with open standards or should we tolerate a mixed economy? This paper seeks to explore these issues.
We will now review two digital library programmes in more detail and expand on the standards framework and project monitoring and technical support services which seek to ensure that the deliverables from funded projects are interoperable and comply with appropriate standards and best practices. Three of the authors of this paper have been involved in provided the technical support services to these programmes. The experiences gained in providing this support have helped to inform the writing of this paper.
Within the UK the Higher Education community has a culture which is supportive of open standards in its digitisation programmes. An early digital library programme known as eLib ran from 1995 until 2001. A set of guidelines known as the eLib Standard Guidelines (JISC-1) defined the standards which funded projects were expected to implement were produced.
In 1999 the Joint Information Systems Committee (JISC) established a digital library programme with the intention of improving the applicability of its collections and resources for learning and teaching. Although digital information and data resources had been created in previous JISC-funded programmes so far they had mainly been used for research and their learning and teaching value had not been widely utilised. The JISC Learning and Teaching Programme (5/99) was aimed at increasing the use of online electronic resources by integrating them into the JISC's Information Environment through deployment into a service environment. Alongside this increased need for usage of resources in new areas was the recognition that if the digital resources created on programmes were to be widely accessible, interoperable, durable and represent value for money, their technical development should be rigorous and based on best practices. To ensure that the project deliverables could be easily deployed into a service environment the JISC expected projects to make use of standards documented in the Standards and Guidelines To Build A National Resourcedocument (JISC-2), which was based on an update of the eLib Standard Guidelines document.
The JISC were aware that although projects funded by the eLib programme were expected to comply with the eLib standards document, in practice compliance was never checked. This may have been appropriate for the eLib programme as, when the programme commenced in 1995, it was not necessarily clear that the Web would turn out to be the killer application for delivery of resources. However there is now an awareness that the Web is the killer application for access to digital resources. There is also a realisation that compliance with standards will be necessary in order for digital resources to be widely interoperable. In response to such needs the JISC funded a new post: QA Focus. The QA Focus post was initially established as a support mechanism solely for the 5/99 programme (although recently its remit has been expected to cover additional programmes). The aim of QA Focus is to ensure that projects comply with standards and recommendations and make use of appropriate best practices by deploying quality assurance procedures.
An initial QA Focus activity was organising focus group meetings which provided feedback on the standards framework. The feedback received included: (a) a lack of awareness of the Standards document; (b) difficulties in seeing how the standards could be applied to projects' particular needs; (c) concerns that the standards would change during the project lifetime; (e) lack of technical expertise and time to implement appropriate standards; (f) concerns that standards may not be sufficiently mature to be used; (g) concerns that the mainstream browsers may not support appropriate standards and (h) concerns that projects were not always starting from scratch but may be building on existing work and in such cases it would be difficult to deploy appropriate standards.
Following the focus group meetings surveys of project Web sites were carried out in order to gain an understanding of the approaches taken by projects in their provision of project Web sites and to identify examples of best practices and areas in which improvements could be made. The surveys analysed compliance with HTML and CSS standards and with W3C WAI guidelines. The findings showed that few project entry points appeared to comply fully with open standards (QA-Focus-1, 2002). A number of reasons for this have been expressed: (a) the surveys may have analysed Web pages about the project, rather than the actual project Web site; (b) the surveys may have analysed Web pages aimed at project partners rather than end users; (c) the surveys may have been carried out at an early stage of development and (d) the focus of the projects deliverables may have been on digitisation or software development and on providing information on a Web site.
It should be noted that such comments appear to indicate that strict compliance with standards is felt to be difficult or that there may be occasions when compliance is not felt to be necessary or would be unnecessarily expensive to implement. These comments appear to show reservations as to the applicability or scope of compliance with open standards.
The NOF-digitise programme (NOF-1) is the second of these case studies. Supported by public funding of about £50 million, the programme forms part of a larger initiative (the New Opportunities Fund or NOF) that distributed funding to education, health and environment projects throughout the United Kingdom, with a focus on providing for those in society who are most disadvantaged. The NOF-digitise element, as the title suggests, was dedicated to funding and supporting universities, local government, museums and other public sector organisations in digitising material from their collections and archives and making this cultural heritage available on the Web.
Emphasis on the need for standards and good practice began early in the lifespan of the programme. This was for two reasons. Firstly, few of the funded projects had much experience of digitisation and a fair degree of education was required to inculcate the importance of standards. Secondly, it was realised that the public funding of a large-scale digitisation programme entailed the creation of material that needed to be preserved and made accessible not just in the present, but for future generations. Therefore the NOF-digitise programme elected to formulate a set of standards based on open standards. In addition a Technical Advisory Service (NOF-2) was established which would be able to offer technical assistance to the projects as they applied these standards.
The standards developed for NOF-digitise projects (NOF-3) were split into five areas: creation, management, collection development, access and re-use. In many cases defining the open standards in these areas was a relatively straightforward matter. Thus those projects that were digitising textual material needed to do so in XML or HTML; those creating digital images had to use formats such as TIFF, GIF, JPEG (JFIF) or PNG.
But almost immediately the difficulty of applying purely open standards became apparent. At the time of the standards' initial creation, there was no suitable open standard for the creation of audio or video files thus the programme had to adopt a more pragmatic outlook, accepting formats such as MPEG4 for video and MP3 for audio. The problem was even more acute when it came to access to data using software such as Macromedia Flash or Adobe PDF. Macromedia Flash's SWF format provides many features attractive to projects. Projects, quite rightly, considered that the ability to create stylish graphics and animations would be an important feature in attracting users to their Web sites, especially younger users. Adobe's PDF format offered presentational advantages, especially of some historical documents, that made it easier to apply than HTML.
But the use of such proprietary formats presented two problems. Firstly, in terms of accessibility: as the NOF-digitise programme was committed to being inclusive in delivering digital material, it had to take account users who would not be able to access, for example, resources created in Flash or PDF. Secondly, there were preservation issues related to the creation of material in such proprietary formats. To what extent would such material be accessible in five, ten, twenty years? Would there also be the possibility that future users would have to begin to have to pay for the plug-ins needed to access such materials?
The programme was therefore faced with the problem addressed in this paper: should it enforce strict compliance or cater for a mixed economy? It was decided to adopt a pragmatic approach. The development of resources in such proprietary formats was not forbidden but projects had to ensure that the creation of any part of the resource in a format such as PDF or SWF was accompanied by various safety checks that ensured data would become locked into solely proprietary formats. The crucial stipulation was that any significant digital resource created in a proprietary format had also to be presented in an open standard as well. Thus documents created in PDF also had to be available in HTML or XML. Extra functionality could be provided by proprietary formats but the project had to ensure that the core content was accessible in an open standard as well. The same was true for Flash if a project was using its digitised content to create a resource in SWF the project had to ensure that key content used was available to those users without Flash. So if a project was developing a game or animation using the content they had digitised, they had to make sure that the content was available without Flash as well (even if the functionality of the game itself was not replicated in an open standard).
Additionally, users who were unable to access the Flash-based resources would have to be informed, using short notices on the project Web site, of the resources which could not be accessed. This ensured that even if a user could not access a particular part of the Web site they would not be left in the dark as to what was available to other users. This technique was also required for resources existing in other proprietary formats: audio resources, for example, needed to be accompanied by textual transcriptions or descriptions of what the file contained.
Finally NOF stipulated that projects should devise migration strategies to ensure the existence of their digital material in the medium to long term. The kernel of such migration strategies was to be the continued survey of the possibility of migrating the data currently held in proprietary format to an open standard. So in the case of Flash resources, projects need to review on the possibility of transferring to, say, SMIL (W3C-1) or explore the opportunities afforded by the publication of the SWF specification (SWF).
The NOF-digitise programme has been committed to helping develop digital resources in open standards, thus increasing the chances of developing. But it has done so within a framework that has tried to understand and accommodate the advantages provided by proprietary formats.
As we have seen communities which have expressed commitments to use of open standards are currently failing to comply with such standards. We can speculate on a number of reasons for this:
Bad experiences with standards: Organisations may have sought to implement standards in the past and experienced difficulties, which may have been costly. Within the UK Higher Education community those with long memories will remember the edict that the community must strive towards OSI networking protocols through use of Coloured Book software (JNT).
Lack of awareness of standards: There is a danger that although awareness of standards may be widespread amongst certain sectors of the Web development community, other developers may have a focus on Web development applications and not the underlying standards they support.
Difficulties in monitoring compliance: Even in cases in which there is an awareness of the importance of open standards and a commitment to their use we can find that Web sites fail to comply with standards. This may be due to the difficulties in monitoring compliance with standards. Compliance testing services such as W3C's HTML validator (W3C-2) and CSS validator (W3C-2) are not particularly easy to use, requiring a cumbersome manual process which is not cleanly integrated with a publishing process or scalable for validating large numbers of resources.
Limitations of the tools: Many authoring tools fail to comply with open standards. In addition many authoring tools fail to implement best practices, and generate deprecated features such as HTML elements used for formatting rather than using cascading style sheets to define the appearance of resources. Open source advocates argue that there are open source authoring tools which do provide better support for open standards. However replacement of existing tools will inevitably result in hidden costs such as training and support costs.
If it's not broken ....: Developers of the current generation of Web sites may argue that the Web resources are accessible in the current generation of browsers. Some will argue that their Web sites have been tested across a range of browsers and operating system environments; others will point out that their Web sites have been tested under the most popular browsers and this is an adequate testing regime, especially in light of the costs of testing and the diminishing returns gained by testing under the more esoteric environments.
Maturity of standards: Although some organisations may welcome the opportunity to be early adopters of new standards, others may not wish to make use of new standards until they have been adequately tested and a wide range of tools which support the standards are available.
Standards wars: There are occasions when there are competing standards. For example the news feed syndication standards RSS has two competing standards: one based on XML (RSS-1) and one on RDF/XML (RSS-2)).
We have a problem - let's invent a new standard: When a standard is found to have limitations, there seems to be a temptation to use this as an opportunity to develop a new standard. This can happen before the flawed standard has yet been widely deployed and is still being promoted. An example of this is XHTML 2.0. Although XHTML 1.0 provides many advantages, effective deployment is hindered by the requirement of current browsers to attempt to display resources which do not comply with standards. A recent survey has shown that many XHTML 1.0 documents are not compliant (Goer). Such document may be displayed, but as they are not valid XML documents, they cannot be processed as XML. In an attempt to address this W3C are developing XHTML 2.0 which will not be expected to be backwards compatible. This leaves Web developers uncertain whether to move from HTML to XHTML 1.0 or wait until XHTML 2.0 becomes available. Moving from HTML 4.0 to XHTML 1.0 and the XHTML 2.0 would appear to be a resource-intensive operation. As Mark Pilgrim put it "Someday, I'll upgrade myself from 'SHOULD NOT chase after bleeding edge technologies that don't solve real world problems' to 'MUST NOT chase after bleeding edge technologies that don't solve real world problems.'" (Pilgrim, 2002).
We know the benefits which use of open standards has to offer. But, as we have seen, many organisations are simply not complying with open standards such as HTML and there are a number of reasons why this is the case. So what should we be doing? Possible strategies include:
Although the New Zealand Government Web Guidelines have formal requirements for compliance with standards the document gives no indication of measures for assessing compliance. Some organisations are developing self-assessment toolkit approaches. In the UK the Government is developing a proforma to be used by local government bodies to document their compliance with appropriate standards (UK) which requires organisations to state their compliance with the Government Interoperability Framework, the Guidelines for UK Government Web sites and with W3C WAI guidelines.
The approaches of providing greater encouragement to comply with open standards or of mandating compliance do not address many of the difficulties which have been outlined previously. Such approaches do not take into account conflicts within standards organisations, the dangers facing earlier adopters, the resource implications in deploying new tools, etc.
It should also be pointed out that we may see developments in the marketplace in response to the needs of the community which open standards seek to address. For example:
Need to define DOCTYPE: HTML standards mandate that compliant HTML documents use a DOCTYPE to define the version of HTML used. In practise, however, Web browsers can render documents which do not have a DOCTYPE. In addition, other tools, such as search engine robots, transformation tools, etc. are capable of processing documents which do not have a DOCTYPE. In light of the vast numbers of documents which do not contain a DOCTYPE one could argue that heuristics approaches can be taken to compensate.
Need to define Character Encoding: HTML standards mandate that compliant HTML documents define the character encoding of characters used. As mentioned above, vast numbers of documents do not define the character encoding used and one could argue that heuristics approaches can be taken to compensate.
Need to use relative sizing: W3C WAI guidelines require HTML elements to be defined using relative positioning and sizes. This is to enable visually impaired readers to resize resources to an appropriate size. In practice accessibility aids are available which will allow users to resize not only text on a Web page but everything on the computer display.
It may be argued that there is a proven difference between real world standards which, for example, require an electric plug to be of a particular size and characteristics in order to function correctly. In an IT environment it is possible for software to compensate for deviations from standards. One should avoid taking this example too far: it is not intended to argue that any proprietary formats or deviation from a standard can or should be processed correctly. The point being made is that in today's Web environment a great many resources do not comply with standards and yet the services are functional.
There is a need to address the applicability of mandating strict compliance in 'softer' areas such as accessibility. Widespread access to digitised resources has been important for the two case studies described in Section 2. However the implementation of Web accessibility has led to much discussion initially focussed on the accessibility of proprietary formats. However there is a wider issue which needs to be addressed: whether Web accessibility guidelines are regarded as a formal standard or guidelines which provide sensible suggestions in many cases, but which can be interpreted and applied on a case-by-case basis.
The recognition of discriminatory practices in society has led to a range of initiatives and legislation to prevent disabled people being treated unfairly. As information systems such as the Internet developed guidelines such as the W3C WAI Content Accessibility Guidelines (W3C-6) have been established to help developers ensure that their methods and materials were not excluding disabled people.
However the authors argue that these guidelines should be treated as exactly that: a set of guiding principles, rather than absolute and fixed standards. They are not and cannot be 'hard and fast' rules because of the very nature of the community they wish to serve, which is diverse and sometimes has conflicting needs, for example dyslexic users of the Web may use a very visual interface and prefer very rich multimedia Web sites, whilst this is of less use to a blind user. This is not to say that the rich multimedia site cannot be made accessible, just that the designers may not be able to please everybody, and though the guidelines have a caveat that if something is not accessible then there should be an alternative, it is often the case that even if the content is dynamically driven and placed in a user's own interface, it is a different experience to that envisaged by the originator.
In addition to the problems of having a set of guidelines that services such a diverse community, are the 'subjective' and 'user' checks that are needed to claim adherence to the guidelines. These relate to a range of issues such as plain and simple language, the use of appropriate colour and style. All of which can be perceived differently, not only by developers but also users. However, without these elements to the guidelines they become much less effective, a mere technical shadow of the purpose for which they were envisaged; an inclusive user experience.
There is now a trend to cite these guidelines as standards, or at least use them as the basis; in the UK an industry based group the Digital Content Forum as recently put together an 'Industry Action Group' to look at Web accessibility standards in the UK. Their co-chair commented: "Despite the talk, there is currently little genuine understanding of accessibility related issues in the UK Web design community. And worryingly, there is even less practical experience of building sites that meet the highest recognised standards in accessibility. Our first objective is therefore to widen understanding within the industry of the relevant standards that already exist, and then to foster a shared approach to overcoming the technical issues relating to making existing Web-based technologies meet these standards."
Whilst the rhetoric is about ensuring accessibility, there is a worrying sub-text, that of the 'recognised standards' and meeting standards. At best, this approach suggests a misunderstanding of the use of the guidelines, at worst there is a worry that the community for which the guidelines were written are now being taken over and rewritten in a form that is more suitable for a standards-driven technical community. Guidelines are in place for a reason, they are a guide only, and recognise that there is a diverse set of needs for users - not a standard that can be used as a 'one size fits all', and certainly not a standard that is developed and imposed. If there is to be a standard in this area, then it is essential that the community and not industry in isolation drive it. Furthermore, until there is (or if there is) a standard the use and abuse of the term should be treated with the utmost caution, lest a 'stealth standard' is imposed before we notice.
If a simple commitment to use of open standards is difficult to implement and an abandonment of open standards will lead to difficulties in providing universal access to resources, what should we do? The solution advocated in this paper is based on a developmental approach which recognises the desirability of supporting open standards, but the difficulties in doing so. The approach recognises that developers are constrained by a wide range of factors, such as resources, expertise, timescales and organisational culture. Rather than mandating a single approach for all, it is proposed that digitisation programmes should recognise such complexities, but rather than abandoning a commitment to open standards, provide a developmental culture which is supportive of open standards but does not mandate open standards in all cases. This approach has grown from our experiences in supporting the NOF-digitise and JISC 5/99 programmes.
On reflection it would appear that an approach based on a simply advocating use of open standards is not necessarily desirable. It is felt that there are several factors which need to be addressed, which are listed in the following table.
|Ownership||Is the standard owned by a recognised neutral open standards body or by a company.|
|Development process||Is there is community process for development of a proprietary standard.|
|Availability||Has the proprietary standard has been published openly or reverse-engineered.|
|Viewers||Are viewers (a) available for free, (b) available as open source and (c) available on multiple platforms|
|Authoring tools||Are authoring tools (a) available for free, (b) available as open source and (c) available on multiple platforms|
|Fitness for purposes||Is the standard appropriate for the purpose envisaged|
|Resource implications||What are the resource implications in making use of the standard?|
|Complexity||How complex is the standard?|
|Interoperability||How interoperable is the standard?|
|Ease of service deployment||How easy will it be to deploy the deliverable in a service environment?|
|Ease of long term preservation||Is the standard suitable for long term preservation?|
|Organisational culture||Is the organisational cultural appropriate for use of the standard?|
|Approaches to migration||What approaches can be taken to migrating to more appropriate standards in the future?|
|Approaches to assessing compliance||What approaches can be taken to measuring compliance?|
This matrix approach can be supported by appropriate quality assurance (QA) procedures. The QA approach requires provision of documentation on the policies regarding the standards to be implemented, the architecture use to implement the standards, compliance measures to ensure that policies are correctly implemented which may include audit trails providing details of compliance. An example of a QA policy is illustrated below.
Policy: The Web site will be based on XHTML 1.0.
Justification: Compliance with appropriate standards should ensure that access to Web resources is maximised and that resources can be repurposed using tools such as XSLT.
Exceptions: Resources which are derived automatically from other formats (such as MS PowerPoint) need not comply with standards. In cases where compliance with this policy is felt to be difficult to implement the policy may be broken. However in such cases the project manager must give agreement and the reasons for the decision must be documented.
Compliance measures: When new resources are added to the Web site or existing resources update the ,validate tool will be used to check compliance. A complete compliance survey will be carried out quarterly.
Audit trail: Reports from the monthly audit will be published on the Web site in order to monitor trends.
Figure 1: QA Policy For QA Focus Web Site
This approach has been developed by QA Focus and is documented at (QA-Focus-2, 2003).
Three of the authors of this paper are involved in providing support services for JISC and NOF-digitise programmes. Our work will include making recommendations for support work in future programmes. Our recommendations are liked to include the following.
The importance of use of open standards is widely recognised within the cultural heritage sector. However in practice many digital cultural heritage Web resources fail to comply with open standards. On consideration of the reasons for this it would appear to be counter productive merely to impose greater pressure on developers to comply with standards. Rather there is a need to ensure that players within the community have an understanding of the importance of open standards but also have some degree of flexibility to provide access to resources which acknowledges the challenges in implementing fully compliant services. This paper provides a model based on the deployment of documented quality assurance processes, self assessment and liaison with funders which encourages a standards-based approach while still allowing the flexibility needed to allow for the complexities of Web development.
Brian Kelly provides the JISC-funded UK Web Focus post, which provides support for the UK Higher and Further Education communities on Web issues. He is also the project manager for QA Focus and the NOF Technical Advisory Service. He is based in UKOLN a national centre of excellence in digital information management, based at the University of Bath.
Marieke Guy is a member of the QA Focus team which supports JISC's Information Environment programme by ensuring that funded projects comply with standards and recommendations and make use of appropriate best practices. Marieke has previously worked as a NOF-digitise Advisor and co-ordinated technical support and advice services to the NOF national digitisation programme. Marieke is based in UKOLN a national centre of excellence in digital information management, based at the University of Bath.
Alastair Dunning works at the Arts and Humanities Data Service (AHDS) as Communications Manager, responsible for the AHDS's publications, Web site, events and training workshops. The AHDS aids the discovery, creation and preservation of digital collections in the arts and humanities. Alastair has also been working as a member of the NOF-digitise Technical Advisory Service, advising many projects on standards and good practice in the creation and dissemination of their digital resources.
Lawrie Phipps is a Senior Advisor (Higher Education) to the JISC TechDis service which aims to enhance provision for disabled students and staff in further and higher education through technology. Lawrie's background is in learning technology as a developer and he retains a research interest in virtual fieldwork. Currently he is working on e-learning and accessibility, and how mobile computing can be of benefit to disabled students.