UKOLN AHDS URI Naming Conventions For Your Project Web Site



Background

Once you have agreed on the purpose(s) of your project Web site(s) [1] you will need to choose a domain name for your Web site and conventions for URIs. It is necessary to do this since this can affect (a) The memorability of the Web site and the ease with which it can be cited; (b) The ease with which resources can be indexed by search engines and (c) The ease with which resources can be managed and repurposed.

Domain Name

You may wish to make use of a separate domain name for your project Web site. If you wish to use a .ac.uk domain name you will need to ask UKERNA. You should first check the UKERNA rules [2]. A separate domain name has advantages (memorability, ease of indexing and repurposing, etc) but t his may not be appropriate, especially for short-term projects. Your organisation may prefer to use an existing Web site domain.

URI Naming Conventions

You should develop a policy for URIs for your Web site which may include:

Issues

Grouping Of Resources

It is strongly recommended that you make use of directories to group related resources. This is particularly important for the project Web site itself and for key areas of the Web site. The entry point for the Web site and key areas should be contained in the directory itself: e.g. use http://www.foo.ac.uk/bar/ to refer to project BAR and not http://www.foo.ac.uk/bar.html) as this allows the bar/ directory to be processed in its entirety, independently or other directories. Without this approach automated tools such as indexing software, and tools for auditing, mirroring, preservation, etc. would process other directories.

URI Persistency

You should seek to ensure that URIs are persistent. If you reorganise your Web site you are likely to find that internal links may be broken, that external links and bookmarks to your resources are broken, that citations to resources case to work. Y ou way wish to provide a policy on the persistency of URIs on your Web site.

File Names and Formats

Ideally the address of a resource (the URI) will be independent of the format of the resource. Using appropriate Web server configuration options it is possible to cite resources in a way which is independent of the format of the resource. This should allow easy of migration to new formats (e.g. HTML to XHTML) and, using a technology known as Transparent Content Negotiation [3] provide access to alternative formats (e.g. HTML or PDF) or even alternative language versions.

File Names and Server-Side Technologies

Ideally URIs will be independent of the technology used to provide access to the resource. If server-side scripting technologies are given in the file extension for URIs (e.g. use of .asp, .jsp, .php, .cfm, etc. extensions) changing the server-side scripting technology would probably require changing URIs. This may also make mirroring and repurposing of resources more difficult.

Static URIs Or Query Strings?

Ideally URIs will be memorable and allow resources to be easily indexed and repurposed. However use of Content Management Systems or databases to store resources often necessitates use of URIs which contain query strings containing input parameters to server-side applications. As described above this can cause problems.

Possible Solutions

You should consider the following approaches which address some of the concerns:

References

  1. The Purpose Of Your Project Web Site
    <http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-15/html/>
  2. UKERNA
    <http://www.ukerna.ac.uk>
  3. Transparent Content Negotiation
    <http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html>