Technical Advisory Service
2001 - 2004 archive
Frequently Asked Questions
The questions and answers on this page have been asked by nof-digitise applicants. This page will be updated frequently.
Hardware and software
1. Where can I get a draft form to use for copyright clearance?
IPR and copyright is a very complex area and unfortunately there is no "one-size-fits-all" solution to these issues. Every resource or collection of resources may have its own IPR problems that will need to be solved before a digitisation project can go ahead. However, as it is an issue of such importance when working in a networked environment, a number of excellent resources have been produced to guide you through the process of clearing resources for use.
VADS (The Visual Arts Data Service) and TASI (The
Technical Advisory Service for Images) have produced a guide to
creating digital resources:
2. The technical standards document talks about certain institutions having access to additional resources by signing a licence committing them to non-commercial use (section 3.1.5). Who would be parties to these licences? Would these licences involve an exchange of money and if so between whom? Would the copyright owner be entitled to a fee for reproduction in the same way as with non-digital reproduction?
The principle is that the end user will be provided with access free at the point of use but that two issues have to be covered:
There is a variety of models of course but a proven model involves using contracts i.e. issuing a licence.
Let us define three parties. The contributor is the body which owns the IPR in the resource. The Service Provider is the body which stores and makes available the resource. The User Institution is the body which accesses the Service Provider under licence.
The Contributor and the Service Provider could be the same thing. However, if that is not the case, there requires to be a licence setting out the conditions under which the Service Provider may make the material available.
There may also be a payment from the Service Provider to the Contributor in respect of the IPR to allow for non-profit, non proliferation educational use from then on.
The User Institution will be licensed to use the resources by the Service Provider and may pay an annual fee to allow that access. The User Institution agrees in the licence to certain conditions - normally non-profit, non proliferation use.
The User institution then allows its user group - students, those visiting a library or museum - to access the resources.
It is expected that so long as the use was purely for non-profit, non-proliferation educational purposes, the service provider would not make a further IPR payment. The licence fee it charges being merely there to sustain the service.
An explanation of NOF's IPR conditions and to issues around Open Source systems follows.
1. The NOF IPR conditions specify (page 4 under Definitions) that 'Material means any documentation or material (including without limitation software and databases) to be provided to the Fund etc..' This is further explained in the guidance letter (13 August 02 under 'Definitions' and under '2.2 Licences' on page 4) ...'in the case of material that is delivered with software or databases specially written for this project (including any adaptation of commercially available software or databases) The Fund would expect that any commercial exploitation would recognise the use of public funds in the generation of the material. Significant commercial exploitation might involve grant repayment.'
2. The IPR conditions give the Fund the right to use the materials developed for the programme but do not provide for any 'transfer' of IPR. This is an important difference. The Fund will not 'own' the IPR to materials created (including software) but through the conditions does have rights over the commercial exploitation of the material, as explained above.
3. If any grant holder in unclear on this point and has any query regarding terms and conditions of NOF grants please contact NOF directly either through your case manager or to this email address at email@example.com.
4. If you are a supplier contracted to a grant-holder please address your queries to the grant-holder who will raise them with NOF where necessary.
5. Please note that neither the IPR conditions nor the Technical Standards conditions require a commitment to open source software, but we welcome that debate on the nof-jiscmail list as it raises awareness of an important issue.
This information is not tendered as advice but does indicate which projects should most likely take note of regulations as they might apply to their situation. It also provides details on sources of information of possible use to projects.
This FAQ is potentially of importance to any project which opts to:
If either of the above applies, then there are requirements which must be met in respect of three possible areas :
1 Information requirements, i.e. the information that must be provided to end-users. These requirements include providing your end users with:
The above requirements will probably apply to you if you sell or advertise goods or services online (i.e. via the Internet, interactive television or mobile telephone).
2 Commercial communications, i.e. essential identifications and explanations that must be provided to end-users, for example if a project markets via email. These requirements include providing your end users with:
Note therefore that any form of electronic communication designed to promote your goods, services or image, such as an e-mail advertising your goods or services, must:
The above requirements will probably apply to you if you promote goods or services through any form of electronic communication (e.g. an e-mail advertising your goods or services).
3 Electronic contracting, i.e. information and explanations about the process of creating a contract electronically with an end-user. These requirements include providing your end users with:
The above requirements will probably apply to you if you enable end users to place orders online.
The requirements contained in the three categories above represent the basic situation. There may
be other requirements in addition which can be ascertained from the sources of information given below.
In conclusion the DTI guidance states:
Sources of Information
The Electronic Commerce Directive (00/31/EC) & The Electronic Commerce (EC Directive)
Regulations 2002 (SI 2002 No. 2013)
Guidance on Electronic Commerce Regulations
Beginners Guide to the E-Commerce Regulations 2002
Frequently Asked Questions on The Electronic Commerce (EC Directive) Regulations 2002
The address for Northern Ireland is:
Hardware and Software
1. I am trying to find information on whether there is any OCR software that can cope reliably with 17th-19th century printed material, including material in columns. I would also like pointers to information on how existing OCR software would cope with 19th-century newspapers.
Although we do not have very much experience of individual products most OCR software would still have problems with recognising these types of text. Even apart from the likelihood of non-standard typefaces and awkward columns, most OCR software might have problems with background noise (e.g. print bleed-through or foxing) and non-standard characters. It's probably worth testing OCR software before rejecting it, as the main alternatives would be re-keying the whole text (horribly expensive) or just digital imaging.
The AHDS/OTA's guide to Creating and documenting
electronic texts is worth a look:
2. What is the situation regarding servers? Supplying video, for example, to many institutions simultaneously places huge demands upon UK Internet infrastructure as well as servers. Should we be looking to host our servers at a high capacity Super Janet node or will other provisions be made?
At this stage the NOF are not going to provide central servers on high speed networks, and the like. It will be down to the project to make arrangements to have their content connected to the Internet at speeds sufficient to deliver it to users in a useful fashion. Thus, a project delivering high bandwidth video will probably need a more robust (and faster) connection to the Net than one delivering small static images. The extra costs of this connection will need to be laid out - and justified - in the business plan.
Connection via one of the bigger SuperJANET nodes is one possibility that projects with HE partners might pursue, provided their use falls within JANET's Acceptable Use Guidelines http://www.ja.net/documents/use.html
The term Contents Management System (CMS) is usually used to describe a database which organises and provides access to digital assets, from text and images to digital graphics, animation, sound and video. This type of product is relatively new and there are a few CMS available as off-the-shelf packages. CMS range from very basic databases to sophisticated tailor-made applications and can be used to carry out a wide range of tasks, such as holding digital content, holding information about digital content, publishing online and publishing on-the-fly.
For more information see http://www.ukoln.ac.uk/nof/support/help/papers/cms/
Do I need one for my project? Is a database sufficient?
The CMS provides mechamisms to support asset management, internal and external linking, validation, access control and other functionality. Typically, a CMS is built on an underlying database technology.
Content Management Systems range from very basic databases, to sophisticated tailor-made applications. They facilitate easier tracking of different parts of a Web site, enabling, for example, staff to easily see where changes have been made recently and - perhaps - where they might need to make changes (a 'News' page that hasn't been edited for 6 months?). They also ease the handling of routine updating/modifying of pages, where you want to change a logo or text on every page, for example.
A CMS can also simplify internal workflow processes and can ensure that you are working with a single master copy of each digital asset.
However there are other approaches which may be useable, such as making use of server-side scripting to manage resources.
Solutions may include:
Use of a dedicated CMS system. Note this may be expensive, and there may be costs in learning the system, using it, etc. In addition you should ensure that an 'off-the-shelf' CMS product supports the metadata standards one might expect to use.
Use of a an open source CMS system. This avoids licence costs, but there are still resource issues.
Use of a database. May manage the resources but will it address issues such as workflow?
Use of server-side scripting approaches, such as PHP (Unix) and ASP (NT). These may allow bespoke applications to be developed, and may sit on top of databases.
To summarise then, the issue to be aware of is the difficulties in maintaining resources in formats such as HTML. Using flat files and a CMS and/or databse is a way of addressing this management issue. Whilst it is not an explicit requirement that projects manage their resources with a CMS and/or a database, if such tools are not used, the project must show how it intends to faciltate good management of its digital assets.
Advice on selecting scanning equipment can be found in the
Digitistation Process information paper available at:
Suitable resolutions for digital master files for various media types are discussed in the HEDS Matrix , and the JIDI Feasibility Study  contains a useful table of baseline standards of minimum values of resolutions according to original material type.
A detailed discussion of resolution, binary and bit depth can be found on TASI's Web pages  and a good basic guide to colour capture can also be found on the EPIcentre Web pages .
References refer to :
Also both the HEDS and TASI sites, particularly at:
5. Which is better SQL Server or an Access Database?
MS Access was designed as a database system for small scale office use. It was not designed for use as a database server, although it can be used in this mode for simple use.
For further information we suggest you look at the Microsoft Web site.
Some comments on Access vs SQLServer are available at:
You may also wish to look at:
Although Access may be capable of handling the sorts of query volume you suggest, at least in the short term, you do need to consider scalability (SQLServer scales better), Web site integration (SQLServer *probably* integrates better), enterprise access to the data (SQLServer will better enable intranet access to the data, etc).
Data structures are unlikely to be affected by a move to SQLServer.
6. Would NOF recommend the web site to be hosted on its own dedicated server or on a shared server? What are the things I should bear in mind to take such a decision?
The issues to be considered in making this decision relate to performance, security and potential conflicts between software applications:
Performance - with a shared server, projects will need to ensure that the performance of their service is not impacted by the other things the server is doing. Peak access times (of both the project's service and the other services on the shared machine) need to be considered. Projects need to think about the performance of the server itself, as well as available network bandwidth to the machine.
Security - as a general rule, the more services offered by a machine, the harder it is to make that machine secure. Projects should ensure that any machine they use is operated in a secure manner.
Conflicts - on a shared server there are more likely to be software conflicts, e.g. to run package X, package Y needs to be installed, but this conflicts with package Z that is already installed for some other service. There are also issues associated with hosting more than one Web server on a single machine. Typically these are resolved by hosting multiple 'named virtual hosts' (though under some operating systems it is also possible to assign multiple 'virtual' IP addresses to a single network interface, or to install multiple network interfaces). Where 'named virtual hosts' are used it should be noted that the browser must support HTTP 1.1. However, this is not a significant problem. The Apache manual says:
"The main disadvantage [of name-based virtual hosts] is that the client must support this part of the protocol. Almost all browsers do, but there are still tiny numbers of very old browsers in use which do not."
7. Should the web server be protected by a firewall? (Or is it enough to have a firewall installed on our office network server where we will store all our digital mastercopies?)
As the technical standards state (section 3.1.4):
"Machines should be placed behind a firewall if possible, with access to the Internet only on those ports that are required for the project being delivered."
This applies to all machines used to deliver the project. Projects are strongly encouraged to protect all machines used to deliver material (Web servers and back-end master storage servers) with a firewall.
8. Should we require back ups from the web hosting organisation? (We will have all back ups of mastercopies on our own network server in the office.)
Projects will need to be able to recover their Web service in the event of server failure, disk failure, or malicious hacking. Backups therefore need to be taken of all files that need to be restored to recover a service. I would anticipate that, in most cases, this means that projects will need to take backups of more than just master copies.
9. What type of connection and recommended speed should the web server have to the local ethernet (eg 100Mbits/sec)?
It is not possible to give a single answer to this kind of question. The issue is ensuring sufficient bandwidth, given anticipated levels of traffic. Traffic levels will depend on numbers of users and the kind of material being accessed. A project that anticipated 10 concurrent users accessing text-based material will have significantly less bandwidth requirements than a project anticipating 100 concurrent users accessing streaming video.
Access performance should be 'reasonable' for all resources served by the project, but it is difficult to provide guidance currently on what 'reasonable' means. Available bandwidth at the server end of less than 56K for any individual end-user is likely to have an impact on their perception of server performance. Image or video projects will probably want to aim much higher than this. Available bandwidth is total bandwidth divided by total numbers of users (but remember to allow for bandwidth being used for other things - e.g. some public library/council networks will have bandwidth reserved for CCTV, administrative computing and so on.
As part of the Technical Evaluation process run by
BECTa, the software that
each project is using is recorded.
The NOF technical standards and guidelines require that all text is encoded
in a way that makes it compatible with Unicode UTF-8. This allows for the
simultaneous use of languages that deploy different (e.g. Roman and
non-Roman) character sets, including many of the community languages being
used by NOF-digi projects. Project managers need to be aware of what
hardware / software is required in order to use Unicode. Basic information
on Unicode is available from
Windows 2000 and XP currently both support Unicode, whereas earlier versions do not. However, *applications* running on the earlier Windows operating systems can still support Unicode.
The upcoming Mac OS 10.2 is alleged to have better Unicode support than
previous versions. For information on Unicode and previous Mac operating
Some web browsers are better than others are reading community language scripts but any browser which claims to support HTML4 should be able to support Unicode. Overall, Mozilla, the open-source browser, is the preferred choice, followed by Netscape Navigator, then Internet Explorer. See http://www.alanwood.net/unicode/browsers.html for more details on how browsers need to be configured to read Unicode
It is necessary to obtain a Unicode font in order to display the different character sets. For a list of all the different Unicode fonts, Alan Wood's site is again a good source of information (see http://www.alanwood.net/unicode/fonts.html). Often a Unicode font comes embedded within particular applications. Many PCs have Microsoft's Arial Unicode font installed along with their copy of Microsoft Office 2000. Those without Office 2000 used to be able to download this font for free, but the font was removed from the Microsoft website in August 2002, leaving no suitable free Unicode in existence. For a further discussion on this see http://slashdot.org/comments.pl?sid=38224&cid=4092943.
Many developers employing Unicode, however, prefer to use one of two software packages which act as multi-character set text editors. These come with their own rudimentary Unicode fonts. Unipad, available for free, can be downloaded from http://www.unipad.org/, with versions for Windows 95 and above. A trial version of Uniedit, which should run on Windows 3.1 and above, is available from http://www.humancomp.org/uniintro.htm. Both programs cater for built-in keyboards and a wide variety of character sets - although some of these sets require further downloading from the relevant websites.
|UKOLN is funded by MLA, the Museums, Libraries and Archives Council, the Joint Information Systems Committee (JISC) of the Higher and Further Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.|
T A S : 2 0 0 1 - 2 0 0 4 : A R C H I V E
This page is part of the NOF-digi technical support pages http://www.ukoln.ac.uk/nof/support/.
The Web site is no longer maintained and is hosted as an archive by UKOLN.
Page last updated on Monday, May 09, 2005