PRIDE Requirements and Success Factors
Work Package 2 of Telematics for Libraries project PRIDE (LB 5624)
Table of Contents
As a service for the end users, PRIDE will inevitably be concerned with the design of its user interfaces. This section presents user interface designs in the information retrieval area.
The interfaces for search systems are closely related to the kinds of requests they can handle. Two fundamental approaches form the basis of today's user interfaces:
The syntax of queries in a query-based interface can be different. Two common alternatives, called simple search and advanced search, are present in most search engines.
Simple search is entering one or more keywords separated by spaces in the query entry field. Simple search accepts the system's defaults, usually retrieves irrelevant or too many documents in large databases. However, in small databases of Web documents and in subject directories, simple search is usually the best approach.
Advanced search has more sophisticated query syntax and is often supplemented by additional controls (pull-down menus, checkboxes, etc.). Types of functionality provided by advanced search include:
Most commonly used search engines nowadays are World Wide Web based systems which interface with their users by means of Web forms (described in section 22.214.171.124) and search mainly WWW and USENET (network news) documents. User interfaces to a number of popular search engines are described below as examples.
This is the largest Internet search engine, which searches over 140 million web pages. It is query-oriented. The basic (simple) search can use + and - prefixes to include or exclude terms from the search, and quotes to ensure adjacency. If a word has at least one capital letter, the case is preserved during search, otherwise the word is searched case-insensitively. Wildcards (*) are allowed. The advanced search allows AND, OR, AND NOT, NEAR (within 10 words), and parentheses to form Boolean expressions, as well as limiting by dates. Parentheses can be used to group expressions. It has many new features, including search by specific language, rudimentary language translation, and sophisticated techniques for refining searches and ranking results.
Infoseek searches 30 million or more well chosen web pages plus special searches of newswires, companies, newsgroups, and shareware. Infoseek uses the following conventions:
Consecutive capitalised words (e.g., River Phoenix) must stay consecutive. Use commas to break up capitalised terms that shouldn't be consecutive. Double quote marks around a phrase also force the terms to be consecutive.
Hyphenated words are required to be within one word of each other in either order. Square brackets force the bracketed words to be within 100 words of each other.
Plus (+) or minus (-) immediately before a word force the word to be included or NOT included, respectively (like in AltaVista).
Excite searches over 50 million web pages, Usenet newsgroups, web site reviews, newspaper and magazine news sources. Excite can search by synonyms, using its built-in thesaurus, as well as by exact words. It supports +, -, AND, OR, AND NOT, and parentheses in Boolean expressions. Excite can use feedback and do a "More Like This"-style follow-on query-by-example search.
Lycos is another large search engine, searching over 50 million web pages as well as gopher and ftp sites. It has some very sophisticated features for controlling proximity and sequencing of search terms. It searches for graphics or sound files as separate choices. The basic search treats all terms as ORed, but gives preference to results with the most hits. The Custom Search allows you to AND all terms, OR all terms, or request documents with at least a selected number of terms. It also allows you to limit or include lower scoring returns.
DejaNews searches past and present newsgroup articles. It is one of the best ways to find a brand new web site that other engines have not yet indexed, because newsgroup articles themselves include information and point to many resources that might not be found through web searches. It offers both a basic and a power search. The power search lets you control whether any or all terms are included in returns, and various controls on the format of the returned information. It supports full Boolean operators (AND, OR, AND NOT and parentheses). Proximity is supported by placing a caret (^) between terms. Wildcards (?, *) can be used. Braces are used to denote a range, for example, [a,c} shows the range from a to c.
Yahoo is a directory-based engine. It is the biggest of the subject-organised directories and has been imitated all over. Its directory is carefully populated with cross-links to make navigation easier. The query-based search facility in Yahoo plays an auxiliary role and is limited to the Yahoo directory.
The Meta-search engines, which combine searches of a number of the individual engines, have distinct limitations. They generally use the most basic level searches of each of the major engines, and do not handle complex search constructs well. They are getting better, though. For instance, Metacrawler (<URL:http://www.metacrawler.com>) offers three levels of term grouping ("any", "all", and "phrase") and a selection of five search domains (Web, newsgroups, files, etc.).
|1999-01-22||PRIDE Requirements and Success Factors|