Search Catalogue Documentation

The present document presents information on creation and maintenance of MS SiteServer Search Catalogues. We expect the readers of this document to be generally familiar with the features provided by SiteServer Search, how to create new catalogue instances from Microsoft Wizards, and other basic functionality. Information provided here is supplementary to Microsoft documentation. It focuses on important aspects of Search we feel would benefit from extra focus.

In this document we cover:

Getting Search to Work with DC Metadata

Dublin Core metadata is in the form DC.Name, DC is followed by a dot and then by the name of the tag. Search does not support dots. In fact it does not support most punctuation characters. To overcome this difficulty the following steps need to be carried out.

NOTE: If the catalogue definition already contains meta tags with "invalid" characters then the following procedure must be carried out before new tags can be entered, or old tags modified. The Catalog Schema interface will not allow any operations to be performed on it when "invalid" characters are found in meta tags. Hence, before new tags can be entered the old tags must be deleted.

  1. Find a file called Schema.txt in directory C:\Microsoft Site Server\Data\Search\Projects\xxxx\Build\, replacing xxxx with the name of the catalogue (e.g. ExploitCatalogue) (Note: we assume that a Custom Schema is used for the creation of the catalog).
  2. Find the blocks of code containing the tags with "invalid" characters and cut them out of the file. Save and exit the editor.

To enter new tags follow these steps:

  1. In SS3.0 MMC open the search branch and find the search catalogue that you wish to modify. In the Catalogue Build Server branch open the Schema and Add to the catalog the desired tag, replacing fullstops by an underscore or some other neutral and acceptable character. Select HTML Meta for Property Set. For Property ID give a unique name e.g. DC$, where $ is a number. Set both Retrieve and Index radio buttons to Yes.
  2. Repeat Step 1 for each Dublin Core tag. Save this as a Custom Schema. When done, close the MMC.
  3. Find a file called Schema.txt in directory C:\Microsoft Site Server\Data\Search\Projects\xxxx\Build\schema.txt where xxxx stands for the name of the catalogue. Open this file in a text editor. Find the lines with your tags that you have just entered and replace the neutral character by a full stop in each of the DC tags. Now save the schema file and exit.
  4. Rebuild the catalog. To search for tag content in the catalog you must type: @"META_DC.Creator" Jake Brown . You will need to modify the search scripts to accept the extra form argument.

Setting Up Category Search

Categorisation of documents is used to restrict the scope of the searched files. In practice this means tagging files with metadata. A different kind of configuration to the above is needed to facilitate Category based searching:

  1. Open DEFINECOLUMNS.TXT file in c:\Microsoft Site Server\Data\Search\Config directory.
  2. Modify it to contain the following line for each meta tag you want in the catalog:
    myTag (DBTYPE_WSTR | DBTYPE_BYREF) = d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1 "myTag"
    (NOTE: replace myTag with the name of the tag in two places. Only the second entry ought to be in quotes)
  3. Modify Schema.txt file in the same directory to have the following block of code for each meta tag you want to catalog:
    <COLUMN
        name=myTag
        description="Some Descriptive Text"
        type="VT_LPWSTR"
        propid="d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1/myTag"
        index="yes"
        retrieve="yes"
    </COLUMN>
    NOTE: replace myTag with the appropriate string in two places. Add some descriptive comment at the "description" variable.

We have used the Master Schema to facilitate Category Search.


Configuring Category Search JavaScript Interface

The Category Search interface is implemented using JavaScript. The interface is split across two frames. The left frame is used to select a category from a hierarchical list. The right frame is used to enter a text query for searching. When a category is selected in the left frame, it is immediately echoed in the Search Category field in the right frame. The category hierachy is created from a configuration file that is pulled in using a Server Side Include by Default.asp in the /cat_search directory. To pull in a different configuration file edit Default.asp in line 52.

The format of the category configuration file is as follows. Each category and subcategory needs to have the following form

categoryTree[index] = ["level", "label", "tag name", "tag content search string"]

where:

Final configuration parameter associated with Category Search is in file /global_defaults.ssi which contains the definition for a variable holding the name of the catalogue used.


Written by Adam Batenin
March 2000