Let me start you off on the right foot, so that you don’t waste time reading this if you don’t have to:
If you do not have a lot of information on your intranet or website, or if the information that you have is not really that import (for example, not many people use it and if they spend a lot of time looking for information, it doesn’t really matter), then we suggest that you use a free search engine such as Google Custom Search, JRank or PicoSearch.
On the other hand, if you manage intranet content for a medical institution, financial services company, law firm or other company with information that is of great importance, then we suggest that you take a few moments and read the following articles, written by Kevin Broccoli (CEO at BIM):
- Indexes: An Old Tool for a New Medium
- Intranet Design Magazine also published the following article, reproduced below:
In addition to the above-mentioned articles, Brian O’Leary (in Book:A Futurist’s Manifesto) had this to say:
Recently, BIM has created a system which combines a search engine interface with a human-created index. Stated simply, the search engine is set to search the index instead of every page of an intranet.
The following outlines the four basic steps that must be performed in order to have accurate information retrieval:
(1) The search engine must be set to search only keywords that are assigned to each Web page. This means that someone has to read through each page of the site, and then decide which words most accurately reflect the contents of that page. The text must be carefully examined for concepts that are implied, even if the word itself is not used within the text.
(2) A custom-designed thesaurus must be developed, based on the terminology used on the web site as well as within the specific industry or profession. The thesaurus lists not only terms that mean the same thing as the chosen term, but both broader and narrower terms for the word. An example of this (for a medical web site) would be the phrase “root canal.” A broader term for this phrase would be “dentistry.”
(3) The search engine is set to recognize synonymous terms. If a user types in a certain word, all pages are retrieved that have words meaning the same thing as the chosen term.
(4) Then the individual assigning the keywords uses the site thesaurus as a guide in assigning terms to the various pages. She or he assigns broader and narrower terms to the principal keywords as called for.
A fifth step can be used to create a super-precise information retrieval system. This step is the creation of sub-categories to go along with each keyword. Let’s illustrate this with our medical web site example:
Using the first four steps, we created a search system which allows visitors to find all of the information dealing with the human heart. However, if the web site is quite large, the search engine still might retrieve up to 20 or more pages that have to do with the heart. Although there is usually a title and paragraph summary showing what each page is about, they can often be very wordy and not easy to scan through. By creating sub-categories for the word “heart” (such as “heart disease,” “heart surgery,” “parts of the heart,” etc.) and displaying these sub-topics as search results, web site visitors can quickly and easily identify the topics that fit their needs. There is no need for Boolean operators or any special skills on the part of the user.
By employing the above methods users always find all of the information that they are looking for, without having to wade through hundreds (or even thousands) of irrelevant documents.
Who should assign the keywords?
There is always the temptation to use content writers to assign the keywords. The thought is that they are already writing the page, so why not have them stick some terms in the keyword tags of the html page so that it can be used with a search engine later on? This inevitably results in inconsistent keyword tagging, lack of use of all words that might be used in the search, and (once again) user frustration.
BIM has experience tagging documents with keywords, having done so for medical and financial institutions.