|
From `everything' we can select sites based on keywords. The keywords
are part of the user's question.
Although, it might seem contra-productive, but when the keywords are
entered, the search engine will immediately start producing a list of words
which may be related to the words given by the user.
Relationships to other words can be made in various ways. A list
of synonyms would be great (but a lot of work to produce). Words which are
often found on the same page as the words from the query can also be used.
Combinations of words made by other users in previous requests would be
extremely helpful.
The user has to group the words from the question and the suggestions
made by the spider into three categories: primary keywords (in the
interface represented by the color green), related words (blue), and
forbidden words (red).
The figure above shows how this works.
- The primary keywords. Initially, these are the words found in the
question from the user.
- The (probably) related words, as suggested by the search-engine.
The user
can promote them to primary keywords. They can be removed when not
related at all, or they can be demoted into a forbidden word.
- The forbidden words. Useful to exclude pages which contain the
wrong meaning of a word, or for specialization of a subject.
For each word, the user gets an overview about
- the number of sites which contain the word;
- the number of pages over all sites which contain the word; and
- the number of hits: the word-count over the whole Internet.
These are easy figures to produce, because the spider can count them
when it scans the pages found on the Net.
Next Limiters on Words.
Up The Selection Process.
|