A Search Interface for my Questions
The Selection Process

Selection on Keywords

The Spider, Selection on Keywords, Limiters on Words, Limiters on Sites, Displaying Sites.
  From `everything' we can select sites based on keywords. The keywords are part of the user's question. Although, it might seem contra-productive, but when the keywords are entered, the search engine will immediately start producing a list of words which may be related to the words given by the user. Relationships to other words can be made in various ways. A list of synonyms would be great (but a lot of work to produce). Words which are often found on the same page as the words from the query can also be used. Combinations of words made by other users in previous requests would be extremely helpful.

The user has to group the words from the question and the suggestions made by the spider into three categories: primary keywords (in the interface represented by the color green), related words (blue), and forbidden words (red).

The figure above shows how this works.

  • The primary keywords. Initially, these are the words found in the question from the user.
  • The (probably) related words, as suggested by the search-engine. The user can promote them to primary keywords. They can be removed when not related at all, or they can be demoted into a forbidden word.
  • The forbidden words. Useful to exclude pages which contain the wrong meaning of a word, or for specialization of a subject.
For each word, the user gets an overview about
  • the number of sites which contain the word;
  • the number of pages over all sites which contain the word; and
  • the number of hits: the word-count over the whole Internet.
These are easy figures to produce, because the spider can count them when it scans the pages found on the Net.

Next Limiters on Words.
Up The Selection Process.