Improving the search
Here are ways to improve the quality of the user's experience when searching in your data.
Include headings.
On first entering a collection of text in Words Close Together format, the user is shown a table of contents. Each line in a table of contents is actually a heading, enclosed by one of the six heading tag pairs allowed by HTML: <H1>...</H1>, <H2>...</H2>, <H3>...</H3>, <H4>...</H4>, <H5>...</H5>, and <H6>...</H6>. Think of the lower numbered headings as parents to the higher numbered "child" headings. <H1> is the most senior heading. Suppose level 1 is, for example, the name of an author. There might be several level 2 headings, each the name of a book by that author. Level 3 headings might be used for chapter headings within those books, and level 4 headings might be sections within chapters. They are arranged in a hierarchy.
Do you have to have a table of contents? No. Do headings and the resulting table of contents improve upon the search experience? Yes, in several ways:
- The headings are like an expandable roadmap through the collection. Click on the plus sign in front of any heading and you are shown the subheadings. That can be repeated as often as there are yet further subdivisions. Once you are at the least significant heading level, no more plus signs show.
- The more headings you put in the text document, the more the user has control of where to start reading or browsing.
- The words before a heading are considered totally separate from words just after a heading. A search result that straddles a heading is almost always meaningless.
- Unlike most other search engines, Words Close Together takes headings into account in the search. A heading does, after all, tell something about the text that follows it.
Separate things that don't belong together.
Have you ever carried out a search to find that the words you want are in the result, but they are totally unrelated to one another? The conventional Internet search engines frequently offer up catalog-style pages near the top of their result lists. The words are there, but they are in separate paragraphs or separate segments of text. That's not very helpful.
You can help your end user by inserting dividing lines between blocks of text that are unrelated. One way is to place an <HR> tag -- a horizontal rule -- between the text portions. You get the same effect by using paragraph tags, but that is more demanding because you have to ensure that paragraph tags and their corresponding </P> end tags are nested correctly with any other tag pairs present. The <HR> tag does not have an end tag.
You may not care to check continuous text for whether it needs these dividing lines. But catalog-style pages warrant this extra work. In some cases it can be automated. Example: If you are handling text that is dumped from a database file, it's very helpful to have the dump software automatically inject <HR> tags between separate units.
Watch for long dashes disguised as hyphens.
Various word processors and other programs permit you to save files in text format. Some reduce a long dash to a single hyphen. The result is that two words that belong apart wind up as one hyphenated word. If a person searches on the exact spelling of one of the words, the hyphenated copy of it will be missed in the search. Search is improved if you catch instances of false hyphenation. You might substitute four characters (space hyphen hyphen space) for hyphens that started out as long dashes. That way, the result is no longer an unrecognized hyphenated word.
A related problem is words that are hyphenated at the end of lines. This occurs particularly in content prepared for presentation in columns, as in newspapers. That brings us to the next tip ...
Be thankful for spell checkers.
Better yet, use a spell checker.
Words that are not spelled correctly may be missed in a search. It is always a good idea to spell check content before including it in searchable text. Proper spell checking also makes your work look much more professional. Use the standard spell checker built into your word processing or other software.
It is true that the wild card feature in Words Close Together search lets you guess at correct spellings ... for example, entering "b*t*f*l*" will enable you to find the word "beautiful". (It will at the same time find "butterfly"!) But there is no assurance the user is comfortable with (or even knows about) the wild card feature. So it is best to have correct spellings as much as possible in the searchable text.
|
|||||
| Proximity Search .com | Search technologies by Marpex, Inc. | ||||