Site index

Home

About Web Indexing

Resources

Indexer Search

Member Directory

Membership

Web Index Examples

Web Indexing Awards

Contact

ASI logo and link to ASI site

Digital Publications Indexing SIG, a Special Interest Group of the American Society for Indexing

HTML Indexing -- a few hints on what it’s all about
By L. Pilar Wyman

The two most important aspects of HTML Indexing are 1) it’s still indexing, and 2) you need to incorporate HTML code. Sounds simple, and, in many ways it is. But, as usual, the devil is in the details. Any time you deal with technicalities and code, you need to be careful. Otherwise, assuming you are versed in quality indexing, there are no surprises.

In fact, there are several professional products and tools available to assist you with the technicalities and code, and which can allow you to focus on the most important part, the conceptual analysis, the indexing. This becomes true the more versed you are in the tools themselves, as with any software product.

In brief, the top tools around to assist you with HTML indexing include:

  • HTML/Prep, available from Leverage Technology
  • RoboHelp, made by eHelp
  • HTML Indexer, developed by David Brown
  • HTML Help Workshop, from Microsoft

HTML/Prep works as an add-on to Cindex, a popular stand-alone indexing program, and converts index files to HTML files. You can also convert index files made with other index programs, such as Sky or Macrex (this requires an initial ASCII conversion step). If you use this program, you don't have to worry much about the HTML coding for entries, as that is provided for you. You do, however, need to figure out and specify exactly how you want your index presented: with letter lists on top, top and bottom, or alongside; with main headings separate for browsing, or not; class and style attributes for heading levels; tips to alert users where they are in the index; frames for certain target names; etc.

RoboHelp and HTML Help Workshop create Help-type indexes. You have probably seen this type of Help system on the Web or with a software package on your computer: you click on "Help" and a window pops up into which you can type a search word. In the large window below, an index of sorts appears, with the top-most line being whatever line in the index matches closest to your search word. You can also scroll through the index line options. With this sort of utility, you essentially write your index, incorporate coding for the hot links, and then format the index so that it appears as a Help screen. Sounds simple, but the coding for the Help screen can be tricky, and, as usual, you need to make sure the coding for the entry links is accurate.

HTML Indexer is designed for embedding index entries directly into HTML files. While you can embed index files via Cindex or other index programs, embedding into HTML files can be tricky as you need to modify your entry tags so that they are anchor tags, and, very importantly, you need to embed coding into the source document to match the anchor tags. (For example, with HTML/Prep, above, you would need to go back to the core document files to embed your entries.) With HTML Indexer, embedding is done via a user-friendly GUI interface. (The icon is even a little blue anchor, too.) Index entries are thereby married to their Web pages upon creation, so any changes made to the Web pages will be reflected directly in the index entries, though you may need to re-compile the index to ensure complete accuracy after any changes to the Web pages have been made.

Anchors are critical for HTML indexing as they ensure your user can actually find the text source they are looking for. When printed books are indexed and page locators are used, a user needs to scan the page sent to find the text source for the entry they are searching. Generally, print pages aren't that large, so this visual scanning is not onerous. However, an HTML document can be quite long, and you may not be able to see it all at once on your computer screen. Having an index entry anchored to the specific text source on the HTML page means that your user is sent directly to the text source itself, wherever in the page it may be -- no visual scanning is necessary to find the text source. This makes for a happy user.

In sum, the technical aspects of HTML indexing include adding and editing tags for entries, including anchor tags, so that users can click on index entries and be led directly to the source text; and formatting the index file for presentation with the HTML document, options for which include back-of-the-book type index files and clickable alpha group headers or top/bottom headers, Help file-type windows, frames, and other HTML document presentation options. This may sound like a lot, but it's all learnable, and, the good news is, as you can see from above, there are tools to assist the indexer. This does mean that HTML indexing takes more time, from start to finish, than more traditional print indexing. You need to allow for the time required for the technical coding.

However, as with any indexing, the conceptual analysis the indexer provides to the users is most important. Analysis, term selection and topic identification, cross-referencing, double-posting, and all other critical components of quality indexing (bringing together disparate but related information, informing users of preferred terminology or related terms, providing analysis vs. long strings of locators, etc.), can – and should – prevail in HTML indexing. Get your technical details down so you can focus on the content and your users will always return.

May your indexes always compile.

© 2001 L. Pilar Wyman
Wyman Indexing
Washington, DC
Tel/Fax: (443) 336-5497

Back to Resources

web counter
web counter since Apr 5, 2015