(Republished with permission from the author. Originally published November 17, 1998, in Contentious.com)
Indexes: An Old Tool for a New Medium
"I know it's here but I just can't find it!"
You've probably heard that exclamation in a variety of situations.
Today, however, it seems that people often experience this kind of frustration
when trying to locate specific information within HTML documents. This
is especially true concerning "content-rich" Web sites.
Perhaps you've had the following experience: You visit a Web site hoping to find information about a particular topic. You type a keyword or two into the site's search engine. What do you find? Nothing! The search engine says, "0 results have been found for your search."
So you try once more, this time using a different search term than before. Now you do get some results but too many. The search engine now says, "47 documents have been retrieved." Thats more than you wanted or expected.
Still, you start looking through those documents one at a time. After several hours spent scanning many pages of text, you discover that only 4 of those 47 documents contain the information you sought.
Exasperated, you wonder why so much information was presented to you, when so very little of it met your needs. You also wonder what might have happened had you not submitted that precise search term into the search engine.
The root of the problem lies in how search engines perform searches. Put
simply, they scan text looking for occurrences of whatever word you typed
into the search box. Then, they list every single document that contains
even the merest mention of the word.
What Makes a Good Index
The Internet is a relatively new medium, but you can learn a lot about how to make online content work well from the "parent" of online media: print media.
Most printed reference or nonfiction books offer an index of some kind. An index is not a blind, mechanical catalog of words. Rather, it is created by an indexer.
Indexers are trained to analyze concepts. An indexer will physically read every page of a book and develop a list of page references that lead to information on various topics, individuals, or places covered in that book.
The goal of an index is to direct readers to pertinent information on each topic listed, rather than passing mentions. This requires the indexer to make many judgment calls that is, to consider context as well as content.
Indexers also categorize concepts they break down main subjects headings into subtopics, in a hierarchical format. This structure helps readers "narrow" their search.
A well-written index assumes that the reader may not know specific terms used in the text. Therefore, an indexer will use a thesaurus to create index entries that are synonyms of the terms used within the text. This ensures that even if readers dont know the exact words used in a text, they still will be directed to pages that discuss the topic sought.
A well-written index also lists topics that are implied, rather than stated directly in the text. Consider the example of a book about dogs that does not include a section devoted to canine food or nutrition but that does discuss (in various places) the importance of feeding a dog properly, and also what vitamins and minerals are essential to canine health.
It is likely that readers would turn to this book seeking information on
dog food or nutrition, so the book's index should include the terms "nutrition"
and "food," with references to relevant pages.
Web Indexes vs. Book Indexes
Indexes obviously are useful and appropriate for books. However, they also can work well for Web sites. A Web site index offers the same benefits over a search engine that a book index offers over a concordance.
In some respects, the process of creating an index for a Web site is similar to creating an index for a book. For instance, a Web indexer will read through every page in the site, analyze the concepts discussed, and develop an index that lists the topics covered in the text.
One key difference between a book index and a Web index is hypertext.
In a Web index, the references listed can (and should) be live links that take the user directly to the relevant text in the site. Live links make a Web index not merely informative, but functional. Some examples of Web site indexes that utilize live links are:
Ideally, a Web site indexer should know how to modify the HTML code of Web pages, in order to create hyperlinks. Specifically, indexers should know how to create an "anchor" in the Web page where the text referenced in a particular index entry begins (if no anchor already exists at that location), and then make the index entry a live link to that anchor.
Updating is an important issue for both print and online indexes. However, updating a Web index typically involves incremental maintenance. (Index updates for books are infrequent, major projects.)
Most Web sites evolve constantly from minor modifications to small sections of text, to the addition or deletion of entire content sections. Also, existing content can be moved to a different page or directory within the site.
In order for a Web index to remain useful, it must keep pace with the sites evolution. Few things are more frustrating to a user than broken or outdated links in a sites own index.
Consequently, there should be regular, frequent communication between the
sites developers and the indexer. Whenever significant content is
modified, moved, added, or deleted, the indexer should be informed. Then,
the indexer should immediately update the index to reflect the current
state of content on the site.
Is It an Index or Not?
A quick look around the Web reveals that the term "index" is much misunderstood by Web developers and publishers. In fact, most Web reference tools labeled "site index" are not indexes at all!
Most people know what an index is, from having used them in printed books. Therefore, when a visitor sees a link on your site that says "site index," he or she may click on that link expecting to encounter a real index. However, if that link leads to a different type of guide it might cause confusion, frustration, or disappointment.
If the guide or reference tool youve created for your Web site is not a true index, its helpful to your visitors if you call it by its correct name.
The site guides and tools described below are not indexes, but they commonly are mislabeled as such. Examples of sites that have made this mistake also are listed:
Sometimes it can be hard to tell whether a particular site guide is an index or some other kind of tool. For instance, at first glance the "index" of the Association for Health Services Research Web site appears to be a true index. It is ordered alphabetically, and some entries (such as "About AHSR") include subtopics.
However, this page is a sophisticated table of contents, not a true
index. All of its entries directly reflect the sites structure
(how information is divided into sections and pages). The list is not
really broken down by subject. For instance, while this list includes
entries for "Job and Resume Binder Order Form" and "Career
Center," there is no subject-based entry for "Jobs."
Not Every Site Needs an Index
Some types of Web sites on the Web that would not benefit significantly from an index. For instance:
In contrast, many types of sites would serve their visitors better by offering an index. This is especially true of online magazines or other content-rich sites.
For example, 21st Century Online publishes articles by professionals in various disciplines. Although a reader can simply "drill down" through the current selection of articles on the site, this becomes increasingly difficult as more and more articles are published.
Even Wired (the
online counterpart of Wired magazine) does not yet have a site
index. However, an index would be especially helpful for finding specific
information in this venues four years worth of archives.
Working with (or as) an Indexer
If you decide that your Web site needs an index, you then must decide whether to hire someone to create it, or whether to do it yourself.
If your site is very content-rich, youre probably better off investing in hiring a professional indexer. This also could be a good decision for sites that are smaller or less complex, as long as the budget is available.
Remember: the goal of an index is to improve the usability of a Web site. Therefore, considering an indexer as a usability professional could help justify this investment.
However, if your site is not especially large, or if there is no budget to hire an indexer, or if you simply wish to learn a new skill, it is possible to teach yourself enough about the basics of indexing to attempt this project. A few resources that can help you learn how to create an index are:
Indexing also can be a lucrative line of work. Although most available indexing work is for print media (books, etc.), indexes are becoming increasingly common in online and digital media (Web sites, Intranets, CD-ROMs, etc.). For writers, editors, producers, or Web developers, indexing can be one more valuable service to market to your clients.
Whether your site has an index or not, or whether you learn to create indexes or not, learning about indexing can prove valuable to anyone who develops or uses Web sites.
Understanding indexes makes Web developers and publishers consider what their users would want to find, and how those searches could be simplified or aided. Similarly, Web users who understand the value of a good index can encourage Web publishers to add this key usability tool to their sites.
Its even possible that, one day, indexes might be considered as indispensable to informational or content-rich Web sites as they are to printed reference books today.
© 1998-2006 Kevin Broccoli
Kevin Broccoli is president of Broccoli Information Management.
Genealogy: The Complete Resource Guide per Debbie Reynolds 7/17
web counter since Apr 5, 2015