Web Site Indexes - Frequently Asked Questions
A Web site or digital publications index – often called an “A-Z site index” – is a finding aid for a Web site, intranet, or sub-site or digital publication, organized in the same way as a traditional, alphabetical back-of-the-book index. In addition to the alphabetical arrangement of index entries, the following conventions can be found:
In addition, cross-references (See or See also references) may be used, but are not required.
An index can be limited to named entities (such as the names of departments, people, etc.), but – when compared with an index of topical terms – is usually not sufficient for searching a site. On a Web site, a named-entity index is more accurately called a “directory,” like a telephone directory.
An alphabetical list of Web page or digital publication titles, or even edited titles, is not a Web site or digital publication index, just as an alphabetical list of chapter titles does not constitute a book or digital publication index. One of the more obvious reasons is the fact that the wording of page titles can be quite different from a typical index entry. For example, the page title “Work For Us,” while catchy as a heading, is not likely to be quickly found by a user searching for a job in an alphabetical list of Web site or digital publication pages. A true index entry to the “Work For Us” page, on the other hand, could be posted as “careers,” “employment,” or “jobs” – or indeed all three. As well, a simple alphabetical sorting of Web pages would lack variant or cross-reference terms and second-level entries, all of which would be found in a true index.
A site map or digital publication table of contents is a finding aid organized in the same way as a table of contents; it follows the structure of the site or digital publication, section by section, instead of being alphabetical. While providing a helpful overview of a site, therefore, site maps or tables of contents do not always enable users to quickly find a specific topic. Also, a site map or table of contents lists each Web page or digital publiation page only once and by its correct name (for example, “Work For Us”), without cross-references or variants to make it more easily found. Finally, site maps or tables of contents tend to include only Web pages, or digital publication pages and perhaps not even all pages, rather than the additional sections within pages. This is usually dictated by spatial factors: the entire site map or table of contents should fit on one screen to be most usable for browsing. An index can be much larger and more detailed, and is not constrained by space; for example, a common convention of indexes is the use of a separate Web or digital publication page for each letter of the alphabet.
Instead of pointing to page numbers, as in a traditional index, a Web site or digital publications index is composed of entries that are themselves links to the pages, or named anchors at the heads of sections within the pages, in which information can be found. Cross-references (See and See also), while indicating a preferred term, may link directly to the appropriate Web pages or digital publication pages. If, however, the cross-reference points to a term with multiple subentries, then the cross-reference may link to the referred term within the index.
There are no standard styles for Web site indexes as there are for book or digital publications indexes. When it comes to format, the guiding factor should be the usability of an index. There are, however, some style conventions, which are covered under Best Practices on this site.
Most appropriate for indexes are sites (or parts of sites) that do not change too frequently, that have repeat users, and that have rich and varied content. Sites of organizations, associations, government departments and agencies, libraries, educational and health institutions, and employee intranets, with the varied information and services they provide, are all very suitable for indexes.
Some sites may have a sufficient number of pages, but the content is not suitable for an index, such as online games or short descriptions of directory entries. A directory-type site might require an alphabetical list of names to look up, but this would not be a structured, topical index (as explained above). Even most sites that sell products do not need indexes if most of the pages of the site are a listing of products. Potential customers tend to look up products by category, not by alphabetical names. A large retail site such as Amazon.com is easily navigated through its tab-structure menu and category listings. If, however, the site also describes support services and contains press releases, corporate history, investment information, customer testimonials, tips, and other articles, such varied content would indeed be well-served by an index.
Parts of digital publications that have a large quantity of relatively unchanging content, such as help documentation (for example, eBay’s help topics) or policy handbooks, can also benefit from indexes. Collections of articles, although constantly being added to, should also be indexed.
Web sites or digital publications ranging from 50 to 500 pages are best served by Web site or digital publications indexes, although a Web site or digital publication can have as few as 20 or 25 pages and still benefit from an index. A comparison with back-of-the-book indexes is helpful here: just as books with 700 or so pages still have indexes, so can Web sites or digital publications of this size. Once you go over the 1,000-page range, however, an index may not be practical, since the site will likely have changed before the index is complete. Rather than indexing the entire site or digital publication, individual indexes can be applied to subsections.
A search engine query will not always provide you with the information you're looking for. Compared with the entire Web, the number of pages within a site or digital publication is relatively small, so a simple search engine query might not yield enough or any results, even if there are good pages on the subject. This is most likely to occur because the search phrase you type is worded differently than references to that topic within the page text.
Whole-Web or digital publication search engines usually produce “satisfactory” results in the quality of articles, since the major search engine companies have developed complicated criteria and algorithms for the retrieval and ranking of pages. Off-the-shelf search engines to be used within a site are not so sophisticated. They often retrieve pages that include a mere passing mention of the search term, but do not really focus on the subject at all.
Jared Spool's article "Why On-Site Searching Stinks" further points out the limitations of on-site search utilities.
Site indexes are best done by individuals skilled in indexing who also have basic skills in HTML or in using HTML indexing tools. You can hire a freelance contractor from the Contract Indexer Search database.
Software can automatically extract page titles or headings, retain their page URL links, and sort them alphabetically. But only a human indexer can edit (or, more precisely, rewrite) such a list of titles and headings into a useful and meaningful set of index entries, add variant terms and cross-references, and decide where and how to structure subentries.
Software can also automatically extract metadata, such as keywords, to create an index. But in order to be truly useful, an indexer will have had to create the keywords for each page in a systematic way, using a controlled vocabulary for consistency. Creating a controlled vocabulary is a specialty of some indexers.
If you choose to create a Web site or digital publication index on your own, some training in indexing is recommended. The American Society for Indexing (ASI) Web site lists courses and workshops, including online courses. ASI and other national indexing societies also offer workshops at their annual conferences and local chapter meetings.
If you have a combination of HTML skills and a background in library and information science or in information architecture, you might be able to pick up indexing well enough to create a Web site or digital publication index after studying a good book on the subject, rather than taking a course. The ASI Web site also has a page on publications of interest. Our page of sample Web and digital publication indexes – including some created by SIG members – provides an opportunity to look at examples from several different fields.
Once you have created your own Web site or digital publication index, you might consider hiring a freelance indexer to review it, edit it, and provide you with valuable feedback before taking it “live.”
An index should be written so that the most dynamically changing parts of a site are not indexed to the specifics. For example, a page on events should be indexed for the topic of “Events,” but not for the specific events themselves, which would be constantly added and deleted. Even after taking this rule into account, however, provisions must still be made for maintaining the index.
If the index is created by a contracted indexer, an agreement needs to be reached regarding how the index will be maintained. Either the indexer can be retained for future updates, or the indexer can provide written guidelines to the editor or content manager on how to maintain the index for predictable types of content additions or changes. As an alternative, an in-house staff member could be trained in indexing in order to keep the index up to date, especially for a large, frequently changing site.
If an indexer is not maintaining the index, it might be a good idea to contract an indexer to review it every year or so. An indexer who updates an index need not be the same indexer who created the index, but anyone who updates an index needs to be informed of what major content changes have occurred since the last index update. While a Web site or digital publications indexing tool might keep track of content deletions, an indexer needs to index all content additions.
It should be noted here that the importance of the indexer in maintenance goes beyond just keeping the index up to date. As the indexer reviews the index on a regular basis, he or she is also able to catch broken links, HTML problems, typos, and any other issues that affect the Web site or digital publication as a whole, and to report these to the content manager.
Freelance indexers with a background in book indexing are accustomed to being paid per page, such as US$3.50 -$5.00 per page. Digital publications, however, vary greatly in length and in the amount of indexable content (this FAQ Web page, for example, equals four printed pages), so the number of pages is not an ideal way in which to determine cost. Web site or digital publications indexing is therefore charged by the index entry or by the hour. Alternatively, once the exact scope of what is to be indexed has been defined, the indexer may be happy to quote a flat fee for the job. Indexers tend to charge more per hour than copy editors, but less than Web developers or information architects.
In addition to this Web site, the ASI Digital Publications Indexing Google Group is a forum for all queries and discussion of issues related to digital publications indexing for which SIG membership is required.
Our Digital Publications
Indexing Resources page is a comprehensive listing of courses and
workshops, books, software, and other aids.
You may submit additional questions to the .
web counter since Apr 5, 2015