Evaluating Internet Search Tools: A Librarian's
Guide
Doralyn H. Edwards
While there are an abundance of resources available through the Internet, it can be very challenging to select and evaluate information available on the World Wide Web. Librarians can teach others how to effectively search the Internet and evaluate the data found there. Subject guides, search engines, and meta-search engines can be extremely useful tools when tackling the task of finding information available on the Internet.
This article will address features of search tools and how to evaluate them. The characteristics available with search tools are rapidly changing, and searchers should be aware that the most effective and reliable search tool today may not be the best one tomorrow. While much of the literature addresses features of specific search tools at the moment, this article is meant to be a long-term guide for using and evaluating all searching resources. To become more familiar with these search resources, it is helpful to understand what types of tools are available and how they work.
Structure
There are some common features that can be found in many of the search tools available. Generally, these sites have options for basic and advanced searching. Basic searching tends to include Boolean operator options such as searching for "any" of the words entered (same as using Boolean "or"), "all" of the words entered (same as using Boolean "and"), and the "exact phrase" entered (same as using Boolean quotation marks around a phrase). The advanced search option will appeal to many librarians who use Boolean queries with online databases and catalogs. This option usually provides the opportunity to combine these search features by using any combination of Boolean operators available (usually "and," "or," "not," and "near," as well as quotation marks). Some tools allow for "natural language" queries. A natural language query, such as the one found on LycosPro (http://lycospro.lycos.com), allows the user to pose a question as if asking another human. An example of this type of search would be: How many planets are in our solar system? Many tools allow for truncation, usually represented by an asterisk. For example, laugh* would find laugh, laughter, and laughing. Some search tools also employ a feature using plus and minus (+ and -) signs to include or eliminate terms.
All of the search engines default in basic and advanced searching to a pre-set type of search, if no options are selected by the user. Generally, the search default is for "all the words" (and), or "any of the words" (or), or "natural language query." It is important to know what type of search is being done in order to underst-and the results retrieved. If the default option is not indicated on the initial search screen, this information is usually available on the help pages for these tools.
Types
The types of search tools available can be categorized into three groups: subject guides, search engines, and meta-search engines. The best-known subject guides are Yahoo (http://www.yahoo.com), The Argus Clearinghouse (http://www. clearinghouse.net), Magellan (http://www.mckinley.com), and The WWW Virtual Library (http://vlib.stanford.edu/Overview.html). Subject guides tend to be strongest in the area of indexing information by subject category, although they may include a feature which allows you to search across or among the categories. It can be particularly helpful to use a subject guide when looking for a specific site or topic. Since the producers of subject guides have already examined the Web sites in their database and categorized them, the amount of time spent determining the relevancy of sites is reduced.
For example, if you are looking for a specific newspaper, such as the Houston Chronicle, you can probably find it easily under a link to "News and Media" or "Regional" news information subject categories found in these guides. In addition, subject guides tend to index the "front end" of sites, meaning that they index the main page of a site, but not a lot of links within a site. So, when searching for the Houston Chronicle, you will probably find links in the subject guide to the main page of the Chronicle, and several sections within the site like sports and business, but not a lot of links to articles from the newspaper itself. These subject guides are not as helpful for finding information that is not the focus of a site. For instance, if you wanted information on the NASA Voyager space mission, these subject guides will retrieve pages specifically about the Voyager mission. They will not find sites where, for example, an astrophysicist includes "Voyager mission studies" in a list of his or her areas of research.
An exception to ways the subject guides are able to search is the Magellan subject guide. It does search the full text of sites, but its database is smaller than the search engines mentioned in the next section.
HotBot (http://www.hotbot.com), AltaVista (http://www.altavista.digital.com), Excite (http://www.excite.com), Info-seek (http://www.infoseek.com), Lycos (http://www.lycos.com), and WebCrawler (http://www.webcrawler.com) are good examples of search engines. While some of these sites may offer searching by subject, the main focus of these engines is indexing the full text of a large number of Web sites. These tools have the advantage of providing large databases for searching. If you searched for the "Voyager space mission" in a subject guide and did not find all of the information needed, then a search engine can be a good resource to use next.
These search engines will retrieve large amounts of information, but it will not be categorized by area as subject guides are. So, using search engines generally requires more specific, well-planned queries to retrieve relevant information. While your search on a subject guide for Voyager information may have been voyager and mission, a query on a search engine could develop into (voyager and mission) not "star trek" to narrow the search to pertinent documents.
Many search engines do provide a relevancy rating for each site retrieved, providing the highest ranked documents first. Generally, relevancy is determined by where the search terms appeared in the document: title, URL (Web address), site description, or document text. These rankings may not actually reflect what is relevant to the researcher, but it is an attempt by search engine producers to provide some organization to large retrieval sets.
Meta-Search Engines such as MetaCrawler (http://www.metacrawler.com), the Internet Sleuth (http://www.isleuth.com), SavvySearch (http://www.cs.colostate.edu/~dreiling/smartform.html), and USE IT! (http://www.he.net/~kamus/useadven.htm) query many search engines and compile the results. These meta-search engines tend to rely on the databases of other search engines. This type of tool is helpful when trying to find very specific information that might not be found in every search engine. Using a meta-search engine saves the user the time of searching many sites individually. Generally, the search options available are limited to simple features or to the features allowed by each individual search engine. So, while you can search by language in AltaVista, you cannot search by language in some of the meta-search engines that access AltaVista because it is not an option in the meta-search engines interfaces. A meta-search engine query might consist of voyager and mission. The results could then be analyzed to see which search engine of the ones queried produced the most helpful results, and the user could then continue searching on that engine.
The terminology used to describe these Internet search tools is often used interchangeably. For the purpose of the rest of this discussion, the term Internet search tools and search engines will be used in referring to all three tool types.
When using and evaluating these search tools, there are two tips that can be of great assistance: (1) Read any help screens available, (2) Do the same search on multiple search tools and compare the results. While you may end up only using one or two search tools on a regular basis, it is worthwhile to read the documentation about the various search tools to understand the strengths and weaknesses of each.
It is helpful to keep in mind that, like most things available through the Internet, the structure of Internet search tools is constantly changing. So, it is important for librarians, as well as others, to evaluate and re-evaluate these tools for usefulness, currency, and effectiveness for searching the Internet.
Scope
Many Internet search tools offer the option of changing the database searched: the Web, Usenet, or both. Selecting Usenet allows the searcher to explore the newsgroups on the Web for messages and group names containing searched phrases. Information from Usenet may be less reliable or verifiable than Web sites, given that the data is from discussion groups, not published Web sites. Usenet can be beneficial when looking for input or commentary from those with expertise in a specific subject area. Selection of the Web option permits the search for documents available through the Web. The Web may provide more useful information than Usenet, given the number of resources available. Using these options can help narrow a search to relevant documents.
Features
One thing that can really set one search tool apart from another is the availability of unique search features. Such features include the ability to search for audio and video files, searching for pages in a particular language, searching of a specific domain or geographic region or even searching for pages with links to a particular URL, or Web site. For instance, Yahoo automatically defaults to searching AltaVista when no matches are found in its own database. HotBot allows the user to determine how far in the directory structure of a site the search should be conducted (e.g., the top page, such as http://www.rice.edu or a page deeper in the directory such as http://www.rice.edu/Fondren). Excite, Lycos, and Webcrawler provide guides, or rankings, to many of the Web sites in their databases. AltaVista allows for searching of pages written in a particular language. Twenty-five languages are currently indexed by this engine. Yahoo provides a kid-friendly search tool called Yahooligans! (http:/www.yahooligans.com), which has subject categorizes and sites which appeal to children. Excite, Infoseek, and Metacrawler all have free, down-loadable software that adds a search bar for that search engine to your Web browser or elsewhere on your desktop. This allows the user to search directly in the engines database, without having to go to the Web site to do so. Search engine producers are constantly looking for ways to make their search engine better and more appealing to Internet users. It is useful to regularly visit the various search engines available to see if new features have been implemented which fulfill a searching need.
A helpful group of search tool feature comparison sites can be found on Yahoos Comparing Search Engines site at: http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Comparing_Search_ Engines/.
Currency
An important factor in determining the usefulness of a search engine is the currency of its database. No search tool indexes all of the information on the Web, but some are better than others at checking the currency of their database. When sites are added to most search tool databases, the information from the site is copied directly into the search tools database. So, if a site changes the information on one of its pages, moves to a different location, or is removed, this will not be reflected in the information retrieved from search engines, unless the information is constantly maintained in the database. Librarians, as well as other searchers, should be aware of which search tools do regularly re-download the information from the sites contained in their database.
A helpful site for checking on the various features and functions of search engines is Search Engine Watch (http://searchenginewatch.com). The site is produced by Danny Sullivan, an Internet and search engine consultant, and owner of a consulting firm, Calfia Consulting. The site, geared initially for Web designers, has become a resource for search engine users and designers alike. Topics such as "Webmasters Guide to Search Engines," "Search Engine Facts and Fun," and "Search Engine Status Reports" are covered. From this site, anyone can subscribe to a free newsletter, "Search Engine Report." This monthly publication covers new search engine information and changes to the Search Engine Watch site. Although the Web site is free, the author requests that people voluntarily subscribe to help cover costs.
Librarians, like other Web surfers, may find it difficult to keep up with all of the types of information available on the Web. Becoming familiar with a couple of search tools and their features, and constantly revisiting other search tools for new features make the difference between an Internet "wader" and an Internet "surfer."