Tom's Subject Directory & City Building Fan Site

Search a Section of this site or the Entire Site
site search by freefind

Search Engine Techniques




Table of Contents

Search engines have a variety of ways for you to refine and control your searches. Some of them offer menu systems for this. Others require you to use special commands as part of your query.

Not every power searching command is shown on this page, only the main ones that are most likely to be used. Read the help files at each search engine for more detailed coverage about what they offer.

Searching Using Boolean Operators

Keyword searching refers to a search type in which you enter terms representing the concepts you wish to retrieve. Boolean operators are not used.

Implied Boolean logic refers to a search in which symbols are used to represent Boolean logical operators. In this type of search on the Internet, the absence of a symbol is also significant, as the space between keywords will default to either OR logic or AND logic.

Most major internet search engines recognize Boolean operators. There are FOUR Boolean operators that can be used when performing a search:

  • OR
  • AND (+)
  • NOT (-)
  • NEAR

OR logic is most commonly used to search for synonymous terms or concepts.

With AND logic, the more terms or concepts you combine in a search, the fewer records you will retrieve.

NOT logic excludes records from your search results.

  • Think carefully when you use NOT: if the term you want is present in an important way, those documents may not be retrieved.

Match Any

Sometimes you want pages that contain any of your search terms. For example, you may want to find pages that say either chemical or fertilizer.

At some search engines, you can do a Match Any search by using a menu next to the search box or on the advanced search page.

Keep in mind that most search engines will automatically first list pages that have all your terms, then some of your terms, when you perform a Match Any search.

Match All

This is a search for pages containing all of your search terms, rather than any of them. For example, to find pages with references to both Clinton and Dole on the same page:

    +clinton +dole

would return pages with both Clinton and Dole on the page and would not return pages that mention only Clinton or only Dole.

Practically all major search engines support the + symbol as a means of doing a Match All search.

Exclude

Most major search engines allow you to exclude documents that contain certain words. This is a helpful way to narrow a search. For example, you may want a page about the philosopher Calvin, not the cartoon character Calvin. By excluding pages that mention Hobbes, the cartoon character's side-kick, you will get better results. The best way to do this is by using the - command, which is supported by practically all major search engines.

Using a philosophy search engine (Specialty Search Services) would ensure relevant results.

Site Search

One of the most powerful features available is the ability to control what sites are included or excluded from a search. For example, imagine you wanted to see all the pages from the Mars Exploration web site run by the NASA's Jet Propulsion Laboratory. At AltaVista, you could use this command:

     host:mars.jpl.nasa.gov

In response, AltaVista would display all the pages it has indexed from the mars.jpl.nasa.gov domain. More about using the site search command to find web pages from a particular web site is described on the Checking Your URL page.

Now imagine you wanted to find all the pages from the Mars Exploration web sites that also mention Venus and Jupiter. You could do that this way:


     host:mars.jpl.nasa.gov venus jupiter

That tells AltaVista to list pages with the words "venus" and "jupiter" that are within the Mars Exploration web site. You can even combine other commands, such as those described on the Search Engine Math page. For instance, look at this example:


     host:mars.jpl.nasa.gov -"mars pathfinder"

Here, we are telling AltaVista to list all pages within the Mars Exploration web site that do not contain the exact phrase "mars pathfinder." Now, imagine you are looking for information about Mars landings but are getting overwhelmed by results from NASA. You can get rid of NASA pages by doing this:

     "mars exploration" -host:nasa.gov

In that example, we are looking for the phrase "mars landings" but excluding any pages from sites that end in nasa.gov. That means we will NOT get pages from sites like these

     mars.jpl.nasa.gov
     spacekids.hq.nasa.gov
     cmex.arc.nasa.gov.

We could even decide to see all pages about Mars landing from US educational sites, which end in .edu, like this:

"mars landings" +host:edu

Finally, imagine you live outside the US and want to see results that are predominately from your country. Here's how someone in the UK might search for football (soccer) information: "football scores" +host:uk This finds pages that say "football scores" and which are from sites that end in the .uk domain, which is used for UK-based sites.

Search Engine Specific Issues