How Search Engines Work

How Do Search Engines Find Web Pages?

Most search engines employ automated programs called “spiders” that find web pages and “read” them. Most commonly a spider reaches a page because it followed a link from another page. By following links, the “spider” traverses the web in much the manner that a real spider traverses it's web. The location of each web page found is added the the search engine's index along with information about that page.  This is the search engine's “index.”

What Information Does  A Search Engine Gather?

What is gathered and used by a search engine (it's algorithm) is closely guarded information that webmasters infer from observing “Search Engine Results” (SERPs). A major element is how many times a certain word appears on that page. Search engines are  very sophisticated and can give a word more weight if its at the start of a sentence or is used in a title.  The total number of times the word or word phrase is used is the  keyword density of the page. A record is also kept of other pages that link to that page.

How Search Engines Rank Pages

Each spidered page is kept in the search engine's index along with information about the words it contains and any other semantic analysis that that engine performs. Google (the most used search engine) pioneered use of  the number of links to a page to calculate it's  “PageRank” ( a number from 0-10). The “PageRank” can be thought of as that page's “reputation.”  Thus, a page with several mentions of the words “blue widget” on it and numerous links from other web sites to it containing the words “blue widgets” will be judged to have higher  relevance in a  search for “blue widgets” than a site with only one mention of blue widgets and no links from other sites.