How Search Engines Work
How Do Search Engines Find Web Pages?
Most search engines employ automated programs called
“spiders” that
find web pages and “read” them. Most commonly a
spider reaches a page
because it followed a link from another page. By following links, the
“spider” traverses the web in much the manner that
a real spider
traverses it's web. The location of each web page found is added the
the search engine's index along with information about that
page.
This is the search engine's “index.”
What Information Does A Search Engine
Gather?
What is gathered and used by a search engine (it's algorithm) is
closely guarded information that webmasters infer from observing
“Search Engine Results” (SERPs). A major element is
how many times a
certain word appears on that page. Search engines are very
sophisticated and can give a word more weight if its at the start of a
sentence or is used in a title. The total number of times the
word or word phrase is used is the keyword density of the
page. A
record is also kept of other pages that link to that page.
How Search Engines Rank Pages
Each spidered page is kept in the search engine's index along with
information about the words it contains and any other semantic analysis
that that engine performs. Google (the most used search engine)
pioneered use of the number of links to a page to calculate
it's “PageRank” ( a number from 0-10).
The “PageRank” can be
thought of as that page's “reputation.”
Thus, a page with several
mentions of the words “blue widget” on it and
numerous links from
other web sites to it containing the words “blue
widgets” will be
judged to have higher relevance in a search for
“blue
widgets” than a site with only one mention of blue widgets
and no links
from other sites.