Showing posts with label PageRank. Show all posts
Showing posts with label PageRank. Show all posts

Monday, July 25, 2011

How Search Engines Use Links

The search engines use links primarily to discover web pages, and to count the links as votes for those web pages. But how do they use this information once they acquire it? Let’s take a look:

Index inclusion

Search engines need to decide what pages to include in their index. Discovering pages by crawling the Web (following links) is one way they discover web pages (the other is through the use of XML Sitemap files). In addition, the search engines do not include pages that they deem to be of low value because cluttering their index with those pages will not lead to a good experience for their users. The cumulative link value, or link juice, of a page is a factor in making that decision.

Crawl rate/frequency

Search engine spiders go out and crawl a portion of the Web every day. This is no small task, and it starts with deciding where to begin and where to go. Google has publicly indicated that it starts its crawl in PageRank order. In other words, it crawls PageRank 10 sites first, PageRank 9 sites next, and so on. Higher PageRank sites also get crawled more deeply than other sites. It is likely that other search engines start their crawl with the most important sites first as well. This would make sense, because changes on the most important sites are the ones the search engines want to discover first. In addition, if a very important site links to a new resource for the first time, the search engines tend to place a lot of trust in that link and want to factor the new link (vote) into their algorithms quickly.

Ranking
 
Links play a critical role in ranking. For example, consider two sites where the on-page content is equally relevant to a given topic. Perhaps they are the shopping sites Amazon.com and (the less popular) JoesShoppingSite.com. The search engine needs a way to decide who comes out on top: Amazon or Joe. This is where links come in. Links cast the deciding vote. If more sites, and more important sites,
link to it, it must be more important, so Amazon wins.

Thursday, June 30, 2011

Google Page Rank

PageRank is an algorithm patented by Google that measures a particular page’s importance relative to
other pages included in the search engine’s index. It was invented in the late 1990s by Larry Page and
Sergey Brin. PageRank implements the concept of link equity as a ranking factor. PageRank considers a link to a page as a vote, indicating importance.

PageRank approximates the likelihood that a user, randomly clicking links throughout the Internet, will
arrive at that particular page. A page that is arrived at more often is likely more important — and has a
higher PageRank. Each page linking to another page increases the PageRank of that other page. Pages
with higher PageRank typically increase the PageRank of the other page more on that basis. You can
read a few details about the PageRank algorithm at http://en.wikipedia.org/wiki/PageRank.
To view a site’s PageRank, install the Google toolbar (http://toolbar.google.com/) and enable
the PageRank feature, or install the SearchStatus plugin for Firefox (http://www.quirk.biz/
searchstatus/). One thing to note, however, is that the PageRank indicated by Google is a cached
value, and is usually out of date.

PageRank values are published only a few times per year, and sometimes using outdated
information. Therefore, PageRank is not a terribly accurate metric. Google
itself is likely using a more current value for rankings.

PageRank is just one factor in the collective algorithm Google uses when building search results pages
(SERPs). It is still possible that a page with a lower PageRank ranks above one with a higher PageRank
for a particular query. PageRank is also relevance agnostic, in that it measures overall popularity using
links, and not the subject shrouding them. Google currently also investigates the relevance of links when
calculating search rankings, therefore PageRank should not be the sole focus of a search engine marketer.
Building relevant links will naturally contribute to a higher PageRank. Furthermore, building too many
irrelevant links solely for the purpose of increasing PageRank may actually hurt the ranking of a site,
because Google attempts to detect and devalue irrelevant links that are presumably used to manipulate it.
PageRank is also widely regarded by users as a trust-building factor, because users will tend to perceive
sites with a high value as more reputable or authoritative. Indeed, this is what PageRank is designed to
indicate. This perception is encouraged by the fact that Google penalizes spam or irrelevant sites (or
individual pages) by reducing or zeroing their PageRank.