Quote from wikipedia: A PageRank results from a mathematical algorithm based on the webgraph, created by all World Wide Web pages as nodes and hyperlinks as edges, taking into consideration authority hubs such as cnn.com or usa.gov. The rank value indicates an importance of a particular page. A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it (“incoming links“). A page that is linked to by many pages with high PageRank receives a high rank itself.
→ Consider a random surfer who proceeds from node to node in a “random walk”, where the surfer follows one of the outgoing links at random. Nodes that have many incoming links are likely to be visited more often.
→ The crux of page rank algorithm is that pages which are visited more often are more important.
→ This is codified by constructing a markov chain where each node is a webpage and the steady state probability is the probability that a random surfer will end up on a particular page. The page rank is derived from these steady state probabilities.
→ What if a page has no outlinks ? To handle this case, a restart operation is introduced, where the surfer randomly jumps to any page
Learn more from :