项目作者: zhengy001

项目描述 :
Pagerank algorithm implementation
高级语言: Java
项目地址: git://github.com/zhengy001/PageRank.git
创建时间: 2019-11-04T04:01:21Z
项目社区:https://github.com/zhengy001/PageRank

开源协议:

下载


PageRank

Overview

Pagerank algorithm implementation using Hadoop MapReduce and Java language.

Step

  • According to the transition matrix input(transition.txt) build a relationship model
  • Calculate the weight or transiton fact between pages
    • PageRank1 = Transition X PageRank0
  • Sum up each unit weight to get new rank model
  • Converge above steps N times

How to run

$ hadoop jar pagerank.Driver -trans /transition -rank /pagerank -unit /output -times 5

usage: pagerank.Driver

  • -rank input rank file dir
  • -times times of convergence
  • -trans input transition file dir
  • -unit unit output dir