项目作者: zhengy001
项目描述 :
Pagerank algorithm implementation
高级语言: Java
项目地址: git://github.com/zhengy001/PageRank.git
Overview
Pagerank algorithm implementation using Hadoop MapReduce and Java language.
Step
- According to the transition matrix input(transition.txt) build a relationship model
- Calculate the weight or transiton fact between pages
- PageRank1 = Transition X PageRank0
- Sum up each unit weight to get new rank model
- Converge above steps N times
How to run
$ hadoop jar pagerank.Driver -trans /transition -rank /pagerank -unit /output -times 5
usage: pagerank.Driver
- -rank input rank file dir
- -times times of convergence
- -trans input transition file dir
- -unit unit output dir