Nutch安装测试文档
安装ant
因为nutch的安装都是源码安装需要用ant编译,所以这里需要先安装一下ant
下载ant,http://mirrors.noc.im/apache//ant/binaries/里面下载apache-ant-1.9.7-bin.tar.gz
Ant的安装比较简单,解压ant
tar –zxvf apache-ant-1.9.7-bin.tar.gz 得到解压目录,将解压目录配置为环境变量
然后执行source /etc/profile
ant –version能够查看ant的版本信息
安装配置nutch
配置conf/nutch-site.xml
storage.data.store.class
org.apache.gora.hbase.store.HBaseStore
http.agent.name
chenkl
http.accept.language
ja-jp,en-us,en-gb,en;q=0.7,*;q=0.3
parser.character.encoding.default
utf-8
plugin.includes
protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)
配置ivy/ivy.xml文件
取消注释
添加配置
property/ant/安装/配置/basic/nutch/目录/解压/kafka/下载/
property/ant/安装/配置/basic/nutch/目录/解压/kafka/下载/
-->