项目作者: Re1tReddy

项目描述 :
Hadoop Progaming
高级语言: Java
项目地址: git://github.com/Re1tReddy/Hadoop.git
创建时间: 2015-05-12T12:08:40Z
项目社区:https://github.com/Re1tReddy/Hadoop

开源协议:

下载


Hadoop

Guys,

Here you will find some sample MapReduce programs to process different types of files like Text,PDF,CSV,Log,XML,Doc/Docx,XLS/XLSX etc..

You can also find a single programe which can read any kind of file types specified above .
You can also find how to write custom Data Types and custom Partitioners in Map Reduce .

trendfinder Folder :

In trendfinder folder you will find how to work with multiple Mappers and Reducers .
Here we are processing twitter data based on the occurence of the tweet.

logfiles Folder :

It contains a program to count the number of views on a per-hour basis on a particular web-site.

pdf Folder :

It contains how to write our own Custome FileInPutFormat,RecordReader classes in hadoop inorder to process PDF files .

Java To HDFS Connection

In this package you will find how to connect to HDFS from Java , Writting a file in to HDFS using Java and how to display the contents of the directory in HDFS from Java Programing .


You can reach me for any suggestions/clarifications on : revanthkumar95@gmail.com
Feel free to share any insights or constructive criticism. Cheers!!

Happy hadooping!