项目作者: Lewuathe

项目描述 :
Multiple node cluster on Docker for self development.
高级语言: Shell
项目地址: git://github.com/Lewuathe/docker-hadoop-cluster.git
创建时间: 2015-09-13T14:22:41Z
项目社区:https://github.com/Lewuathe/docker-hadoop-cluster

开源协议:Apache License 2.0

下载


docker-hadoop-cluster Build Status

Multiple node cluster on Docker for self development.
docker-hadoop-cluster is suitable for testing your patch for Hadoop which has multiple nodes.

Build images from your Hadoop source code

Base image of hadoop service. This image includes JDK, hadoop package configurations etc. This image can include your self-build hadoop package.
In order to bind, tar.gz package assumed be put under hadoop-base directory.

  1. $ cd docker-hadoop-cluster
  2. $ cp /path/to/hadoop-3.0.0-alpha3-SNAPSHOT.tar.gz hadoop-base
  3. $ make

Once you build hadoop-base image, you can launch hadoop cluster by using docker-compose.

  1. $ docker-compose up -d

or

  1. $ make run

See http://localhost:9870 for NameNode or http://localhost:8088 for ResourceManager.

  1. $ make down

shutdowns your cluster.

Build images from the latest trunk

docker-hadoop-cluster also uploads the latest image which refers HEAD of trunk. They are deployed on Docker Hub.
If you want to try the trunk (though it can be unstable), docker-compose.yml like below is needed. It will launch 3 slave Hadoop cluster.

  1. version: '2'
  2. services:
  3. master:
  4. image: lewuathe/hadoop-master
  5. ports:
  6. - "9870:9870"
  7. - "8088:8088"
  8. - "19888:19888"
  9. - "8188:8188"
  10. container_name: "master"
  11. slave1:
  12. image: lewuathe/hadoop-slave
  13. container_name: "slave1"
  14. depends_on:
  15. - master
  16. ports:
  17. - "9901:9864"
  18. - "8041:8042"
  19. slave2:
  20. image: lewuathe/hadoop-slave
  21. container_name: "slave2"
  22. depends_on:
  23. - master
  24. ports:
  25. - "9902:9864"
  26. - "8042:8042"
  27. slave3:
  28. image: lewuathe/hadoop-slave
  29. container_name: "slave3"
  30. depends_on:
  31. - master
  32. ports:
  33. - "9903:9864"
  34. - "8043:8042"

Login cluster

  1. $ docker exec -it master bash
  2. bash-4.1# cd /usr/local/hadoop
  3. bash-4.1# bin/hadoop version
  4. Hadoop 3.0.0-SNAPSHOT
  5. Source code repository git://git.apache.org/hadoop.git -r 0c7d3f480548745e9e9ccad1d318371c020c3003
  6. Compiled by lewuathe on 2015-09-13T01:12Z
  7. Compiled with protoc 2.5.0
  8. From source with checksum 9174a352ac823cdfa576f525665e99
  9. This command was run using /usr/local/hadoop-3.0.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar

Deploy on EC2

We can run docker-hadoop-cluster on EC2 instance with ec2/ script.

  1. $ python ec2/ec2.py -k <Key Name> -s <Security Group ID> -n <Subnet ID> launch

This script launch EC2 instance and prepare prerequisites to launch docker-hadoop-cluster.

Docker Hub

Image name Pulls Stars
lewuathe/hadoop-base hadoop-base hadoop-base
lewuathe/hadoop-master hadoop-master hadoop-master
lewuathe/hadoop-slave hadoop-slave hadoop-slave

License

Apache License Version2.0
This images are modified version of sequenceiq/hadoop-docker.