项目作者: Spirals-Team

项目描述 :
Docker containers to build an Hadoop infrastructure and experiment feedback control loops atop of it.
高级语言: Shell
项目地址: git://github.com/Spirals-Team/hadoop-benchmark.git
创建时间: 2015-12-18T15:51:24Z
项目社区:https://github.com/Spirals-Team/hadoop-benchmark

开源协议:Apache License 2.0

下载


Overview

Hadoop-Benchmark is an open-source research acceleration platform for rapid prototyping and evaluation of self-adaptive behaviors in Hadoop clusters.
The main objectives are to allow researchers to

rapidly prototype, i.e., to experiment with self-adaptation in Hadoop clusters without the need to cope with low-level system infrastructure details,

reproduction, i.e., to share complete experiments for others to reproduce them independently, and

repetition, i.e., to experiment with and to compare their work, re-doing the same experiments on the same system using the same evaluation methods.

It uses docker and docker-machine to easily create a multi-node cluster (on a single laptop or in a cloud including Grid5000) and provision Hadoop.
It contains a number of acknowledged benchmarks and one self-adaptive scenario.

The following is the high-level overview of the created cluster and deployed services:
architecture

Requirements

  • docker >= 1.12
  • docker-machine >= 0.8
  • (optional) R >= 3.3.2 with tidyverse and Hmisc for data analysis

Usage

  1. ./cluster.sh
  2. Usage ./cluster.sh [OPTIONS] COMMAND
  3. Options:
  4. -f, --force Use '-f' in docker commands where applicable
  5. -n, --noop Only shows which commands would be executed wihout actually executing them
  6. -q, --quiet Do not print which commands are executed
  7. Commands:
  8. Cluster:
  9. create-cluster
  10. start-cluster
  11. stop-cluster
  12. restart-cluster
  13. destroy-cluster
  14. status-cluster
  15. Hadoop:
  16. start-hadoop
  17. stop-hadoop
  18. restart-hadoop
  19. destroy-hadoop
  20. Misc:
  21. console Enter a bash console in a container connected to the cluster
  22. run-controller CMD Run a command CMD in the controller container
  23. hdfs CMD Run the HDFS CMD command
  24. hdfs-download SRC Download a file from HDFS SRC to current directory
  25. Info:
  26. shell-init Shows information how to initialize current shell to connect to the cluster
  27. Useful to execute like: 'eval $(./cluster.sh shell-init)'
  28. connect-info Shows information how to connect to the cluster

Documentation