项目作者: boldkhuu

项目描述 :
Kafka, Spark Streaming, Spark SQL, Javascript project
高级语言: Python
项目地址: git://github.com/boldkhuu/BDT-Project.git
创建时间: 2018-03-17T03:39:06Z
项目社区:https://github.com/boldkhuu/BDT-Project

开源协议:

下载


Project - Big Data Technology

Built with: Spark Streaming, Kafka, Spark SQL, Javascript

Start Hadoop DFS

  1. /usr/local/hadoop/sbin/start-dfs.sh

Start zookeeper

  1. /usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties

Start Kafka server

  1. /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties

Run Spark stream

  1. /usr/local/spark/bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.1 spark/sparkStream.py 5

Run Kafka stream

  1. python kafka/twitter.py

Start Node server

  1. node node/server.js

Show the graph

Go to http://localhost:3001

Project 2 - Basketball players dataset querying

Run Spark SQL

  1. cd sql
  2. /usr/local/spark/bin/spark-submit reader.py