项目作者: minhntm

项目描述 :
Simple examples to familiarize myself with Spark
高级语言: Java
项目地址: git://github.com/minhntm/LearnSpark.git
创建时间: 2017-05-14T16:29:38Z
项目社区:https://github.com/minhntm/LearnSpark

开源协议:

下载


Learn Apache Spark By Myself

Simple examples to familiarize myself with Spark

Batch job, streaming job is running locally because of hardware limitations.

Learn how to use Spark data type: RDDs. However, at this time you can use SQL, Datasets, DataFrames data type with more information about the structure of both the data and the computation being performed and of course, newer API than RDDs

Spark has introduced Structured Streaming instead of Spark streaming to processing structured data streams with relation queries. I haven’t used it but I think it’s newer API and better API.