Learning PySpark.pdf


立即下载 谦逊的毛巾
2024-04-19
Spark Data sets Tungsten Distributed https book Customer .fi NLP
13.1 MB

https://www.iteblog.com
Learning PySpark
https://www.iteblog.com
Table of Contents
Learning PySpark
Credits
Foreword
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Understanding Spark
What is Apache Spark?
Spark Jobs and APIs
Execution process
Resilient Distributed Dataset
DataFrames
Datasets
Catalyst Optimizer
Project Tungsten
Spark 2.0 architecture
Unifying Datasets and DataFrames
Introducing SparkSession
Tungsten phase 2
https://www.iteblog.com
Structured Streaming
Continuous applications
Summary
2. Resilient Distributed Datasets
Internal workings of an RDD
Creating RDDs
Schema
Reading from files
Lambda expressions
Global versus local scope
Transformations
The .map(...) transformation
The .fi


Spark/Data/sets/Tungsten/Distributed/https/book/Customer/.fi/NLP/ Spark/Data/sets/Tungsten/Distributed/https/book/Customer/.fi/NLP/
-1 条回复
登录 后才能参与评论
-->