项目作者: tezzytezzy

项目描述 :
Complete Guide to Data Munging
高级语言: Jupyter Notebook
项目地址: git://github.com/tezzytezzy/us-flight-delay.git
创建时间: 2019-10-24T02:38:51Z
项目社区:https://github.com/tezzytezzy/us-flight-delay

开源协议:

下载


US Flight Delay

Objective

Demonstrate data munging through the following actions.

:one: Distribution assessment by checking and eliminating samples with Not a Number (NaN) fields
:two: Outlier elimination via Tukey Fences and Z-Scores
:three: Categorical value transformation and encoding

Installation

  1. pandas 0.25.2 py37he6710b0_0
  2. pyarrow 0.13.0 py37he6710b0_0
  3. category-encoders 2.1.0 pypi_0 pypi

Data Source

flight_data.csv (393Mb)

Reference

Python Machine Learning by Wei-Ming Lee (ISBN: 978-1-119-54563-7)