项目作者: vikrantkakad

项目描述 :
Basic descriptive and predictive analysis of Red wine quality data using Python
高级语言: Jupyter Notebook
项目地址: git://github.com/vikrantkakad/Red-Wine-Quality-Analysis.git
创建时间: 2018-05-20T22:13:48Z
项目社区:https://github.com/vikrantkakad/Red-Wine-Quality-Analysis

开源协议:MIT License

下载


Red-Wine-Quality-Analysis

Basic descriptive and predictive analysis of Red wine quality data using Python.

License

Welcome, and thank you for opening this Project. This project contains a jupyter notebook which will provide knowledge to novice Data Scientists with basic Data Analysis/Machine Learning concepts like:

  • Data Extraction
    • Downloading a publicly available dataset
    • Describing the dataset
    • Describing the research question
  • Data Pre-processing
    • Cleaning/removing invalid values from rows
    • Cleaning up columns
    • Removing/filling missing data
    • Creating new columns
    • Modifying exsting columns
  • Data Visualization
  • Data Exploratory Analysis
  • Descriptive Analytics
  • Prediction and Model Selection
  • Classification
  • Deriving Conclusion/Insights from the data

Dataset:

Name: Red Wine Quality Data Set

Source: UCI Machine Learning Repository

Input variables:

  • fixed acidity
  • volatile acidity
  • citric acid
  • residual sugar
  • chlorides
  • free sulfur dioxide
  • total sulfur dioxide
  • density
  • pH
  • sulphates
  • alcohol

Output variable: quality (score between 0 and 10)

Data Set Characteristics: Multivariate

Number of Observations: 1599

Number of Attributes/Variables: 12

Missing Values: N/A