Basic descriptive and predictive analysis of Red wine quality data using Python
Basic descriptive and predictive analysis of Red wine quality data using Python.
Welcome, and thank you for opening this Project. This project contains a jupyter notebook which will provide knowledge to novice Data Scientists with basic Data Analysis/Machine Learning concepts like:
- Downloading a publicly available dataset
- Describing the dataset
- Describing the research question
- Cleaning/removing invalid values from rows
- Cleaning up columns
- Removing/filling missing data
- Creating new columns
- Modifying exsting columns
Name: Red Wine Quality Data Set
Source: UCI Machine Learning Repository
Input variables:
- fixed acidity
- volatile acidity
- citric acid
- residual sugar
- chlorides
- free sulfur dioxide
- total sulfur dioxide
- density
- pH
- sulphates
- alcohol
Output variable: quality (score between 0 and 10)
Data Set Characteristics: Multivariate
Number of Observations: 1599
Number of Attributes/Variables: 12
Missing Values: N/A