项目作者: Kimonokimo
项目描述 :
Machine Learning studies at Brandeis University, with my best friends Ran Dou, Tianyi Zhou, Dan Mduduzi, Siyan Lin.
高级语言: Jupyter Notebook
项目地址: git://github.com/Kimonokimo/Machine-Learning-Projects.git
Machine-Learning-Projects
This repository records all my machine learning practices and projects at Brandeis University.
The Regression.Rmd
, Classification.Rmd
, and Clustering.Rmd
are the codes for the Diabetes Diagnosis Machine Learning Project.
Main task finished on the Diabetes Diagnosis project:
- Extracted diabetes diagnosis data from NIH web database including Glucose level, Body Mass Index, Age, Blood Pressure,
Skin Thickness, Insulin, and Diabetes Pedigree Function. - Processed data cleaning through feature engineering and explored correlation matrix to identify the multicollinearity of the
features; utilized the K-means clustering algorithm to identify the inherent structure and pattern of data. - Developed predictive models with linear regression, logistic regression, KNN, decision tree and k-fold cross-validations.
- Compared each model with the baseline linear model by evaluating accuracy through confusion matrix and improved the prediction power to 90.3% recall.