Classification of Pneumonia Chest X-Ray Images.
Author: Andy Peng
The contents of this repository detail an analysis of the Pneumonia Image Classification project. This analysis is detailed in hopes of making the work accessible and replicable.
The task is to create a model that can accurately predict whether the patient has Pneumonia or not given a patient’s chest xray image.
Our dataset comes from Kaggle. The dataset contains three folders training, validation and testing. Each folder is filled with chest xray images used for training and testing the model that we will create.
Normal Chest XRay
Pneumonia Chest XRay
First Activation of ModelI
Sixth Activation of ModelI
ROC Curve of the different Models
Model Results
To summarize everything above, we can see from above that
Our goal is to minimize the amount of patients we classify as healthy when they do indeed have Pneumonia. Therefore we would want to minimize false negatives in other words maximizing recall. Our recommendation is to stick with ModelI. Although modelB was slightly better in recall and accuracy, there was only a slight difference in the recall and accuracy score. Also ModelI did better in precision score, F1 score and AUC score. Therefore ModelI is the best model to use for predictions.
There are many things that we didn’t due to lack of time and money constraints. For example, we can ask a doctor what they would look at in a chest xray image to determine whether a patient has Pneumonia or not. We could also use cross validation or gather more data to further improve our models. (Future Work - Include RNN Model)
Please review the narrative of our analysis in our jupyter notebook or review our presentation.pdf)
For any additional questions, please contact andypeng93@gmail.com
Here is where you would describe the structure of your repoistory and its contents, for example:
├── README.md <- The top-level README for reviewers of this project.
├── Image Classification.ipynb <- narrative documentation of analysis in jupyter notebook
├── presentation.pdf <- pdf version of project presentation
└── Visualizations
└── images <- both sourced externally and generated from code