项目作者: raaaouf

项目描述 :
Breast Cancer Detection with Decision trees Algorithm And Bagging Normalizing
高级语言: Python
项目地址: git://github.com/raaaouf/Decision-Tree-for-Breast-cancer-detection.git
创建时间: 2020-09-06T07:27:27Z
项目社区:https://github.com/raaaouf/Decision-Tree-for-Breast-cancer-detection

开源协议:MIT License

下载


Decision-Tree-for-Breast-cancer-detection

implemening decision trees algorithm with bagging as a normlizing technic to predict breast cancer from routine blood tests

DATA

The dataset contains only the following ten attributes:

  • Age: age of the patient (years)
  • BMI: body mass index (kg/m²)
  • Glucose: glucose concentration in blood (mg/dL)
  • Insulin: insulin concentration in blood (microU/mL)
  • HOMA: Homeostatic Model Assessment of Insulin Resistance (glucose multiplied by insulin)
  • Leptin: concentration of leptin — the hormone of energy expenditure (ng/mL)
  • Adiponectin: concentration of adiponectin — a protein regulating glucose levels (micro g/mL)
  • Resistin: concentration of resistin — a protein secreted by adipose tissue (ng/mL)
  • MCP.1: concentration of MCP-1 — a protein that recruits monocytes to the sites of inflammation due to tissue injury or inflammation (pg/dL)
  • Classification: Healthy controls (1) or patient (2)

Data Visualisation

As you can see there are two classes with almost the same number of samples
data

Results with bagging

  1. from sklearn.ensemble import BaggingClassifier
  2. bagging_clf = BaggingClassifier()
  3. bagging_clf.fit(X_train, y_train.ravel())
  4. y_pred_bag = bagging_clf.predict(X_test)
  5. bag_cm = confusion_matrix(y_test, y_pred_bag)
  6. plot_confusion_matrix(bag_cm, [0, 1])
  7. plt.show()

results

After ploting the confusion matrix we can see our model classified correctly all instances in the test set using bagging normalization.

License

GitHub