Prediction if patients with symptoms have COVID-19 based on clinical variables (blood related variables, urine related variables, age, etc)
The main goal is to propose an efficient and also transparent/interpretable ML solution for the diagnosis of suspicious COVID-19 cases, based on some clinical variables.
Secondary goal will be to eliminate least important features while keeping the preditive power of the solution.
Classification algorithms utilized in the solution:
Interpretation algorithms:
Original dataset comes from a Kaggle Competition held by Einstein Data4u. It can be found at:
https://www.kaggle.com/dataset/e626783d4672f182e7870b1bbe75fae66bdfb232289da0a61f08c2ceb01cab01/tasks?taskId=645
Specific steps adopted on this novel:
Link: https://doi.org/10.48011/asba.v2i1.1590
Abstract:
This work proposes an interpretable machine learning approach to diagnose suspected COVID-19 cases based on clinical variables. Results obtained for the proposed models have F-2 measure superior to 0.80 and accuracy superior to 0.85. Interpretation of the linear model feature importance brought insights about the most relevant features. Shapley Additive Explanations were used in the non-linear models. They were able to show the difference between positive and negative patients as well as offer a global interpretability sense of the models.
If you enjoy this work, please cite as :
@article{thimoteo_vellasco_amaral_figueiredo_yokoyama_marques_2020, title={Interpretable Machine Learning for COVID-19 Diagnosis Through Clinical Variables}, DOI={10.48011/asba.v2i1.1590}, journal={Anais do Congresso Brasileiro de Automática 2020}, author={Thimoteo, Lucas M. and Vellasco, Marley M. and Amaral, Jorge M. Do and Figueiredo, Karla and Yokoyama, Cátia Lie and Marques, Erito}, year={2020}, month={Dec}}