K-means clustering using MLlib
The code was written in Scala on Spark for a Graduate Course in Big Data Analytics at the University of Toronto in Apr 2018. It was originally written in Zeppelin Notebook on Datascientist Workbench.
The goal of the project was to perform data exploration and apply machine learning techniques on Malnutrition Suvery Data collected by Demographic and Health Survey (DHS).
Malnutrition datasets from all available countries between 2008-2018 were downloaded from: https://dhsprogram.com/Data/