Georgia Tech - OMSCS - CS7641 - Machine Learning Repository
https://github.com/ezerilli/CS7641-Machine_Learning
The following steps lead to setup the working environment for CS7641 - Machine Learning
in the OMSCS program. 👨🏻💻📚
Installing the conda environment is a ready-to-use solution to be able to run python scripts without having to worry
about the packages and versions used. Alternatively, you can install each of the packages in requirements.yml
on your
own independently with pip or conda.
Start by installing Conda for your operating system following the instructions here.
Now install the environment described in requirements.yaml
:
conda env create -f requirements.yml
To activate the environment run:
conda activate CS7641
Once inside the environment, if you want to run a python file, run:
python my_file.py
To deactivate the environment run:
conda deactivate
During the semester I may need to add some new packages to the environment. So, to update it run:
conda env update -f requirements.yml
This assignment aims to explore 5 Supervised Learning algorithms (k-Nearest Neighbors, Support Vector Machines,
Decision Trees, AdaBoost and Neural Networks) and to perform model complexity analysis and learning curves while
comparing their performances on two interesting datasets: the Wisconsin Diagnostic Breast Cancer (WDBC) and the
Handwritten Digits Image Classification (the famous MNIST).
The assignment consists of two parts:
experiment 1, producing validation curves, learning curves and performances on the test set, for each of the
algorithms, on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset.
experiment 2, producing validation curves, learning curves and performances on the test set, for each of the
algorithms, on the Handwritten Digits Image Classification (MNIST) dataset.
In order to run the experiments, run:
cd Supervised_Learning
python run_experiments.py
Figures will show up progressively. It takes a while to perform all the experiments and hyperparameter optimizations.
However, they have already been saved into the images directory. Theory, results and experiments are discussed in the
report (not provided here due to Georgia Tech’s Honor Code).
This assignment aims to explore some algorithms in Randomized Optimization, namely Random-Hill Climbing (RHC), Simulated
Annealing (SA), Genetic Algorithms (GA) and Mutual-Information Maximizing Input Clustering (MIMIC), while comparing
their performances on 3 interesting discrete optimisation problems: the Travel Salesman Problem, Flip Flop and 4-Peaks.
Moreover, RHC, SA and GA will later be compared to Gradient Descent and Backpropagation on a (nowadays) fundamental
optimization problem: training complex Neural Networks.
The assignment consists of four parts:
In order to run the experiments, run:
cd Randomized_Optimization
python run_experiments.py
Figures will show up progressively. It takes a while to perform all the experiments and parameters optimizations.
However, they have already been saved into the images directory. Theory, results and experiments are discussed in the
report (not provided here due to Georgia Tech’s Honor Code).
This assignment aims to explore some algorithms in Unsupervised Learning, namely Principal Components Analysis (PCA),
Kernel PCA (KPCA), Independent Components Analysis (ICA), Random Projections (RP), k-Means and
Gaussian Mixture Models (GMM), while comparing their performances on 2 interesting dataset: the
Wisconsin Diagnostic Breast Cancer (WDBC) and the Handwritten Digits Image Classification (the famous MNIST).
Moreover, their contribution to Neural Networks in the supervised setting will be assessed.
The assignment consists of two parts:
experiment 1, producing curves for dimensionality reduction, clustering and neural networks with unsupervised techniques
on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset.
experiment 2, producing curves for dimensionality reduction, clustering and neural networks with unsupervised techniques
on the Handwritten Digits Image Classification (MNIST) dataset.
In order to run the experiments, run:
cd Unsupervised_Learning
python run_experiments.py
Figures will show up progressively. It takes a while to perform all the experiments and parameters optimizations.
However, they have already been saved into the images directory. Theory, results and experiments are discussed in the
report (not provided here due to Georgia Tech’s Honor Code).
This assignment aims to explore some algorithms in Reinforcement Learning, namely Value Iteration (VI),
Policy Iteration (PI) and Q-Learning, while comparing their performances on 2 interesting MDPs: the
Frozen Lake environment from OpenAI gym and the Gambler’s Problem from Sutton and Barto.
The assignment consists of two parts:
experiment 1, producing curves for VI, PI and Q-Learning on the Frozen Lake environment from OpenAI gym.
experiment 2, producing curves for VI, PI and Q-Learning on the Gambler’s Problem from Sutton and Barto.
In order to run the experiments, run:
cd Markov_Decision_Processes
python run_experiments.py
Figures will show up progressively. It takes a while to perform all the experiments and parameters optimizations.
However, they have already been saved into the images directory. Theory, results and experiments are discussed in the
report (not provided here due to Georgia Tech’s Honor Code).