Using regression models(one build from scratch and other build using sklearn module) to predict Canada's GDP in the upcoming years.
In this project there are two jupyter notebooks namely from-scratch.ipynb and using-sklearn.ipynb.
In from-scratch.ipynb a
linear regression
model is built from scratch, using numpy for mathematical operations. This model is then trained with the data to predict Canada’s GDP where Year of which we want the GDP is the input data.In from-scratch.ipynb
Gradient Descent
andNormal Equation
(since the size of data is less than 10,000) are for finding the best parameters and then the model isevaluated
using the test data.In using-sklearn.ipynb
sklearn
module is used the and machine learning techinques likeCross Validation
,Analyzing Learning Curve
andParameter Tunning
are used to train the model and then it isevaluated
with the test data.
is used as Programming Language.
Numpy
is used for the mathematical and data manipulation.
Pandas
is used to analysis and manipulation of data.
Matplotlib
andSeaborn
are used for data visualisation which helped in the analysis of data.
Sciki-learn
is used for data preprocessing, creating machine learning model and evaluating it, thus creating a pipeline.
Pipenv
is the virtual environment used for the project.Jupyter Notebook
is used to for the entire data science and machine learning life cycle.
Results of
from-scratch.ipynb
andusing-sklearn.ipynb
are same i.e. the regression model built using sklearn module and the one built just using numpy gives the same results.
It is highly recommended to use virtual enviroment
for this project to avoid any issues related to dependencies.
Here pipenv
is used for this project.
There is a requirements.txt
file in 'Predict-GDP-of-Canada'/requirements.txt
which has all the dependencies for this project.
git clone https://github.com/AkashSDas/Predict-GDP-of-Canada
pipenv
if you don’t have it
pip install pipenv
cd 'Predict-GDP-of-Canada'/venv/
pipenv install
The Pipfile of the project must be for creating replicating project’s virtual enviroment.
This will install all the dependencies and create a Pipfile.lock (this should not be altered).
pipenv shell
'Predict-GDP-of-Canada'/venv/src
folder.
cd src/
jupyter noterbook
This will open a webpage in the browser from there you can click on notebook.ipynb to view it.
The source of the data used here is the World Bank national accounts data, and OECD National Accounts data files.
This project is licensed under the MIT License - see the MIT LICENSE file for details.