predict-gdp-of-canada

About
Technologies Used
Results of the Project
Installation
Data Source
License

About

In this project there are two jupyter notebooks namely from-scratch.ipynb and using-sklearn.ipynb.

In from-scratch.ipynb a linear regression model is built from scratch, using numpy for mathematical operations. This model is then trained with the data to predict Canada’s GDP where Year of which we want the GDP is the input data.

In from-scratch.ipynb Gradient Descentand Normal Equation(since the size of data is less than 10,000) are for finding the best parameters and then the model is evaluated using the test data.

In using-sklearn.ipynb sklearn module is used the and machine learning techinques like Cross Validation, Analyzing Learning Curve and Parameter Tunning are used to train the model and then it is evaluated with the test data.

Technologies Used

is used as Programming Language.

Numpy is used for the mathematical and data manipulation.

Pandas is used to analysis and manipulation of data.

Matplotlib and Seaborn are used for data visualisation which helped in the analysis of data.

Sciki-learn is used for data preprocessing, creating machine learning model and evaluating it, thus creating a pipeline.

Pipenv is the virtual environment used for the project. Jupyter Notebook is used to for the entire data science and machine learning life cycle.

Results of the Project

Results of from-scratch.ipynb and using-sklearn.ipynb are same i.e. the regression model built using sklearn module and the one built just using numpy gives the same results.

Line Plot

Correlation Matrix

Cross Validation Score

Learning Curve

Fitted Line

Metrics Scores

Actual VS Prediction

Metrics Scores

Installation

It is highly recommended to use virtual enviroment for this project to avoid any issues related to dependencies.

Here pipenv is used for this project.

There is a requirements.txt file in 'Predict-GDP-of-Canada'/requirements.txt which has all the dependencies for this project.

First, start by closing the repository


git clone https://github.com/AkashSDas/Predict-GDP-of-Canada

Start by installing pipenv if you don’t have it


pip install pipenv

Once installed, access the venv folder inside the project folder


cd  'Predict-GDP-of-Canada'/venv/

Create the virtual environment


pipenv install

The Pipfile of the project must be for creating replicating project’s virtual enviroment.

This will install all the dependencies and create a Pipfile.lock (this should not be altered).

Enable the virtual environment


pipenv shell

dataset, jupyter notebook and model are in 'Predict-GDP-of-Canada'/venv/src folder.


cd src/

To start/view the jupyter notebook


jupyter noterbook

This will open a webpage in the browser from there you can click on notebook.ipynb to view it.

Data Source

The source of the data used here is the World Bank national accounts data, and OECD National Accounts data files.

License

This project is licensed under the MIT License - see the MIT LICENSE file for details.

predict-gdp-of-canada

Table of contents

About

Technologies Used

Results of the Project

Line Plot

Correlation Matrix

Cross Validation Score

Learning Curve

Fitted Line

Metrics Scores

Actual VS Prediction

Installation

Data Source

License