项目作者: SoniSiddharth

项目描述 :
Linear Regression (Gradient Descent) from scratch
高级语言: Python
项目地址: git://github.com/SoniSiddharth/ML-Linear-Regression-from-scratch.git


Linear Regression ⭐⭐

Directory Structure 📁

  1. collinear_dataset.py
  2. compare_time.py
  3. contour_plot.gif
  4. degreevstheta.py
  5. gif1.gif
  6. gif2.gif
  7. linear_regression_test.py
  8. line_plot.gif
  9. Makefile
  10. metrics.py
  11. Normal_regression.py
  12. plot_contour.py
  13. poly_features_test.py
  14. README.md
  15. surface_plot.gif
  16. ├───images
  17. q5plot.png
  18. q6plot.png
  19. q8features.png
  20. q8samples.png
  21. ├───linearRegression
  22. linearRegression.py
  23. __init__.py
  24. └───__pycache__
  25. linearRegression.cpython-37.pyc
  26. __init__.cpython-37.pyc
  27. ├───preprocessing
  28. polynomial_features.py
  29. __init__.py
  30. └───__pycache__
  31. polynomial_features.cpython-37.pyc
  32. __init__.cpython-37.pyc
  33. ├───temp_images
  34. └───__pycache__
  35. metrics.cpython-37.pyc

Instructions to run 🏃

make help

make regression

make polynomial_features

make normal_regression

make poly_theta

make contour

make compare_time

make collinear

Stochastic GD (Batch size = 1) ☝️

  • Learning rate type = constant
    RMSE: 0.9119624181584616
    MAE: 0.7126923090787688

  • Learning rate type = inverse
    RMSE: 0.9049599308106121
    MAE: 0.7098334683036919

Vanilla GD (Batch size = N) ✋

  • Learning rate type = constant
    RMSE: 0.9069295672718122
    MAE: 0.7108301179089876

  • Learning rate type = inverse
    RMSE: 0.9607329070540364
    MAE: 0.7641616657610887

Mini Batch GD (Batch size between 1 and N(5)) 🤘

  • Learning rate type = constant
    RMSE: 0.9046502501334435
    MAE: 0.7102161700019564

  • Learning rate type = inverse
    RMSE: 0.9268357442221973
    MAE: 0.7309246821952116

Polynomial Feature Transformation 🔰

  • The output [[1, 2]] is [[1, 1, 2, 1, 2, 4]]

  • The output for [[1, 2, 3]] is [[1, 1, 2, 3, 1, 2, 3, 4, 6, 9]]

  • The outputs are similar to sklearn’s PolynomialFeatures fit transform

Theta vs degree 📈

alt text

  • Conclusion - As the degree of the polynomial increases, the norm of theta increases because of overfitting.

L2 Norm of Theta vs Degree of Polynomial for varying Sample size 📈

alt text

Conclusion

  • As the degree increases magnitude of theta increases due to overfitting of data.
  • But at the same degree, as the number of samples increases, the magnitude of theta decreases because more samples reduce the overfitting to some extent.

Linear Regression line fit 🔥

alt text

Linear Regression Surface plot 🔥

alt text

Linear Regression Contour plot 🔥

alt text

Time Complexities ⏳

  • Theoretical time complexity of Normal equation is O(D^2N) + O(D^3)
  • Theoretical time complexity of Gradient Descent equation is O((t+N)D^2)

Time vs Number of Features ⏳📊

alt text

When the number of samples are kept constant, normal equation solution takes more time as it has a factor of D^3 whereas Gradient Descent has a factor of D^2 in the time complexity.

Time vs Number of Samples ⏳📊

alt text

When the number of features are kept constant varying number of samples, it can be noticed that time for normal equation is still higher as compared to gradient descent because of computational expenses.

Multicollinearity in Dataset ❗ ❗

  • The gradient descent implementation works for the multicollinearity.
  • But as the multiplication factor increases, RMSE and MAE values takes a large shoot
  • It reduces the precision of the coefficients