Madison restaurant Yelp rating prediction based on review text
The second project of Spring 2017 Stat 333 is a Kaggle competition, where we are asked to predict Yelp ratings based on the text comments in Madison WI area. Our group got rank one on both public and private leaderboard 🎉.
Model | Directory Name | Description |
---|---|---|
Deep Learning | ./dl |
Use Stanford’s GloVe to vectorize text, and a simple CP-CP-CP neural network |
Linear Regression | ./lr |
Use TFIDF text encoding, and lasso, ridge regression and elastic net |
Multiple Linear Regression | ./mrl |
Naive simple multiple linear regression with silly variables |
Neural Network | ./nn |
Use tf-idf text encoding, and a simple one hidden layer neural network |
Our best model is using Ridge regression with tf-idf text encoding. You can check out the self-explained Jupyter notebook here.
You can see our presentation to get more info.