Hierarchical Attention Networks

Data from Yelp is downloaded from Duyu Tang’s homepage (the same dataset used in Yang’s paper)
Download link: http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp-2015-data.7z

Put data in a directory named “data/yelp_YEAR/“ (where “YEAR” is the year)
Run “yelp-preprocess.ipynb” to preprocess the data. The format becomes “label \t\t sentence1 \t sentence2…”.
Then run “word2vec.ipynb” to train word2vec model from training set.
Run “HAN.ipynb” to train the model.
Run “case_study.ipynb” to run visualization of some examples from validation set, including attention vector(sentence-level and word-level) and the prediction results.

Now we get about 65% accuracy on the yelp2013 test set. After fine-tuning hyperparameters, it can be better.

Hyperparameters we used

epoches	batch size	GRU units	word2vec size	optimizer	learning rate	maximum sentence length
50	32	128	50	Adam	4e-4	200

alt text

train	validation	test
66.891%	64.659%	64.734%

alt text

alt text