Sentiment polarity classification on Twitter data for "Halal" keywords
This repository holds the work for sentiment polarity classification of Twitter data gathered by University of Malaya Halal research group.
This project implements various deep learning architectures using 2 features sets:
The list of models available are:
The data are private datasets and can be made available upon further requests.
This project uses:
If it is still unclear, this project uses Graphical Processing Unit (GPU) based learning using Tensorflow as the backend.
This repository contains all the Jupyter Notebook scripts for:
This project aims to implement all 8 deep learning models to perform sentiment polarity classification of the Twitter data.
The outcome of the predictions are 8 sets of :
With all 8 sets of metrics are available, the weighted average of each metric is calculated:
Therefore, upon having the weighted sentiment probabilities, the sentiment polarity is identified using a set of rules:
This reduces the possibilities of having ties condition if only the predicted sentiments are taken into account. However, ties condition is still possible but the probability is very unlikely to occur.
The processes, training patterns, results and outputs for each training and prediction session are available inside the notebook. Feel free to explore.
Below are the weighted outcomes of the sentiment.
Number of Weighted Positive Sentiment | Number of Weighted Negative Sentiment |
---|---|
90910 | 14632 |
The graph below visualizes the testing set accuracies of both feature sets across all models.
The highest accuracy achieved is the Word2Vec CNN + LSTM model while the lowest accuracy achieved is the Word2Seq LSTM model.
However, all of the models are of equal important in this project as each of them have different implementation and personality.
Thank you.