Given tweets about NASDAQ top 6 stocks(AAPL, GOOG, GOOGL, TSLA, AMZN, MSFT), will be there any relationship between tweet sentiments and the stock price?
Given tweets about NASDAQ top 6 stocks(AAPL, GOOG, GOOGL, TSLA, AMZN, MSFT), will be there any relationship between tweet sentiments and the stock price?
|- data
|- company-sentiment
: sentiment labeled tweet ids on NASDAQ stocks
|- raw
|- company-tweets.csv
|- company-values.csv
|- nasdaq-tweets.csv
|- (sentiment labeled financial tweets)
|- regression
: data for regression of public tweet sentiment and market value
|- sample
: data for debugging data generator
|- sentiment-classifier
: data to finetune BERtweet for sentiment classification of financial tweets
|- calculate.py
: calculate public sentiment and stock price difference of nasdaq stocks for intervals
|- finetune_classifier.py
: finetune model from `fintweet_sentiment_classifier.py`
|- fintweet_sentiment_classifier.py
: model copied & slightly modified from HuggingFace Transformer RoBERTa
|- nasdaq_tweet_sentiment_tagger.py
: tag sentiment to nasdaq tweets with fine-tuned classifier
|- preproces_tweets.py
: drop duplicated and suspicious nasdaq tweets and sentiment labeled financial tweets
|- regression.py
: output the relation between public sentiment and stock price direction
|- utils.py
: misc functions
preproces_tweets.py
fintweet_sentiment_classifier.py
, finetune_classifier.py
nasdaq_tweet_sentiment_tagger.py
, calculate.py
regression.py
Model / Train | Model / Test | Predict always rise / Test | Predict always rise / All | |
---|---|---|---|---|
AAPL | 63.5% | 74.2% | 63.7% | 58.8% |
GOOG | 66.4% | 60.9% | 58.6% | 56.5% |
GOOGL | 65.6% | 65.2% | 58.0% | 60.0% |
AMZN | 63.9% | 62.6% | 63.2% | 61.9% |
TSLA | 62.2% | 54.9% | 56.0% | 55.1% |
MSFT | 58.1% | 66.4% | 66.4% | 61.2% |
*All values are accuracy of predicting rise or fall
[1]: https://www.kaggle.com/aramacus/bot-hunting-or-how-many-tweets-were-made-by-bots
[2]:
@inproceedings{bertweet,
title = {{BERTweet: A pre-trained language model for English Tweets}},
author = {Dat Quoc Nguyen and Thanh Vu and Anh Tuan Nguyen},
booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
year = {2020},
pages = {9—14}
}