项目作者: thefirebanks
项目描述 :
Contains code for a voting classifier that is part of an ensemble learning model for tweet classification (which includes an LSTM, a bayesian model and a proximity model) and a system for weighted voting
高级语言: Python
项目地址: git://github.com/thefirebanks/Ensemble-Learning-for-Tweet-Classification-of-Hate-Speech-and-Offensive-Language.git
Ensemble Learning for Tweet Classification of Hate Speech and Offensive Language - Winter/Spring Project 2018
These programs are part of a project that will use an ensemble learning model to detect offensive language and hate speech in tweets. It is composed of:
- A Voting classifier
- An LSTM network
- A Bayesian model
- A Proximity model
Link to full project: https://github.com/quinnbp/WT2018
This repository contains:
Voting classifier for hate-speech and offensive language detection in tweets:
TODO:
Weighting system for ensemble learning
- Has 3 different options for applying weighted voting:
- Precision score of the classifiers’ confusion matrices
- CEN score
- Precision + CEN score
- Equal voting
Confusion matrix class
- Creates a confusion matrix given the output predictions of a classifer and the set of true labels
- Contains operations like getting precision score, storing it as a pdf, getting number of false positives, getting the CEN score of the matrix, etc.
All written by Daniel Firebanks
Inspired by the research of: