Identify and classify toxic online comments
Multi-label-classificationis a project for toxic comment classification. This repository provide moudle/api which was made by refined bert and expore different models to solve multi-label problem using static word embedding and contextual word representation as input features in different models.
wget "https://drive.google.com/u/0/uc?id=1PEpcLfhs18NzQKvUYVzcmn-jnnnyXBHz&export=download" -O "bert_classifier.dict"
mv bert_classifier.dict model
Get the prediction
from violation import predict
text = "fuckkkkk u"
output = predict(text)
print(output)
# output : {'toxic': 1.0, 'severe_toxic': 0.0, 'obscene': 1.0, 'threat': 0.0, 'insult': 1.0, 'identity_hate': 0.0}
Get the probability
from violation import predict
text = "fuckkkkk u"
output = predict(text, get_probs = True)
#output: {'toxic': 0.9909837245941162, 'severe_toxic': 0.4319310486316681, 'obscene': 0.9577020406723022, 'threat': 0.08440539240837097, 'insult': 0.884278416633606, 'identity_hate': 0.11709830909967422}
The data resource we used to train our model. (Ignore this selection for simply using api )
wget http://nlp.stanford.edu/data/glove.6B.zip
unzip glove.6B.zip
Module
Probability
Prediction
Bert
/w BCEWithLogitsLoss
stastical word representation
+ LSTM
/w Focal Loss
word representation
+ LSTM
/w multiple output layers
LSTM
model.word representation
+ LSTM
/w single output layer but have multiple neurons
LSTM
model.You’re challenged to build a multi-headed model that’s capable of detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate better than Perspective’s current models. You’ll be using a dataset of comments from Wikipedia’s talk page edits. Improvements to the current model will hopefully help online discussion become more productive and respectful.
focal loss
as our loss fuction (Model 4.0). Because in our dataset, the data is extremely unbalance. To tackle this question, choosing a suitable loss function is one of the answer. Focal Loss is an improved version of Cross-Entropy Loss that tries to handle the class imbalance problem by assigning more weights to hard or easily misclassified examples and to down-weight easy examplesGenerate descriptive statistics
Total comment counts for different labels
Count numbers of different categories (Training set)
Count numbers of different categories (Testing set before data restructure)
Word Length distribution
Word colud
MultiLabel(
(bert): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30522, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
# 12 BertLayers
(11): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(dropout): Dropout(p=0.1, inplace=False)
(classifier): Linear(in_features=768, out_features=6, bias=True)
)
Learning Curve - Acc (Model 4.0)
Learning Curve - Loss (Model 4.0)
Learning Curve - AUC-ROC (Model 4.0)
Use the Focal Loss. Accuarncy got 0.9934 and AUC got 0.97. (Auc of Model 2.0 got 0.955)
(This loss function aim to improve data imbalance situation)
2000/2000 [==============================] - 18s 9ms/step - loss: 0.0493 - acc: 0.9934 - auc: 0.9700
Loss: 0.04926449805498123
Test Accuracy: 0.9933727383613586
Confuse Matric
>> training set
precision recall f1-score support
toxic 0.94 0.65 0.76 15294
severe_toxic 0.96 0.05 0.10 1595
obscene 0.94 0.60 0.73 8449
threat 1.00 0.01 0.03 478
insult 0.89 0.37 0.52 7877
identity_hate 0.92 0.01 0.02 1405
micro avg 0.93 0.51 0.66 35098
macro avg 0.94 0.28 0.36 35098
weighted avg 0.93 0.51 0.63 35098
samples avg 0.06 0.04 0.05 35098
>> testing set
precision recall f1-score support
toxic 0.65 0.72 0.68 6090
severe_toxic 0.44 0.06 0.10 367
obscene 0.83 0.53 0.65 3691
threat 0.33 0.00 0.01 211
insult 0.81 0.30 0.44 3427
identity_hate 0.40 0.00 0.01 712
micro avg 0.71 0.51 0.59 14498
macro avg 0.58 0.27 0.31 14498
weighted avg 0.71 0.51 0.56 14498
samples avg 0.07 0.05 0.05 14498
Using LSTM with dropout. Accuarncy got only 18.4% (Overfitting by observing the historical curve.)
2000/2000 [==============================] - 23s 11ms/step - loss: 0.5070 - dense_8_loss: 0.2167 - dense_9_loss: 0.0189 - dense_10_loss: 0.1153 - dense_11_loss: 0.0150 - dense_12_loss: 0.1050 - dense_13_loss: 0.0361 - dense_8_acc: 0.9138 - dense_9_acc: 0.9932 - dense_10_acc: 0.9561 - dense_11_acc: 0.9966 - dense_12_acc: 0.9598 - dense_13_acc: 0.9893
Test Score: 0.5070158839225769
Test Accuracy: 0.21666644513607025
Using LSTM with dropout. Accuarncy got 99.7% (The result is over optimism, for we use argmax to evaluate). AUC got 0.9559.
2000/2000 [==============================] - 18s 9ms/step - loss: 0.0872 - acc: 0.9965 - auc: 0.9559
Loss: 0.0871838703751564
Test Accuracy: 0.9964519143104553
Confusion matrix. (To look into each categories)
>> training set
precision recall f1-score support
toxic 0.88 0.78 0.82 15294
severe_toxic 0.58 0.37 0.45 1595
obscene 0.88 0.74 0.81 8449
threat 0.25 0.00 0.00 478
insult 0.77 0.64 0.70 7877
identity_hate 0.67 0.00 0.01 1405
micro avg 0.84 0.68 0.75 35098
macro avg 0.67 0.42 0.47 35098
weighted avg 0.82 0.68 0.73 35098
samples avg 0.07 0.06 0.06 35098
>> testing set
precision recall f1-score support
toxic 0.55 0.82 0.66 6090
severe_toxic 0.39 0.42 0.40 367
obscene 0.70 0.69 0.69 3691
threat 0.00 0.00 0.00 211
insult 0.62 0.58 0.60 3427
identity_hate 0.40 0.00 0.01 712
micro avg 0.59 0.67 0.63 14498
macro avg 0.44 0.42 0.39 14498
weighted avg 0.58 0.67 0.60 14498
samples avg 0.07 0.06 0.06 14498