A Model for Natural Language Attack on Text Classification and Inference
A Model for Natural Language Attack on Text Classification and Inference
This is the source code for the paper: Jin, Di, et al. “Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment.” arXiv preprint arXiv:1907.11932 (2019). If you use the code, please cite the paper:
@article{jin2019bert,
title={Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment},
author={Jin, Di and Jin, Zhijing and Zhou, Joey Tianyi and Szolovits, Peter},
journal={arXiv preprint arXiv:1907.11932},
year={2019}
}
Our 7 datasets are here.
Required packages are listed in the requirements.txt file:
pip install -r requirements.txt
Run the following code to install the esim package:
cd ESIM
python setup.py install
cd ..
(Optional) Run the following code to pre-compute the cosine similarity scores between word pairs based on the counter-fitting word embeddings.
python comp_cos_sim_mat.py [PATH_TO_COUNTER_FITTING_WORD_EMBEDDINGS]
python attack_classification.py
For Natural langauge inference:
python attack_nli.py
Examples of run code for these two files are in run_attack_classification.py and run_attack_nli.py. Here we explain each required argument in details:
Two more things to share with you:
In case someone wants to replicate our experiments for training the target models, we shared the used seven datasets we have processed for you!
In case someone may want to use our generated adversary results towards the benchmark data directly, here it is.