An NMT framework built on Joint Representation
PyTorch original implementation of Neural Machine Translation with Joint Representation. It is modified from fairseq-0.6.0.
You need to install it first:
git clone https://github.com/lyy1994/reformer.git
cd reformer
pip install -r requirements.txt
python setup.py build develop
Before running the training and decoding scripts, we implicitly make assumptions on the file directory structure:
|- reformer (code)
|- data
|- data-bin
|- BINARIZED_DATA_FOLDER
|- RAW_DATA_FOLDER
|- train (training set raw text)
|- valid (validation set raw text)
|- test (test set raw text)
|- checkpoints
|- torch-1.1.0
|- EXPERIMENT_FOLDER
|- toolkit
|- multi-bleu.perl
To train a model, run:
cd reformer/scripts
sh train.sh
To decode from the trained model, run:
sh decode.sh
If you would like to customize the configuration, please modify train.sh for training and decode.sh for decoding.
The table below summarizes the scripts for reproducing our experiments:
Dataset | Script |
---|---|
IWSLT14 German-English | iwslt-train.sh |
NIST12 Chinese-English | nist-train.sh |
@inproceedings{li2020aaai,
title = {Neural Machine Translation with Joint Representation},
author = {Yanyang Li and Qiang Wang and Tong Xiao and Tongran Liu and Jingbo Zhu},
booktitle = {Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence},
year = {2020},
}