Code for the EMNLP 2019 paper "Span-based Hierarchical Semantic Parsing for Task-Oriented Dialog"
This is the codebase for the paper
Panupong Pasupat, Sonal Gupta, Karishma Mandyam, Rushin Shah, Mike Lewis, Luke Zettlemoyer.
Span-based Hierarchical Semantic Parsing for Task-Oriented Dialog.
EMNLP, 2019
data/glove/
.Training with the basic model (no edge scores) on debug data:
./main.py train configs/base.yml configs/data/artificial-chain.yml \
configs/model/embedder-lstm.yml configs/model/span-node.yml
This will train the model on the training data and evaluate on the development data.
The results will be saved to out/___.exec/
where ___
is some number. The results contain
the final config file, saved models, and predictions.
The out/__.exec/__.meta
metadata file stores the best accuracy and the epoch for that accuracy.
To dump the metadata, use ./dump-meta.py out/__.exec/__.meta
.
To dump the predictions of a trained model in, say, out/42.exec/14.model
on the test data:
./main.py test out/42.exec/config.json -l out/42.exec/14
To use the real training data (TOP dataset), change the data
config to configs/data/top.yml
.
To use edge scores, change the model
config to configs/model/span-edge.yml
.
To use the BERT model, change both the data config and embedder config to the BERT counterparts
(artificial-chain.yml
> artificial-chain-bert.yml
, top.yml
> top-bert.yml
,embedder-lstm.yml
> embedder-bert.yml
)
The training accuracy will be incorrect because the decoder is not run during training.
To turn on decoding during training, add
-c "{"model": {"decoder": {"punt_on_training": false}}}"`
to the command line.