部署模型-attention-seeking-in-pytorch-PROSAGA-码农传奇

attention-seeking-in-pytorch

This repo contains implementation of various forms of attention:

Location based attention
Content based dot product
Content based concatenation
Content based general attention
Pointer networks

and finally

No attention

Task to learn

Each of these sequence to sequence models is trained to learn how to sort a shuffled array of numbers from 1 to N. The code to generate this data is here.

There is a considerable improvement if an attention based model is used versus the no attention model.

Organization of code

All the models and the data loader are defined in code/.

Each model is defined in a separate file. The file containing a model also contains train and test functions which are self-explanatory.
Output logs are stored under training_outputs/
Attention weights can be visualized using the code in the notebook Visualizing attention.