ar
X
iv
:1
40
9.
32
15
v3
[
cs
.C
L
]
1
4
D
ec
2
01
4
Sequence to Sequence Learning
with Neural Networks
Ilya Sutskever
Google
ilyasu@google.com
Oriol Vinyals
Google
vinyals@google.com
Quoc V. Le
Google
qvl@google.com
Abstract
Deep Neural Networks (DNNs) are powerful models that have achieved excel-
lent performance on difficult learning tasks. Although DNNs work well whenever
large labeled training sets are available, they cannot be used to map sequences to
sequences. In this paper, we present a general end-to-end approach to sequence
learning that makes minimal assumptions on the sequence structure. Our method
uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence
to a vector of a fixed dimensionality, and then another deep LSTM to decode the
target sequence from the vector. Our main result is that on an English to French
translation task from the WMT’14 dataset, the translations produced by the LSTM
achieve a BLEU
sequence/LSTM/vector/map/DNNs/Neural/BLEU/NLP/14/Memory/
sequence/LSTM/vector/map/DNNs/Neural/BLEU/NLP/14/Memory/
-->