Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou
Daan Wierstra Martin Riedmiller
DeepMind Technologies
{vlad,koray,david,alex.graves,ioannis,daan,martin.riedmiller} @ deepmind.com
Abstract
We present the first deep learning model to successfully learn control policies di-
rectly from high-dimensional sensory input using reinforcement learning. The
model is a convolutional neural network, trained with a variant of Q-learning,
whose input is raw pixels and whose output is a value function estimating future
rewards. We apply our method to seven Atari 2600 games from the Arcade Learn-
ing Environment, with no adjustment of the architecture or learning algorithm. We
find that it outperforms all previous approaches on six of the games and surpasses
a human expert on three of them.
1 Introduction
Learning to control agents directly from high-dimensional sensory inputs like vision and speech is
one of th
input/model/games/control/learning/high/inforcement/dimensional/sensory/miller/
input/model/games/control/learning/high/inforcement/dimensional/sensory/miller/
-->