agz_unformatted_nature.pdf


立即下载 v-star*위위
2024-03-25
learning AlphaGo algorithm moves neural net inforcement Recently roduce int
2.4 MB

Mastering the Game of Go without Human Knowledge
David Silver*, Julian Schrittwieser*, Karen Simonyan*, Ioannis Antonoglou, Aja Huang, Arthur
Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy
Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis.
DeepMind, 5 New Street Square, London EC4A 3TW.
*These authors contributed equally to this work.
A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, su-
perhuman proficiency in challenging domains. Recently, AlphaGo became the first program
to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated posi-
tions and selected moves using deep neural networks. These neural networks were trained
by supervised learning from human expert moves, and by reinforcement learning from self-
play. Here, we introduce an algorithm based solely on reinforcement learning, without hu-
man data, guidance, or domain knowle


learning/AlphaGo/algorithm/moves/neural/net/inforcement/Recently/roduce/int/ learning/AlphaGo/algorithm/moves/neural/net/inforcement/Recently/roduce/int/
-1 条回复
登录 后才能参与评论
-->