This is an RL playground to test Karpathy's great tutorial on Policy Gradients.
This is an RL playground to tinker, and learn with Karpathy’s great tutorial on Policy Gradients.
The original code has been adapted to Python 3.6.
Tutorial:
“Deep Reinforcement Learning: Pong from Pixels”
http://karpathy.github.io/2016/05/31/rl
Original repo:
“Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels”
https://gist.github.com/karpathy/a4166c7fe253700972fcbc77e4ea32c5
Pong-v0 environment and doc:
https://gym.openai.com/envs/Pong-v0