Monte Carlo and Temporal Difference implementation from Chapter 5 and Chapter 6 of Reinforcement Learning: An Introduction Book by Andrew Barto and Richard S. Sutton.