This Repository contains my notes from the course "Introduction to Reinforcement Learning" taught by David Silver at DeepMind. Implementation of some of the popular reinforcement learning algorithms is also available.
Lecture notes are primarily based on the course videos, slides and Reinforcemnt Learning textbook by Sutton and Barto mentioned below. All algorithms are implemented in Python3.7.
Online Course: David Silver's Reinforcement Learning Course
Textbook: Reinforcement Learning: An Introduction (2nd Edition)
- Introduction to reinforcement learning
- Markov decision processes and Bellman equations
- Dynamic programming methods for prediction and control
- Model free prediction - Monte-Carlo and temporal-difference prediction
- Model free control, Sarsa, Q-learning
- Value function approximation methods, deep Q-learning (DQN)
- Policy gradient methods
- Planning and Learning
Following algorithms are currently implemented:
-
Dynamic Programming (DP)
- Policy evaluation
- Policy improvement and policy iteration
- Value iteration
-
Monte Carlo (MC)
- Incremental every-visit MC policy evaluation
- On-policy control using epsilon-greedy policy evaluation
- Off-policy control using weighted importance sampling
-
Temporal-Difference (TD)
- Sarsa
- Sarsa(lambda) using eligibility traces
- Q-Learning
-
Function Approximation Methods
- Semi-gradient Q-learning
-
Deep Q-Networks (DQN)
-
DQN with Double Q-learning (Double DQN)
-
Policy Gradient Methods
- REINFORCE-with-baseline: Monte-Carlo Policy Gradient
-
Dyna-Q algorithm (Planning and Learning)