Course Overview

“I always wondered how it would be if a superior species landed on earth and showed us how they play chess. I feel now I know.”

“It doesn’t play like a human, and it doesn’t play like a program. It plays in a third, almost alien, way.”

“In fact, it played so well that it was almost scary…”

The quotes above refer to AlphaZero and AlphaGO, Google DeepMind’s AIs that achieved super-human performance in Chess and GO. At the core of these AI systems is an exciting idea that is changing the way view AI – reinforcement learning.

This is a course on reinforcement learning (RL), the third, often forgotten paradigm of machine learning along with supervised learning and unsupervised learning and is something I am very fascinated by. RL has its roots in behavioural psychology and the reason I love RL so much is because I believe it comes closest to learning tasks in the same way humans do. Along with RL and the similar fields like Apprenticeship Learning and Meta Learning, I think we are taking massive steps towards solving the problem of AI.

By the end of this course, I hope to tell you a few things about RL that’ll help you gain some knowledge on the subject and also hopefully get you as interested in it as I am. Over this course, I hope to cover several RL fundamentals and algorithms that will give you the intuition and also some math behind RL. I will also try and include some code samples and implementational details wherever I see fit but note that if you’re looking for a course that’ll just teach you how to implement RL algorithms in Python, this may not be the right one.

We’ll start off with a couple of introduction posts and then cover several common RL algorithms and deep RL algorithms as well. I will try and put in code samples (Python3 and PyTorch) and mathematical proofs whenever possible.

I will add additional references if any, at the end of each lecture but most of the content for this course comes from Reinforcement Learning: An Introduction by Sutton and Barto (from this point will be referred to as “the RL book”) and this NPTEL course by Prof. Balaraman Ravindran.

I highly recommend some background in probability and maybe even some machine learning (supervised and unsupervised learning) and optimization before you get started with this course (not a lot, but some basics would be nice). A basic understanding of neural networks and backpropagation would be useful for the deep RL posts. However, I will try my best to stick to the course title and make this as much “from scratch” as I can. None of the prerequisites are absolutely essential to get an intuitive idea of RL but would be useful to have a thorough understanding.

I often go back to edit previous posts and I will try to keep this page updated with such changes.

This is the first time I’m doing something like this so please let me know if there’s anything different you’d like to see or if I’ve made any mistakes.