Go to Course: https://www.coursera.org/learn/dmrol
Model non-associative and associative sequential decision problems with multi-armed bandit problems and Markov decision processes respectively
Implement dynamic programming algorithms to find optimal policies
Implement basic reinforcement learning algorithms using Monte Carlo and temporal difference methods
Decision Making and Utility Theory
Welcome to Decision Making and Reinforcement Learning! During this week, Professor Tony Dear provides an overview of the course. You will also view guidelines to support your learning journey towards modeling sequential decision problems and implementing reinforcement learning algorithms.
Bandit ProblemsWelcome to week 2! This week, we will learn about multi-armed bandit problems, a type of optimization problem in which the algorithm balances exploration and exploitation to maximize rewards. Topics include action values and sample averaging estimation,
This course is an introduction to sequential decision making and reinforcement learning. We start with a discussion of utility theory to learn how preferences can be represented and modeled for decision making. We first model simple decision problems as multi-armed bandit problems in and discuss several approaches to evaluate feedback. We will then model decision problems as finite Markov decision processes (MDPs), and discuss their solutions via dynamic programming algorithms. We touch on the n
Very good introductory and basic to Reinforcement Learning. But programming assignments need more careful compilation and more attention to detail!
Well-structured course that provides a great introduction to methodologies used in reinforcement learning. I am now eager to experiment more in my own time, to consolidate what I have learned.