Coursera
Coursera Logo

Columbia University - Decision Making and Reinforcement Learning 

  • Offered byCoursera
  • Public/Government Institute

Decision Making and Reinforcement Learning
 at 
Coursera 
Overview

Duration

47 hours

Total fee

Free

Mode of learning

Online

Official Website

Explore Free Course External Link Icon

Credential

Certificate

Decision Making and Reinforcement Learning
Table of content
Accordion Icon V3
  • Overview
  • Highlights
  • Course Details
  • Curriculum

Decision Making and Reinforcement Learning
 at 
Coursera 
Highlights

  • Earn a Certificate upon completion
    Flexible deadlines
    Coursera Labs
Details Icon

Decision Making and Reinforcement Learning
 at 
Coursera 
Course details

More about this course
  • This course is an introduction to sequential decision making and reinforcement learning
  • We start with a discussion of utility theory to learn how preferences can be represented and modeled for decision making
  • We first model simple decision problems as multi-armed bandit problems in and discuss several approaches to evaluate feedback
  • The course will then model decision problems as finite Markov decision processes (MDPs), and discuss their solutions via dynamic programming algorithms
  • Course will touch on the notion of partial observability in real problems, modeled by POMDPs and then solved by online planning methods
  • Finally, we introduce the reinforcement learning problem and discuss two paradigms: Monte Carlo methods and temporal difference learning
  • We conclude the course by noting how the two paradigms lie on a spectrum of n-step temporal difference methods
  • An emphasis on algorithms and examples will be a key part of this course
Read more

Decision Making and Reinforcement Learning
 at 
Coursera 
Curriculum

Decision Making and Utility Theory

Introduction to Decision Making and Reinforcement Learning

Course Logistics

1.1 Rational Agents and Utility Theory

1.2 Preferences and Axioms of Utility Theory

1.3 Uncertain and Multi-Attribute Utilities

1.4 Value of Perfect Information

Course Syllabus

About the Instructor

Academic Honesty Policy

Discussion Forum Etiquette

Pre-Course Survey

Week 1 Lesson Materials

Utility Theory

Bandit Problems

2.1 Multi-Armed Bandits and Action Values

2.2 ?-Greedy Action Selection

2.3 Upper Confidence Bound

Week 2 Lesson Materials

Multi-Armed Bandit Problems

Markov Decision Processes

3.1 Markov Decision Process Framework

3.2 Gridworld Example

3.3 Rewards, Utilities, and Discounting

3.4 Policies and Value Functions

3.5 Example: Mini-Gridworld

3.6 Bellman Optimality Equations

Week 3 Lesson Materials

Sequential Decision Problems

Dynamic Programming

4.1 Time-Limited Values

4.2 Value Iteration

4.3 Value Iteration Implementation

4.4 Policy Iteration

4.5 Example: Mini-Gridworld

4.6 Algorithm Complexity

Week 4 Lesson Materials

Markov Decision Processes

Partially Observable Markov Decision Processes

5.1 Partial Observability and POMDP

5.2 Belief States

5.3 Belief Transition Model

5.4 Policies and Value Functions

5.5 Example: Mini-Gridworld

Week 5 Lesson Materials

POMDPs

Monte Carlo Methods

6.1 Monte Carlo Methods

6.2 First-Visit MC Prediction

6.3 State-Action Values

6.4 ?−Greedy On-Policy MC Control

6.5 On and Off-Policy MC Control

6.6 Example: Mini-Gridworld

Week 6 Lesson Materials

Monte Carlo RL

Temporal-Difference Learning

7.1 Temporal Difference Learning

7.2 Temporal Difference Prediction

7.3 Batch Updating

7.4 TD Learning for Control

7.5 SARSA vs Q-Learning

Week 7 Lesson Materials

Temporal Difference Learning

Reinforcement Learning - Generalization

8.1 n-step Temporal Difference Prediction

8.2 n-step SARSA

8.3 Model-Based Methods

8.4 Function Approximation

Week 8 Lesson Materials

Post-Course Survey

Generalization of Tabular Methods

Other courses offered by Coursera

– / –
3 months
Beginner
– / –
20 hours
Beginner
– / –
2 months
Beginner
– / –
3 months
Beginner
View Other 6726 CoursesRight Arrow Icon
qna

Decision Making and Reinforcement Learning
 at 
Coursera 

Student Forum

chatAnything you would want to ask experts?
Write here...