Chapter 5

Monte Carlo Methods

The characteristics of Monte Carlo Methods is :

  • We do not assume complete knowledge of the environment
    • in fact, we only need experience (sample data)
    • even if we have no prior knowledge of the environment dynamics, we can still attain optimal behavior
  • A model is needed
    • but only the generate sample transitions are mandatory
    • not the complete probability distributions of all possible transition
      • which is require in DP
  • It is a way of solving RL problème based on averaging sample returns
  • We assume we are on episodic tasks
    • it means experience is divided into episodes

To make a parallel to the work done previously, monte-carlo methods behave like the bandit methods but each state is now a bandit problem in which each bandit problems are interrelated.

Monte Carlo Methods use methods we saw in previous chapter (policy evaluation ...)