Chapter 5
Monte Carlo Methods
The characteristics of Monte Carlo Methods is :
- We do not assume complete knowledge of the environment
- in fact, we only need experience (sample data)
- even if we have no prior knowledge of the environment dynamics, we can still attain optimal behavior
- A model is needed
- but only the generate sample transitions are mandatory
- not the complete probability distributions of all possible transition
- which is require in DP
- It is a way of solving RL problème based on averaging sample returns
- We assume we are on episodic tasks
- it means experience is divided into episodes
To make a parallel to the work done previously, monte-carlo methods behave like the bandit methods but each state is now a bandit problem in which each bandit problems are interrelated.
Monte Carlo Methods use methods we saw in previous chapter (policy evaluation ...)