4.7 Efficiency of Dynamic Programming
- Dynamic Programming may seems not practical for very large problem
- but compared with other methods for solving MDPs, it is quite efficient
- the time DP methods take to find an optimal policy is polynomial (in the number of state and actions)
- DP methods is guaranteed to find an optimal policy in polynomial time
- Linear programming methods can also be used, but became inefficient as the state grows large.
DP is sometimes thought to be of limited applicability because of the fact that the number of states often grows exponentially with the number of state variables. In practice, DP methods can be used with today's computer to solves MDPs with millions of states.
On large state space, asynchronous DP methods are often preferred. Synchronous methods require to compute state which may be irrelevant. Asynchronous methods can help by only apply DP methods on the state you want. You may find good or optimal policies much faster than synchronous methods can.
Bootstraping
- update estimates on the basis of other estimates