3.4 Unified Notation For Episodic and Continuing Tasks

To unified episodic both reinforcement learning task (episodic and continuing) we will refer to a episodic state which is a state at time t in episode i at . We will almost always be considering a particular episode, or stating stomething that is true for all episodes.

Also we need an other convention to unified both notation. In continuing task we have a infinite sum. Considering episode termination to be the entering of a special absorbing state that transition only to itself and generates only rewards of zero.

Figure 2: state transition diagram for episodic task

We can now define the return as :

including the possibility that or (but not both).