3.4 Unified Notation For Episodic and Continuing Tasks
To unified episodic both reinforcement learning task (episodic and continuing) we will refer to a episodic state which is a state at time t in episode i at . We will almost always be considering a particular episode, or stating stomething that is true for all episodes.
Also we need an other convention to unified both notation. In continuing task we have a infinite sum. Considering episode termination to be the entering of a special absorbing state that transition only to itself and generates only rewards of zero.
We can now define the return as :
including the possibility that or (but not both).