3.1 The Agent-Environment Interface
Some definition:
- Agent: the learner and decision-maker
- Environment: things the agent interact with
- Action: the way the agent interact with its environment
- Reward: a special numerical values send by the environment to the agent that it tried to maximize
The agent interact with its environment at discrete time steps. At each time step the agent receives some representation of the environment's state.
Let
Where :
- is the set of all possible states
- is the set of actions available in state S_t
- the reward associated to the previous action
At each time step, the agent implements a mapping from states to action. This mapping is called a policy.
- is a policy at time step t
- is the probability that if
The framework is abstract and flexible so it can easily be extended to many reinforcement problems.