The Exploration-Exploitation Trade-off
The Exploration-Exploitation Trade-off. The ideas of exploration and exploitation are central to designing an expedient reinforcement learning system. The word "expedient" is a terminology adapted from the theory of Learning Automata to mean a system in which the Agent (or Automaton) learns the dynamics of the stochastic Environment.
Read More