Download PDF (external access)

IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning Honolulu, HI, APR 01-05, 2007, Date: 2007/04/01 - 2007/04/05, Location: HI, Honolulu

Publication date: 2007-01-01
Pages: 76 - 83
ISSN: 1424407060, 978-1-4244-0706-4
Publisher: Ieee; 345 E 47TH ST, NEW YORK, NY 10017 USA

2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning

Author:

Peeters, Maarten
Verbeeck, Katja ; Nowe, Ann

Keywords:

Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science

Abstract:

Learning Automata are shown to be an excellent tool for creating learning multi-agent systems. Most algorithms used in current automata research expect the environment to end in an explicit end-stage. In this end-stage the rewards are given to the learning automata (i.e. Monte Carlo updating). This is however unfeasible in sequential decision problems with infinite horizon where no such end-stage exists. In this paper we propose a new algorithm based on one-step returns that uses bootstrapping to find good equilibrium paths in multi-stage games.