Title: Real-time Bayesian search control for MDPs
Authors: Sanner, Scott
Goetschalckx, Robby
Driessens, Kurt #
Issue Date: 2008
Host Document: ICAPS 2008 workshop on oversubscribed planning & scheduling
Conference: ICAPS edition:18 location:Sydney, Australia date:14-18 september 2008
Abstract: Many oversubscribed planning problems with multiple (potentially jointly unachievable) goals can be cast in the Markov decision process (MDP) framework when appropriate reward functions and objective criteria can be defined. Recent real-time, search-based solutions to MDPs maintain upper and lower bounds on the value function at all stages of execution prior to convergence. While many algorithms already use this information to detect convergence and to prioritize state exploration, we propose a novel Bayesian interpretation of these bounds to evaluate the myopic value of perfect information (VPI) of exactly knowing a state’s value. This VPI analysis permits two general Bayesian search control modifications to existing algorithms: (1) we can use it to dynamically adapt the backup depth in real-time, search-based bounded dynamic programming approaches, and (2) we can use it for real-time policy evaluation with a bounded value function produced by an anytime algorithm. We empirically evaluate both of these modifications to existing algorithms and analyze their performance. While real-time Bayesian search control may reduce the number of backups until convergence for (1), it offers more substantive improvements for (2), making it an attractive real-time policy evaluation approach for MDPs.
Description: Presentation was based on the material in the uploaded file
Publication status: published
KU Leuven publication type: IMa
Appears in Collections:Informatics Section
# (joint) last author

Files in This Item:
File Description Status SizeFormat
BSCMDP.pdfMain article Published 181KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.