Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)
Rethinking State, Action, and Reward in Reinforcement Learning
Satinder Singh
November 7, 2005, 4:15PM
Hewlett (TCSeq) 200
http://graphics.stanford.edu/ba-colloquium/
Abstract
Over the last decade and more, there has been rapid theoretical and
empirical progress in reinforcement learning (RL) using the well-
established formalisms of Markov decision processes (MDPs) and
partially observable MDPs or POMDPs. At the core of these formalisms
are particular formulations of the elemental notions of state,
action, and reward that have served the field of RL so well. In this
talk, I will describe recent progress in rethinking these basic
elements to take the field beyond (PO)MDPs. In particular, I will
briefly describe older work on flexible notions of actions called
options, briefly describe some recent work on intrinsic rather than
extrinsic rewards, and then spend the bulk of my time on recent work
on predictive representations of state. I will conclude by arguing
that taken together these advances point the way for RL to address
the many challenges of building an artificial intelligence.
About the Speaker
Satinder Singh is an Associate Professor of Electrical Engineering
and Computer Science in the University of Michigan, Ann Arbor. His
main research interest is in the old-fashioned goal of Artificial
Intelligence, that of building autonomous agents that can learn to
be broadly competent in complex, dynamic, and uncertain environments.
The field of reinforcement learning (RL) has focused on this goal, and
accordingly his deepest contributions are in RL.
Contact: bac-coordinators@cs.stanford.edu
Back to the Colloquium Page