Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
Sharing and Abstraction in Hierarchical Reinforcement Learning
Thomas G. Dietterich
Computer Science Department
Oregon State University
Monday, Mar 5, 2001, 4:15PM
TCSEQ 200
http://robotics.stanford.edu/ba-colloquium/
Abstract
Reinforcement learning addresses the problem of learning optimal
policies for sequential decision-making problems involving stochastic
operators and numerical reward functions rather than the more
traditional deterministic operators and logical goal predicates. In
many ways, reinforcement learning research is recapitulating the
development of classical research in planning and problem solving.
After studying the problem of solving ``flat'' problem spaces,
researchers have recently turned their attention to hierarchical
methods that incorporate subroutines and state abstractions.
In this talk, I will provide an overview of the MAXQ approach to
hierarchical reinforcement learning. In this approach, the programmer
designs a task hierarchy much the way a programmer might design a
subroutine hierarchy or a hierarchical task network. A reinforcement
learning algorithm (MAXQ Q learning) is then applied to simultaneously
learn implementations for each of the subroutines by interacting,
online, with the environment. The key data structure constructed by
MAXQ Q learning is a hierarchical decomposition of the value function
known as the MAXQ hierarchy. This decomposition supports three
important forms of state abstract that are essential to the practical
application of the MAXQ hierarchy. The talk will briefly summarize
theoretical results concerning correctness and convergence of MAXQ and
present experimental studies showing that hierarchical reinforcement
learning can be much more efficient than non-hierarchical methods.
About the Speaker
Thomas G. Dietterich has been a faculty member in Computer Science at
Oregon State University since 1985 (Professor since 1995). He
received the Ph.D. in computer science from Stanford University in
1984, the M.S. from the University of Illinois in 1979, and the
A.B. (in mathematics) from Oberlin College in 1977. He is the author
of many journal papers in the area of artificial intelligence and
machine learning, as well as editing (with Jude Shavlik) the book
Readings in Machine Learning. He is the past Executive Editor of the
journal Machine Learning, and currently serves as an Action
Editor for Neural Computation and the Journal of Machine
Learning Research. He also edits the MIT Press Series on Adaptive
Computation and Machine Learning. Dietterich received an NSF
Presidential Young Investigator Award in 1986; he has served as
Technical Program Chairman for the National Conference on Artificial
Intelligence (1990) and Councillor of the American Association for
Artificial Intelligence (1991-1993). He also worked as a Senior
Scientist for Arris Pharmaceutical Corporation from 1991-93 developing
machine learning methods to support rational drug design. His current
research has two main directions. The first is the application of
reinforcement learning to problems in search and optimization,
particularly in image processing and protein structure determination.
The second direction is the development of generic tools for learning
from temporal and spatial data.
Contact: bac-coordinators@cs.stanford.edu
Back to the Colloquium Page