Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision

Sharing and Abstraction in Hierarchical Reinforcement Learning

Thomas G. Dietterich
Computer Science Department
Oregon State University

Monday, Mar 5, 2001, 4:15PM


Reinforcement learning addresses the problem of learning optimal policies for sequential decision-making problems involving stochastic operators and numerical reward functions rather than the more traditional deterministic operators and logical goal predicates. In many ways, reinforcement learning research is recapitulating the development of classical research in planning and problem solving. After studying the problem of solving ``flat'' problem spaces, researchers have recently turned their attention to hierarchical methods that incorporate subroutines and state abstractions. In this talk, I will provide an overview of the MAXQ approach to hierarchical reinforcement learning. In this approach, the programmer designs a task hierarchy much the way a programmer might design a subroutine hierarchy or a hierarchical task network. A reinforcement learning algorithm (MAXQ Q learning) is then applied to simultaneously learn implementations for each of the subroutines by interacting, online, with the environment. The key data structure constructed by MAXQ Q learning is a hierarchical decomposition of the value function known as the MAXQ hierarchy. This decomposition supports three important forms of state abstract that are essential to the practical application of the MAXQ hierarchy. The talk will briefly summarize theoretical results concerning correctness and convergence of MAXQ and present experimental studies showing that hierarchical reinforcement learning can be much more efficient than non-hierarchical methods.

About the Speaker

Thomas G. Dietterich has been a faculty member in Computer Science at Oregon State University since 1985 (Professor since 1995). He received the Ph.D. in computer science from Stanford University in 1984, the M.S. from the University of Illinois in 1979, and the A.B. (in mathematics) from Oberlin College in 1977. He is the author of many journal papers in the area of artificial intelligence and machine learning, as well as editing (with Jude Shavlik) the book Readings in Machine Learning. He is the past Executive Editor of the journal Machine Learning, and currently serves as an Action Editor for Neural Computation and the Journal of Machine Learning Research. He also edits the MIT Press Series on Adaptive Computation and Machine Learning. Dietterich received an NSF Presidential Young Investigator Award in 1986; he has served as Technical Program Chairman for the National Conference on Artificial Intelligence (1990) and Councillor of the American Association for Artificial Intelligence (1991-1993). He also worked as a Senior Scientist for Arris Pharmaceutical Corporation from 1991-93 developing machine learning methods to support rational drug design. His current research has two main directions. The first is the application of reinforcement learning to problems in search and optimization, particularly in image processing and protein structure determination. The second direction is the development of generic tools for learning from temporal and spatial data.


Back to the Colloquium Page