Nonlinear Inverse Reinforcement Learning with Gaussian Processes

Supplementary Materials
Sergey Levine
Stanford University
Zoran Popović
University of Washington
Vladlen Koltun
Stanford University

This webpage provides supplementary materials for the NIPS 2011 paper "Nonlinear Inverse Reinforcement Learning with Gaussian Processes." The paper can be viewed here. The following materials are provided:

  • Derivation of likelihood partial derivatives and description of random restart scheme: PDF
  • Full results of comparison between GPIRL and previous IRL algorithms: PDF
  • Complete MATLAB source code, in a modular, extensible framework: ZIP
  • Video of the learned policies and human demonstrations on the "highway" environment: see below
1. Supplementary Video

The video first shows an example demonstration by a human expert of a policy that avoids exceeding speed 2 near police cars. We then show an optimal expert executing this policy, based on the reward function described in the paper. The optimal demonstration is shown for reference, and is not provided to the IRL algorithms. Finally, we show the policies learned by the three best performing IRL algorithms for this task: GPIRL, MaxEnt/Lp, and FIRL. The video is provided below using Flash, and available as a DivX avi file here.