Nonlinear Inverse Reinforcement Learning with Gaussian Processes

Nonlinear Inverse Reinforcement Learning with Gaussian Processes

Supplementary Materials

Sergey Levine
Stanford University

Zoran Popović
University of Washington

Vladlen Koltun
Stanford University

This webpage provides supplementary materials for the NIPS 2011 paper "Nonlinear Inverse Reinforcement Learning with Gaussian Processes." The paper can be viewed here. The following materials are provided:

Derivation of likelihood partial derivatives and description of random restart scheme: PDF
Full results of comparison between GPIRL and previous IRL algorithms: PDF
Complete MATLAB source code, in a modular, extensible framework: ZIP
Video of the learned policies and human demonstrations on the "highway" environment: see below

1. Supplementary Video

The video first shows an example demonstration by a human expert of a policy that avoids exceeding speed 2 near police cars. We then show an optimal expert executing this policy, based on the reward function described in the paper. The optimal demonstration is shown for reference, and is not provided to the IRL algorithms. Finally, we show the policies learned by the three best performing IRL algorithms for this task: GPIRL, MaxEnt/Lp, and FIRL. The video is provided below using Flash, and available as a DivX avi file here.