Maximum Likelihood Inverse Reinforcement Learning

Monica Babes-Vroman, Rutgers University

Defense Committee: Michael Littman, Tina Eliassi-Rad, Alex Borgida and Brian Ziebart (University of Illinois at Chicago)

Abstract

A central premise of decision theory is that tasks can be specified as reward functions to be optimized. The problem of inverse reinforcement learning (IRL) is inferring such reward functions from sample executions of the task. In my work, I derive a maximum likelihood estimation approach to the IRL problem (MLIRL) and show that it quickly and successfully identifies unknown reward functions under the standard assumption that reward is a linear function of a known set of features. I show that MLIRL, in addition to being a highly competitive method for standard IRL problems, is well structured to support solutions to important generalizations. I extend MLIRL to address the problem of learning reward functions that can be a more general function of the features, and show that a modular IRL approach is a competitive alternative to existing approaches covering the same class of functions, while, at the same time, being able to learn rewards in cases unaddressed by existing approaches. I pose the problem of learning a set of reward functions when unlabeled demonstrations of tasks are mixed together. The solution leverages an EM approach that clusters observed trajectories using MLIRL in the role of the M-step, providing an effective first solution to this challenge.

Go Back to Colloquiua Listing