A central premise of decision theory is that tasks can be specified as
reward functions to be optimized. The problem of inverse reinforcement
learning (IRL) is inferring such reward functions from sample executions
of the task. In my work, I derive a maximum likelihood estimation
approach to the IRL problem (MLIRL) and show that it quickly and
successfully identifies unknown reward functions under the standard
assumption that reward is a linear function of a known set of features.
I show that MLIRL, in addition to being a highly competitive method for
standard IRL problems, is well structured to support solutions to
important generalizations. I extend MLIRL to address the problem of
learning reward functions that can be a more general function of the
features, and show that a modular IRL approach is a competitive
alternative to existing approaches covering the same class of functions,
while, at the same time, being able to learn rewards in cases
unaddressed by existing approaches. I pose the problem of learning a set
of reward functions when unlabeled demonstrations of tasks are mixed
together. The solution leverages an EM approach that clusters observed
trajectories using MLIRL in the role of the M-step, providing an
effective first solution to this challenge.