![]()
Maintained by web@cs.rutgers.edu |
Rutgers University DCIS Colloquium Date: Friday, October 24, 2003 Time: 2:00 PM Location: CoRE Building room 301, Busch Campus, Rutgers University
Abstract: Stochastic, or Markov games are a rich model for multi-agent interactions. They generalize the Markov decsision process model and the repeated game model. At each stage the agent engages in a single shot game whose outcome determines both his reward and the next stage's game. I will describe R-max, a very simple model-based algorithm for learning in zero-sum stochastic games. R-max is the first algorithm to be shown to converge to near-optimal value in polynomial time. Moreover, unlike most other algorithms, it has a built in method for trading exploration and exploitation which provides the first formal justification of some popular reinforcement learning heuristics. Because R-max is a deterministic algorithm, it can also be extended to handle common-interest stochastic games. I will explain the idea behind this extension and show some preliminary experimental results. Finally, I will suggest a new normative criteria for game-learning algorithms clled an Efficient Learning Equilibrium. This is joint work with Moshe Tennenholtz
|