Practical issues in temporal reinforcement learning -> Robin * Is backgammon a "complex, real world problem"? Compared to what? * What are structural and temporal credit assignment? * What does TD stand for? What does lambda do? * What is self-play? What's an alternative? * How is learning performance measured in this paper? * What is feature discovery and why is it important? * Why should TDGammon be expected to play better late in the game instead of early in the game? What does it actually do? Generalization in reinforcement learning: Safely approximating the value function -> George * What is the curse of dimensionality? * What are rollouts and why are the safer than one-step backups? * What are four classes of behavior of smooth value iteration? * Why would you expect a local function approximator to behave better than a global one? * Why are multiple rollouts needed for non-deterministic problems? Generalization in reinforcement learning: Successful examples using sparse coarse coding -> Nishkam * How does smooth value iteration differ from Sutton's RL approach? How do the function approximators differ in the two papers? * How are tilings local? How do they generalize? * Why does performance vary with lambda as a u-shaped curve? How about alpha? Scaling reinforcement learning toward RoboCup soccer -> Vasil * Is the Robocup simulator a "complex, real world problem"? How does it compare to a task with real robots? * Why do the authors model keepaway as a semi-Markov decision process instead of a standard MDP? * What is a representational selection? Are there examples in which this is automated? * How is learning performance measured in the keep-away task? * Is self-play used for training in this paper?