CPS196 - Fall 1999
RL: Demos
Notes: By Lenore Ramm
Students demoed their robots in the lobby outside the classroom.
Luis mentioned that this is the first time that reinforcement learning
has been tried on the Mindstorms.
Issues:
- One must pray to the rotation sensor gods at regular intervals to
insure success.
- It might be good to have a decreasing epsilon or a "happy" state.
Is this an exploration issue?
- Maybe the robot should be rewarded more for being in the center?
- SARSA?
Possibilities for what could be done in the future:
- More states might be helpful for each of the rotation sensor
readings rather than having such a broad range.
- Parallel processing of the Q values. Put the Q update loop into
its own thread.
- Maybe move 3 times and then update.
- Turn Kappa down so it's not doing so many iterations.
- Q learning: only do updates on what you just experienced.
- Dispense with floats.
Are we done with reinforcement learning?? Rotation sensors?
Photos
The trailorbots: Cyrus' Group, Lenore's Group, Peter's
Group, Beau's Robot, The Whole Family, Mechanical closeup.
The class hard at work, Discussing Q learning.
Modified: Sat Oct 16 12:58:47 EDT 1999
by Michael Littman, mlittman@cs.duke.edu