CS Events Monthly View

Qualifying Exam

Inferring Time-delayed Hidden Causal Relations In Reinforcement Learning

 

Download as iCal file

Friday, May 22, 2020, 01:00pm - 02:00pm

 

Speaker: Junchi Liang

Location : Remote via Webex

Committee

Prof. Abdeslam Boularias, Prof. Sungjin Ahn, Prof. Yongfeng Zhang, Prof. Shubhangi Saraf

Event Type: Qualifying Exam

Abstract: Despite the remarkable progress made by deep RL agents in reaching human-level performance and beyond, they continue to lag behind humans in terms of data efficiency. Model-based RL algorithms arguably require less data than model-free ones. But learning models that are sufficiently accurate for planning is still a challenging problem. The difficulty in learning accurate predictive models can be mostly attributed to the partial observability of the states. In robotics, for example, the Markov condition is seldom verified. Future states and rewards often depend on the entire history of actions and observations. LSTM and GRU architectures are general-purpose tools for solving problems of partial observability by discovering and remembering pertinent information. They tend, however, to require large amounts of data, and they cannot be easily interpreted. To address these two issues, we present here an approach that combines the merits of general function approximators such as neural networks with probabilistic graphical models for representing hidden variables. Given a stream of actions, observations and rewards, a neural network is trained to predict future observations and rewards. Simultaneously, a graphical model of causal relations between observations occurring at different time steps is also gradually constructed. The values of the variables in the graph are also provided to the neural network as additional inputs along with the observations. The learned predictive model is then utilized by the agent to select actions based on their predicted future rewards.

 

Meeting Link: https://rutgers.webex.com/rutgers/j.php?MTID=me9c98987e2f4c0b4c80115b08aa3e9c4

Meeting Number: 790 552 862

Password: itUnAgF37y5