Data-Efficient Reinforcement and Transfer Learning in Robotics
Time: Fri 2020-12-04 13.00
Location: U1, Brinellvägen 28A, Stockholm (English)
Subject area: Computer Science
Doctoral student: Xi Chen , Robotik, perception och lärande, RPL
Opponent: Ville Kyrki,
Supervisor: Patric Jensfelt, Signaler, sensorer och system, Numerisk analys och datalogi, NADA, Robotik, perception och lärande, RPL
In the past few years, deep reinforcement learning (RL) has shown great potential in learning action selection policies for solving different tasks.Despite its impressive success in games, several challenges remain, such as designing appropriate reward functions, collecting large amounts of interactive data, and dealing with unseen cases, which make it difficult to apply RL algorithms to real-world robotics tasks. The ability of data-efficient learning and rapid adaptation to novel cases is essential for an RL agent to solve real-world problems.
In this thesis, we discuss algorithms to address the challenges in RL by reusing past experiences gained while learning other tasks to improve the efficiency of learning new tasks.Instead of learning directly from the target task, which is complicated and sometimes unavailable during training, we propose first learning from relevant tasks that contain valuable information about the target environment, and reuse the obtained solutions in solving the target task.We follow two approaches to achieve knowledge sharing between tasks. In the first approach, we model the problem as a transfer learning problem and learn to minimize the distance between the representations found based on the training and the target data, such that the learned solution can be applied to the target task using a small amount of data from the target environment.In the second approach, we formulate it as a meta-learning problem and obtain a model that is explicitly trained for rapid adaptation using a small amount of data. At test time, we can learn quickly on top of the trained model in a few iterations when facing a new task.
We demonstrate the effectiveness of the proposed frameworks by evaluating the methods on a number of real and simulated robotic tasks, including robot navigation, motion control, and manipulation. We show how these methods can be applied to challenging tasks with high-dimensional state/action spaces, limited data, sparse rewards, and requiring diverse skills.