Generalization to New Tasks with Imitation Learning

(T) Babies learn by exploring their environments, manipulating any objects that can fall in their hands, observing the behaviors of adults, and interacting with them. Animals are good observers too. My dog always knows when I will go out of the house. Animals can even laugh about an surprising situation:

In machine learning, simply said learning from observations is called imitation learning. Imitation learning enables expert demonstrations, which are often supplied from human experts themselves, to generate the training data for the system.  There are two main categories of imitation learning algorithms: behavior learning that learns in a supervised way the policy from the expert demonstration, and inverse reinforcement learning that learns from the environment of the expert demonstration the reward function, and then find the optimal policy that maximizes that policy. Imitation learning is widely used in robotics and autonomous vehicles.

A team of Google Robotics researchers showed that “that simple imitation learning approaches can be scaled in a way that enables zero-shot generalization to new tasks for vision-based robotic manipulation systems”. Although the training of that robot is not based on any new techniques, the video is quite convincing. What is interesting to note is that the “pre-trained embeddings of natural language” generalizes much better than “the videos of humans performing the task” e.g. communicating to the robot to “place sponge in tray” rather than a video of “how to place the sponge in a tray” if I understood it correctly:

“We study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning. We approach the challenge from an imitation learning perspective, aiming to study how scaling and broadening the data collected can facilitate such generalization. To that end, we develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions and can be conditioned on different forms of information that convey the task, including pre-trained embeddings of natural language or videos of humans performing the task. When scaling data collection on a real robot to more than 100 distinct tasks, we find that this system can perform 24 unseen manipulation tasks with an average success rate of 44%, without any robot demonstrations for those tasks.”

References:

Note 1: Two good tutorials on imitation learning: an ICML 2018 Tutorial, and a lecture from UC Berkeley’s CS Professor Sergey Levine.

Note 2: The picture above is the robot BC-Z.

Copyright © 2005-2022 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com