Researchers at Toyota Research Institute (TRI) have used generative AI to teach robots how to perform individual tasks to make breakfast. Instead of using traditional coding methods that require hours of coding, errors, and bug fixing, the researchers gave the robots a sense of touch and connected them to an AI model. By incorporating a sense of touch, the robots can “feel” what they are doing, which provides them with more information and makes complex tasks easier to carry out compared to relying on sight alone.
The process involves a “teacher” demonstrating a set of skills and the model learning in the background over a few hours. This approach allows the robots to learn new behaviors quickly. The researchers are working on creating “Large Behavior Models” (LBMs) for robots, which would learn by observing and then generalize their knowledge to perform new skills without being explicitly taught each one.
So far, over 60 challenging skills, such as pouring liquids, using tools, and manipulating deformable objects, have been successfully trained using this method. The researchers aim to expand this number to 1,000 skills by the end of 2024.
Similar research is being conducted by companies like Google and Tesla. Google’s robots, including their Robotic Transformer (RT-2), also use experience to infer how to perform tasks. The goal is to eventually have AI-trained robots that can carry out tasks with minimal instruction, similar to giving directions to a human. However, there are still challenges to overcome, such as the need for ample training data, as highlighted by The New York Times in their coverage of Google’s research.
Sources:
– Wes Davis, The Verge (source article)
– Toyota Research Institute