Northwestern University engineers have developed a new artificial intelligence (AI) algorithm for smart robotics. By helping robots learn complex skills quickly and reliably, the new method could significantly improve the practicality – and safety – of robots for a variety of applications, including self-driving cars, delivery drones, household assistants and automation.
Called Maximum Diffusion Reinforcement Learning (MaxDiff RL), the algorithm’s success lies in its ability to encourage robots to explore their environment as randomly as possible in order to gain a diverse set of experiences. This “designed randomness” improves the quality of data that robots collect about their own environment. And, by using higher quality data, the simulated robots demonstrated faster and more efficient learning, improving their overall reliability and performance.
When tested against other AI platforms, simulated robots using Northwestern’s new algorithm consistently outperformed state-of-the-art models. In fact, the new algorithm works so well that the robots learned new tasks and then performed them successfully in a single attempt, getting it right the first time. This is in stark contrast to current AI models, which allow for slower learning through trial and error.
The research was published today in the journal Nature Machine Intelligence.
“Other AI frameworks can be somewhat unreliable,” Northwestern said. Thomas Berrueta, who led the study. “Sometimes they will completely succeed at a task, but other times they will completely fail. With our framework, as long as the robot is able to solve the task, every time you turn on your robot, you can expect it to do exactly what it was asked to do. This makes it easier to interpret the successes and failures of robots, which is crucial in a world increasingly dependent on AI.
Berrueta is a Presidential Fellow at Northwestern and a Ph.D. candidate in mechanical engineering at the McCormick School of Engineering. Robotics expert Todd Murphy, a professor of mechanical engineering at McCormick and Berrueta’s advisor, is the paper’s lead author. Berrueta and Murphey co-authored the paper with Allison Pinosky, also a Ph.D. candidate in Murphey’s lab.
The disembodied disconnection
To train machine learning algorithms, researchers and developers use large amounts of massive data, which humans carefully filter and organize. The AI learns from this training data, through trial and error until it achieves optimal results. While this process works well for disembodied systems, like ChatGPT and Google Gemini (formerly Bard), it does not work for embodied AI systems like robots. Robots, on the other hand, collect data on their own, without the luxury of human curators.
“Traditional algorithms are not compatible with robotics in two distinct ways,” Murphey said. “First, disembodied systems can take advantage of a world where physical laws do not apply. Second, individual failures have no consequences. For computer applications, the only thing that matters is that it succeeds most of the time. In robotics, failure can be catastrophic.
To address this disconnect, Berrueta, Murphey and Pinosky sought to develop a new algorithm that ensures robots will collect high-quality data on the move. Basically, MaxDiff RL commands robots to move more randomly in order to collect comprehensive and diverse data about their environment. By learning through self-organized, randomized experiments, robots acquire the skills needed to perform useful tasks.
Succeed the first time
To test the new algorithm, the researchers compared it to current state-of-the-art models. Using computer simulations, the researchers asked simulated robots to perform a series of standard tasks. Overall, robots using MaxDiff RL learned faster than other models. They also performed tasks correctly much more consistently and reliably than others.
Perhaps even more impressive: robots using the MaxDiff RL method often managed to perform a task correctly in a single attempt. And that’s even when they started with no knowledge.
“Our robots were faster and more agile, able to effectively generalize what they had learned and apply it to new situations,” Berrueta said. “For real-world applications where robots cannot afford infinite time for trial and error, this represents a huge advantage. »
Since MaxDiff RL is a general algorithm, it can be used for various applications. The researchers hope this will resolve fundamental issues holding back the field, paving the way for reliable decision-making in intelligent robotics.
“This doesn’t necessarily have to be used just for robotic vehicles that move,” Pinosky said. “It could also be used for stationary robots, such as a robotic arm in a kitchen that learns to load the dishwasher. As tasks and physical environments become more complex, the role of embodiment becomes even more crucial to consider during the learning process. This is an important step toward real systems that can perform more complicated and interesting tasks.
Source: NNorthwestern University
Originally published in The European Times.
source link eu news