Grass lawns and hiking trails are no problem for this robot, which learned to walk on them on the fly thanks to a machine learning algorithm
26 August 2022
A robot dog can learn to walk on unfamiliar and hard-to-master terrain, such as grass, bark and hiking trails, in just 20 minutes, thanks to a machine learning algorithm.
Most autonomous robots have to be carefully programmed by humans or extensively tested in simulated scenarios before they can perform real-world tasks, such as walking up a rocky hill or a slippery slope – and when they encounter unfamiliar environments, they tend to struggle.
Now, Sergey Levine at the University of California, Berkeley, and his colleagues have demonstrated that a robot using a kind of machine learning called deep reinforcement learning can work out how to walk in about 20 minutes in several different environments, such as a grass lawn, a layer of bark, a memory foam mattress and a hiking trail.
The robot uses an algorithm called Q-learning, which doesn’t require a working model of the target terrain. Such machine learning algorithms are usually used in simulations. “We don’t need to understand how the physics of an environment actually works, we just put the robot into an environment and turn it on,” says Levine.
Instead, the robot receives a certain reward for each action it performs, depending on how successful it was according to predefined goals. It repeats this process continuously while comparing its previous successes until it learns to walk.
“In some sense, it’s very similar to how people learn,” says team member Ilya Kostrikov, also at the University of California, Berkeley. “Interact with some environment, receive some utility and basically just think about your past experience and try to understand what could have been improved.”
Although the robot can learn to walk on each new surface it encounters, Levine says the team would need to fine-tune the model’s reward system if the robot is to learn other skills.
Making deep reinforcement learning work in the real world is hard, says Chris Watkins at Royal Holloway, University of London, because of the amount of different variables and data that have to interact at the same time.
“I think it’s very impressive,” says Watkins. “I’m honestly a little bit surprised that you can use something as simple as Q-learning to learn skills like walking on different surfaces with so little experience and so quickly in real time.”
More on these topics: