Skip to main content

NPMP: DeepMind's Solution For Natural Motion In Humanoid RL Tasks

NPMP uses motion capture data during training as a basis for realistic movement. This project culminates in an impressive 2v2 AI soccer simulation.
Created on September 1|Last edited on September 1
Researchers at DeepMind have released a blog post outlining recent advancements in producing more natural humanoid locomotion, namely a process called neural probabilistic motor primitives, or NPMP for short, in reinforcement learning tasks. This post comes out together with a paper released alongside it describing the whole process is much more detail.


Improving humanoid motion in reinforcement learning

DeepMind researchers have long worked on humanoid motion in reinforcement learning tasks, starting with a project for obstacle course traversal back in 2017. Despite the success of developing a model which could control a humanoid body to complete these locomotion tasks for a more than satisfactory run length, the movements were notably awkward and inhuman such as with wildly flailing arms or a crab-like sideways sprint.

Issues like this arise from the fact that the models driving the humanoid bodies are, in fact, not humans. They'll obviously just gun for the best solution they can find, even if it looks silly and isn't actually perfectly optimal. Human locomotion is a freak accident of nature developed over millions of years by countless individuals, so an AI trying to imitate it will need some sort of manual intervention to help.

Using NPMP to learn realistic movement

The NPMP model is a model that promotes realistic motor control by first learning it through training with motion capture data. Once it has learned how a human moves according to the training data, it reuses that understanding when given instruction in a reinforcement learning environment, sort of like a pretraining regimen.

The NPMP model has two important parts to it:
  1. The encoder portion encodes future trajectory information from the motion capture data to teach motor intentions.
  2. The low-level controller produces the action the agent will take given its state and the motor intention.
Illustrated in the above graphic, you can see how the first portion is only used during motion capture data training, while the second portion is reused during the actual reinforcement learning process. This allows the model to leverage what it already knows about natural motion while learning what it needs to do in the new RL environment.

What NPMP was able to help create

Using these new developments, the researchers tasked the model to learn how to play soccer (aka football) in a realistic fashion. Though the rules of the simulation they play in are simplified, the environment pits two teams of two AIs against each other to score goals.
With NPMP motion capture data trained as the starting point, the AI soccer players spent a human lifetime's worth of years in simulated soccer purgatory. Eventually, the players not only learned how to kick a ball toward the other team's goal, but they learned to coordinate with their teammates, how to understand the consequences of their actions, and target long-term outcomes.

More details about the soccer project, including cool videos of the AI playing soccer, can be found in the paper here: https://www.science.org/doi/10.1126/scirobotics.abo0235
Other tasks that this NPMP method has helped develop are agents carrying and throwing objects, as well as navigating mazes, all from a first-person world observation perspective. It has also been used in driving real-life robotics in attempts to create optimized and efficient movement based on the energy-conserving movements real animals make naturally.

Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.