Skip to main content

Guiding Robots With Natural Language

A new paper titled, "Correcting Robot Plans with Natural Language Feedback" written by Pratyusha Sharma et al. explores the utility of natural language processing in helping robots performs tasks more effectively.
Created on April 12|Last edited on April 14
Pratyusha Sharma et al. aim to explore the utility of natural language processing in guiding robots to complete tasks more effectively, explained in detail in a new paper titled, "Correcting Robot Plans with Natural Language Feedback."
Sometimes, a robot needs additional guidance to complete a task more effectively. This guidance can be administered by a number of methods, like manually controlling it with a joystick, however one of the most broadly usable methods might be to control it with commands spoken to them.
This paper explores ways that natural language can be used to give correction to robots, and ways those commands can be interpreted by the robot and how it weighs against the other constraints programmed into the robot.

Why use natural language for controlling robots?

Though joysticks and other peripherals control many robots today, they require some amount of training and knowledge to use. Being able to simply dictate instructions broadly increases who could instruct these robots and give them agency over how to accomplish a certain task instead of controlling each individual movement.
After all, imagine a robot that needs to climb a flight of stairs. It's considerably easy to just ask it to meet you on the second floor than to micromanage each individual step. The same goes for a lot of seemingly simple commands.

How a robot interprets these commands

The examples in the paper provide a scene of a robot arm moving around a space littered with objects. It's simple objective it to move from point A to point B, using a few different layers of constraints guiding its way to the objective.


With the use of an NLP model as one of these constraint layers, additional instructions can be passed to the robot so that it may arrive at the objective according to our instructions. Given that we use language for our instructions, we can be a lot more freeform in our commands, as long as the model is able to interpret the words we use. This includes using synonyms, adjectives, describing locations, and more.
The example robot then gauges all the different factors constraining it, such as object avoidance and the instructions we give through speech, to generate a cost map which it uses to determine the best path to the objective.


However, there are limitations when it comes to ambiguity, such as if you told the robot to "go below the spam can" when there are two spam cans.

Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.