Skip to main content

CICERO: AI In Diplomacy and Relations

Created on January 13|Last edited on January 13
Released in November of 2022, Meta AI's CICERO demonstrates top "human-level" performance in a game called Diplomacy.


What is it doing?

For a game like Diplomacy, it isn't as simple as chess (not that chess is easy to begin with). It isn't a game based on a set of constrained, possible moves with a limited number of opponents. This game involves strategic decision making, much like in chess and Go, but also involves natural language where each player has to negotiate and maintain relations in order to succeed. This intersection presents a unique problem. How did Meta AI model both the decision making aspect but also account for the human interaction aspect?
The core of CICERO is its Large Language Model (LLM) and Reinforcement Learning (RL) Value Model. In a match of Diplomacy, the LLM is in charge of processing the board history along with exchanged dialogue. It then makes an initial guess of what other players will do. This guess is then iteratively refined through what is called an iterative planning algorithm. This algorithm does a few things. It ensures that CICERO acts rationally and in the best interests of its alliances. It also plays a meta game. According to Meta AI's CICERO blog, it doesn't just predict other players' actions, but it also predicts what other players are thinking CICERO's next move will be. The ultimate next action, or dialogue, CICERO takes is refined on this multitude of factors and penalized if its refined prediction is too different from its original prediction. This diplomacy feat is nothing short of remarkable, but there's always room to improve and more complex scenarios to consider.
Its occasional lapses in dialogue judgment (saying one thing and acting the other) are one example of where it falls short. And, more generally, it would be interesting to imagine how a model of this scale would perform in a larger diplomatic role, outside the boundaries of a game. There would be a myriad of new factors to consider like cost, time, etc.
Nonetheless, this paper demonstrates a crucial step towards more generalized, multi-modal AI capable of not just perceiving and understanding, but also acting and planning. Recent work in the multi-modal has generated lots of excitement about premature general intelligence and I can't wait to see what's in store.