Meta researchers have created an artificial intelligence called CICERO whose mastery of human language allowed it to achieve human-level performance in the strategy game Diplomacy. For those not familiar with it, Diplomacy is a seven-player game that blends elements of Risk, Poker, and the TV show Survivor. It is available as either a board game or online and CICERO was trained online where it achieved over twice as high the score of average human players. CICERO even managed to rank in the top ten percent of players who played the game more than one time. The Meta researchers have published both a scientific paper on CICERO and a couple of videos about it.
Language Model Training
“To build a controllable dialogue model, we started with a 2.7 billion parameter BART-like language model pre-trained on text from the internet and fine-tuned on over 40,000 human games on webDiplomacy.net. We developed techniques to automatically annotate messages in the training data with corresponding planned moves in the game so that at inference time we can control dialogue generation to discuss specific desired actions for the agent and its conversation partners. For example, if our agent is playing as France, conditioning the dialogue model on a plan involving England supporting France into Burgundy might yield a message to England like, “Hi England! Are you willing to support me into Burgundy this turn?” Controlling generation in this manner allows Cicero to ground its conversations in a set of plans that it develops and revises over time to better negotiate. This helps the agent coordinate with and persuade other players more effectively.“
We further improve dialogue quality using several filtering mechanisms – such as classifiers trained to distinguish between human and model-generated text – that ensure that our dialogue is sensible, consistent with the current game state and previous messages, and strategically sound.
CICERO combines its natural language skills with strategic reasoning to win the game but its creators explain that it could also be an AI assistant to help people learn new skills or advance NPCs in games that assist players with winning strategies or converse like humans. CICERO’s creators have made it open-source and it is available on GitHub.
And by open sourcing CICERO’s code, we have high hopes for others to build responsibly upon our work, for use inside and outside of the metaverse.