Facebook AI Introduces ‘ReBeL’: An Algorithm That Generalizes The Paradigm Of Self-Play Reinforcement Learning And Search To Imperfect-Information…
Posted: December 14, 2020 at 1:55 am
Most AI systems excel in generating specific responses to a particular problem. Today, AI can outperform humans in various fields. For AI to do any task it is presented with; it needs to generalize, learn, and understand new situations as they occur without supplementary guidance. However, as humans can recognize chess and Poker both as games in the broadest sense, teaching a single AI to play both is challenging.
Perfect-Information games versus Imperfect-Information games
AI systems are relatively successful at mastering perfect-information games like chess, where nothing is hidden to either player. Each player can see the entire board and all possible moves in all instances. With bots like AlphaZero, AI can even combine reinforcement learning with search (RL+Search) to teach themselves to master these games from scratch.
Unlike perfect-information games and single-agent settings, imperfect-information games have a critical challenge that an actions value may depend on their chosen probability. Therefore, the team states that it is also crucial to include the probability that different sequences of actions occurred and not just the sequences of actions alone.
ReBel
Facebook has recently introduced Recursive Belief-based Learning (ReBeL). It is a general RL+Search algorithm that works in all two-player zero-sum games, including imperfect-information games. ReBeL grows on the RL+Search algorithms that have proved successful in perfect-information games. However, unlike past AIs, ReBeL makes decisions by factoring in the probability distribution of different views each player might have about the games current state, which is called a public belief state (PBS). For example, ReBeL can assess the chances that its poker opponent thinks it has.
Former RL+Search algorithms break down in imperfect-information games like Poker, where not complete information is known (for example, players keep their cards secret in Poker). These algorithms give a fixed value to each action regardless of whether the action is chosen. For instance, in chess, a right step is good irrespective of whether it is chosen frequently or rarely. But in games like Poker, the more a player bluffs, its value goes down as opponents can alter their strategy to call more of those bluffs. Thus Pluribus poker bot is trained on an approach that uses search during actual gameplay and not before.
ReBeL can treat imperfect-information games similar to perfect-information games by accounting for the views of each player. Facebook has developed a modified RL+Search algorithm that ReBeL can leverage to work with the higher-dimensional state and action range of imperfect-information games.
Experiments show that ReBeL is efficient in large-scale two-player zero-sum imperfect-information games such as Liars Dice and Poker. ReBeL achieves superhuman performance by even defeating a top human professional in the benchmark game of heads-up no-limit Texas Hold em.
Several works have occurred before to achieve the same. However, ReBeL executes it using considerably less expert domain knowledge than any previous poker AI. This is a crucial step to building a generalized AI that can solve complex real-world problems involving hidden information like negotiations, fraud detection, cybersecurity, etc.
Limitations:
ReBeL is the first AI to empower RL+Search in imperfect-information games. However, there are some limitations to its current implementation, as listed below:
Nevertheless, ReBeL achieves low exploitability in benchmark games and is a significant start toward creating more general AI algorithms. To promote further research, Facebook has open-sourced the implementation of ReBeL for Liars Dice.
GitHub: (For ReBeL for Liars Dice) https://github.com/facebookresearch/rebel?
Source: https://ai.facebook.com/blog/rebel-a-general-game-playing-ai-bot-that-excels-at-poker-and-more
Related Paper: https://arxiv.org/pdf/2007.13544.pdf
Read the original here:
- This 90's Japanese commercial for Street Fighter Alpha 2 doesn't make a ton of sense, but it somehow still makes us want to play some Alpha -... [Last Updated On: December 9th, 2019] [Originally Added On: December 9th, 2019]
- Artificial intelligence: How to measure the I in AI - TechTalks [Last Updated On: December 9th, 2019] [Originally Added On: December 9th, 2019]
- Doubting The AI Mystics: Dramatic Predictions About AI Obscure Its Concrete Benefits - Forbes [Last Updated On: December 9th, 2019] [Originally Added On: December 9th, 2019]
- From AR to AI: The emerging technologies marketers can explore to enable and disrupt - Marketing Tech [Last Updated On: December 13th, 2019] [Originally Added On: December 13th, 2019]
- MuZero figures out chess, rules and all - Chessbase News [Last Updated On: December 13th, 2019] [Originally Added On: December 13th, 2019]
- John Robson: Why is man so keen to make man obsolete? - National Post [Last Updated On: December 18th, 2019] [Originally Added On: December 18th, 2019]
- Artificial intelligence in the arms race: Commentary by Avi Ben Ezra - Augusta Free Press [Last Updated On: February 9th, 2020] [Originally Added On: February 9th, 2020]
- Explained: The Artificial Intelligence Race is an Arms Race - The National Interest Online [Last Updated On: February 9th, 2020] [Originally Added On: February 9th, 2020]
- Google's DeepMind effort for COVID-19 coronavirus is based on the shoulders of giants - Mashviral News - Mash Viral [Last Updated On: March 8th, 2020] [Originally Added On: March 8th, 2020]
- Fat Fritz 1.1 update and a small gift - Chessbase News [Last Updated On: March 8th, 2020] [Originally Added On: March 8th, 2020]
- Magnus Carlsen: "In my country the authorities reacted quickly and the situation is under control" - Sportsfinding [Last Updated On: April 6th, 2020] [Originally Added On: April 6th, 2020]
- ACM Prize in Computing Awarded to AlphaGo Developer - HPCwire [Last Updated On: April 6th, 2020] [Originally Added On: April 6th, 2020]
- AlphaZero Crushes Stockfish In New 1,000-Game Match ... [Last Updated On: October 17th, 2020] [Originally Added On: October 17th, 2020]
- AlphaGo Zero - Wikipedia [Last Updated On: October 17th, 2020] [Originally Added On: October 17th, 2020]
- AlphaZero: Shedding new light on chess, shogi, and Go ... [Last Updated On: October 17th, 2020] [Originally Added On: October 17th, 2020]
- AlphaZero - Wikipedia [Last Updated On: October 17th, 2020] [Originally Added On: October 17th, 2020]
- When 3 is greater than 5 - Chessbase News [Last Updated On: October 22nd, 2020] [Originally Added On: October 22nd, 2020]
- AI has almost solved one of biologys greatest challenges how protein unfolds - ThePrint [Last Updated On: December 14th, 2020] [Originally Added On: December 14th, 2020]
- Scientists say dropping acid can help with social anxiety and alcoholism - The Next Web [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Toronto scientists help create AI-powered bot that can play chess like a human - blogTO [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- This AI chess engine aims to help human players rather than defeat them - The Next Web [Last Updated On: January 31st, 2021] [Originally Added On: January 31st, 2021]
- Artificial Intelligence, and the Future of Work Should We Be Worried? - BBN Times [Last Updated On: October 21st, 2021] [Originally Added On: October 21st, 2021]
- What Happened in Reinforcement Learning in 2021 - Analytics India Magazine [Last Updated On: November 14th, 2021] [Originally Added On: November 14th, 2021]
- How AI is impacting the video game industry - ZME Science [Last Updated On: December 15th, 2021] [Originally Added On: December 15th, 2021]
- Quest Pro is here, Google and Valve report back - MIXED Reality News [Last Updated On: October 20th, 2022] [Originally Added On: October 20th, 2022]
- AI now not only debates with humans but negotiates and cajoles too - Mint [Last Updated On: November 26th, 2022] [Originally Added On: November 26th, 2022]
- Newspoll quarterly aggregates: July to December (open thread ... - The Poll Bludger [Last Updated On: December 29th, 2022] [Originally Added On: December 29th, 2022]
- MPL 59th National Senior R3: The Systematic Pawn Structure ... - ChessBase India [Last Updated On: December 29th, 2022] [Originally Added On: December 29th, 2022]
- Personality traits and decision-making styles among obstetricians ... - Nature.com [Last Updated On: April 6th, 2023] [Originally Added On: April 6th, 2023]
- What Brains of the Past Teach Us About the AI of the Future - Next Big Idea Club Magazine [Last Updated On: November 26th, 2023] [Originally Added On: November 26th, 2023]