|
|
| Line 279: |
Line 279: |
| | || | | || |
| | <youtube>QgCxCeoW5JI</youtube> | | <youtube>QgCxCeoW5JI</youtube> |
| − | <b>HH1 | + | <b>The State of Techniques for Solving Large Imperfect-Information Games, Including Poker |
| − | </b><br>BB1 | + | </b><br>The ability to computationally solve imperfect-information games has a myriad of future applications ranging from auctions, negotiations, and (cyber)security settings to medical domains. A dramatic scalability leap has occurred in the capability to solve such games over the last nine years, fueled in large part by the Annual Computer Poker Competition. I will discuss the key, domain-independent, techniques that enabled this leap, including automated abstraction techniques and approaches for mitigating the issues that they raise, new equilibrium-finding algorithms, safe opponent exploitation methods, techniques that use qualitative knowledge as an extra input, and endgame solving techniques. I will also include new results on 1) developing the world’s best Heads-Up No-Limit Texas Hold'em poker program, 2) theory that enables abstraction that gives solution quality guarantees, 3) techniques for hot starting equilibrium finding, 4) simultaneous abstraction and equilibrium finding, and 5) theory that improves gradient-based equilibrium finding. I will also cover the Brains vs AI competition that I recently organized where our AI, Claudico, challenged four of the top-10 human pros in Heads-Up No-Limit Texas Hold'em for 80,000 hands. (The talk covers joint work with many co-authors, mostly Noam Brown, Sam Ganzfried, and Christian Kroer. |
| | |} | | |} |
| | |<!-- M --> | | |<!-- M --> |
| Line 287: |
Line 287: |
| | || | | || |
| | <youtube>McV4a6umbAY</youtube> | | <youtube>McV4a6umbAY</youtube> |
| − | <b>HH2 | + | <b>AI for Imperfect-Information Games: Beating Top Humans in No-Limit Poker |
| − | </b><br>BB2 | + | </b><br>Despite AI successes in perfect-information games, the hidden information and large size of no-limit poker have made the game difficult for AI to tackle. Libratus is an AI that, in a 120,000-hand competition, defeated four top pros in heads-up no-limit Texas hold’em poker, the leading benchmark in imperfect-information game solving. This talk explains why imperfect-information games are fundamentally more difficult than perfect-information games, and the advances in Libratus that overcame those challenges. In particular, this talk describes new methods for real-time planning in imperfect-information games that have theoretical guarantees. Additional research has extended these methods to deeper game trees, enabling the development of the master-level poker AI Modicum which was constructed using only a 4-core CPU and 16 GB of RAM. These algorithms are domain-independent and can be applied to a variety of strategic interactions involving hidden information. |
| | |} | | |} |
| | |}<!-- B --> | | |}<!-- B --> |
| Line 304: |
Line 304: |
| | || | | || |
| | <youtube>EhvH4jdF-ko</youtube> | | <youtube>EhvH4jdF-ko</youtube> |
| − | <b>HH2 | + | <b>Super-Human AI for Strategic Reasoning |
| − | </b><br>BB2 | + | </b><br>Poker has been a challenge problem in game theory, operations research, and artificial intelligence for decades. As a game of imperfect information, it involves obstacles not present in games like chess and go, and requires totally different techniques. In 2017, our AI, Libratus, beat a team of four top specialist pros in the main benchmark for imperfect-information game solving, heads-up no-limit Texas hold'em, which has 10^161 decision points. This was the first time AI has beaten top players in a very large poker game. Libratus is powered by new algorithms in each of its three main modules: 1) computing approximate Nash equilibrium strategies before the event (i.e., computing a blueprint strategy for the entire game), 2) safe nested endgame solving during play (i.e., refining the blueprint strategy on the fly in parts of the game that are reached while preserving guarantees on exploitability), and 3) fixing its own strategy to play even closer to equilibrium based on what holes opponents have tried to identify and exploit. The algorithms are domain independent and have applicability to video games, strategic pricing, finance, negotiation, business strategy, strategic market segmentation, sports, investment banking, strategic product portfolio optimization, electricity markets, bidding, auction design, acquisition strategy (e.g., for streaming companies to acquire movies), political campaigns, cybersecurity, physical security, military, bot detection, and steering evolution and biological adaptation (such as for medical treatment planning and synthetic biology). The Libratus part of this talk is joint work with my PhD student Noam Brown. |
| | |} | | |} |
| | |}<!-- B --> | | |}<!-- B --> |