Treffer: Reinforcement Learning for Impartial Games and Complex Combinatorial Optimisation Problems

Title:
Reinforcement Learning for Impartial Games and Complex Combinatorial Optimisation Problems
Authors:
Publication Year:
2024
Collection:
Queen Mary University of London: Queen Mary Research Online (QMRO)
Document Type:
other/unknown material
Language:
English
Accession Number:
edsbas.8F2F1BF4
Database:
BASE

Weitere Informationen

AlphaZero-style reinforcement learning algorithms have succeeded in various games, from traditional board games to advanced video games. However, their integration into combinatorial games, particularly impartial games, presents inherent challenges. A pivotal aspect of these challenges stems from the parity function's role in determining the winning strategies and the fact that they do not exploit the underlying structures of games besides the game rules. Within the domain of combinatorial optimisation, the task of finding large Condorcet domains is remarkably complex, marked by numerous local optima and deep-rooted mathematical structures. Notably, the most phenomenal method hinged mainly on Fishburn's alternating scheme, an approach intricately tied to the parity function. We demonstrate the intrinsic complexity of Condorcet domain generation by showcasing that diverse AI learning paradigms, from deep reinforcement learning and genetic algorithms to local search algorithms, tend to get stuck in some of the numerous local optima. Thus, a genuinely novel approach is needed. The main contribution of the thesis is to present a novel heuristic approach inspired by AlphaZero-style reinforcement learning but using significant expert domain knowledge and heavily relying on various databases accessed during the search. The Condorcet Domain Library, a research software written in C++ for high execution speed and providing Python interfaces for fast prototyping, was initially developed to implement the algorithm. It has evolved into a flexible library with various functionalities, underpinning multiple impactful research projects. In collaboration with mathematicians, further computational ideas, combined with our new algorithms and the library, led to new results for Condorcet domains, including the construction of a set-alternating scheme leading to a set of large CDs that were hitherto unknown and the discovery of new CDs with distinct properties, some of which are intriguingly linked to multi-agent systems. ...