Result: Boosting Active Learning to Optimality: a Tractable Monte-Carlo, Billiard-based Algorithm

Title:
Boosting Active Learning to Optimality: a Tractable Monte-Carlo, Billiard-based Algorithm
Contributors:
Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Machine Learning and Optimisation (TAO), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Centre Inria de Saclay, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Algorithmic number theory for cryptology (TANC), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), École polytechnique (X), Institut Polytechnique de Paris (IP Paris)-Institut Polytechnique de Paris (IP Paris)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X), Institut Polytechnique de Paris (IP Paris)-Institut Polytechnique de Paris (IP Paris)-Centre National de la Recherche Scientifique (CNRS)-Centre Inria de Saclay
Source:
ECML. :302-317
Publisher Information:
CCSD, 2009.
Publication Year:
2009
Collection:
collection:X
collection:EC-PARIS
collection:CNRS
collection:INRIA
collection:UNIV-PSUD
collection:LIX
collection:INRIA-SACLAY
collection:X-DEP-INFO
collection:INRIA_TEST
collection:TESTALAIN1
collection:UMR8623
collection:INRIA2
collection:LRI-AO
collection:TDS-MACS
collection:UNIV-PARIS-SACLAY
collection:UNIV-PSUD-SACLAY
collection:DEPARTEMENT-DE-MATHEMATIQUES
Subject Geographic:
Original Identifier:
HAL:
Document Type:
Conference conferenceObject<br />Conference papers
Language:
English
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edshal.inria.00433866v1
Database:
HAL

Further Information

. This paper focuses on Active Learning with a limited num- ber of queries; in application domains such as Numerical Engineering, the size of the training set might be limited to a few dozen or hundred exam- ples due to computational constraints. Active Learning under bounded resources is formalized as a finite horizon Reinforcement Learning prob- lem, where the sampling strategy aims at minimizing the expectation of the generalization error. A tractable approximation of the optimal (in- tractable) policy is presented, the Bandit-based Active Learner (BAAL) algorithm. Viewing Active Learning as a single-player game, BAAL com- bines UCT, the tree structured multi-armed bandit algorithm proposed by Kocsis and Szepesv´ri (2006), and billiard algorithms. A proof of a principle of the approach demonstrates its good empirical convergence toward an optimal policy and its ability to incorporate prior AL crite- ria. Its hybridization with the Query-by-Committee approach is found to improve on both stand-alone BAAL and stand-alone QbC.