Result: regAL: Python Package for Active Learning of Regression Problems

Title:

regAL: Python Package for Active Learning of Regression Problems

Authors:

Surzhikova, Elizaveta, orcid:0009-0004-6924-, Proppe, Jonny

Publisher Information:

Zenodo

Publication Year:

2025

Collection:

Zenodo

Subject Terms:

Machine Learning, Active Learning, Python, Data Science

Document Type:

Electronic Resource software

Language:

English

Relation:

arXiv:2410.17917; https://zenodo.org/records/15309124; oai:zenodo.org:15309124; https://doi.org/10.5281/zenodo.15309124

DOI:

10.5281/zenodo.15309124

Availability:

https://doi.org/10.5281/zenodo.15309124
https://zenodo.org/records/15309124

Rights:

Creative Commons Attribution 4.0 International ; cc-by-4.0 ; https://creativecommons.org/licenses/by/4.0/legalcode

Accession Number:

edsbas.C8E1517B

Database:

BASE

Further Information

regAL - Active Learning for Regression This Python package is a user-friendly introductory active learning toolbox for inexperienced users. Active learning is a type of machine learning, where the model learns the problem scope sample by sample, while repeatedly querying the user or an oracle for the next (to the model most userful) sample. This allows learning a scope, starting from only a handful of labeled samples, with the steepest possible learning curve. Using this package, active learning can be tested on already labeled datasets (in "benchmark" mode) or perform active learning on an unknown dataset (in "learn" mode). During the experiment all relevant parameters are recorded and written into an output file for documentation. Additionally, all intermediate models are saved in a models/ folder. This allows the experiment to be aborted and resumed at a later point. Currently available sample selection methods: Uncertainty Sampling (only if standard deviation of model predictions is available, e.g. in GPRs) Covariance Sampling Query by Committee (QBC) Farthest-First Traversal (for comparison) Random Sampling (for comparison) Benchmark mode In the benchmark mode, active learning can be tested on an already labeled dataset. This can help to evaluate, which active learning strategy would be better suited for similar, but not yet explored, scopes. This mode at minimum requires a numpy array or pandas series with training data and a numpy array or pandas series of labels. At the end of the benchmark procedure, a plot with the models R^2 or RMSE over the course of the learning process for all chosen sample selection methods will be returned. Learn mode In the learn mode, active learning can be performed on an unlabeled dataset while querying the user or an oracle for sample labels. At the end of the learning procedure, the trained model will be returned.

Result: regAL: Python Package for Active Learning of Regression Problems

Further Information

Links

Additional functions