Result: Q-Learning for Online PID Controller Tuning in Continuous Dynamic Systems: An Interpretable Framework for Exploring Multi-Agent Systems.

Title:
Q-Learning for Online PID Controller Tuning in Continuous Dynamic Systems: An Interpretable Framework for Exploring Multi-Agent Systems.
Authors:
Ibarra-Pérez, Davor1 (AUTHOR) davor.ibarra@usach.cl, García-Nieto, Sergio1 (AUTHOR), Sanchis Saez, Javier1 (AUTHOR)
Source:
Mathematics (2227-7390). Nov2025, Vol. 13 Issue 21, p3461. 29p.
Database:
Academic Search Index

Further Information

This study proposes a discrete multi-agent Q-learning framework for the online tuning of PID controllers in continuous dynamic systems with limited observability. The approach treats the adjustment of each PID gain ( k p , k i , k d ) as an independent learning process, in which each agent operates within a discrete state space corresponding to its own gain and selects actions from a tripartite space (decrease, maintain, or increase its gain). The agents act simultaneously under fixed decision intervals, favoring their convergence by preserving quasi-stationary conditions of the perceived environment, while a shared cumulative global reward, composed of system parameters, time and control action penalties, and stability incentives, guides coordinated exploration toward control objectives. Implemented in Python, the framework was validated in two nonlinear control problems: a water-tank and inverted pendulum (cart-pole) systems. The agents achieved their initial convergence after approximately 300 and 500 episodes, respectively, with overall success rates of 49.6 % and 46.2 % in 5000 training episodes. The learning process exhibited sustained convergence toward effective PID configurations capable of stabilizing both systems without explicit dynamic models. These findings confirm the feasibility of the proposed low-complexity discrete reinforcement learning approach for online adaptive PID tuning, achieving interpretable and reproducible control policies and providing a new basis for future hybrid schemes that unite classical control theory and reinforcement learning agents. [ABSTRACT FROM AUTHOR]