Treffer: A new strategy for incorporating Gaussian process dynamic models into stochastic dynamic programming
3-907144-12-0
Weitere Informationen
This paper proposes a local solution method tailored to stochastic optimal control problems with Gaussian process (GP) representation of the dynamics leaning on the stochastic dynamic programming (DP) approach. We explore two methods - Fourier-Hermite DP (FHDP) and its recent extension, Fourier-Hermite Probabilistic DP (FHPDP) - for incorporating GP-based model learning. Compared to other model learning techniques, GP-based model learning explicitly quantifies model uncertainty and mitigates the effects of structural model errors. These Fourier-Hermite methods provide derivative-free versions of the differential dynamic programming (DDP) method through iterative backward-forward sweeps using sigma-point integration schemes for probabilistic value function approximation. Unlike the deterministic nature of the state-of-the-art GP-based DDP methods, the probabilistic foundation of the Fourier-Hermite methods makes them well-suited for integrating GPs. Therefore, we leverage GP-based forward uncertainty propagation within the Fourier-Hermite methods to propose sample-efficient data-driven methods, called GP-FHDP and GP-FHPDP, that can be applied to both stochastic and risk-sensitive optimal control problems. Furthermore, our methods can actively adjust exploration based on the uncertainty level, leading to accelerated convergence. The capabilities of the proposed algorithms are demonstrated on a simulated nonlinear vehicle system.