Result: Explainability of reinforcement learning agents within simulated environments

Title:
Explainability of reinforcement learning agents within simulated environments
Publisher Information:
University of Malta
Faculty of Information Communication Technology. Department of Artificial Intelligence
Publication Year:
2025
Collection:
University of Malta: OAR@UM / L-Università ta' Malta
Document Type:
Dissertation/ Thesis bachelor thesis
Language:
English
Relation:
Camilleri, L. (2025). Explainability of reinforcement learning agents within simulated environments (Bachelor's dissertation).; https://www.um.edu.mt/library/oar/handle/123456789/137840
Rights:
info:eu-repo/semantics/restrictedAccess ; The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.
Accession Number:
edsbas.89AB9C85
Database:
BASE

Further Information

B.Sc. (Hons) ICT(Melit.) ; This dissertation explores the use of model‐agnostic Explainable Artificial Intelligence (XAI) techniques for interpreting the behaviour of Reinforcement Learning (RL) agents trained in Unity‐based environments. As RL systems grow in complexity and are applied in critical domains, the need for interpretable decision‐making becomes increasingly important. Traditional Deep Reinforcement Learning (DRL) policies often lack transparency, creating challenges for both developers and end‐users in understanding agent behaviour and ensuring reliability. Improving explainability in this context is essential not only for debugging and validation but also for enhancing trust in autonomous systems. The study involved training agents across three distinct Unity environments using the ML‐Agents toolkit, followed by the application of explainability techniques including surrogate models, SHAP feature attribution, and saliency mapping. Observation–action data was collected during inference and analysed using Python‐based tooling. Custom logic was used to preprocess inputs, train interpretable models, and extract feature attributions aligned with the agents’ input structures. These techniques were tailored to the characteristics of each environment, accounting for differences in action space dimensionality and sensor design. Selected results were integrated into Unity through a seeded post‐hoc visualisation interface to enable real‐time inspection of decision‐making processes using in‐engine UI elements. Results demonstrate that semantic explanations such as SHAP can meaningfully highlight key features driving agent behaviour, and that Unity can serve as a viable platform for embedding visual explanations. Surrogate models offered reliable approximations of discrete agent policies, though they struggled with continuous control due to observation complexity and the noisier nature of regression targets. Decision tree models provided interpretable symbolic representations but were constrained by the high ...