Treffer: XAItest: Evaluating XAI-Based Feature Discovery Methods
collection:NGERE-UL
collection:BMS-UL
Weitere Informationen
Identifying key variables in high-dimensional omics datasets remains a major challenge, particularly when relationships between features are non-linear, multimodal, or involve complex interactions. While classical statistical methods such as t-tests effectively capture mean differences, they often fail to detect more intricate patterns. Recent advances in explainable artificial intelligence (XAI) offer alternatives, but lacks systematic benchmarking of these methods.In this study, we present XAItest, a framework designed to evaluate the ability of various machine learning algorithms and XAI methods to identify relevant features under diverse data configurations. We test six synthetic data scenarios, including variance shifts, bimodal distributions, XOR-like structures, concentric circles and non-linear regression patterns. Four ML models, Decision Trees, Random Forests, Support Vector Machines, and Multi-Layer Perceptrons, are combined with feature importance metrics such as Gini Decrease, Accuracy Increase, SHAP values, and Olden scores. To interpret feature importance in a statistically meaningful way, we apply p-value transformation methods including mProbes, PIMP, and our novel approach, simThresh, which estimates significance thresholds through iterative simulation.Our results show that all methods detect simple patterns reliably, while complex structures remain challenging for most algorithms. SimThresh emerges as a computationally efficient and effective method for setting statistical thresholds. This work provides a foundation for evaluation of XAI tools in biomarker discovery and is implemented as a Bioconductor package (10.18129/B9.bioc.XAItest).