Treffer: Disc-Hub: a python package for benchmarking machine learning strategies in DIA-MS identification

Title:
Disc-Hub: a python package for benchmarking machine learning strategies in DIA-MS identification
Source:
Bioinformatics Advances.
Publisher Information:
Oxford University Press (OUP), 2025.
Publication Year:
2025
Document Type:
Fachzeitschrift Article
Language:
English
ISSN:
2635-0041
DOI:
10.1093/bioadv/vbaf232
Rights:
CC BY
Accession Number:
edsair.doi...........d4babcd4cca118b56e6a27053c6f0c54
Database:
OpenAIRE

Weitere Informationen

Motivation Accurate analysis of data-independent acquisition (DIA) mass spectrometry data relies on machine learning to distinguish target peptides from decoy peptides. Different DIA identification engines adopt distinct binary classifiers and training workflows to accomplish this learning task. However, systematic comparisons of how different machine learning strategies affect identification performance are lacking. This absence of evaluation hinders optimal learning strategy selection, increases the risk of model underfitting or overfitting, and ultimately undermines the effectiveness and reliability of false discovery rate (FDR) control. Results In this study, we benchmarked three training strategies and four classifiers on representative DIA datasets. Among them, K-fold training combined with a multilayer perceptron achieved the best balance between identification depth and FDR control. We have released the datasets and code through the Python package Disc-Hub, enabling rapid selection of optimal machine learning configurations for developing DIA identification algorithms. Availability and implementation Disc-Hub is released as an open source software and can be installed from PyPi as a python module. The source code is available on GitHub at https://github.com/yuyiwen-yiyuwen/Disc_Hub