Treffer: Ibaqpy: A scalable Python package for baseline quantification in proteomics leveraging SDRF metadata.

Title:
Ibaqpy: A scalable Python package for baseline quantification in proteomics leveraging SDRF metadata.
Authors:
Zheng P; Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China., Audain E; Institute of Medical Genetics, University Medicine Oldenburg, Carl von Ossietzky University, Oldenburg, Germany., Webel H; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark., Dai C; State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing, China., Klein J; Program for Bioinformatics, Boston University, Boston, USA., Hitz MP; Institute of Medical Genetics, University Medicine Oldenburg, Carl von Ossietzky University, Oldenburg, Germany., Sachsenberg T; Department of Computer Science, Applied Bioinformatics, University of Tübingen, Tübingen, Germany; Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany., Bai M; Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China. Electronic address: baimz@cqupt.edu.cn., Perez-Riverol Y; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK. Electronic address: yperez@ebi.ac.uk.
Source:
Journal of proteomics [J Proteomics] 2025 Jun 15; Vol. 317, pp. 105440. Date of Electronic Publication: 2025 Apr 21.
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: Elsevier Country of Publication: Netherlands NLM ID: 101475056 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1876-7737 (Electronic) Linking ISSN: 18743919 NLM ISO Abbreviation: J Proteomics Subsets: MEDLINE
Imprint Name(s):
Original Publication: Amsterdam : Elsevier
Contributed Indexing:
Keywords: Big data; Bioinformatics; Data integration; Proteomics; Quantification
Substance Nomenclature:
0 (Proteome)
Entry Date(s):
Date Created: 20250423 Date Completed: 20250528 Latest Revision: 20250528
Update Code:
20250529
DOI:
10.1016/j.jprot.2025.105440
PMID:
40268243
Database:
MEDLINE

Weitere Informationen

Intensity-based absolute quantification (iBAQ) is essential in proteomics as it allows for the assessment of a protein's absolute abundance in various samples or conditions. However, the computation of these values for increasingly large-scale and high-throughput experiments, such as those using DIA, TMT, or LFQ workflows, poses significant challenges in scalability and reproducibility. Here, we present ibaqpy (https://github.com/bigbio/ibaqpy), a Python package designed to compute iBAQ values efficiently for experiments of any scale. Ibaqpy leverages the Sample and Data Relationship Format (SDRF) metadata standard to incorporate experimental metadata into the quantification workflow. This allows for automatic normalization and batch correction while accounting for key aspects of the experimental design, such as technical and biological replicates, fractionation strategies, and sample conditions. Designed for large-scale proteomics datasets, ibaqpy can also recompute iBAQ values for existing experiments when an SDRF is available. We showcased ibaqpy's capabilities by reanalyzing 17 public proteomics datasets from ProteomeXchange, covering HeLa cell lines with 4921 samples and 5766 MS runs, quantifying a total of 11,014 proteins. In our reanalysis, ibaqpy is a key component in automating reproducible quantification, reducing manual effort and making quantitative proteomics more accessible while supporting FAIR principles for data reuse. SIGNIFICANCE: Proteomics studies often rely on intensity-based absolute quantification (iBAQ) to assess protein abundance across various biological conditions. Despite its widespread use, computing iBAQ values at scale remains challenging due to the increasing complexity and volume of proteomics experiments. Existing tools frequently lack metadata integration, limiting their ability to handle experimental design intricacies such as replicates, fractions, and batch effects. Our work introduces ibaqpy, a scalable Python package that leverages the Sample and Data Relationship Format (SDRF) to compute iBAQ values efficiently while incorporating critical experimental metadata. By enabling automated normalization and batch correction, ibaqpy ensures reproducible and comparable quantification across large-scale datasets. We validated the utility of ibaqpy through the reanalysis of 17 public HeLa datasets, comprising over 200 million peptide features and quantifying 11,000 proteins across thousands of samples. This comprehensive reanalysis highlights the robustness and scalability of ibaqpy, making it an essential tool for researchers conducting large-scale proteomics experiments. Moreover, by promoting FAIR principles for data reuse and interoperability, ibaqpy offers a transformative approach to baseline protein quantification, supporting reproducible research and data integration within the proteomics community.
(Copyright © 2025 The Authors. Published by Elsevier B.V. All rights reserved.)

Declaration of competing interest None.