Treffer: CIA: unveiling cellular identities with cluster-independent annotation in single-cell RNA sequencing data for comprehensive cell type characterization and exploration.

Title:
CIA: unveiling cellular identities with cluster-independent annotation in single-cell RNA sequencing data for comprehensive cell type characterization and exploration.
Authors:
Ferrari I; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy.; Department of Biosciences, Università Degli Studi di Milano, Milan, Italy., Battistella M; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy.; Department of Biosciences, Università Degli Studi di Milano, Milan, Italy., Vincenti F; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy., Gobbini A; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy., Marini F; Institute of Medical Biostatistics, Epidemiology and Informatics, Mainz, Germany., Notarbartolo S; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy.; Infectious Diseases Unit, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy., Costanza J; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy., Biffo S; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy.; Department of Biosciences, Università Degli Studi di Milano, Milan, Italy., Grifantini R; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy., Abrignani S; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy.; Department of Clinical Sciences and Community Health, Università Degli Studi di Milano, Milan, Italy., Galeota E; Fondazione Istituto Nazionale Di Genetica Molecolare 'Romeo ed Enrica Invernizzi' (INGM), Milan, Italy. galeota@ingm.org.
Source:
BMC bioinformatics [BMC Bioinformatics] 2025 Dec 17. Date of Electronic Publication: 2025 Dec 17.
Publication Model:
Ahead of Print
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: BioMed Central Country of Publication: England NLM ID: 100965194 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1471-2105 (Electronic) Linking ISSN: 14712105 NLM ISO Abbreviation: BMC Bioinformatics Subsets: MEDLINE
Imprint Name(s):
Original Publication: [London] : BioMed Central, 2000-
References:
Chen G, Ning B, Shi T. Single-cell RNA-seq technologies and related computational data analysis. Front Genet. 2019;10:317. https://doi.org/10.3389/fgene.2019.00317.
Nguyen QH, Pervolarakis N, Nee K, Kessenbrock K. Experimental considerations for single-cell RNA sequencing approaches. Front Cell Dev Biol. 2018;6:108. https://doi.org/10.3389/fcell.2018.00108.
Wagner A, Regev A, Yosef N. Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol. 2016;34:1145–60. https://doi.org/10.1038/nbt.3711.
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, et al. Orchestrating single-cell analysis with Bioconductor. Nat Methods. 2020;17:137–45. https://doi.org/10.1038/s41592-019-0654-x.
Grabski IN, Street K, Irizarry RA. Significance analysis for clustering with single-cell RNA-sequencing data. Nat Methods. 2023;20:1196–202. https://doi.org/10.1038/s41592-023-01933-9.
Duò A, Robinson MD, Soneson C. A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res. 2020;7:1141. https://doi.org/10.12688/f1000research.15666.3.
Pasquini G, Rojo Arias JE, Schäfer P, Busskamp V. Automated methods for cell type annotation on scRNA-seq data. Comput Struct Biotechnol J. 2021;19:961–9. https://doi.org/10.1016/j.csbj.2021.01.015.
Clarke ZA, Andrews TS, Atif J, Pouyabahar D, Innes BT, MacParland SA, et al. Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat Protoc. 2021;16:2749–64. https://doi.org/10.1038/s41596-021-00534-0.
Choi YH, Kim JK. Dissecting cellular heterogeneity using single-cell RNA sequencing. Mol Cells. 2019;42:189–99. https://doi.org/10.14348/molcells.2019.2446.
Abdelaal T, Michielsen L, Cats D, Hoogduin D, Mei H, Reinders MJT, et al. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol. 2019;20:194. https://doi.org/10.1186/s13059-019-1795-z.
Quan F, Liang X, Cheng M, Yang H, Liu K, He S, et al. Annotation of cell types (ACT): a convenient web server for cell type annotation. Genome Med. 2023;15:91. https://doi.org/10.1186/s13073-023-01249-5.
Hu C, Li T, Xu Y, Zhang X, Li F, Bai J, et al. Cell marker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2023;51:D870–6. https://doi.org/10.1093/nar/gkac947.
Franzén O, Gan L-M, Björkegren JLM. PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database. 2019;2019:baz046. https://doi.org/10.1093/database/baz046.
Meng F-L, Huang X-L, Qin W-Y, Liu K-B, Wang Y, Li M, et al. singleCellBase: a high-quality manually curated database of cell markers for single cell annotation across multiple species. Biomark Res. 2023;11:83. https://doi.org/10.1186/s40364-023-00523-3.
Patil A, Patil A. CellKb Immune: a manually curated database of mammalian hematopoietic marker gene sets for rapid cell type identification. 2020. https://doi.org/10.1101/2020.12.01.389890.
Lotfollahi M, Naghipourfar M, Luecken MD, Khajavi M, Büttner M, Wagenstetter M, et al. Mapping single-cell data to reference atlases by transfer learning. Nat Biotechnol. 2022;40:121–30. https://doi.org/10.1038/s41587-021-01001-7.
Hemberg M, Marini F, Ghazanfar S, Ajami AA, Abassi N, Anchang B, et al. Insights, opportunities and challenges provided by large cell atlases. arXiv; 2024. https://doi.org/10.48550/ARXIV.2408.06563.
Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018;19:15. https://doi.org/10.1186/s13059-017-1382-0.
Lun A, Aut C. SingleCellExperiment. Bioconductor. 2017. https://doi.org/10.18129/B9.BIOC.SINGLECELLEXPERIMENT.
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36:411–20. https://doi.org/10.1038/nbt.4096.
Domínguez Conde C, Xu C, Jarvis LB, Rainbow DB, Wells SB, Gomes T, et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science. 2022;376:eabl5197. https://doi.org/10.1126/science.abl5197.
Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049. https://doi.org/10.1038/ncomms14049.
Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573-3587.e29. https://doi.org/10.1016/j.cell.2021.04.048.
Pliner HA, Shendure J, Trapnell C. Supervised classification enables rapid annotation of cell atlases. Nat Methods. 2019;16:983–6. https://doi.org/10.1038/s41592-019-0535-3.
Aran D, Looney AP, Liu L, Wu E, Fong V, Hsu A, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol. 2019;20:163–72. https://doi.org/10.1038/s41590-018-0276-y.
Xu C, Lopez R, Mehlman E, Regier J, Jordan MI, Yosef N. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol Syst Biol. 2021;17:e9620. https://doi.org/10.15252/msb.20209620.
Cheng Y, Fan X, Zhang J, Li Y. A scalable sparse neural network framework for rare cell type annotation of single-cell transcriptome data. Commun Biol. 2023;6:545. https://doi.org/10.1038/s42003-023-04928-6.
Gabitto M, Travaglini K, Ariza J, Kaplan E, Long B, Rachleff V, et al. Integrated multimodal cell atlas of Alzheimer’s disease. Nat Neurosci. 2023;27(12):2366–83. https://doi.org/10.21203/rs.3.rs-2921860/v1.
Lee H-O, Hong Y, Etlioglu HE, Cho YB, Pomella V, Van Den Bosch B, et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat Genet. 2020;52:594–603. https://doi.org/10.1038/s41588-020-0636-z.
Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, et al. SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017;14:1083–6. https://doi.org/10.1038/nmeth.4463.
Bakken TE, Jorstad NL, Hu Q, Lake BB, Tian W, Kalmbach BE, et al. Comparative cellular analysis of motor cortex in human, marmoset and mouse. Nature. 2021;598:111–9. https://doi.org/10.1038/s41586-021-03465-8.
Pelka K, Hofree M, Chen JH, Sarkizova S, Pirl JD, Jorgji V, et al. Spatially organized multicellular immune hubs in human colorectal cancer. Cell. 2021;184:4734-4752.e20. https://doi.org/10.1016/j.cell.2021.08.003.
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27:1739–40. https://doi.org/10.1093/bioinformatics/btr260.
Della Chiara G, Gervasoni F, Fakiola M, Godano C, D’Oria C, Azzolin L, et al. Epigenomic landscape of human colorectal cancer unveils an aberrant core of pan-cancer enhancers orchestrated by YAP/TAZ. Nat Commun. 2021;12:2340. https://doi.org/10.1038/s41467-021-22544-y.
Zeng Z, Ma Y, Hu L, Tan B, Liu P, Wang Y, et al. Omicverse: a framework for bridging and deepening insights across bulk and single-cell sequencing. Nat Commun. 2024;15:5983. https://doi.org/10.1038/s41467-024-50194-3.
Yang F, Wang W, Wang F, Fang Y, Tang D, Huang J, et al. ScBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat Mach Intell. 2022;4:852–66. https://doi.org/10.1038/s42256-022-00534-z.
Cui H, Wang C, Maan H, Pang K, Luo F, Duan N, et al. ScGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat Methods. 2024;21:1470–80. https://doi.org/10.1038/s41592-024-02201-0.
Rue-Albrecht K, Marini F, Soneson C, Lun ATL. iSEE: interactive summarized experiment explorer. F1000Res. 2018;7:741. https://doi.org/10.12688/f1000research.14966.1.
Megill C, Martin B, Weaver C, Bell S, Prins L, Badajoz S, et al. cellxgene: a performant, scalable exploration platform for high dimensional sparse matrices. 2021. https://doi.org/10.1101/2021.04.05.438318.
Grant Information:
PE00000007 INF-ACT NextGenerationEU, European Union
Contributed Indexing:
Keywords: Cell-type annotation; Classifier; Clustering-free; Functional analysis; Gene signature; Scoring; scRNA-seq
Entry Date(s):
Date Created: 20251217 Latest Revision: 20251217
Update Code:
20251218
DOI:
10.1186/s12859-025-06320-z
PMID:
41408153
Database:
MEDLINE

Weitere Informationen

Background: Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of the transcriptional landscape of complex tissues, enabling the discovery of novel cell types and biological functions. However, the identification and classification of cells from scRNA-seq datasets remain significant challenges.
Results: To address this, we developed a new computational tool called CIA (Cluster Independent Annotation), which accurately identifies cell types across different datasets without requiring a fully annotated reference dataset or complex machine learning processes. Based on predefined cell type signatures, CIA provides a highly user-friendly and practical solution to cell-type and functional annotation of single cells. The CIA framework is implemented in both the Python and R programming languages, making it applicable to all main single-cell analysis frameworks, and it is available under the MIT license with its documentation at the following links: Python package: https://pypi.org/project/cia-python/. Python tutorial: https://cia-python.readthedocs.io/en/latest/tutorial/Cluster_Independent_Annotation.html. R package and tutorial: https://github.com/ingmbioinfo/CIA_R.
Conclusions: Our results demonstrate that CIA classification performances are comparable to the other state-of-the-art approaches, while requiring a significantly lower computational running time. Overall, CIA simplifies the process of obtaining reproducible signature-based cell assignments that can be easily interpreted through graphical summaries providing researchers with a powerful tool to explore the complex transcriptional landscape of single cells.
(© 2025. The Author(s).)

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.