Treffer: Optimizing fMRI Data Acquisition for Decoding Natural Speech with Limited Participants

Title:
Optimizing fMRI Data Acquisition for Decoding Natural Speech with Limited Participants
Contributors:
Neuroimagerie cognitive - Psychologie cognitive expérimentale (UNICOG-U992), Service NEUROSPIN (NEUROSPIN), Université Paris-Saclay-Institut des Sciences du Vivant Frédéric JOLIOT (JOLIOT), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Institut des Sciences du Vivant Frédéric JOLIOT (JOLIOT), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Institut National de la Santé et de la Recherche Médicale (INSERM), Laboratoire de sciences cognitives et psycholinguistique (LSCP), Département d'Etudes Cognitives - ENS-PSL (DEC), École normale supérieure - Paris (ENS-PSL), Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-École normale supérieure - Paris (ENS-PSL), Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS), Karavela, Modèles et inférence pour les données de Neuroimagerie (MIND), IFR49 - Neurospin - CEA, Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre Inria de l'Université Paris-Saclay, Centre Inria de Saclay, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre Inria de Saclay, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), This work was performed using HPC resources from GENCI-IDRIS (Grant 2024-AD011016055)., ANR-24-CE17-1427,SpeakOut,Une interface cortex-machine pour restaurer la communication parlée(2024), European Project: 101147319,HORIZON-INFRA-2022-SERV-B-01,HORIZON-INFRA-2022-SERV-B-01,EBRAINS 2.0(2024)
Source:
NeurIPS Workshop 2025 : Foundation Models for the Brain and Body., Dec 2025, San Diego (CA), United States
Publisher Information:
CCSD, 2025.
Publication Year:
2025
Collection:
collection:CEA
collection:ENS-PARIS
collection:CNRS
collection:INRIA
collection:EHESS
collection:INRIA-SACLAY
collection:LSCP
collection:DEC
collection:INRIA_TEST
collection:TESTALAIN1
collection:INRIA2
collection:GENCI
collection:CEA-UPSAY
collection:PSL
collection:UNIV-PARIS-SACLAY
collection:JOLIOT
collection:CEA-DRF
collection:NEUROSPIN
collection:ENS-PSL
collection:UNIVERSITE-PARIS-SACLAY
collection:ANR
collection:GS-COMPUTER-SCIENCE
collection:GS-LIFE-SCIENCES-HEALTH
Subject Geographic:
Original Identifier:
HAL: hal-05368522
Document Type:
Konferenz conferenceObject<br />Conference papers
Language:
English
Relation:
https://doi.org/10.48550/arXiv.2505.21304; info:eu-repo/grantAgreement//101147319/EU/EBRAINS 2.0: A Research Infrastructure to Advance Neuroscience and Brain Health/EBRAINS 2.0
Rights:
info:eu-repo/semantics/OpenAccess
URL: http://creativecommons.org/licenses/by/
Accession Number:
edshal.hal.05368522v1
Database:
HAL

Weitere Informationen

We present a systematic investigation into decoding perceived natural speech from fMRI data in a participant-limited setting. Using a publicly available dataset of eight participants (LeBel et al., 2023), we demonstrate that deep neural networks trained with a contrastive objective can effectively decode unseen natural speech by retrieving the embedding of perceived sentences from fMRI activity. We found that decoding performance directly correlates with the amount of training data available per participant. In this data regime, multi-subject training does not improve decoding accuracy compared to the single-subject approach. Additionally, training on similar or different stimuli across subjects has a negligible effect on decoding accuracy. Finally, we find that our decoders model both syntactic and semantic features, and that stories containing sentences with complex syntax or rich semantic content are more challenging to decode. While our results demonstrate the benefits of having extensive data per participant (deep phenotyping), they suggest that leveraging multi-subject data for natural speech decoding likely requires deeper phenotyping or a substantially larger cohort.