Showing 1 - 20 of 56,041

1

Switchable Deep Beamformer for High-quality and Real-time Passive Acoustic Mapping
Yi Zeng ; Jinwei Li ; Hui Zhu ; et al.
Ultrasound in Medicine & Biology. 51:1901-1914

FOS: Computer and inform... Computer Science - Machi... Sound (cs.SD) Artificial Intelligence... Computer Science - Artif... Audio and Speech Process...
Academic journal
Save to List
2

Semantic Communication for the Internet of Sounds: Architecture, Design Principles, and Challenges
Liang, Chengsi ; Sun, Yao ; Thomas, Christo Kurisummoottil ; et al.
IEEE Wireless Communications. 32:188-195

Audio and Speech Process... FOS: Electrical engineer... Electrical Engineering a...
Academic journal
Save to List
3

Sparse wavefield reconstruction and denoising with boostlets
Zea, Elias ; Laudato, Marco ; Andén, Joakim
2025 International Conference on Sampling Theory and Applications (SampTA). :1-5

FOS: Computer and inform... Sound (cs.SD) Beräkningsmatematik wavefields Signalbehandling Fluid Mechanics
Academic journal
Save to List
4

Analyzing the relationships between pretraining language, phonetic, tonal, and speaker information in self-supervised speech models
Gubian, Michele ; Krehan, Ioana ; Liu, Oli ; et al.

Computer Science - Compu... Electrical Engineering a...
Report
Save to List
5

FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition
Kim, Jongsuk ; Yu, Jaemyung ; Kwon, Minchan ; et al.

Electrical Engineering a...
Report
Save to List
6

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes
Yasuda, Masahiro ; Nguyen, Binh Thien ; Harada, Noboru ; et al.

Computer Science - Sound Electrical Engineering a...
Report
Save to List
7

Robust Unsupervised Adaptation of a Speech Recogniser Using Entropy Minimisation and Speaker Codes
van Dalen, Rogier C. ; Zhang, Shucong ; Parcollet, Titouan ; et al.
Interspeech 2025

Electrical Engineering a... Computer Science - Compu... Computer Science - Machi...
Report
Save to List
8

Joint ASR and Speaker Role Tagging with Serialized Output Training
Xu, Anfeng ; Feng, Tiantian ; Narayanan, Shrikanth

Electrical Engineering a... Computer Science - Sound
Report
Save to List
9

AC/DC: LLM-based Audio Comprehension via Dialogue Continuation
Fujita, Yusuke ; Mizumoto, Tomoya ; Kojima, Atsushi ; et al.

Electrical Engineering a... Computer Science - Compu... Computer Science - Sound
Report
Save to List
10

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Futami, Hayato ; Tsunoo, Emiru ; Kashiwagi, Yosuke ; et al.

Computer Science - Compu... Computer Science - Sound Electrical Engineering a...
Report
Save to List
11

RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
Liu, Yisi ; Wang, Chenyang ; Kim, Hanjo ; et al.

Electrical Engineering a... Computer Science - Artif...
Report
Save to List
12

Fine-Grained control over Music Generation with Activation Steering
Panda, Dipanshu ; Joe, Jayden Koshy ; R, Harshith M ; et al.

Computer Science - Sound Computer Science - Artif... Electrical Engineering a...
Report
Save to List
13

The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset
Landau, Gilad ; Özdogan, Miran ; Elvers, Gereon ; et al.
NeurIPS 2025 Competition Track

Computer Science - Machi... Computer Science - Sound Electrical Engineering a...
Report
Save to List
14

Training-Free Voice Conversion with Factorized Optimal Transport
Lobashev, Alexander ; Yermekova, Assel ; Larchenko, Maria

Computer Science - Sound Computer Science - Compu... Computer Science - Machi... Electrical Engineering a...
Report
Save to List
15

Recognizing Every Voice: Towards Inclusive ASR for Rural Bhojpuri Women
Joshi, Sakshi ; George, Eldho Ittan ; Javed, Tahir ; et al.

Electrical Engineering a...
Report
Save to List
16

A Study on Speech Assessment with Visual Cues
Ahmed, Shafique ; Zezario, Ryandhimas E. ; Saleem, Nasir ; et al.

Electrical Engineering a... Computer Science - Sound Electrical Engineering a...
Report
Save to List
17

You Are What You Say: Exploiting Linguistic Content for VoicePrivacy Attacks
Gaznepoglu, Ünal Ege ; Leschanowsky, Anna ; Aloradi, Ahmad ; et al.

Electrical Engineering a... Computer Science - Compu...
Report
Save to List
18

OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary
Sudo, Yui ; Fujita, Yusuke ; Kojima, Atsushi ; et al.

Computer Science - Sound Computer Science - Compu... Electrical Engineering a...
Report
Save to List
19

Ming-Omni: A Unified Multimodal Model for Perception and Generation
AI, Inclusion ; Gong, Biao ; Zou, Cheng ; et al.

Computer Science - Artif... Computer Science - Compu... Computer Science - Compu... Computer Science - Machi... Computer Science - Sound Electrical Engineering a...
Report
Save to List
20

A Technique for Isolating Lexically-Independent Phonetic Dependencies in Generative CNNs
Šegedin, Bruno Ferenc

Computer Science - Compu... Computer Science - Sound Electrical Engineering a...
Report
Save to List

Filter