Treffer: Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks

Title:

Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks

Authors:

WÖLLMER, Martin, SCHULLER, Björn

Source:

Neurocomputing (Amsterdam). 132:113-120

Publisher Information:

Amsterdam: Elsevier, 2014.

Publication Year:

2014

Physical Description:

print, 31 ref

Original Material:

INIST-CNRS

Subject Terms:

Cognition, Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Physique, Physics, Domaines classiques de la physique (y compris les applications), Fundamental areas of phenomenology (including applications), Acoustique, Acoustics, Traitement des signaux acoustiques, Acoustic signal processing, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Traitement des données. Listes et chaînes de caractères, Data processing. List processing. Character string processing, Intelligence artificielle, Artificial intelligence, Reconnaissance et synthèse de la parole et du son. Linguistique, Speech and sound recognition and synthesis. Linguistics, Sciences biologiques et medicales, Biological and medical sciences, Sciences biologiques fondamentales et appliquees. Psychologie, Fundamental and applied biological sciences. Psychology, Psychologie. Psychophysiologie, Psychology. Psychophysiology, Langage, Language, Production et perception du langage parlé, Production and perception of spoken language, Psychologie. Psychanalyse. Psychiatrie, Psychology. Psychoanalysis. Psychiatry, Approche probabiliste, Probabilistic approach, Enfoque probabilista, Congestion trafic, Traffic congestion, Congestión tráfico, Contexte, Context, Contexto, Conversation, Conversación, Court terme, Short term, Corto plazo, Critère sélection, Selection criterion, Criterio selección, Décodage, Decoding, Desciframiento, Effet mémoire, Memory effect, Efecto memoria, Extraction forme, Pattern extraction, Extracción forma, Interface utilisateur, User interface, Interfase usuario, Langage naturel, Natural language, Lenguaje natural, Long terme, Long term, Largo plazo, Modélisation, Modeling, Modelización, Perception verbale, Verbal perception, Percepción verbal, Phonème, Phoneme, Fonema, Phonétique, Phonetics, Fonética, Production verbale, Verbal production, Producción verbal, Propagation longue distance, Long distance propagation, Propagación larga distancia, Reconnaissance forme, Pattern recognition, Reconocimiento patrón, Reconnaissance parole, Speech recognition, Reconocimiento voz, Réseau neuronal, Neural network, Red neuronal, Sensibilité contexte, Context aware, Sensibilidad contexto, Spontané, Spontaneous, Espontáneo, Traitement parole, Speech processing, Tratamiento palabra, Extraction caractéristique, Feature extraction, Extracción de características, Réseau neuronal récurrent, Recurrent neural nets, Red neuronal recurrente, Bidirectional speech processing, Bottleneck networks, Long Short-Term Memory, Probabilistic feature extraction

Document Type:

Konferenz Conference Paper

File Description:

text

Language:

English

Author Affiliations:

Institute for Human-Machine Communication, Technische Universität München, Theresienstr. 90, 80333 München, Germany

ISSN:

0925-2312

Access URL:

http://pascal-francis.inist.fr/vibad/index.php?action=search&terms=28282818

Rights:

Copyright 2015 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS

Notes:

Computer science; theoretical automation; systems

Physics: acoustics

Psychology. Ethology

FRANCIS

Accession Number:

edscal.28282818

Database:

PASCAL Archive

Weitere Informationen

We introduce a novel context-sensitive feature extraction approach for spontaneous speech recognition. As bidirectional Long Short-Term Memory (BLSTM) networks are known to enable improved phoneme recognition accuracies by incorporating long-range contextual information into speech decoding, we integrate the BLSTM principle into a Tandem front-end for probabilistic feature extraction. Unlike the previously proposed approaches which exploit BLSTM modeling by generating a discrete phoneme prediction feature, our feature extractor merges continuous high-level probabilistic BLSTM features with low-level features. By combining BLSTM modeling and Bottleneck (BN) feature generation, we propose a novel front-end that allows us to produce context-sensitive probabilistic feature vectors of arbitrary size, independent of the network training targets. Evaluations on challenging spontaneous, conversational speech recognition tasks show that this concept prevails over recently published architectures for feature-level context modeling.

Treffer: Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks

Weitere Informationen

Links

Zusatz-Funktionen