
Robust Automatic Speech Recognition
A Bridge to Practical Applications
- 1st Edition - October 12, 2015
- Imprint: Academic Press
- Authors: Jinyu Li, Li Deng, Reinhold Haeb-Umbach, Yifan Gong
- Language: English
- Hardback ISBN:9 7 8 - 0 - 1 2 - 8 0 2 3 9 8 - 3
- eBook ISBN:9 7 8 - 0 - 1 2 - 8 0 2 6 1 6 - 8
Robust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environme… Read more

Purchase options

Institutional subscription on ScienceDirect
Request a sales quoteRobust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environmental distortion. It provides a thorough overview of classical and modern noise-and reverberation robust techniques that have been developed over the past thirty years, with an emphasis on practical methods that have been proven to be successful and which are likely to be further developed for future applications.The strengths and weaknesses of robustness-enhancing speech recognition techniques are carefully analyzed. The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided.The reader will:
- Gain a unified, deep and systematic understanding of the state-of-the-art technologies for robust speech recognition
- Learn the links and relationship between alternative technologies for robust speech recognition
- Be able to use the technology analysis and categorization detailed in the book to guide future technology development
- Be able to develop new noise-robust methods in the current era of deep learning for acoustic modeling in speech recognition
- The first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks
- Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment
- Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques
- Written by leading researchers who have been actively working on the subject matter in both industrial and academic organizations for many years
Researchers and engineers in the area of speech processing, both in industry and academia; Undergraduate and graduate students in the area of signal and speech processing
- About the Authors
- List of Figures
- List of Tables
- Acronyms
- Notations
- Chapter 1: Introduction
- Abstract
- 1.1 Automatic Speech Recognition
- 1.2 Robustness to Noisy Environments
- 1.3 Existing Surveys in the Area
- 1.4 Book Structure Overview
- Chapter 2: Fundamentals of speech recognition
- Abstract
- 2.1 Introduction: Components of Speech Recognition
- 2.2 Gaussian Mixture Models
- 2.3 Hidden Markov Models and the Variants
- 2.4 Deep Learning and Deep Neural Networks
- 2.5 Summary
- Chapter 3: Background of robust speech recognition
- Abstract
- 3.1 Standard Evaluation Databases
- 3.2 Modeling Distortions of Speech in Acoustic Environments
- 3.3 Impact of Acoustic Distortion on Gaussian Modeling
- 3.4 Impact of Acoustic Distortion on DNN Modeling
- 3.5 A General Framework for Robust Speech Recognition
- 3.6 Categorizing Robust ASR Techniques: An Overview
- 3.7 Summary
- Chapter 4: Processing in the feature and model domains
- Abstract
- 4.1 Feature-Space Approaches
- 4.2 Model-Space Approaches
- 4.3 Summary
- Chapter 5: Compensation with prior knowledge
- Abstract
- 5.1 Learning from Stereo Data
- 5.2 Learning from Multi-Environment Data
- 5.3 Summary
- Chapter 6: Explicit distortion modeling
- Abstract
- 6.1 Parallel Model Combination
- 6.2 Vector Taylor Series
- 6.3 Sampling-Based Methods
- 6.4 Acoustic Factorization
- 6.5 Summary
- Chapter 7: Uncertainty processing
- Abstract
- 7.1 Model-Domain Uncertainty
- 7.2 Feature-Domain Uncertainty
- 7.3 Joint Uncertainty Decoding
- 7.4 Missing-Feature Approaches
- 7.5 Summary
- Chapter 8: Joint model training
- Abstract
- 8.1 Speaker Adaptive and Source Normalization Training
- 8.2 Model Space Noise Adaptive Training
- 8.3 Joint Training for DNN
- 8.4 Summary
- Chapter 9: Reverberant speech recognition
- Abstract
- 9.1 Introduction
- 9.2 Acoustic Impulse Response
- 9.3 A Model of Reverberated Speech in Different Domains
- 9.4 The Effect of Reverberation on ASR Performance
- 9.5 Linear Filtering Approaches
- 9.6 Magnitude or Power Spectrum Enhancement
- 9.7 Feature Domain Approaches
- 9.8 Acoustic Model Domain Approaches
- 9.9 The REVERB Challenge
- 9.10 To Probe Further
- 9.11 Summary
- Chapter 10: Multi-channel processing
- Abstract
- 10.1 Introduction
- 10.2 The Acoustic Beamforming Problem
- 10.3 Fundamentals of Data-Dependent Beamforming
- 10.4 Multi-Channel Speech Recognition
- 10.5 To Probe Further
- 10.6 Summary
- Chapter 11: Summary and future directions
- Abstract
- 11.1 Robust Methods in the Era of GMM
- 11.2 Robust Methods in the Era of DNN
- 11.3 Multi-Channel Input and Robustness to Reverberation
- 11.4 Epilogue
- Index
- Edition: 1
- Published: October 12, 2015
- Imprint: Academic Press
- No. of pages: 306
- Language: English
- Hardback ISBN: 9780128023983
- eBook ISBN: 9780128026168
JL
Jinyu Li
LD
Li Deng
RH
Reinhold Haeb-Umbach
YG