Limited Offer
Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments
- 1st Edition - September 4, 2024
- Author: Xiao-Lei Zhang
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 2 4 8 5 6 - 6
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 2 4 8 5 7 - 3
Speech Signal Processing Based on Deep Learning in Complex Acoustic Environments provides a detailed discussion of deep learning-based robust speech processing and its applic… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quote- Provides a comprehensive introduction to the development of deep learning-based robust speech processing
- Covers speech detection, speech enhancement, dereverberation, multi-speaker speech separation, robust speaker verification, and robust speech recognition
- Focuses on a historical overview and then covers methods that demonstrate outstanding performance in practical applications
- Cover image
- Title page
- Table of Contents
- Copyright
- Chapter 1: Introduction of the book
- Abstract
- Chapter 2: Fundamentals of deep learning
- Abstract
- 2.1. Supervised learning
- 2.2. Single-layer neural network
- 2.3. Feedforward deep neural network
- 2.4. Recurrent neural networks (RNN)
- 2.5. Convolutional neural networks
- 2.6. Normalization in neural networks
- 2.7. Attention mechanism in neural networks
- 2.8. Generative adversarial networks
- 2.9. Summary of this chapter
- References
- Chapter 3: Voice activity detection
- Abstract
- 3.1. Introduction
- 3.2. Fundamental knowledge
- 3.3. Voice activity detection models
- 3.4. Loss function of voice activity detection models
- 3.5. Acoustic features for voice activity detection
- 3.6. Generalization ability of models
- 3.7. Summary of this chapter
- References
- Chapter 4: Single-channel speech enhancement
- Abstract
- 4.1. Introduction
- 4.2. Fundamental knowledge
- 4.3. Frequency domain speech enhancement
- 4.4. Time domain speech enhancement
- 4.5. Summary of this chapter
- References
- Chapter 5: Multichannel speech enhancement
- Abstract
- 5.1. Introduction
- 5.2. Signal models
- 5.3. Spatial feature extraction
- 5.4. Beamforming methods
- 5.5. Ad-hoc microphone array method
- 5.6. Summary of this chapter
- References
- Chapter 6: Multispeaker speech separation
- Abstract
- 6.1. Introduction
- 6.2. Signal models
- 6.3. Speaker-dependent speech separation methods
- 6.4. Speaker-independent speech separation
- 6.5. Summary of this chapter
- References
- Chapter 7: Speaker recognition
- Abstract
- 7.1. Introduction
- 7.2. Speaker verification
- 7.3. Speaker diarization
- 7.4. Robust speaker verification
- 7.5. Summary of this chapter
- References
- Chapter 8: Speech recognition
- Abstract
- 8.1. Introduction
- 8.2. Speech recognition fundamentals
- 8.3. Evaluation metrics
- 8.4. End-to-end speech recognition
- 8.5. Noise-robust methods for speech recognition
- 8.6. Speaker adaptation
- 8.7. Summary of this chapter
- References
- References
- References
- Index
- No. of pages: 400
- Language: English
- Edition: 1
- Published: September 4, 2024
- Imprint: Elsevier
- Paperback ISBN: 9780443248566
- eBook ISBN: 9780443248573
XZ
Xiao-Lei Zhang
Xiao-Lei Zhang received his Ph.D. degree with honors from Tsinghua University, China, in 2012. He was a postdoctoral researcher with the Department of Electronic Engineering at Tsinghua University from 2012 to 2014. He was a visiting scholar at The Ohio State University, USA, from 2013 to 2014 and a postdoctoral researcher with the Department of Computer Science and Engineering, The Ohio State University, from 2014 to 2016. Since 2016 he has been a full professor at the Northwestern Polytechnical University, Xi'an, China.
His research interests are the topics in speech processing, machine learning, statistical signal processing, and artificial intelligence.