
Deep Learning
From Algorithmic Essence to Industrial Practice
- 1st Edition - July 18, 2025
- Imprint: Elsevier
- Authors: Shuhao Wang, Gang Xu
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 4 3 9 5 4 - 4
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 4 3 9 5 5 - 1
Deep Learning: From Algorithmic Essence to Industrial Practice serves as a comprehensive guide, bridging the gap between foundational theories and real-world implementation. The bo… Read more
Purchase options

Deep Learning: From Algorithmic Essence to Industrial Practice serves as a comprehensive guide, bridging the gap between foundational theories and real-world implementation. The book delves into the mechanisms behind deep learning models, breaking down complex algorithms into digestible explanations. Engineering students, AI enthusiasts, and professionals alike will find this resource invaluable as it navigates the intricacies of training neural networks, optimizing performance, and effectively deploying them in various sectors. Beyond theoretical knowledge, the book emphasizes practical applications, showcasing how deep learning powers advancements in fields like healthcare, finance, and autonomous systems.
It also discusses the challenges of scaling models, the ethical considerations surrounding AI, and the future trajectory of this transformative technology. With its blend of academic rigor and industrial insights, this book equips readers with the tools to innovate and lead in the ever-evolving world of artificial intelligence.
It also discusses the challenges of scaling models, the ethical considerations surrounding AI, and the future trajectory of this transformative technology. With its blend of academic rigor and industrial insights, this book equips readers with the tools to innovate and lead in the ever-evolving world of artificial intelligence.
- Provides in-depth explanations and practical code examples for the latest deep learning architectures, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers
- Examines theoretical concepts and the engineering practices required for deploying deep learning models in real-world scenarios
- Includes detailed case studies and applications of deep learning models in various domains such as image classification, object detection, semantic segmentation, etc.
Senior level students and researchers interested computer science, automation, electronics, communications, mathematics, and physics
1: Neural Networks
1.1 Opening the Door to Deep Learning
1.2 Starting from Optimization Problems
1.2.1 Newton and Kepler's Dialogue
1.2.2 Mathematical Models for Fitting and Classification
1.2.3 Optimizing Model Parameters with Training Data
1.2.4 Optimization Methods
1.3 Deep Neural Networks
1.3.1 Feature Extraction
1.3.2 Artificial Neurons and Activation Functions
1.3.3 The Mathematical Essence of Neural Networks
1.4 Regularization Methods
1.4.1 Underfitting and Overfitting
1.4.2 Regularization Techniques
1.4.3 Training Tips
1.5 Model Evaluation
1.5.1 Importance of Evaluation Metrics
1.5.2 Confusion Matrix
1.5.3 Typical Evaluation Metrics
1.6 The Limits of Deep Learning
1.6.1 Development Stages in Various Fields
1.6.2 Tasks Not Suitable for Current Deep Learning Techniques
1.6.3 The Future of Deep Learning
1.7 Exercises
2: Convolutional Neural Networks – Image Classification and Object Detection
2.1 Basic Concepts of Convolution
2.1.1 Definition of Convolution
2.1.2 Essence of Convolution
2.1.3 Important Parameters of Convolution
2.1.4 Pooling Layers
2.2 Convolutional Neural Networks
2.2.1 Typical Convolutional Neural Networks
2.2.2 LeNet
2.2.3 AlexNet
2.2.4 VGGNet
2.2.5 ResNet
2.2.6 Capability Comparison
2.3 Object Detection
2.3.1 R-CNN
2.3.2 Fast R-CNN
2.3.3 Faster R-CNN
2.3.4 YOLO
2.4 Exercises
3: Convolutional Neural Networks – Semantic Segmentation
3.1 Basics of Semantic Segmentation
3.1.1 Application Areas
3.1.2 Fully Convolutional Networks
3.1.3 Full Convolution vs. Dilated Convolution
3.1.4 U-Net
3.1.5 DeepLab v1 and v2
3.1.6 DeepLab v3
3.1.7 Combining Architectures – DeepLab v3+
3.2 Model Visualization
3.2.1 Convolution Kernel Visualization
3.2.2 Feature Map Visualization
3.2.3 Representation Vector Visualization
3.2.4 Occlusion Analysis and Salient Gradient Analysis
3.3 Initial Exploration of Pathological Image Segmentation
3.3.1 Pathology – The “Gold Standard” in Medical Diagnosis
3.3.2 Challenges in Pathological AI
3.3.3 Real Model Training Processes
3.4 Self-Supervised Learning
3.4.1 Method Overview
3.4.2 Self-Supervised Learning Algorithms
3.5 Model Training Processes
3.5.1 Cost Functions
3.5.2 Automatic Learning Rate Adjustment
3.5.3 Model Saving and Loading
3.6 Exercises
4: Recurrent Neural Networks
4.1 Basics of Natural Language Processing
4.1.1 Importance of the Time Dimension
4.1.2 Natural Language Processing
4.1.3 Bag of Words
4.1.4 Word Embeddings
4.2 Recurrent Neural Networks
4.2.1 Patterns in Time Series Data Modeling
4.2.2 Basic Structure of Recurrent Neural Networks
4.2.3 LSTM
4.2.4 GRU
4.3 Fraud Detection Based on Conversations
4.3.1 Fraud Patterns
4.3.2 Technical Challenges
4.3.3 Data Preprocessing
4.3.4 Practicing Recurrent Neural Networks
4.4 Speech Recognition and Evaluation
4.4.1 Feature Extraction
4.4.2 Model Structure
4.4.3 CTC Loss Function
4.5 Exercises
5: Distributed Deep Learning Systems
5.1 Distributed Systems
5.1.1 Challenges and Solutions
5.1.2 Master-Slave Architecture
5.1.3 Hadoop and Spark
5.2 Distributed Deep Learning Systems
5.2.1 CPU vs. GPU
5.2.2 Distributed Deep Learning
5.2.3 Communication – Synchronizing Parameters
5.3 Microservices Architecture
5.3.1 Basic Concepts
5.3.2 Message Queues
5.4 Distributed Inference Systems
5.4.1 Deep Learning Inference Frameworks
5.4.2 Inference System Architecture
5.5 Exercises
6: Frontiers of Deep Learning
6.1 Deep Reinforcement Learning
6.1.1 Reinforcement Learning
6.1.2 Deep Reinforcement Learning
6.1.3 Deep Reinforcement Learning in Nintendo Games
6.2 AlphaGo
6.2.1 Why Go Is So Difficult
6.2.2 AlphaGo System Architecture
6.2.3 AlphaGo Zero
6.3 Generative Adversarial Networks
6.3.1 Overview of Generative Adversarial Networks
6.3.2 Typical Generative Adversarial Networks
6.4 Future Directions
6.5 Exercises
7: Special Lectures
7.1 DenseNet
7.2 Inception
7.3 Xception
7.4 ResNeXt
7.5 Transformer
7.6 Exercises
8: Transformer and Its Companions
8.1 Attention Models
8.1.1 Image Captioning
8.1.2 Language Translation
8.1.3 Various Attention Mechanisms
8.2 Transformer
8.2.1 Self-Attention Mechanism and Transformer
8.2.2 Applications of Transformer in Vision
8.3 Exercises
9: Core Practices
9.1 Image Classification
9.1.1 Overview of the ImageNet Dataset
9.1.2 ImageNet Data Exploration and Preprocessing
9.1.3 Model Training
9.1.4 Model Testing
9.1.5 Model Evaluation
9.1.6 Cat and Dog Battle Dataset
9.1.7 Model Export
9.2 Semantic Segmentation
9.2.1 Introduction to Digital Pathology Slides
9.2.2 Digital Pathology Slide Preprocessing
9.2.3 Handling Sample Imbalance
9.2.4 Model Training
9.2.5 Model Testing
9.2.6 Model Export
9.3 Exercises
10: Deep Learning Inference Systems
10.1 Overall Architecture
10.2 Scheduler Module
10.3 Worker Node Module
10.4 Logging Module
10.5 Exercises
1.1 Opening the Door to Deep Learning
1.2 Starting from Optimization Problems
1.2.1 Newton and Kepler's Dialogue
1.2.2 Mathematical Models for Fitting and Classification
1.2.3 Optimizing Model Parameters with Training Data
1.2.4 Optimization Methods
1.3 Deep Neural Networks
1.3.1 Feature Extraction
1.3.2 Artificial Neurons and Activation Functions
1.3.3 The Mathematical Essence of Neural Networks
1.4 Regularization Methods
1.4.1 Underfitting and Overfitting
1.4.2 Regularization Techniques
1.4.3 Training Tips
1.5 Model Evaluation
1.5.1 Importance of Evaluation Metrics
1.5.2 Confusion Matrix
1.5.3 Typical Evaluation Metrics
1.6 The Limits of Deep Learning
1.6.1 Development Stages in Various Fields
1.6.2 Tasks Not Suitable for Current Deep Learning Techniques
1.6.3 The Future of Deep Learning
1.7 Exercises
2: Convolutional Neural Networks – Image Classification and Object Detection
2.1 Basic Concepts of Convolution
2.1.1 Definition of Convolution
2.1.2 Essence of Convolution
2.1.3 Important Parameters of Convolution
2.1.4 Pooling Layers
2.2 Convolutional Neural Networks
2.2.1 Typical Convolutional Neural Networks
2.2.2 LeNet
2.2.3 AlexNet
2.2.4 VGGNet
2.2.5 ResNet
2.2.6 Capability Comparison
2.3 Object Detection
2.3.1 R-CNN
2.3.2 Fast R-CNN
2.3.3 Faster R-CNN
2.3.4 YOLO
2.4 Exercises
3: Convolutional Neural Networks – Semantic Segmentation
3.1 Basics of Semantic Segmentation
3.1.1 Application Areas
3.1.2 Fully Convolutional Networks
3.1.3 Full Convolution vs. Dilated Convolution
3.1.4 U-Net
3.1.5 DeepLab v1 and v2
3.1.6 DeepLab v3
3.1.7 Combining Architectures – DeepLab v3+
3.2 Model Visualization
3.2.1 Convolution Kernel Visualization
3.2.2 Feature Map Visualization
3.2.3 Representation Vector Visualization
3.2.4 Occlusion Analysis and Salient Gradient Analysis
3.3 Initial Exploration of Pathological Image Segmentation
3.3.1 Pathology – The “Gold Standard” in Medical Diagnosis
3.3.2 Challenges in Pathological AI
3.3.3 Real Model Training Processes
3.4 Self-Supervised Learning
3.4.1 Method Overview
3.4.2 Self-Supervised Learning Algorithms
3.5 Model Training Processes
3.5.1 Cost Functions
3.5.2 Automatic Learning Rate Adjustment
3.5.3 Model Saving and Loading
3.6 Exercises
4: Recurrent Neural Networks
4.1 Basics of Natural Language Processing
4.1.1 Importance of the Time Dimension
4.1.2 Natural Language Processing
4.1.3 Bag of Words
4.1.4 Word Embeddings
4.2 Recurrent Neural Networks
4.2.1 Patterns in Time Series Data Modeling
4.2.2 Basic Structure of Recurrent Neural Networks
4.2.3 LSTM
4.2.4 GRU
4.3 Fraud Detection Based on Conversations
4.3.1 Fraud Patterns
4.3.2 Technical Challenges
4.3.3 Data Preprocessing
4.3.4 Practicing Recurrent Neural Networks
4.4 Speech Recognition and Evaluation
4.4.1 Feature Extraction
4.4.2 Model Structure
4.4.3 CTC Loss Function
4.5 Exercises
5: Distributed Deep Learning Systems
5.1 Distributed Systems
5.1.1 Challenges and Solutions
5.1.2 Master-Slave Architecture
5.1.3 Hadoop and Spark
5.2 Distributed Deep Learning Systems
5.2.1 CPU vs. GPU
5.2.2 Distributed Deep Learning
5.2.3 Communication – Synchronizing Parameters
5.3 Microservices Architecture
5.3.1 Basic Concepts
5.3.2 Message Queues
5.4 Distributed Inference Systems
5.4.1 Deep Learning Inference Frameworks
5.4.2 Inference System Architecture
5.5 Exercises
6: Frontiers of Deep Learning
6.1 Deep Reinforcement Learning
6.1.1 Reinforcement Learning
6.1.2 Deep Reinforcement Learning
6.1.3 Deep Reinforcement Learning in Nintendo Games
6.2 AlphaGo
6.2.1 Why Go Is So Difficult
6.2.2 AlphaGo System Architecture
6.2.3 AlphaGo Zero
6.3 Generative Adversarial Networks
6.3.1 Overview of Generative Adversarial Networks
6.3.2 Typical Generative Adversarial Networks
6.4 Future Directions
6.5 Exercises
7: Special Lectures
7.1 DenseNet
7.2 Inception
7.3 Xception
7.4 ResNeXt
7.5 Transformer
7.6 Exercises
8: Transformer and Its Companions
8.1 Attention Models
8.1.1 Image Captioning
8.1.2 Language Translation
8.1.3 Various Attention Mechanisms
8.2 Transformer
8.2.1 Self-Attention Mechanism and Transformer
8.2.2 Applications of Transformer in Vision
8.3 Exercises
9: Core Practices
9.1 Image Classification
9.1.1 Overview of the ImageNet Dataset
9.1.2 ImageNet Data Exploration and Preprocessing
9.1.3 Model Training
9.1.4 Model Testing
9.1.5 Model Evaluation
9.1.6 Cat and Dog Battle Dataset
9.1.7 Model Export
9.2 Semantic Segmentation
9.2.1 Introduction to Digital Pathology Slides
9.2.2 Digital Pathology Slide Preprocessing
9.2.3 Handling Sample Imbalance
9.2.4 Model Training
9.2.5 Model Testing
9.2.6 Model Export
9.3 Exercises
10: Deep Learning Inference Systems
10.1 Overall Architecture
10.2 Scheduler Module
10.3 Worker Node Module
10.4 Logging Module
10.5 Exercises
- Edition: 1
- Published: July 18, 2025
- Imprint: Elsevier
- Language: English
SW
Shuhao Wang
Dr Shuhao Wang received his from Tsinghua University; he is a fellow at the Institute for Interdisciplinary Information Sciences at Tsinghua University and is currently the co-founder and CTO of ‘Thorough Future.’ He has conducted research on data science and artificial intelligence at Baidu, NovuMind, and JD.com. He holds over 20 national patents.
Dr. Wang has received several key accolades, such as the "30 New Generation Digital Economy Talents" award at the 2019 Wuzhen Internet Summit and the Year 2022 Fall Asia-Pacific Signal and Information Processing Association Industrial Distinguished Leaders award, and was named one of Alibaba Cloud's "Seeing New Power" figures of 2022Affiliations and expertise
Tsinghua University, ChinaGX
Gang Xu
Dr Gang Xu gained his Ph.D. from Tsinghua University; he is currently an Assistant Professor at the Institute of Complex Systems Multiscale Research at Fudan University. Dr. Xu’s primary research focuses on the application of artificial intelligence in medical imaging and computational biology
Affiliations and expertise
Fudan University, PR China