
Deep Learning
From Algorithmic Essence to Industrial Practice
- 1st Edition - July 1, 2025
- Imprint: Elsevier
- Authors: Shuhao Wang, Gang Xu
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 4 3 9 5 4 - 4
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 4 3 9 5 5 - 1
Deep Learning: From Algorithmic Essence to Industrial Practice introduces the fundamental theories of deep learning, engineering practices, and their deployment and applic… Read more

Purchase options

Institutional subscription on ScienceDirect
Request a sales quoteDeep Learning: From Algorithmic Essence to Industrial Practice introduces the fundamental theories of deep learning, engineering practices, and their deployment and application in the industry. It provides a detailed explanation of classic convolutional neural networks, recurrent neural networks, and transformer networks based on self-attention mechanisms, along with their variants, combining code demonstrations. Additionally, it covers the applications of these models in areas such as image classification, object detection, semantic segmentation, etc. The book also considers the advancements in deep reinforcement learning and generative adversarial networks.
- Provides in-depth explanations and practical code examples for the latest deep learning architectures, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers
- Examines theoretical concepts and the engineering practices required for deploying deep learning models in real-world scenarios. It covers the use of distributed systems for training and deploying models
- Includes detailed case studies and applications of deep learning models in various domains such as image classification, object detection, semantic segmentation, etc. These practical examples help readers apply theoretical knowledge to solve real-world problems
Senior level students and researchers interested computer science, automation, electronics, communications, mathematics, and physics
1: Neural Networks
1.1 Opening the Door to Deep Learning
1.2 Starting from Optimization Problems
1.2.1 Newton and Kepler's Dialogue
1.2.2 Mathematical Models for Fitting and Classification
1.2.3 Optimizing Model Parameters with Training Data
1.2.4 Optimization Methods
1.3 Deep Neural Networks
1.3.1 Feature Extraction
1.3.2 Artificial Neurons and Activation Functions
1.3.3 The Mathematical Essence of Neural Networks
1.4 Regularization Methods
1.4.1 Underfitting and Overfitting
1.4.2 Regularization Techniques
1.4.3 Training Tips
1.5 Model Evaluation
1.5.1 Importance of Evaluation Metrics
1.5.2 Confusion Matrix
1.5.3 Typical Evaluation Metrics
1.6 The Limits of Deep Learning
1.6.1 Development Stages in Various Fields
1.6.2 Tasks Not Suitable for Current Deep Learning Techniques
1.6.3 The Future of Deep Learning
1.7 Exercises
2: Convolutional Neural Networks – Image Classification and Object Detection
2.1 Basic Concepts of Convolution
2.1.1 Definition of Convolution
2.1.2 Essence of Convolution
2.1.3 Important Parameters of Convolution
2.1.4 Pooling Layers
2.2 Convolutional Neural Networks
2.2.1 Typical Convolutional Neural Networks
2.2.2 LeNet
2.2.3 AlexNet
2.2.4 VGGNet
2.2.5 ResNet
2.2.6 Capability Comparison
2.3 Object Detection
2.3.1 R-CNN
2.3.2 Fast R-CNN
2.3.3 Faster R-CNN
2.3.4 YOLO
2.4 Exercises
3: Convolutional Neural Networks – Semantic Segmentation
3.1 Basics of Semantic Segmentation
3.1.1 Application Areas
3.1.2 Fully Convolutional Networks
3.1.3 Full Convolution vs. Dilated Convolution
3.1.4 U-Net
3.1.5 DeepLab v1 and v2
3.1.6 DeepLab v3
3.1.7 Combining Architectures – DeepLab v3+
3.2 Model Visualization
3.2.1 Convolution Kernel Visualization
3.2.2 Feature Map Visualization
3.2.3 Representation Vector Visualization
3.2.4 Occlusion Analysis and Salient Gradient Analysis
3.3 Initial Exploration of Pathological Image Segmentation
3.3.1 Pathology – The “Gold Standard” in Medical Diagnosis
3.3.2 Challenges in Pathological AI
3.3.3 Real Model Training Processes
3.4 Self-Supervised Learning
3.4.1 Method Overview
3.4.2 Self-Supervised Learning Algorithms
3.5 Model Training Processes
3.5.1 Cost Functions
3.5.2 Automatic Learning Rate Adjustment
3.5.3 Model Saving and Loading
3.6 Exercises
4: Recurrent Neural Networks
4.1 Basics of Natural Language Processing
4.1.1 Importance of the Time Dimension
4.1.2 Natural Language Processing
4.1.3 Bag of Words
4.1.4 Word Embeddings
4.2 Recurrent Neural Networks
4.2.1 Patterns in Time Series Data Modeling
4.2.2 Basic Structure of Recurrent Neural Networks
4.2.3 LSTM
4.2.4 GRU
4.3 Fraud Detection Based on Conversations
4.3.1 Fraud Patterns
4.3.2 Technical Challenges
4.3.3 Data Preprocessing
4.3.4 Practicing Recurrent Neural Networks
4.4 Speech Recognition and Evaluation
4.4.1 Feature Extraction
4.4.2 Model Structure
4.4.3 CTC Loss Function
4.5 Exercises
5: Distributed Deep Learning Systems
5.1 Distributed Systems
5.1.1 Challenges and Solutions
5.1.2 Master-Slave Architecture
5.1.3 Hadoop and Spark
5.2 Distributed Deep Learning Systems
5.2.1 CPU vs. GPU
5.2.2 Distributed Deep Learning
5.2.3 Communication – Synchronizing Parameters
5.3 Microservices Architecture
5.3.1 Basic Concepts
5.3.2 Message Queues
5.4 Distributed Inference Systems
5.4.1 Deep Learning Inference Frameworks
5.4.2 Inference System Architecture
5.5 Exercises
6: Frontiers of Deep Learning
6.1 Deep Reinforcement Learning
6.1.1 Reinforcement Learning
6.1.2 Deep Reinforcement Learning
6.1.3 Deep Reinforcement Learning in Nintendo Games
6.2 AlphaGo
6.2.1 Why Go Is So Difficult
6.2.2 AlphaGo System Architecture
6.2.3 AlphaGo Zero
6.3 Generative Adversarial Networks
6.3.1 Overview of Generative Adversarial Networks
6.3.2 Typical Generative Adversarial Networks
6.4 Future Directions
6.5 Exercises
7: Special Lectures
7.1 DenseNet
7.2 Inception
7.3 Xception
7.4 ResNeXt
7.5 Transformer
7.6 Exercises
8: Transformer and Its Companions
8.1 Attention Models
8.1.1 Image Captioning
8.1.2 Language Translation
8.1.3 Various Attention Mechanisms
8.2 Transformer
8.2.1 Self-Attention Mechanism and Transformer
8.2.2 Applications of Transformer in Vision
8.3 Exercises
9: Core Practices
9.1 Image Classification
9.1.1 Overview of the ImageNet Dataset
9.1.2 ImageNet Data Exploration and Preprocessing
9.1.3 Model Training
9.1.4 Model Testing
9.1.5 Model Evaluation
9.1.6 Cat and Dog Battle Dataset
9.1.7 Model Export
9.2 Semantic Segmentation
9.2.1 Introduction to Digital Pathology Slides
9.2.2 Digital Pathology Slide Preprocessing
9.2.3 Handling Sample Imbalance
9.2.4 Model Training
9.2.5 Model Testing
9.2.6 Model Export
9.3 Exercises
10: Deep Learning Inference Systems
10.1 Overall Architecture
10.2 Scheduler Module
10.3 Worker Node Module
10.4 Logging Module
10.5 Exercises
1.1 Opening the Door to Deep Learning
1.2 Starting from Optimization Problems
1.2.1 Newton and Kepler's Dialogue
1.2.2 Mathematical Models for Fitting and Classification
1.2.3 Optimizing Model Parameters with Training Data
1.2.4 Optimization Methods
1.3 Deep Neural Networks
1.3.1 Feature Extraction
1.3.2 Artificial Neurons and Activation Functions
1.3.3 The Mathematical Essence of Neural Networks
1.4 Regularization Methods
1.4.1 Underfitting and Overfitting
1.4.2 Regularization Techniques
1.4.3 Training Tips
1.5 Model Evaluation
1.5.1 Importance of Evaluation Metrics
1.5.2 Confusion Matrix
1.5.3 Typical Evaluation Metrics
1.6 The Limits of Deep Learning
1.6.1 Development Stages in Various Fields
1.6.2 Tasks Not Suitable for Current Deep Learning Techniques
1.6.3 The Future of Deep Learning
1.7 Exercises
2: Convolutional Neural Networks – Image Classification and Object Detection
2.1 Basic Concepts of Convolution
2.1.1 Definition of Convolution
2.1.2 Essence of Convolution
2.1.3 Important Parameters of Convolution
2.1.4 Pooling Layers
2.2 Convolutional Neural Networks
2.2.1 Typical Convolutional Neural Networks
2.2.2 LeNet
2.2.3 AlexNet
2.2.4 VGGNet
2.2.5 ResNet
2.2.6 Capability Comparison
2.3 Object Detection
2.3.1 R-CNN
2.3.2 Fast R-CNN
2.3.3 Faster R-CNN
2.3.4 YOLO
2.4 Exercises
3: Convolutional Neural Networks – Semantic Segmentation
3.1 Basics of Semantic Segmentation
3.1.1 Application Areas
3.1.2 Fully Convolutional Networks
3.1.3 Full Convolution vs. Dilated Convolution
3.1.4 U-Net
3.1.5 DeepLab v1 and v2
3.1.6 DeepLab v3
3.1.7 Combining Architectures – DeepLab v3+
3.2 Model Visualization
3.2.1 Convolution Kernel Visualization
3.2.2 Feature Map Visualization
3.2.3 Representation Vector Visualization
3.2.4 Occlusion Analysis and Salient Gradient Analysis
3.3 Initial Exploration of Pathological Image Segmentation
3.3.1 Pathology – The “Gold Standard” in Medical Diagnosis
3.3.2 Challenges in Pathological AI
3.3.3 Real Model Training Processes
3.4 Self-Supervised Learning
3.4.1 Method Overview
3.4.2 Self-Supervised Learning Algorithms
3.5 Model Training Processes
3.5.1 Cost Functions
3.5.2 Automatic Learning Rate Adjustment
3.5.3 Model Saving and Loading
3.6 Exercises
4: Recurrent Neural Networks
4.1 Basics of Natural Language Processing
4.1.1 Importance of the Time Dimension
4.1.2 Natural Language Processing
4.1.3 Bag of Words
4.1.4 Word Embeddings
4.2 Recurrent Neural Networks
4.2.1 Patterns in Time Series Data Modeling
4.2.2 Basic Structure of Recurrent Neural Networks
4.2.3 LSTM
4.2.4 GRU
4.3 Fraud Detection Based on Conversations
4.3.1 Fraud Patterns
4.3.2 Technical Challenges
4.3.3 Data Preprocessing
4.3.4 Practicing Recurrent Neural Networks
4.4 Speech Recognition and Evaluation
4.4.1 Feature Extraction
4.4.2 Model Structure
4.4.3 CTC Loss Function
4.5 Exercises
5: Distributed Deep Learning Systems
5.1 Distributed Systems
5.1.1 Challenges and Solutions
5.1.2 Master-Slave Architecture
5.1.3 Hadoop and Spark
5.2 Distributed Deep Learning Systems
5.2.1 CPU vs. GPU
5.2.2 Distributed Deep Learning
5.2.3 Communication – Synchronizing Parameters
5.3 Microservices Architecture
5.3.1 Basic Concepts
5.3.2 Message Queues
5.4 Distributed Inference Systems
5.4.1 Deep Learning Inference Frameworks
5.4.2 Inference System Architecture
5.5 Exercises
6: Frontiers of Deep Learning
6.1 Deep Reinforcement Learning
6.1.1 Reinforcement Learning
6.1.2 Deep Reinforcement Learning
6.1.3 Deep Reinforcement Learning in Nintendo Games
6.2 AlphaGo
6.2.1 Why Go Is So Difficult
6.2.2 AlphaGo System Architecture
6.2.3 AlphaGo Zero
6.3 Generative Adversarial Networks
6.3.1 Overview of Generative Adversarial Networks
6.3.2 Typical Generative Adversarial Networks
6.4 Future Directions
6.5 Exercises
7: Special Lectures
7.1 DenseNet
7.2 Inception
7.3 Xception
7.4 ResNeXt
7.5 Transformer
7.6 Exercises
8: Transformer and Its Companions
8.1 Attention Models
8.1.1 Image Captioning
8.1.2 Language Translation
8.1.3 Various Attention Mechanisms
8.2 Transformer
8.2.1 Self-Attention Mechanism and Transformer
8.2.2 Applications of Transformer in Vision
8.3 Exercises
9: Core Practices
9.1 Image Classification
9.1.1 Overview of the ImageNet Dataset
9.1.2 ImageNet Data Exploration and Preprocessing
9.1.3 Model Training
9.1.4 Model Testing
9.1.5 Model Evaluation
9.1.6 Cat and Dog Battle Dataset
9.1.7 Model Export
9.2 Semantic Segmentation
9.2.1 Introduction to Digital Pathology Slides
9.2.2 Digital Pathology Slide Preprocessing
9.2.3 Handling Sample Imbalance
9.2.4 Model Training
9.2.5 Model Testing
9.2.6 Model Export
9.3 Exercises
10: Deep Learning Inference Systems
10.1 Overall Architecture
10.2 Scheduler Module
10.3 Worker Node Module
10.4 Logging Module
10.5 Exercises
- Edition: 1
- Published: July 1, 2025
- Imprint: Elsevier
- No. of pages: 250
- Language: English
- Paperback ISBN: 9780443439544
- eBook ISBN: 9780443439551
SW
Shuhao Wang
Dr Shuhao Wang received his from Tsinghua University; he is a fellow at the Institute for Interdisciplinary Information Sciences at Tsinghua University and is currently the co-founder and CTO of ‘Thorough Future.’ He has conducted research on data science and artificial intelligence at Baidu, NovuMind, and JD.com. He holds over 20 national patents.
Dr. Wang has received several key accolades, such as the "30 New Generation Digital Economy Talents" award at the 2019 Wuzhen Internet Summit and the Year 2022 Fall Asia-Pacific Signal and Information Processing Association Industrial Distinguished Leaders award, and was named one of Alibaba Cloud's "Seeing New Power" figures of 2022Affiliations and expertise
Tsinghua University, ChinaGX
Gang Xu
Dr Gang Xu gained his Ph.D. from Tsinghua University; he is currently an Assistant Professor at the Institute of Complex Systems Multiscale Research at Fudan University. Dr. Xu’s primary research focuses on the application of artificial intelligence in medical imaging and computational biology
Affiliations and expertise
Fudan University, PR China