Limited Offer
Deep Learning in Action: Image and Video Processing for Practical Use
- 1st Edition - March 1, 2025
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 3 0 0 7 8 - 3
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 3 0 0 7 9 - 0
Artificial intelligence technology has entered an extraordinary phase of fast development and wide application. The techniques developed in traditional AI research areas, such as… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quoteArtificial intelligence technology has entered an extraordinary phase of fast development and wide application. The techniques developed in traditional AI research areas, such as computer vision and object recognition, have found many innovative applications in an array of real-world settings. The general methodological contributions from AI, such as a variety of recently developed deep learning algorithms, have also been applied to a wide spectrum of fields such as surveillance applications, real-time processing, IoT devices, and health care systems. The state-of-the-art and deep learning models have wider applicability and are highly efficient. Deep Learning in Action: Image and Video Processing for Practical Use provides a comprehensive and accessible resource for both intermediate to advanced readers seeking to harness the power of deep learning in the domains of video and image processing. The book bridges the gap between theoretical concepts and practical implementation by emphasizing lightweight approaches, enabling readers to efficiently apply deep learning techniques to real-world scenarios. It focuses on resource-efficient methods, making it particularly relevant in contexts where computational constraints are a concern.
- Provides step-by-step guidance on implementing deep learning techniques, specifically for video and image processing tasks in real-world scenarios
- Emphasizes lightweight and efficient approaches to deep learning, ensuring that readers learn techniques that are suited to resource-constrained environments
- Covers a wide range of real-world applications, such as object detection, image segmentation, video classification
- Offers a comprehensive understanding of how deep learning can be leveraged across various domains
- Encourages hands-on experience that can be applied to the concepts to existing projects
Researchers and graduate students in computer science, electrical engineering, machine learning, artificial intelligence, image and video processing
1 Introduction
1.1 Overview
1.2 The Power of Deep Learning in visual processing
1.2.1 Deep Learning Fundamental
1.2.2 The Evolution of Visual Processing
1.2.3 Deep Neural Network
1.2.4 Overview of Object Detection, Classification, and Segmentation
1.2.5 Computer Vision Applications
1.2.6 Deep Learning for Embedded System
1.3 Understanding Image and Video Data
1.3.1 Overview of the importance of image and video data in various real-world applications
1.3.2 Differentiating between image and video data
1.3.3 Common image formats and their properties
1.4 Importance of Real-world Applications
1.4.1 Bridging the Gap Between Theory and Practice
1.4.2 Solving Real-world Challenges
1.4.3 Industry Relevance and Innovation
1.5 Book outline
2 Image Analysis for Surveillance: Detecting Fire and Smoke Incidents
2.1 State of the art video-based fire/smoke detection
2.1.1 Experiment setup and results
2.1.2 Testing the R-CNN with Raspberry Pi
2.2 Real-time fire and smoke detection
2.2.1 Methodology and results
2.2.2 Execution the protype on NVIDIA Jetson nano
3 Enhancing COVID-19 Safety Measures with AI-Powered Video Analysis
3.1 Related work
3.2 Social distancing using YOLOv2
3.2.1 Social distancing workflow
3.2.2 YOLOv2 architecture
3.2.3 Experiment setup
3.2.4 Euclidean formula for measuring the distance
3.2.5 Results and discussion
3.3 Social distancing with YOLOv4-tiny
3.3.1 The proposed approach
3.3.2 Violation threshold
3.3.3 Bird’s-eye view transformation
3.3.4 Experiment setup and results
3.4 Algorithms implementation on the embedded system
3.4.1 Social distancing on NVIDIA devices
3.4.2 Distributed video infrastructure for social distancing
3.5 Integrated approach for monitoring social distancing, face mask, and facial temperature measurement
3.5.1 Dataset and annotation of face masks
3.5.2 Dataset and annotation of Facial Temperature
3.5.3 Experiment setup and results
3.5.4 Implementation of the integrated algorithms on NVIDIA platforms
4 Deep Learning Approaches for Fingerprint Image Restoration 4.1 Background
4.1.1 Deep learning related work for fingerprint
4.2 Autoencoders
4.3 Feature extraction
4.4 Fingerprint dataset
4.5 Sparse autoencoder for image reconstruction
4.5.1 Sparse autoencoder model
4.5.2 Pre-processing the images
4.5.3 Algorithm description
4.5.4 Experiment set up
4.5.5 Efficiency and Parameter Sensitivity
4.6 Recreating fingerprint images by CNN
4.6.1 Convolution neural network for image reconstruction
4.6.2 CNN algorithm design
4.6.3 Training and validation
4.7 Experiment results and discussion
4.7.1 Evaluation
4.7.2 Comparative analysis
5 Deep Learning for Classification and Localization of Multiple Abnormalities on chest x-rays images
5.1 Overview of Diagnosis on Medical images
5.2 Literature Review
5.3 Computer Vision for Medical Image Processing
5.3.1 CNN for Feature Extraction
5.3.2 Role of Deep Learning in Medical Imaging
5.4 Dataset Description
5.4.1 COVID-19 Radiography Database
5.4.2 SIIM-FISABIO-RSNA COVID-19 Database
5.5 Methods
5.5.1 Multi-Classification of Abnormalities on Chest X-ray Images
5.5.2 Localization of Abnormalities on Chest X-ray Images
5.5.3 Ensembled Models for Enhancing Localization of Abnormalities
6 Real-time Stroke Detection Based on Deep Learning and Federated Learning
6.1 Background
6.1.1 Stroke as a Critical Health Issue
6.1.2 Current Challenges in Stroke Detection
6.1.3 The Need for Real-time Detection
6.2 Federated Learning for Healthcare
6.2.1 Understanding Federated Learning
6.2.2 Privacy and Security Concerns in Federated Learning
6.2.3 Benefits and Challenges
6.3 Real-time stroke detection system
6.3.1 Design and Architecture
6.3.2 Data Pre-processing and Augmentation
6.3.3 Integration of Deep Learning and Federated Learning
6.3.4 Distributed Model Training Setup
6.4.1 Utilizing NVIDIA GPUs for Real-time Inference
6.3.5 Model Training and Optimization
6.4 Implementation on NVIDA Platforms
6.4.2 NVIDIA DeepStream for Real-time Video Analysis
6.4.3 Performance Metrics and Monitoring
7 Efficient Identification of Bag Breakup in Continuous Airflow via Video Analysis
7.1 Overview of Bag Break Detection
7.1.1 Bag Breakup in Continuous Airflow
7.1.2 Factors Affecting Bag Breakup
7.1.3 Importance of Bag Breakup Detection in Automotive Safety
7.1.4 Challenges in Identifying Bag Breakup
7.2 Methodology
7.2.1 Dataset Collection and Description
7.2.2 Data Preprocessing
7.2.3 The proposed Deep learning Models
7.2.4 Training and Validation
7.3 Experimental Results
7.3.1 Evaluation metrics
7.3.2 Comparative analysis
7.3.3 Error Analysis and Misdetection Cases 7 Conclusions and Recommendations 8 Bibliography
1.1 Overview
1.2 The Power of Deep Learning in visual processing
1.2.1 Deep Learning Fundamental
1.2.2 The Evolution of Visual Processing
1.2.3 Deep Neural Network
1.2.4 Overview of Object Detection, Classification, and Segmentation
1.2.5 Computer Vision Applications
1.2.6 Deep Learning for Embedded System
1.3 Understanding Image and Video Data
1.3.1 Overview of the importance of image and video data in various real-world applications
1.3.2 Differentiating between image and video data
1.3.3 Common image formats and their properties
1.4 Importance of Real-world Applications
1.4.1 Bridging the Gap Between Theory and Practice
1.4.2 Solving Real-world Challenges
1.4.3 Industry Relevance and Innovation
1.5 Book outline
2 Image Analysis for Surveillance: Detecting Fire and Smoke Incidents
2.1 State of the art video-based fire/smoke detection
2.1.1 Experiment setup and results
2.1.2 Testing the R-CNN with Raspberry Pi
2.2 Real-time fire and smoke detection
2.2.1 Methodology and results
2.2.2 Execution the protype on NVIDIA Jetson nano
3 Enhancing COVID-19 Safety Measures with AI-Powered Video Analysis
3.1 Related work
3.2 Social distancing using YOLOv2
3.2.1 Social distancing workflow
3.2.2 YOLOv2 architecture
3.2.3 Experiment setup
3.2.4 Euclidean formula for measuring the distance
3.2.5 Results and discussion
3.3 Social distancing with YOLOv4-tiny
3.3.1 The proposed approach
3.3.2 Violation threshold
3.3.3 Bird’s-eye view transformation
3.3.4 Experiment setup and results
3.4 Algorithms implementation on the embedded system
3.4.1 Social distancing on NVIDIA devices
3.4.2 Distributed video infrastructure for social distancing
3.5 Integrated approach for monitoring social distancing, face mask, and facial temperature measurement
3.5.1 Dataset and annotation of face masks
3.5.2 Dataset and annotation of Facial Temperature
3.5.3 Experiment setup and results
3.5.4 Implementation of the integrated algorithms on NVIDIA platforms
4 Deep Learning Approaches for Fingerprint Image Restoration 4.1 Background
4.1.1 Deep learning related work for fingerprint
4.2 Autoencoders
4.3 Feature extraction
4.4 Fingerprint dataset
4.5 Sparse autoencoder for image reconstruction
4.5.1 Sparse autoencoder model
4.5.2 Pre-processing the images
4.5.3 Algorithm description
4.5.4 Experiment set up
4.5.5 Efficiency and Parameter Sensitivity
4.6 Recreating fingerprint images by CNN
4.6.1 Convolution neural network for image reconstruction
4.6.2 CNN algorithm design
4.6.3 Training and validation
4.7 Experiment results and discussion
4.7.1 Evaluation
4.7.2 Comparative analysis
5 Deep Learning for Classification and Localization of Multiple Abnormalities on chest x-rays images
5.1 Overview of Diagnosis on Medical images
5.2 Literature Review
5.3 Computer Vision for Medical Image Processing
5.3.1 CNN for Feature Extraction
5.3.2 Role of Deep Learning in Medical Imaging
5.4 Dataset Description
5.4.1 COVID-19 Radiography Database
5.4.2 SIIM-FISABIO-RSNA COVID-19 Database
5.5 Methods
5.5.1 Multi-Classification of Abnormalities on Chest X-ray Images
5.5.2 Localization of Abnormalities on Chest X-ray Images
5.5.3 Ensembled Models for Enhancing Localization of Abnormalities
6 Real-time Stroke Detection Based on Deep Learning and Federated Learning
6.1 Background
6.1.1 Stroke as a Critical Health Issue
6.1.2 Current Challenges in Stroke Detection
6.1.3 The Need for Real-time Detection
6.2 Federated Learning for Healthcare
6.2.1 Understanding Federated Learning
6.2.2 Privacy and Security Concerns in Federated Learning
6.2.3 Benefits and Challenges
6.3 Real-time stroke detection system
6.3.1 Design and Architecture
6.3.2 Data Pre-processing and Augmentation
6.3.3 Integration of Deep Learning and Federated Learning
6.3.4 Distributed Model Training Setup
6.4.1 Utilizing NVIDIA GPUs for Real-time Inference
6.3.5 Model Training and Optimization
6.4 Implementation on NVIDA Platforms
6.4.2 NVIDIA DeepStream for Real-time Video Analysis
6.4.3 Performance Metrics and Monitoring
7 Efficient Identification of Bag Breakup in Continuous Airflow via Video Analysis
7.1 Overview of Bag Break Detection
7.1.1 Bag Breakup in Continuous Airflow
7.1.2 Factors Affecting Bag Breakup
7.1.3 Importance of Bag Breakup Detection in Automotive Safety
7.1.4 Challenges in Identifying Bag Breakup
7.2 Methodology
7.2.1 Dataset Collection and Description
7.2.2 Data Preprocessing
7.2.3 The proposed Deep learning Models
7.2.4 Training and Validation
7.3 Experimental Results
7.3.1 Evaluation metrics
7.3.2 Comparative analysis
7.3.3 Error Analysis and Misdetection Cases 7 Conclusions and Recommendations 8 Bibliography
- No. of pages: 250
- Language: English
- Edition: 1
- Published: March 1, 2025
- Imprint: Elsevier
- Paperback ISBN: 9780443300783
- eBook ISBN: 9780443300790