Deep Learning in Action: Image and Video Processing for Practical Use

1st Edition - June 6, 2025
Imprint: Elsevier
Authors: Abdussalam Elhanashi, Sergio Saponara
Language: English
Paperback ISBN:
9 7 8 - 0 - 4 4 3 - 3 0 0 7 8 - 3
eBook ISBN:
9 7 8 - 0 - 4 4 3 - 3 0 0 7 9 - 0

Artificial intelligence technology has entered an extraordinary phase of fast development and wide application. The techniques developed in traditional AI research areas, such as… Read more

Purchase options

SUMMER SALE

Summer of discovery!

Save up to 30% on books & eBooks

Shop now

Group of Science and Technology book covers

Artificial intelligence technology has entered an extraordinary phase of fast development and wide application. The techniques developed in traditional AI research areas, such as computer vision and object recognition, have found many innovative applications in an array of real-world settings. The general methodological contributions from AI, such as a variety of recently developed deep learning algorithms, have also been applied to a wide spectrum of fields such as surveillance applications, real-time processing, IoT devices, and health care systems. The state-of-the-art and deep learning models have wider applicability and are highly efficient. Deep Learning in Action: Image and Video Processing for Practical Use provides a comprehensive and accessible resource for both intermediate to advanced readers seeking to harness the power of deep learning in the domains of video and image processing. The book bridges the gap between theoretical concepts and practical implementation by emphasizing lightweight approaches, enabling readers to efficiently apply deep learning techniques to real-world scenarios. It focuses on resource-efficient methods, making it particularly relevant in contexts where computational constraints are a concern.

1 Introduction

1.1 Overview

1.2 The Power of Deep Learning in visual processing

1.2.1 Deep Learning Fundamental

1.2.2 The Evolution of Visual Processing

1.2.3 Deep Neural Network

1.2.4 Overview of Object Detection, Classification, and Segmentation

1.2.5 Computer Vision Applications

1.2.6 Deep Learning for Embedded System

1.3 Understanding Image and Video Data

1.3.1 Overview of the importance of image and video data in various real-world applications

1.3.2 Differentiating between image and video data

1.3.3 Common image formats and their properties

1.4 Importance of Real-world Applications

1.4.1 Bridging the Gap Between Theory and Practice

1.4.2 Solving Real-world Challenges

1.4.3 Industry Relevance and Innovation

1.5 Book outline

2 Image Analysis for Surveillance: Detecting Fire and Smoke Incidents

2.1 State of the art video-based fire/smoke detection

2.1.1 Experiment setup and results

2.1.2 Testing the R-CNN with Raspberry Pi

2.2 Real-time fire and smoke detection

2.2.1 Methodology and results

2.2.2 Execution the protype on NVIDIA Jetson nano

3 Enhancing COVID-19 Safety Measures with AI-Powered Video Analysis

3.1 Related work

3.2 Social distancing using YOLOv2

3.2.1 Social distancing workflow

3.2.2 YOLOv2 architecture

3.2.3 Experiment setup

3.2.4 Euclidean formula for measuring the distance

3.2.5 Results and discussion

3.3 Social distancing with YOLOv4-tiny

3.3.1 The proposed approach

3.3.2 Violation threshold

3.3.3 Bird’s-eye view transformation

3.3.4 Experiment setup and results

3.4 Algorithms implementation on the embedded system

3.4.1 Social distancing on NVIDIA devices

3.4.2 Distributed video infrastructure for social distancing

3.5 Integrated approach for monitoring social distancing, face mask, and facial temperature measurement

3.5.1 Dataset and annotation of face masks

3.5.2 Dataset and annotation of Facial Temperature

3.5.3 Experiment setup and results

3.5.4 Implementation of the integrated algorithms on NVIDIA platforms

4 Deep Learning Approaches for Fingerprint Image Restoration 4.1 Background

4.1.1 Deep learning related work for fingerprint

4.2 Autoencoders

4.3 Feature extraction

4.4 Fingerprint dataset

4.5 Sparse autoencoder for image reconstruction

4.5.1 Sparse autoencoder model

4.5.2 Pre-processing the images

4.5.3 Algorithm description

4.5.4 Experiment set up

4.5.5 Efficiency and Parameter Sensitivity

4.6 Recreating fingerprint images by CNN

4.6.1 Convolution neural network for image reconstruction

4.6.2 CNN algorithm design

4.6.3 Training and validation

4.7 Experiment results and discussion

4.7.1 Evaluation

4.7.2 Comparative analysis

5 Deep Learning for Classification and Localization of Multiple Abnormalities on chest x-rays images

5.1 Overview of Diagnosis on Medical images

5.2 Literature Review

5.3 Computer Vision for Medical Image Processing

5.3.1 CNN for Feature Extraction

5.3.2 Role of Deep Learning in Medical Imaging

5.4 Dataset Description

5.4.1 COVID-19 Radiography Database

5.4.2 SIIM-FISABIO-RSNA COVID-19 Database

5.5 Methods

5.5.1 Multi-Classification of Abnormalities on Chest X-ray Images

5.5.2 Localization of Abnormalities on Chest X-ray Images

5.5.3 Ensembled Models for Enhancing Localization of Abnormalities

6 Real-time Stroke Detection Based on Deep Learning and Federated Learning

6.1 Background

6.1.1 Stroke as a Critical Health Issue

6.1.2 Current Challenges in Stroke Detection

6.1.3 The Need for Real-time Detection

6.2 Federated Learning for Healthcare

6.2.1 Understanding Federated Learning

6.2.2 Privacy and Security Concerns in Federated Learning

6.2.3 Benefits and Challenges

6.3 Real-time stroke detection system

6.3.1 Design and Architecture

6.3.2 Data Pre-processing and Augmentation

6.3.3 Integration of Deep Learning and Federated Learning

6.3.4 Distributed Model Training Setup

6.4.1 Utilizing NVIDIA GPUs for Real-time Inference

6.3.5 Model Training and Optimization

6.4 Implementation on NVIDA Platforms

6.4.2 NVIDIA DeepStream for Real-time Video Analysis

6.4.3 Performance Metrics and Monitoring

7 Efficient Identification of Bag Breakup in Continuous Airflow via Video Analysis

7.1 Overview of Bag Break Detection

7.1.1 Bag Breakup in Continuous Airflow

7.1.2 Factors Affecting Bag Breakup

7.1.3 Importance of Bag Breakup Detection in Automotive Safety

7.1.4 Challenges in Identifying Bag Breakup

7.2 Methodology

7.2.1 Dataset Collection and Description

7.2.2 Data Preprocessing

7.2.3 The proposed Deep learning Models

7.2.4 Training and Validation

7.3 Experimental Results

7.3.1 Evaluation metrics

7.3.2 Comparative analysis

7.3.3 Error Analysis and Misdetection Cases 7 Conclusions and Recommendations 8 Bibliography

Abdussalam Elhanashi

Dr. Abdussalam Elhanashi is a researcher at the Università di Pisa, Italy, specializing in advanced applications of deep learning and video imaging processing. He holds an M.Sc. in Electronics and Electrical Engineering from the University of Glasgow in Scotland and an MBA from the University of Nicosia in Cyprus. He earned his Ph.D. in Information Engineering from the Università di Pisa, funded by a prestigious merit-based scholarship from the Islamic Bank Development (IsDB) as Libya's top candidate in 2019-2020. Dr. Elhanashi was a Research Fellow at the University of Strathclyde in 2021, where he applied deep learning models to analyse CT scans and X-ray images for medical diagnostics. In 2022, he was a visiting researcher at Hiroshima University in Japan, focusing on advanced video analysis techniques. With over 16 years of industry experience, he has successfully managed engineering projects, conducted system maintenance, and performed root cause analyses to address technical challenges. He authored the first Arabic-language book on artificial intelligence in Libya and has contributed to numerous peer-reviewed articles in international conferences and journals. He is a developer at the Society for Imaging Informatics in Medicine (SIIM) in USA. His work focuses on real-world AI applications, lightweight model development, video surveillance, IoT-based low-cost embedded systems, designing AI-driven solutions for medical imaging, and efficient coding techniques for imaging and video processing systems.

Affiliations and expertise

Department of Information Engineering, University of Pisa, Pisa, Italy

Sergio Saponara

Prof. Sergio Saponara, Director of the Department of Information Engineering at the University of Pisa, IEEE Distinguished Lecturer, obtained his Master's degree cum laude and Ph.D. in Electronic Engineering from the University of Pisa. In 2002, he was a Marie Curie Research Fellow at the Inter-university Microelectronics Center (IMEC) in Leuven, Belgium. He was also a post-graduate researcher at the National Research Council. Currently, he is a Full Professor of Electronics at the University of Pisa, where he teaches courses in Vehicular Electronics, Electronic Systems for Robotics, HW and Embedded Security for the Master's degrees in Vehicle Engineering, Robotics and Automation Engineering, and Cybersecurity. Additionally, he teaches Electronics at the Italian Naval Academy in Livorno. Prof. Saponara was President of the Bachelor's and Master's programs in Electronic Engineering at the University of Pisa. He has co-authored 400 scientific articles indexed in Scopus and 20 patents. He is a Founding Member of the IoT CASS SiG and has been a Program Committee Member for over 100 international IEEE and SPIE conferences. He also serves as the Director of the summer school "Enabling Technologies for the Internet of Things (IoT)" and the specialization course "Automotive Electronics and Powertrain Electrification." He is also the Director of the interUniversity Center for Automotive Research (UCAR).

Affiliations and expertise

Department of Information Engineering, Università di Pisa, Pisa, Italy

Life Sciences

Physical Sciences & Engineering

Social Sciences & Humanities

Health

Deep Learning in Action: Image and Video Processing for Practical Use

Purchase options

Summer of discovery!

Abdussalam Elhanashi

Sergio Saponara

Related books