Embedded Systems
ARM Programming and Optimization
- 2nd Edition - October 28, 2023
- Author: Jason D. Bakos
- Language: English
- Paperback ISBN:9 7 8 - 0 - 1 2 - 8 2 2 5 7 5 - 2
- eBook ISBN:9 7 8 - 0 - 3 2 3 - 9 0 3 0 2 - 8
Embedded Systems: ARM Programming and Optimization, Second Edition combines an exploration of the ARM architecture with an examination of the facilities offered by the Linux operat… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quoteEmbedded Systems: ARM Programming and Optimization, Second Edition combines an exploration of the ARM architecture with an examination of the facilities offered by the Linux operating system to explain how various features of program design can influence processor performance. The book demonstrates methods by which a programmer can optimize program code in a way that does not impact its behavior but instead improves its performance. Several applications, including image transformations, fractal generation, image convolution, computer vision tasks, and now machine learning are used to describe and demonstrate these methods. From this, the reader will gain insight into computer architecture and application design, as well as practical knowledge in embedded software design for modern embedded systems. The second edition has been expanded to include more topics of interest to upper level undergraduate courses in embedded systems.
- Covers three ARM instruction set architectures, the ARMv6 and ARMv7-A, as well as three ARM cores, the ARM11 on the Raspberry Pi, Cortex-A9 on the Xilinx Zynq 7020, and Cortex-A15 on the NVIDIA Tegra K1
- Describes how to fully leverage the facilities offered by the Linux operating system, including the Linux GCC compiler toolchain and debug tools, performance monitoring support, OpenMP multicore runtime environment, video frame buffer, and video capture capabilities
- Designed to accompany and work with most low-cost Linux/ARM embedded development boards currently available
- Expanded to include coverage of topics such as bus architectures, low-power programming, and sensor interfacing
- Includes practical application areas such as machine learning
Students in an embedded systems design course (roughly 11, 000 students per year according to Navstem. Professional programmers needing to understand embedded development
- Cover image
- Title page
- Table of Contents
- Copyright
- Preface
- Acknowledgments
- Chapter 1. The Linux/ARM embedded platform
- Abstract
- Chapter Outline
- 1.1 Performance-oriented programming
- 1.2 ARM technology
- 1.3 Brief history of ARM
- 1.4 ARM programming
- 1.5 ARM architecture set architecture
- 1.6 Assembly optimization #1: sorting
- 1.7 Assembly optimization #2: bit manipulation (ARMv6/7)
- 1.8 Assembly optimization #2: bit manipulation (ARMv8)
- 1.9 Code optimization objectives
- 1.10 Runtime profiling with performance counters
- 1.11 Measuring memory bandwidth
- 1.12 Performance results
- 1.13 Performance bounds
- 1.14 Basic ARM instruction set
- 1.15 Chapter wrap-up
- Exercises
- Chapter 2. Multicore and data-level optimization: OpenMP and SIMD
- Abstract
- Chapter Outline
- 2.1 Optimization techniques covered
- 2.2 Amdahl’s law
- 2.3 Test kernel: polynomial evaluation
- 2.4 Using multiple processor cores: OpenMP
- 2.5 Performance bounds
- 2.6 Performance analysis
- 2.7 ARM Cortex-A53/A72: minimizing instructions per flop
- 2.8 ARM Cortex-A53/A72: minimizing cycles per instruction
- 2.9 Software pipelining
- 2.10 ARM11: using inline assembly language
- 2.11 ARM11: single instruction, multiple data
- 2.12 Chapter wrap-up
- Exercises
- Chapter 3. Arithmetic optimization and the Linux framebuffer
- Abstract
- Chapter Outline
- 3.1 Graphics output libraries
- 3.2 Affine image transformations
- 3.3 Bilinear interpolation
- 3.4 Floating-point image transformation
- 3.5 Analysis of floating-point performance
- 3.6 Fixed-point arithmetic
- 3.7 Fixed-point performance
- 3.8 Real-time fractal generation
- 3.9 Chapter wrap-up
- Exercises
- Chapter 4. Memory optimization and video processing
- Abstract
- Chapter Outline
- 4.1 Stencil loops
- 4.2 The 2D filter
- 4.3 Separable filters
- 4.4 Memory access behavior of 2D filters
- 4.5 Loop tiling
- 4.6 Tiling and the stencil halo region
- 4.7 Example 2D filter implementation
- 4.8 Capturing and converting video frames
- 4.9 Capturing frames from a webcam
- 4.10 Applying the 2D tiled filter
- 4.11 Applying the separated 2D tiled filter
- 4.12 Top-level loop
- 4.13 Performance results
- 4.14 Chapter wrap-up
- Exercises
- Chapter 5. Embedded heterogeneous programming with OpenCL
- Abstract
- Chapter Outline
- 5.1 GPU microarchitecture
- 5.2 OpenCL
- 5.3 OpenCL programming model, idioms, and abstractions
- 5.4 Kernel workload distribution
- 5.5 OpenCL implementation of Horner’s method: Device code
- 5.6 Performance results
- 5.7 Chapter wrap-up
- Exercises
- Appendix A. Adding PMU support to Raspbian for the generation 1 Raspberry Pi
- Chapter Outline
- A.1 Download the Linux kernel and cross-compiler tools
- A.2 Kernel modifications
- A.3 Building the kernel
- A.4 Installing the kernel
- Appendix B. NEON intrinsic reference
- Chapter Outline
- B.1 Vector data types
- B.2 Reading and writing vector variables
- B.3 Vector element manipulation
- B.4 Optimizing floating-point code with NEON intrinsics
- B.5 Summary of NEON instrinsics
- Appendix C. OpenCL reference
- Chapter Outline
- C.1 Platform layer
- C.2 Memory types
- C.3 Buffer management
- C.4 Programs and compiling
- C.5 Kernel functions
- C.6 Command queue functions
- C.7 Vector and image data types
- C.8 Attributes
- C.9 Constants
- C.10 Built-in functions
- Index
- No. of pages: 368
- Language: English
- Edition: 2
- Published: October 28, 2023
- Imprint: Morgan Kaufmann
- Paperback ISBN: 9780128225752
- eBook ISBN: 9780323903028
JB
Jason D. Bakos
Jason D. Bakos is a professor of Computer Science and Engineering at the University of South Carolina. He received a BS in Computer Science from Youngstown State University in 1999 and a PhD in Computer Science from the University of Pittsburgh in 2005. Dr. Bakos’s research focuses on mapping data- and compute-intensive codes to high-performance, heterogeneous, reconfigurable, and embedded computer systems. His group works closely with FPGA-based computer manufacturers Convey Computer Corporation, GiDEL, and Annapolis Micro Systems, as well as GPU and DSP manufacturers NVIDIA, Texas Instruments, and Advantech. Dr. Bakos holds two patents, has published over 30 refereed publications in computer architecture and high-performance computing, was a winner of the ACM/DAC student design contest in 2002 and 2004, and received the US National Science Foundation (NSF) CAREER award in 2009. He is currently serving as associate editor for ACM Transactions on Reconfigurable Technology and Systems.
Affiliations and expertise
Professor of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States of AmericaRead Embedded Systems on ScienceDirect