CUDA Fortran for Scientists and Engineers
Best Practices for Efficient CUDA Fortran Programming
- 2nd Edition - July 11, 2024
- Authors: Gregory Ruetsch, Massimiliano Fatica
- Language: English
- Paperback ISBN:9 7 8 - 0 - 4 4 3 - 2 1 9 7 7 - 1
- eBook ISBN:9 7 8 - 0 - 4 4 3 - 2 1 9 7 6 - 4
CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming shows how high-performance application developers can leverage the power of GPUs u… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quoteCUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming shows how high-performance application developers can leverage the power of GPUs using Fortran, the familiar language of scientific computing and supercomputer performance benchmarking. The authors presume no prior parallel computing experience, and cover the basics along with best practices for efficient GPU computing using CUDA Fortran. In order to add CUDA Fortran to existing Fortran codes, they explain how to understand the target GPU architecture, identify computationally-intensive parts of the code, and modify the code to manage the data and parallelism and optimize performance – all in Fortran, without having to rewrite in another language.
Each concept is illustrated with actual examples so you can immediately evaluate the performance of your code in comparison.
This second edition provides much needed updates on how to efficiently program GPUs in CUDA Fortran. It can be used either as a tutorial on GPU programming in CUDA Fortran as well as a reference text.
- Presents optimization strategies for current hardware, including Hopper generation GPUs
- Includes discussions of new language and hardware features, including managed memory, tensor cores, shuffle instructions, new multi-GPU paradigms
- Offers resources and strategies for porting large codes to GPUs, including language features as well as library use
Scientists and engineers who want to use GPU computing as a tool in their respective research fields rather than from a pure computational science perspective. No previous experience with parallel computing is required, only knowledge of Fortran 90. Anyone interested in writing parallel codes in Fortran, from financial applications to climate and weather modeling
- Cover image
- Title page
- Table of Contents
- Copyright
- Dedication
- Preface to the Second Edition
- Preface to the First Edition
- References
- Acknowledgments
- Part 1: CUDA Fortran programming
- Chapter 1: Introduction
- Abstract
- 1.1. A brief history of GPU computing
- 1.2. Parallel computation
- 1.3. Basic concepts
- 1.4. Determining CUDA hardware features and limits
- 1.5. Error handling
- 1.6. Compiling CUDA Fortran code
- 1.7. CUDA Driver, Toolkit, and compatibility
- Chapter 2: Correctness, accuracy, and debugging
- Abstract
- 2.1. Assessing correctness of results
- 2.2. Debugging
- Chapter 3: Performance measurement and metrics
- Abstract
- 3.1. Measuring execution time
- 3.2. Instruction, bandwidth, and latency bound kernels
- 3.3. Memory bandwidth
- Chapter 4: Synchronization
- Abstract
- 4.1. Synchronization of kernel execution and data transfers
- 4.2. Synchronization of kernel threads on the device
- Chapter 5: Optimization
- Abstract
- 5.1. Transfers between host and device
- 5.2. Device memory
- 5.3. Execution configuration
- 5.4. Instruction optimization
- Chapter 6: Porting tips and techniques
- Abstract
- 6.1. CUF kernels
- 6.2. Conditional inclusion of code
- 6.3. Renaming variables
- 6.4. Minimizing memory footprint for work arrays
- 6.5. Array compaction
- References
- Chapter 7: Interfacing with CUDA C code and CUDA libraries
- Abstract
- 7.1. Calling user-written CUDA C code
- 7.2. cuBLAS
- 7.3. cuSPARSE
- 7.4. cuSOLVER
- 7.5. cuTENSOR
- 7.6. Thrust
- Chapter 8: Multi-GPU programming
- Abstract
- 8.1. CUDA multi-GPU features
- 8.2. Multi-GPU programming with MPI
- References
- Part 2: Case studies
- Chapter 9: Monte Carlo method
- Abstract
- 9.1. CURAND
- 9.2. Computing π with CUF kernels
- 9.3. Computing π with reduction kernels
- 9.4. Accuracy of summation
- 9.5. Option pricing
- References
- Chapter 10: Finite difference method
- Abstract
- 10.1. Nine-point 1D finite difference stencil
- 10.2. 2D Laplace equation
- References
- Chapter 11: Applications of the fast Fourier transform
- Abstract
- 11.1. CUFFT
- 11.2. Spectral derivatives
- 11.3. Convolution
- 11.4. Poisson solver
- References
- Chapter 12: Ray tracing
- Abstract
- 12.1. Generating an image file
- 12.2. Vectors in CUDA Fortran
- 12.3. Rays, a simple camera, and background
- 12.4. Adding a sphere
- 12.5. Surface normals and multiple objects
- 12.6. Antialiasing
- 12.7. Material types
- 12.8. Positionable camera
- 12.9. Defocus blur
- 12.10. Where next?
- 12.11. Triangles
- 12.12. Lights
- 12.13. Textures
- References
- Part 3: Appendices
- Appendix A: System and environment management
- A.1. Environment variables
- A.2. nvidia-smi – System Management Interface
- References
- References
- Index
- No. of pages: 436
- Language: English
- Edition: 2
- Published: July 11, 2024
- Imprint: Morgan Kaufmann
- Paperback ISBN: 9780443219771
- eBook ISBN: 9780443219764
GR
Gregory Ruetsch
MF