Data Parallel C++: Programming Accelerated Systems Using C++ and SYCL, 2nd Edition

  • 6h 25m
  • Ben Ashbaugh, James Brodman, James Reinders, John Pennycook, Michael Kinsner, Xinmin Tian
  • Apress
  • 2023

"This book, now in is second edition, is the premier resource to learn SYCL 2020 and is the ONLY book you need to become part of this community." Erik Lindahl, GROMACS and Stockholm University

Learn how to accelerate C++ programs using data parallelism and SYCL.

This open access book enables C++ programmers to be at the forefront of this exciting and important development that is helping to push computing to new levels. This updated second edition is full of practical advice, detailed explanations, and code examples to illustrate key topics.

SYCL enables access to parallel resources in modern accelerated heterogeneous systems. Now, a single C++ application can use any combination of devices–including GPUs, CPUs, FPGAs, and ASICs–that are suitable to the problems at hand.

This book teaches data-parallel programming using C++ with SYCL and walks through everything needed to program accelerated systems. The book begins by introducing data parallelism and foundational topics for effective use of SYCL. Later chapters cover advanced topics, including error handling, hardware-specific programming, communication and synchronization, and memory model considerations.

All source code for the examples used in this book is freely available on GitHub. The examples are written in modern SYCL and are regularly updated to ensure compatibility with multiple compilers.

What You Will Learn

  • Accelerate C++ programs using data-parallel programming
  • Use SYCL and C++ compilers that support SYCL
  • Write portable code for accelerators that is vendor and device agnostic
  • Optimize code to improve performance for specific accelerators
  • Be poised to benefit as new accelerators appear from many vendors

Who This Book Is For

New data-parallel programming and computer programmers interested in data-parallel programming using C++

About the Author

James Reinders is an Engineer at Intel Corporation with more than four decades of experience in parallel computing and is an author/co-author/editor of more than 10 technical books related to parallel programming. He has a passion for system optimization and teaching. He has had the great fortune to help make contributions to three of the world’s fastest computers (#1 on the TOP500 list) as well as many other supercomputers and software developer tools.

Ben Ashbaugh is a Software Architect at Intel Corporation, where he has worked for over 20 years developing software drivers and compilers for Intel graphics products. For the past 10 years, he has focused on parallel programming models for general-purpose computation on graphics processors, including SYCL and the DPC++ compiler. He is active in the Khronos SYCL, OpenCL, and SPIR working groups; helping to define industry standards for parallel programming; and he has authored numerous extensions to expose unique Intel GPU features.

James Brodman is a Principal Engineer at Intel Corporation, working on runtimes and compilers for parallel programming, and he is one of the architects of DPC++. He has a PhD in Computer Science from the University of Illinois at Urbana-Champaign.

Michael Kinsner is a Principal Engineer at Intel Corporation, developing parallel programming languages and compilers for a variety of architectures. He contributes extensively to spatial architectures and programming models and is an Intel representative within The Khronos Group where he works on the SYCL and OpenCL industry standards for parallel programming. He has a PhD in Computer Engineering from McMaster University and is passionate about programming models that cross architectures while still enabling performance.

John Pennycook is a Software Enabling and Optimization Architect at Intel Corporation, focused on enabling developers to fully utilize the parallelism available in modern processors. He is experienced in optimizing and parallelizing applications from a range of scientific domains, and previously served as Intel’s representative on the steering committee for the Intel eXtreme Performance User’s Group (IXPUG). He has a PhD in Computer Science from the University of Warwick. His research interests are varied, but a recurring theme is the ability to achieve application “performance portability” across different hardware architectures.

Xinmin Tian is an Intel Fellow and Compiler Architect at Intel Corporation and serves as Intel’s representative on OpenMP Architecture Review Board (ARB). He has been driving OpenMP offloading, vectorization, and parallelization compiler technologies for Intel architectures. His current focus is on LLVM-based OpenMP offloading, SYCL/DPC++ compiler optimizations for CPUs/GPUs, and tuning HPC/AI application performance. He has a PhD in Computer Science from Tsinghua University, holds 27 US patents, has published over 60 technical papers with over 1300+ citations of his work, and has co-authored two books that span his expertise.

In this Book

  • Foreword
  • Introduction
  • Where Code Executes
  • Data Management
  • Expressing Parallelism
  • Error Handling
  • Unified Shared Memory
  • Buffers
  • Scheduling Kernels and Data Movement
  • Communication and Synchronization
  • Defining Kernels
  • Vectors and Math Arrays
  • Device Information and Kernel Specialization
  • Practical Tips
  • Common Parallel Patterns
  • Programming for GPUs
  • Programming for CPUs
  • Programming for FPGAs
  • Libraries
  • Memory Model and Atomics
  • Backend Interoperability
  • Migrating CUDA Code
  • Epilogue—Future Direction of SYCL
SHOW MORE
FREE ACCESS