CUDA and Applications to Task-based Programming

Markus Steinberger, Martin Winter, Michael Kenzel, Bernhard Kerbl

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review


Since its inception, the CUDA programming model has been continuously evolving. Because the CUDA toolkit aims to consistently expose cutting-edge capabilities for general-purpose compute jobs to its users, the added features in each new version reflect the rapid changes that we observe in GPU architectures. Over the years, the changes in hardware, growing scope of built-in functions and libraries, as well as an advancing C++ standard compliance have expanded the design choices when coding for CUDA, and significantly altered the directives to achieve peak performance. In this tutorial, we give a thorough introduction to the CUDA toolkit, demonstrate how a contemporary application can benefit from recently introduced features and how they can be applied to task-based GPU scheduling in particular. For instance, we will provide detailed examples of
use cases for independent thread scheduling, cooperative groups, and the CUDA standard library, libcu++, which are certain to become an integral part of clean coding for CUDA in the near future.
Original languageEnglish
Title of host publicationEUROGRAPHICS 2021
Publication statusPublished - 2021

Cite this