Are dynamic memory managers on GPUs slow? - a survey and benchmarks.

Martin Winter, Mathias Parger, Daniel Mlakar, Markus Steinberger

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Dynamic memory management on GPUs is generally understood to be a challenging topic. On current GPUs, hundreds of thousands of threads might concurrently allocate new memory or free previously allocated memory. This leads to problems with thread contention, synchronization overhead and fragmentation. Various approaches have been proposed in the last ten years and we set out to evaluate them on a level playing field on modern hardware to answer the question, if dynamic memory managers are as slow as commonly thought of. In this survey paper, we provide a consistent framework to evaluate all publicly available memory managers in a large set of scenarios. We summarize each approach and thoroughly evaluate allocation performance (thread-based as well as warp-based), and look at performance scaling, fragmentation and real-world performance considering a synthetic workload as well as updating dynamic graphs. We discuss the strengths and weaknesses of each approach and provide guidelines for the respective best usage scenario. We provide a unified interface to integrate any of the tested memory managers into an application and switch between them for benchmarking purposes. Given our results, we can dispel some of the dread associated with dynamic memory managers on the GPU.

Original languageEnglish
Title of host publicationPPoPP 2021 - Proceedings of the 2021 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
PublisherAssociation of Computing Machinery
Pages219-233
Number of pages15
ISBN (Electronic)978-145038294-6
DOIs
Publication statusPublished - 17 Feb 2021
Event26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming: PPoPP 2021 - Virtual, Online, United States
Duration: 27 Feb 20213 Mar 2021

Publication series

NameProceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Conference

Conference26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Country/TerritoryUnited States
CityVirtual, Online
Period27/02/213/03/21

Keywords

  • analysis
  • benchmarks
  • bulksemaphore
  • CUDA
  • GPU
  • halloc
  • memory management
  • ouroboros
  • ScatterAlloc
  • survey
  • XMalloc

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Are dynamic memory managers on GPUs slow? - a survey and benchmarks.'. Together they form a unique fingerprint.

Cite this