Adaptive sparse matrix-matrix multiplication on the GPU

Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review


In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.
In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …
Original languageEnglish
Title of host publicationPPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming
Place of PublicationNew York, NY
PublisherAssociation of Computing Machinery
Number of pages14
ISBN (Print)978-1-4503-6225-2
Publication statusPublished - 2019
Event24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - Washington, DC, United States
Duration: 16 Feb 201920 Feb 2019


Conference24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Abbreviated titlePPoPP '19
Country/TerritoryUnited States
CityWashington, DC

Cite this