DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Mathias Parger, Chengcheng Tang, Christopher Twigg, Cem Keskin, Robert Wang, Markus Steinberger

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review


Convolutional neural network inference on video data requires powerful hardware for real-time processing. Given the inherent coherence across consecutive frames, large parts of a video typically change little. By skipping identical image regions and truncating insignificant pixel updates, computational redundancy can in theory be reduced significantly. However, these theoretical savings have been difficult to translate into practice, as sparse updates hamper computational consistency and memory access coherence; which are key for efficiency on real hardware. With DeltaCNN, we present a sparse convolutional neural network framework that enables sparse frame-by-frame updates to accelerate video inference in practice. We provide sparse mplementations for all typical CNN layers and propagate sparse feature updates end-to-end – without accumulating errors over time. DeltaCNN is applicable to all convolutional neural networks without retraining. To the best of our knowledge, we are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy
Original languageEnglish
Title of host publicationProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publication statusPublished - Jun 2022
Event2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition: CVPR 2022 - New Orleans Ernest N. Morial Convention Center, Hybrider Event, New Orleans, United States
Duration: 21 Jun 202224 Sept 2022
Conference number: 2022


Conference2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Abbreviated titleCVPR 2022
Country/TerritoryUnited States
CityHybrider Event, New Orleans

Cite this