DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Mathias Parger; Chengcheng Tang; Christopher  Twigg; Cem Keskin; Robert  Wang; Markus Steinberger

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Mathias Parger, Chengcheng Tang, Christopher Twigg, Cem Keskin, Robert Wang, Markus Steinberger

Institute of Computer Graphics and Vision (7100)

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

Convolutional neural network inference on video data requires powerful hardware for real-time processing. Given the inherent coherence across consecutive frames, large parts of a video typically change little. By skipping identical image regions and truncating insignificant pixel updates, computational redundancy can in theory be reduced significantly. However, these theoretical savings have been difficult to translate into practice, as sparse updates hamper computational consistency and memory access coherence; which are key for efficiency on real hardware. With DeltaCNN, we present a sparse convolutional neural network framework that enables sparse frame-by-frame updates to accelerate video inference in practice. We provide sparse mplementations for all typical CNN layers and propagate sparse feature updates end-to-end – without accumulating errors over time. DeltaCNN is applicable to all convolutional neural networks without retraining. To the best of our knowledge, we are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy

Original language	English
Title of host publication	Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publication status	Published - Jun 2022
Event	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition: CVPR 2022 - New Orleans Ernest N. Morial Convention Center, Hybrider Event, New Orleans, United States Duration: 21 Jun 2022 → 24 Sept 2022 Conference number: 2022

Conference

Conference	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Abbreviated title	CVPR 2022
Country/Territory	United States
City	Hybrider Event, New Orleans
Period	21/06/22 → 24/09/22

Access to Document

https://openaccess.thecvf.com/content/CVPR2022/papers/Parger_DeltaCNN_End-to-End_CNN_Inference_of_Sparse_Frame_Differences_in_Videos_CVPR_2022_paper.pdfLicence: Other

Cite this

Parger, M., Tang, C., Twigg, C., Keskin, C., Wang, R., & Steinberger, M. (2022). DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) https://openaccess.thecvf.com/content/CVPR2022/papers/Parger_DeltaCNN_End-to-End_CNN_Inference_of_Sparse_Frame_Differences_in_Videos_CVPR_2022_paper.pdf

Parger, M, Tang, C, Twigg, C, Keskin, C, Wang, R & Steinberger, M 2022, DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Hybrider Event, New Orleans, Louisiana, United States, 21/06/22. <https://openaccess.thecvf.com/content/CVPR2022/papers/Parger_DeltaCNN_End-to-End_CNN_Inference_of_Sparse_Frame_Differences_in_Videos_CVPR_2022_paper.pdf>

@inproceedings{29a1aafbcdf74e799e6e1216a31718a6,

title = "DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos",

abstract = "Convolutional neural network inference on video data requires powerful hardware for real-time processing. Given the inherent coherence across consecutive frames, large parts of a video typically change little. By skipping identical image regions and truncating insignificant pixel updates, computational redundancy can in theory be reduced significantly. However, these theoretical savings have been difficult to translate into practice, as sparse updates hamper computational consistency and memory access coherence; which are key for efficiency on real hardware. With DeltaCNN, we present a sparse convolutional neural network framework that enables sparse frame-by-frame updates to accelerate video inference in practice. We provide sparse mplementations for all typical CNN layers and propagate sparse feature updates end-to-end – without accumulating errors over time. DeltaCNN is applicable to all convolutional neural networks without retraining. To the best of our knowledge, we are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy",

author = "Mathias Parger and Chengcheng Tang and Christopher Twigg and Cem Keskin and Robert Wang and Markus Steinberger",

year = "2022",

month = jun,

language = "English",

booktitle = "Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)",

note = "2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition : CVPR 2022 , CVPR 2022 ; Conference date: 21-06-2022 Through 24-09-2022",

}

TY - GEN

T1 - DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

AU - Parger, Mathias

AU - Tang, Chengcheng

AU - Twigg, Christopher

AU - Keskin, Cem

AU - Wang, Robert

AU - Steinberger, Markus

N1 - Conference code: 2022

PY - 2022/6

Y1 - 2022/6

N2 - Convolutional neural network inference on video data requires powerful hardware for real-time processing. Given the inherent coherence across consecutive frames, large parts of a video typically change little. By skipping identical image regions and truncating insignificant pixel updates, computational redundancy can in theory be reduced significantly. However, these theoretical savings have been difficult to translate into practice, as sparse updates hamper computational consistency and memory access coherence; which are key for efficiency on real hardware. With DeltaCNN, we present a sparse convolutional neural network framework that enables sparse frame-by-frame updates to accelerate video inference in practice. We provide sparse mplementations for all typical CNN layers and propagate sparse feature updates end-to-end – without accumulating errors over time. DeltaCNN is applicable to all convolutional neural networks without retraining. To the best of our knowledge, we are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy

AB - Convolutional neural network inference on video data requires powerful hardware for real-time processing. Given the inherent coherence across consecutive frames, large parts of a video typically change little. By skipping identical image regions and truncating insignificant pixel updates, computational redundancy can in theory be reduced significantly. However, these theoretical savings have been difficult to translate into practice, as sparse updates hamper computational consistency and memory access coherence; which are key for efficiency on real hardware. With DeltaCNN, we present a sparse convolutional neural network framework that enables sparse frame-by-frame updates to accelerate video inference in practice. We provide sparse mplementations for all typical CNN layers and propagate sparse feature updates end-to-end – without accumulating errors over time. DeltaCNN is applicable to all convolutional neural networks without retraining. To the best of our knowledge, we are the first to significantly outperform the dense reference, cuDNN, in practical settings, achieving speedups of up to 7x with only marginal differences in accuracy

M3 - Conference paper

BT - Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

T2 - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Y2 - 21 June 2022 through 24 September 2022

ER -