Abstract
We propose Hardware-Aware Deep Subnetworks (HADS) to tackle model adapta-
tion to dynamic resource contraints. In contrast to the state-of-the-art, HADS use
structured sparsity constructively by exploiting permutation invariance of neurons,
which allows for hardware-specific optimizations. HADS achieve computational
efficiency by skipping sequential computational blocks identified by a novel iter-
ative knapsack optimizer. HADS support conventional deep networks frequently
deployed on low-resource edge devices and provide computational benefits even
for small and simple networks. We evaluate HADS on six benchmark architec-
tures trained on the GOOGLE SPEECH COMMANDS, FMNIST and CIFAR10
datasets, and test on four off-the-shelf mobile and embedded hardware platforms.
We provide a theoretical result and empirical evidence for HADS outstanding per-
formance in terms of submodels’ test set accuracy, and demonstrate an adaptation
time in response to dynamic resource constraints of under 40μs, utilizing a 2-layer
fully-connected network on Arduino Nano 33 BLE Sense.
tion to dynamic resource contraints. In contrast to the state-of-the-art, HADS use
structured sparsity constructively by exploiting permutation invariance of neurons,
which allows for hardware-specific optimizations. HADS achieve computational
efficiency by skipping sequential computational blocks identified by a novel iter-
ative knapsack optimizer. HADS support conventional deep networks frequently
deployed on low-resource edge devices and provide computational benefits even
for small and simple networks. We evaluate HADS on six benchmark architec-
tures trained on the GOOGLE SPEECH COMMANDS, FMNIST and CIFAR10
datasets, and test on four off-the-shelf mobile and embedded hardware platforms.
We provide a theoretical result and empirical evidence for HADS outstanding per-
formance in terms of submodels’ test set accuracy, and demonstrate an adaptation
time in response to dynamic resource constraints of under 40μs, utilizing a 2-layer
fully-connected network on Arduino Nano 33 BLE Sense.
Original language | English |
---|---|
Number of pages | 10 |
Publication status | Accepted/In press - 7 May 2024 |
Event | International Conference on Learning Representations: ICLR 2024 - Messe Wien Exhibition and Congress Center, Wien, Austria Duration: 7 May 2024 → 11 Jul 2024 https://iclr.cc/ |
Conference
Conference | International Conference on Learning Representations |
---|---|
Abbreviated title | ICLR |
Country/Territory | Austria |
City | Wien |
Period | 7/05/24 → 11/07/24 |
Internet address |