Ternary feature masks: zero-forgetting for task-incremental learning

M. Masana; T. Tuytelaars; J. Van De Weijer

doi:10.1109/CVPRW53098.2021.00396

Ternary feature masks: zero-forgetting for task-incremental learning

M. Masana^*, T. Tuytelaars, J. Van De Weijer

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue--and show experimentally--that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.

Original language	English
Title of host publication	Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021
Pages	3565-3574
Number of pages	10
ISBN (Electronic)	9781665448994
DOIs	https://doi.org/10.1109/CVPRW53098.2021.00396
Publication status	Published - Jun 2021
Externally published	Yes
Event	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops: CVPRW 2021 - Virtuell, United States Duration: 19 Jun 2021 → 25 Jun 2021

Publication series

Name	IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)	2160-7508
ISSN (Electronic)	2160-7516

Conference

Conference	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
Abbreviated title	CVPRW 2021
Country/Territory	United States
City	Virtuell
Period	19/06/21 → 25/06/21

ASJC Scopus subject areas

Electrical and Electronic Engineering
Computer Vision and Pattern Recognition

Access to Document

10.1109/CVPRW53098.2021.00396

Cite this

Masana, M., Tuytelaars, T., & Van De Weijer, J. (2021). Ternary feature masks: zero-forgetting for task-incremental learning. In Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021 (pp. 3565-3574). (IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops). https://doi.org/10.1109/CVPRW53098.2021.00396

Ternary feature masks: zero-forgetting for task-incremental learning. / Masana, M.; Tuytelaars, T.; Van De Weijer, J.
Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021. 2021. p. 3565-3574 (IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops).

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Masana, M, Tuytelaars, T & Van De Weijer, J 2021, Ternary feature masks: zero-forgetting for task-incremental learning. in Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 3565-3574, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Virtuell, United States, 19/06/21. https://doi.org/10.1109/CVPRW53098.2021.00396

@inproceedings{3341ba51f483485f8fcb5900f25858b7,

title = "Ternary feature masks: zero-forgetting for task-incremental learning",

abstract = "We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue--and show experimentally--that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.",

author = "M. Masana and T. Tuytelaars and {Van De Weijer}, J.",

year = "2021",

month = jun,

doi = "10.1109/CVPRW53098.2021.00396",

language = "English",

series = "IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops",

pages = "3565--3574",

booktitle = "Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021",

note = "2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops : CVPRW 2021, CVPRW 2021 ; Conference date: 19-06-2021 Through 25-06-2021",

}

TY - GEN

T1 - Ternary feature masks: zero-forgetting for task-incremental learning

AU - Masana, M.

AU - Tuytelaars, T.

AU - Van De Weijer, J.

PY - 2021/6

Y1 - 2021/6

N2 - We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue--and show experimentally--that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.

AB - We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue--and show experimentally--that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.

UR - http://www.scopus.com/inward/record.url?scp=85113504296&partnerID=8YFLogxK

U2 - 10.1109/CVPRW53098.2021.00396

DO - 10.1109/CVPRW53098.2021.00396

M3 - Conference paper

T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops

SP - 3565

EP - 3574

BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021

T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops

Y2 - 19 June 2021 through 25 June 2021

ER -

Ternary feature masks: zero-forgetting for task-incremental learning

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this