The Role of Pre-training Data in Transfer Learning

Rahim Entezari*, Mitchell Wortsman, Olga Saukh, M. Moein Shariatnia, Hanie Sedghi, Ludwig Schmidt

*Korrespondierende/r Autor/-in für diese Arbeit

Publikation: KonferenzbeitragPaper

Abstract

The transfer learning paradigm of model pre-training and subsequent fine-tuning produces high- accuracy models. While most studies recommend scaling the pre-training size to benefit most from transfer learning, a question remains: what data and method should be used for pre-training? We investigate the impact of pre-training data distribution on the few-shot and full fine-tuning performance using 3 pre-training methods (supervised, contrastive language-image and image-image), 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training data source is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000× more pre-training data from LAION can match the performance of supervised ImageNet pre-training. Furthermore, we investigate the effect of pre-training methods, comparing language-image contrastive vs. image-image contrastive, and find that the latter leads to better downstream accuracy
Originalspracheenglisch
Seitenumfang38
PublikationsstatusVeröffentlicht - 27 Feb. 2023
Veranstaltung1st Multimodal Representation Learning Workshop: ICLR 2023 - Virtual, Ruanda
Dauer: 1 Mai 20235 Mai 2023
https://iclr.cc/virtual/2023/workshop/12836

Workshop

Workshop1st Multimodal Representation Learning Workshop
KurztitelICLR 2023
Land/GebietRuanda
OrtVirtual
Zeitraum1/05/235/05/23
Internetadresse

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Untersuchen Sie die Forschungsthemen von „The Role of Pre-training Data in Transfer Learning“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren