TY - JOUR
T1 - Multi-label learning on low label density sets with few examples
AU - Vergara, Matías
AU - Bustos, Benjamin
AU - Sipiran, Ivan
AU - Schreck, Tobias
AU - Lengauer, Stefan
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/3/15
Y1 - 2025/3/15
N2 - Multi-label learning has experienced an immense growth in the last years due to the multiple real-life applications to which it is applicable, such as the classification of protein functions, or musical genres, among others. This has led to the proposal of categories for multi-label classification (MLC) problems that seek to establish guidelines for the different configurations, given either by the quality or quantity of the labels, the number of examples for training, etc. Such is the case for the class of problems known as “Challenging MLC”, those in which the universe of labels incorporates obstacles either in terms of quality (erroneously assigned labels, unseen labels, etc.) or quantity (thousands or millions of labels). Different methods have been developed to address these cases, and yet few efforts have been directed towards the case where, despite having a large label universe, the number of examples is small (of the same order as the labels), thus posing a more complex scenario. In this paper, we examine one important real-world problem case — the labeling of Geometric surface patterns, appearing on pottery objects from the Classical era. As we will show, existing methods from the state of the art can provide baseline performance, but cannot yet comprehensively address this and similar application problems. We present and encompassing experimental comparison of state of the art methods, detailing advantages and problems. We contribute a processing pipeline that allows us to achieve effective classifications. Our work addresses the importance case when the universe of labels admits a feasible simplification through natural language processing (NLP) techniques and augmentation of visual training data. Based on an in-depth analysis of results, we propose practical guidelines on how to face similar problems, regarding both the selection of techniques and the analysis of results. We also identify pressing issues for current research to make multi-labeling more widely applicable and functional.
AB - Multi-label learning has experienced an immense growth in the last years due to the multiple real-life applications to which it is applicable, such as the classification of protein functions, or musical genres, among others. This has led to the proposal of categories for multi-label classification (MLC) problems that seek to establish guidelines for the different configurations, given either by the quality or quantity of the labels, the number of examples for training, etc. Such is the case for the class of problems known as “Challenging MLC”, those in which the universe of labels incorporates obstacles either in terms of quality (erroneously assigned labels, unseen labels, etc.) or quantity (thousands or millions of labels). Different methods have been developed to address these cases, and yet few efforts have been directed towards the case where, despite having a large label universe, the number of examples is small (of the same order as the labels), thus posing a more complex scenario. In this paper, we examine one important real-world problem case — the labeling of Geometric surface patterns, appearing on pottery objects from the Classical era. As we will show, existing methods from the state of the art can provide baseline performance, but cannot yet comprehensively address this and similar application problems. We present and encompassing experimental comparison of state of the art methods, detailing advantages and problems. We contribute a processing pipeline that allows us to achieve effective classifications. Our work addresses the importance case when the universe of labels admits a feasible simplification through natural language processing (NLP) techniques and augmentation of visual training data. Based on an in-depth analysis of results, we propose practical guidelines on how to face similar problems, regarding both the selection of techniques and the analysis of results. We also identify pressing issues for current research to make multi-labeling more widely applicable and functional.
KW - Deep learning for multi-label learning
KW - Extreme multi-label learning
KW - Pattern recognition
UR - http://www.scopus.com/inward/record.url?scp=85211108205&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.125942
DO - 10.1016/j.eswa.2024.125942
M3 - Article
AN - SCOPUS:85211108205
SN - 0957-4174
VL - 265
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 125942
ER -