TY - GEN
T1 - AMLP-Conv, a 3D Axial Long-range Interaction Multilayer Perceptron for CNNs
AU - Bonheur, Savinien
AU - Pienn, Michael
AU - Olschewski, Horst
AU - Bischof, Horst
AU - Urschler, Martin
N1 - Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - While Convolutional neural networks (CNN) have been the backbone of medical image analysis for years, their limited long-range interaction restrains their ability to encode long distance anatomical relationships. On the other hand, the current approach to capture long distance relationships, Transformers, is constrained by their quadratic scaling and their data inefficiency (arising from their lack of inductive biases). In this paper, we introduce the 3D Axial Multilayer Perceptron (AMLP), a long-range interaction module whose complexity scales linearly with spatial dimensions. This module is merged with CNNs to form the AMLP-Conv module, a long-range augmented convolution with strong inductive biases. Once combined with U-Net, our AMLP-Conv module leads to significant improvement, outperforming most transformer based U-Nets on the ACDC dataset, and reaching a new state-of-the-art result on the Multi-Modal Whole Heart Segmentation (MM-WHS) dataset with an almost 1.1% Dice score improvement over the previous scores on the Computed Tomography (CT) modality.
AB - While Convolutional neural networks (CNN) have been the backbone of medical image analysis for years, their limited long-range interaction restrains their ability to encode long distance anatomical relationships. On the other hand, the current approach to capture long distance relationships, Transformers, is constrained by their quadratic scaling and their data inefficiency (arising from their lack of inductive biases). In this paper, we introduce the 3D Axial Multilayer Perceptron (AMLP), a long-range interaction module whose complexity scales linearly with spatial dimensions. This module is merged with CNNs to form the AMLP-Conv module, a long-range augmented convolution with strong inductive biases. Once combined with U-Net, our AMLP-Conv module leads to significant improvement, outperforming most transformer based U-Nets on the ACDC dataset, and reaching a new state-of-the-art result on the Multi-Modal Whole Heart Segmentation (MM-WHS) dataset with an almost 1.1% Dice score improvement over the previous scores on the Computed Tomography (CT) modality.
KW - 3D semantic segmentation
KW - Axial attention
KW - Convolutional neural network
KW - Heart segmentation
KW - MLP
KW - Multi-label
UR - http://www.scopus.com/inward/record.url?scp=85144814854&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-21014-3_34
DO - 10.1007/978-3-031-21014-3_34
M3 - Conference paper
AN - SCOPUS:85144814854
SN - 9783031210136
T3 - Lecture Notes in Computer Science
SP - 328
EP - 337
BT - Machine Learning in Medical Imaging - 13th International Workshop, MLMI 2022, Held in Conjunction with MICCAI 2022, Proceedings
A2 - Lian, Chunfeng
A2 - Cao, Xiaohuan
A2 - Rekik, Islem
A2 - Xu, Xuanang
A2 - Cui, Zhiming
PB - Springer Science and Business Media Deutschland GmbH
T2 - 13th International Workshop on Machine Learning in Medical Imaging, MLMI 2022, held in conjunction with 25th International Conference on Medical Image Computing and Computer_Assisted Intervention, MICCAI 2022
Y2 - 18 September 2022 through 18 September 2022
ER -