TY - GEN
T1 - Matwo-CapsNet: A Multi-label Semantic Segmentation Capsules Network
AU - Bonheur, Savinien
AU - Stern, Darko
AU - Payer, Christian
AU - Pienn, Michael
AU - Olschewski, Horst
AU - Urschler, Martin
PY - 2019
Y1 - 2019
N2 - Despite some design limitations, CNNs have been largely adopted by the computer vision community due to their efficacy and versatility. Introduced by Sabour et al. to circumvent some limitations of CNNs, capsules replace scalars with vectors to encode appearance feature representation, allowing better preservation of spatial relationships between whole objects and its parts. They also introduced the dynamic routing mechanism, which allows to weight the contributions of parts to a whole object differently at each inference step. Recently, Hinton et al. have proposed to solely encode pose information to model such part-whole relationships. Additionally, they used a matrix instead of a vector encoding in the capsules framework. In this work, we introduce several improvements to the capsules framework, allowing it to be applied for multi-label semantic segmentation. More specifically, we combine pose and appearance information encoded as matrices into a new type of capsule, i.e. Matwo-Caps. Additionally, we propose a novel routing mechanism, i.e. Dual Routing, which effectively combines these two kinds of information. We evaluate our resulting Matwo-CapsNet on the JSRT chest X-ray dataset by comparing it to SegCaps, a capsule based network for binary segmentation, as well as to other CNN based state-of-the-art segmentation methods, where we show that our Matwo-CapsNet achieves competitive results, while requiring only a fraction of the parameters of other previously proposed methods.
AB - Despite some design limitations, CNNs have been largely adopted by the computer vision community due to their efficacy and versatility. Introduced by Sabour et al. to circumvent some limitations of CNNs, capsules replace scalars with vectors to encode appearance feature representation, allowing better preservation of spatial relationships between whole objects and its parts. They also introduced the dynamic routing mechanism, which allows to weight the contributions of parts to a whole object differently at each inference step. Recently, Hinton et al. have proposed to solely encode pose information to model such part-whole relationships. Additionally, they used a matrix instead of a vector encoding in the capsules framework. In this work, we introduce several improvements to the capsules framework, allowing it to be applied for multi-label semantic segmentation. More specifically, we combine pose and appearance information encoded as matrices into a new type of capsule, i.e. Matwo-Caps. Additionally, we propose a novel routing mechanism, i.e. Dual Routing, which effectively combines these two kinds of information. We evaluate our resulting Matwo-CapsNet on the JSRT chest X-ray dataset by comparing it to SegCaps, a capsule based network for binary segmentation, as well as to other CNN based state-of-the-art segmentation methods, where we show that our Matwo-CapsNet achieves competitive results, while requiring only a fraction of the parameters of other previously proposed methods.
U2 - 10.1007/978-3-030-32254-0_74
DO - 10.1007/978-3-030-32254-0_74
M3 - Conference paper
SN - 978-3-030-32253-3
T3 - Lecture Notes in Computer Science
SP - 664
EP - 672
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2019
PB - Springer
CY - Cham
T2 - 22nd International Conference on Medical Image Computing and Computer-Assisted Intervention
Y2 - 13 October 2019 through 17 November 2019
ER -