Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence

Lennart Johannes Gruber; Jan Egger; Andrea Bönsch; Joep Kraeima; Max Ulbrich; Vincent van den Bosch; Ila Motmaen; Caroline Wilpert; Mark Ooms; Peter Isfort; Frank Hölzle; Behrus Puladi

doi:10.1016/j.eswa.2023.122275

Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence

Lennart Johannes Gruber, Jan Egger, Andrea Bönsch, Joep Kraeima, Max Ulbrich, Vincent van den Bosch, Ila Motmaen, Caroline Wilpert, Mark Ooms, Peter Isfort, Frank Hölzle, Behrus Puladi^*

^*Korrespondierende/r Autor/-in für diese Arbeit

Institut für Maschinelles Sehen und Darstellen (7100)

Publikation: Beitrag in einer Fachzeitschrift › Artikel › Begutachtung

Abstract

Objective: 3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods. Method: In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUB_DS), and blinded assessments by two radiologists. Results: Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUB_DS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUB_DS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUB_DS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUB_DS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability. Conclusion: Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.

Originalsprache	englisch
Aufsatznummer	122275
Fachzeitschrift	Expert Systems with Applications
Jahrgang	239
DOIs	https://doi.org/10.1016/j.eswa.2023.122275
Publikationsstatus	Veröffentlicht - 1 Apr. 2024

ASJC Scopus subject areas

Allgemeiner Maschinenbau
Angewandte Informatik
Artificial intelligence

Zugriff auf Dokument

10.1016/j.eswa.2023.122275

Andere Dateien und Links

Verknüpfung zur Publikation in Scopus

Dieses zitieren

Gruber, L. J., Egger, J., Bönsch, A., Kraeima, J., Ulbrich, M., van den Bosch, V., Motmaen, I., Wilpert, C., Ooms, M., Isfort, P., Hölzle, F., & Puladi, B. (2024). Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence. Expert Systems with Applications, 239, Artikel 122275. https://doi.org/10.1016/j.eswa.2023.122275

Gruber, LJ, Egger, J, Bönsch, A, Kraeima, J, Ulbrich, M, van den Bosch, V, Motmaen, I, Wilpert, C, Ooms, M, Isfort, P, Hölzle, F & Puladi, B 2024, 'Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence', Expert Systems with Applications, Jg. 239, 122275. https://doi.org/10.1016/j.eswa.2023.122275

@article{ba8885f8aa2a48d39a2ad5b13eb89d81,

title = "Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence",

abstract = "Objective: 3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods. Method: In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUBDS), and blinded assessments by two radiologists. Results: Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUBDS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUBDS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUBDS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUBDS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability. Conclusion: Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.",

keywords = "Artificial intelligence, Computer-assisted surgery, Oral and maxillofacial surgery, Segmentation, Virtual reality",

author = "Gruber, {Lennart Johannes} and Jan Egger and Andrea B{\"o}nsch and Joep Kraeima and Max Ulbrich and {van den Bosch}, Vincent and Ila Motmaen and Caroline Wilpert and Mark Ooms and Peter Isfort and Frank H{\"o}lzle and Behrus Puladi",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2024",

month = apr,

day = "1",

doi = "10.1016/j.eswa.2023.122275",

language = "English",

volume = "239",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Accuracy and Precision of Mandible Segmentation and Its Clinical Implications

T2 - Virtual Reality, Desktop Screen and Artificial Intelligence

AU - Gruber, Lennart Johannes

AU - Egger, Jan

AU - Bönsch, Andrea

AU - Kraeima, Joep

AU - Ulbrich, Max

AU - van den Bosch, Vincent

AU - Motmaen, Ila

AU - Wilpert, Caroline

AU - Ooms, Mark

AU - Isfort, Peter

AU - Hölzle, Frank

AU - Puladi, Behrus

PY - 2024/4/1

Y1 - 2024/4/1

N2 - Objective: 3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods. Method: In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUBDS), and blinded assessments by two radiologists. Results: Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUBDS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUBDS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUBDS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUBDS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability. Conclusion: Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.

AB - Objective: 3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods. Method: In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUBDS), and blinded assessments by two radiologists. Results: Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUBDS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUBDS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUBDS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUBDS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability. Conclusion: Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.

KW - Artificial intelligence

KW - Computer-assisted surgery

KW - Oral and maxillofacial surgery

KW - Segmentation

KW - Virtual reality

UR - http://www.scopus.com/inward/record.url?scp=85176304182&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2023.122275

DO - 10.1016/j.eswa.2023.122275

M3 - Article

AN - SCOPUS:85176304182

SN - 0957-4174

VL - 239

JO - Expert Systems with Applications

JF - Expert Systems with Applications

M1 - 122275

ER -

Accuracy and Precision of Mandible Segmentation and Its Clinical Implications: Virtual Reality, Desktop Screen and Artificial Intelligence

Abstract

ASJC Scopus subject areas

Zugriff auf Dokument

Andere Dateien und Links

Fingerprint

Dieses zitieren