TY - JOUR
T1 - Evaluating a Periapical Lesion Detection CNN on a Clinically Representative CBCT Dataset—A Validation Study
AU - Hadzic, Arnela
AU - Urschler, Martin
AU - Press, Jan Niclas Aaron
AU - Riedl, Regina
AU - Rugani, Petra
AU - Štern, Darko
AU - Kirnbauer, Barbara
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2024/1
Y1 - 2024/1
N2 - The aim of this validation study was to comprehensively evaluate the performance and generalization capability of a deep learning-based periapical lesion detection algorithm on a clinically representative cone-beam computed tomography (CBCT) dataset and test for non-inferiority. The evaluation involved 195 CBCT images of adult upper and lower jaws, where sensitivity and specificity metrics were calculated for all teeth, stratified by jaw, and stratified by tooth type. Furthermore, each lesion was assigned a periapical index score based on its size to enable a score-based evaluation. Non-inferiority tests were conducted with proportions of 90% for sensitivity and 82% for specificity. The algorithm achieved an overall sensitivity of 86.7% and a specificity of 84.3%. The non-inferiority test indicated the rejection of the null hypothesis for specificity but not for sensitivity. However, when excluding lesions with a periapical index score of one (i.e., very small lesions), the sensitivity improved to 90.4%. Despite the challenges posed by the dataset, the algorithm demonstrated promising results. Nevertheless, further improvements are needed to enhance the algorithm’s robustness, particularly in detecting very small lesions and the handling of artifacts and outliers commonly encountered in real-world clinical scenarios.
AB - The aim of this validation study was to comprehensively evaluate the performance and generalization capability of a deep learning-based periapical lesion detection algorithm on a clinically representative cone-beam computed tomography (CBCT) dataset and test for non-inferiority. The evaluation involved 195 CBCT images of adult upper and lower jaws, where sensitivity and specificity metrics were calculated for all teeth, stratified by jaw, and stratified by tooth type. Furthermore, each lesion was assigned a periapical index score based on its size to enable a score-based evaluation. Non-inferiority tests were conducted with proportions of 90% for sensitivity and 82% for specificity. The algorithm achieved an overall sensitivity of 86.7% and a specificity of 84.3%. The non-inferiority test indicated the rejection of the null hypothesis for specificity but not for sensitivity. However, when excluding lesions with a periapical index score of one (i.e., very small lesions), the sensitivity improved to 90.4%. Despite the challenges posed by the dataset, the algorithm demonstrated promising results. Nevertheless, further improvements are needed to enhance the algorithm’s robustness, particularly in detecting very small lesions and the handling of artifacts and outliers commonly encountered in real-world clinical scenarios.
KW - artificial intelligence
KW - convolutional neural network
KW - deep learning
KW - digital imaging/radiology
KW - image segmentation
KW - inflammation
KW - oral diagnosis
KW - periapical lesions
UR - http://www.scopus.com/inward/record.url?scp=85181967454&partnerID=8YFLogxK
U2 - 10.3390/jcm13010197
DO - 10.3390/jcm13010197
M3 - Article
AN - SCOPUS:85181967454
SN - 2077-0383
VL - 13
JO - Journal of Clinical Medicine
JF - Journal of Clinical Medicine
IS - 1
M1 - 197
ER -