TY - JOUR
T1 - GBM-Reservoir
T2 - Brain tumor (Glioblastoma Multiforme) MRI dataset collection with ground truth segmentation masks
AU - Solak, Naida
AU - Ferreira, André
AU - Luijten, Gijs
AU - Puladi, Behrus
AU - Alves, Victor
AU - Egger, Jan
N1 - Publisher Copyright:
© 2025
PY - 2025/2
Y1 - 2025/2
N2 - In this article, we present a brain tumor database collection comprising 23,049 samples, with each sample including four different types of MRI brain scans: FLAIR, T1, T1ce, and T2. Additionally, one or two segmentation masks (ground truth) are provided for each sample. The first mask is the raw output from the registration process and is provided for all samples, while the second mask, provided particularly for synthetic samples, is a post-processed version of the first, designed to simplify interpretation and optimize it for network training. These samples have been acquired via registration process of 438 samples available at the moment of registration from the original dataset provided by the BraTS 2022 Challenge. Registering each pair of existing brain scans results in two additional scans that retain a similar brain shape while featuring varying tumor locations. Consequently, by registering all possible pairs, a dataset originally consisting of n samples can be expanded to n2 samples. The original dataset was collected from different institutions under standard clinical conditions, but with different equipment and imaging protocols. As a result, the image quality is heterogeneous, reflecting the diversity of clinical practices across institutions. This dataset can be utilized for various tasks, such as developing fully automated segmentation algorithms for new, unseen brain tumor cases, particularly through deep learning-based approaches, since ground truth is provided for each sample.
AB - In this article, we present a brain tumor database collection comprising 23,049 samples, with each sample including four different types of MRI brain scans: FLAIR, T1, T1ce, and T2. Additionally, one or two segmentation masks (ground truth) are provided for each sample. The first mask is the raw output from the registration process and is provided for all samples, while the second mask, provided particularly for synthetic samples, is a post-processed version of the first, designed to simplify interpretation and optimize it for network training. These samples have been acquired via registration process of 438 samples available at the moment of registration from the original dataset provided by the BraTS 2022 Challenge. Registering each pair of existing brain scans results in two additional scans that retain a similar brain shape while featuring varying tumor locations. Consequently, by registering all possible pairs, a dataset originally consisting of n samples can be expanded to n2 samples. The original dataset was collected from different institutions under standard clinical conditions, but with different equipment and imaging protocols. As a result, the image quality is heterogeneous, reflecting the diversity of clinical practices across institutions. This dataset can be utilized for various tasks, such as developing fully automated segmentation algorithms for new, unseen brain tumor cases, particularly through deep learning-based approaches, since ground truth is provided for each sample.
KW - Brain tumor segmentation
KW - BraTS
KW - Data augmentation
KW - Deep learning
KW - Registration
UR - http://www.scopus.com/inward/record.url?scp=85215545835&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2025.111287
DO - 10.1016/j.dib.2025.111287
M3 - Article
AN - SCOPUS:85215545835
SN - 2352-3409
VL - 58
JO - Data in Brief
JF - Data in Brief
M1 - 111287
ER -