ESTRO 2024 - Abstract Book
S3015
Physics - Autosegmentation
ESTRO 2024
[5] Huang Y, Bert C, Sommer P, Frey B, Gaipl U, Distel LV, Weissmann T, Uder M, Schmidt MA, Dörfler A, Maier A, Fietkau R, Putz F. Deep learning for brain metastasis detection and segmentation in longitudinal MRI data. Medical Physics. 2022 Sep;49(9):5773-86.
611
Proffered Paper
Diving deep into the uncertainty estimation methods for head and neck tumor auto-segmentation
Jintao Ren 1,2,3 , Jonas Teuwen 4 , Jasper Nijkamp 1,3 , Mathis Rasmussen 1,3,2 , Zeno Gouw 4 , Jesper G Eriksen 2 , Jan-Jakob Sonke 4 , Stine S Korreman 1,3,2 1 Aarhus University Hospital, Danish Center for Particle Therapy, Aarhus, Denmark. 2 Aarhus University Hospital, Department of Oncology, Aarhus, Denmark. 3 Aarhus University, Department of Clinical Medicine, Aarhus, Denmark. 4 Netherlands Cancer Institute, Department of Radiation Oncology, Amsterdam, Netherlands
Purpose/Objective:
Deep learning enables auto-segmentation of primary gross tumor volume (GTV-T) and involved nodal metastases (GTV-N) in patients with head-and-neck cancer (HNC)[1]. However, its application is limited by occasional errors and failures that can undermine its reliability for clinical use[2]. Uncertainty estimation can be incorporated to provide confidence levels associated with the model’s predictions, thereby addressing this limitation[3,4]. However, it is critical to ensure that the uncertainty estimation technique employed is reliable and consistent with the model’s overall accuracy[5]. In this context, our objective was to systematically evaluate and compare various uncertainty estimation methodologies, including the conventional deterministic approach. We aimed to determine their efficacy in improving the reliability of auto-segmentation for HNC GTV-T and GTV-N by quantifying uncertainty in segmentation and understanding its relation to potential segmentation errors.
Material/Methods:
We collected data from 567 HNC patients with diverse anatomical sites and multi-modality images (CT, PET, T1 weighted, and T2-weighted MRI) along with their clinical GTV-T/N delineations. We randomly split the dataset into a training set of 470 patients and a test set of 97 patients. Using the nnUNet 3D segmentation pipeline for training, GTV T and -N were treated as two separate targets. For comparison purposes, we employed test-time augmentation (TTA) referred to as “Baseline”, which is effective at estimating aleatoric uncertainty (irreducible data uncertainty)[6]. To estimate epistemic uncertainties (reducible model uncertainty when more data is introduced), we considered the following five methods: 1. Monte Carlo dropout with a rate of 0.2, using 10 samples (referred to as "MC Dropout")[7], 2. An ensemble of five models, each trained independently on different splits of the training set (referred to as "Ensemble")[8], 3. A snapshot ensemble, which took predictions from five checkpoints saved prior to learning rate restarts (referred to as "Snapshot")[9],
Made with FlippingBook - Online Brochure Maker