ESTRO 2024 - Abstract Book
S3003
Physics - Autosegmentation
ESTRO 2024
patients, 114 scans). Model inference was performed on the data of both centers with both the MR-only and the MR+prior model.
To compare the performance of an externally developed model with what would be the model performance if the model was trained locally, we also trained a nnU-Net model (without prior) from scratch on the data from Center 1 only. We designate this model as baseline. We split the data on patient level into training-validation-test sets (53-6-13 patients, respectively) and compared the performance of the baseline model on the test cases with the performance of the previously mentioned in-house developed MR+prior model on the same cases.
The different models were compared on Dice, surface Dice with a tolerance of 3 mm and the mean surface distance (MSD) and tested for significance using the Wilcoxon signed-rank test.
Results:
Figure 1 shows the performance of the models on the Dice, surface Dice (3 mm) and the MSD. The MR+prior model performed significantly better on all metrics compared to the MR-only model (figure 1, top row), for both center 1 and center 2(all p-values < 0.01). In the example shown in figure 2, it can be seen that the MR+prior model follows the cranial and caudal borders of the ground truth more closely than the MR-only model due to the additional, patient specific input, as these borders are not only anatomically defined but also dependent on patient-specific information.
Performance of the MR+prior model was similar to the baseline model when evaluated on test-data from center 1 (all p values > 0.05), showing that adding the prior results in the same model performance as full retraining from scratch.
Made with FlippingBook - Online Brochure Maker