ESTRO 2024 - Abstract Book
S3052
Physics - Autosegmentation
ESTRO 2024
Additionally, for comparison, we generated contours using a benchmark method, LayerCAM [5]. Performance assessment for each method was done by calculating the detection F1-score, which indicates how well individual tumors/lesions are detected, and the dice score, which indicates the degree of overlap between the ground truth delineations and the predicted contours of tumors or lesions. We furthermore report the balanced accuracy (BA): a subject is considered to be predicted as positive if the generated delineation mask is non-empty, and the balanced accuracy is computed with respect to the ground truth binary labels used for training.
Results:
Comparative performance results of LayerCAM and our method are summarized in Table 1, and comparisons between ground truth and contours generated by our method are illustrated in Figure 1. Overall, our method reached a higher F1-score, dice score and BA across all datasets compared to benchmark. Only the BA for the AutoPET dataset was slightly lower (Table 1). Moreover, a visual coherency between the ground truth and the generated contours can clearly be observed in Figure 1.
Dataset
LayerCAM
Our method
F1-score
0.35
0.39
Dice
0.31
0.32
AutoPET
BA
0.80
0.74
F1-score
0.39
0.50
Dice
0.35
0.35
MosMed
BA
0.90
0.93
F1-score
0.24
0.51
Dice
-
-
Duke breast cancer
BA
0.58
0.79
Table 1: Performance evaluation on different datasets using benchmark method (LayerCAM) and our method. For the Duke dataset, only bounding box annotations were available, therefore dice score is not reported.
Made with FlippingBook - Online Brochure Maker