ESTRO 2024 - Abstract Book

S3052

Physics - Autosegmentation

ESTRO 2024

Additionally, for comparison, we generated contours using a benchmark method, LayerCAM [5]. Performance assessment for each method was done by calculating the detection F1-score, which indicates how well individual tumors/lesions are detected, and the dice score, which indicates the degree of overlap between the ground truth delineations and the predicted contours of tumors or lesions. We furthermore report the balanced accuracy (BA): a subject is considered to be predicted as positive if the generated delineation mask is non-empty, and the balanced accuracy is computed with respect to the ground truth binary labels used for training.

Results:

Comparative performance results of LayerCAM and our method are summarized in Table 1, and comparisons between ground truth and contours generated by our method are illustrated in Figure 1. Overall, our method reached a higher F1-score, dice score and BA across all datasets compared to benchmark. Only the BA for the AutoPET dataset was slightly lower (Table 1). Moreover, a visual coherency between the ground truth and the generated contours can clearly be observed in Figure 1.

Dataset

LayerCAM

Our method

F1-score

0.35

0.39

Dice

0.31

0.32

AutoPET

BA

0.80

0.74

F1-score

0.39

0.50

Dice

0.35

0.35

MosMed

BA

0.90

0.93

F1-score

0.24

0.51

Dice

-

-

Duke breast cancer

BA

0.58

0.79

Table 1: Performance evaluation on different datasets using benchmark method (LayerCAM) and our method. For the Duke dataset, only bounding box annotations were available, therefore dice score is not reported.

Made with FlippingBook - Online Brochure Maker