ESTRO 2024 - Abstract Book

S3020

Physics - Autosegmentation

ESTRO 2024

Purpose/Objective:

Automated segmentation (AS) of organs-at-risk (OAR) structures can improve efficiency and quality for radiotherapy planning, but only if human checking and editing is not onerous or overly time-consuming. Quality assurance (QA) for AS is a current challenge; however, recommendations have been made to use independent QA tools, rather than internal model probabilities and uncertainty estimates, which are typically poorly calibrated (1) and tend to prioritise epistemic (model) uncertainty over aleatoric (data driven) uncertainty.

We developed AutoConfidence (ACo), a novel AI-driven QA method, to estimate pixel-wise confidence maps on a per patient basis, enabling robust, efficient review of AS for RT.

Material/Methods:

Model:

ACo consisted of a generator U-net (G) which produced internal autosegmentations (IAS) and a U-ResNET discriminator (D) which estimated uncertainty given the segmentation and input image. These networks learnt adversarially to produce segmentations and estimate the pixel-wise probability (pGS) that the IAS was from the unseen ground-truth distribution, from which (noisy) gold standard examples came. D therefore estimated the local probability that a segmentation was ‘reasonable’, combining model and data uncertainties into a single confidence map.

Data and training:

32 retrospective glioma cases, with T1w-Gd MRI and OARs, were randomly selected (LeedsEastREC: 19/YH/0300, IRAS: 255585) for ACo training and 9 for validation.

Model training continued until testing metrics stabilised. Synthetic errors were applied to the IAS produced by G before D estimated pGS. These errors were intended to mimic real-world failure cases, enabling D to learn from both high- and low-quality AS examples.

Testing:

ACo performance was evaluated on IAS and third-party deep-learning segmentations from a commercial system (RayStation Laboratory, Sweden). These external models were previously developed and validated in-house (2). Two models of high (MRIeMRI) and lower (MRIu) quality respectively (2), were used for external validation of ACo. Quantitative assessment against ‘difference to gold-standard’ (d2GS) maps, resulted in a confusion matrix evaluation. From this, Matthew’s correlation coefficient (MCC), False Positive Rate (FPR) and False Negative Rate (FNR) were derived. Two corrections were applied to achieve more clinically relevant outputs and performance measures. Intelligent Edge Removal (IER) was used because ACo produces a confidence map with low confidence at OAR boundaries due to partial volume effects and small differences (1-2 px) between the autosegmentation and gold standard. The d2GS and pGS maps were masked out within 3 pixels of the OARs boundaries, with any remaining errors adjacent to the masked region grown back into this region by morphological dilation. This correction was intended to make the confidence maps easier to interpret by isolating low-confidence regions away from the immediate segmentation boundaries.

Made with FlippingBook - Online Brochure Maker