ESTRO 2023 - Abstract Book

S260

Saturday 13 May

ESTRO 2023

ones was 13 of 1131 (1.1%, Table 1). The average PQM score (out of 100) with the re-planned dose distributions was 69.1±15.4 and 69.2±15.2 for the manual and DL-generated OARs, respectively. Table 1 Number of unmet clinical objectives with dose distributions from plans generated on unedited DL structures. Clinical DVH Objectives Number of failures to meet objectives p-value OAR Dose-Volume Limit Replan dose on gold data structures Replan dose on unedited DL structures Brainstem_3mm Max <25Gy 16 15 0.12 Bone_Mandible Max<50Gy 34 32 <0.0001 Cavity_Oral Mean<32Gy 13 11 <0.0001 SMG_contralateral Mean<39Gy 7 6 0.22

V(55Gy)<32%

5 6 7 6

4 5 4 5

0.45 0.60

Larynx

Mean<51Gy Mean<54Gy V(40Gy)<65%

Musc_Constrict_M Musc_Constrict_I

0.14

<0.0001

Conclusion The DL-generated HN OARs resulted in treatment plans of equivalent quality to the original ones, as judged by target coverage, PQM scores, and DVH analysis of the replanned dose distributions combined with the original OARs. The DL-based replanned dose distributions projected on the gold data OARs resulted in meeting clinical objectives in 99% of the total 1131 individual dose-volume points. PD-0326 Alternative DL segmentation approaches for clinical partially-labeled HN data: transformers or Unet? L. Cubero Gutiérrez 1 , L. Cubero Gutiérrez 2 , J. Castelli 2 , R. de Crevoisier 2 , O. Acosta 2 , J. Pascau 1,3 1 Universidad Carlos III de Madrid, Departamento de Bioingeniería, Madrid, Spain; 2 Université Rennes, CLCC Eugène Marquis, Inserm, LTSI - UMR 1099, Rennes, France; 3 Hospital Gregorio Marañón, Instituto de Investigación Sanitaria Gregorio Marañón, Madrid, Spain Purpose or Objective Deep learning (DL) has recently demonstrated efficiency and robustness for automatic segmentation of organs at risk (OAR) in radiotherapy (RT). Head and neck (HN) RT treatment planning could particularly benefit from this tool due to the large amount of OAR present in the region and the difficulties for manual segmentation driven by the varying shapes of some organs, anatomical deformations induced by the tumor, and low contrast between tissues. These issues also diminish the efficacy of DL models, which are usually trained and evaluated in single-center curated datasets. This study aimed to compare the performance of two state-of-the-art DL networks for HN OAR segmentation trained with a partially-labeled clinical database, where each patient has different OAR manually contoured. Materials and Methods The study included 225 partially-labeled CT images from HN cancer patients with locally advanced carcinoma of the oropharynx, a condition that often hampers OAR contouring. These data were used to train a two-step workflow to segment 11 OAR. First, single-class OAR-specific networks based on 3D U-Net were trained to generate pseudo-contours for the CTs with missing labels, obtaining a fully-segmented training image set. Then, a multiclass network was trained with 5-fold cross-validation to segment the 11 OAR simultaneously, exploiting the anatomical relationships between the individual structures. In this step, we compared the performance of two state-of-the-art DL algorithms: nnU-Net, a self-configuring fully-convolutional neural network, and SwinUNETR, a model introducing vision transformers with self-attention mechanisms to the task of delineation. These two models have shown competitive results in semantic segmentation but, to our knowledge, have never been implemented to segment partially-labeled clinical HN datasets. Both algorithms were evaluated on 44 fully-labeled CT images excluded from training by measuring the Dice Score Coefficient (DSC) and Average Surface Distance (ASD). Results Figure 1 depicts the evaluation metrics of both DL models on the test set. nnU-Net achieved slightly more accurate results for almost every OAR. Nonetheless, the differences in performance were shallow, and both networks achieved very accurate results for all OAR except the lips, submandibular glands, and larynx (DSC < 75%). Each fold of nnU-Net was trained in 23 hours, whereas SwinUNETR required 40 hours per fold.

Made with FlippingBook flipbook maker