ESTRO 38 Abstract book

S516 ESTRO 38

investigate features reproducibility. In particular, as our results show, simple phantoms are mostly useful to define a list of “excluded” features due to their poor reproducibility even in the presence of inserts of simple shapes. PO-0954 A Prediction Model of Acute Esophageal Toxicity in Esophageal Squamous Cell Carcinoma Patients L. Jiang 1 , S. Lu 1 , J. Lu 1 , W. Hu 1 , J. Wang 1 , Y. Chen 1 , K. Zhao 1 1 Fudan University Shanghai Cancer Center, Department of Radiation Oncology, Shanghai, China Purpose or Objective This study sought to establish a multivariable normal tissue complication probability (NTCP) model for Grade ≥ 2 acute esophageal toxicity (AET) after definitive intensity-modulated radio(chemo)therapy in patients with esophageal squamous cell carcinoma (ESCC). Material and Methods A cohort of 181 ESCC patients was enrolled in this study. The clinical and dosimetric parameters were analysed. Clinical parameters included age, gender, use of concurrent chemotherapy, T, N, M stage, and tumor location. Dosimetric parameters of the esophagus included the following: V5 to V65 with step of 5Gy, mean esophagus dose, maximum esophagus dose, GTV-L, PTVp- L. A Spearman’s rank correlation coefficient matrix was calculated. A univariate logistic analysis was performed for each available predictor. The multivariate logistic regression model was achieved by least absolute shrinkage and selection operator (LASSO) logistic regression for predictor selection. The performance of the model was evaluated by the area under the curve (AUC) of the receiver operating characteristic (ROC) curve and calibration plot. The external validation of the model were carried out in a independent cohort. Results A total of 42 patients (22.8%) developed Grade ≥ 2 AET. In the univariate logistic analysis, age > 65 and length of planning target volume of primary esophageal cancer (PTVp-L) were significantly associated with risk of Grade ≥ 2 AET (Table 1). The final model (Table 2) enrolled age > 65 (OR=0.59), PTVp-L (OR=1.03), and lower thoracic esophageal cancer (OR=0.53). The model parameters for the multivariable logistic regression model didn't correlate to each other. The AUC of the prediction model was 0.6837(95%CI: 0.5975-0.7699) (Figure 1), and the cross-validation optimism-corrected AUC was 0.64. The model showed moderate calibration (Figure 2). On validation in the external dataset, the prediction model showed moderate calibration and moderate discrimination (AUC 0.5901 (95%CI: 0.467- 0.7132) for predicting Grade ≥2 AET.

Oncology Maastro Clinic, Radiation Oncology, Maastricht, The Netherlands Purpose or Objective To guarantee the generalizability and validity of radiomics-based models, only reproducible features should be used. “Reproducible” features refer to features that show marginal differences when imaged with different settings. Only a few expensive phantoms specially designed for radiomics studies are available on the market. In this study, we exploited the feasibility of performing radiomics reproducibility studies with a quality assurance phantom, commonly used in the clinics. Material and Methods The dataset (available at https://xnat.bmia.nl/) consists of CT scans of the COPD Gene Phantom II (Phantom Laboratory, Greenwich, NY, USA) acquired in three Dutch medical centers. Acquisition parameters like slice thickness, or convolutional kernel were varied from the standard thorax protocols. Textural and statistical first order (FO) features were extracted using Pyrex (https://github.com/zhenweishi/Py-rex) from a spherical region in the insert cavities of the phantom. The relative difference (RD) between features values on different scanners with different settings was used to evaluate features reproducibility, with the following thresholds: a) 0 < RD ≤ 10%: high reproducibility, b) 10% < RD ≤ 30%: medium reproducibility; c) RD > 30%: poor reproducibility. Agreement between centers were evaluated using the Spearman rank correlation coefficients ( ϱ ). Results 73 radiomics features were extracted. Slice thickness: around 50% of the features in all the centers ( ϱ = 0.8) showed poor reproducibility. Most of the GLSZM (Gray Level Size Zone Matrix) features were poorly reproducible compared to the other textural features. FO features were in general more stable than textural features. Reconstruction kernels: almost all the FO features presented high reproducibility in all the centers ( ϱ = 0.75). Again, textural features were more impacted, with GLSZM features the least reproducible in all the centers. Only a small percentage of features (14% for center1, 14% for center2, and 30% for center3 presented high reproducibility) were robust for all the perturbations. There is only small subset (12%, 9/73) of common features with high reproducibility between the centers (Figure1). However, it is possible to isolate a larger common subset of features presenting poor reproducibility in all the three centers. Again, the majority of textural features had overall poor reproducibility compared to FO features.

Conclusion Radiomics features are strongly affected by acquisition parameters. FO features were more robust than textural features. As Figure1 shows, the consensus is higher when using phantoms to identify features that present poor reproducibility. We showed that it is possible to use standard and common quality assurance phantoms to

Made with FlippingBook - Online catalogs