ESTRO 2024 - Abstract Book
S4968
Physics - Radiomics, functional and biological imaging and outcome prediction
ESTRO 2024
458
Digital Poster
Radiomic feature reproducibility for cervix MRI: impact of number of observers and patient datasets
Rhianna Brown 1,2,3 , Amy Walker 3,2,4 , Karen Lim 5,4 , Shalini Vinod 2,5,4 , Viet Do 5,4 , Chelsie O'Connor 6 , Jaqueline Veera 7 , Nira Borok 8 , Peter Metcalfe 1 , Dean Cutajar 1 , Lois Holloway 3,2,4 1 University of Wollongong, Department of Physics, Wollongong, Australia. 2 Ingham Institute for Applied Medical Research, Medical Physics, Liverpool, Australia. 3 Liverpool and Macarthur Cancer Therapy Centre, Department of Medical Physics, Liverpool, Australia. 4 University of New South Wales, South Western Sydney Clinical School, Liverpool, Australia. 5 Liverpool and Macarthur Cancer Therapy Centre, Department of Radiation Oncology, Liverpool, Australia. 6 Chris O'Brien Lifehouse, Department of Radiation Oncology, Camperdown, Australia. 7 Peter MacCallum Cancer Centre, Department of Radiation Oncology, Bendigo, Australia. 8 Liverpool and Macarthur Cancer Therapy Centre, Department of Radiology, Liverpool, Australia
Purpose/Objective:
The addition of radiomic features with clinical features in prediction modelling has been shown to improve the predictive capabilities of such models [1], [2]. Ensuring the radiomic features utilised in these models are reproducible with interobserver contour variation is essential. The aim of this work is to assess radiomic feature reproducibility with interobserver variation considering the trade-off between number of datasets and number of observers.
Material/Methods:
24 pre-treatment T2W-MRI datasets of patients with cervix cancer were delineated by up to 6 observers, consisting of 1 radiologist and 5 radiation oncologists. The gross tumour volume (GTV), bladder, rectum and uterus were delineated by the observers. 9 MRIs were delineated by 6 observers, 18 MRIs were delineated by 4 observers and 24 MRIs were delineated by 3 observers. To allow comparison between the contours from different observers to a common contour, a Simultaneous Truth and Performance Level Estimate (STAPLE) contour was generated in Python using SimpleITK library [3]. STAPLE contours were generated based on 3, 4, 5 and 6 observer contours where available. For example, datasets that had contours from 6 observers had four STAPLE contours generated to allow for more data for the 3-, 4- and 5-observer datasets, as well as data for the 6-observer dataset. Dice similarity coefficient (DSC) and mean absolute surface distance (MASD) were used to compare the contours from the different observers to the STAPLE contours. Radiomic features were extracted from the observer and STAPLE contours delineated on the MRIs. The open-sourced Python library, PyRadiomics [4], was utilised for radiomic feature extraction, and included shape-based, intensity based and texture-based features. The reproducibility of radiomic features was calculated using an intraclass correlation coefficient (ICC) by comparing radiomic features extracted from the observer and STAPLE contours. The ICC values were calculated in R-Studio with the ‘psych’ library package [5] based on a single rating, absolute agreement, 2-way random-effects model (ICC(2,1)) [6]. A radiomic feature with an ICC≥0.90 was classified to have excellent reproducibility, an ICC≥0.75 and less than 0.90 was classified to have good reproducibility, and an ICC<0.75 was classified to have poor reproducibility [6]. To test the reproducibility of the radiomic features against
Made with FlippingBook - Online Brochure Maker