ESTRO 2024 - Abstract Book

S4566

Physics - Machine learning models and clinical applications

ESTRO 2024

Material/Methods:

Dataset: The internal training dataset comprised binary tumor segmentation masks from 198 patients, for entity subgroups of 41 ALTs from University Hospital Freiburg (9), Klinikum rechts der Isar - TUM (5), and University Hospital Ulm (27); and 157 high-grade liposarcomas from the University of Washington. The external test dataset, on the other hand, included 7 ALT cases from University Hospital Munich - LMU and 86 high-grade liposarcoma cases from Klinikum Rechts der Isar - TUM. Acknowledging the class imbalance in our training dataset, we mitigated overfitting and biased learning in the machine learning approach through random oversampling of the minority class. In the deep learning pipeline, we adjusted the loss computation based on the inverse class frequency. Consequently, we present the balanced accuracy (bACC) score to ensure a robust evaluation. ML-Approach: Initially, we derived 14 radiomics shape features from the segmentation masks of each patient employing pyradiomics [2]. Subsequently, we reduced feature dimensionality via principal component analysis to address potential multicollinearity among features. All principal components, cumulatively explaining 99% of the variance, were retained. In the subsequent binary classification task, we evaluated the comparative efficacy of a Support Vector Classifier (SVC) against that of a Random Forest Classifier (RFC). DL-Approach: We first converted each segmentation mask into a 3D surface mesh employing the marching cubes algorithm [3] followed by a laplacian surface smoothing, rigid alignment to rectify rotational artifacts, and translation to the coordinate origin. In addition to the x, y, and z coordinates, we enrich the per mesh vertex information using local fast point feature histograms to incorporate geometric properties of each individual point’s neighborhood and global shape information, akin to radiomics shape features. In the binary downstream classification, we compare the efficacy of a neural network composed of three Graph Convolutional Network (GCN) layers [4] against that of a neural network featuring three GraphSAGE layers [5].

Results:

The presented table illustrates the conclusive performance metrics of the constructed models on the external test set, exclusively utilizing shape information encapsulated within the tumor segmentation mask. In the realm of balanced accuracy, both graph neural network techniques, specifically GCN and GraphSAGE, exhibited comparable or marginally superior outcomes compared to the machine learning methodologies, namely SVC and RFC. However, regarding algorithmic sensitivity, it becomes apparent that the ML approaches outperformed their DL counterparts, albeit at the expense of reduced specificity.

Method

bACC

AUC

Sensitivity

Specificity

SVC

0.86

0.90

0.86

0.86

RFC

0.79

0.90

0.86

0.71

GCN

0.86

0.90

0.71

1.00

GraphSAGE

0.87

0.87

0.74

1.00

Additionally, we identified the most seminal features, represented as principal components, linked with the top performing ML method (SVC) utilizing permutation feature importance [6]. Employing Pearson correlation between

Made with FlippingBook - Online Brochure Maker