ESTRO 2025 - Abstract Book
S3368
Physics - Machine learning models and clinical applications
ESTRO 2025
724
Poster Discussion A framework to create, validate and select synthetic datasets for survival prediction in radiation oncology Andreas Christoforou 1 , Simon KB Spohn 2 , Alexander Ruehle 3 , Nils H Nicolay 3 , Ilinca Popp 2 , Anca L Grosu 2 , Iosif Strouthos 1 , Alexander H Thieme 4 , Constantinos Zamboglou 1 1 Department of Radiation Oncology, German Oncology Center, Limassol, Cyprus. 2 Department of Radiation Oncology, Medical Center – University of Freiburg, Freiburg, Germany. 3 Department of Radiation Oncology, University of Leipzig, Leipzig, Germany. 4 Department of Biomedical Data Science, Stanford Medical School, Stanford, USA Purpose/Objective: Data-driven decision-making in radiation oncology (RO) relies on integrating real-world clinical data (RWCD). Synthetic data (SD), generated through machine learning, offers a solution by mimicking RWCD without compromising privacy. The best methodology to create SD in the field of RT and to assess its performance is unknown. This work presents a general framework for generating, evaluating, and selecting high-quality tabular SD for clinical use, focusing on survival datasets in RO. Material/Methods: Five retrospective uni- and multi-center datasets (n=1038 recurrent prostate cancer, n=109 primary localised prostate cancer, n=46 metastasised prostate cancer, n=1072 head and neck cancer, n=298 gliomas) with patients undergoing RT in different scenarios were collected. SD was generated using four different machine-learning models, (Copula, Copula GAN, CTGAN, TVAE) with each model producing multiple SD. These were evaluated for privacy, clinical behavior, and feature distribution using robust and interpretable metrics by using a weighted ranking, enabling a single synthetic dataset to be selected for each RWCD.
Results:
Made with FlippingBook Ebook Creator