ESTRO 38 Abstract book

First page Table of contents Previous page 1052 Next page Last page

S1040 ESTRO 38

modified and reference MDM, and how often each MDM was better than the other. Results On average, for data set sizes up to 500 observations, the modified MDM performed and generalized better than the reference MDM, but not in each single simulation, as shown in the Table and Figure. For larger data sets the performance of both MDMs stabilized at a constant high level, with only small differences between both methods, which were on average slightly in favour of the reference method.

come back at a later convenient time to e.g. apply the trained model in practice. Conclusion We developed an easy to use distributed learning dashboard. Statistical information about the datasets available can be requested using algorithm A. Algorithm B gives users the opportunity to search for treatment outcomes for similar patients treated in the past. Definition of clinical similarity can be set by the user him/herself using broader/smaller inclusion criteria. Algorithm C can be triggered, and afterwards directly applied for new patients, as a true rapid learning platform. All algorithms can be triggered as often as needed, to update when new information becomes available. Furthermore, both algorithms and connected hospitals are configurable, allowing dashboards for specific collaborations and algorithms. EP-1914 A method to deal with highly correlated explanatory variables in the development of NTCP models A. Van Der Schaaf 1 , L. Van den Bosch 1 , S. Both 1 , E. Schuit 2 , J.A. Langendijk 1 1 University of Groningen- University Medical Center Groningen, Radiation Oncology, Groningen, The Netherlands ; 2 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht- Utrecht University, Utrecht, The Netherlands Purpose or Objective Data-driven modelling of patient outcomes is often impeded by high correlation between explanatory variables (EVs), resulting in unstable variable selection and inflated variance. This problem can lead to models that highly depend on specific correlations between EVs and, therefore, generalise poorly to other populations. We modified a frequently used model development method (MDM) to deal with this problem and tested the performance and generalizability of the resulting models As reference MDM we used stepwise logistic regression with forward variable selection based on the Bayesian Information Criterion (BIC). To deal with high correlation between EVs we modified this MDM using four steps. First, the EVs were assigned to overarching EV groups with mutual correlations ≤0.8 and that were as large as possible. Second, for each such group, a prediction model was developed using the reference MDM. Third, models with good performance were selected based on the BIC with a range equivalent to one degree of freedom. Finally, we combined all models with good performance into a single logistic model by averaging their linear predictors. We simulated random datasets with 10 EVs drawn from a standard Gaussian distribution with autoregressive correlation structure and a correlation of 0.9 between neighbouring EVs. A binary response variable was added according to a logistic response relation with 5 randomly chosen EVs with coefficients drawn from a standard uniform distribution. Datasets of varying size of 100 up to 500+ observations were generated and used to develop prediction models using the reference and modified MDM. Using the same response relation we generated 3 validation sets, each containing 1000 observations. The first validation set had the same correlation structure of the EVs as the training set, to test the model performance in the same population. The other validation sets had 0.5 and 0 correlation between EVs, to test model generalizability. Calculated measures included the mean absolute prediction error (MAE), loss of log-likelihood (LoLL), and loss of area under the receiver operating characteristic curve (LoAUC). Of these measures we calculated the average relative difference between the in realistic simulations. Material and Methods

Conclusion We modified an existing MDM to deal with high correlation between EVs. On average, the modified models predict and generalize better than the reference models, up to the point where sufficient data is available to reliably estimate all model parameters.

Made with FlippingBook - professional solution for displaying marketing and sales documents online