ESTRO 38 Abstract book

S290 ESTRO 38

radiotherapy data: dose-matrix, dose administered, and time. Survival and patters of recurrence are essential to render toxicity data interpretable. This requires a department policy commitment as it will change workflows and even clinical routine (e.g., organisation of patient follow-up). Using common terminology for regions- of-interest is a prerequisite, just like adequate software and IT-support, which is crucial as data incompleteness and inconsistencies are an issue. SP-0550 Dreams and reality of toxicity data- sharing/farming: quality vs quantity? A. Dekker 1 1 Department Of Radiation Oncology Maastro- Grow School For Oncology And Developmental Biology- Maastricht University Medical Centre+, Maastro, Maastricht, The Netherlands Abstract text Big, real world data has opened a new paradigm of modeling. The large quantity and diversity of data becoming available, may enable new predictive models of toxicity after radiation therapy which are better in terms of performance and more holistic (i.e. taking into account data elements from a wide variety of domains such as biological, clinical, imaging and treatment factors). But big, real word data is usually of much lower quality compared to clinical trial data that was until now used to derive toxicity prediction models. The hypothesis is that knowing and planning for data quality problems, using data quantity to our advantage and ensuring proper validation strategies will enable big real world data to start playing a larger role in toxicity prediction models. Typical big, real world data quality problems include unstructured data, incomprehensible data, missing data, incorrect/implausible data, contradicting data, biased data and biased-missing data. Being aware of and identifying these data quality problems is a first step towards addressing them. There are a number of ways to improve and mitigate quality once the problems are known. E.g. tools are becoming available to convert unstructured and incomprehensible data into FAIR (Findable Accessible Interoperable Reusable) data; missing data can be imputed; incorrect data removed; contradictions solved by a trust hierarchy of sources; biased data can be transformed to a common, unbiased data domain. Besides improving the data quality, careful selection of the modeling approach taking into account the approach’s sensitivity to data quality problems is also important. In almost all the above data quality improvement approaches, a high data quantity is an important enabler. Finally, even if the toxicity prediction model is based on low quality data, it does not necessarily mean the model is bad nor vice versa (a model on good data may be bad). It is therefore important to validate the model on locally recorded data so that confidence and trust in the model is increased. Such acceptance and commissioning of models requires effort on the implementing site to capture their toxicity data and other data elements required for model validation. This effort leads to more and high quality data to become available which can then be fed back for the development of a new iteration of the prediction model. Such a learning system, which implements the model development – validation – development cycle, is expected to lead to both improved data quality and data quantity and finally better toxicity prediction models. SP-0551 Exploiting large data base to build robust predictive models: validation issues T. Rancati 1 1 Fondazione IRCCS Istituto Nazionale dei Tumori, Prostate Cancer Program, Milan, Italy

Abstract text The final purpose of any predictive model in the oncological domain is to provide valid outcome predictions for new patients. Essentially, the data set used to develop a model is not of interest other than to learn for the future. Validation hence is a crucial aspect in the process of predictive modelling. Validation is the process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended uses of the model. It is a process that accumulates evidence of a model correctness or accuracy for specific scenarios, with external validation providing a measure of “generalizability” and “transportability” of the prediction model to populations that are “plausibly related”. “Plausibly related” populations can be defined as cohorts that could be slightly different from the one used for model development, e.g. treated at different hospitals, at different dose levels, with different RT techniques, in different countries or in different time frames. Generalizability and transportability are desired properties from both a scientific and practical perspective. Quantifying the confidence and predictive accuracy of model calculations provides the decision-maker with the information necessary for making high-consequence decisions. The more often a model is externally validated and the more diverse these settings are, the more confidence we can gain in use of the model for prospective decision- making and its possible use in interventional trials. Within this frame, the following specific issues will be considered: 1. Detecting signal from noise: the importance of model validation 2. Internal vs external validation\3. External validation: homogeneity vs heterogeneity issues, harmonization issues 4. Using large distributed datasets vs large benchmark datasets 5. Extrapolation of models: defining model applicability domain 6.Using models: continuous refinement opportunity SP-0552 Radiogenomics: big data to understand genetic risk factors of toxicity C.N. Andreassen 1 1 Aarhus University Hospital, Department of Experimental Clinical Oncology, Aarhus C, Denmark Abstract text The ability to predict normal tissue radiosensitivity has been a long sought goal in radiobiology. During the last 15 years, efforts were made to identify genetic germeline alterations that affect the risk of toxicity after radiotherapy. This research is to an increasing extent undertaken by international cooperative research groups. Certain challenges relate research in radiogenomics: Large cohorts are needed to identify genotype-phenotype associations. Furthermore, normal tissue radiosensitivity is a relatively complex phenotype in several respects. It is made up by a number of sub-phenotypes that are not necessarily strongly associated. In addition the risk of normal tissue toxicity is dependent on complicated and not entirely understood dose-volume relationships. Some sequence alterations are likely to be specifically associated with certain types of normal tissue toxicity whereas others are likely to have a general impact on normal tissue toxicity. In order to meet these challenges, ‘big data approaches’ will be needed. These include development of detailed NTCP-models, machine learning algorithms to analyze

Made with FlippingBook - Online catalogs