ESTRO 2024 - Abstract Book

S85 ESTRO 2024 administration by automation and optimization of workflows. Predictive algorithms could be conducive in reducing the time required to measure plans that are at low risk of failure, and perhaps channeling resources into producing a better plan for difficult cases. However, developing machine learning (ML) models that guarantee consistently good performance under all circumstances is utopical. One of the main persistent problems of AI is its data dependency, where the limited training data distribution cannot encompass the whole clinical practice. Therefore, robust quality assurance (QA) is needed in a case-specific and routine way (cf. Figure 1), preferably in a programmed way to preserve the benefit of automation and to verify the suitability of the AI output. In this session, we will focus on the patient specific QA part of the RT workflow. Due to the complex nature of intensity-modulated RT (IMRT) and volumetric modulated arc therapy (VMAT) treatment techniques, patient-specific QA measurement procedures are recommended by AAPM TG 119 and 218, respectively, to identify discrepancies between the dose calculated by a treatment planning system (TPS) and that delivered by the treatment machine. In general, we define 1) pre-treatment QA (i.e., secondary dose calculations and phantom measurements) and 2) in vivo QA. However, such techniques are time-consuming, resource-intensive, and their ability to identify suboptimal plans has been questioned. AI algorithms have been developed for predicting pre-treatment IMRT/VMAT QA outcome (i.e. AI4QA). The first approach directly predicts the QA outcome, such as gamma passing rate results or ion chamber dose disagreement, while the second approach focuses on detecting and identifying errors associated with the delivery and cannot be discovered with the QA metric. These models have the potential of providing time-efficiency and automated virtual QA tools, which would significantly reduce the RT treatment workload. The predicted output could help to make treatment plans less complex and reduce the probability of future failure. This is advantageous for adaptive RT or for high-priority treatments, where the patient care path is limited in time. For in-vivo QA, gamma images can be used as input for deep learning models to detect the reason of failure, e.g. mispositioning,… Nevertheless, QA procedures of these automated tools are not well defined yet (i.e. QA4AI4QA). However, it is very important to fully understand the limitations of virtual QA, for example data quality, model adaptability, and model limitations. Data quality is by far the most basic and essential requirement for building an accurate prediction model. Not only can incomplete data, such as small sample size, lead to wrong conclusions, but “true” QA data from detectors, especially for extremely small/large field size or large low dose regions, can also lead to imperfect prediction models due to detector system limitations. During online QA, similarity checks can be performed that compare the input with the data distribution of the training data. Alternatively, sanity checks can be performed on the model outcome e.g. an independent, secondary algorithm can be used to benchmark the performance of the clinical (AI) model and point divergent behavior. In offline mode, there are two elements of built-in routine QA when implementing an AI-based patient-specific QA workflow. First, the predictive accuracy of the model should be continuously assessed using measured plans which laid under a determined threshold in the routine QA process due to their complexity. In this way, routine QA would ensure that there are no deviations from baseline. Secondly, a subset of plans of varying complexity should be set aside as a benchmark dataset. A monthly re-delivery of these plans should be performed to ensure consistency and robustness. Invited Speaker

Made with FlippingBook - Online Brochure Maker