ESTRO 2025 - Abstract Book

S3452

Physics - Machine learning models and clinical applications

ESTRO 2025

[2] Hochreiter, S. "Long Short-term Memory." Neural Computation MIT-Press (1997). [3] Vaswani, A. "Attention is all you need." Advances in Neural Information Processing Systems (2017). [4] Gu, Albert, and Tri Dao. "Mamba: Linear-time sequence modeling with selective state spaces." arXiv preprint arXiv:2312.00752 (2023). [5] Beck, Maximilian, et al. "xLSTM: Extended Long Short-Term Memory." arXiv preprint arXiv:2405.04517 (2024). [6] Schiavi, Angelo, et al. "Fred: a GPU-accelerated fast-Monte Carlo code for rapid treatment plan recalculation in ion beam therapy." Physics in Medicine & Biology 62.18 (2017): 7482.

4461

Digital Poster A deep reinforcement learning approach for adaptive fractionation: Dynamic fraction size optimization for enhancing OAR sparing Martin Weigand, Simon van Kranen, Jan-Jakob Sonke Department of Radiation Oncology, The Netherlands Cancer Institute, Amsterdam, Netherlands Purpose/Objective: In online-adaptive radiotherapy (OART), daily plan adaptations use the same objectives as the initial plan to recreate the treatment intent in the anatomy of the day. This strategy does not distinguish favorable from less favorable anatomies and thus fails to fully capitalizing the potential of OART. However, over the course of therapy, it is unclear how to identify the ‘favorability’ of the daily anatomy and how to optimally respond. This proof -of-concept study aims to investigate a deep reinforcement learning (DRL) approach for adaptive fractionation: dynamically scaling the dose per fraction using geometric and dosimetric navigation signals, while keeping the dose distribution otherwise unchanged. Material/Methods: A 2D simulation environment was created to model interfractional anatomical and dosimetric variations, introducing progressive levels of complexity (Figure 1a). The target was static while the OAR exhibits (1) systematic and random displacements and was extended to include (2) rotations, (3) inter-patient dose gradient variability and (4) a time-trend away from the tumor as a surrogate for tumor shrinkage. A DRL algorithm [1,2] was implemented as a decision-making agent, capable of learning a policy for dynamic fraction size optimization (between 1 and 8 Gy) while ensuring a cumulative tumor BED 10Gy of 72Gy (30x2Gy) and minimizing OAR exposure. The agent’s observation space included: fraction number, accumulated tumor BED, sparing factor (physical OAR dose relative to tumor dose) [3], tumor-OAR distance, OAR size, and BED sparing factors (calculated with BED instead of physical dose) for the minimum and maximum allowed fraction sizes. The agents were allowed to use 30 or fewer fractions. Performance was assessed using the Equivalent Uniform Biological-Effective-Dose (EUBED 3Gy ;a=10) to the OAR.

Made with FlippingBook Ebook Creator