ESTRO 2025 - Abstract Book
S3024
Physics - Image acquisition and processing
ESTRO 2025
2703
Digital Poster CBCT imaging dose reduction using transformer models for projection interpolation
Adrian Thummerer 1 , Lukas Schmidt 1 , Claus Belka 1,2,3 , Stefanie Corradini 1 , Guillaume Landry 1 , Christopher Kurz 1 1 Department of Radiation Oncology, LMU University Hospital, LMU Munich, Munich, Germany. 2 German Cancer Consortium (DKTK), partner site Munich, a partnership between DKFZ and LMU University Hospital Munich Germany, Munich, Germany. 3 Bavarian Cancer Research Center, (BZKF), Munich, Germany Purpose/Objective: Cone-beam computed tomography (CBCT) is essential for image-guided and adaptive radiotherapy, but cumulative imaging dose from fractionated treatments raises concerns about secondary cancer risks, particularly in pediatric patients [1]. This study investigates dose reduction through sparse-view CBCT acquisition combined with deep learning-based projection synthesis to compensate for subsampling artifacts and increased noise. Material/Methods: In this study we adapted a video frame interpolation transformer (VFIT) to synthesize CBCT projections at non sampled viewing angles [2]. The model incorporates shallow feature embedding, a Transformer-based encoder decoder network, and a multi-scale frame synthesis network. We evaluated two dose reduction scenarios: 50% (2 fold) and 75% (4-fold) projection subsampling (i.e. simulating acquisitions with 50% or 25% of the original dose). Two models were developed: VFIT2x for interpolating a single missing projection between two adjacent views, and VFIT4x for interpolating one projection across a gap of three missing projections. For 4-fold reduction, VFIT4x and VFIT2x were applied sequentially. The models were trained and validated using 592,776 projections from 2,997 head-and-neck cancer patients, split into training (2,500), validation (247), and testing (250) sets. Performance was assessed through visual comparison and quantified by calculating image simialrity metrics (PSNR, SSIM), radial noise power spectra (NPS) and modulation transfer functions (MTF). Results: Synthesized projections from both VFIT models demonstrated high image quality and resulted in noticeably reduced noise levels (see Figure 1). Figure 2 presents reconstructed CBCT images from subsampled projections (CBCT SUB2x and CBCT SUB4x ), interpolated projections (CBCT VFIT2x and CBCT VFIT4x ) and the reference CBCT OR . Compared to the sub-sampled images, both CBCT VFIT2x and CBCT VFIT4x , achieve a significant reduction of streaking artifacts and noise, bringing the image quality close to CBCT OR . However, CBCT VFIT2x achieved superior image sharpness compared to CBCT VFIT4x , indicating that wider angular gaps and sequential model application present challenges for maintaining projection consistency. Quantitative analysis confirmed these visual findings: CBCT VFIT2x achieved higher image similarity metrics (SSIM: 0.96, PSNR: 43.5dB) compared to CBCT VFIT4x (SSIM: 0.94, PSNR: 36.6dB) and both noticeably improve the CBCTs reconstructed from 2-fold (SSIM: 0.92, PSNR: 39.1dB) and 4-fold (SSIM: 0.81, PSNR: 34.0dB) subsampled projection sets. The radial NPS, shown in Figure 1, confirmed the observed noise reduction when applying VFIT to CBCT SUB2x/4x . In contrast to visual perception, MTF curves showed no reduction in spatial resolution (1.01 lp/mm).
Made with FlippingBook Ebook Creator