ESTRO 2025 - Abstract Book
S2427
Physics - Autosegmentation
ESTRO 2025
944
Digital Poster Determination of metrics by correlation between physician satisfaction and geometric similarity for heart auto-segmentation model Eun Jeong Heo 1,2 , Song Heui Cho 1 , Nam Kwon Lee 1 , Suk Lee 1 , Chul Yong Kim 1 1 Department of Radiation Oncology, College of Medicine, Korea University, Seoul, Korea, Republic of. 2 Department of Medical Physics, Graduate School of Korea University, Sejong, Korea, Republic of Purpose/Objective: This study aimed to evaluate physician satisfaction with an on-site trained model using institutional patient data and an in-house-developed physician blind test for heart delineation. Additionally, we sought to determine comprehensive evaluation metrics by examining the correlation between physician satisfaction and geometric similarity to support the clinical application of heart auto-segmentation model. Material/Methods: Retrospective data from 80 breast cancer patients at our institution were obtained. A radiation oncologist with extensive experience contoured the heart, left and right lungs, esophagus, and thyroid according to RTOG guidelines. Fifty patients were selected to assess the on-site trained model's performance with the FCDN model, while a subset of 30 patients was used for clinical validation. We compared clinical validation results from the on site trained model to those from a pre-built model against a reference dataset. Geometric similarity was assessed using DSC, HD95, and MSD metrics, and physician satisfaction was evaluated using both subjective and quantitative methods. Total scores were calculated based on subjective evaluations (4 questions) and quantitative assessments (borders, chambers, great vessels, and coronary arteries). The physician blind test was scored on a scale from "Unacceptable with major corrections" (score 0-3), "Acceptable with minor corrections" (score 4-6), to "Acceptable with no corrections" (score 7-8). We assessed the correlation (Spearman's rank correlation) between geometric similarity and physician blind test results to determine metrics for clinical acceptability. A Wilcoxon signed rank test assessed differences between the two auto-segmented models (p<0.05). Results: No significant differences were observed between the on-site trained and pre-built models for DSC and MSD, except for HD95 (5.13±1.67 mm vs. 7.10±3.28 mm, p<0.001). For the physician blind test, the on-site trained model was rated “Acceptable with minor corrections” (6.95±0.93), while the pre-built model was rated “Unacceptable with major corrections” (3.07±1.10). Spearman coefficients indicated no correlation between the physician blind test and geometric similarity results (r s ≤0.50). Conclusion: We demonstrated improved physician satisfaction with the on-site trained model for heart delineation through comprehensive evaluation using an in-house-developed physician blind test. In addition, we demonstrated that physician blind test was necessary to evaluate clinical acceptability as there was no correlation between geometric similarity and physician satisfaction.
Keywords: Auto-segmentation, Heart delineation, Correlation
Made with FlippingBook Ebook Creator