ESTRO 2024 - Abstract Book
S4420
Physics - Machine learning models and clinical applications
ESTRO 2024
A diverse group of 392 HN cancer patients were used in this study and divided into training (283), validation (31) and test (78) sets.
Robust IMPT plans with 2 dose levels (70/54.25 GyE) were created using Erasmus-iCycle [[1], [2]] using 9 scenarios, which acted as ground truth (GT) for DL models training and evaluation. The 9 scenarios were the nominal scenario (no errors), +3/-3 mm setup error in each direction, and +3/-3% range error. The planning resulted in 9 scenario specific dose distributions per patient. We investigated three DL strategies: (1) a separate prediction model for each scenario, all trained independently, (2) a separate model for each scenario with learning for non-nominal scenarios is initialized by the trained network for the nominal scenario (transfer learning), and (3) a single model trained with the dose distributions for all scenarios with scenario labels as additional input. All prediction models were based on the Hierarchical Dense U-Net, where the patient images (planning CT, structure sets and dose) were downsampled to a grid size of 256x256x128 with patient resolution of 1.8x1.3x(2-2.5) mm³ i.e., keeping the original slice thickness. The inputs to the models were the planning CT, the binary masks of the CTVs and OARs, and a feature map representing the Euclidean distance between the isocenter voxel and every voxel inside the patient contour (isocenteric distance map). For approach (3), a one-hot-vector that represent the scenario to be predicted was additionally used as input. Setup errors were incorporated into the models as shifts in the isocentric distance map, according to the scenario specific setup error. Range shift uncertainties were accommodated by upscaling or downscaling the HUs of the planning CT.
The presented evaluations for the three investigated DL strategies were based on the 78 patients in the test set, which were not used for training.
All experiments were conducted on a NVIDIA A100 Tensor Core 80GB GPU.
Results:
Strategy (3) was most efficient in terms of calculation times. Training times for strategies (1), (2) and (3) were 51, 24, and 16 hours respectively. Model(s) loading times prior to predictions were 45 seconds for (1) and (2) and 28.5 seconds for (3). Average prediction times per patient and scenario were 1.18, 1.17, and 1.14 seconds per scenario. All three strategies resulted in accurate dosimetric predictions, with comparable OARs dose prediction performance. Strategy (3) showed the smallest differences with the Erasmus-iCycle GT CTV coverages (using Wilcoxon signed-rank test, all p<0.001), see Figure 1. The scatter plots in Figure 2 shows the differences between GT and predictions with strategy (3) for the individual scenarios of the 78 test patients.
Conclusion:
Strategy (3), using a single DL model for predicting dose in all separate IMPT robustness scenarios with scenario labels as input, showed close agreement between predicted dose and GT , outperforming results of the other two investigated prediction strategies. Moreover, strategy (3) was most efficient in calculation times with training times of 16 hours. For individual patients, the model loading time for prediction of the scenario doses of a patient was 28.5
Made with FlippingBook - Online Brochure Maker