An interactive plan and model evolution method for knowledge‐based pelvic VMAT planning

Abstract Purpose To test if a RapidPlan DVH estimation model and its training plans can be improved interactively through a closed‐loop evolution process. Methods and materials Eighty‐one manual plans (P0) that were used to configure an initial rectal RapidPlan model (M0) were reoptimized using M0 (closed‐loop), yielding 81 P1 plans. The 75 improved P1 (P1+) and the remaining 6 P0 were used to configure model M1. The 81 training plans were reoptimized again using M1, producing 23 P2 plans that were superior to both their P0 and P1 forms (P2+). Hence, the knowledge base of model M2 composed of 6 P0, 52 P1+, and 23 P2+. Models were tested dosimetrically on 30 VMAT validation cases (Pv) that were not used for training, yielding Pv(M0), Pv(M1), and Pv(M2) respectively. The 30 Pv were also optimized by M2_new as trained by the library of M2 and 30 Pv(M0). Results Based on comparable target dose coverage, the first closed‐loop reoptimization significantly (P < 0.01) reduced the 81 training plans’ mean dose to femoral head, urinary bladder, and small bowel by 2.65 Gy/15.63%, 2.06 Gy/8.11%, and 1.47 Gy/6.31% respectively, which were further reduced significantly (P < 0.01) in the second closed‐loop reoptimization by 0.04 Gy/0.28%, 0.18 Gy/0.77%, 0.22 Gy/1.01% respectively. However, open‐loop VMAT validations displayed more complex and intertwined plan quality changes: mean dose to urinary bladder and small bowel decreased monotonically using M1 (by 0.34 Gy/1.47%, 0.25 Gy/1.13%) and M2 (by 0.36 Gy/1.56%, 0.30 Gy/1.36%) than using M0. However, mean dose to femoral head increased by 0.81 Gy/6.64% (M1) and 0.91 Gy/7.46% (M2) than using M0. The overfitting problem was relieved by applying model M2_new. Conclusions The RapidPlan model and its constituent plans can improve each other interactively through a closed‐loop evolution process. Incorporating new patients into the original training library can improve the RapidPlan model and the upcoming plans interactively.


| INTRODUCTION
Knowledge-based radiotherapy treatment planning is deemed to reduce the inter-planner varieties of plan quality [1][2][3][4][5][6][7][8][9][10][11][12][13] and expedite the planning process. [14][15][16][17] The RapidPlan module in Eclipse treatment planning system of version 13.5 or later (Varian Medical Systems, Palo Alto, CA) has commercialized the knowledge-based solution 18,19 and displayed good compatibility across patient orientations, treatment techniques, and systems. 20,21 Well-trained RapidPlan models have outperformed conventional trial and error-based manual planning by reducing excess organs-atrisk (OAR) dose with greater consistency. 17,20,[22][23][24][25][26][27][28][29][30] Should the model performance be highly dependent on the library volume 31 and average quality of the training plans, 17,32 incorporating the modelimproved constituent training plans into the model (closed-loop) 25 may potentially evolve the model as a cycle of interactive improvement. There has been attempts to iteratively improve KDE (kernel density estimation)-based DVH prediction model. However, compared with RapidPlan, the KDE algorithm did not consider division between in-field and out-of-field regions, and the generated point objectives were tested on limited sample size based on Pinnacle (Philips Radiation Oncology Systems, Fitchburg, WI), 33 whose optimization algorithm, progressive optimization algorithm (POA) is different from Eclipse's Photon Optimizer (PO). This study aims to evaluate the performance of the closed-loop model evolution on rectal cancer patients in the environment of Eclipse RapidPlan V13.6 knowledge-based treatment planning system.

| MATERIALS AND METHODS
As a summary, Fig. 1 displays a schematic workflow explaining the evolution process and naming abbreviations.

2.A | Initial model configuration
The planning and modeling details can be found in our previous publications. 17,20,34 In summary, 81 clinical VMAT plans (P c ) for preoperative rectal cancer patients were refined manually by experts (besteffort manual plans, P 0 ) to guarantee the initial plan quality and push a stricter evaluation criteria on the closed-loop method. Plans were optimized to deliver 50.6 Gy and 41.8 Gy to 95% PTV boost and 95% PTV respectively in 22 fractions. 35 The extracted structure sets, prescriptions, and field geometries of P 0 were regressed as the initial DVH estimation model (M 0 ) and statistically verified using Varian Model Analytics tool. 36 Model-generated optimization objectives and priorities were assisted by additional manual constraints to make the model comply with our clinical protocols. The validations on 100+ patients have demonstrated that M 0 -generated personalized objectives improved plan quality and consistency significantly compared to the clinical plans. 17,20 2.B | Model evolution As shown in Fig. 1, the 81 constituent P 0 of M 0 were reoptimized using M 0 (closed-loop), yielding training sets of first iteration (P 1 ). To simplify the scoring of plan quality and avoid observer-dependent evaluation preferences especially when the DVH lines have crossovers, three explicit endpoints: the mean dose to the femoral head, urinary bladder, and small bowel (D mean_FH , D mean_UB , and D mean_SB ) were compared 33 . Plans with reduced D mean_FH , D mean_UB , and D mean_SB were defined as improved plans (P 1+ ). The first closed-loop reoptimization using M 0 produced 75 P 1+ , which composed M 1 in addition to 6 P 0 where the original plans were considered better.
Second closed-loop reoptimization using M 1 derived 23 P 2+ of better quality than both their P 0 and P 1 forms. The new model of each iteration was configured with best plans from all previous optimizations, hence M 2 included 6 P 0 , 52 P 1+ , and 23 P 2+ . To be cost-effective, iterations were terminated when no or clinically negligible improvement could be achieved anymore. To address the over-fitting problem, the reoptimized 30 VMAT validation cases using M 0 (P v (M 0 )) were added to the training library of M 2 , yielding model M 2_new . The performance of M 2_new was tested on the 30 validation cases thereafter.

2.D | Statistical methods
Using SPSS (v21.0, IBM Analytics, Armonk, NY), normality was tested using Shapiro-Wilk method. Normal and abnormal data were analyzed by paired samples t-test and Wilcoxon signed-rank test respectively (two-tailed, significant level 0.05).

3.A | Closed-loop reoptimizations
After replacing the training library with 75 P 1+ during the first closed-loop refinement, the 81 plans used to configure model M 1 were of comparable HI and CI (mean difference < 0.03) relative to the library of M 0 , but of consistently lower mean dose to all OARs.

3.B | Validations
Based on 30 VMAT validation cases, knowledge-based reoptimizations using various models yielded comparable target coverage (mean difference of HI and CI < 0.01), but the impact on the OARs were more complex and intertwined: relative to the results of using M 0 , monotonically increased magnitudes of mean dose reduction to two OARs were observed using the refined models

| DISCUSSION
In the first closed-loop refinement, most (75/81, 92.59%) of the besteffort manual plans (P 0 ) that were used to train the initial RapidPlan The largely overlapping target DVHs in Fig. 2 echoed the comparable target numeric in Table 3