Improvements in treatment planning calculations motivated by tightening IMRT QA tolerances

Abstract Implementing tighter intensity modulated radiation therapy (IMRT) quality assurance (QA) tolerances initially resulted in high numbers of marginal or failing QA results and motivated a number of improvements to our calculational processes. This work details those improvements and their effect on results. One hundred eighty IMRT plans analyzed previously were collected and new gamma criteria were applied and compared to the original results. The results were used to obtain an estimate for the number of plans that would require additional dose volume histogram (DVH)‐based analysis and therefore predicted workload increase. For 2 months and 133 plans, the established criteria were continued while the new criteria were applied and tracked in parallel. Because the number of marginal or failing plans far exceeded the predicted levels, a number of calculational elements were investigated: IMRT modeling parameters, calculation grid size, and couch top modeling. After improvements to these elements, the new criteria were clinically implemented and the frequency of passing, questionable, and failing plans measured for the subsequent 15 months and 674 plans. The retrospective analysis of selected IMRT QA results demonstrated that 75% of plans should pass, while 19% of IMRT QA plans would need DVH‐based analysis and an additional 6% would fail. However, after applying the tighter criteria for 2 months, the distribution of plans was significantly different from prediction with questionable or failing plans reaching 47%. After investigating and improving several elements of the IMRT calculation processes, the frequency of questionable plans was reduced to 11% and that of failing plans to less than 1%. Tighter IMRT QA tolerances revealed the need to improve several elements of our plan calculations. As a consequence, the accuracy of our plans have improved, and the frequency of finding marginal or failing IMRT QA results, remains within our practical ability to respond.


| INTRODUCTION
The clinical relevance of intensity modulated radiation therapy (IMRT) and volumetric modulated arc therapy (VMAT) quality assurance (QA; further referred to as IMRT QA) is often questioned [1][2][3][4] due to the lack of sensitivity of the IMRT QA methods. Recent recommendations 5  Corp., USA) using the gamma criteria 7 of 3%/3 mm, global and 10% threshold (The software switch "Apply measurement uncertainty" was on, so the criteria correlates with 4%/3 mm with uncertainty correction off). A passing threshold of 92% had been used for many years, and all these plans had passed. The plans were reanalyzed with the new gamma criteria of 3%/2 mm, global, 10% threshold, and uncertainty corrections off, and the plans were binned into the categories of "pass" (>95% of diodes passing), "questionable" (between 90% and 95%), and "failed" (<90%). 6 The results using the new criteria were used to obtain an estimate for the number of plans requiring additional dose volume histogram (DVH)-based analysis using 3DVH software (Sun Nuclear Corp., USA) and therefore the predicted workload increase.
The implementation of the revised IMRT QA analysis was phased. For the first 2 months, 133 IMRT QA measurements made and analyzed with the original criteria and used for the patient-specific QA documentation while, in parallel, the results with the revised criteria were obtained and tracked. This was done to permit training with the new workflow and to test whether the earlier prediction of the increased workload would be borne out. The distribution of pass, questionable, and fail plans under the new criteria was compared to the predicted distribution using a chi-square test.
There was an observed increased frequency of problematic plans once all IMRT QA were included in the new analysis method (see Table 1 below). DVH analysis demonstrated that over 50% of the problematic plans were measuring cold in the organs at risk (OAR) and hot in the planning target volume (PTV), which indicated that there was a systematic difference between the TPS and resulting treatment. This motivated a series of investigations and modifications to our processes in order to reduce the effects of systematic calculational artifacts. These are detailed below. Subsequent to these changes, the new criteria were clinically implemented. The distribution of IMRT QA results between "pass," "questionable," and "fail" was analyzed for the next 673 plans over 15 months.

2.B.1 | TPS calculation parameters
It was found that one of the distributed calculation framework (DCF) settings in Eclipse, which sets the angular resolution for conformal and VMAT calculations, was set to 5 degrees. This caused an apparent ripple in the calculated dose, most prominently for small fields and increasing with distance from the isocenter. This setting was changed to "off" which changed the calculation resolution to the resolution of the control point (2 degrees). The effect is shown in Section 3.
The clinical dose calculation grid size is typically set to 2.5 mm in Eclipse. However, SNC Patient interpolates all plans to a 1.0 mm grid for gamma analysis creating opportunities for misrepresentations of treatment plans as shown in Fig. 1. Now all QA plans are calculated with a dose grid of 1.0 mm to remove any interpolation error.
T A B L E 1 Distributions of pass, questionable, and failed plans for the retrospective analysis of H & N and prostate IMRT quality assurance (QA) measurements, the first 2 months of clinical testing prior to introducing the calculational changes, and the subsequent 15 months of IMRT QA measurements. The gamma criteria are 3%/ 2 mm, 10% threshold with >95% of points passing being a pass, between 90% and 95% being questionable and below 90% being a fail. ing rates were compared to the passing rates obtained with the initial DLG and transmission parameters. These were altered until agreement degraded, giving a range of possible combinations, and then the parameters were refined until optimal agreement was reached.

2.B.4 | Couch top model values
Due to a discrepancy seen between the TPS and the corresponding ArcCHECK measurement for a 360°open field arc (see Fig. 2     impact on the measurement results, especially for small field plans as can be seen in Fig. 7. In this case, the 3%/2 mm passing rate for the plan increased from 90.1% to 96.8% by changing the DCF angular resolution setting. The gamma results for larger field plans also were impacted, although not as drastically (~2% increase) as small fields as can be seen in Table 2.

Couch modeling and dose grid results
The original couch models demonstrated up to a 4.2% difference between the TPS and the corresponding measurement for select beam directions. After optimizing the surface and interior HU values for the couch in Eclipse, the error between the TPS plan and measurement decreased to less than 0.5%. much more palatable in a busy clinic.
Even after these improvements, about 14% of our IMRT plans do not cleanly pass IMRT QA. We are still in the process of learning from DVH analysis of the questionable and failing plans if further systematic improvements to our treatment planning processes are warranted or if we have reached the limit based on the changeable parameters available within Eclipse. One known issue is that we have a beam model that is optimized to minimize differences between the Varian Clinac and Truebeam. In the next year we will have completely transitioned to the newer machines and will adjust the models accordingly.

| CONCLUSION S
Tightening IMRT QA tolerances revealed the need to improve several elements of our IMRT and VMAT calculations. As a consequence, the accuracy of our treatment planning has improved, and the frequency of finding marginal or failing IMRT QA results, while much larger than before the tightening, remains within our practical ability to respond.

CONFLI CTS OF INTEREST
The authors have no relevant conflicts of interest to disclose.