Comparison of 3D and 2D gamma passing rate criteria for detection sensitivity to IMRT delivery errors

Abstract This study compared three‐dimensional (3D) and two‐dimensional (2D) percentage gamma passing rates (%GPs) for detection sensitivity to IMRT delivery errors and investigated the correlation between two kinds of %GP. Eleven prostate IMRT cases were selected, and errors in multileaf collimator (MLC) bank sag, MLC leaf traveling, and machine output were simulated by recalculating the dose distributions in patients. 2D doses were extracted from the 3D doses at the isocenter position. The 3D and 2D %GPs with different gamma criteria were then obtained by comparing the recalculated and original doses in specific regions of interest (ROI), such as the whole body, the planning target volume (PTV), the bladder, and the rectum. The sensitivities to simulated errors of the two types of %GP were compared, and the correlation between the 2D and 3D %GPs for different ROIs were analyzed. For the whole‐body evaluation, both the 2D and 3D %GPs with the 3%/3 mm criterion were above 90% for all tested MLC errors and for MU deviations up to 4%, and the 3D %GP was higher than the 2D %GP. In organ‐specific evaluations, the PTV‐specific 2D and 3D %GP gradients were −4.70% and −5.14% per millimeter of the MLC traveling error, and −17.79% and −20.50% per percentage of MU error, respectively. However, a stricter criterion (2%/1 mm) was needed to detect the tested MLC sag error. The Pearson correlation analysis showed a significant strong correlation (r > 0.8 and P < 0.001) between the 2D and 3D %GPs in the whole body and PTV‐specific gamma evaluations. The whole‐body %GP with the 3%/3 mm criterion was inadequate to detect the tested MLC and MU errors, and a stricter criterion may be needed. The PTV‐specific gamma evaluation helped to improve the sensitivity of the error detection, especially using the 3D GP%.


| INTRODUCTION
Both the planning and delivery of intensity-modulated radiation therapy (IMRT) are highly complex processes that require a comprehensive quality assurance (QA) procedure for routine IMRT plan verification. 1,2 Currently, IMRT QA is mostly performed by applying a patient-specific treatment plan to a phantom, measuring the twodimensional (2D) planar dose distribution in the phantom, and comparing the measured and calculated phantom dose distributions. QA measurements are commonly taken with detector arrays consisting of either ion chambers or diodes. However, because of the lack of information regarding correlations between phantom dosimetry and anatomical dose distributions, including the volumetric dose differences between the targets and organs at risk (OARs), radiotherapy practice demands 3D dose verification based on actual patient anatomies. 3 1,4 Previous studies have assessed the merits and limitations of different QA systems in terms of their compatibility with the gamma analysis methods and their capability to detect different IMRT delivery errors. Rangel et al. 5 and Nelms et al. 6 deliberately introduced systematic multileaf collimator (MLC) offset to the treatment beams and found that the 2D gamma analysis was insufficiently sensitive to detect some types of MLC misplacements and that planar IMRT QA passing rates did not predict clinically relevant patient dose errors.
Pulliam et al. 7 reported the findings of their phantom study: that the 3D gamma index produced better agreement than the corresponding 2D analysis with different algorithms. However, the responses of 3D and 2D gamma passing rates (%GPs) to different IMRT delivery errors have not yet been investigated thoroughly for individual structures in patients.
Therefore, this work applied the gradient technique and statistical methods to compare the 3D and 2D %GPs for different individual structures, to analyze the %GP responses to three different types of delivery error, and to investigate possible correlations between the two types of %GP.

2.A | Patient plans
Eleven IMRT plans for prostate treatment were randomly selected from clinical treatment cases. All patient plans were inversely optimized in the treatment planning system (TPS, Eclipse V11.0, Varian Medical Systems, Palo Alto, CA, USA) and calculated with a 2.5 mm 9 2.5 mm 9 2.5 mm dose grid using the anisotropic analytical algorithm (AAA). The plan isocenters were positioned at the PTV centroids. The plans consisted of eight or nine gantry angles and were delivered with the sliding window (SW) technique using 6

2.B | Introducing delivery errors into treatment plans
An in-house MATLAB program was developed to insert three different types of delivery errors into the clinical plans, following the techniques described by Zhen et al. 8 and Oliver et al. 9 A brief description of the experimental flow is shown in Fig. 1. The three types of errors that were created were MLC bank sag errors, MLC traveling errors, and delivered machine output errors, all of which are common errors that may occur during treatment.

2.B.1 | MLC sag errors
Because IMRT is implemented by beams at varied gantry angles, the MLC bank sag error deserves significant attention. When the MLC control mechanism is relaxed, MLC sag errors may occur as a result of the gravity effect, with the most sag deviation occurring in MLC bank positions at gantry angles of 90°or 270°and no deviation at 0°or 180°. This type of error varies with the gantry rotation. We used a sinusoidal transform to simulate such a sag error, as reported by Carver et al. 10 The gantry angle a can be extracted from the plan's DICOM RT file for each corresponding control point. "A" is the maximum amplitude of the MLC leaf position change when the beam is horizontal; in this study, "A" was set to 1, 1.5, 2, and 3 mm.

2.B.2 | MLC leaf traveling errors
The MLCs (30, 31) are the two central MLC leaves in the negative x-direction (using the IEC 1217 coordinate system); B is the leaf positional error, which was set to 1.5, 2, 3, 4, and 5 mm in this work.

2.B.3 | Delivered MU errors
A third common type of error is the accelerator output error. This could be introduced by changing the number of MUs for each beam in the plan's DICOM RT files.
Here, C is the percentage linac output error, which was set to 3, 3.25, 3.5, 4, and 5%, respectively, in this study.
Based on the above assumptions, a virtual "RT delivery with errors," shown in Fig  In clinical practice, the 2D %GP criterion of 3%/3 mm (DD%/ DTA mm) has been commonly recommended and routinely applied for IMRT QA. 1 Nevertheless, a 3% dose output tolerance and a 2-mm mechanical accuracy are recommended for daily QA using conventional IMRT machines, and a stricter tolerance is recommended for stereotactic body radiation (SBRT) treatment. 13 Therefore, the 3%/2 mm criterion is usually used as a more restrictive %GP criterion, and the gamma criterion of 2%/1 mm has been recommended for detecting MLC shift misalignments in SBRT QA. 14

2.C.3 | Correlation analysis between the 2D and 3D %GP
The correlation between the 2D and 3D %GPs was statistically evaluated by analyzing the Pearson correlation coefficient using the SPSS software V18.0 and pair plots. A Pearson's r value greater than 0.8 in conjunction with a P value of less than 0.05 in the significance test was considered to indicate a strong correlation.

3.A | 2D vs 3D %GP analyses
The 2D and 3D %GPs were compared with three different criteria of 2%/1 mm, 3%/2 mm, and 3%/3 mm. The average results of the %GPs for the body, PTV, and OARs for the three types of error are shown in Tables 1-3. The results were summarized as follows.

3.A.1 | MLC bank sag error detection sensitivities
With regard to the MLC bank sag errors, with the 3%/3 mm criterion, the 2D %GP was lower than the 3D %GP in the whole body (P < 0.05); however, no significant difference between the two was  Fig. 3(b)]. In contrast, when a stricter criterion of 2%/1 mm was used, the 3D %GP was significantly lower than the 2D %GP (P < 0.05), as shown in Table 1, and had a steeper gradient, as shown in Fig. 4, indicating that the PTV-specific 3D %GPs with the stricter criteria were also more sensitive to the MLC sag error.

3.A.2 | MLC leaf traveling error detection sensitivities
Because the simulated MLC leaf traveling errors were introduced to only the two central leaves, which are unlikely to affect the dose accuracy in the OARs above or below the PTV level, the OAR-specific %GPs were not assessed for this type of error. The results of the %GP evaluation for the whole body and the PTV-specific region are shown in Table 2 and Fig. 5. As shown in Fig. 5, the %GP gradient analysis supported the above results with a larger absolute slope in the PTV-specific %GP than that of the whole body and a steeper decline in the PTV-specific 3D %GP than the PTV-specific 2D %GP.

3.A.3 | MU error detection sensitivities
For different levels of MU errors, the PTV-specific %GPs were very sensitive with criteria corresponding to the output error level in both the 2D and 3D evaluations. The %GP decreased considerably when the MU error surpassed the DD criterion of the gamma evaluation, and the 3D %GP declined more abruptly than the 2D %GP in the PTV-specific evaluation. However, in the gamma evaluations for the whole body and the bladder, the %GP decreases were relatively gradual than that for the PTV, and the average 2D %GP was    Table 3 and Fig. 6. For this type of error, the PTV-specific 3D %GP had the largest absolute slope.  Table 4, the correlations between the 2D and 3D %GPs for the OARs were weaker than those for the whole body and the PTV. For the bladder and rectum, the correlations between the 2D and 3D %GPs at all the tested criteria were weak (r ≤ 0.8). Another issue addressed in our study is the correlation between the 2D and 3D %GPs. A significant strong correlation was observed between them for the PTV and whole-body gamma evaluations (Pearson r > 0.8 and P < 0.05). Similar results were reported by Wu et al. 21 in which they F I G . 7. Scatter plot of 2D %GP vs 3D %GP for (a) the whole body and (b) the PTV with different errors and gamma criteria (2%/1 mm, 3%/ 2 mm, 3%/3 mm; 10% threshold cutoff).

| DISCUSSION
observed a significant statistical correlation between 3D and 2D global (body area) %GPs in their investigation of two IMRT QA methods.

| CONCLUSIONS
In this work, we investigated the detection sensitivities to three typical delivery errors of 2D and 3D %GPs and their correlations. For the whole-body gamma evaluation, both the 2D and 3D %GPs with the commonly used criterion of 3%/3 mm were inadequate to discover small MLC sag and leaf traveling errors. The PTV-specific 3D %GP evaluations with the stricter criterion were more sensitive to detect these types of MLC errors. A corresponding dose difference criterion is needed to detect MU errors using %GP, and PTV-specific analysis is more sensitive to this type of error compared to whole-body assessment.

AVAILABILITY OF DATA AND MATERIALS
All data in this study have been recorded at the Research Data Deposit website (RDD, https://www.researchdata.org.cn) for future reference (number RDDA2017000326) and are available upon request.

CONF LICT OF I NTEREST
No conflicts of interest.