On the need for tuning the dosimetric leaf gap for stereotactic treatment plans in the Eclipse treatment planning system

Abstract The dosimetric leaf gap (DLG) and tongue‐and‐groove (T&G) effects are critical aspects in the modeling of multileaf collimators (MLC) in the treatment planning system (TPS). In this study, we investigated the dosimetric impact of limitations associated with the T&G modeling in stereotactic plans and its relationship with the need for tuning the DLG in the Eclipse TPS. Measurements were carried out using Varian TrueBeam STx systems from two different institutions. Test fields presenting MLC patterns with several MLC gap sizes (meanGap) and different amounts of T&G effect (TGi) were first evaluated. Secondly, dynamic conformal arc (DCA) and volumetric modulated arc therapy (VMAT) deliveries of stereotactic cases were analyzed in terms of meanGap and TGi. Two DLG values were used in the TPS: the measured DLG (DLGmeas) and an optimal DLG (DLGopt). Measured and calculated doses were compared according to dose differences and gamma passing rates (GPR) with strict local gamma criteria of 2%/2 mm. The discrepancies were analyzed for DLGmeas and DLGopt, and their relationships with both TGi and meanGap were investigated. DCA arcs involved significantly lower TGi and larger meanGap than VMAT arcs (P < 0.0001). By using DLGmeas in the TPS, the dose discrepancies increased as TGi increased and meanGap decreased for both test fields and clinical plans. Dose discrepancies dramatically increased with the ratio TGi/meanGap. Adjusting the DLG value was then required to achieve acceptable calculations and configuring the TPS with DLGopt led to an excellent agreement with median GPRs (2%/2 mm) > 99% for both institutions. We also showed that DLGopt could be obtained from the results of the test fields. We demonstrated that the need for tuning the DLG is due to the limitations of the T&G modeling in the Eclipse TPS. A set of sweeping gap tests modified to incorporate T&G effects can be used to determine the optimal DLG value.

TrueBeam STx systems from two different institutions. Test fields presenting MLC patterns with several MLC gap sizes (meanGap) and different amounts of T&G effect (TGi) were first evaluated. Secondly, dynamic conformal arc (DCA) and volumetric modulated arc therapy (VMAT) deliveries of stereotactic cases were analyzed in terms of meanGap and TGi. Two DLG values were used in the TPS: the measured DLG (DLG meas ) and an optimal DLG (DLG opt ). Measured and calculated doses were compared according to dose differences and gamma passing rates (GPR) with strict local gamma criteria of 2%/2 mm. The discrepancies were analyzed for DLG meas and DLG opt , and their relationships with both TGi and meanGap were investigated. DCA arcs involved significantly lower TGi and larger meanGap than VMAT arcs (p < 0.0001). By using DLG meas in the TPS, the dose discrepancies increased as TGi increased and meanGap decreased for both test fields and clinical plans. Dose discrepancies dramatically increased with the ratio TGi/meanGap. Adjusting the DLG value was then required to achieve acceptable calculations and configuring the TPS with DLG opt led to an excellent agreement with median GPRs (2%/2 mm) > 99% for both institutions. We also showed that DLG opt could be obtained from the results of the test fields. We demonstrated that the need for tuning the DLG is due to the limitations of the T&G modeling in the Eclipse TPS. A set of sweeping gap tests modified to incorporate T&G effects can be used to determine the optimal DLG value.

| INTRODUCTION
Stereotactic body radiation therapy (SBRT) and stereotactic radiosurgery (SRS) treatments are particularly valuable modalities for treating relatively small lesions with high delivered doses. Stereotactic treatments generally use different delivery techniques: the most popular are dynamic conformal arc (DCA) and volumetric modulated arc therapy (VMAT). Some SBRT protocols, such as RTOG 0236 1 and 0813 2 required a minimum field size, encouraging multiple static beams or DCA. This requirement may be difficult to fulfill with VMAT as multileaf collimator (MLC) apertures do not strictly follow the projection of the planning target volume (PTV). Thus, VMAT arcs may lead to small MLC gaps. Nevertheless, the use of VMAT in SRS and SBRT is becoming increasingly widespread. 3 Since the target volumes are typically small, so are the radiation field sizes involved. This can be challenging for the accuracy of the treatment planning system (TPS) calculations. 4 Hence, the ICRU 91 report 5 recently recommended rigorous testing of the TPS dose calculation accuracy in stereotactic treatments because lesions can be in proximity to vital sensitive structures.
Dose calculation accuracy is known to be affected by inappropriate handling of simplifications in the TPS algorithms and models. [6][7][8] For rounded leaf-end MLC systems, the Eclipse TPS requests the user to input two MLC configuration parameters: the MLC transmission ratio and the dosimetric leaf gap (DLG). Some studies 9 have found good agreement between calculated and delivered doses by using the DLG measured with sweeping gap tests 10,11 or the dynamic chair test. 12 However, other authors [13][14][15] found substantial discrepancies and reported on the need for tuning the DLG value configured in the Eclipse TPS. Kielar et al. 13 observed discrepancies between calculated and measured doses around 5% for the Varian's high-definition multileaf collimator (HDMLC), which were greatly reduced by increasing the DLG entered into the TPS by more than 1 mm. Another important characteristic that can affect dose calculation accuracy is the tongue-and-groove (T&G) modeling. Indeed, many MLC models have a T&G design, where the sides of adjacent leaves interlock in order to reduce interleaf transmission. However, this configuration produces underdosage between adjacent leaf pairs in asynchronous MLC movements due to the additional shielding by the tongue of opposing leaf sides during treatment delivery. 16 This underdosage is known as the T&G effect and it can significantly change the dose distribution. 17 In arc treatments, T&G effects are typically smoothed out due to the gantry rotation, but they can produce a reduction in average doses of up to 5%-7%. 18 In a recent study 19

2.A | MLC model in Eclipse
Only two parameters of the MLC model in Eclipse are user configurable, namely the DLG and the MLC transmission. The TPS uses a single value for MLC transmission, which is the average radiation transmitted through the leaves. Regarding the leaf tip, Eclipse accounts for the increased transmission through the leaf tip by applying a shift to the leaf-end position which amounts to half the DLG value introduced during configuration. Therefore, doses are calculated with an effective gap larger than the nominal gap by a distance equal to the DLG. The procedure recommended by the vendor 11 to determine the DLG is the sweeping gap test as initially introduced by LoSasso et al. 10 For that purpose, the vendor supplies DICOM files implementing the tests that can be readily imported into the TPS. Concerning the modeling of the T&G, Eclipse extends the leaf projections in the direction perpendicular to the leaf motion by a certain tongue width, which is subtracted from the delivered fluence. Thus, the field size in the direction of leaf movements is enlarged by the DLG, while in the perpendicular direction it is reduced due to the tongue width by 0.625 mm. 11,20 This last value is fixed and unmodifiable by the user.
In this study, two DLG values were assessed: the "measured DLG" (DLG meas ) and the "optimal DLG" (DLG opt ). DLG meas was obtained with the standard sweeping gap test. In contrast, both institutions determined DLG opt during the SRS/SBRT commissioning process and was defined as the value producing the best agreement between calculations and measurements for a set of stereotactic clinical plans. To that aim, a procedure was followed in which the DLG parameter was increased iteratively until optimal QA results were achieved according to the stereotactic QA program of each institution.  The original sweeping gap test involves uniformly extended leaves for different MLC gaps without exposing any of the leaf sides and consequently without any T&G effect. In a previous work, 19 we designed a set of test fields based on the sweeping gap test that incorporated well-defined amounts of T&G by applying different shifts to the adjacent leaves. These test fields were: (a) the asynchronous sweeping gap (aSG) for sliding window beams and (b) the asynchronous oscillating sweeping gap (aOSG) for VMAT arcs. The detailed characteristics of these test fields are given in Hernandez et al. 19 and similar tests have also been proposed by other investigators. 21,22 For each beam of aSG and aOSG tests, a tongue-andgroove index (TGi) was defined as the quotient of the distance between adjacent leaf ends "s" and the MLC gap size (meanGap) Dose measurements were performed with a PTW ion chamber model T31013. This chamber is smaller than the typical Farmer chamber used for measuring the DLG, 23 but its active length (16 mm) still spanned several leaves, providing an estimate of the average impact of the T&G effect. 19 The chamber was positioned at the isocenter, 10 cm depth, in a water phantom for the aSG test and in a cylindrical phantom for the aOSG test. Chamber readings were corrected for the daily output variations. The expanded measurement uncertainty U 24 was calculated for one standard deviation confidence interval. Measured doses D meas were compared to the calculated doses D calc in the sensitive volume of the chamber for both DLG settings and dose differences were evaluated as (D calc − D meas )/D meas . Dose deviations for both tests with respect to mean-Gap, TGi, and to the ratio TGi/meanGap were investigated. In addition, a DLG value, noted as DLG minΔD , was calculated as the value that is required to compensate for the dose discrepancies obtained from test fields for a particular meanGap and TGi.

2.C | Clinical plans
Five SRS brain and five SBRT lung patients were randomly chosen.
The cases included small and large volumes, with PTV volumes ranging from 2 to 40 cc for brain cases (mean volume of 21.5 cc) and from 8.6 to 81 cc for lung cases (mean volume of 26.1 cc). For each patient, a DCA and a VMAT plan were optimized according to our institution's dosimetric clinical guidelines. The two different modalities were selected to highlight potential differences between VMAT and DCA deliveries. Hence, a total of 20 plans (10 DCA and 10 VMAT) were generated. VMAT plans used 4 arcs and DCA plans used between 3 and 5 arcs depending on the complexity of the case. The same plans were used in both institutions and dose calculations were performed with both DLG meas and DLG opt using the same MLC patterns and the same number of MUs. gamma) and provides the percentage of points that passes the criteria within the volume defined by a given isodose level. All GPRs were calculated with the 2%/2 mm local gamma criteria with a threshold of 10%. Moreover, the 3D gamma in Verisoft was analyzed for higher thresholds (30%, 80%, and 95%). The standard local and global gamma criteria 3%/3 mm were also recorded although deemed inappropriate for stereotactic treatments, which require stricter criteria. Additionally, qualitative line profiles agreement and the doses at the isocenter were also analyzed. The cal-  only for s < gap. 19 The relationships between the dose agreement (in terms of GPR and dose differences at the isocenter) with both meanGap and TGi were investigated. To that aim, results were reported with respect to: (a) meanGap, (b) TGi, and (c) TGi/meanGap.

| RESULTS
Using the standard sweeping gap tests, the DLG meas was found to be 0.3 mm (institution A) and 0.4 mm (institution B), which was in agreement with the value obtained with the dynamic chair method. The MLC transmission ratio was 1.25%. Both institutions independently determined an DLG opt of 1.1 mm, each institution using its own set of stereotactic clinical plans. As already mentioned, this DLG opt was obtained during SRS/SBRT commissioning by varying the DLG parameter in the Eclipse TPS iteratively until the best matching was reached between measured and calculated dose distributions for the plans considered.

3.A | Test fields (aSG and aOSG tests)
3.A.1 | Analysis of test fields Figure 1 illustrates the variation in dose difference for the aOSG test as a function of the TGi and with respect to the ratio TGi/ meanGap for both DLG settings. As shown in Fig. 1(a) the dose differences increased as TGi increased and as the MLC gap decreased. In the absence of T&G, a better agreement was found with DLG meas . This was expected since the DLG was measured with sweeping gaps without T&G. Nevertheless, as the T&G effect became higher, the discrepancies with DLG meas increased. Some dose differences exceeded 5% and were up to 8% for TGi = 1 and for the smallest gap. In the presence of T&G effects, the agreement clearly improved using DLG opt . With a TGi = 0.25, the dose differences were reduced from 2-4% to nearly 0% and with a higher value of TGi = 0.5 they decreased from 3-6% to 1%. It should be noted that the dose differences for DLG opt were less dependent on the gap size than for DLG meas . Indeed, the curves related to DLG opt for gaps 10, 20, and 30 mm almost overlapped.
Dose differences clearly depended on the ratio TGi/meanGap for DLG meas [ Fig. 1(b)], with a strong linear behavior (r 2 = 0.883). Discrepancies were larger for the largest ratio, which corresponds to large T&G and small MLC gaps. Tuning the DLG partially compensated for these discrepancies that were noticeably reduced by using DLG opt [see Fig. 1(a)]. Similar results were obtained for the aSG and aOSG tests in both institutions. Both tests were also carried out for the energies 6 MV FFF, 10 MV FFF, and 10 MV WFF obtaining the same behavior as shown in Fig. 1, and are provided as Supporting Information in Fig. S1. The uncertainty U on the ion chamber measurement was estimated to be less than 0.5%.

3.A.2 | Determination of the DLG minΔD
From the results obtained for test fields with DLG meas [ Fig. 1(a)], the value DLG minΔD that minimizes dose discrepancies between measurements and calculations can be calculated. Let us consider a dose difference ΔD between the measured and calculated doses for an asynchronous sweeping gap field with a particular MLC gap size "gap" and a representative "TGi": where D calc TGi;gap¼30 mm and D calc TGi;gap are the calculated doses for a MLC gap of 30 mm and the representative "gap," respectively, corresponding to a particular "TGi." The DLG minΔD value that minimizes dose discrepancies can then be easily obtained as

3.B | Clinical plans
All DCA plans exhibited an excellent agreement for both DLG meas and DLG opt . GPRs (2%/2 mm) were close to 100% for both institutions  Table S1. DLG meas produced excessively low calculated doses and the difference in the dose at the isocenter was reduced from 4% to 1.5% when DLG opt was used [see Fig. 2

3.B.2 | Relationship between plan analysis and dose agreement
The GPRs (2%/2 mm) obtained for DCA and VMAT arcs with 1000 SRS are given in Fig. 5.  VMAT, in contrast, the drop in GPR was evident as TGi/meanGap increased, similarly to the aSG and aOSG tests. Thereby, for all VMAT arcs, the larger TGi/meanGap ratio, the lower the GPR. In particular, for brain cases, VMAT arcs involved high TGi and small meanGap, producing the highest TGi/meanGap ratios and the lowest GPRs.
As previously commented, an excellent agreement was obtained in all cases with calculations performed with DLG opt [Fig. 5(d)]. All VMAT arcs presented GPR ≥ 98% regardless of their meanGap and TGi, which means that tuning the DLG effectively compensated for those limitations. A few DCA plans yielded slightly less congruous GPRs, but they were still in very good agreement (GPR > 97%).
Similar results were obtained for 6 MV FFF, with slightly better GPRs but exactly the same trends, and are provided as Supporting Information in Fig. S2.

| DISCUSSION
The use of VMAT in SBRT and SRS treatments is rapidly increasing 27-29 as a result of its dosimetric advantages over DCA. 30 We found that increasing DLG meas in Eclipse by 0.7-0.8 mm greatly reduced discrepancies, producing very good agreement between calculations and measurements (median GPRs > 99%) for F I G . 5. Local GPR for 2%/2 mm obtained with 4D Octavius and 1000 SRS for dynamic confomal arc (DCA) and VMAT arcs for 6 MV WFF. Results obtained with the measured DLG are given as a function of (a) meanGap, (b) TGi and (c) TGi/meanGap. Results with the optimal DLG are shown in (d).
VMAT arcs with a stringent local gamma criteria of 2%

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of the article.