Dosimetric impact and detectability of multi‐leaf collimator positioning errors on Varian Halcyon

Abstract The purpose of this study is to investigate the dosimetric impact of multi‐leaf collimator (MLC) positioning errors on a Varian Halcyon for both random and systematic errors, and to evaluate the effectiveness of portal dosimetry quality assurance in catching clinically significant changes caused by these errors. Both random and systematic errors were purposely added to 11 physician‐approved head and neck volumetric modulated arc therapy (VMAT) treatment plans, yielding a total of 99 unique plans. Plans were then delivered on a preclinical Varian Halcyon linear accelerator and the fluence was captured by an opposed portal dosimeter. When comparing dose–volume histogram (DVH) values of plans with introduced MLC errors to known good plans, clinically significant changes to target structures quickly emerged for plans with systematic errors, while random errors caused less change. For both error types, the magnitude of clinically significant changes increased as error size increased. Portal dosimetry was able to detect all systematic errors, while random errors of ±5 mm or less were unlikely to be detected. Best detection of clinically significant errors, while minimizing false positives, was achieved by following the recommendations of AAPM TG‐218. Furthermore, high‐ to moderate correlation was found between dose DVH metrics for normal tissues surrounding the target and portal dosimetry pass rates. Therefore, it may be concluded that portal dosimetry on the Halcyon is robust enough to detect errors in MLC positioning before they introduce clinically significant changes to VMAT treatment plans.

which have recommended best practices to avoid or minimize the consequences of these errors. 6,7 In particular, some studies have compared dosimetric differences in plans with random MLC leaf errors (where leaves are shifted some random amount within provided parameters) with differences in plans with systematic MLC leaf errors (where leaves are shifted identical amounts). Although both error classes have been shown to unfavorably impact treatment plans, systematic errors are reported to be more significant than random errors of the same magnitude. 1,2 However, all such studies to date have used a linear accelerator with a single-layer MLC design, commonly for which upper or lower collimation jaws are supplemented or replaced by the MLC, although various MLC designs exist. 6,8 These variations in MLC design and geometry introduce potentials for uncertainty which may be further compounded by such variables as leaf size and design, the introduction or removal of a field penumbra, or restricting treatment modalities. 2 The recently introduced Halcyon linear accelerator instead uses a jaw-free, dual-layer MLC design (Varian Medical Systems, Palo Alto, CA, USA). Thus, existing knowledge may not be directly translatable to the unique conditions that the Halcyon provides. In particular, the dual-layer MLC design greatly reduces inter-leaf dose leakage and low dose spillage that may have contributed to the overall dosimetric changes reported by previous studies. [9][10][11] The purpose of the current study was to evaluate the dosimetric impact of both random and systematic errors in the leaf positions of Halcyon's dual-layer MLC, to examine the correlation between clinically significant dosimetric changes and QA pass rates, and to estimate our ability to consistently detect these errors with portal dosimetry before the errors become clinically significant. This is the first study of its kind to examine such an impact from dual-layer MLC errors.

2.A | Linear accelerator
A preclinical version of the Halcyon was used for the current study.
The Halcyon does not use moving jaws for beam collimation; instead, initial collimation is performed by fixed primary and secondary collimators and the beam is further shaped by a novel twolayered MLC system (Fig. 1

2.B | Patient data
The current study used 11 physician-approved head and neck VMAT plans for treatments delivered at Penn Medicine Center at the University of Pennsylvania. Four of these plans contained two treatment arcs, four contained three arcs, and the other three contained four arcs. The median dose per fraction was 2 Gy (range 1.8-2.12 Gy), and the median number of treatment fractions was 30 (range 14-35). All treatment plans for Halcyon include one or more imaging fields. 12 These were unchanged and did not contribute significantly to the current study. All data were obtained with the appropriate institutional review board approvals and data transfer agreements.
All patient plans contained the following physician-contoured clinical structures: brainstem, eyes, lenses, right parotid, spinal cord and expanded spinal cord, and high-risk planning target volume (PTV) structures, of which all but eyes and lenses were selected (Table 1). When applicable, low-and mid-risk PTVs were also selected. Other structures included optic chiasm, optic nerves, left parotid, and submandibular glands. Because these structures were not consistently contoured for all patients, they were not directly considered in the current study.

2.C | Error simulation
In-house software was used to introduce various controlled errors within each treatment plan. Plans were exported in DICOM format from the treatment planning system to a local directory. Pydicom was then used to read and edit leaf position sequences directly and create a new treatment plan associated with the corresponding original, unmodified treatment plan. 13 Four magnitudes of random error were simulated by adding an array containing a random distribution uniformly sampled within ±3, ±5, ±7, or ±10 mm to each control point. These values were selected based on our own preliminary work using the same Halcyon device which showed that smaller errors did not have noticeable impact. This is for random errors up to 2 mm. 2 To avoid potential biases or unintentional duplication, a new random distribution was generated for each treatment arc and each plan, so that identical random distributions were never added to any two arcs or plans. In contrast, identical symmetric shifts of 3, 5, 7, or 10 mm were added to MLC leaf positions to simulate four degrees of systematic errors, where all leaves were shifted equal amounts in the same direction relative to the beam isocenter. Repeating both processes for all 11 original plans provided a total of 99 plan variants: 11 unmodified, 44 with random errors, and 44 with systematic errors. After errors were added, appropriate adjustments were made to avoid overlapping leaves, dynamic leaf gaps, or leaves being shifted outside the collimator boundaries. When errors were added to treatment plans, average MLC leaf shifts, and the standard deviation of each iteration were automatically recorded by the in-house software, and these records were later reviewed and validated to ensure that they were within expected values.
Once all 99 plan variants were generated, they were imported into a research version of Eclipse and the dose distribution was recalculated for all plans. A QA verification plan was then generated for each treatment plan and later delivered on the Halcyon through the Treatment Mode workspace and captured using the Halcyon's opposing portal dosimetry device. Initially, each verification plan was delivered twice, and individual fractions were compared for variations, but after analysis showed no noticeable differences between fractions, the remaining plans were delivered once.

2.D | Portal image detection
The Portal Dosimetry workspace (Varian Medical Systems) was used to perform gamma analysis as described in the literature. 14 For the current study, low-dose threshold was set to 10% and no region-ofinterest was set (vendor default settings). For each of the 11 patients, the predicted fluence of that patient's unmodified plan was taken as the reference to accurately simulate the effect of MLC positioning errors during beam delivery.
Gamma evaluation was performed with a combination of two absolute dose difference criteria, 3% and 2%, as well as three distance-to-agreement criteria, 3, 2, and 1 mm, for a total of four different indices: 3%/3 mm (in common use), 3%/2 mm (proposed by American Association of Physicists in Medicine Task Group 218 [TG-218]), 2%/2, and 2%/1 mm. 15 Because all plans contained at least two treatment arcs, each plan's overall agreement was taken as the mean of the agreement values for each arc, where perfect agreement was 100% and complete disagreement was 0%, within the provided evaluation criteria.
To determine whether a given plan had passed or failed portal dosimetry QA, two different criteria were examined: 1. If the overall percent agreement was 95% or higher, the plan passed; otherwise, it failed.

2.
If the overall percent agreement of any of the modified plans was no lower than 2% below the lowest mean percent agreement for the unmodified plans, the plan passed. For example, if the lowest pixel agreement of any of the 11 unmodified treatment plans was 93% for a 3%/2 mm evaluation, any plan with 91% or higher overall agreement passed. This approach was used to more closely examine the effect of adding MLC errors on the QA results beyond the unmodified ("error-free") plan.

2.E | Dose-volume histogram metric evaluation and normalization
To evaluate the impact of the MLC positioning errors on the dose delivered to the patient, DVH metrics for all normal structures were calculated using Eclipse for all 99 treatment plans, and text files containing these distributions were exported. In-house software was then used to extract target dose and volume coverage metrics for the normal tissue structures. Additionally, values for the volume of tissue receiving 98% and 95% of the prescription dose (V 98% and V 95% values) were extracted for high-, mid-, and low-risk PTVs. The dose to normal tissues for each patient was normalized to the dose delivered to that patient's unmodified treatment plan. The volume coverage of normal tissues for each patient was measured in cubic centimeters, and volume coverage for PTVs was measured as a nonnormalized percentage. Notable DVH metrics for the primary normal structures we considered are shown in Table 1, and clinically significant errors were defined as >5% change to these structures.

2.F | DVH metric correlation to portal dosimetry pass rate
Pearson's correlation coefficients (r-values) and P-values were used to determine the correlation between clinically significant changes to normal structures nearby or surrounding the target and the portal dosimetry pass rate of all plans at all gamma indices. High-and Systematic errors also had a greater effect than random errors on PTV volume coverage evaluated at 95% and 98% of the prescription dose ( Fig. 5). High-risk PTV was more substantially impacted than midand low-risk PTVs. Mean volume coverage trended downward as the magnitude of errors increased for both random and systematic errors.

3.B | Clinical impact
Systematic errors rapidly introduced clinically significant changes to the DVH metrics listed in Table 1. Using pass/fail criteria where plans with >5% change to DVH metrics ("clinically significant errors") fail evaluation, most plans with 5 mm or higher systematic error failed (Fig. 6). Conversely, although random errors also caused some degree of clinically significant errors, the effect was less pronounced. Most plans with random errors did not exhibit a clinically significant change to PTV covered, with the pass rate plateauing at 91% for errors between 3 and 10 mm. The pass rate for dose metrics decreased fairly linearly for plans with random errors; all 3 mm error plans and 27% of 10 mm error plans passed.
Correlation varied by structure and DVH metric type as well as between random and systematic error. Volume metrics had no meaningful correlation except for those to the PTV in plans with systematic errors, which were moderately correlated with QA pass rates.

| DISCUSSION
Our results showed that the clinical impact of MLC positioning errors depends on both error type and magnitude of error. Systematic errors were more clinically significant than random errors with maximum value of similar size ("random errors of equal magnitude") when examining changes to DVH, a phenomenon previously reported in the literature. 1,2 Although dose metrics were affected fairly linearly by both, the percentage of plans with unacceptable changes to PTV coverage remained constant for all random errors (Fig. 5).
The clinical impact of systematic errors was not only consistently greater than that of random errors of the same magnitude but also often greater than that of random errors of larger magnitude. This is  The first column for each index is the passing rate where "passing" is defined as ≥95% mean pixel agreement with the predicted dose. The second column for each index is the passing rate where "passing" is defined as mean pixel agreement no lower than 2% of the minimum mean pixel agreement of the unmodified plans.
T A B L E 3 Pearson correlation coefficients (r) and P values for the target volume and nearby normal structures. There are a few limitations associated with the current study.

DVH metric
The research version of Eclipse for the preclinical Halcyon unit did not support delivering treatment plans with fully independent leaf motion; instead, the distal MLC layer was slaved to the proximal layer. Distal leaves were never allowed to protrude into the beam or be closer than 0.1 mm to the beam edge, which was determined solely by the edges of the proximal leaves. Therefore, errors were simulated only in the proximal layer leaf positions, and then distal leaves were adjusted to ensure that they were within the requirements imposed by Eclipse. This constraint has been removed in the clinical versions of Eclipse, which now allow fully independent motion of proximal and distal leaves.
The relatively small patient cohort and restriction to VMAT head and neck treatment plans somewhat limit the range of the study.
However, we believe that the dataset of 99 unique plans largely offsets the effect of a small cohort. A previously conducted study for F I G . 8. Portal dosimetry pass rates for plans at indicated gamma indices, using the two pass/fail criteria described in Methods. Points represent mean pass rate for all plans at the indicated multi-leaf collimation error. Regardless of pass/fail criteria, systematic errors rapidly cause plans to fail gamma evaluation, while failure for plans with random errors depends more on gamma indices and evaluation criteria. (a) Overall pass rate for the first pass/fail criteria (e.g., ≥95% mean pixel agreement with predicted dose to pass). (b) Pass rate for the same plans when using the secondary pass/fail criteria (e.g., mean pixel agreement no lower than 2% of the minimum mean pixel agreement of the unmodified plans to pass).
168 VMAT lung treatment plans provided additional confidence that the Halcyon's portal dosimetry is able to catch errors occurring in various treatment sites before they become clinically significant. 19 One limitation of this study is the fact that we did not thoroughly examine the impact of MLC errors less than 3mm. This minimum was originally determined based on preliminary experiments with the Halcyon which showed that errors smaller than 3 mm did not have a noticeable impact. Because systematic error plans always failed portal dosimetry analysis, even for the smallest error considered (3 mm), additional experiments would be needed to determine the impact of errors smaller than 3 mm and if there is a range of systematic errors that do not significantly affect patient dose. We acknowledge that very large systematic errors, such as the 10 mm error used in the current study, represent a worst-case scenario and are highly unlikely to occur in actual practice, especially considering the Machine Performance Check QA that must always be performed for the Halcyon. 20 These large errors were included to provide contrast with 10 mm random error and to illustrate the continual deterioration of dosimetric metrics as systematic errors increase.

| CONCLUSION
While MLC positioning errors in any treatment plan delivered on the Halcyon are undesirable and should be minimized when possible, the type and magnitude of errors can greatly impact the clinical significance of these errors. Systematic errors consistently introduced more clinically significant changes to dose delivered to normal tissues and to target volume coverage than did random errors of the same magnitude. Consequently, although systematic errors introduced greater dosimetric changes, they were also relatively easy to detect with portal dosimetry. Random errors were less likely to introduce clinically significant changes but also more difficult to detect.
Correct detection of clinically significant errors with portal dosimetry depends greatly on the choice of evaluation criteria. The gamma index of 3%/2 mm recommended by TG-218 also held for this dual-layer MLC system.