Quantifying the performance of in vivo portal dosimetry in detecting four types of treatment parameter variations

Purpose: To quantify the ability of electronic portal imaging device (EPID) dosimetry used during treatment (in vivo) in detecting variations that can occur in the course of patient treatment. Methods: Images of transmitted radiation from in vivo EPID measurements were converted to a 2D planar dose at isocenter and compared to the treatment planning dose using a prototype software system. Using the treatment planning system (TPS), four different types of variability were modeled: overall dose scaling, shifting the positions of the multileaf collimator (MLC) leaves, shifting of the patient position, and changes in the patient body contour. The gamma pass rate was calculated for the modified and unmodified plans and used to construct a receiver operator characteristic (ROC) curve to assess the detectability of the different parameter variations. The detectability is given by the area under the ROC curve (AUC). The TPS was also used to calculate the impact of the variations on the target dose–volume histogram. Results: Nine intensity modulation radiation therapy plans were measured for four different anatomical sites consisting of 70 separate fields. Results show that in vivo EPID dosimetry was most sensitive to variations in the machine output, AUC = 0.70−0.94, changes in patient body habitus, AUC = 0.67−0.88, and systematic shifts in the MLC bank positions, AUC= 0.59−0.82. These deviations are expected to have a relatively small clinical impact [planning target volume (PTV) D99 change <7%]. Larger variations have even higher detectability. Displacements in the patient’s position and random variations in MLC leaf positions were not readily detectable, AUC < 0.64. The D99 of the PTV changed by up to 57% for the patient position shifts considered here. Conclusions: In vivo EPID dosimetry is able to detect relatively small variations in overall dose, systematic shifts of the MLC’s, and changes in the patient habitus. Shifts in the patient’s position which can introduce large changes in the target dose coverage were not readily detected. C 2015 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution 3.0 Unported License. [http://dx.doi.org/10.1118/1.4935093]


INTRODUCTION
Intensity modulated radiation therapy (IMRT) and volumetric arc therapy (VMAT) are complex treatment techniques that require a patient specific quality assurance (QA) measurement.This QA is most often done prior to treatment using a phantom of simple geometry. 1The pretreatment approach is able to detect errors prior to the initiation of treatment; however, errors related to the patient or errors that occur during the actual treatment delivery will escape detection, limiting the QA's overall effectiveness. 2,3easurements can also be made in vivo (i.e., during the treatment delivery) and can detect errors related to the patient.It is recommended by the International Atomic Energy Agency (IAEA) that in vivo dosimetry be used in standard practice. 46][7] These measurements, while in vivo, only provide dose information for a single point on the surface of the patient.This information is insufficient, particularly for IMRT and VMAT treatments where the delivered fluence is highly modulated, motivating the need for 2D or 3D dose information.
Another device that can be used for dosimetry is the electronic portal imaging device (EPID) which has been used for both pretreatment and in vivo QA. 8 EPID based in vivo dosimetry can potentially overcome the limitations of pointbased in vivo dosimetry since it is able to reconstruct a 2D plane or 3D dose distribution within the patient.EPID-based portal dosimetry also has the practical benefit that it potentially does not require phantom equipment or the placement of a point detector, which can be cumbersome and can introduce inaccuracies. 26][17][18][19][20][21] Other studies have suggested that in vivo EPID is a potentially beneficial QA method 22 and a recent review of an incident database suggests that in vivo dosimetry performed during the first fraction of treatment has the potential to detect the majority of clinically reported incidents. 23owever, to our knowledge, no study has appeared to date which quantifies the sensitivity of in vivo EPID measurements to potential errors or unintended variations in treatment delivery.The goal of this study therefore is to determine the ability of the EPID in detecting variations when used in vivo.We use in vivo patient data to explore variations of four different parameters related to patient and treatment machine characteristics.We employ a receiver operating characteristic (ROC) analysis as an endpoint.This methodology is similar to the work of Carlone et al. 24 which used several plans to perform a ROC analysis for multileaf collimator (MLC) positional errors.We also investigate the correlation between the detectability of these parameter variations and connect these to dose difference metrics.

2.A. EPID commissioning and measurements
The EPID used is an amorphous silicon flat panel imager (iView GT, Elekta, Crawley, UK).5][26] During patient treatment, the EPID is extracted, positioned beyond the isocenter at a source to detector distance (SDD) of 160 cm.][29] In order to use image data to reconstruct the dose within the patient, the response of the EPID, scatter contributions from within the EPID and patient, and attenuation must all be accounted for.The response of the EPID is calculated by taking an image of the entire EPID panel, with no patient present, to calibrate for variations in pixel intensity and nonflatness of the field.After this correction is applied, the EPID image is related to dose by comparing pixel intensity to ion chamber data taken at the same SDD.Scatter is modeled by scatter kernels, and the parameters are fit for during commissioning, taking EPID images of radiation transmitted through a solid water phantom and ion chamber readings made in solid water phantoms.Primary radiation, originating directly from the radiation head, and scattered radiation are found for various thicknesses of solid water and field sizes.Primary radiation is back projected to a given plane inside the patient correcting for scatter using the kernals, the inverse square law, and an attenuation correction using data from the planning CT.
The software used here was developed as an in-house system at the Netherlands Cancer Institute (NKI), and a more detailed description of the algorithm is presented in Refs.17 and 29.It is important to note that the model parameters for dose backprojection are based on commissioning done with solid water and will not account for tissue inhomogeneities accurately.For treatment sites with inhomogeneities present, the "in aqua vivo" correction was used. 18he resultant reconstructed dose is a 2D plane at isocenter, perpendicular to the beam direction, and is assessed by comparing it to the treatment planning dose.All plans are generated using the  treatment planning system (version 9.8, Philips Radiation Oncology Systems, Fitchburg, WI) calculated on a 2×2×2 mm dose grid.The dose comparison is performed with the gamma method of Low et al. 30 The most common planar gamma criteria of 3%/3 mm DTA (Ref.31) were used and only dose values above 20% of the maximum were included.For the sample of treatment plans analyzed, the gamma analysis was performed for each field.
Separate commissioning was done for each energy and Linac used.Commissioning was performed with solid water with thickness ranging from 4 to 32 cm, and the beams delivered were 200 monitor units (MUs) with square field sizes of 2 × 2 to 23 × 23 cm 2 .After commissioning, validation measurements were taken on solid water phantoms.Validation of the gamma pass rates and dose at isocenter was done with various combinations of solid water thickness and field size as well as half beam block fields, off axis fields, sliding window IMRT fields, and conformal and IMRT patient plans.Validations of the in aqua vivo correction were done with an anthropomorphic lung phantom.For all validation tests, the gamma rate was greater than 95%.These validations ensure that there are no large biases in the backprojection model which would skew the ROC curves.For a validation pass rate lower than 100%, small biases may still be present; however, due to variations in the gamma pass rates leading to noise in the ROC curves will not be observable.
Data were collected for three energies 6, 10, and 18 MV on four different treatment machines (Synergy with MLCi2 and Infinity with Agility MLC, Elekta, Crawley, UK).Before in vivo data were acquired, a backprojection model was commissioned for each energy on each machine.In vivo data were acquired for a single treatment fraction for each candidate patient.Candidate treatment plans included IMRT and conformal plans with field sizes less than 26 × 26 cm.Plans that were excluded include wedged fields, which were not commissioned and couch rotations larger than 15 • , which could cause a collision with the EPID panel.

2.B. Patient data
In vivo patient data were collected for 32 patients for a total of 131 separate fields under an institutional review board-approved protocol.The full data set was used to determine the agreement between planned and reconstructed doses at isocenter.A subset of plans was used to perform the ROC analysis reported here.IMRT plans were selected for this purpose because the goal was to evaluate in the potential impact of in vivo EPID measurement as a tool for patient specific QA.In total, nine IMRT plans were analyzed: four lung treatments, two liver, two spinal, and one head and neck treatment.The IMRT plans consisted of a total of 70 separate fields.All plans employed a step-and-shoot technique.

2.C. Simulation of treatment variability
In order to model variability and possible errors that can occur, modifications were introduced in the treatment plan.For each patient, new plans were generated by adjusting the plan parameters to simulate a variation.For this study, both machine-related and patient-related variations were considered and include the following: • variations of the machine output, scaling the MU by ±3% and ±6%; • Gaussian noise added to MLC leaf positions with a standard deviation of 0.5, 1, and 2 mm; • systematic shifting of all MLC positions by ±1 and ±2 mm; • patient shifts of 5 and 10 mm in each of the three cardinal directions; and • expansion and contraction of patient's body contour by 5 and 10 mm to model patient weight changes.
For each error, only a single parameter was adjusted, keeping all others fixed.For errors related to the MLC positions, the positions of closed leaves were not altered.When adding Gaussian noise to the MLC leaf positions a new random set of positions was produced for each standard deviation used.
For the modification of the patient's contour, new contours were created by applying a uniform margin around the original full body contour used for the treatment planning.
In order to assess the effects of the plan modifications, the modified treatment plan dose was exported to the  analysis software and the gamma analysis performed.

2.D. Changes in dose metric
For each simulated variation, the resultant DVH of the planning target volume (PTV) was calculated for comparison with the original treatment plan DVH.All DVH's were computed in the treatment planning system.The percent change in PTV D 99 for each parameter was found for each patient plan and then averaged.For example, the change in D 99 was computed for a 1 mm opening of the MLC banks for each patient and then the average D 99 was determined.

2.E. ROC methodology
The ROC approach is widely used to quantify the value of diagnostic imaging tests and has also been used in radiation oncology to assess the ability of tests in detecting errors. 24,29n this paper, we employ ROC analysis to assess EPID data collected in vivo.In order to determine the sensitivity and specificity of the EPID system to different variations, the treatment fields were divided into two groups: (1) the unmodified group consisting of plans used for treatment and (2) simulated modified group consisting of plans with a parameter variation simulated in the treatment planning system (TPS).The gamma pass rate for fields with and without modifications was used to construct a ROC curve.As an example, Fig. 1 shows the distribution of gamma pass rates for unmodified plans and for plans where a shift of −2 mm to the MLC banks is applied.N is the number of plans that has a given gamma pass rate in a bin width of 0.5%.Also shown on the curve for illustrative purposes is the threshold of 90%.A detection of a modified plan, a true positive, is defined as a field with a gamma pass rate lower than a given threshold.Figure 1 illustrates the number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) for a given threshold value.These values are used to quantify the sensitivity and specificity in the following manner: • Sensitivity: The probability that the test is positive for modified plans TP/(TP + FN).• Specificity: The probability that the test is negative for unmodified plans FP/(FP + TN).
The ROC curve is constructed by plotting the sensitivity vs 1-specificity while varying the gamma pass rate threshold.The performance of the test is quantified by the area under the curve (AUC) and the ideal gamma pass rate threshold can be found from the point closest to (0.0, 1.0) in the ROC space.A more complete description of ROC analysis can be found in Refs.32 and 33.
The distribution of gamma pass rates when combining the data from the 70 fields from the nine IMRT patients is used to create the ROC curve.

RESULTS
Comparing the TPS and reconstructed dose at isocenter for the full sample of 32 patients, the mean difference was −2.1% (σ = 3.1%).Pretreatment QA of each IMRT plan was performed with no bias, suggesting that there is a systematic bias to the backprojection algorithm.Prior to performing the ROC analysis of IMRT plans, the model was recommissioned applying this difference as a renormalization factor to remove any bias.
The nine patients used for the ROC analysis are listed in Table I.
Figures 2-5 show the ROC curves for the different parameter variations.A curve is calculated for each magnitude and sign of a parameter variation resulting in 27 curves in total.Figure 2 shows the ROC curves for scaling the monitor units.The overall number of monitor units in the prescription was scaled by ±3% and ±6%.AUC values are shown in Table II and are 0.70-0.77for the 3% offsets and 0.92-0.94for the 6% offsets.
Figure 3 shows the ROC curves for repositioning the MLC leaves by adding Gaussian noise with a sigma of 0.5, 1, and 2 mm (left) and applying a systematic shift of ±1 and 2 mm to all of the MLC leafs (right).The AUC values for random shifts with sigma of 0. respectively.For systematic MLC shifts, the AUC is 0.59-0.63 for shifts of 1 mm and 0.79-0.82for shifts of 2 mm. Figure 4 shows the ROC curves for applying shifts to the patient position in the anterior-posterior direction (left), the lateral right-left direction (center), and the superior-inferior direction (right).Shifts of ±5 and ±10 mm were applied in each direction.The AUC values are 0.54-0.56for shifts of 5 mm and 0.58-0.61for shifts of 10 mm.
Figure 5 shows the ROC curves for the expansion and contraction of the patient's body contour.A uniform margin of ±5 and ±10 mm was applied to the original full body contour.The AUC values are 0.67-0.70 for changes of 5 mm and 0.88 for changes of 10 mm.
A summary of the AUC for each ROC curve is shown in Table II.These results indicate that in vivo EPID dosimetry is most selective for variation in output and overall body size, but less sensitive to shifts of the patient.MLC leaf offsets are of intermediate detectability, but need to be systematic in nature to be easily detected.
The optimal gamma pass threshold, found as a point of the ROC curve closest to (0,1.0), was between 94.5% and 96.5% for patient shifts, 95%-95.5% for random MLC shifts, 93%-95% for systematic MLC shifts, 86.5%-95% for MU scaling, and 84.5%-95% for the patient body contour change.For the MU scaling and patient contour change, the F. 3. ROC curve when adding random Gaussian noise to MLC leaf positions with sigma of 0.5, 1, and 2 mm (left).ROC curve for a systematic shift to MLC banks of ±1 and ±2 mm (right).
Medical Physics, Vol.42, No. 12, December 2015 F. 4. ROC curve for shifts to patient position in the anterior-posterior direction (left).ROC curve for shifts in the lateral direction (center).ROC curve for shifts in superior-inferior direction (right).Shifts of ±5 and ±10 mm were applied in all directions.larger magnitude variation gives a lower optimal gamma pass threshold.
To assess the impact of these various deviations, we examine the change in dosimetric parameters.Figure 6 shows the average change in the D 99 value for the PTV versus the measured AUC.The change in D 99 is the percentage change averaged over all IMRT plans used in the ROC analysis, and the error bar represents the standard deviation of the D 99 change over all patients.

DISCUSSION
This study assesses the ability of EPID dosimetry to detect various types of variations when used during treatment (in vivo).Using the ROC methodology, our results show that in vivo EPID dosimetry was most sensitive to variations in overall dose (AUC = 0.70−0.94),changes in patient habitus (0.67-0.88), and systematic shifts in the MLC bank positions (0.59-0.82).Displacements in the patient's position and random variations in MLC leaf positions were not readily detectable (AUC < 0.64).The AUC for each parameter variation is quoted in Table II.
The detectability of variation depends on whether the variation is systematic or random in nature.A systematic F. 5. ROC curve for expanding and contracting the body contour of patient by 5 and 10 mm.variation of a parameter changes the gamma value for each measurement point and impacts the overall gamma pass rate whereas random changes in the transmitted radiation effect the gamma value for a limited number of measurement points and have a smaller impact on the gamma pass rate.For the random variations in the MLC leaves, the largest standard deviation considered was 2 mm so it is not surprising that a test using 3%/3 mm was not sensitive to these errors.
Changes in the PTV dose metric, D 99 , caused by parameter variations are shown in Fig. 6.Horizontal lines of ±5% are also shown in the figure, a value which is chosen based on previous studies such as the work of Dische et al. which found that a 5% dose difference can have an effect on the tumor response or morbidly risk. 34Variations in MU, patient habitus, and systematic shifts in the MLC banks that result in a large AUC have a change in D 99 of less than 7%.Changes in D 99 that are larger than this will be more detectable.For these types of variations, then, in vivo EPID dosimetry is an effective test.However, for patient shifts, the story is different.From Fig. 6, it can be appreciated that patient position variations can lead to changes in D 99 much larger than 5%, but with a small AUC < 0.7.The largest shifts (superior-inferior) resulted in a D 99 change of 57%.The conclusion is that patient This study should be understood in the context of previously published studies.Most studies of EPID-based portal dosimetry have focused on operational aspects of the technique.][17][18][19][20][21] To our knowledge, however, no study to date has evaluated the sensitivity of in vivo portal dosimetry to various types of variations.One previous study 24 did examine the sensitivity of phantom-based IMRT QA tests using the ROC methodology.Though this study shares some similarities with the present report, it considered only one type of error (MLC position) in one disease site and used phantom-based measurements.Also similar is the recent study by Bedford et al. 35 that considered multiple sources of error (MU scaling, MLC calibration, and gantry angle errors).However, this study did not quantify the detectability of the test through ROC or other methodology and it was also phantom based.These previous studies therefore do not address the potential of QA as performed in vivo.The DVH data in the present study are consistent with the study of Nelms et al. 36 which evaluated plan pass rates in phantombased QA.This study found no correlation between per beam planar gamma pass rates and changes in the DVH values for the clinical tumor volume (CTV) and organs, which is consistent with Fig. 6 where there is no correlation between the AUC and dose difference metric D 99 .
There are several limits to this study.The ROC curves were generated using patient plans of various different treatment sites to give an overall evaluation of the in vivo EPID dosimetry approach.The sensitivity and specificity in detecting different parameter variations can be dependent on treatment site, especially with respect to variations in the patient's position.In anatomical regions with few imaging features, variations in the patient's position will have little effect on the transmitted radiation.The majority of treatment plans used for this analysis were for soft tissue abdominal sites and the EPID in vivo system was poor in detecting patient shifts.As a result, it is of interest to expand upon the patient numbers to perform further site specific studies, a treatment site such as head and neck may be more sensitive to patient positioning variations.Also, while this study was conducted for IMRT treatments, it is of interest to see if similar trends occur for VMAT treatments.

CONCLUSION
In vivo EPID dosimetry is sensitive to variations in dose, systematic shifts of the MLC's, and changes in the patient habitus.For these parameters, in vivo EPID dosimetry is an effective test.The variations considered introduced a change in D 99 of less than 7%, with the largest changes in D 99 having a large AUC.For variations larger than those considered in this study, the detectability will be even larger.For shifts in the patient's position, which can introduce large changes in D 99 , the in vivo EPID dosimetry system does not provide reliable detection.The sensitivity to patient position variations should be studied for different treatment sites, but the data here suggest that in vivo EPID dosimetry should not be used in isolation but is most effectively used in combination with image guidance.

F. 1 .
Distribution of gamma pass rates for unmodified plans (top panel) and a plan with a −2 mm shift in MLC banks applied (bottom panel).Vertical lines represent a gamma threshold of 90%, shown for display purposes.

F. 6 .
Average change in D 99 of the PTV vs AUC.Dashed horizontal lines at ±5%.Medical Physics, Vol.42, No. 12, December 2015 position changes are not reliably detected by in vivo EPID dosimetry.