Validation of the RayStation Monte Carlo dose calculation algorithm using realistic animal tissue phantoms

Abstract Purpose The aim of this study is to validate the RayStation Monte Carlo (MC) dose algorithm using animal tissue neck phantoms and a water breast phantom. Methods Three anthropomorphic phantoms were used in a clinical setting to test the RayStation MC dose algorithm. We used two real animal necks that were cut to a workable shape while frozen and then thawed before being CT scanned. Secondly, we made a patient breast phantom using a breast prosthesis filled with water and placed on a flat surface. Dose distributions in the animal and breast phantoms were measured using the MatriXX PT device. Results The measured doses to the neck and breast phantoms compared exceptionally well with doses calculated by the analytical pencil beam (APB) and MC algorithms. The comparisons between APB and MC dose calculations and MatriXX PT measurements yielded an average depth difference for best gamma agreement of <1 mm for the neck phantoms. For the breast phantom better average gamma pass rates between measured and calculated dose distributions were observed for the MC than for the APB algorithms. Conclusions The MC dose calculations are more accurate than the APB calculations for the static phantoms conditions we evaluated, especially in areas where significant inhomogeneous interfaces are traversed by the beam.

deposition of dose along straight lines traced from the source to the point of interest in the patient's body. The deficiencies of these algorithms were well understood, but they were the only option in the early days when computers were still very slow. These algorithms were replaced by analytical pencil beam (APB) dose calculation algorithms as described by Petti 7 and Hong. 8 The first releases of the RayStation treatment planning system employed an APB algorithm for PBS that was partly based on the algorithm described by Soukup and Fippel. 9 This APB dose engine divides the beam into many closely spaced mini-beams, called "pencil beams". 10  dose spots in the presence of a range shifter when large patient-torange shifter air gaps must be used. The latter two issues are straightforward to address during validation using standard water phantom measurements, but validating the dose in tissue, that is, a realistic clinical situation, is much harder.
This MC dose engine can be used for dose calculations for PBS and scattering-based treatment deliveries. PBS deliveries allows for inverse treatment planning techniques where the weights of a large number of candidate spots are determined using single-field optimization (SFO) or multifield optimization (MFO) strategies. 12 In RayStation, the Monte Carlo dose engine is not only used for final dose computation of a given spot distribution but may also be used in the optimization of the plan. This means that the MC algorithm can be used to either calculate the final dose from APB-optimized spot distributions, or it can be used for optimization and final dose calculation. The latter is the ideal option, but it can sometimes be time consuming especially during the initial phases of the planning process. RaySearch Laboratories AB explains: 11 The MC engine can account for range shifters and apertures with arbitrary air gaps. A Class II transport algorithm is used for primary and secondary protons, while heavier secondary particles such as deuterons and alphas are transported only by taking energy loss into account using the Continuous Slowing Down Approximation (CSDA). ['Class II' methods classify interactions into "hard" and "soft" categories depending on energy: Interactions causing energy loss above a specified threshold ('hard' interactions) have their delta rays explicitly modeled, whereas less-energetic interactions are summarized by sampling their condensed history (a statistical summary of multiple interactions). 13 22 The need to validate and transition to RayStation's MC dose algorithm is therefore clear.
Here we report on the more sophisticated clinical commissioning of the RayStation MC engine employing clinically realistic scenarios and accurate dose measurements in various anthropomorphic phantoms at multiple depths. We embarked on a series of experiments to validate the MC doses vs doses measured in the near-reality phantoms for different geometries. Using animal tissues to validate dose calculations is a common method and yielded great results as described by Zheng, 23 Grassberger, 24 and Gurjar, 25 though most of this work was done for passively scattered or uniformly scanned proton beams. Our aim was to develop phantoms that can validate the calculated dose "inside" the phantom and not "on the other side," that is, a transmission-type measurement. In addition to this study, we also verified the accuracy of MC calculations for lung; these findings are being prepared for subsequent publication.
We note that similar work has recently been published regarding the open-source fast MC proton dose algorithm MCsquare. 26,27 In a recent experimental study involving a measured lateral profile measured in a water tank hosting a large lateral inhomogeneity, MCsquare was compared to an early research version of the Ray-Search MC algorithm. It was found that both algorithms fared well in the study, but that RaySearch's MC had a 2.5% better passing rate for a 2%/2 mm gamma criterion. 28 With this experiment, we sought to build on this confidence of the reliability of this dose algorithm.
Specifically, for the safety and improvement of our clinical practice we sought to answer the questions, "Does the current RayStation MC algorithm accurately predict the dose to breast and head-andneck sites?" and, "What is the magnitude of improvement that MC provides over APB for dose distribution and range accuracy in breast and head-and-neck sites?" 2 | METHODS

2.A | Dose validation phantoms
Beams were delivered using the IBA Universal Nozzle in a gantry treatment room at a proton center. The beams were delivered to the phantoms in a clinical setting, setup in the same manner that a patient is positioned for treatment. The lamb neck and breast phantoms are shown in Fig. 1. The deer neck was similar in appearance and treated in the same manner.

2.A.1 | Head and Neck (H&N) phantom
A lamb neck (Fig. 1) and a deer neck were used for the head-andneck (H&N) phantoms. The necks were cut while frozen with a radial saw to have a flat surface just past the neck vertebrae. Both necks were then thawed and placed with their flat surfaces on 2-mm water-equivalent thick solid water slabs, which in turn were placed on the MatriXX PT (Ion Beam Applications S.A., Louvain-la-Neuve, Belgium) detector to measure the dose distal but in close proximity to the neck vertebrae. More solid water slabs were inserted between the phantom and the MatriXX PT to make measurements at deeper depths beyond the neck solid water interface for each phantom.

2.A.2 | Breast
The breast phantom (   Care was taken to mark the phantoms for precise treatment planning and positioning in the proton beam. The dose grid resolution was set to 1 mm for all final dose computations, and the MC dose was computed to reach an average statistical uncertainty better than 0.5% for voxels with a dose higher than 50% of the maximum dose.
The dose computation times required to calculate the final dose for each beam in this study using the MC and APB algorithms are listed in Table 1. The computations were done on a computer equipped with an Intel® Xenon CPU E-5 v3 with a 2.3GHz dual processor. All the beams were optimized using the APB algorithm only since the purpose of the study was to compare the same beams, that is, identical spot distributions and spot doses. Re-optimizing the beams with the MC algorithm would have resulted in slightly different spot distributions and spot doses due to the subtle differences in the algorithms addressed in this work.

2.B.2 | Dosimetry
For the neck phantoms, the dose optimization targets were 1 × 5 × 7 cm 3 volumes drawn in the solid water slabs at a depth posterior to the phantom solid water interface (the green boxes shown in Fig. 2). Two different AP beams, F1 and F3 for the lamb and deer neck phantoms respectively, were planned to deliver a deliberately nonuniform dose distribution beyond the animal tissue as shown in Fig. 2. We created plans using the APB algorithm for plan optimization. 30,31 The nonuniform dose distributions were accomplished by first overriding the material in the red volumes shown in Fig. 2, panels A and B, to water and optimizing the beam to deliver a uniform dose in the target (green boxes in Fig. 2). Secondly, the material override was then removed to recalculate the dose distributions resulting in a nonuniform distribution due to the presence of the heterogeneities in the neck phantoms. Two T A B L E 1 Dose calculation times compared for the MC and APB dose algorithms using uniform dose calculation grids of 1 mm and 2 mm, that is, 1 mm 3 and 8 mm 3 voxels. The treatment plan for the breast phantom, shown in Fig. 3, followed a similar approach to the neck phantoms except that we drew an irregular target volume crossing over into the solid water to enable calculating and measuring dose beyond the breast prosthesis.

Plan
We were primarily interested in the dose adjacent to the patient's breast tissue, that is, the rib dose and at beam edges where the breast tissue forms a significant roll or other discrete soft tissue to air interface or oblique interface to the beam. Dose was optimized for the red target volume shown in Fig. 1 (bottom right panel) and  The measurement depths (d m ) and solid water slabs used are tabulated in Table 2.   100.0 The differences in the expected DICOM depth (d e) and the depth of best gamma agreement (d γ ) for the 2D analyses are also listed.
The water-equivalent thickness ratio of the solid water material was 1.03. The phantom was aligned with the beam using the Veri-Suite IGRT system (MedCom, Darmstadt, Germany) employing orthogonal X rays, exactly as is done for patients.
The MatriXX PT detector is used daily for patient specific QA measurements and is cross calibrated regularly in the reference TRS398 calibration beam. 33 The TRS398 reference beam is calcu-

2.D | Data analysis
Two-dimensional (2D) and three-dimensional (3D) gamma analyses of absolute doses were performed with the measured doses as reference, and the computed doses as comparison. Global gamma was considered where the 100% level was defined as the maximum dose of the computed doses. 34 A gamma threshold of 5% and 10% was used for the 2D and 3D analyses respectively. This means that only product Compass from IBA Dosimetry and RaySearch. 32 In this gamma analysis, the sparsely measured data points were used as the reference dose with the computed 3D dose as evaluation, which is the converse of the 2D gamma analyses described above.

3.A | Dose calculation times
The dose calculation times required by the MC algorithm (t MC ) are significantly longer than for APB algorithm (t APB ) and scales approximately as the inverse cube of the dose computation grid size. The ratio of t MC /t APB is also listed in Table 1 showing that the MC calculation times are on average 5.7 (±1.5 SD) and 11.3 (± 1.2 SD) times longer for beams using 2-and 1-mm grid spacings, respectively. It is expected that t MC will be reduced significantly when GPU-based calculations become available in future releases of the RayStation software.

3.B | Gamma analyses
The gamma passing rates for the animal neck phantoms and breast phantom are listed in Tables 3 and 4. The depths of best 2D gamma T A B L E 4 3D Gamma pass rates for the breast phantom. APB, analytical pencil beam. The depth of best gamma agreement d γ was found to be the expected DICOM depth d e in the dose cube; the corresponding depth relative to the solid water surface d m is also tabulated. Bold pass rate percentages are the best agreement per criteria.
agreement for the neck phantoms are also listed in Table 3. We report the 3D gamma pass rates at the expected measurement planes.

3.C | Neck phantoms
Calculated dose distributions for the F3 lamb neck plan are shown in

3.D | Breast phantom
For the breast plans, only 3D gamma analysis was performed. The passing rates are listed in Table 4.  Table 3  racy. This is illustrated in Fig. 6, which shows the measured vs. calculated differences beyond the bone-air interface region of the lamb neck phantom.
The 3D gamma passing rates of the neck phantoms in Table 3 tells a different story. occur after the beam traverses discrete density changes as seen in Fig. 5 and Fig. 6. It is important to note that the largest differences are observed for single beams in the head-and-neck region, however, single beams are not often used in the clinical setting. These kinds of interface errors are diminished by using multiple beams traversing the interface at different angles.
We note also the overall difference in dose conformality. MC calculations are more granular whereas the APB calculations appear smoother. This is because the APB algorithm is using the infinite slab approximation as described earlier, which is clearly not addressing the density interfaces correctly. This is most likely the primary reason why the MC calculations struggle to demonstrate superiority over APB calculations at the stricter gamma criteria and more so for the 2D analyses. In areas where there are no discrete density changes, the calculations agree very well.

| CONCLUSION
In this work we validated the RayStation 6 Monte Carlo and APB dose calculation algorithms for head-and-neck and breast phantoms.
The MC results were systematically better than the APB results when compared in a 3D fashion, although APB was found to be clinically acceptable for the studied cases. We further demonstrated depth-dose discrepancy to be less than 1% for both algorithms. This work also highlighted the spatial limitation of 2D gamma, supporting the use of 3D gamma analyses for evaluating 3D dose distributions. Similar to isodose curves complementing dose-volume histograms, we must pay attention to where the gamma-index criteria are not satisfied in addition to the relative percentage to which it is satis-

fied.
We recommend implementing the RayStation Monte Carlo algorithm as a direct means to improve accuracy in treatment planning.
Our future work will discuss the influence of air gap, range shifters, and apertures on this algorithm's accuracy. We will also validate this algorithm for targets in lung tissue using a novel phantom allowing dose measurement within a realistic tumor phantom.

ACKNOWLEDGMENTS
We are grateful to RaySearch Americas Inc. and RaySearch Laboratories AB for their clinically useful developments.

CONFLI CT OF INTEREST
The authors have no relevant conflict of interest to disclose.