The impact of robustness of deformable image registration on contour propagation and dose accumulation for head and neck adaptive radiotherapy

Abstract Deformable image registration (DIR) is the key process for contour propagation and dose accumulation in adaptive radiation therapy (ART). However, currently, ART suffers from a lack of understanding of “robustness” of the process involving the image contour based on DIR and subsequent dose variations caused by algorithm itself and the presetting parameters. The purpose of this research is to evaluate the DIR caused variations for contour propagation and dose accumulation during ART using the RayStation treatment planning system. Ten head and neck cancer patients were selected for retrospective studies. Contours were performed by a single radiation oncologist and new treatment plans were generated on the weekly CT scans for all patients. For each DIR process, four deformation vector fields (DVFs) were generated to propagate contours and accumulate weekly dose by the following algorithms: (a) ANACONDA with simple presetting parameters, (b) ANACONDA with detailed presetting parameters, (c) MORFEUS with simple presetting parameters, and (d) MORFEUS with detailed presetting parameters. The geometric evaluation considered DICE coefficient and Hausdorff distance. The dosimetric evaluation included D95, Dmax, Dmean, Dmin, and Homogeneity Index. For geometric evaluation, the DICE coefficient variations of the GTV were found to be 0.78 ± 0.11, 0.96 ± 0.02, 0.64 ± 0.15, and 0.91 ± 0.03 for simple ANACONDA, detailed ANACONDA, simple MORFEUS, and detailed MORFEUS, respectively. For dosimetric evaluation, the corresponding Homogeneity Index variations were found to be 0.137 ± 0.115, 0.006 ± 0.032, 0.197 ± 0.096, and 0.006 ± 0.033, respectively. The coherent geometric and dosimetric variations also consisted in large organs and small organs. Overall, the results demonstrated that the contour propagation and dose accumulation in clinical ART were influenced by the DIR algorithm, and to a greater extent by the presetting parameters. A quality assurance procedure should be established for the proper use of a commercial DIR for adaptive radiation therapy.


| INTRODUCTION
Treatment of head and neck (H&N) cancers has been found to benefit from Intensity Modulated Radiation Therapy (IMRT). [1][2][3] plans are typically based on the anatomy defined by a pretreatment CT image dataset of the patient. The desire to consider potential changes in patient anatomy during the treatment perioddue to weight loss and tumor shrinkageled to the new approach called adaptive radiotherapy (ART). [4][5][6] In ART, the original treatment plans are revised to address the random and systematic patient anatomical variations over the 6-7 weeks of fractionated delivery. For H&N patients, weight loss and volume shrinkage lead to changes to the target and parotid glands, and the adaptive re-planning has been shown to be successful in compensating for the geometrical changes. [7][8][9][10] However, off-line ART involving manual delineation adjustment of the OARs and the target in H&N cases is a labor and time-consuming procedure. 11 More recent advances in the deformable image registration (DIR) further improved the efficiency of the ART workflow. DIR plays an important role in efficient adaptive treatment planning. [12][13][14] DIR allows the propagation of contours as well as the corresponding radiation dose from one image to another. 15 The ability to perform contour propagation facilitates an efficient adaptive radiotherapy workflow by avoiding tedious manual delineation. 16 Dose mapping is used in treatment evaluation by accumulating the fractional or weekly dose during the treatment course through daily cone beam CTs (CBCTs) or CT-on-rails. 17 In addition, DIR can be similarly used in 4D dose accumulation to study interplay effect and to map the densities from planning CT to CBCT in order to compute dose on the daily CBCT. 18,19 As more commercial treatment planning systems (TPSs) begin to integrate the DIR module in clinical adaptive radiotherapy, DIR-associated uncertainty has drawn concerns due to its fundamental importance to contour propagation, dose accumulation, autosegmentation, and 4D-CT processing. [20][21][22] DIR algorithms are supposed to be well validated in both the TPS commissioning and routine quality assurance procedure. A limited number of publications have addressed the DIR validation research in H&N adaptive radiotherapy, especially for contour propagation and dose accumulation. [23][24][25] In our work, adaptive radiotherapy was studied using CTon-rails linac (CTVision, Siemens, Erlangen, Germany) and RayStation  3. between small and large organs for all the presettings and algorithms.

2.A | Patient data
In this study, ten H&N cancer patients were randomly selected. All the patients received off-line adaptive treatment planning with several weekly CTs scanned using Siemens CT-on-rails during treatment fractions. The CT parameters were set to 3.0 mm thickness and 1.0 mm in plane resolution. For each weekly CT, CIVCO (Orange City, Iowa) H&N board and thermoplastic mask with a molded pillow were used to immobilize the patients. The target and organs at risk (OARs) contours were re-delineated by the same radiation oncologist (see Table 1).

2.C | Deformable image registration algorithms
In this study, two commercial deformable image registration algorithms integrated in RayStation TPS were evaluated. One is a hybrid DIR algorithm called ANACONDA which uses a combination of image intensity information and anatomical information. 27 The objective is a non-linear optimization problem which maintains image similarity as well as uses controlling contours for driving the deformation to make the deformation anatomically reasonable. The other is a novel biomechanical DIR algorithm called MORFEUS which computes the displacement field by solving a linear elasticity problem using the finite element method. 28 The objective function is setup by controlling ROIs represented by meshes and leaves out image gray scale information. Both the two commercial algorithms were set to the same pre-executed resolution for comparison purpose.

2.D | Contour propagation and Dose accumulation
For each patient, the primary CT and weekly CTs both with RO delineated contours were selected for the contour propagation comparison. Though deformable image registration is a relatively complex process, its software interface is relatively simple in the commercial TPS. The quality of DIR processing output depends on both the commercial DIR algorithm itself and the DIR presetting parameters, which will finally influence the contour propagation and dose accumulation results. The DIR presetting parameters adopted in RayStation is the controlling ROI specification which is used as the effective constrains to direct the DIR algorithm to better deforma- 2.E | DIR evaluation metrics

2.E.1 | Geometric
For geometric comparison, the DICE index was adopted to evaluate the spatial overlap between the volume surrounded by the reference manually delineated and DIR-propagated contours. 26 The DICE index is a coefficient to calculate the grade of two volumes' overlap as follows: where Volume1 and Volume2 represent the volumes of selected reference contours acquired by manual delineation and DIR propagation, respectively. 29 The DICE index has a value ranging from 0.0 to 1.0, with 0.0 meaning non-overlap and 1.0 meaning totally coincident.
Another geometric evaluation index was the Hausdorff distance (HD) to quantify the max distance of all the nearest points between where min a2A dðaÞ is the minimum distance of all points on the contour A to points on the contour B, and a represents the point on contour A. The similar definition is used for min b2B dðbÞ. 26 The DICE index was used to evaluate the overall spatial overlap of two contour volumes, while the Hausdorff distance to quantify the extreme shift of two contours.

2.E.2 | Dosimetric
For dose evaluation, all the accumulated weekly doses were deformed and accumulated on primary CT 1 using different DIR F I G . 1. Workflow of the contour propagation and dose accumulation process. (a) Contours were deformably propagated from the primary CT to each weekly CTs using the DVFs generated by different DIRs for geometric evaluation; (b) Weekly doses for each patient were deformed using the DVFs generated by different DIRs and totally accumulated to the primary CT for dosimetric evaluation.
presetting and the accumulated doses were compared with the primary planning dose on CT 1 . For the GTV, the max dose D max , min dose D min , and mean dose D mean as well as the dose to 95% of the volume D 95 were evaluated. Also, the Homogeneity Index (HI) value was adopted to analyze the uniformity of the dose distribution in the target volume, defined as: where D 5 is the dose to 5% of the target volume and D 95 is the dose to 95% of the target volume. 30 The ideal HI value is 1 and it increases as the plan become less homogeneous. For organs at risk, D max and D mean were used to evaluate the OAR dose. All the accumulated dose variations were counted relative to the primary planning dose as shown in the dosimetric evaluation procedure in Fig. 1(b).

3.A | Example patient
For the GTV contour propagation, the contours mapped by the detailed presetting were found to show better consistency with RO delineated contour, compared to simple presetting for both the ANACONDA and MORFEUS DIR algorithms. Figure 2 shows the axial, sagittal and coronal views for one typical patient.
This observation also held true for GTV dose accumulation. The DVH of the detailed presetting were found to show better consistency with the primary plan's DVH compared to the simple presetting for both ANACONDA and MORFEUS DIR algorithms, as shown in Fig. 2(B).

| DISCUSSION
In this study, ten H&N patients with weekly CTs were adopted to evaluate two DIR algorithms integrated in the RayStation ART module, namely, the contour propagation and dose accumulation for adaptive radiotherapy. In a previous study that compared commercial DIR algorithm variations, Nie et al. found that the three commercial DIR algorithms adopted by other ART software were able to achieve DICE coefficients above 0.81 in contour propagation. 31 Pukala et al.
found that the commercial DIR algorithms had a relatively low average geometric registration error between 0.5 mm and 3 mm. 32 Also, both studies show that, although most of the DIR algorithms could achieve acceptable results for contour propagation, the variations in dose accumulation using different DIR algorithms were found to be more profound. This study investigated more parameters to verify the accuracy of two DIR algorithms used in RayStation. The results show that, under the detailed DIR presetting, the DICE coefficients for both the two algorithms reached 0.8 or higher, and the variations were less than 0.05. The HD values were found to be also consistent with the maximum difference of less than 0.1 cm. And the mean values of the dose variation statistics were lower than 60 cGy, with HI index showing no significant variation. Under the simple presetting, however, the average geometric variations expanded by as much as 40% and the average dosimetric variations expanded by nearly 50% relative to detailed presetting between the two algorithms. Our research confirms the previous research conclusions under detailed presetting but further shows that, when the DIR presetting was simplified in clinical use, the variations between the two DIR algorithms expanded significantly for contour propagation and dose accumulation process.
Deformable image registration is a complex calculation process involving a large number of presetting parameters for research purposes. When it comes to clinical use, however, the presetting parameters will need to be reduced to maintain computational efficiency.
The main DIR presetting parameters offered by RayStation is con- This study further explored the influence of ROI's volume on DIR contour propagation and dose accumulation. Previous study by Kumarasiri et al. showed that, under the same DIR conditions, the averaged DICE coefficients for large and small organs contour propagation were 0.82 and 0.59 respectively, and DIR had better performance in large organs compared to small organs for contour propagation process. 26 As shown by Fig. 3(A), the DICE coefficients are higher for large organs than small organs, and the maximum variation is 0.38 when using the same DIR algorithm with the same presetting. However, the mean value of HD for large organs is slightly higher than that for the small organs, with the maximum variation of 0.46 cm. This is because the DICE coefficient represents the overall volume overlap rate in 3D space and the HD value represents the extreme shift. This result confirms and complements Kumarasiri et al.'s findings. 26 The dosimetric results in Fig. 5 show that, the variations of D max and D mean for large organs are also lower than small organs under the same DIR conditions. The maximum dosimetric variations were 81.9 cGy for D max and 213.7 cGy for D mean between large and small organs. This result further confirms that, DIR not only performs better for large organs in the overall contour propagation, but also has better performance in dose accumulation compared to small organs.
Overall, the results of this study indicate that RayStation's two DIR algorithms differ in contour propagation and dose accumulation process for H&N adaptive radiotherapy. The dose accumulation process is more complicated than the contour propagation process and can result in more complex variation. And, when simple DIR presetting is adopted, the differences between the two algorithms expand significantly. Therefore, it is necessary to validate the DIR process when more than two DIR algorithms are used together clinically to ensure that the DIR process will not produce excessive deviations between the two algorithms, even under relatively sim- The limitation of this study is that manual delineation was used as the reference contour, thus we can only estimate the influence of above factors to contour propagation and dose accumulation using different DIR. It is not possible in this study to further quantify which algorithm is more accurate, due to the lack of ground truth deformation. Using manual delineation as a reference is clinically easier to achieve intuitive results, and many reports such as Kumarasiri et al., 26 Gardner et al., 33 and Rigaud et al. 34 have used this method to evaluate different DIRs. However, considering the necessity of DIR QA, we believe that further research is still necessary to find a clinically feasible method to obtain ground truth deformation for DIR process. At present, the most widely studied methods are to use physical phantoms or computational phantoms to obtain ground truth deformation. But the problem of physical phantoms is that the workload is too great, and it is labor intensive to create a physical phantom to simulate every clinical scenario. In comparison, it is convenient to use the computational phantom instead, but as Nie et al.

| CONCLUSION
In this study, we evaluated the contour propagation and dose accumulation variations induced by the DIR process for ten H&N adaptive radiation therapy cases retrospectively using two integrated DIR algorithms in RayStation. The results showed that there were significant variations in the DICE coefficients, the Hausdorff distance between the two algorithms, especially under simple presetting condition. The dosimetric results lead to the same conclusion. DIR presettings have been found to have more significant influence on the final results, and the detailed presetting showed less significant variation in contour propagation and dose accumulation than the simple presetting. Compared to large ROIs, small ROIs were easier to produce more significant variation in both the contour propagation and dose accumulation. As more treatment planning systems integrate the DIR module, it's necessary for each organization to establish their DIR protocols and quality assurance procedures for adaptive radiation therapy.