Quantifying the accuracy of deformable image registration for cone‐beam computed tomography with a physical phantom

Abstract Purpose Kilo‐voltage cone‐beam computed tomography (CBCT) is widely used for patient alignment, contour propagation, and adaptive treatment planning in radiation therapy. In this study, we evaluated the accuracy of deformable image registration (DIR) for CBCT under various imaging protocols with different noise and patient dose levels. Methods A physical phantom previously developed to facilitate end‐to‐end testing of the DIR accuracy was used with Varian Velocity v4.0 software to evaluate the performance of image registration from CT to CT, CBCT to CT, and CBCT to CBCT. The phantom is acrylic and includes several inserts that simulate different tissue shapes and properties. Deformations and anatomic changes were simulated by changing the rotations of both the phantom and the inserts. CT images (from a head and neck protocol) and CBCT images (from pelvis, head and “Image Gently” protocols) were obtained with different image noise and dose levels. Large inserts were filled with Mobil DTE oil to simulate soft tissue, and small inserts were filled with bone materials. All inserts were contoured before the DIR process to provide a ground truth contour size and shape for comparison. After the DIR process, all deformed contours were compared with the originals using Dice similarity coefficient (DSC) and mean distance to agreement (MDA). Both large and small volume of interests (VOIs) for DIR volume selection were tested by simulating a DIR process that included whole patient image volume and clinical target volumes (CTV) only (for CTVs propagation). Results For cross‐modality DIR registration (CT to CBCT), the DSC were >0.8 and the MDA were <3 mm for CBCT pelvis, and CBCT head protocols. For CBCT to CBCT and CT to CT, the DIR accuracy was improved relative to the cross‐modality tests. For smaller VOIs, the DSC were >0.8 and MDA <2 mm for all modalities. Conclusions The accuracy of DIR depends on the quality of the CBCT image at different dose and noise levels.


| INTRODUCTION
Image guidance is widely used in radiation therapy. Many modern linear accelerators are equipped with on-board imaging that can acquire kilo-voltage (kV) cone beam computed tomography (CBCT). 1 CBCT is widely used for patient alignment and more recently for adaptive treatment planning. Targets and organs are known to change position and shape during fractionated radiotherapy. 2 Deformable image registration (DIR) software has gained acceptance for managing contour propagation, dose tracking, and related issues over the course of such therapy. Deformable image registration enables users to automatically adjust the treatment planning contours drawn on the initial planning CT scan to account for anatomic changes observed on subsequent CT or CBCT images and to modify the plans as needed. For routine clinical use, the contour propagation process must have acceptable accuracy.
Commercial DIR software programs including MIM Maestro (MIM Software Inc., Cleveland, OH, USA), Velocity (Varian Medical Systems, Palo Alto, CA, USA) and RayStation (RaySearch Laboratories, Stockholm, Sweden) are being used for propagating contours with CBCT images in clinical practice. [3][4][5] The performance of DIR depends on numerous variables such as type of algorithm, implementation of that algorithm, and image modality and quality. 6 The clinical stability of DIR is also influenced by factors such as the method of regularization 6 and user experience. 7 Several methods have been used to evaluate DIR algorithms, the three most common are contour outline comparison, landmark tracking, and simulating deformation with a phantom. 8 Validating and commissioning DIR are complex because of the lack of systematically documented processes for doing so. Currently, means of validating the accuracy of deformable registration are being investigated at academic institutions. Until the technology advances to allow production of a standard testable deformable phantom, the most common way to review deformation at present is by visual verification, 9 including tissue/voxel intensity overlay, viewing the deformable warp map, and displaying the difference map between two registered images. The American Association of Physics in Medicine (AAPM) recommends that formal image registration quality assurance (QA) programs be implemented at individual facilities. The program should include commissioning image registration and fusion software to ensure the accuracy of the tools used. 6 Understanding the optimization approach used by the user's DIR system is essential to appreciate how it converges, its limitations, and its potential pitfalls. Last year, the AAPM task group 132 reported 6 a new, downloadable virtual phantom to test DIR accuracy and recommended using either a digital phantom or a physical phantom for DIR tests. However, the digital phantom does not facilitate end-to-end testing of DIR systems, in particular, facilitating the selection of optimal imaging parameters for DIR systems. In addition, reports have shown that the validation procedure is more complex for digital phantoms. 10  The quality of images obtained with cone-beam geometry is known to be inferior to that of regular CT images because of the large solid angle receiving scattered radiation. 11 Increased scatter from the patient obstructs the signal, degrading CBCT image quality compared with standard CT, resulting in blurred images and changes in CT numbers. 12 For kV CBCT, up to 2.5 times more photons arriving at a detector behind a normal-sized patient body are scattered as compared to fan beam CT. 12,13 The accuracy of DIR systems for CBCT images under various levels of noise and dose has not been well studied. Decreases in soft tissue CT number intensity have been noted from increased beam hardening and truncation of images due to smaller field of view (FOV). Some physical phantoms have been developed to assess the accuracy of DIR. 4,10,[14][15][16][17][18][19][20] Despite the existence of many methods to independently validate DIR systems, none have been standardized and all demand a great deal of time and resources.
We previously presented our work on the development of a physical phantom (Wuphantom, US patent application) that can be used to seamlessly quantify the accuracy of a DIR system. 21 Here we used this physical phantom to evaluate the accuracy of DIR for CBCT with various scanning protocols and levels of image quality.
Our goal in this testing was to evaluate the accuracy of DIR in (a) cross-modality registration (CT-vs-CBCT), (b) same-modality registration (CBCT-vs-CBCT and CT vs CT), and (c) these CBCT registrations with different-sized volume of interest (VOIs).

2.A | Phantom
We previously designed a physical phantom that can be used to test  were created in different shapes: circle, oval, and irregular, simulating deformed contours from the original circle (Fig. 1). The oval shape represents commonly deformed contours, and the irregular shapes simulate irregularly deformed contours. For DIR testing, the inserts are rotated to simulate contour changes in both shape and location compared with the reference circle. Each of these large insert cavities was filled with Mobil DTE oil (density 0.95 g/mL) to represent soft tissue. A smaller cavity on the right side of the phantom was filled with bone plug (CB2 30%) from RMI (Gammex, Inc.), which has a density of 1.33 g/mL to simulate bone and changes in bone location. 2.0 mm slice thickness, and 300 mA). The CBCT images were acquired with pelvis, head and "Image Gently" protocols to represent images with various noise and dose levels, with image quality levels ranging from best to worst in a typical clinical environment. Image Gently protocol gives much less radiation exposure to patients at the cost of increased image noise level, mostly used for patient alignment or pediatric patients. Image acquisition variables are shown in Table 1. All images were then transferred to a Velocity irregular-shaped inserts, to simulate tissue deformation from a circular shape to another circular shape or a different shape (oval or irregular

2.C.2 | Image quality
CT and CBCT image quality can be quantified in terms of the contrast-to-noise ratio (CNR). 22 A Catphan with a module containing low-contrast cylinders (CTP 604) was used to determine the CNR   Fig. 3(b)] and CBCT pelvis protocol [ Fig. 3(c)].
The measurement process was also repeated for images obtained with CBCT head and CBCT Image Gently protocols (not displayed).
The CNR is defined as: where ROI (1%) represents the mean Hounsfield units (HU) in the ROI of a 15-mm diameter, low-contrast object; ROI (bk) represents the mean HU of the adjacent background; and SD (bk) represents the standard deviation of the background.

2.C.3 | Accuracy of DIR
Contours can be quantitatively compared by using several metrics.
Two commonly used approaches are the Dice similarity coefficient (DSC) 23    The DIR for small VOIs was significantly better than that for larger VOIs in all imaging protocols.

| DISCUSSION
We quantitatively validated the accuracy of These findings imply that CBCT can be used for adaptive planning, dose tracking, and so on, but only with selected imaging techniques that provide adequate CNR.
Singhrao et al., in their study of DIR algorithms for evaluating MV and kV imaging of a head and neck phantom, found that the presence of artifacts on CBCT images is problematic for algorithms that focus strongly on image similarity. 16 We also found that   noise and dose levels should be undertaken in the future to investigate additional scenarios.
Finally, use of deformable phantoms for multimodality image registration adds complexity, as it requires phantoms to have components that are optimized for MRI, PET, and single positron emission tomography. 6 The Wuphantom was designed to faciliate tests of multimodality image registration. The DIR accuracy for other image modalities, either same-modality or across-modality (e.g., 4D CT vs CT, MRI vs MRI, MRI vs CT, PET vs PET) have not yet been validated. We plan to continue our studies in these areas in the near future.

| CONCLUSIONS
We quantitatively evaluated the accuracy of DIR for CBCT imaging.
The image quality of CBCT images at specified reference CNRs and dose levels can be correlated with the accuracy of the DIR. The Wuphantom facilitates the essential AAPM recommendation that physical phantoms be used for end-to-end testing of DIR systems.

ACKNOWLEDGMENT
We thank Christine Wogan, MS, ELS, of the Division of Radiation Oncology at MD Anderson Cancer Center, for editorial assistance.

CONF LICT OF I NTEREST
Authors Wu, Yang, Wisdom, Liu, Zhu and Frank are inventors on a related pending patent. Other authors have nothing to disclose.