MR‐based treatment planning in radiation therapy using a deep learning approach

Abstract Purpose To develop and evaluate the feasibility of deep learning approaches for MR‐based treatment planning (deepMTP) in brain tumor radiation therapy. Methods and materials A treatment planning pipeline was constructed using a deep learning approach to generate continuously valued pseudo CT images from MR images. A deep convolutional neural network was designed to identify tissue features in volumetric head MR images training with co‐registered kVCT images. A set of 40 retrospective 3D T1‐weighted head images was utilized to train the model, and evaluated in 10 clinical cases with brain metastases by comparing treatment plans using deep learning generated pseudo CT and using an acquired planning kVCT. Paired‐sample Wilcoxon signed rank sum tests were used for statistical analysis to compare dosimetric parameters of plans made with pseudo CT images generated from deepMTP to those made with kVCT‐based clinical treatment plan (CTTP). Results deepMTP provides an accurate pseudo CT with Dice coefficients for air: 0.95 ± 0.01, soft tissue: 0.94 ± 0.02, and bone: 0.85 ± 0.02 and a mean absolute error of 75 ± 23 HU compared with acquired kVCTs. The absolute percentage differences of dosimetric parameters between deepMTP and CTTP was 0.24% ± 0.46% for planning target volume (PTV) volume, 1.39% ± 1.31% for maximum dose and 0.27% ± 0.79% for the PTV receiving 95% of the prescribed dose (V95). Furthermore, no significant difference was found for PTV volume (P = 0.50), the maximum dose (P = 0.83) and V95 (P = 0.19) between deepMTP and CTTP. Conclusions We have developed an automated approach (deepMTP) that allows generation of a continuously valued pseudo CT from a single high‐resolution 3D MR image and evaluated it in partial brain tumor treatment planning. The deepMTP provided dose distribution with no significant difference relative to a kVCT‐based standard volumetric modulated arc therapy plans.


| INTRODUCTION
In recent years there have been many efforts to develop Magnetic Resonance Imaging (MRI)-based treatment planning methods that avoid auxiliary computed tomography (CT) for radiation therapy treatment planning. 1 MRI provides superior soft tissue contrast compared to CT which makes it an excellent image modality to delineate accurate boundaries for targeted treatment regions to deliver the most desirable dose distribution. 2,3 In addition, image techniques that do not administer ionizing radiation, such as MRI, are desirable for pursuing reduced treatment dose to patients.
A key challenge for MRI-based treatment planning is the lack of a direct approach to obtain electron density for dose calculation. Given the importance of an accurate μ-map to enable accurate dose calculation in treatment planning, the development of novel approaches to generate pseudo CTs or synthetic CTs from MRI is an actively studied topic. [6][7][8][9] State-of-the-art approaches can be roughly classified into two main categories: image intensity-based and atlasbased. 1 The typical intensity-based approach is to utilize individual or combined T1-weighted, T2-weighted, and water/fat separated MR sequences that estimate tissue compartments with a single or multiple acquisition. 1 These images are then further processed to directly assign Hounsfield Unit (HU) values to air, fat, lung, and water compartments 10 or to estimate the continuously valued HU to various tissues by MR signal conversion model. 11 However, because bone cannot be visualized with positive contrast on conventional MR imaging approaches, bone is typically challenging to estimate in these approaches. Specialized MRI acquisitions using an ultrashort echo time (UTE) or zero echo time can be implemented to allow the measurement of the rapidly decaying MR signal in bone tissue to estimate bone. Unfortunately, most UTE acquisitions are challenging to integrate into clinical workflows as a result of technical difficulties in implementation, require a relatively long scan time, and have limited availability across different vendor platforms. Additionally, even with advanced UTE acquisitions, bony structure and air often remain difficult to distinguish and pose errors in consecutive attenuation calculation. In addition, partial volume effects and signal inhomogeneity due to RF pulse and receive coil arrays are additional confounding factors that influence the accuracy and precision of segmentation-based approaches in MR.
Atlas-based approaches for treatment planning utilize registration and spatial normalization of a population-based CT image template to acquired MR images to estimate the location and geometry of tissue types. 12,13 A particular advantage of these techniques is that an existing, clinically useful MRI series can be used as the input dataset, eliminating the need for an extra MR acquisition particular for treatment planning. However, this process is highly dependent upon the accuracy of image registration where the patient anatomy must be appropriate for the atlas used. Therefore, atlas-based approaches may suffer when utilized in subjects with abnormal anatomy such as missing tissues or the presence of surgical implants.
Deep learning utilizing convolutional neural networks (CNN) have recently been applied to medical imaging with successful implementations for a wide range of applications. 14 16 utilized deep learning to enable accurate generation of pseudo CTs from a single acquisition of T1-weighted MRI image acquired in standard clinical brain protocol. In this study, a deep CNN model was designed to classify tissues on the MRI images after training with registered kilovoltage CT (kVCT) images. As a result, three-discrete labels were assigned to soft tissue, air and bone in the generated pseudo CTs that delivered accurate PET/MR attenuation correction with significantly reduced error in reconstructed PET images compared with existing segmentation-based and atlasbased methods. 16 In one recent study from the same group, the deep learning approach was applied in combination with an advanced UTE sequence, which achieved reliable and accurate tissue identification for bone in PET/MR attenuation correction in brain imaging. 17 Another recent study demonstrated excellent performance utilizing deep learning generated pseudo CTs for PET/MR attenuation correction in pelvis. 18 The purpose of this study was to further evaluate the efficacy and efficiency of deep learning generated pseudo CTs in the application of treatment planning in radiation therapy on brain tumor patients. Specifically, we evaluated an MRI-based treatment planning approach, deepMTP, which allows generation of a continuously valued pseudo CT from a single MRI acquisition from clinical protocol using deep CNN model. To the best of our knowledge, this study is one of the pilot studies to implement deep learning generated pseudo CTs into the workflow of treatment planning and to evaluate the accuracy and robustness of such approach for dose calculation accuracy in clinical cases of radiotherapy for brain metastases. While other studies have demonstrated techniques for the generation of pseudo CTs with hypothesized applications for radiotherapy treatment planning, none have evaluated their efficacy in evaluating clinical radiotherapy treatment plans using the generated pseudo CT images.

2.A | Convolutional neural network architecture
Inspired by the network design in Ref., [16,17] we utilized the deep convolutional encoder-decoder network structure shown in Fig. 1, which is capable of mapping pixel-wise image intensity from MRI to CT in multiple image scales. The basic framework of this type of network was built based on structures that perform well in natural image object recognition 19 and MRI segmentation for various tissues. [20][21][22][23][24] The network consisted an encoder network directly followed by a decoder network, where the encoder network uses a set of combined 2D convolution filtering, batch normalization, 25 ReLU activation, 26 and max-pooling to achieve image feature extraction for unique spatial invariant input image features. The decoder network takes the output of the encoder network and combines extracted image features in multiple resolution scale to generate targeted highresolution image output through an image upsampling process. The encoder uses the same 13 convolutional layers from the VGG16 network 27 originally designed for image recognition and later tested in multiple image segmentation tasks. 16,20 The decoder is applied directly after the encoder network and features a reversed VGG16 network structure with the max-pooling layers replaced by corresponding upsampling layers. In Ref., [16] the pseudo CT generation was treated as a semantic segmentation problem for multiple tissue classes in MR images. More specifically, a multiclass softmax layer was inserted into the final layer of decoder network and the model was optimized with multiclass cross-entropy image loss which yields a pixel-wise discrete label output matching the input MR image resolution. The different fixed HU values were assigned to different tissue compartments based on the discrete label to create discrete pseudo CT model. In this study, we substituted the softmax layer with a 2D convolutional layer and optimized image loss using a mean square error cost function. The pseudo CT generation was treated as a signal regression problem for converting MR contrast to CT contrast. More specifically, instead of outputting tissue classes as in F I G . 1. Schematic illustration of deepMTP pipeline. The convolutional encoder-decoder (CED) network is used to convert MR images into pseudo CT images. This network consists of a combined encoder network (VGG16) and decoder network (reversed VGG16) with multiple symmetrical shortcut connection (SC) between layers. The insertion of SC follows the strategy of full preactivation deep residual network. The deepMTP process consists of a training phase to optimize the CED network and a planning phase to generate pseudo CTs for new MR data using trained and fixed CED network. This figure is adapted from the Figure 1 of the Ref. [16] with permission to be used in this paper.
Ref., [16] the CNN in this study directly outputted pseudo CTs and the actual pixel-wise HU values of pseudo CTs were optimized against the real CT values to minimize the contrast difference. As a result, an output of pseudo CTs with continuously valued HU number is enabled in contrast to the output of discrete tissue labels in Ref. [16].
Additionally, like the popular U-Net, 28 shortcut connections (SC) were added between the encoder and decoder network to enhance the mapping performance of the encoder-decoder structure. The added SC are advantageous in preventing excess image detail loss during the max-pooling process of the encoder in deep CNN networks. 29,30 A total of four SC were created between the network layers by following the full preactivation method described in the deep residual network configuration 30 and one additional shortcut connection was also generated from the input image directly to the output image. The detailed structure of the proposed networks is schematically illustrated in Fig. 1.

2.B | deepMTP procedure
The proposed deepMTP procedure consisted of two independent phases for training retrospective MRI and coregistered CT data and for generating pseudo CTs using a fixed network in the treatment planning phase. As also shown in Fig. 1, in the training phase, 3D CT images were first coregistered to MR images using a combined rigid Euler transformation and nonrigid B-spline transformation using the Elastix image registration tool 31 following the method described in Ref. [32]. Specifically, a 4-level multiple resolution strategy, 32 histogram bin similarity measurement and a total of 1500 iteration were performed. An optimization metric combining localized mutual information with the bending energy penalty term was used for nonrigid registration. For each training dataset, MR and coregistered CT images were first offset to positive values and then scaled by pixel intensity of 5000 and 2000 (HU), respectively, to ensure the similar dynamic range. Then the 3D MR and CT volume data were input into the encoder network as a stack of 2D axial images. The network weights were initialized using the scheme described in Ref. [33] and updated using the ADAM algorithm 34

2.D | Evaluation of Pseudo CTs
Evaluation of pseudo CT accuracy was performed on above mentioned 10 subjects who were not involved in training phase. We used the Dice coefficient, a similarity measure ranging from 0 to 1 that describes the overlap between two labels, to calculate the classification accuracy for soft tissue, bone, and air, where pseudo CT generated from deepMTP and the ground truth (kVCT image) were compared. For calculation of Dice coefficients, the continuously valued pseudo CT and planning kVCT images were discretized by thresholding as following: bone if HU >300, air if HU <−400, otherwise soft tissue. Additionally, the mean absolute error (MAE) within the head region was also evaluated between pseudo CT and kVCT for each subject to elucidate the overall pixel-wise image error.

| RESULTS
For the deepMTP procedure, the training phase required approximately 2.5 hrs of computational time in our dataset, whereas generating a single pseudo CT image using the trained model and input MR images required roughly 1 minute.
An example of an acquired 1.5 T MRI, actual CT, and pseudo CT for a 47-year-old female patient with right cerebellar metastasis is demonstrated in Fig. 2. As shown, deepMTP was able to accurately transfer MR image contrast into CT images with clearly identified air, brain soft tissue, and bone highly similar to that of kVCT images.
Evaluation of the Dice coefficient in 10 brain metastases cases com-   MRI has been increasingly incorporated into the planning and delivery of radiation treatment. 1,[55][56][57] Given that nearly half of all cancer patients receive radiation during their treatment, there are a substantial number of patients likely to benefit from improved approaches of integrating MRI into RT planning. The further development of MR-only approaches will help to reduce radiation dose from kVCT. This is advantageous in particular for treatment of pediatric and pregnant cases where radiation dose reduction is the primary goal. Future applications of MR-only treatment planning will be improved if the MR scan can be performed in the radiation treatment position, which will require development of more MR compatible setup equipment and improved capabilities for MR imaging in a large field of view. Furthermore, recent advances in technology have resulted in therapy devices that combine MR scanners with RT devices (e.g., 60 Co IGRT 58-61 and linear accelerators [62][63][64] ) which further support the use of MRRT and other advanced methods to improve and augment therapy delivery, such as interfraction assessment of therapy response and inter-and intrafraction adaptation of therapy plans. 65,66 These systems also allow real-time imaging during treatment, which may prevent geometric tumor miss and allow for a smaller PTV to be used. This capability is particularly advantageous in the lung and upper abdominal cancers were respiratory tumor motion must be taken into account. Therefore, future development of approaches to extend deepMTP to other regions of the body will have additional impact.
F I G . 5. An example of a 74-year-old male patient with a large superior brain tumor shows similar isodose lines (a) and (b) and almost identical PTV dose curves in DVH (c) between deepMTP and CTTP.

| CONCLUSION
We have shown that deep learning approaches applied to MR-based treatment planning in radiation therapy can produce comparable plans relative to CT-based methods. The further development and clinical evaluation of such approaches for MR-based treatment planning has potential value for providing accurate dose coverage and reducing treatment unrelated dose in radiation therapy, improving workflow for MR-only treatment planning, combined with the improved soft tissue contrast and resolution of MR. Our study demonstrates that deep learning approaches such as deepMTP, as described herein, will have a substantial impact on future work in treatment planning in the brain and elsewhere the body.

ACKNOWLEDGMENTS
The authors thank the funding support from NIH R01EB026708.

CONF LICT OF I NTEREST
All authors have no conflict of interest to disclose.