Image synthesis of monoenergetic CT image in dual‐energy CT using kilovoltage CT with deep convolutional generative adversarial networks

Abstract Purpose To synthesize a dual‐energy computed tomography (DECT) image from an equivalent kilovoltage computed tomography (kV‐CT) image using a deep convolutional adversarial network. Methods A total of 18,084 images of 28 patients are categorized into training and test datasets. Monoenergetic CT images at 40, 70, and 140 keV and equivalent kV‐CT images at 120 kVp are reconstructed via DECT and are defined as the reference images. An image prediction framework is created to generate monoenergetic computed tomography (CT) images from kV‐CT images. The accuracy of the images generated by the CNN model is determined by evaluating the mean absolute error (MAE), mean square error (MSE), relative root mean square error (RMSE), peak signal‐to‐noise ratio (PSNR), structural similarity index (SSIM), and mutual information between the synthesized and reference monochromatic CT images. Moreover, the pixel values between the synthetic and reference images are measured and compared using a manually drawn region of interest (ROI). Results The difference in the monoenergetic CT numbers of the ROIs between the synthetic and reference monoenergetic CT images is within the standard deviation values. The MAE, MSE, RMSE, and SSIM are the smallest for the image conversion of 120 kVp to 140 keV. The PSNR is the smallest and the MI is the largest for the synthetic 70 keV image. Conclusions The proposed model can act as a suitable alternative to the existing methods for the reconstruction of monoenergetic CT images in DECT from single‐energy CT images.

Dual-energy computed tomography (DECT) is based on the fact that x-ray attenuation depends primarily on the photoelectric effect and Compton scattering in the diagnostic energy range, and that these attenuation phenomena are energy dependent. 3 DECT scans are acquired using two different tube potentials, which can be used to estimate the Compton scattering and the photoelectric effect components of the attenuation. Subsequently, this information is used to distinguish between tissues and characterize materials. Using this technique, the monoenergetic CT number, effective atomic number, urinary stone characterization, and virtual noncontrast-enhanced images may be reconstructed. 4 The monoenergetic CT image can be reconstructed at an energy level ranging from 40 to 140 keV. 5 Furthermore, it can achieve better soft-tissue contrasts for radiotherapy treatment planning and radiation diagnosis by reducing the effect of image artifacts resulting from the presence of metal. 6 The GE Revolution CT scanner with Gemstone Spectral Imaging (GSI) allows for dual energy kV-CT acquisitions that can be used to generate monoenergetic, iodine contrast-enhanced, calcium-enhanced, and effective atomic number images. 7 The disadvantages of DECT are that it requires a higher radiation dose and more expensive than conventional multidetector CT.
Image synthesis with deep learning is used for image-to-image translation from magnetic resonance (MR) images to CT images and for multicontrast MR images with convolutional neural networks (CNNs). 8 CNNs can capture and represent high-dimensional input-output relationships. CNNs have been applied to medical image segmentation and computer-aided detection. 9 Florkow et al. designed a two-dimensional (2D) CNN model that generates a synthetic CT images from a T1-weighted MR images. In the kV-CT images synthesized from MR images, a large difference with variations of up to 17% was observed in terms of the mean absolute error and variations of up to 28% specifically in bone images. These differences are attributable to the spatial resolution of MR images being poorer than that of CT images. 10 The current study proposes the image synthesis of monoenergetic CT images from kV-CT images, both of which have the same resolution.
Recently, an image synthesis method based on a generative neural network (GAN) has been used. The GAN based on the CNN model operates by training two different networks: a generator network to synthesize an image and a discriminator network to distinguish between synthesized and reference images. 11 Herein, the synthesis of monoenergetic CT images at 40, 70, and 140 keV from equivalent kV-CT images using the GAN model is proposed.

2.A | Data acquisition
A total of 18,084 images from 28 patients were analyzed as part of an institutional review board-approved study. DECT images for each patient were acquired using the Revolution DECT scanner (GE Healthcare, Princeton, NJ, USA). DECT acquisitions at 80 and 140 kV tube voltages and an exposure of 560 mA were performed.
The other scanning parameters were a field-of-view of 360 mm, slice thickness of 0.5 mm, and rotation time of 1.0 s. The monoenergetic CT images at 40, 70, and 140 keV and the equivalent kV-CT images were reconstructed using GSI and defined as the reference images.

2.B | Deep learning model
In the current study, a 2D CNN model comprising a GAN was designed. An overview of the GAN network model is depicted in Fig. 1. It includes a generator to estimate the monoenergetic CT image and a discriminator to distinguish the real monoenergetic CT image from the generated one. The generator attempts to produce realistic images that confuses the discriminator. Notably, these two networks are trained simultaneously. Hyperparameter optimization was performed in the training dataset, and the test set settings were adjusted only once for each algorithm. Image red-green-blue (RGB) channels are typically used as inputs to the neural network. 12  The proposed models were implemented using TensorFlow packages (V1.7.0, Python 2.7, CUDA 9.0) on a Ubuntu 16.04 LTS system.
All the three models were trained using instance normalization and identical hyperparameters except for the batch size. In the instance normalization, the mean and standard deviation were calculated and normalized across each channel in each training example. At each iteration, a minibatch of 2D images was randomly selected from the training set. The batch size was limited by the graphics processing unit (GPU) memory. Three hundred epochs were used to operate the 2D model on an 11-GB NVIDIA GeForce GTX 1080 GPU.

2.C | Evaluation
The prediction accuracy of the model for synthetic and real monoenergetic CT images were evaluated using the following five metrics: relative mean absolute error (MAE), relative root mean square (RMSE), structural similarity index (SSIM), peak signal-to-noise ratio KAWAHARA ET AL.
| 185 (PSNR), and mutual information (MI). These metrics are defined as follows: Here, r i, j ð Þ is the value of pixel i, j ð Þ in the synthetic CT image, t i, j ð Þ is the value of pixel i, j ð Þ in the reference image, and n x n y is the total number of pixels. RMSE is defined as The SSIM is discrete form, as follows, and luminance to compute a similarity score between two images.
The SSIM between two images x and ỹ can be computed as Ref. [13].
C 1 and C 2 are constants that are used to prevent a zero denominator and to maintain the stability of the formula. Q is the maximum CT value for the synthetic and reference images. The values of k 1 and k 2 are generally obtained from Ref. [14]. σ x is an estimate in the discrete form, as follows.
The correlation coefficient between x and ỹ is denoted as σ xy , which is expressed as follows.
F I G . 2. Generation and testing of the prediction model. Model performance was evaluated via cross-validation.
Evaluation of the predictive model based on the number of samples. Training dataset was reduced to one-half and onequarter.
F I G . 1. GAN framework. Generator learns to generate monoenergetic CT images of an anatomy similar to the kV-CT images. Meanwhile, discriminator learns to discriminate between the synthetic and real monoenergetic CT images.
and μ x is the mean intensity and can be expressed as The PSNR is calculated as follows: Here, MAX and MSE are the possible maximum signal intensity and the mean square error (or difference) of the image, respectively. The MI is used as a cross-modality similarity measure 15 and is calculated as follows: Here, m and n are the intensities in the reference monoenergetic CT image I r and synthesized monoenergetic CT image I t , respectively.
p(m, n) is the joint probability density of I r and I t , whereas p(m) and p(n) are marginal densities. Furthermore, p(m, n) can be calculated as follows: where h m, n ð Þ is the histogram of the pixel values in the reference monoenergetic CT image I r and synthesized monoenergetic CT image I t . Furthermore, the difference in the synthesized and reference monoenergetic CT numbers in the region of interest (ROI) was evaluated for several slices, starting from the feet to the chest in a manually drawn ROI, as depicted in Fig. 4 3 | RESULTS    images. 17 The future work will be performed to evaluate of the image quality and lesion detectability due to the difference of the scale resolution in the SECT and DECT images.
The monoenergetic CT image at 140 keV indicated the smallest values for MAE, MSE, and RMSE but the largest for PSNR and SSIM.
In the monoenergetic CT image at a high energy, the contrast scale in the monoenergetic CT number from low to high density was smaller than that at a lower energy. It is therefore easier to predict the pixel values for small contrast scales at high energies. Notably, the MAE, MSE, RMSE, PSNR, and SSIM values were found to be dependent on the contrast scale with monoenergetic CT images.
The MI with the monoenergetic CT image at 70 keV was the lar- The current study demonstrates that highly accurate monoenergetic CT images can be generated from kV-CT images using a GAN therefore, it will be performed in a future study.

| CONCLUSION
Synthetic medical image generation can be a cost-saving approach for developing automated diagnostic technology. The image prediction framework of a kV-CT image equivalent to a monoenergetic CT image was proposed. It is expected that the proposed model can serve a suitable alternative to the existing methods for the reconstruction of monoenergetic CT images in DECT from SECT images.

INFORMED CONSENT
Informed consent was obtained from all individual participants included in the study.

ETHNICAL APPROVAL
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

CONFLI CT OF INTEREST
The author have no other relevant of conflict of interest to disclose.

ADVANCES IN KNOWLEDGE
We created a new image prediction model for monoenergetic CT images scanned from kV-CT images.