Faster RCNN‐based detection of cervical spinal cord injury and disc degeneration

Abstract Magnetic resonance imaging (MRI) can indirectly reflect microscopic changes in lesions on the spinal cord; however, the application of deep learning to MRI to classify and detect lesions for cervical spinal cord diseases has not been sufficiently explored. In this study, we implemented a deep neural network for MRI to detect lesions caused by cervical diseases. We retrospectively reviewed the MRI of 1,500 patients irrespective of whether they had cervical diseases. The patients were treated in our hospital from January 2013 to December 2018. We randomly divided the MRI data into three groups of datasets: disc group (800 datasets), injured group (200 datasets), and normal group (500 datasets). We designed the relevant parameters and used a faster‐region convolutional neural network (Faster R‐CNN) combined with a backbone convolutional feature extractor using the ResNet‐50 and VGG‐16 networks, to detect lesions during MRI. Experimental results showed that the prediction accuracy and speed of Faster R‐CNN with ResNet‐50 and VGG‐16 in detecting and recognizing lesions from a cervical spinal cord MRI were satisfactory. The mean average precisions (mAPs) for Faster R‐CNN with ResNet‐50 and VGG‐16 were 88.6 and 72.3%, respectively, and the testing times was 0.22 and 0.24 s/image, respectively. Faster R‐CNN can identify and detect lesions from cervical MRIs. To some extent, it may aid radiologists and spine surgeons in their diagnoses. The results of our study can provide motivation for future research to combine medical imaging and deep learning.

as swelling and asymmetry) based on variations in water molecules by measuring alterations in the intensity of tissue signals. 2 Furthermore, damages caused by cervical disc degenerative diseases (DDD) and traumatic spinal cord injury (SCI) can be confirmed by MRI, which is the basis for identifying spinal cord diseases and neurological recovery. 3,4 Therefore, diagnosis is a key step to treatment and controlling lesions on soft tissue, and MRI could provide better diagnostic tools when there is uncertainty regarding diagnosis. 5,6 Recently, deep learning-based models, especially convolutional neural network (CNN) models, have been efficient in object detection.
Convolutional neural network models have been applied in several medical disciplines, including radiology, 7,8 pathology, 9 dermatology, 10 and ophthalmology. 11 Previous studies focused primarily on brain diseases compared to diseases of the spinal cord because MRI has been successful in diagnosing brain-related illness. In addition, spinal cord diseases exhibit more variations in their morphology and signals in sagittal MRI. [12][13][14][15] Only a few studies have investigated spinal cord diseases on MRI using CNN models. Gros et al. conducted a study that utilized a sequence of two CNNs to segment the spinal cord and/or intramedullary multiple sclerosis lesions based on a multi-site clinical dataset, and their segmentation methods showed a better result compared to previous CNN models. 16 However, the spinal cord diseases that they studied did not have specific locations and usually occurred in multiple areas, such as the brain, cerebellum, and lateral ventricles. In addition, MRI data on cervical diseases are insufficient, which has frustrated researchers in object detection/segmentation. As is known, DDD and SCI are the most common diseases of the cervical spine in clinical medicine, and sagittal MRI is increasingly being recognized for its contribution in assessing disease severity in patients with SCI and DDD. 17,18 However, the classification and detection of lesions for DDD and SCI on MRI images on the basis of deep neural networks appear to be limited as published studies in this regard are lacking in the literature. In addition, the performance of traditional deep learning methods are unsatisfactory, and traditional CNN have several defects. 8,19,20 To address this, some novel algorithms capable of powerful processing in object detection have been proposed; an example of such an algorithm is Faster R-CNN, 19 which offers advantages in terms of accuracy and detection speed. Therefore, we investigated the feasibility of using faster-region convolutional neural networks (Faster R-CNN 19 ) with a combination of the pre-trained VGG/Resnet 21 (to extract features) to identify and detect spinal cord diseases on the MRI dataset used in this study. Experimental results show that this method has good recognition performance.

2.A | Data collection
Patients with cervical diseases were admitted to our hospital between January 2013 and December 2018. Two diseases were considered as inclusion criteria: cervical DDD and traumatic SCI patients, which mainly refer to cervical disc herniation and changes in spinal cord signal due to injury, respectively. Simultaneously, spinal cord tumors, syringomyelia, motor neuron disease (MND), and peripheral polyneuritis were used as exclusion criteria.
Patients were subjected to a cervical spine MRI performed by radiologists using surface-coil MRI with 1.5 or 3.0 T. The MRI included T1-weighted image (T1WI), T2-weighted image (T2WI), and short tau inversion recovery (STIR) or fat saturation (FS); STIR and FS can be regarded as one type. 22,23 In clinical procedures, MRI usually includes three types of images -T1WI, T2WI, FS, or STIR images; the typical changes observed in tissue during MRI are listed in Table 1. Based on the results from images and disease incidence, a total of 1000 patients were enrolled from the picture archiving and communication systems (PACS) station, including 690 men and 310 women. In addition, data of 500 people who were diagnosed as negative were collected (without DDD and SCI) to obtain better real-time training results. The patients were divided into three groups: "normal group," "disc group," and "injured group." Finally, all the images were desensitized before being used (e.g., removal of name, age, date of examination, and sex).

2.B | Data preparation
The dataset was randomly split into two parts: 1200 (80%) patients for training and 300 (20%) patients for validation; this was done to simulate the proportion of the incidence in reality. Additionally, 500 MRI images were classified as a testing set to demonstrate detection performance, and the number of images in the "normal group," "disc group," and "injured group" were 200, 200, and 100, respectively.
The training and validation sets used a bounding box to show the location of the lesion, as shown in Fig. 1, while the "normal group" without a bounding box is shown in Fig. 2. In this process, two experienced spine surgeons labeled the bounding boxes using LabelMe Tool box-master. Before feeding the dataset into the network, we cropped the center to eliminate differences from raw data, which was generated from the PACS station. Finally, the number of dataset images was increased by a factor of 10 after horizontal flip and contrast enhancement.

2.C | Overview of the study
In clinical diagnosis, the commonly used weighted images of cervical MRI are T1WI, T2WI, and FS. Doctors need to combine the T A B L E 1 Characteristics of tissues in magnetic resonance imaging. information from three different images to comprehensively judge the patient's condition. 5,6 Typically, in medical classification and detection problems, the three types of images are used as independent inputs. These weighted images are considered as a data augmentation method. Other methods integrate these three singlechannel images at the third dimension, and regard it as a three-channel image. However, these methods have some problems. The former will cause the same patient to be diagnosed with different diseases by inputting different weighted images, resulting in confusing results. The latter uses the combination of three single-channel images to simulate the input of a three-channel color image. However, because the distribution of pixel values of the synthetic image is not consistent with the distribution of pixel values from the large dataset of real images, like ImageNet, the efficiency of the network will be reduced when using the pre-trained model, and the network cannot be pre-trained with other medical datasets.

Tissues signal shown in MRI
In this dataset, each image is divided into three categories: normal, SCI, and DDD. SCI and DDD signals do not appear on any image at the same time, which means that the classification information of the three images is the same for a patient, although the location of the bounding box obtained by the three images on the network may be slightly different. Therefore, integrating the classification information of the three images becomes a key problem in the design of the network structure.
Lesions on magnetic resonance imaging (MRI) annotated in a bounding box by two spine surgeons. Images a-f show the typical T1WI, T2W2, and STIR as "disc groups" labeled with the region of interest (ROI); images h-m show examples of the "injured group" marked with the ROI. The dataset is also naturally imbalanced with respect to the lesion classes, and the "disc group" clearly dominates with 80% of the total images in our training dataset.
We used Faster R-CNN as the main structure of the network.
Currently, Faster R-CNN is the most popular two-stage detection network, and it is used in many medical image detection problems. 22,23 The Faster R-CNN mainly includes a feature extractor, region proposal network (RPN), RoI pooling, and classifier. Primarily, Faster R-CNN consists of two parts. One is the RPN; it is a fully convolutional network (FCN) for generating object proposals that will be fed into the second module. The second is the Fast R-CNN 24 detector whose purpose is to refine the proposals and the sketch map of detection processing, as shown in Fig. 3. It should be highlighted that RPN and Fast R-CNN share the convolutional layers in order to save time. 19 Specifically, RPN is an FCN that simultaneously  Table 2, and Fig. 5 shows some examples of the study.

| RESULTS
First, the datasets were desensitized, and scales were normalized and labeled. Examples of the imaging studies are illustrated in Figs. 1-4. Next, to comprehensively evaluate the proposed detection system, the testing set was also collected from the PACS system to assure data uniformity; a sketch map of training detection processing is shown in Fig. 4 Figure 5 illustrates several examples of visualization images(including disc herniation and spinal cord injure)from the testing dataset through our model. Figure 6 shows the prediction results for the VGG16/Resnet 50 models.In the obtained images, apart from the frame, the corresponding damage probability also marks the image damage area. not only allows the gradient to pass the shortcut to alleviate gradient disappearance but also allows the model to learn an identity function that ensures the performance of the higher layers to be as good as the underlying layers, if not worse. 21 The main structure diagram of ResNet is shown in the study published by He et al. 29 Schmidhuber,

| DISCUSSION
in his study, 30   three, the number of parameters of a single sub-network does not increase much. Therefore, there is no requirement for additional data to overfit the network because of the large size of the network parameters.
Our study has some limitations. First, currently, the deep learning of object-detection task is updated very quickly, and the method we adopted may have a lag compared to the latest method. Second, the dataset used in the study is limited to our hospital imaging system.
Although this method can ensure the uniformity of data, it also results in an insufficient data volume compared with other databases, as well as poor scalability. If the dataset comes from multiple hospitals, it would greatly improve the persuasiveness and practicality of the experiment.

| CONCLUSION
In this study, we implemented a Faster R-CNN combined with a backbone convolutional feature extractor using ResNet-50 and the VGG-16 network to detect lesions on cervical MRI images. Experimental results showed that Faster R-CNN improves the possibility of diagnosing lesions from cervical MRI. Indirectly, our study showed that deep learning can help to detect cervical diseases on MRI, which is a value addition to the field. In the future, we hope that