Impact of image preprocessing methods on reproducibility of radiomic features in multimodal magnetic resonance imaging in glioblastoma

Abstract To investigate the effect of image preprocessing, in respect to intensity inhomogeneity correction and noise filtering, on the robustness and reproducibility of the radiomics features extracted from the Glioblastoma (GBM) tumor in multimodal MR images (mMRI). In this study, for each patient 1461 radiomics features were extracted from GBM subregions (i.e., edema, necrosis, enhancement, and tumor) of mMRI (i.e., FLAIR, T1, T1C, and T2) volumes for five preprocessing combinations (in total 116 880 radiomics features). The robustness and reproducibility of the radiomics features were assessed under four comparisons: (a) Baseline versus modified bias field; (b) Baseline versus modified bias field followed by noise filtering; (c) Baseline versus modified noise, and (d) Baseline versus modified noise followed bias field correction. The concordance correlation coefficient (CCC), dynamic range (DR), and interclass correlation coefficient (ICC) were used as metrics. Shape features and subsequently, local binary pattern (LBP) filtered images were highly stable and reproducible against bias field correction and noise filtering in all measurements. In all MRI modalities, necrosis regions (NC: n ® ~449/1461, 30%) had the highest number of highly robust features, with CCC and DR >= 0.9, in comparison with edema (ED: n ® ~296/1461, 20%), enhanced (EN: n ® ~ 281/1461, 19%) and active‐tumor regions (TM: n ® ~254/1461, 17%). The necrosis regions (NC: n¯ ~ 449/1461, 30%) had a higher number of highly robust features (CCC and DR >= 0.9) than edema (ED: n¯ ~ 296/1461, 20%), enhanced (EN: n¯ ~ 281/1461, 19%) and active‐tumor (TM: n¯ ~ 254/1461, 17%) regions across all modalities. Furthermore, our results identified that the percentage of high reproducible features with ICC >= 0.9 after bias field correction (23.2%), and bias field correction followed by noise filtering (22.4%) were higher in contrast with noise smoothing and also noise smoothing follow by bias correction. These preliminary findings imply that preprocessing sequences can also have a significant impact on the robustness and reproducibility of mMRI‐based radiomics features and identification of generalizable and consistent preprocessing algorithms is a pivotal step before imposing radiomics biomarkers into the clinic for GBM patients.


| INTRODUCTION
Glioblastoma (GBM) is the most aggressive malignant type of brain tumor, commonly occurs (de novo). Regardless of the progressions in treatment, the prognosis of GBM remains poor (median overall survival is 14 months) mainly due to the fact that GBM is remarkably heterogeneous over time and across patients. 1 Nowadays, there is arisen interest to characterize tumor heterogeneous and phenotypes based on the high-throughput quantified features extracted from the clinical standard of care image for providing image-based biomarkers relating to the pathologic, genomic, proteomic, and clinical data, which is wellknown as radiomics. 2 It is expected that radiomics takes an essential role in the current clinical oncology workflow, given that can be acquired noninvasively, and with no extra cost at any time of the treatment procedure. 3 However, the main issue and challenging for the clinical applicability of the radiomics is the reliability and repeatability of the radiomics features 4 across multi-centers. Radiomics features, reliability and reproducibility can be affected by various aspects of radiomics processing (e.g., image acquisition parameters and protocols, image preprocessing algorithms, tumor segmentation, and software used for processing and feature extractions). Major of radiomics studies by concerning a different aspect of radiomics reproducibility and repeatability issue was done in computed tomography (CT) and PET modalities for limited cancer types, 5,6 and a few studies have been reported in MRI. 7 Magnetic resonance imaging (MRI) is generally used for standard clinical care of GBM patients (i.e., Diagnoses, monitor tumor progression, and treatment response assessment). Given that MRI undergoes of various inherent acquisition artifacts and noises such as lack of standard intensity for inter-and intra-scanner variability even for the same protocol, body region, and patient; intensity non-uniformity as a result of reduced radio frequency, coil uniformity, nonlinear fields, gradient field, magnetic field, and etc.; image preprocessing method suchlike intensity normalization, bias field correction, and noise smoothing can facilitate quantitative MRI analysis and make the radiomics results more repeatable and comparable. 8,9 Currently, many attentions of the GBM mMRI-based radiomics studies were drowned to prognosis and prediction model, while they have not used a pre-specific image preprocessing pipeline. For an instant, in a mMRI radiomics study 10 co-registration, resampling (1mm 3 ), and histogram intensity normalization, in other work 11 registration, skull stripping, bias field correction, and intensity normalization, and in Ref. [ 12 ] skull stripping, registration, bias field correction and histogram matching, were implemented in mMRI preprocessing steps.
Also, in Ref. [ 13 ] noise reduction, bias field correction, skull stripping, rigid registration, and intensity normalization, in another study 14 coregistration, noise smoothing, bias field correction, and skull stripping, in Ref. [ 15 ] co-registration, resampling (1 mm 3 ), skull stripping, noise smoothing, and intensity normalization, was used as preprocessing steps in their mMRI GBM radiomics researches. MRI preprocessing can have a considerable impact on the whole radiomics analysis.
In this study, the effect of various MRI preprocessing sequences was investigated on the reliability and reproducibility of the radiomics features extracted from multi-regional GBM tumor (i.e., Edema, Enhancement, Necrosis, and Tumor) of mMRI (FLAIR, T1, T2weighted (T2), and T1C), by focusing on intensity inhomogeneity   correction and noise smoothing. To this end five different preprocessing cohorts were performed, and four comparisons were assessed, including (a) Baseline versus modified bias field, (b) Baseline versus modified bias field followed by noise smoothing, (c) Baseline versus modified noise reduction, and (d) Baseline versus modified noise followed bias field correction. The effect of bias field correction and noise filtering on the reproducibility of each extracted radiomics feature were quantified using concordance correlation coefficient (CCC), dynamic range (DR), and interclass correlation coefficient (ICC). The schematic of our workflow was depicted in Fig. 1. Experimental results have demonstrated that various preprocessing and sequences have affected the reproducibility and robustness of radiomics features, and identify a generalizable preprocessing step is crucially needed for providing clinical radiomics biomarkers. It is expected that our results can support researchers to select more reliable and repeatable MRI preprocessing sequences and radiomics features.

2.A | Data collection
Data were downloaded from the Cancer Genome Atlas Glioblastoma Multiform (TCGA-GBM) 16 series, which is publically available in the Cancer Imaging Archive (TCIA) 17 at Data Portal at [https://wiki.cance rimagingarchive.net/display/Public/TCGA-GBM]. The TCGA-GBM series consisted of the pre-operative and in some cases follow up MR images of 262 GBM patients and provided the pathological, genetic, and Clinical data of patients. These image sets collected from eight research centers with various scanners modalities, manufacturers, and acquisition protocols. All data were without a patient identifier, so not required for approval of the institutional review board.
In this research, 65GBM patients (40 males and 25 females with an average age of 60 yr) were selected from TCGA-GBM series based on the availability of pre-operative mMRI included fluid-attenuated inversion recovery (FLAIR), T1-weighted, T2-weighted, and post-contrast T1-weighted (T1C) image sets and survival time. Further, information regarding the scanners and demographic data of these subsets of patients can be found in Table 1.

2.B | Preprocessing
Preprocessing is a series of the transformation applied to an initial image for improving the image quality and making statistical analysis more repeatable and comparable. However, for the brain MRI, there is not any pre-specified analytic format for preprocessing and depends on the condition may diversify.
In this investigation, co-registration, resampling, skull stripping, and intensity normalization was applied as the baseline preprocessing pipeline, based on the crucial role of these methods in facilitating the radiomics analysis and accurate quantitative comparison across multi-modal MRI volumes. 18 An example of our baseline preprocessing pipelines on mMRI of a GBM patient was shown in Fig. 2.
Co-registration is the process of the mapping between two images to the reference coordinate system. Resampling 19 is commonly performed to standardize the voxel size of the database with a unique voxel resolution (e.g., 1 mm 3 ) and to correct for the differences of the scanner, pixel size, and slice thickness within single or multi-center cohort studies. Skull stripping 20 is accomplished to remove extra cerebral tissue from brain volume and increase speed and precision of subsequent MRI processing and establishing a robust normalized result.
Variation in the brain MRI intensity distribution 21 because of the difference in inter-and intra-scanners sensitivity and acquisition parameters leads to complicated in image quantitative analysis, even for the same protocol, tissue, patient, and scanner. Consequently, intensity normalization is essential for providing the same tissue intensity scale in brain MR images across all observations to facilitate radiomics analysis 9 and accurate quantitative comparison between MRI volumes as is expressed by Eq. (1), g R and g I are reference image and the original image intensity respectively, L R and H R are respectively the low and high reference image intensity range.
Image preprocessing may also apply to correct other important inherent MRI acquisition artifacts such as intensity non-uniformity and Gaussian noise. Intensity non-uniformity, or intensity inhomogeneity, or bias of magnetic field manifests as a low-frequency signal on MRI as a result of various causes such as fluctuation of the magnetic field. 22 This artifact can cause variation in the intensities of the cerebral tissue regions such as gray matter, white matter, or cerebral-spinal fluid in the different location within the brain MRI volume.
Noise is presented as high-frequency, intensity variations in the image. Noise filtering 23 is commonly conducted to increase the signal-to-noise ratio (SNR) while preserving the high-spatial-frequency information of the underlying image.
Various preprocessing pipelines were provided in different volumes to investigate the interplay between image preprocessing sequences, concerning intensity inhomogeneity correction and noise filtering, for mMRI on reproducibility and reliability of radiomics features. All original MRI volumes were converted to NIFTI format by using MRIcron through the dcm2nii and then were subjected to the following preprocessing pipelines.
The workflow of radiomics study. Multimodal magnetic resonance images (FLAIR, T1, T2, and T1C) were subjected to several preprocessing pipelines (A, B, C, D, and E). Glioblastoma (GBM) tumor was segmented. Feature extraction was performed on multi-regional GBM tumor by Pyradiomics. The reproducible radiomics features were selected based on reproducible metrics. FLAIR = fluid-attenuated inversion recovery, T1C = post contrast T1 weighted BCo-registration, resampling, skull stripping, bias field correction, and intensity normalization.
Co-registration was performed by using T1 mapping 24 from SRI24 atlas 25 image as the reference image for matching each mMRI (FLAIR, T1, T2, and T1C) volume. All images were resampled to 1 mm × 1 mm × 1 mm isotropic voxels. Skull stripping was done based on the ITK filter; the reference atlas and its relevant reference brain mask were specified by SRI24 atlas. 25 N4ITK, 28 a well-known intensity inhomogeneity correction, was to correct intensity non-uniformity. The model of image structure in this bias correction method in MR is considered to be the form of Eq. (3), where C is the corrupted image, I is the uncorrupted image, B is bias field, and n is the Gaussian noise.
The image model can become,Ĉ ¼Î þB by taking the logarithm  4), S Ã 0 f g is a modified B-spline estimator, and B n r is the measured remaining bias field at the nth reiteration.

2.C | Tumor segmentation
In radiomics studies, tumor segmentation is also a common source of variation in quantitative image analyzing because of intra-and inter-reader and software variability. To minimize the effects of tumor segmentation on our radiomics reproducibility analyze, the results of previously published work, 15 publically accessible through TCIA, were used for tumor segmentation labels. GLISTR boost, the modified version of GLISTR (GLioma Image SegmenTation and Registration) software, was applied for tumor segmentation based on a hybrid generative-discriminative model.
To segment the brain images into normal and tumor tissue labels, concomitant registration, the atlas of normal brain scans to brain scans with GBM tumors, and segmentation approaches based on an Expectation-Maximization (EM) framework using the tumor growth model were applied (a generative part, i.e., GLISTR).
To provide more reliable and accurate tumor labels, gradient boosting multi-class classification was implemented according to the information of the multi patients (a discriminating part). Tumor segmentation It analyzes high-spatial frequency phenomena localized in space, thus it can effectively extract information derived from localized high-frequency signals. Wavelet transform was calculated per eight decomposition levels applying either a high pass filter or a low pass filter in the original volumes.
Laplacian of Gaussian (LoG) 39 filter is the process of two steps, firstly smooth image by using a Gaussian filter, then applying the Laplacian to find edges (areas of gray level rapid change) as is calculated by Eq. (6), (a, b) is the spatial coordinates of the image pixel, 2. An example of our baseline preprocessing steps on a single slice of multimodal magnetic resonance imaging (MRI) (FLAIR, T1, T2, T1C) glioblastoma patient. FLAIR = fluid-attenuated inversion recovery, T1C = Post contrast T1 weighted and σ is the kernel size of the Gaussian filter. In this study, the LoG filter was implemented by two kernel size values (σ = 1, and 2 mm).
Square and square root image filter are tagged as Gamma modifiers. The square filter is accomplished by taking the square of image intensities (g I ) is given by Eq. (7), and the square root filter by taking the square root of the absolute value of image intensities is presented in Eq. (8), then the results are modified into the original image intensity range.
The radiomics features classes implemented in this study can be found in Table 2. PyRadiomics, as a standard open-source Python package, was employed for implementing a streamlined and reproducible standard tested platform for the radiomics features extraction task.

2.E | Statistical analysis
In this study, the concordance correlation coefficient (CCC) and the The CCC calculated as described with 95% confidence intervals to test the agreement between raters 41 by Eq. (11), μ is a mean value, and σ refers to the standard deviation, related to this work it was between the respective values of the baseline and modified preprocessed images.
Besides, for assaying inter-rater reliability intraclass correlation coefficient (ICC) is an appropriate index, 42 which evaluates both sides of the correlation and agreement among raters, is given by Eq. (12).
The range of DR is between 0 and 1, and features with the DR values closer to one, have more inclination to be considered as robust features against perturbation and hold comparatively reliable information.
In this study, radiomic features with CCC and DR > 0.9 were considered as robust and reproducible features; and the Intraclass correlation coefficient (ICC) with 95% confident intervals, higher than 0.9 and between 0.75-0.9 considered as excellent and good reproducible cohorts, respectively. 44 Statistical analysis was implemented with R software; version 3.5.1, including "EPIR" and "IRR" packages, were used for calculating CCC and ICC respectively.
Shape ( Besides, our results reconfirmed that shape features are less influenced by bias field and noise artifacts. This can be easily interpreted due to shape features are not or less correlated to the gray level intensity distribution. Shape features reproducibility were predominantly reported by multiple radiomics studies, mostly in CT modality 5 and in the limited literature for MR images. 47,48 Furthermore, the stability of LBP-filtered was substantially higher than the other filtered (i.e., wavelet, LoG, square, and square root) volumes. In contrast, square-filtered images were the most sensitive to the intensity inhomogeneity and had the least number of reproducible features.
Local binary pattern (LBP) is a practical texture feature with excellent image descriptor ability, 40 it provides local-level feature extraction to identify small gray level differences. Although there is plausible evidence of the outstanding performance of LBP filter in various aspects, it has been rarely used in radiomic studies. 6 Moreover, the total numbers of reproducible features for bias field correction and bias filed correction followed by noise smoothing, were higher than noise reduction or noise smoothing followed by bias field correction over MRI sequences. This finding is consolidated by previous literature, which indicated that bias field correction preceded noise reduction more improving the MR image quality than noise filtering followed by bias field correction. 49,50 Even though the qualities of our findings are robust, the reproducibility and repeatability of radiomics features were investigated on only five popular preprocessing combinations. Further validation is needed to identify the optimum and more generalizable image preprocessing method for MR image standardization in GBM patients.

| CONCLUSION
In conclusion, four critical issues were deduced from our findings.
Firstly, radiomics features extracted from necrotic regions were the most robust in comparison with other GBM phenotypes (edema, and enhancing active tumor) independent of MRI sequences. Secondly, shape features were the most robust and reproducible in all MRI sequences and GBM phenotypes. Thirdly, the LBP-filtered images were less sensitive to intensity inhomogeneity, and noise artifacts, and had better reproducibility features in comparison with the other image types (i.e., original, Wavelet, LoG, square, square root). Finally, if researchers are interested in modifying intensity inhomogeneity and noise simultaneously, bias field correction followed by noise filtering introduce more stable and reproducible radiomics features than noise filtering followed by bias field correction. For future work, we will explore the added value of these features, along with the prognostic model for GBM patients.

ACKNOWLEDG MENTS
The

CONF ILICT OF INTEREST
There is no conflict of interest declared in this article.