Evaluation of repeatability and reproducibility of radiomic features produced by the fan-beam kV-CT on a novel ring gantry-based PET/CT linear accelerator
The RefleXion X1 is a novel radiotherapy delivery system on a ring gantry equipped with fan-beam kV-CT and PET imaging subsystems. The day-to-day scanning variability of radiomics features must be evaluated before any attempt to utilize radiomics features.
This study aims to characterize the repeatability and reproducibility of radiomic features produced by the RefleXion X1 kV-CT.
Materials and Methods
The Credence Cartridge Radiomics (CCR) phantom includes six cartridges of varied materials. It was scanned 10 times on the RefleXion X1 kVCT imaging subsystem over a 3-month period using the two most frequently used scanning protocols (BMS and BMF). Fifty-five radiomic features were extracted for each ROI on each CT scan and analyzed using LifeX software. The coefficient of variation (COV) was computed to evaluate the repeatability. Intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC) were used to evaluate the repeatability and reproducibility of the scanned images using 0.9 as the threshold. This process is repeated on a GE PET-CT scanner using several built-in protocols as a comparison.
On average, 87% of the features on both scan protocols on the RefleXion X1 kVCT imaging subsystem can be considered repeatable as they met COV < 10% criteria. On GE PET-CT, this number is similar at 86%. When we tighten the criteria to COV <5%, the RefleXion X1 kVCT imaging subsystem showed much better repeatability with 81% of features on average whereas GE PET-CT showed only 73.5% on average. About 91% and 89% of the features with ICC > 0.9 respectively for BMS and BMF protocols on RefleXion X1. On the other hand, the percentage of features with ICC > 0.9 on GE PET-CT ranges from 67% to 82%. The RefleXion X1 kVCT imaging subsystem showed excellent intra-scanner reproducibility between the scanning protocols much better than the GE PET CT scanner. For the inter-scanner reproducibility, the percentage of features with CCC > 0.9 ranged from 49% to 80%. between X1 and GE PET-CT scanning protocols.
Clinically useful CT radiomic features produced by the RefleXion X1 kVCT imaging subsystem are reproducible and stable over time, demonstrating its utility as a quantitative imaging platform.
Radiomics is focused on using image feature analysis to evaluate the imaging modality or patient's current status in-depth and has the potential to predict response rates and clinical outcomes as imaging prognostic and predictive biomarkers.1 To do so, the repeatability and reproducibility of the modality and image analysis method are essential.2 One of the applications thus far for radiomics is evaluating the performance of the radiomics features using different imaging modalities.2-22 However, the performance of outcome prediction will be affected by many factors. For example, to evaluate a CT modality performance, the low contrast detectability3 would differ between different manufacturers using an iterative reconstruction rather than a filtered back-projection algorithm. The CT radiomics features would be varied and the pixel size needs harmonizing.4 The radiomics features were also voxel size and gray level discretization dependent and needed normalization.5-7 The radiomics feature consistency is affected by scanning protocol and the images might need the compensation approach to express all data in a common space devoid of protocol effects.8, 15, 20 The phantoms used in the analysis of the radiomics features would have different materials and would have certain effects on the outcome prediction.11, 19, 21 In the past, studies have been done extensively for existing CT scanners from General Electric (GE) Healthcare, Philips Healthcare, Siemens Healthineers, and Toshiba Medical System.11, 13 However, most of those studies were done for diagnostic imaging purposes and CT modalities. For radiation therapy, CT techniques, such as kV cone-beam CT (CBCT), MVCT, has been widely used in clinical patient setup alignment. The images can also be further utilized to follow up on the patient's response and even to predict treatment outcomes. Therefore, the image quality of kVCT in radiotherapy is also important since kVCT radiomics may play an important role in radiotherapy management in the future. For kVCT radiomics analysis, repeatability and reproducibility are two key factors to ensure the applications of radiomics are successful. Indeed, when a new modality is introduced into the field, it is essential to show the stability of the imaging modality before widespread clinical implementation. The repeatability and reproducibility of radiomics features from CBCT23, 24 and MVCT25 have been studied. Traverso et al.26 and Pfaehler et al.16 also reviewed on radiomic features robustness for various imaging modalities.
A new type of modality called Biology-guided Radiation Therapy (BgRT) (RefleXion Medical Inc., Hayward, California, USA) is now being investigated.27 The first model (X1) was installed in several clinics beginning in 2021. The system integrates a PET-CT and rotational linac to deliver the radiation to the tumor target(s). The real-time BgRT function is currently approved for the BgRT by the US Food & Drug Administration. The machine design was described by Oderinde et al.28 The X1 have the potential to change the role of radiotherapy in metastatic cancer management.29, 30 Since the RefleXion X1 is a novel radiotherapy system, the imaging protocols, and the imaging quality for radiomics analysis are not extensively studied yet. The kVCT imaging radiomics repeatability and reproducibility are critical for the application of radiomics to the clinic. This study aims to assess the repeatability and reproducibility of the RefleXion X1 kVCT imaging subsystem based on the Credence Cartridge Radiomics (CCR) phantom.
2 MATERIALS AND METHODS
2.1 CCR phantom
A CCR phantom was used in this study. The phantom was the same phantom reported on by Ger et al. in 2018.9 The phantom has dimensions of 28 cm × 21 cm × 22 cm and contains six cylindrical cartridges (as shown in Figure. 1). Each cartridge is made up of different materials and encased by a build-up region of high-density polystyrene. The materials that make up the CCR phantom were chosen such that the radiomic features mimic tumors from patients with non-small cell lung cancer. Dense cork, shredded rubber, hemp seeds in polyurethane, 50% ABS and 50% acrylic beads, 50% ABS and 50% PVC pieces, and 50% acrylonitrile butadiene styrene, 25% acrylic beads, and 25% polyvinyl chloride were chosen as the materials for cartridge 1−6 respectively. The cartridge materials are summarized in Table 1 and imaging for each cartridge was shown in Figure 1B.
|Cartridge 1||Dense cork|
|Cartridge 2||Shredded rubber|
|Cartridge 3||Hemp seeds in polyurethane|
|Cartridge 4||50% ABS and 50% acrylic beads|
|Cartridge 5||50% ABS and 50% PVC pieces|
|Cartridge 6||50% acrylonitrile butadiene styrene, 25% acrylic beads, and 25% polyvinyl chloride|
2.2 Scanning protocol selection
The CCR phantom was scanned ten times on the RefleXion X1 kVCT imaging subsystem over 3 months using the two most frequently used scanning protocols: BMS (Body/Medium dose/Slow Couch) and BMF (Body/Medium dose/Fast couch). The RefleXion X1 kVCT detector had a dimension 16 rows × 1.25 mm and scanning FOV (field of view) 50 cm. The detector is filled in with GOS scintillator material and data are acquired 960 views per rotation. The gantry rotation speed is 60 RPM. Therefore, twenty CT scans were acquired during the 3-month period. The CCR phantom was also scanned 10 times on a GE Optima 560 PET-CT scanner in our clinic, which is an ACR-accredited center for diagnostic imaging purposes. The CT detector of GE Optima 560 PET-CT scanner uses 24 rows of HiLight Ceramic Matrix II. The bore size is 70 cm and the scanning FOV is 50 cm. Each time, the scan was performed on five different built-in protocols: Head and Neck (HN), SRS Brain (SRS), Upper Extremity (Ext), Breast (Brst), Abdomen (Abd) for a total of 50 CT scans. All PET-CT scans were acquired during the same period of time as RefleXion X1 kVCT. The parameter used for each protocol was retrieved both from the DICOM header information and the settings on the scanner as listed in Table 2. Note that the information about the convolution kernel and filter was not included in the RefleXion X1 DICOM header.
|Scan protocol||Tube voltage (kV)||Pitch||Tube current (mA)||Voxel size mm3||Conv Kernel||Filter|
|RefleXion X1 BMS||120||0.222||67||0.98 × 0.98 × 1.25||Shepp-Logan filter|
|RefleXion X1 BMF||120||1.333||400||0.98 × 0.98 × 1.25||Shepp-Logan filter|
|GE head and neck||120||1.35||272||0.98 × 0.98 × 2.5||Standard||Body|
|GE SRS brain||120||1.35||371||0.78 × 0.78 × 1.25||Standard||Body|
|GE upper extremity||120||1.35||128||0.98 × 0.98 × 2.5||Bone||Body|
|GE breast||120||1.35||141||0.98 × 0.98 × 2.5||Standard||Body|
|GE abdomen||120||1.35||145||0.98 × 0.98 × 2.5||Standard||Body|
2.3 Segmentation and radiomics features extraction
After the CCR phantom was scanned, the CT images were transferred to an Eclipse treatment planning station. (Ver. 16.0, Varian Medical Systems, Palo Alto, CA, USA). The six cartridges were manually segmented on one CT first. All cartridges were segmented into cylindrical regions of interest (ROI) with a diameter of 8 cm. Thickness for each ROI varied to best cover the cartridges: ROI for cartridge 3 had a thickness of 1.29 cm, and ROI for cartridge 4 had a thickness of 1.52 cm, all other ROIs had a thickness of 1.76 cm. To maintain consistent segmentation across all CTs, each subsequent CT was registered to the first CT scan and the ROI contours were transferred onto the subsequent CT scans. CTs and structure files were imported into an Image Biomarker Standardization Initiative (IBSI) compliant radiomic software package LifeX (https://www.lifexsoft.org/) for radiomics feature extraction.
Fifty-five radiomic features were extracted for each ROI on each CT scan on LifeX: 11 Grey Level Run Length matrix features (GLRLM), 11 Grey Level Zone Level Matrix features (GLZLM), 10 Conventional features (CONVENTIONAL), 13 Discretized features (DISCRETIZED), 7 Grey level Co-occurrence matrix features (GLCM), and 3 Neighborhood Grey-Level Dependence matrix features (NGLDM).
2.4 Repeatability analysis of radiomic features
Radiomics features on all six cartridges extracted from repeated scans were analyzed for each scan protocol and scanner. The coefficient of variation (COV)31 was computed to evaluate the repeatability. The different cut-off values for COV31 had been used to evaluate the repeatability. In this study, we evaluated cut-off values of COV ≤ 5%, COV ≤ 10%, COV ≤ 20%, and COV > 20%.
The repeatability of radiomics features for each scanner and scan protocol was also evaluated using the intraclass correlation coefficient (ICC) parameter.32 Generally, features with ICC > 0.9 or ICC > 0.85 are regarded as having good repeatability.34, 35 The percentages of features with ICC > 0.9 and ICC > 0.85 as well as the histogram of ICC values were computed.
2.5 Reproducibility analysis of radiomic features
For the intra-scanner and inter-scanner reproducibility evaluation of radiomics features, the concordance correlation coefficient (CCC)33 was computed and compared. Since our focus is on the RefleXion X1 kVCT imaging subsystem, we did not evaluate the intra-scanner reproducibility on the GE PET-CT scanner. CCC was computed between the BMS and BMF scans. For inter-scanner reproducibility, the BMF and BMS scans were compared with each of the GE PET-CT scan protocols. A cutoff value of 0.90 was used to evaluate the reproducibility of radiomics features.34, 35
3.1 Repeatability evaluation
The COV of radiomic features was computed for both the RefleXion X1 and GE PET-CT. On average 87% of the features on both scan protocols on RefleXion X1 can be considered repeatable as they met COV < 10% criteria. On the GE PET-CT, this number is similar at 86%. When we tighten the criteria to COV < 5%, the RefleXion X1 showed much better repeatability with 81% of features on average whereas GE PET-CT showed only 73.5% on average.
The repeatability of radiomics features can also be characterized by ICC. The distribution of ICC under RefleXion X1 scan protocols as well as under GE PET-CT protocols is shown in Figure 2. About 91% and 89% of the features with ICC > 0.9 respectively for BMS and BMF protocols on RefleXion X1. On the other hand, the percentage of features with ICC > 0.9 on GE PET-CT ranges from 67% to 82%, with the SRS protocol having the most stable features and the Ext protocol having the least. However, if we use the ICC > 0.85 criteria, on RefleXion X1 the BMF protocol has the most stable features at 95%. On GE PET-CT, the Brst protocol has the most stable features at 87%, whereas the HN protocol has the least at 71%.
The ICC value versus feature family can be visualized as a heatmap shown in Figure 3. In both RefleXion X1 protocols, the unstable features are mostly located in the GLRLM feature family. In GE PET-CT, the unstable features also include GLZLM, as well as GLRLM feature families.
3.2 Intra-scanner and inter-scanner reproducibility evaluation
The intra-scanner and inter-scanner reproducibility were shown in both Figures 4 and 5. The RefleXion X1 kVCT imaging subsystem showed excellent intra-scanner reproducibility. All but one features (98%) satisfy CCC > 0.9 for BMS versus BMF (mid-column BMS_BMF in Figure 4). On the other hand, GE PET CT shows better intra-scanner reproducibility between scans using the same slice thickness (HN vs. other) than scans using different slice thickness (SRS vs. other), as Figure 5 illustrated. However, the overall intra-scanner reproducibility is much worse than RefleXion X1.
As expected, the inter-scanner reproducibility is much poorer. As shown in Figure 4, percentage of features with CCC > 0.9 on between X1 and GE PET-CT protocols ranging from 49% to 80%, with HN protocol, reproducing X1 scan radiomics the best and Brst protocol reproducing X1 scan radiomics the least. The poor inter-scanner reproducibility issue mainly focuses on GLRLM and GLZLM feature families.
The RefleXion X1 and its kVCT imaging subsystem are relatively new to the radiation oncology community and the repeatability and reproducibility of the image features have not been well-characterized. In our study, we used the CCR phantom to evaluate the repeatability and reproducibility of CT radiomic features. The repeatability score, measured with both COV and ICC, shows the RefleXion X1 kVCT imaging subsystem is either on par or slightly better than the GE PET-CT. The ICC heatmap also reveals that the majority of radiomic features on RefleXion X1 are quite repeatable except for five features in the GLRLM family (LGRE, SRLGE, LRLGE, GLNU, RLNU).
Our study also showed that the radiomic features are quite reproducible between the two scan protocols of the RefleXion X1 kVCT imaging subsystem. All radiomic features have CCC score > 0.9 between the fast and slow scan except for the feature GLRLM_RLNU. It should be noted that the fast and slow scan settings use the same kVp, reconstruction kernel, reconstructed slice thickness, beam collimation, etc. The only difference is the pitch and tube current. However, as shown in Table 2, the pitch and tube current vary in such a way that the effective mAs, which equals mA * rotation time/pitch, remain the same. This means that the CTDIvol and the image noise between the two scan protocols will be similar. Therefore, it is not surprising to see that most of the radiomics features in RefleXion X1 are reproducible between the two scan protocols.
Our data shows that 1st order features (Conventional and Discretized) had better repeatability and reproducibility than higher-order features (GLCM, GLRLM, NGLDM, and GLZLM). This is in line with other studies. As shown in Figures 4 and 5, none of the first-order features has CCC < 0.5, this is in sharp contrast with the performance of high-order features. First-order features that showed lower reproducibility were mostly between scans with differing slice thicknesses. As reported in previous studies, slice thickness has a major impact on radiomic reproducibility. Although it has been reported that preprocessing before radiomic calculations can reduce the effect, it could not completely cancel it. Therefore, it is critical to ensure the same slice thickness is used in radiomics studies. Another potential reason affecting the result is the phantom was designed for chest imaging protocols and so the fact that other imaging protocols for head and neck or extremity might show different performance.
Our findings suggest that differing slice thickness has a stronger effect on the reproducibility of textural features (high-order radiomic features). Features extracted from scans using 1.25 mm slice thickness were less reproducible than from ones calculated on 2.5 mm scans. This is evident from the better intra-scanner reproducibility among protocols using 2.5 mm slice thickness than between the protocol using 1.25 mm slice thickness (SRS protocol) and other protocols using 2.5 mm slice thickness in GE PET-CT shown in Figure 5. Even though the inter-scanner reproducibility between the RefleXion X1 kVCT imaging subsystem and GE PET-CT protocols are poor overall due to the different reconstruction settings, the protocols using the same slice thickness (1.25 mm) still demonstrates much better reproducibility as shown in Figure 4. This is likely the result of the partial volume averaging effect which creates blurring across the slice direction. As textural features are calculated based on the inter-relationship between neighboring voxels, more blurring can reduce contrast between adjacent voxels. This average effect can affect these high order texture features, making them not reproducible between different slice thicknesses.
While first-order features were demonstrated to be more robust to both repeat scan reliability and inter-scanner reproducibility, it is higher order texture features that captures information about underlying tumor heterogeneity and could be the key to the radiomic research. While more research is needed to improve radiomics feature intra- and inter- scanner reproducibility, we find that maintaining the same slice thickness on all scanners in the same radiomics study is the first step towards meaningful radiomics research.
Radiomic analysis demonstrated that the clinically useful CT radiomic features produced by the RefleXion X1 kVCT imaging subsystem are reproducible and stable over time, demonstrating its utility as a quantitative imaging platform. The CT radiomic features produced by the RefleXion X1 kVCT imaging subsystem are quite stable with the scanning protocols, allowing the user to freely choose scan protocols to monitor radiomics features throughout the treatment course.
The authors have nothing to report.
CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest.
- 1, , , et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. 2012; 48(4): 441-446. 10.1016/j.ejca.2011.11.036
- 2, , , et al. Measuring computed tomography scanner variability of radiomics features. Invest Radiol. 2015; 50(11): 757-765. 10.1097/RLI.0000000000000180
- 3, , , et al. Evaluation of low-contrast detectability of iterative reconstruction across multiple institutions, CT scanner manufacturers, and radiation exposure levels. Radiology. 2015; 277(1): 124-133. 10.1148/radiol.2015141260
- 4, , , et al. Harmonizing the pixel size in retrospective computed tomography radiomics studies. PLoS One. 2017; 12(9):e0178524. 10.1371/journal.pone.0178524. Erratum in: PLoS One. 2018;13(1):e0191597.
- 5, , , et al. Intrinsic dependencies of CT radiomic features on voxel size and number of gray levels. Med Phys. 2017; 44(3): 1050-1062. 10.1002/mp.12123
- 6, , , et al. Influence of gray level discretization on radiomic feature stability for different CT scanners, tube currents and slice thicknesses: a comprehensive phantom study. Acta Oncol. 2017; 56(11): 1544-1553. 10.1080/0284186X.2017.1351624
- 7, , , , , . Voxel size and gray level normalization of CT radiomic features in lung cancer. Sci Rep. 2018; 8(1):10545. 10.1038/s41598-018-28895-9
- 8, , , , . Validation of a method to compensate multicenter effects affecting CT radiomics. Radiology. 2019; 291(1): 53-59. 10.1148/radiol.2019182023
- 9, , , et al. Comprehensive investigation on controlling for CT imaging variabilities in radiomics studies. Sci Rep. 2018; 8(1):13047. 10.1038/s41598-018-31509-z
- 10, , , et al. Effect of tube current on computed tomography radiomic features. Sci Rep. 2018; 8(1): 2354. 10.1038/s41598-018-20713-6
- 11, , , et al. Multicenter CT phantoms public dataset for radiomics reproducibility tests. Med Phys. 2019; 46(3): 1512-1518. 10.1002/mp.13385
- 12, , , et al. Reliability of CT-based texture features: phantom study. J Appl Clin Med Phys. 2019; 20(8): 155-163. 10.1002/acm2.12666
- 13, , . Physical imaging phantoms for simulation of tumor heterogeneity in PET, CT, and MRI: an overview of existing designs. Med Phys. 2020; 47(4): 2023-2037. 10.1002/mp.14045
- 14, , , et al. CT texture analysis challenges: influence of acquisition and reconstruction parameters: a comprehensive review. Diagnostics (Basel). 2020; 10(5): 258. 10.3390/diagnostics10050258
- 15, , , . The effect of CT scan parameters on the measurement of CT radiomic features: a lung nodule phantom study. Comput Math Methods Med. 2019; 2019:8790694. 10.1155/2019/8790694
- 16, , , et al. A systematic review and quality of reporting checklist for repeatability and reproducibility of radiomic features. Phys Imaging Radiat Oncol. 2021; 20: 69-75. 10.1016/j.phro.2021.10.007
- 17, , , et al. The application of a workflow integrating the variable reproducibility and harmonizability of radiomic features on a phantom dataset. PLoS One. 2021; 16(5):e0251147. 10.1371/journal.pone.0251147
- 18, , , et al. The effects of in-plane spatial resolution on CT-Based radiomic features' stability with and without ComBat harmonization. Cancers (Basel). 2021; 13(8): 1848. 10.3390/cancers13081848
- 19, , , et al. The impact of phantom design and material-dependence on repeatability and reproducibility of CT-based radiomics features. Med Phys. 2022; 49(3): 1648-1659. 10.1002/mp.15491
- 20, , , et al. The impact of the variation of imaging parameters on the robustness of computed tomography radiomic features: a review. Comput Biol Med. 2021; 133:104400. 10.1016/j.compbiomed.2021.104400
- 21, , , et al. HeLLePhant: a phantom mimicking non-small cell lung cancer for texture analysis in CT images. Phys Med. 2022; 97: 13-24. 10.1016/j.ejmp.2022.03.010
- 22. Understanding sources of variation to improve the reproducibility of radiomics. Front Oncol. 2021; 11:633176. 10.3389/fonc.2021.633176
- 23, , , et al. Can radiomics features be reproducibly measured from CBCT images for patients with non-small cell lung cancer? Med Phys. 2015; 42: 6784-6797. 10.1118/1.4934826
- 24, , , , . On the impact of smoothing and noise on robustness of CT and CBCT radiomics features for patients with head and neck cancers. Med Phys. 2017; 44: 1755-1770. 10.1002/mp.12188
- 25, , , et al. The feasibility study of megavoltage computed tomographic (MVCT) image for texture feature analysis. Front Oncol. 2018; 8: 586. 10.3389/fonc.2018.00586
- 26, , , . Repeatability and reproducibility of radiomic features: a systematic review. Int J Radiat Oncol Biol Phys. 2018; 102(4): 1143-1158. 10.1016/j.ijrobp.2018.05.053
- 27, , , . Emission guided radiation therapy for lung and prostate cancers: a feasibility study on a digital patient. Med Phys. 2012; 39(11): 7140-7152. 10.1118/1.4761951
- 28, , , , , . The technical design and concept of a PET/CT linac for biology-guided radiotherapy. Clin Transl Radiat Oncol. 2021; 29: 106-112. 10.1016/j.ctro.2021.04.003
- 29, , , et al. Biology-guided radiotherapy: redefining the role of radiotherapy in metastatic cancer. Br J Radiol. 2021; 94(1117):20200873. 10.1259/bjr.20200873
- 30, , . In the future, emission-guided radiation therapy will play a critical role in clinical radiation oncology. Med Phys. 2019; 46(4): 1519-1522. 10.1002/mp.13408
- 31, et al. The impact of image reconstruction settings on 18F-FDG PET radiomic features: multi-scanner phantom and patient studies. Eur Radiol. 2017; 27: 4498-4509. 10.1007/s00330-017-4859-z
- 32. The intraclass correlation coefficient as a measure of reliability. Psychol Rep. 1966; 19: 3-11.
- 33. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989; 45: 255-268.
- 34, A Proposal for Strength-of-Agreement Criteria for Lin's. Concordance Correlation Coefficient. NIWA client report: HAM2005-062 (2005).
- 35, . A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016; 15: 155-163.