Variability in CT lung-nodule quantification: Effects of dose reduction and reconstruction methods on density and texture based features

Purpose: To investigate the effects of dose level and reconstruction method on density and texture based features computed from CT lung nodules. Methods: This study had two major components. In the first component, a uniform water phantom was scanned at three dose levels and images were reconstructed using four conventional filtered backprojection (FBP) and four iterative reconstruction (IR) methods for a total of 24 different combinations of acquisition and reconstruction conditions. In the second component, raw projection (sinogram) data were obtained for 33 lung nodules from patients scanned as a part of their clinical practice, where low dose acquisitions were simulated by adding noise to sinograms acquired at clinical dose levels (a total of four dose levels) and reconstructed using one FBP kernel and two IR kernels for a total of 12 conditions. For the water phantom, spherical regions of interest (ROIs) were created at multiple locations within the water phantom on one reference image obtained at a reference condition. For the lung nodule cases, the ROI of each nodule was contoured semiautomatically (with manual editing) from images obtained at a reference condition. All ROIs were applied to their corresponding images reconstructed at different conditions. For 17 of the nodule cases, repeat contours were performed to assess repeatability. Histogram (eight features) and gray level co-occurrence matrix (GLCM) based texture features (34 features) were computed for all ROIs. For the lung nodule cases, the reference condition was selected to be 100% of clinical dose with FBP reconstruction using the B45f kernel; feature values calculated from other conditions were compared to this reference condition. A measure was introduced, which the authors refer to as Q, to assess the stability of features across different conditions, which is defined as the ratio of reproducibility (across conditions) to repeatability (across repeat contours) of each feature. Results: The water phantom results demonstrated substantial variability among feature values calculated across conditions, with the exception of histogram mean. Features calculated from lung nodules demonstrated similar results with histogram mean as the most robust feature (Q ≤ 1), having a mean and standard deviation Q of 0.37 and 0.22, respectively. Surprisingly, histogram standard deviation and variance features were also quite robust. Some GLCM features were also quite robust across conditions, namely, diff. variance, sum variance, sum average, variance, and mean. Except for histogram mean, all features have a Q of larger than one in at least one of the 3% dose level conditions. Conclusions: As expected, the histogram mean is the most robust feature in their study. The effects of acquisition and reconstruction conditions on GLCM features vary widely, though trending toward features involving summation of product between intensities and probabilities being more robust, barring a few exceptions. Overall, care should be taken into account for variation in density and texture features if a variety of dose and reconstruction conditions are used for the quantification of lung nodules in CT, otherwise changes in quantification results may be more reflective of changes due to acquisition and reconstruction conditions than in the nodule itself.


INTRODUCTION
Lung cancer remains the principal cause of cancer related deaths. 1 Quantifying properties of lung nodules imaged on the CT for the purpose of diagnosis, staging, management, and determining treatment response has been a topic of interest for some time. Response evaluation criteria in solid tumors (RECIST), 2 which assesses overall tumor burden via longest diameter of tumors and was originally intended for standardizing and simplifying tumor response criteria, has been accepted as a standardized measure of tumor response, especially in oncologic clinical trials. The NELSON study, 3 which used a management protocol centered on the volume and volume doubling time of lung nodules, demonstrated how volumetric nodule assessment could be used routinely for managing patients with nodules detected through a lung cancer screening program.
There are also efforts that have described moving beyond the commonly used size based measures and into more sophisticated features that quantify the appearance and shape of lung nodules in CT via image processing and machine learning techniques. Aerts et al. 4 showed the effectiveness of how tumor phenotypes can be quantified by applying a large number of quantitative image features, which they referred to as radiomics. El-Baz et al. 5 proposed a spherical harmonics based shape index, which was computed on automatically segmented lung nodules that use an appearance and shape model, where they showed that their proposed measure was able to distinguish between malignant and benign lung nodules with high accuracy. Shen et al. 6 proposed the use of multiscale convolutional neural networks to automatically extract discriminative features to differentiate between malignant and benign nodules. Han et al. 7 compared the performance of 2D and 3D Haralick features in distinguishing between malignant and benign nodules via a support vector machine classifier. Thus, there are a number of other properties of lung nodules that may be extracted from CT image data which are being investigated as being helpful in diagnosis or patient management.
An ongoing concern in applying CT imaging techniques is the radiation dose to the patient. Thus, there have been steps toward using lower radiation dose techniques in CT imaging, both in screening as well as in diagnostic and followup CT examinations. In addition, the manufacturers have been developing many radiation dose reduction techniques such as automatic exposure control (AEC) methods and automatic tube current modulation (TCM), 8 as well as advanced image reconstruction techniques such as iterative reconstructions that seek to reduce the image noise introduced when lower dose scans are used. 9 It is well known that CT acquisition parameters impact appearance of structures and diseases in CT, but what is not clear is the effect of dose reduction methods on quantitative imaging measures derived from CT scans. Young et al. 10 demonstrated that measured nodule volumes with semiautomated segmentation techniques were not sensitive to the effects of dose level and reconstruction kernel. Hunter et al. 11 used 4D CT and 3D CT test-retest scans of non-small cell lung cancer cases to identify a set of nodule features that is robust across different CT machines. Mackin et al. 12 investigated the effects of interscanner variability on radiomics features using a custom-designed phantom and found substantial variability in features derived from CT images when compared to patient cases with non-small cell lung cancer (NSCLC). Fave et al. 13 found similar results when investigating radiomics features for patients imaged on cone beam CT. Except for Young et al., the works above investigated inter or intrascanner variability of various features of lung nodules under similar CT acquisition and reconstruction conditions. There is still a need to investigate the effects of acquisition and reconstruction conditions on nodule density and texture based measures, both of which may be substantially impacted by reduced dose scans reconstructed with advanced image reconstruction techniques. This is similar to the approach described by Nyflot et al. 14 who demonstrated the impact that stochastic effects have on quantitative texture features in PET-CT images.
The purpose of this work is to investigate the effects of dose level and reconstruction method on density and texture based features computed from CT lung nodules. A range of acquisition conditions is obtained both through repeat scanning of a phantom at different dose levels and through simulating reduced dose scans in patient image datasets. Compared to Mackin et al., 12 our analyses are limited to intrascanner variations due to different acquisitions and reconstruction conditions.

MATERIALS AND METHODS
This study consists of two major components. In the first component, an analysis of CT scans of a known uniform object, namely, a water phantom, was performed under a wide variety of acquisition (e.g., dose levels) and reconstruction (conventional filtered back projection or FBP and iterative reconstruction or IR) conditions. Because this object is completely homogeneous and uniform, any variability in the resultant images is therefore known to be due to the acquisition and reconstruction process. In the second component, a similar analysis of lung nodules from patient images was performed under a variety of acquisition and reconstruction conditions described previously in Young et al. 10 To simulate a range of acquisition conditions, the raw projection data were obtained and noise was added using a validated method to simulate different dose levels. The original and simulated reduced-dose scans were reconstructed under several conditions. From the resulting image data, a range of density and texture values was extracted and analyzed to determine the effects of acquisition and reconstruction conditions on each feature value.
This section starts by describing the materials used in our experiments, which consists of CT scans of a water phantom and actual patients with lung nodules. Details on how reduced-dose scans were simulated and reconstructed at different kernels are presented. This is followed by a detailed description on the investigated density and texture features. Finally, the evaluation measures are presented.

2.A. Water phantom scans
Because patient nodules may not be homogenous in composition, we performed an initial set of experiments in homogeneous water phantom over a wide range of acquisition and reconstruction conditions to establish a basis for our investigation. The phantom used was the water section of the QA phantom of our CT scanner (Definition AS, Siemens Healthcare, Forchheim Germany). The water phantom was scanned on the Definition AS using variations of an adult abdomen/pelvis protocol under the following conditions: 120 kV, 0.5 s rotation time, pitch 1.0, collimation of 64 × 0.6, and fixed tube current scans of 225, 100, and 50 effective mAs, which corresponded to CTDIvol (32 cm phantom) of 17.1, 7.6, and 3.8 mGy, respectively. All images were then reconstructed at 1 mm thickness and 1 mm spacing using the eight different reconstruction kernels to represent a range of what is available on the scanner. These included conventional weighted FBP reconstructions of B10f (smoothest), B30f, B45f, and B70f (sharpest) kernels as well as IR (Safire) I26f strength 5 (smoothest), I44f strength 3, I50f strength 3, and I70f strength 1 (sharpest), designated as I26f\5, I44f\3, I50f\3, and I70f\1, respectively, which are somewhat analogous to the conventional reconstructions and used to illustrate a range of reconstruction options available on the scanner. These acquisition and reconstruction conditions are summarized in Table I.

2.B. Patient scans with nodules-Original and simulated reduced-dose images
A total of 33 cases with lung nodules from different patients were used in this study and were identical to the patient dataset used in Young et al. 10 These nodules ranged in size from 7 to 46 mm longest in-plane diameter, with an average of 18 mm. Twenty-five nodules were >10 mm in diameter, and four of these were >30 mm in diameter. All scans were performed on a multidetector CT scanner (Definition Flash, Siemens Healthcare, Forchheim, Germany) using a routine, adultchest protocol: 120 kV, 0.5 s rotation time, 250-285 quality reference mAs, pitch 1, with tube current modulation (TCM) (CareDose 4D, Siemens Healthcare, Forchheim, Germany). This is referred to as the "reference" protocol, which resulted in a CTDIvol of 20.9-23.8 mGy using the standard 32 cm CTDI body phantom.
As described previously, for each case, the original raw projection data were retrieved from the scanner so that simulated reduced dose scans would be created. This method is described in Young et al. 10 and is based on the noise addition methods described by Zabic et al., 15 Massoumzadeh et al., 16 and Yu et al. 17 Using this approach, we generated reduced-T II. Summary of acquisition and reconstruction conditions for the patient scans with lung nodules. Note that 100% dose (approx. 20.9 mGy) and B45f were the reference conditions here, indicated by REF. X indicates used reconstructions.

Dose level
B45ff I44f\3 B50f\3 100% (20.9 mGy) REF X X 25% (5.2 mGy) X X X 10% (2.1 mGy) X X X 3% (0.6 mGy) X X X dose sinograms at 25%, 10%, and 3% of clinical dose. The TCM information is contained in the raw sinogram data. The TCM usually scales linearly with quality reference mAs setting and so dose reduction was modeled simply as a linear T III. List of features included, where I , P I , and P are functions representing the image, normalize histogram, and normalized GLCM matrix of the image, respectively, and Ω is a set containing the coordinates of all voxels in the ROI.

Family
Feature Formula

Information correlation A
Hx y−Hx y1 max(Hx, Hy) scaling of the TCM function by a constant factor. The reduceddose TCM function was used to calculate the photon fluences for each projection in the helical scan trajectory, which was subsequently used as input to the noise-addition model. After creating simulated reduced-dose sinograms, these sinograms were imported back to the scanner and reconstructed with three different reconstruction methods at each dose level: B45f, I44f strength 3 (I44f\3), and I50f strength 3 (I50f\3). These reconstruction methods would all be considered to be medium to medium sharp kernels and therefore represent a much narrower range of reconstruction conditions compared to those used on the water phantom scans. All images were reconstructed at 1 mm slice thickness and 1 mm interval. This is summarized in Table II.

2.C. Region of interest (ROI) and nodule contouring
For the water phantom, five nonoverlapping ROI of 10 mm diameter spheres were manually placed within the water region of the phantom from the reference condition (17.1 mGy, FBP with B45f kernel). For each nodule case, a single nodule was identified and semiautomatically contoured in 3D using the reference acquisition and reconstruction condition (100% clinical dose, FBP with B45f kernel) using an inhouse software that used semiautomatic contouring, which was initiated via an Otsu thresholding 18 to obtain an initial contour, where the user performs a click on a point within the nodule and a drag to a point outside the nodule. Additional regions were then added or erased manually via a paint interface until the desired contour was obtained. 17 randomly selected nodules were identified and contoured again using the same software. The lung nodule contours used in this study are a subset of contours used in Young et al., 10 which were contoured by three lab technologists trained in contouring lung nodules on CT scans.

2.D. Computation of density and texture based features
As mentioned earlier, we are interested in density and texture based features of lung nodules, which quantifies appearance of lung nodules based on CT densities (Hounsfield Units or HU). There is a very large family of features in the literature for quantifying the appearance of an object in an image. In addition, these features are also typically associated with a continuous parameter space, and can be used in combination to form new features. It is therefore not possible to exhaustively test all the image features in the literature nor is it the aim of this paper. Instead, we limit our scope to a subset of features belonging to two well known image feature families, which are histogram based features and gray level co-occurrence matrix (GLCM) based texture features. Histogram based features such as mean, standard deviation, and kurtosis of a given region of interest are some of the most commonly used density features not only in lung nodule quantification but also medical image quantification in general. GLCM based texture features, first introduced in Haralick et al., 19 are one of the most commonly used features F. 1. CT images of the water phantom obtained at eight kernels and three dose levels. The images show the same region of 125 × 125 mm cropped at the middle of the water phantom at window level of 0 and window width of 400. The overlaid red lines in the images are the histogram of the densities within cropped region. Both scales in the X and Y axis were kept constant across all images. The reference condition is indicated by *.
F. 2. Plots of histogram features from a water phantom across different dose/reconstruction conditions, where the points and whiskers indicate the mean and the standard deviation of the feature value, respectively. The y-axes are the mean feature value at each condition and the x-axes are the various dose/reconstruction conditions. The images accompanying each plot are (from top to bottom) the image at the reference condition (B45f, 17.1 mGy) and the image from the condition having a mean feature value closest to the mean feature value at the reference condition. Note that the condition under which the closest match to the reference condition occurs changes with each feature used as the basis for the match.
for quantification of textures in image processing. Table III shows the list of features that were included in this study.
Computation of GLCM for a region of interest involves two parameters: the number of directions used (also referred to as offsets) and the number of quantization levels. In our F. 3. A histogram of how frequent a particular condition has the nearest mean feature value to that of the reference condition (B45f, 17.1 mGy) obtained using the water phantom images. experiments, we based the direction on the 26-connectivity that is typically used in 3D, resulting in a total of 13 offsets (=26/2 due to symmetry). This results in a total of 13 GLCMs computed for a given contour (one for each direction), resulting in a total of 13 values for a given GLCM feature. In practice, it is common to use the mean and range (maximum-minimum) across the different directions of a GLCM feature, 19 which is what was done in this study. The number of quantization levels used is typically application specific and is usually chosen empirically. For this study, two quantization levels were used, which are 25 4 and 32.
All computations were performed in 3D, which involves all voxels inside a given 3D contour. All feature values were computed from the contours obtained at the reference condition to prevent other sources of variations from influencing our analysis, such as intrareader variability and human perception of different reconstruction.

2.E.1. Water phantom
The role of the homogenous water phantom was to illustrate the change in feature values across conditions. The mean and standard deviation of each feature, computed across the five nonoverlapping sphere ROIs, were observed over the range of dose levels and reconstruction conditions. In order to identify conditions that are most similar to the reference condition, comparisons among conditions were performed based on individual feature values, with the nearest mean value to the reference condition being recorded and plotted in the form of a histogram.

2.E.2. Patient scans with nodules
Similar to the previous analysis, the mean and standard deviation of each feature, computed across the different nodule cases, were used to illustrate the change in feature values across conditions.
The large variation of value range between different features makes it difficult to compare the effects of different conditions across the different features via simple measures such as mean squared difference. Therefore, a Q measure was used to quantify the robustness/stability of a feature f at a particular recon r with respect to the reference recon. The Q measure used is the ratio between the standard deviation of reproducibility (across acquisition and reconstruction conditions) and repeatability (across repeat contours of the reference condition), defined as where f r (.) and f 0 (.) is the feature f computed for a given contour at reconstruction r and the reference recon, respectively, S(.) is a function that computes standard deviation, Φ i and Φ ′ i are the contour of a case and its repeated counterpart, respectively, N is the total number of cases, and M is the total number of cases with a repeated contour. Intuitively, a feature f is said to be robust to reconstruction r if Q( f ;r) ≤ 1. On the other hand, the larger Q( f ;r) is from one, the more f is impacted by a change from the reference recon to reconstruction r. Figure 1 shows the cropped CT images of the water phantom at different kernels and dose levels, with the histogram of the densities (HUs) within the cropped region overlaid in the images. This figure demonstrates that both dose level and reconstruction condition have a substantial impact on local variation of HU values. It also shows, as expected, that the position of the peak of histograms did not vary much across reconstructions, indicating a stable mean value. The spread of the histogram however varies across conditions, having a smaller spread for the smoother kernels (lower reconstruction kernel numbers such as B10f) and a larger spread for the sharper kernels (higher numbers such as B70f). In addition, images using the same reconstruction condition, but different dose levels (i.e., images within the same column), illustrate the increased variation that can be observed as dose is reduced.

3.A. Water phantom
For example, the differences can initially be appreciated based on visual comparisons between the smoothest image with the narrowest histogram peak, Fig. 1(e), to images with more variation and wider histograms, such as (d) that is of the same dose level, and (t) or (x) that are at reduced dose levels.
To illustrate the effects of different dose level and reconstruction conditions on density and texture features, Fig. 2 shows the graph of a few selected features-the mean, standard deviation, entropy density, and mean of GLCM mean (at 25 quantization level) features across different conditions, where the point indicates the average feature values and the whiskers indicate the standard deviation across the ROIs. This figure illustrates that some features [e.g., mean, mean of GLCM mean (25)] are very stable across different conditions, while other features (e.g., standard deviation and entropy) vary substantially across conditions. Using the B45f kernel at 100% dose level as the reference condition, the image of the dose/reconstruction condition that has the mean feature value closest to the feature value under the reference condition was shown side by side with the reference image. This side-by-side F. 4. CT images of a nodule case obtained at three kernels and four dose levels (one original and three simulated), at a window level of 40 and window width of 400. The overlaid red lines in the images are the histogram of the densities within cropped region. Both scales in the X and Y axis were kept constant across all images. The reference condition is indicated by *.
F. 5. Plots of histogram features derived from nodules across different dose/reconstruction conditions, where the points and whiskers indicate the mean and the standard deviation of the feature value, respectively. The y-axes are the mean feature value at each condition and the x-axes are the various dose/reconstruction conditions. The images accompanying each plot are (from top to bottom) the image at the reference condition (B45f, 100%) and the image from the condition having a mean feature value closest to the mean feature value at the reference condition. Note that the condition under which the closest match to the reference condition occurs changes with each feature used as the basis for the match.
visual and quantitative comparison is illustrated for each of the four features mentioned earlier in this paragraph. Figure 3 is a histogram of how frequent a particular condition has the nearest mean feature value to the reference condition. The majority of histogram features were observed F. 6. A histogram of how frequent a particular reconstruction has the nearest mean feature value to nearest to B45f at 100% from the nodule cases.
to be closely matched to the I44f\3 at 7.6 mGy condition, which more closely resembles (at least visually) the reference reconstruction. The majority of the GLCM features, however, were most closely matched to B45f at 7.6 mGy despite the difference in dose level with the reference condition.

3.B. Lung nodule cases
A total of three simulated dose levels were generated from the raw sinogram data for each lung nodule case, which are 25%, 10%, and 3% of the original dose. For each raw sinogram datum (including both original and simulated), three reconstructions were obtained using FBP with B45f kernel and IR with I44f\3 and I50f\3, which represents a more limited range of reconstruction conditions compared to those used in the water phantom. This resulted in 12 reconstructions per lung nodule case. Figure 4 shows the CT images of the same nodule case with the different reconstructions. Overlaid on the images were the histogram of the densities of the nodule at the different reconstructions. Similar to the results of the water phantom, this figure demonstrates that both dose level and reconstruction condition have a substantial impact on local variation in HU values. Again, the differences between conditions can initially be appreciated based on visual comparisons between the smoothest image with the narrowest histogram peak, e.g., Fig. 4(b), to images with more variation and wider histograms, such as Fig. 4(j) or 4(l) which are both at simulated reduced-dose conditions. Figure 5 shows the graphs of the mean, standard deviation, entropy densities, and mean of GLCM mean (at 25 quantization level), similar to Fig. 2 conditions. Using the B45f kernel at 100% dose level as reference, the image from the condition with a mean feature value that was closest to the mean feature value obtained at the reference condition was shown side by side with the reference image for several feature values. Because of the inhomogeneity of the nodule cases, the standard deviation of the features per condition is larger as compared to the ones from the water phantom (refer to Fig. 2). The influence of acquisition/reconstruction to histogram entropy and histogram standard deviation is also less prominent in the nodule cases as compared to those observed from the water phantom. There seems to be a consistent relationship between the mean of GLCM mean and the acquisition/reconstruction conditions for the nodule cases, which was not observed in the water phantom. Figure 6 is a histogram of how frequent a particular condition has the nearest mean feature value to the reference condition. The results from the nodule cases show that the condition with the most frequent selection, which is I44f\3 at 10%, is also the condition that appears visually to be the most similar to the reference condition, as observed in Fig. 4. Figure 7 shows the evaluation measure Q for each of the histogram and texture measures with respect to the reference condition. As expected, the histogram mean feature is very robust to different conditions, even for the lung nodules. Similar to the observations in Figs. 5(b) and 5(c), the Q measure confirmed the sensitiveness of histogram entropy to acquisition/reconstruction conditions, and the relatively better robustness of histogram standard deviation.
Tables IV and V list the 30 lowest variation features and the 20 largest variation features across all conditions, respectively, ranked using the number of conditions with Q ≤ 1, and max Q in case of a tie. Histogram mean, variance, and standard deviation were the three features with the lowest variation. For GLCM features, both range and mean of sum variance, diff. variance, sum average and mean, and the range of entropy were among the 30 lowest variation features. Table VI lists the rank of conditions according to the number of features with Q ≤ 1, and max Q in case of a tie, where I50f\3@100% and I44f\3@25% were the conditions with the lowest variation across all features. The three highest variation conditions were the conditions with 3% dose levels, with very few number of features with Q ≤ 1, especially for the I50f\3 reconstruction in which only the histogram mean was robust. T IV. 30 lowest variation features extracted from the lung nodules according to the number of conditions with Q ≤ 1, and max Q in case of a tie.

Rank
Family

DISCUSSION
In this study, we have investigated the behavior of histogram features and GLCM based texture features in both water phantom and nodule cases from actual patients across a variety of dose and reconstruction conditions. To avoid exposing patients to additional radiation, low dose scans were simulated from clinical dose CT raw data for the nodule cases. Due to the large amount of density and texture features available in the literature, and the near infinite possibilities of settings and combinations, exhaustive investigation is impossible. Instead, we limited the scope of our investigation to a subset of well known features that are used in lung nodule quantification related literature, which are histogram based features and GLCM based texture features.
As observed in Fig. 1, the appearance of a homogeneous object differs substantially across different acquisition and reconstruction conditions. From the histogram based features, it was observed that the mean HU value is quite robust to different conditions as expected, primarily because of CT numbers being consistently referenced and calibrated to water across all acquisition and reconstruction conditions. The main T VI. Ranking of conditions according to the number of features with Q ≤ 1, and max Q in case of a tie. The number in square brackets indicates the total number of features in the indicated feature family. impact is on the spread or standard deviation, which is also observed from Figs. 2(a) and 2(b). When comparing the features across different acquisition/reconstruction conditions with respect to a reference condition (100% clinical dose with FBP B45f reconstruction) as shown in Fig. 3, we observed that most histogram based features were nearest to I44f\3 at 7.6 mGy. This condition is also one that resembles the reference condition visually, both appearance and histogram wise as shown in Fig. 1. However, the GLCM based texture features from the reference condition seemed to be nearest to B45f at 7.6 mGy and I50f\3 at 17.1 mGy, which visually looks noisier than the reference condition. We suspect this is due to the quantization step in the computation of GLCM, as in I50f\3 at 3.8 mGy and B45f at 7.6 mGy may appear to be more similar to the reference condition after quantization is applied.
As observed in Fig. 5, and similar to results observed from the water phantom, histogram mean of the nodule cases was quite robust across the different conditions. Surprisingly, unlike the water phantom, histogram standard deviation seems relatively robust. A possible explanation for this is that the variability introduced by the acquisition/reconstruction is masked by either the inhomogeneity in the nodule population or the inhomogeneity of densities within nodules as seen in the histogram shown in Fig. 4. While Fig. 3 showed different behavior between the histogram and GLCM features, the nodule cases showed general agreement of favored conditions (those conditions nearest in terms of mean feature value to the reference condition) between the histogram and GLCM features as observed in Fig. 6, where both types of features favor I44f\3 at 10% and I50f\3 at 100%.
As expected, HU mean is the feature with the lowest variation. This is followed by variance (rank 2) and standard deviation (rank 3). Other than GLCM mean of maximal correlation coefficient at 32 quantization level, all features have favored conditions with a Q measure of less than one, as shown in Tables IV and V. In general, quantization level of 25 and 32 did not have much impact on the way the GLCM features behave. For the features belonging to the GLCM mean family, we did observe that features where their calculations involve summation of product between probability and bin index have lower variation in general, or of the form  i iP(i), such as sum variance, diff. variance, sum average, variance, and mean. With the exception of diff. variance, the same is true for GLCM range family. It is also surprising that while entropy of GLCM range was among the lowest variation features (rank 7 and 8 for quantization level of 25 and 32, respectively), entropy of GLCM mean was not.
Probably because of the limited range of different reconstructions in the nodule cases, effects due to different reconstructions appeared to be minor as compared to different dose levels. Conditions with dose level of 3% were the conditions that most features (with the exception of histogram mean) were sensitive to, with little or no features having Q ≤ 1. It is interesting that I44f\3 at 100% dose level has less features with Q ≤ 1 as compared to 25% and 10% dose levels, indicating that additional noise from the lower dose levels actually helps in countering the smoothing effect of the reconstruction method. Table VI shows the rank of the different conditions according to the number of robust features (Q ≤ 1). While the majority of features are robust (with respect to the reference condition) for high ranking conditions (e.g., I50f\3 at 100%), there are still sensitive features with Q > 3. This implies that if these sensitive features were used to quantify nodules, care must be taken to ensure that the acquisition conditions are the same (e.g., the reference condition in our case), for instance at different time points, or else changes in the quantification results may reflect the difference in acquisition conditions rather than the underlying physiology.
One limitation of this study was that the patient nodule scans were not analyzed under the same conditions as the water phantom, thereby limiting some of the direct comparisons between the water phantom and nodule results. This was in large part due to an equipment change such that the scanner on which the original patient data were acquired (Definition Flash) was removed and no longer available for reconstructions (the raw projection data from the Flash could not be reconstructed on any other nonFlash scanner). This limited the range of conditions under which the patient nodule data could be analyzed. The water phantom was scanned on a similar but not identical multidetector CT (Definition AS) and reconstructions were performed under a wider range of conditions to more completely illustrate the effects of dose and reconstruction on feature calculations for a homogeneous object.

CONCLUSION
We have illustrated the effects of dose and reconstruction (both FBP and IR) methods have on density and texture based features computed from CT scans of a uniform water phantom and nodule cases. Even though acquired under similar acquisition conditions, the effects of the features on a uniform water phantom differ as compared to those from the nodule cases. Though only at a limited number of dose levels and reconstructions, we showed that using lower dose level (more noisy) on smoother reconstruction, or vice versa, may have a balancing effect and results in images and feature values similar to images acquired from other conditions. It was shown that different features were impacted differently by different dose level and reconstruction, and some features are more robust to different conditions than others. Additionally, we observed that most features computed based on summation of the product of probability and intensity values (either in HU or values after quantization) have a tendency to be more robust to different acquisition conditions, barring a few exceptions.
It should be noted that our focus in this study is on reproducibility of features across different acquisition conditions, which may or may not relate to physiology of lung nodules, for instance prognosis of nodule malignancies.
In conclusion, when using densities or texture features for quantification of nodule physiology, care should be taken to take account of the susceptibility of the features used to the physics-based variation, either through careful control of acquisition conditions or the usage of more robust features. If such account is not taken into consideration, the quantification results may reflect more to the acquisition conditions rather than the actual nodule physiology, resulting in inaccurate prognosis or diagnosis.

ACKNOWLEDGMENT
This work was supported by a grant from the NCI's Quantitative Imaging Network (Grant No. U01 CA181156).