Evaluation of AAPM Reports 204 and 220: Estimation of effective diameter, water‐equivalent diameter, and ellipticity ratios for chest, abdomen, pelvis, and head CT scans

Abstract Purpose To confirm AAPM Reports 204/220 and provide data for the future expansion of these reports by: (a) presenting the first large‐scale confirmation of the reports using clinical data, (b) providing the community with size surrogate data for the head region which was not provided in the original reports, and additionally providing the measurements of patient ellipticity ratio for different body regions. Method A total of 884 routine scans were included in our analysis including data from the head, thorax, abdomen, and pelvis for adults and pediatrics. We calculated the ellipticity ratio and all of the size surrogates presented in AAPM Reports 204/220. We correlated the purely geometric‐based metrics with the “gold standard” water‐equivalent diameter (DW). Results Our results and AAPM Reports 204/220 agree within our data's 95% confidence intervals. Outliers to the AAPM reports’ methods were caused by excess gas in the GI tract, exceptionally low BMI, and cranial metaphyseal dysplasia. For the head, we show lower correlation (R2 = 0.812) between effective diameter and DW relative to other body regions. The ellipticity ratio of the shoulder region was the highest at 2.28 ± 0.22 and the head the smallest at 0.85 ± 0.08. The abdomen pelvis, chest, thorax, and abdomen regions all had ellipticity values near 1.5. Conclusion We confirmed AAPM reports 204/220 using clinical data and identified patient conditions causing discrepancies. We presented new size surrogate data for the head region and for the first time presented ellipticity data for all regions. Future automatic exposure control characterization should include ellipticity information.


| INTRODUCTION
Dose from computed tomography (CT) has always been a general concern in the medical community. 1,2 This is primarily due to the growing number of CT examinations 3 and the high dose from CT relative to other imaging modalities. 2,4 It is always a challenge for radiologists and medical physicists to establish adequate image quality with the lowest radiation exposure to the patient, in agreement with the ALARA (As Low As Reasonably Achievable) principle. 5 Unfortunately, in CT, the current scanner output dose metrics, such as volume CT dose index (CTDI vol ), do not reflect the dose the patient actually receives. [6][7][8] The CTDI vol only represents the system's radiation output for a very specific set of conditions in a cylindrical acrylic polymethyl methacrylate (PMMA) phantom with diameters of 16 or 32 cm in a contiguous axial or helical examination. 4,7,[9][10][11][12] Ideally, a method would exist to normalize these dose values to make them reflect the dose a patient actually receives.

The American Association of Physicists in Medicine (AAPM)
Report 204 12 introduced the concept of a size-specific dose estimate (SSDE). The SSDE is a patient size-corrected estimate of patient dose which uses a surrogate for patient size to scale the scannerreported CTDI vol . 12 Many previous studies have used and/or evaluated size surrogates to estimate patient size which include body weight, body mass index (BMI), age cross-sectional diameter, effective diameter, and a combination of these parameters for individual dose adaptation for adults. 13 and pediatric CT scans of the torso and truncated axial images. 8,[24][25][26][27] The size surrogates of AAPM Report 204, however, are based only on patient geometry and do not consider the different attenuation of various tissue types. For example, the lung was considered a caveat 28 because of its much lower density compared to water or PMMA, therefore reducing the attenuation of the patient's chest significantly from the 32 cm reference CTDI vol phantom.
This limitation was addressed in the AAPM Report 220 29 in detail, and the sole use of water-equivalent diameter (D w ), which considers tissue attenuation in addition to patient geometric size, for calculations of SSDE is recommended. The use of D W had previously been proposed before AAPM Report 220. 13,18,30,31 Wang et al. 30 demonstrated that the use of D W is more accurate in calculating SSDE in thoracic CT compared to the geometric size surrogates, but D W and the geometric size surrogates both perform and correlate well for the abdomen and pelvis. AAPM Report 220 collected experimental data acquired using cylindrical phantoms and Monte Carlo simulations. The analysis assumed that the collection of a limited number of different size elliptical phantoms and the family of Monte Carlo phantoms used was intended to span what is seen clinically.
Ikuta et al. 25 evaluated D E and D W and found good correlation; however, their method differed from AAPM Report 220 where they used four slices separately corresponding to the lung apex, the superior aspect of the aortic arch, the carina, and immediately superior to the diaphragm without averaging for thorax and abdomen. However, the AAPM 204/220 Reports allow the use of the center of the scan range calling it a "shortcut" relative to averaging a size surrogate over the entire scan range. Leng et al. 32  can also vary the dose angularly about the patient. 6,36,37,[40][41][42][43][44][45][46]   respectively. The data shown in Fig. 1 were collected using the scan parameters listed in Table 1 for the routine adult abdomen pelvis dataset which used angular dose modulation.
The ellipticity ratio is involved in setting the angular dose modulation value; however, there is only one paper reporting ellipticity values to our knowledge in the literature. 42 Therefore, in this paper, we report the ratio of LAT to AP for multiple body regions, including the head for hundreds of patients. We do not report on how this value influences a CT scanners' dose modulation since that is highly vendor dependent and "black box" in nature. However, there are several papers in our field that are actively "reverse engineering" vendors AEC algorithms for research and clinical purposes. [47][48][49] The ellipticity data we report here can be included in such efforts.
As motivated in the previous paragraphs, the purpose of this paper is to confirm AAPM reports 204/220 and provide data for the future expansion of these reports by: (a) presenting the first largescale confirmation of the reports using clinical data, (b) providing the community with size surrogate data for the head region which was not provided in the original reports and additionally provide the measurements of patient ellipticity ratio for different body regions.

2.A | Experimental data collection
A total of 884 patients were included in our analysis. The patients' data were collected from three different examination types and binned into six different sets. Table 1 For the purpose of analysis, we calculate the AP, LAT, and D E for each slice in each dataset and then report the average for all slices for each patient or each subset of patient data as defined in Table 1.
We define the ellipticity ratio as r = LAT/AP. The variable r is calculated for every slice and then averaged overall slices in a given dataset for each patient as described in Table 1. We also report the standard deviation in r and the minimum and maximum r values observed for each dataset shown in Table 1. AAPM 204 uses a second-order fit to relate D E to AP or LAT. The authors of AAPM 204 use a first-order fit to relate D E to AP + LAT. We believe the reason that a second-order fit gave a better result for AP or LAT was due to the phantoms used in the AAPM study. For a fixed ellipticity ratio, D E should be proportional to AP or LAT. The relationship between D E and LAT (or AP with a simple substitution using r = LAT/AP) is where k = 1/r. In other words, for a fixed ellipticity ratio, a firstorder fit should be adequate to relate D E to LAT or AP. The AAPM 204 report, however, includes cylindrical phantoms (r = 1) and some elliptical phantoms of a fixed r but varying size. This is why we believe the authors used a second-order fit between D E and AP or T A B L E 1 Experimental data collection of human patients of routine adult abdomen and pelvis, adult chest, adult head, and pediatric abdomen pelvis cases (the pediatric data included five different protocols hence the range in NI, pitch, and slice thickness). † Denotes datasets that are derived from the adult chest dataset scan range. The Noise Index (NI) refers to a vendor-specific automatic exposure control setting. Other vendor-specific reconstruction options were set as follows: "PLUS" mode, recon kernel of "STANDARD" for the body and "SOFT" for the head, and an ASiR level of 40%. LAT. Not because the underlying relationship between D E and AP or LAT warranted this, but because the combination of varying r values made their data nonlinear. Therefore, we chose to use a first-order fit of our clinical data since it includes hundreds of patients with varying r values. We assumed a given body region in a human would have a distribution of r values with a mean that would be characteristic of that body region. Furthermore, as seen in our results, the second-order fits of AAPM 204 phantom data fall within our confidence intervals.

2.C | Water-equivalent diameter
Previous studies show the x-ray attenuation of a patient in terms of a water cylinder with a water-equivalent diameter (D W ). 12,13,[30][31][32]35,36,50 In other words, the D W represents the diameter of a cylinder of water that contains the same total x-ray attenuation as that contained within the patient's axial cross section and depends on both the cross-sectional area of the patient and the attenuation of the contained tissues. This method of calculating D W was described in AAPM Report 220 and implemented it here with equation The ROI represents the mean CT number within the recon-  Table 1 and plotted D W as a function of D E for each subset. All plots were fitted using a linear fitting routine (polyfit function from MATLAB, the Mathworks INC, Natick, MA, USA). We applied a first-order linear fit and linear regression (R 2 ) to all data points combined and 95% confidence intervals for all data points. A 95% confidence interval indicates that a 0.95 probability of data points contain the true population mean. We report the confidence interval in millimeters and this number is the distance from the trend line to the confidence interval, so the range between confidence intervals is double the reported confidence interval in millimeters. We considered points outside this confidence interval to be outliers and we analyzed each of them to characterize deviations from the correlation shown in the AAPM reports that may be present in the clinic.
We       Figure 4(b) shows both the UW fit of D W as a function of D E with data points taken from AAPM Report 220 Table 1 for abdomen   and Table 2 for thorax, and shows that these points fall within our 95% confidence interval.

| DISCUSSION
For all data excluding the head, we show in Fig. 4(a), that our linear fits of D E as a function of (AP + LAT)/2, LAT, and AP compare well to the results of AAPM Report 204. For D E as a function of AP or LAT as shown in Fig. 4(a), we did not observe the same curvature as AAPM Report 204; however, the spread in our data, shown by the 95% confidence interval, could have been hiding such behavior.
As reported in AAPM Report 204, the physical phantoms used  Table 2. Only the LAT comparison from the AAPM Report 204 data is outside our 95% confidence interval for patient LAT dimensions over 400 mm. In Fig 4(b), all AAPM Report 220 data points lie within our 95% confidence intervals for both the abdomen and thorax AAPM data. We obtained results agreeing with the phantom-based results of AAPM Reports 204 and 220 using a large set of patient data. Our dataset is the largest clinical dataset used for this purpose to date and has allowed us to identify a number of outlier cases not previously reported on in the literature. There were a few outlier cases that deviated from our fits and the correlation shown in the AAPM task group reports. Figure 6 displays the outliers seen in Fig. 3(a) Fig. 6(a) corresponds to the adult chest outlier in Fig. 3(a) which was also 20 mm below the fit line. This is due to the relatively higher ratio of lung space to soft tissue in the thorax to other adult chest scans (e.g., compare to Fig. 6(b)) and relatively lower amount of subcutaneous fat relative to other adult chest patients of the same geometric size. The head outlier case shown in Fig. 6(c) was 23 mm above the fit line for all head scans. The head outlier case presents with cranial metaphyseal dysplasia (excess bone in the head), which when compared to a "normal" adult head (i.e., compare to Fig. 6(d)), it is obvious that the excess bone is the reason for the higher D W relative to other heads of the same geometric size. We do not show the adult abdomen pelvis outliers that can be seen in Fig. 3(a). We analyzed these cases and noted that these cases were always below the fit line, corresponded to cases that included more of the thorax region than was typical for a routine abdomen pelvis scan. Clinically, this is warranted in some cases when: (a) a radiologist requests coverage into the thorax or (b) for patients with lung bases that extend deep within the abdomen or conversely a diaphragm/liver dome that extends deep within the thorax. Therefore, Comparison of AAPM Report 220 D W as a function of D E calculated our fit for pediatric and adult abdomen pelvis data (blue) with 95% confidence intervals (blue dashed line) and our fit for adult thorax only (green) with 95% confidence intervals (green dashed line). In (b), AAPM Report 220 points for abdomen (red asterix) and thorax (red plus sign) are plotted over our fits.
when one scans an abdomen pelvis and includes more of the lungs than is typical for such a scan, D W will decrease.
Ikuta et al. compare D W to D E for the thorax and abdomen and report poor (R 2 = 0.51) and good (R 2 = 0.90) correlation in those regions, respectively. 25 Our correlation coefficients are much higher than the Ikuta result. We believe that the source of this difference is sample size. We analyzed on average 110 image slices for each of our chest datasets whereas Ikuta looked at 50 patients and measured four slices per patient. The four slices corresponded to the lung apex, the superior aspect of the aortic arch, the carina, and immediately superior to the diaphragm. Ikuta et al. reported fitting statistics not on the average of their four measurements per scan, but for each measurement point individually. If we compare D W and D E for each point in our chest dataset individually (not plotted in this paper) and perform no examination averaging, our correlation coefficient drops from 0.937 to 0.589 for the adult chest data. This can be understood by looking at Fig. 3(b), the four measurement points taken by Ikuta et al. span the three different anatomical regions within a routine chest scan, the shoulders, thorax, and abdomen. These regions, for the same geometric size, do exhibit relatively large differences in D W .
For the chest relative to the abdomen, we expected the D W to be much lower because of the thorax (air-filled regions of the lung). 28 We examined a few adult chest patients' scans and noticed that the shoulders and abdomen were included and it is necessary to include them in a routine adult chest procedure in order to ensure the lung apices and bases are covered. We separated the chest region into subset regions of adult shoulders, adult thorax, and adult abdomen only, shown in Fig. 3  results. This takeaway is that the contributions from all body regions included within an examination must be considered when discussing patient size surrogates. This is especially true since x-ray attenuation will change drastically as one moves from the abdomen to the thorax and from the mid-thorax up into the lung apices (e.g., and moves into the shoulders). 32 At such boundaries between patient body regions, vendors' AEC algorithms are likely to greatly change the tube output.
We were also surprised to notice that the adult shoulder data appeared to have a much lower D W than the abdomen for the same D E as shown in Fig. 3(b). Looking at the adult shoulder data is clinically relevant as this body region corresponds to cervical spine imaging, neck CTA imaging, and shoulder imaging. The shoulders are also included in the scanning of other body regions like the chest as shown in the present analysis. One would expect the shoulders to have a higher D W relative to the abdomen for the same D E because of bony anatomy of the shoulders and arms. Albeit, some air-filled regions could also present due to the lung apices. However, we found that the D W for the shoulders is shifted to the right (e.g., decreased D W value) because the adult shoulders' LAT dimension is relatively larger compared to the adult thorax and adult abdomen only, and For the head, we related D W to LAT, AP, and (AP + LAT)/2 in Fig. 5 and D W to D E in Fig 3(a). We found poor correlation between the size surrogates for the head overall. We noted that our image processing steps for obtaining the geometric size-based metrics AP and LAT (from which D E is derived) included the ears and nose.
Therefore, for patients with their ears protruding far from their head, the LAT measurements would increase, predicting the patient was more attenuating than we would desire for the purposes of SSDE calculations. We noticed the same behavior for the nose and the AP length calculation. We also noted that the angle of the head (defined by a line connecting the orbits and ear cannel, e.g., the orbital-meatal line) varied patient to patient and effected AP and LAT measurements. We confirmed that we were able to remove the head holder and couch from the geometric size measurements of AP and LAT, so size contributions from these non-patient objects were not present in our data. We confirmed the head holder and/or couch was not present in AP and LAT length calculations by manually reviewing the thresholded and segmented axial images described in Section 2.B.
The relatively poor correlation (R 2 = 0.81206) for D W vs D E for the adult head scan in Fig 3(a) was not surprising considering the correlation was similar to the one in the work by McMillan et al.
(R 2 = 0.87). In their work, they used the slice above the eyes (e.g., a single slice) differing from our use of the entire head scan range which could explain their slightly better correlation coefficient.
Another external comparison of our data can be made to that of Aman et al. 35 Table 2 to determine the LAT or AP dimension given an AP or LAT measurement, respectively.
One limitation of our study is that we did not relate our patient size surrogates directly to dose as other studies have done, and although this was not the purpose of this study, it is important to note.
Such a comparison will be highly vendor dependent, as each vendor's AEC implementation will respond differently to the size surrogates presented in this paper and additionally to other influences like patient ellipticity and geometric magnification. We did not remove the couch or head holder when calculating Dw. We think that this is fine because Anam et al. 35 show that the effect of the Our results as shown in Fig. 4(b) agree with the AAPM Report 220 results. This agreement provides us confidence that a couch removal strategy is not required. Furthermore, Anam et al. 35 demonstrated that the couch has minimal impact on D W .

| CONCLUSION
Following AAPM Reports 204/220 using a clinical dataset containing 884 patients we made the following specific conclusions: 1. We identified sources of outliers in our data that deviate from the trend lines shown in AAPM Reports 204/220 including: medical conditions causing excess bone formation inside the skull (cranial metaphyseal dysplasia), lack of subcutaneous fat relative to others in the patient population (low BMI), and deviations from typical scan ranges for a particular examination type (e.g., including parts of the thorax in an abdominal pelvis scan).

2.
We applied the methodologies of the size surrogates of AAPM Report 204 and AAPM Report 220 to different body regions and age groups including the head. The head has not previously been reported on using the framework of the AAPM Reports 204/220.
Our fit lines for D E and D W for the abdomen and chest agreed with the AAPM 204 and 220 within our 95% confidence intervals.
3. For the first time to our knowledge, we report patient ellipticity values derived from clinical scans. We report values for adult chest, adult abdomen pelvis, adult head, pediatric abdomen pelvis, adult shoulder, adult thorax, and adult abdomen body regions. Such a description of patient form/shape will be needed to understand and reverse engineer some CT vendors "black box" AEC algorithms.

IRB STATEMENT
All data were collected under an IRB-approved protocol in a retrospective manner in which the patient consent was waived.

ACKNOWLEDG MENTS
This work is supported by an equipment grant from GE Healthcare.
The authors thank Dominic Crotty, Ph.D. and John Boudry, Ph.D.
from GE Healthcare for discussions related to this work. The authors give special thanks to Sebastian Schafer, Ph.D., for translating the references written in German.

CONFLI CT OF INTEREST
TPS receives research support, is a consultant, and supplied CT protocols under a licensing agreement to GE Healthcare. TPS is the founder of protocolshare.org. CSB receives research support from GE Healthcare. For CSB there are no disclosures. TPS is on the MAB of iMALOGIX LLC.