Automated phantom analysis for gamma cameras and SPECT: A methodology for use in a clinical setting

Abstract Purpose We introduce an automated, quantitative image analysis package for gamma camera and single photon emission computed tomography quality control. Our focus was to produce consistent methods that are feasible in clinical settings and use standard phantoms. Methods Four gamma cameras were used to acquire planar images of four‐quadrant bar phantoms and projection views of an American College of Radiology (ACR) phantom as part of a standard gamma camera quality control program. Images were sent to QC‐Track® (Atirix Medical Systems, Inc., Minneapolis, MN, USA), which automatically placed predetermined regions of interest (ROIs) and performed analysis. For the bar phantom, a standard deviation (SD)‐based modulation transfer function was calculated for a circular ROI in each quadrant. The bar widths at various MTF values were reported using linear interpolation as applicable. For the ACR phantom, the contrast‐to‐noise ratio (CNR) for each sphere, a modulation for each rods section, and a percent deviation for uniformity ROIs was calculated. Spheres corresponding to a CNR of 3, and the rod size at various modulations were also reported using linear interpolation. Visual analysis was performed by three medical physicists to evaluate interobserver variability and correlation to quantitative values. Results Analysis of the bar phantom showed predictable differences with changes in matrix size and bar width and showed consistency over similar acquisitions over the course of the study. Analysis of the ACR Phantom showed increasing CNR and modulation with increasing sphere and rod diameter, as expected. For both phantoms, quantitative values from linear interpolation correlated well with visual analysis. Conclusion Our automated method for quantitative image analysis is consistent and shows increased precision and sensitivity when compared to standard visual methods. Thresholds correspond well with visual analysis and previous guidelines for observer visibility (e.g., Rose criterion), making our framework suitable for routine use in a nuclear medicine department.


| INTRODUCTION
Consistent quality control (QC) is key to maintaining image quality in a nuclear medicine department. Routine assessment of a gamma camera's energy peaking, planar intrinsic and extrinsic uniformity, intrinsic and extrinsic spatial resolution, and spatial linearity have been recommended by accrediting bodies such as the American College of Radiology (ACR) 1 and the Intersocietal Accreditation Commission (IAC). 2 For single photon emission computed tomography (SPECT) systems, additional tests using volumetric phantoms with different inserts are recommended to assess tomographic spatial resolution, cold sphere contrast detectability, and uniformity.
Energy peaking and planar uniformity testing for gamma cameras is often performed using manufacturer-supplied software. For the remaining tests, there are a variety of phantoms and methodologies.
For planar spatial resolution and linearity testing, many phantoms, such as slit, orthogonal hole, parallel line equal spacing, and bar, are commercially available. It has been previously noted that four-quadrant bar phantoms are commonly used clinically due to their convenience 3 ; they can be used in combination with a sheet or a point source to evaluate both intrinsic and extrinsic spatial resolution, as well as linearity. For evaluation of tomographic acquisitions (i.e., SPECT), the ACR-approved Jaszczak phantom is used at many sites and is required for ACR accreditation. 4 As such, both the four-quadrant bar phantom and the ACR phantom are important tools for a complete clinical nuclear medicine QC program.
In the clinic, QC phantom images are often evaluated visually, subjecting the tests to inter-and intraobserver variability and bias. 5 In some cases, knowledge of pass/fail thresholds and pressure to keep imaging units active for patient imaging may influence the evaluator to pass a test that may be borderline by visual assessment.
Additionally, qualitative analysis of phantom images is insensitive to small degradations in image quality, and evaluation is limited to discrete the levels of assessment dictated by the physical design of the phantom. As a whole, visual analysis lacks precision and reproducibility, but has proven valuable for uniformity and artifact evaluations. 6,7 Conversely, automated, quantitative analysis ensures objectivity and consistency while increasing sensitivity. Historically, the AAPM has described standardized methods for performing QC tests and methods to quantify their results. AAPM Report 9 of the Nuclear Medicine Task Group described methods and standard definitions for quantifying integral and differential uniformity, statistical uniformity index, dead time, sensitivity, relative sensitivity (per collimator), spatial linearity, and the full-width at half maximum (FWHM) of a line source. 6 AAPM Report 22 further described a set of quantitative tests for gamma cameras with rotating heads. 8 These include the tests of system alignment, collimator hole angulation, tomographic uniformity and contrast, and attenuation correction. AAPM Report 52 was designed to be a comprehensive performance testing program for SPECT. 9 It provided testing instructions, quantification methods, and acceptable values for physicist tests of SPECT systems including rotational field uniformity and sensitivity, tomographic uniformity, spatial and contrast resolution, and attenuation correction.
More recently, AAPM Report 177 included valuable updates and described methods suitable for clinical physics acceptance testing and annual evaluation of gamma cameras and SPECT units. 10 It also included the descriptions of routine clinical quality control tests. This is the most current AAPM Report on SPECT/gamma camera QC.
Independently, others have created software and described their own quantitative metrics for gamma camera QC. Hasegawa et. al. presented software that analyzes a custom orthogonal hole phantom and volumetric flood source to evaluate spatial resolution, linearity, and uniformity. 11 Hander et. al. described a method to measure gamma camera spatial resolution by using the mean and standard deviation of ROIs placed on a four-quadrant bar phantom. 12,13 Madsen showed that annular sampling of a tomographic uniformity image provides a better statistical separation of images with and without ring artifacts compared to nonsampled methods. 14 More recently, De Nijs et. al. presented a MATLAB-based software to calculate NEMA NU-1 2007 based quality control metrics. 15 Nelson et. al. showed that noise texture analysis could be used to better assess planar uniformity floods compared to pixel-based methods. 7 Hirtl et. al. designed and built an ImageJ plugin which automatically places ROIs, performs SPECT contrast measurements for the Jaszczak phantom spheres and rods, and tests uniformity by means of a Hough transform or student's t test. 16 Nichols explored various image texture metrics to link qualitative statements of phantom sphere and rod visibility to quantifiable parameters, including count quantile metrics, gray-level co-occurrence matrix metrics, image contrast metrics, and count histogram metrics. 17 DiFillipo tested software that evaluates contrast of planar ACR rods images, and further explored how creating receiver operating characteristic (ROC) curves for rods sections in the tomographic reconstructions of the ACR SPECT and PET phantoms can better classify rod visibility. 18 While these methods are certainly viable and have demonstrated great improvements over a visual analysis, they are better suited for academic or research centers with ample resources, time, and the knowledge to implement these methods for QC. Most of these methods are not feasible for routine clinical evaluation and have not seen wide-spread clinical implementation. Many of the methods presented previously either do not supply software to perform the appropriate calculations, utilize software not commonly found in most clinics (ImageJ, MATLAB, etc) require significant user input during analysis, demand some level of programming knowledge, or greatly increase the time required to perform QC. As a result, most clinical physicists and technologists choose not to adopt these methods and instead resort to visual analysis due to its efficiency.
The goal with this work was to create an automated QC workflow that is as efficient as visual analysis, and has the added benefits of objectivity, consistency, and sensitivity provided by quantitative analysis. Here we propose a method that is packaged to be easy to implement clinically not only for physicist use but also for technologist use, so the robustness of automated analysis can be applied across the QC program.

2.A | QC-Track
QC-Track® (Atirix Medical Systems, Inc., Minneapolis, MN, USA) is commercially available software for diagnostic imaging quality control that allows representative images of phantoms to be used to build standard phantom analysis templates. Quality control data for each imaging unit can then be saved within QC-Track by sending images to the software via an image router or secondary destination.
QC-Track assigns images to the appropriate device and phantom template using information in the DICOM header and requires minimal user input through a web-based user interface to save and report quality control data. All automated image analysis was performed in QC-Track with the help of the clinical nuclear medicine technologists that typically perform QC in the department.
QC-Track also allows data to be exported as.csv files for import into a researcher's preferred analysis software. In this study, quality control data were exported from QC-Track and analyzed with inhouse python scripts.

2.A.1 | Phantom and worksheet templates
Representative images of each phantom were used to build standard phantom templates. These templates were designed to read DICOM header information to correctly assign images, use image processing techniques to detect the phantom within each image, automatically place ROIs, and perform the appropriate calculations. To limit computational complexity, phantom templates were built using ROIs with a fixed position relative to the center of the phantom, and thus require a consistent phantom orientation during acquisition. Worksheet templates were built to automatically populate QC data from phantom images. We focused our initial efforts on designing templates for the bar and ACR phantoms as most gamma camera and SPECT manufacturers provide on-board software to analyze flood uniformity and center of rotation test results. The phantom and worksheet templates for the ACR phantom were built such that the user can select which transaxial slice they would like to analyze for each section of the phantom.

Bar phantom
For the bar phantom, three phantom templates were designed: one for rectangular phantoms with bar sizes 3.5, 3.0, 2.5, and 2.0 mm ["bar phantom 1" - Fig. 1(a)], one for rectangular phantoms with bar sizes 4.0, 3.5, 3.2, and 2.5 mm ["bar phantom 2" - Fig. 1(b)], and the third for square phantoms with bar sizes 6.4, 4.8, 4.0, and 3.2 mm ["bar phantom 3" - Fig. 1(c)]. All three templates were built to accommodate any square image matrix size (256 × 256, 512 × 512, etc.) by using information available in the DICOM header. All three templates apply a morphological dilation, followed by an erosion and Gaussian blur to eliminate small objects in the image and reduce noise. A threshold algorithm is used to detect the phantom and find the position of the phantom center (P x , P y ). For bar phantoms 1 and 2, circular ROIs with 132 mm diameters are placed within each quadrant at locations (P x ± 134 mm, P y ± 104 mm). For bar phantom 3, circular ROIs with 132 mm diameters are placed within each quadrant at locations (P x ± 106 mm, P y ± 106 mm) due to the difference in the phantom shape.

ACR phantom
For the ACR phantom, templates were designed for the rods, spheres, and uniformity sections. Like the bar phantom, the templates apply a morphological dilation, followed by an erosion to eliminate small objects in the image. A threshold algorithm is used to detect the phantom, as well as the position of the phantom center (P x , P y ) for each axial slice. For the spheres section, a circular background ROI with a 69.2 mm diameter is placed at the phantom center (P x , P y ). Cir-

2.B.2 | ACR SPECT phantom
The ACR SPECT phantom contains sections for cold sphere contrast detectability, cold rod spatial resolution, and uniformity. The sections for cold sphere contrast detectability and cold rod spatial resolution include cold sphere inserts and cold rod inserts, respectively.

Spheres
For the spheres section, functions were built within the ACR phantom template to calculate a contrast to noise ratio (CNR) between each sphere and the central background ROI. The CNR equation was adapted from the contrast measurement for the spheres section described in both AAPM Report 52 and AAPM Report 177. 9,10 TAZEGUL ET AL.

| 207
Additionally, a linear interpolation across the CNR values for the three smallest spheres was built within the template to report the lesion size corresponding to a CNR of 3, based on the lower limit of the Rose Criterion. 19

Rods
Since the line ROIs for the rods section are placed along the outer set of rods, the count profile of the line ROI can be approximated as sinusoidal, and a pixel-based modulation can be calculated across each ROI. For each axial slice containing the rods section of the phantom, the modulation in each rod sector was calculated. An aggregate modulation was also calculated by averaging the results across ten slices within the rods section. A linear interpolation across the modulations of the five largest rod sectors was built to report the hypothetical rod size that would correspond to modulation values of 0.1, 0.15, 0.2, 0.25, 0.3, 0.4, and 0.5. The rod sizes at these modulation values were then used for correlation to a visual analysis and to set a quantitative threshold on the modulation.

Uniformity
The template and calculations for the uniformity section were modeled from uniformity tests employed in CT. Manufacturer-supplied uniformity phantoms, as well as the ACR CT phantom, 20 are evaluated by comparing the mean pixel intensity of four peripheral ROIs to a central ROI. Typically, a percent deviation is reported for each peripheral ROI. That same test was replicated here.

2.D | Image acquisition and reconstruction
Images were acquired using four dual-head Siemens Symbia gamma cameras with 3/8" crystal thickness (Siemens Healthcare, Erlangen, Germany) and ACR-recommended quality control procedures. Images using bar phantom 1 were acquired weekly over 18 months for both heads of each gamma camera using the specifications listed in Table 1. Images were acquired extrinsically, per the manufacturer's specification, and analyzed at matrix sizes of 256 × 256 and 512 × 512 for comparison and to establish the robustness of the technique. To assess the performance of the bar phantom computation over a wider range of bar widths, images of bar phantom 2 and bar phantom 3 were acquired for each head of one camera.
For the ACR phantom, images were acquired using the same four gamma cameras over the course of 12 months and were all reconstructed on a single Siemens workstation. The phantom was acquired every 3 months for each gamma camera using an ACR-recommended protocol and as specified in Table 2. Tomographic reconstructions were created with filtered back projection using a This is the minimum slice thickness for the matrix size and zoom factor used to acquire the images. Although this slice thickness differs from that described in the ACR protocol (6-9 mm slices), thinner slices were used to demonstrate the robustness of the algorithm.
We expect similar results with thicker slices.
2.E | Automated quantitative analysis 2.E.1 | Bar phantom Immediately after they were acquired, bar phantom images were sent to the server for analysis. The technologist responsible for QC then accessed the interface from a common workstation, found the appropriate "Bar Phantom" worksheet, and saved the results using the automated analysis. They also manually entered the visually resolvable set of bars on the same worksheet. Thresholds were set within the software that would alert the technologist of a failure if the size of the visually resolvable rods they entered was greater than 3.0 mm, per ACR's recommended "Satisfactory" criterion for extrinsic resolution testing. 4 Over the course of the study, the images were also saved to a separate folder to build a research dataset. This dataset was retrospectively reanalyzed using the same analysis templates and exported for better visualization and easier manipulation of the data.
A MTF curve was created for each imaging unit using the mean calculated MTF at each bar size across all images acquired on that device. A range and standard deviation were calculated for each bar size for each imaging unit to test the reproducibility of the calculated MTF between different bars sizes.
The technologist or physicist then accessed the interface from a common workstation, found the appropriate "ACR Phantom" worksheet, and saved the results using the automated analysis. On the same worksheet, they entered the visually resolvable sphere and section of rods. They also marked whether the uniformity was adequate and if they observed any artifacts within the image.
ACR phantom images were also saved to a separate folder, retrospectively reanalyzed, and the data exported for further analysis. For each acquisition, every slice within the spheres section, rods section, and uniformity section was analyzed using the appropriate template.
For each of the sections, a single "best slice" was selected visually and the results shown across varying object size (i.e., CNR by sphere size, modulation by rod size, etc.).

2.F | Visual evaluation
A subset of the bar phantom images and ACR phantom reconstructions were evaluated visually by three diagnostic medical physicists with varying levels of experience. The original DICOM images were opened in ImageJ and viewed using 300% magnification. The smallest resolvable bar width, sphere, and rod section were reported. The physicists were also asked to mark the single "best slice" to analyze for the spheres and rods section of the ACR phantom. The visual evaluation was then compared to the quantitative methods to estimate appropriate thresholds corresponding to visual resolvability.
The physicists had no knowledge of the quantitative results of the images they were reading.

3.A | Bar phantom
Representative results from the bar phantom MTF analysis are seen in Fig. 3. The SD-based MTF algorithm was shown to be consistent, irrespective of matrix size [ Fig. 3(a)]. For all cameras, there was complete separation in MTF values between the 3.5 and 3.0 mm, as well as the 3.0 and 2.5 mm bars. However, overlap was observed between the 2.5 and 2.0 mm bars. These results are expected given that the MTFs for these bar sizes are below the lower limit of T A B L E 1 Bar phantom acquisition parameters.

3.B.2 | Individual slice analysis and visual comparison
For every ACR phantom acquisition, the visually selected "best slice" was analyzed for the spheres section. As seen in Fig. 6(a), the values of CNR vary across acquisitions, but also tend to increase with increasing sphere diameter as expected. The correlation between sphere CNR and physicist visibility showed three categorizations as with the bar phantom: spheres "visualized by all physicists," spheres that are "visualized by some physicists," and spheres that are "visualized by no physicists." The "visualized by some physicists" region corresponds to a sphere CNR of roughly 2.5-4. The results from the visual correlation for the spheres section are also shown. The 10-slice average modulation for each rod size for multiple acquisitions is shown in Fig. 6

4.A.2 | Robustness to bar spacing
Results from the frequency response of a single imaging unit show that the algorithm works well across different bar widths, allowing for the use of any bar phantom that allows for the sampling of at least seven line pair periods. 13 This flexibility allows clinical sites to use whichever bar phantom they have available for their gamma cameras, with an ideal phantom having at least

4.A.3 | Quantitative analysis and visual correlation
The smallest resolvable bar width reported by physicists was dif-

4.B.2 | Phantom rotation
While the effects of ACR phantom rotation were not explicitly studied, the phantom templates expect the phantom to be in a certain

4.C | Limitations
As described earlier, the ROIs for both the ACR and bar phantom are placed using geometric considerations of the phantom, and phantom images are expected to have a certain orientation. The template is built to handle translations of the phantom but is likely highly sensitive to rotations. This was done to limit the computational load and increase the efficiency of testing. For both phantoms, however, this forces the phantom to be placed on the scanner in a consistent manner. Furthermore, for the bar phantom, each quadrant of the detector face will only be tested by the same quadrant of the phantom, and the template design also limits x-and y-directional testing of the system's spatial resolution.
Additionally, the uniformity test employed by our template is not optimized for the types of artifacts (i.e., ring or bullseyes) commonly found in SPECT imaging. Other methods for uniformity that have been suggested previously target these specific artifacts, 14

4.D | Routine use and clinical feedback
Our method has been user-friendly enough to have been adopted by technologists with minimal additional training. It has been proven to be robust and efficient in a routine clinical setting. Technologists that regularly use the automated analysis module report that they appreciate the module as it simplifies analysis and saves time.

4.E | Future work
If ROI misalignment due to phantom rotation is found to be a significant usability issue, we plan to incorporate a coregistration algorithm that aligns the ROIs to each phantom acquisition. Additional functionality is desired to recommend the "best slice" for analysis of the spheres section and to analyze the interior rods of the ACR phantom rods section. We are currently developing templates to automate the analysis of uniformity floods using the NEMA algorithm and to calculate center of rotation misalignment. Additionally, we hope to design a template that allows the bar phantom orientation on the detector to be changed weekly, to better characterize the extrinsic resolution of the entire detector.

| CONCLUSION
Here we have presented a method for automated quantitative quality control for gamma cameras that maintains the efficiency of visual analysis while increasing the sensitivity and consistency of the test.
SD-based MTF analysis of the four-quadrant bar phantom shows good statistical separation of the MTF values between bar widths, allowing quantitative thresholds to be set for each imaging unit.
Likewise, CNR analysis of the ACR phantom spheres section and modulation analysis of the rods section provides good quantitative separation of the results between different sized objects, and correlates well with visual analysis, allowing for a more reliable evaluation of SPECT performance. While the methods described in this paper are like other methods proposed previously, our workflow is structured in a way that allows for easy clinical implementation and routine use for both technologists and physicists. Practically speaking, our methodology can be used to improve routine clinical nuclear medicine QC programs as quantitative metrics provide more consistent, sensitive measurements than visual analysis.

ACKNOWLEDG MENTS
The authors wish to thank the nuclear medicine technologists at UVA for uploading images to and using QC-Track as part of their QC routine.

AUTHOR CONTRIBU TI ONS
This project was a collaborative effort. T.T, A.P, A.S, C.S, and P.C. all contributed to conceptual formulation of the idea. T.T., A.S, and C.S carried out the initial algorithm design and implementation, and data analysis. A.P and P.C. acquired data, coordinated clinical implementation, and interpreted results. T.T. took the lead in writing the manuscript. All authors contributed significantly to manuscript revisions, both before and after peer review. All authors reviewed and approved the final submitted version of the manuscript.