A novel approach to evaluate spatial resolution of MRI clinical images for optimization and standardization of breast screening protocols.

PURPOSE
Stringent quality assurance is required in MRI breast screening to ensure that different scanners and imaging protocols reach similar diagnostic performance. The authors propose a methodology, based on power spectrum analysis (PSA), to evaluate spatial resolution in clinical images. To demonstrate this approach, the authors have retrospectively compared two MRI sequences commonly employed in breast screening.


METHODS
In a novel approach to PSA, spatial frequency response curves (SFRCs) were extracted from the images. The SFRC characterizes spatial resolution describing the spatial frequency content of an image over a range of frequencies. Verification of the SFRCs was performed on MRI images of Eurospin agarose gel tubes acquired with different resolution settings. SFRCs of volunteer and patient images obtained with two clinical MRI sequences were then compared. The two sequences differed primarily in k-space coverage pattern, which was either radial (RAD) or linear (LIN).


RESULTS
The computed SFRCs were able to demonstrate the differences between RAD and LIN sequences in relatively small groups of subjects. The curves showed a similar pattern of decay in both volunteer and patient images, indicating that the spatial frequency response is mainly determined by the imaging protocol and not by intersubject anatomical differences. The LIN protocol produced images with increased sharpness; this was reflected in the corresponding SFRCs, which showed a higher content of spatial frequencies associated with image details.


CONCLUSIONS
The SFRC can provide an objective assessment of the presence of spatial details in the image and represent a useful quality assurance tool in the evaluation of different breast screening protocols. With a reference image, a comparative analysis of the SFRCs could ensure that equivalent image quality is achieved across different scanners and sites.


INTRODUCTION
Several large multicenter clinical trials have demonstrated the value of dynamic contrast-enhanced MRI (DCE-MRI) in screening women at high risk of developing breast cancer. 1,2 Imaging protocols require high spatial resolution to detect small foci of disease and are designed to acquire a bilateral 3D volume in no more than 60 s, enabling characterization of contrast agent (CA) uptake curves. 3,4 Within those constrains, the compromise between spatial and temporal resolution may lead to differences in image quality between protocols at different centers. 5 Quality assurance (QA) is therefore required to ensure that different scanners, receiver coils, and imaging protocols reach similar diagnostic performance. 3 Several factors can influence the image quality of fatsuppressed T1-weighted spoiled gradient-echo sequences routinely used for breast screening, and some of these (e.g., k-space sampling pattern, truncation of the acquisition matrix, parallel imaging, and spatial variation of noise) are not tested during standard QA procedures. These factors have an impact on image resolution, which rarely achieves the nominal voxel size and varies with position within the acquired volume.
The analysis of the modulation transfer function (MTF) in test objects is the standard QA procedure to evaluate spatial resolution in different directions. However, in MRI the value of MTF analysis, which is based on linear operators, is known to be limited, [6][7][8][9][10] as several nonlinear processes contribute toward image generation during the combination of complex signals from different phased array elements of the receiver coils. Furthermore, MTF analysis is mainly used as a tool to evaluate the general performance of the scanner in fixed testing conditions, and it is not ideal for the assessment of specific clinical protocols.
There is therefore a need for a procedure that can evaluate spatial resolution directly on the clinical images. Power spectrum analysis (PSA) can be used to assess the spatial frequency content in the Fourier transform of an image, thus characterizing the ability to resolve structures of different size. [11][12][13][14][15] Unlike MTF analysis, PSA takes into account only the spatial frequencies present in the images and can therefore be targeted to a particular clinical protocol.
In this work, we propose for the first time the use of the "spatial frequency response curves" as a QA tool to compare different MRI protocols. This methodology is based on PSA techniques already used in the measurement of spatial resolution of scanning electron microscope images, 16,17 with the novel introduction of a variable signal threshold applied to the power spectrum. The proposed methodology was tested on images of Eurospin agarose gel tubes and was then employed to compare two breast screening DCE-MRI protocols which are expected to produce images of different sharpness.

2.A. Data acquisition
This study was undertaken at 1.5T (Philips Intera, Best, Holland) using the manufacturer's standard breast coil and 3D fat-suppressed spoiled gradient-echo sequences (Philips THRIVE). Read-out direction was anterior/posterior (AP) to minimize cardiac motion artifacts. Details of the two sequences used for comparison are provided in Table I. The sequences differed primarily in k-space coverage pattern (radial, denoted "RAD," and linear, denoted "LIN") in the phaseencoding directions, right/left (RL) and foot/head (FH). LIN samples the k-space with a segmented centric-ordered Cartesian scheme, whilst RAD starts at the center of k-space and progressed toward the higher spatial frequencies in radial trajectories. Both sequences produced image volumes of the same voxel size (1.25 × 1.25 × 2 mm, in AP, RL, and FH directions, respectively) and employed the same fat suppression technique (an adiabatic inversion pulse, with inversion time of 90 ms) and the same parallel imaging reconstruction (SENSE factor of 2 in the RL direction). The clinical protocol acquired each volume in approximately 1 min and consisted of one precontrast and eight postcontrast acquisitions. A single dose of DOTAREM (Guerbet, Villepinte, France), adjusted by body weight (0.2 ml/kg), was administered immediately following acquisition of the precontrast volume.
The following datasets were analyzed in this study: 1. Each examination was independently assessed by two breast radiologists, and only the disease-free breast was considered. For these datasets two different volumes were analyzed: the first precontrast and the most enhanced.
Retrospective analysis of patient examinations was carried out with the approval of the Clinical Audit Committee, and volunteer studies were approved by the Ethics Research Committee.

2.B. Image processing
The imaged volume is reconstructed by the scanner as a series of transaxial images, which are reviewed by the reporting radiologists without any further 3D reconstruction. In this study, the evaluation of spatial resolution was therefore limited to the transaxial plane (AP and RL directions, read-out and phase-encoding, respectively). From each subject a subvolume (150 × 150 × 150 voxels), encompassing the entire imaging volume of one side of the breast coil, was extracted. For patient examinations, volumes from separated visits were visually matched with the aid of reference anatomical structures. Power spectrum analysis was performed on the central transaxial image of the subvolume, employing in-house software developed in  (version 8.2, Exelis Visual Information Solutions).

2.C. Power spectrum analysis and spatial frequency response curves
The power spectrum image represents the distribution of spatial frequencies in the k-space [ Fig. 2(A)] and was calculated from the Fourier transform of the original image and normalized using the k-space central value. 16,19 When a threshold is applied to the spectrum, the set of k-space points above the threshold forms a 2D distribution [ Fig. 2(B)]. The distribution of points can be considered to be a mass distribution. The moments of inertia along the axes describe how the mass is distributed and can be written in tensor form , where x and y are the point coordinates, summed across all points. The eigenvalues of this tensor are the semiaxes of the ellipse of inertia which approximates the mass distribution. 20,21 The ellipse was then fitted with an elliptical contour function with axes parallel to the image directions AP and RL and center at the axes origin. 22 Each axis dimension is expressed in spatial frequency units (cycles/mm). 23 The ellipse dimensions are expected to decrease as the threshold rises [ Fig , denoted as spatial frequency response curve (SFRC) in this paper, is proposed as a novel method to characterize the spatial frequency content of an image.

2.C.1. Verification of the SFRC in test objects
Images of the test object acquired with different resolution settings were compared. A reference dataset [ Fig. 3(A)] was obtained using the scanner default settings for the adopted sequence: radial pattern, full sampling (i.e., no zero-filling in the k-space), data matrix of 336 × 336 voxels, and nominal resolution of 1.13×1.13 mm. A comparative analysis was performed to demonstrate that the proposed method can account for changes related to the following factors: a. Relative image intensity: From the reference image, two images of the same size were obtained excluding the portion of the image outside the contours in Fig. 3 voxels) but reconstructed to match the original matrix (336 × 336 voxels). This image has the same voxel size of the reference image but does not contain the highest spatial frequencies. c. Zero-filling: An image of the object was acquired [ Fig. 3(C)] with the previous data matrix (208 × 208 voxels) but reconstructed with a larger matrix (512×512 voxels). This image has the same spatial frequencies of the image in Fig. 3(B) but a higher nominal resolution.
In addition, SFRCs of test object images acquired with RAD and LIN protocols (Table I) were directly compared to evaluate how the two different k-space sampling patterns affect the spatial frequency distribution.
The analysis of the clinical images was performed on a 150 × 150 voxel subimage from the central transaxial slice of the volume. This subimage contained one entire breast, cut at the most anterior position of the pectoral muscle.

2.C.2. Evaluation of noise in volunteer images
All MRI images contain a wide range of spatial frequencies, but the highest frequencies (finest details) may include image noise. It is therefore desirable to locate the spatial frequency beyond which no information is expected to be separated from noise. This will be referred to as "limiting spatial frequency" and was estimated in volunteer images, subtracting two image volumes acquired consecutively with either the RAD or LIN protocols. The subtracted images are expected to contain mostly noise and possibly some artifacts due to motion. Noise adds in quadrature between two images, and therefore the subtracted images were scaled by multiplying by 1/ √ 2. 24 For all volunteer examinations, determination of the threshold associated with noise on the SFRC from the subtracted image allowed the limiting spatial frequency on the corresponding SFRC to be estimated from the unsubtracted image.

2.C.3. Spatial frequencies in contrast-enhanced images
As the receiver's gain is fixed during the clinical dynamic acquisition, noise levels are not expected to be affected by the contrast injection. Comparison of the SFRCs from the precontrast and the most enhanced postcontrast patient images allowed assessment of the lower spatial frequencies involved in the contrast enhancement.

2.C.4. Comparison of RAD and LIN protocols in clinical data
SFRCs from both volunteer and patient images were compared, in order to describe the variability of the SFRC within different groups of subjects.

3.A. Verification of the SFRC in test objects
Figures 3(A)-3(C) present different test object spatial frequency response curves. The SFRC plots the fitted ellipse axis dimensions (cycles/mm, AP and RL directions) as a function of threshold. The horizontal dashed gray line represents the calculated Nyquist frequency (cycles/mm), i.e., the highest spatial frequency sampled, which is associated with the nominal voxel size. 23 Even though at very low thresholds the estimated ellipses extend beyond the k-space dimensions and the software may fit axes greater than the Nyquist frequency, this section of the curve should be ignored.

3.A.1. Relative image intensity
As the amplitude of the spectrum depends on the maximum image intensity, power spectra require normalization to the kspace central value for the threshold to be consistent between different images. The plots in Fig. 3(A) demonstrate that the normalization eliminates the influence of the relative image intensity, by comparing two subimages containing the same objects (and thus the same spatial frequencies) with different image intensities. Prior to normalization, the amplitude of the curves retains a dependence on relative image intensity (dashed line), whilst normalized curves (solid lines) coincide. This demonstrates that the normalization enables the threshold to be consistent between different images, allowing direct comparison of SFRCs.

3.A.2. Truncation
The image with truncated acquisition [ Fig. 3(B)] shows reduced sharpness due to lower content of high spatial frequencies. This is reflected in the associated SFRC (green), which decays more rapidly in the high frequency range, where the image details are represented. The green curve coincides with the reference curve (red) at lower spatial frequencies, associated with the larger structures of the object. This shows that the SFRC is able to characterize differences in spatial frequency content over a range of frequencies. Consequently, the SFRC represents the relative spatial frequency content of an image as a function of a given threshold. When the threshold is consistent between images, greater amplitude of the SFRC indicates a higher content of spatial frequencies. Figure 4 shows the influence of different k-space sampling patterns (either RAD or LIN) on the SFRCs of the test object images [ Fig. 1(A)]. The curves indicate that the LIN protocol produces images with increased content of higher spatial frequencies in the AP (read-out) direction, while in the RL direction (phase-encoding) the SFRC from RAD protocol has lower higher amplitude over the whole range.

3.B. Evaluation of noise in volunteer images
As subtracted images contain predominantly noise, the associated SFRC decays rapidly until the threshold rises above the noise floor [ Fig. 5(A)]. This limiting threshold (defined as the point where the derivative of the curve is 0) identifies the noise level in the spatial frequency space. The limiting spatial frequency can be estimated by applying this threshold to the power spectrum of the original unsubtracted image. Equivalently, the limiting spatial frequency can be determined by identifying the spatial frequency in the unsubtracted SFRC corresponding to the limiting threshold [gray arrows in Fig. 5(A)]. This was evaluated, considering both image directions, in all the volunteer images, leading to a figure of 0.277 ± 0.007 and 0.293 ± 0.016 cycles/mm (average ± standard deviation) for the RAD and LIN protocols, respectively. These two sets of data are significantly different (twotailed paired t-test, p = 0.030). Figure 5(B) plots SFRCs from the precontrast and the most enhanced postcontrast images of a patient's examination on the same graph, together with the limits evaluated in Sec. 3.B. The associated power spectra were not normalized with the k-space central value, thus introducing a dependence on image contrast in the curves. As noise is not affected by contrast enhancement, the two curves coincide in the range of spatial frequencies associated with noise (low threshold). The spatial frequency at which the enhanced and nonenhanced curves separate was found to be higher for LIN in patient images, in agreement with the previous evaluation. This frequency is associated with the smaller enhancing structure resolved in the image.

3.C. Spatial frequencies in contrast-enhanced images
In subsequent comparisons between the LIN and RAD protocols, the lowest value from the measured limiting spatial frequencies (0.270 cycles/mm) was adopted as a conservative reference limit to separate the spatial frequencies associated with noise or small motion from the frequencies associated with actual changes in resolution. images, but also in the phase-encoding direction (RL). These observations are confirmed by a visual inspection of the images (Fig. 1): images acquired with the LIN protocol are sharper but also noisier. This is consistent with a higher content of spatial frequencies in the high frequency range. This trend is present in both volunteer and patient data, suggesting that the contribution of the imaging protocol is a stronger determinant of the pattern of frequency response than the differences in anatomical structures between the subjects.

DISCUSSION
The nominal spatial resolution of DCE-MRI breast sequences can degrade in the chain of processes leading to image reconstruction. A change in resolution in the formed image is visually perceived as a change in sharpness. Although differences in image sharpness can be subjectively perceived, they are difficult to quantify. In this work, differences in sharpness were detected by the spatial frequency response curve. The SFRC objectively characterizes sharpness, compensating for the confounding contribution of different image contrast through the normalization of the power spectrum.
Postek, Vladár et al. described a procedure to evaluate image sharpness based on the spatial Fourier transform. [13][14][15] An analog methodology was used to measure the limiting spatial resolution of scanning electron microscope images. 16,17 In this case, an arbitrary manual threshold was applied to the power spectrum in order to separate signal from noise. The resulting distribution was fitted with an ellipse, and the axis dimensions provided a measure of resolution. Similar techniques have not been applied to the clinical images produced by an MRI scanner before, despite the fact that these are ideally suited, as the image is formed in k-space. In addition, the decay of the ellipse dimensions as a function of increasing threshold [ Fig. 1(B)] has not been explored in power spectrum analysis. The novel introduction of a variable threshold enables the creation of the SFRC, which characterizes the relative spatial frequency content of the image over the entire frequency range.
High spatial frequencies immediately below the Nyquist frequency are predominantly associated with noise. At high spatial frequencies, higher amplitude of the SFRC may therefore indicate increased noise rather than a greater content of image details. It is therefore essential to determine the frequency beyond which the signal cannot be separated from noise (limiting spatial frequency). Our measurement of the limiting spatial frequency is based on the evaluation of noise in subtracted images. Furthermore, comparison of patient images precontrast and postcontrast identified the range of spatial frequency associated with the enhancing structures, whose visualization is the clinical objective of the examination. The limiting spatial frequency is related to the smallest enhancing structure that the imaging protocol is able to resolve and can therefore be employed as a quality assurance parameter. At the other end of the spatial frequency range, frequencies lower than 0.1 cycles/mm correspond to structures larger than 5 mm [resolution = (2 × frequency) −1 ], 23 and the slow decay of the curve below this point relates to the high signal values within the central core of the power spectrum. This part of the curve, associated with low spatial frequencies, is therefore less informative of differences in image sharpness. Within the frequency range of 0.10-0.27 cycles/mm (size of the structures between 1.85 and 5.00 mm), increased amplitude of the SFRC denotes a higher content of image details.
In fat-suppressed breast images, edge sharpness and frequency spectrum can be affected by the failure of fat suppression. Efficacy of fat suppression for RAD and LIN protocols, in terms of the presence of both unsuppressed fat and chemical shift artifacts, was assessed in a previous publication by Ledger et al. 25 The incidence of both artifacts was not found to differ significantly between the two k-space sampling patterns. Fat suppression failure depended principally on the flip angle (FA) and increased for higher flip angles. The sequences adopted in this analysis differ only in k-space coverage pattern, and both have the same FA = 18 • . Although they are both likely to be similarly affected by the presence of unsuppressed fat, this was minimal in the datasets included in the analysis. Regarding the chemical shift artifact, in the paper by Ledger et al. this was found to be related to the choice of echo time (TE). 25 Both LIN and RAD sequences have high bandwidth (water-fat shift set to 0.4 pixels), similar TEs, with LIN being slightly closer to out-of-phase (TE = 1.97 ms); therefore, the same level of chemical shift artifacts is expected.
In the fat-suppressed spoiled gradient-echo sequences employed in this work, fat suppression pulses precede the acquisition segments; each segment contains a series of spoiled gradient echoes (100 for LIN and 60 for RAD, respectively). Due to the higher number of echoes, the LIN sequence produces a k-space richer in high spatial frequencies, particularly in the read-out direction (AP). This is reflected in the SFRCs from the test object (Fig. 4), the volunteer [ Fig. 6(A)], and the patient [ Fig. 6(B)] images, which exhibit higher amplitude in the high frequency region. The echoes associated with the k-space center (low spatial frequencies) are acquired at the beginning of each segment, whilst the periphery of the kspace (high spatial frequencies) is filled at the end of the series of echoes. The T2* of normal breast parenchyma was measured by Schmidt et al. to be 25±8 ms. 26 The echoes within each segment are spoiled gradient echoes, and the T2* decay occurring between excitation and data acquisition, due to the short TE, is negligible in the normal breast. 25,26 The protocol with linear k-space sampling (LIN) is therefore expected to have improved resolution in the transaxial plane along the read-out direction (AP). This was depicted by the SFRCs from both test object and clinical breast images. However, in clinical examinations, subject motion introduced additional differences between the two protocols, as this is handled dissimilarly by the two different k-space sampling patterns. The RAD protocol performs a radial sampling pattern, oversampling the k-space center across the entire acquisition and omitting the periphery (high spatial frequencies), and is therefore less sensitive to motion and noise. However, the oversampling has an averaging effect which results in blurrier images, with consequent loss of spatial details (Fig. 1). Conversely, LIN protocol samples the k-space linearly and includes the higher spatial frequencies, producing images with increased sharpness and noise. The SFRCs from clinical images (Fig. 6) were able to describe these differences, indicating a higher content of spatial frequencies for LIN both at frequencies >0.27 cycles/mm (noise region) and in the range of 0.10-0.27 cycles/mm (image details). Therefore, subject motion introduced differences between the two protocols which were not detectable with test object studies. This emphasizes the relevance of a method that can evaluate the performance of an imaging protocol directly on the images reviewed by the clinical user. Differences in kspace sampling also affected the limiting spatial frequency: a statistically significant increase was observed for the RAD protocol. Nevertheless, both protocols had the capability to resolve structures <2 mm. The evaluated average limits (0.277 and 0.293 cycle/mm) correspond to structures with sizes of 1.8 and 1.7 mm, for RAD and LIN, respectively.
The different k-space sampling pattern was therefore the main determinant of the content of high spatial frequencies in the images; this resulted in a difference in sharpness which was perceived by the clinical users and objectively detected by the SFRC. Differences between RAD and LIN protocols could be described by the SFRC in a retrospective analysis of small groups of subjects (Fig. 6). As the frequency spectrum is obtained from the final magnitude image, in principle any subportion of the image can be analyzed employing the SFRC methodology, and this would provide information about the specific spatial frequency content of the subimage. In this analysis, we have characterized regions containing one entire breast; this also improved the ability to match images from different examinations. The morphology of the breast has an impact on the shape of the SFRC, as demonstrated by the intragroup variability of the curves from clinical data [Figs. 6(A) and 6(B)]. However, the pattern of decay of the SFRCs from each protocol was consistent across patients and volunteers. This indicates that the proposed methodology provides robust characterization of the frequency response despite (a) the intersubject variability due to anatomical difference in breast size and structure and (b) the intrasubject variability due to different position. Image filtering applied by the manufacturer can create a noise halo around the edge of the breast, which is more evident in the LIN images (Fig. 1). This is due to the fact that the LIN protocol produces noisier images and was depicted by the SFRCs.
The analysis of the SFRCs helped the decision to adopt the LIN protocol in the clinical routine, as an increased content of high spatial frequencies allowed for smaller enhancing structures to be visible, while maintaining an acceptable level of noise and artifacts. We would like to acknowledge that spatial resolution can be considerably affected by the filtering and denoising processes employed by the scanner to form the final image. Analysis of the raw data would be likely to have produced different SFRCs (particularly in the high spatial frequency region), more representative of the original frequency spectrum. However, the raw data are difficult to obtain in standard settings and might not be fully representative of the final image quality. In this regard, this methodology is suitable for assessing the spatial frequency content of the final image as seen by the clinical user, and it is able to quantitatively describe differences in image quality which can be perceived visually.
A primary objective of the quality assurance of MRI protocols for breast screening is the assessment of the ability to resolve image details. The SFRC is able to describe changes in spatial frequency content associated with different imaging protocols. This very general method operates in the Fourier space and can be extended to any digital image and employed to analyze image subportions large enough to contain the structures of interest. The SFRC therefore represents an advanced QA tool: whilst standard QA methods provide an evaluation of the general performance of the scanner in fixed testing conditions, the SFRC can be tailored to a specific clinical imaging protocol and to the structures actually present in the images.
This work introduces the use of power spectrum analysis on MRI images; however, it is subject to some limitations. First, our analysis is restricted to images from one scanner from a single MRI manufacturer. As spectra normalization enables the threshold to be consistent between different images, it is possible to calculate the SFRC from images acquired with different scanners. However, this was not demonstrated in this paper. Second, our analysis compares images from the same sequence after alteration of a specific parameter (k-space sampling pattern), which produced visible differences in the images. As more complex alterations of several sequence parameters can lead to scanning protocols compliant to the breast screening guidelines, a more extensive analysis is required to fully explore the use of SFRC in QA for breast MRI. Furthermore, k-space sampling implementations vary between different MRI sequences and scanners, and therefore our conclusions about the differences between radial and linear coverages are limited to the specific protocol described. Third, within this work we have only considered 2D spectra. Although the extension of the Fourier analysis to the 3D case is possible and will be considered in future work, it was not explored within this paper. However, this would not alter the general concepts that we have introduced with this work.
Even though we acknowledge the relevance of the SFRC in the analysis of clinical images, we also recognize the potential of the SFRC as a QA tool in conjunction with a realistic phantom. The literature provides some examples of breast phantoms which are anthropomorphic to a lesser or greater extent; [27][28][29] together with realistic relaxation properties the ideal phantom should contain a range of bigger and smaller morphological structures to mimic the tissue of interest. A reference image of such phantom would produce a reference SFRC, which could be used in a comparative analysis of images acquired with different protocols or scanners. This would be particularly useful across different MRI centers: differences in images of the same phantom produced with different guideline-compliant protocols could be described by the SFRC. Departure of the SFRCs from a reference curve could be minimized to ensure that protocols perform similarly over the spatial frequency range of interest.

CONCLUSIONS
In conclusion, we have presented a novel methodology for the assessment of spatial resolution in clinical images, introducing the use of the spatial frequency response curve for the evaluation of the spatial frequency content of the image. The proposed methodology produced consistent results in a retrospective analysis of MR clinical images of different groups of subject, quantifying differences in sharpness that could be visually perceived by the clinical users, and characterizing the noise level of the images. The SFRC qualifies as a useful QA tool in the evaluation of different breast screening protocols and provides an objective assessment of the presence of spatial details in the image. With a reference image, a comparative analysis of the SFRCs could ensure that equivalent image quality is achieved across different scanners and sites.