A framework for optimization of diffusion-weighted MRI protocols for large field-of-view abdominal-pelvic imaging in multicenter studies.

PURPOSE
To develop methods for optimization of diffusion-weighted MRI (DW-MRI) in the abdomen and pelvis on 1.5 T MR scanners from three manufacturers and assess repeatability of apparent diffusion coefficient (ADC) estimates in a temperature-controlled phantom and abdominal and pelvic organs in healthy volunteers.


METHODS
Geometric distortion, ghosting, fat suppression, and repeatability and homogeneity of ADC estimates were assessed using phantoms and volunteers. Healthy volunteers (ten per scanner) were each scanned twice on the same scanner. One volunteer traveled to all three institutions in order to provide images for qualitative comparison. The common volunteer was excluded from quantitative analysis of the data from scanners 2 and 3 in order to ensure statistical independence, giving n = 10 on scanner 1 and n = 9 on scanners 2 and 3 for quantitative analysis. Repeatability and interscanner variation of ADC estimates in kidneys, liver, spleen, and uterus were assessed using within-patient coefficient of variation (wCV) and Kruskal-Wallis tests, respectively.


RESULTS
The coefficient of variation of ADC estimates in the temperature-controlled phantom was 1%-4% for all scanners. Images of healthy volunteers from all scanners showed homogeneous fat suppression and no marked ghosting or geometric distortion. The wCV of ADC estimates was 2%-4% for kidneys, 3%-7% for liver, 6%-9% for spleen, and 7%-10% for uterus. ADC estimates in kidneys, spleen, and uterus showed no significant difference between scanners but a significant difference was observed in liver (p < 0.05).


CONCLUSIONS
DW-MRI protocols can be optimized using simple phantom measurements to produce good quality images in the abdomen and pelvis at 1.5 T with repeatable quantitative measurements in a multicenter study.


INTRODUCTION
Diffusion-weighted magnetic resonance imaging (DW-MRI) is well established as a qualitative and quantitative imaging technique in oncologic applications in the body. The high contrast between solid tumors and normal tissues, arising from the restricted diffusion of water molecules in many solid tumors compared with many normal tissues, aids detection of disease and monitoring of response to treatment. Moreover, the most common quantitative parameter, the apparent diffusion coefficient (ADC), has been widely investigated to monitor treatment response, for example, response to chemotherapy in hepatic metastases, 1,2 breast cancer, 3,4 advanced ovarian cancer, 5,6 and nonsmall-cell lung cancer, 7 and to chemoradiation therapy in cervical cancer 8 and squamous cell carcinoma of the head and neck. 9,10 Pretreatment estimates of ADC have also been shown to be predictive of response to chemotherapy in pancreatic cancer 11 and to chemoradiation therapy in nonsmall-cell lung cancer. 12 The ADC is estimated by fitting a monoexponential function to the signal (S) measured using DW-MRI at two or more diffusion weightings (b-values), as described by In the estimation of ADC, the variation in acquisition parameters across scanners from different manufacturers and between models and software versions from the same manufacturer potentially affects quantitation; the effects of some of these variations have been investigated in healthy volunteers. ADC estimates have been shown to depend on the choice of b-values in kidneys 13 and in liver, spleen, and pancreas. 14 The latter study also showed no significant difference in ADC estimates between orthogonal diffusion encoding and three-scan trace weighted encoding. 14 Consensus recommendations for DW-MRI as a cancer biomarker recommend that protocols should be optimized to maximize signal-to-noise ratio (SNR), minimize artifacts from ghosting and distortion, optimize fat suppression, ensure ADC values can be measured accurately and reproducibly, and, ideally, use parameters which can be replicated on other platforms; 15 this is essential in order to achieve comparable quantitative data. In the first instance, protocol development using specialized phantoms to interrogate the effects of distortion and fat suppression circumvents the ethical and timing constraints of volunteer studies and may be particularly valuable in multicenter projects.
The aims of this study were, therefore, to develop methods for optimization of DW-MRI protocols on multiple platforms using specialist phantoms [polydimethlysiloxane (PDMS), corn oil, and temperature-controlled sucrose] and volunteers, implement the protocols on 1.5 T scanners from different manufacturers, and assess repeatability of the measurements in phantoms and normal tissues in healthy volunteers. A formal comparison between scanners was not a focus of this study.

METHODS
We developed the following phantoms and methods for assessment of geometric distortion and ghosting, fat suppression, and repeatability of ADC estimates.

2.A. Geometric distortion and ghosting
A large cylindrical phantom filled with PDMS was used to assess geometric distortion and ghosting. 16 PDMS was used as it has a very low diffusion coefficient at room temperature and therefore allows comparison of geometric properties of images at different diffusion weightings without loss of signal due to diffusion. Subtraction images were calculated by subtracting the b = 0 s mm −2 image from the corresponding heavily diffusion-weighted image (b = 900-1000 s mm −2 ) in order to assess geometrical differences between the images, assuming no signal decay with diffusion-weighting in the PDMS phantom ( 2014a, MathWorks, Inc., Natick, MA). The impact of the choice of diffusion gradient scheme, parallel imaging, and receiver bandwidth on the degree of geometric distortion was assessed. A maximum b-value of 1000 s mm −2 was used in assessments of the diffusion gradient scheme and parallel imaging in order to clearly demonstrate eddy current effects. Assessment of the effects of bandwidth on geometric distortion and ghosting was carried out using a maximum b-value of 900 s mm −2 in order to reflect the final protocol as the optimal bandwidth may be protocol-dependent. A semiquantitative distortion index (DI) was calculated by taking the ratio of the mean absolute pixel value in the subtraction images to the mean pixel value in the background (BG) noise, estimated using a region of interest (ROI) drawn in the corresponding b = 900 s mm −2 image in a region away from the phantom and ghosts, as described in the previous studies. 17 The strength of the ghosting was quantified using a ghost-to-signal ratio, estimated from the mean pixel value in a ROI in the ghost region (G) to the mean pixel value in a ROI in the phantom (S) in the b = 0 s mm −2 images.

2.B. Fat suppression
Fat suppression was assessed using images of healthy volunteers and using a corn oil phantom described previously. 18 The phantom was placed on the couch at 45 • to the z-axis in order to present an elliptical cross section, which is of similar shape and size to axial abdominal images obtained in human subjects. Images were acquired axially and reformatted sagittally in order to assess fat suppression along the length of the imaging volume.

2.C. Homogeneity of ADC estimates
Large cylindrical phantoms filled with water (manufacturers' water bottle phantoms) were used to assess variation in ADC estimates across a large field-of-view (FOV), assuming the isocenter of the magnet provides the "true" ADC estimate. Images were acquired axially. Homogeneity of ADC estimates in the z-direction was assessed using ROIs drawn in the center of the image on all slices. Homogeneity of ADC estimates in the right-left and anterior-posterior directions was assessed using line profiles drawn through the center of the central slice. ADC estimates were calculated for each pixel using a leastsquares fit to all b-values (trust-region-reflective algorithm, fit to monoexponential curve,  2014a, MathWorks, Inc., Natick, MA).

2.D. Repeatability of ADC estimates in an ice-water phantom
An ice-water phantom ( Fig. 1) was used to compare ADC estimates between scanners and to monitor stability of the ADC estimates from each scanner over the period of the volunteer study. The design of the phantom was based on the previous studies using ice-water phantoms in DW-MRI (Ref. 19) and has been described in detail elsewhere. 20 The phantom consisted of a Perspex cylinder containing five sucrose solutions in polypropylene tubes (0%-20% sucrose, Sigma-Aldrich) surrounded by a mixture of ice and water. The phantom was allowed to stabilize for 45 min to ensure that samples were at 0 • C before scanning. The protocols used for scanning the ice-water phantom were as used for the healthy volunteers (Table I) using a smaller FOV (320 mm read FOV) with correspondingly smaller pixels (2.5 × 2.5 mm acquired pixel size) and 5 mm slice thickness. No fat suppression was applied in the images of the ice-water phantom, unless a waterselective excitation was inherent to the diffusion-weighted echo-planar imaging (EPI) sequence. ADC estimates were calculated using in-house software (Levenberg-Marquardt algorithm, least squares fit to monoexponential curve using all b-values, ADEPT, Institute of Cancer Research, London, UK). Phantoms were scanned on multiple occasions during the course of the volunteer studies (scanner 1: n = 14, over a period of 22 months; scanner 2: n = 6, over 17 months; scanner 3: n = 8, over 20 months). After inspection for systematic variation in ADC estimates over time, coefficient of variation (CV = standard deviation/mean) were used to assess repeatability of ADC estimates for each tube on each scanner. T I. DW-MRI protocols used to scan healthy volunteers on three scanners. Parameters in bold were standardized across the three scanners. a Diffusion gradient separation (∆) and duration (δ), or related parameters, are quoted using the terminology used by the three manufacturers. b Receive bandwidth is quoted using the terminology and units used by the three manufacturers. c Generalized autocalibrating partially parallel acquisition (GRAPPA); autocalibration signal (ACS); sensitivity encoding (SENSE); array spatial sensitivity encoding technique (ASSET). d Bipolar and double spin-echo (DSE) are implementations of the twice-refocused spin-echo encoding scheme. e Three-scan trace and gradient overplus use three mutually orthogonal diffusion gradient directions, which are not aligned with the cardinal directions of the scanner; ALL uses three gradient directions aligned with axes of magnet. In all cases, images from three diffusion gradient directions were combined to produce trace images.
Phantoms were scanned at three institutions using scanners from three manufacturers (a different manufacturer at each institution, Table I). The PDMS phantom and corn oil phantom were transported to each institution and the same phantom was therefore imaged on each scanner. For the assessment of homogeneity of ADC estimates, the manufacturers' water bottle phantoms were used; hence, a different phantom was used at each institution. For the assessment of repeatability of ADC estimates, a set of ice-water phantoms were manufactured under identical conditions and tested at the lead site before being issued to each site for repeated measurements.

2.E. Healthy volunteers
Healthy volunteers were scanned twice on the same scanner, with Institutional Review Board Approval and with their written informed consent, on one of the three 1.5 T MR scanners at different institutions using the protocols described in Table I (n = 10 on each scanner). DW images were acquired using three contiguous stations covering the abdomen and pelvis. The couch was moved between stations in order to position the center of each station at the magnet isocenter. The bandwidths, parallel imaging, fat suppression methods, diffusion gradient schemes, and number of slices used in the protocols were determined from the results of the phantom experiments. One volunteer traveled to all three institutions in order to provide images for qualitative comparison. As logistical constraints prevented all ten volunteers from traveling to all three institutions, we therefore recruited the remaining nine volunteers as a separate cohort at each site with identical inclusion and exclusion criteria at each institution. The common volunteer was excluded from quantitative analysis of the data from scanners 2 and 3 in order to ensure statistical independence, giving n = 10 on scanner 1 and n = 9 on scanners 2 and 3 for quantitative analysis. As this study formed part of the work-up for a multicenter trial in ovarian cancer, all volunteers were female (scanner 1: median age 30 yr, range 26-50 yr; scanner 2: median age 35 yr, range 23-58 yr, for the nine volunteers included in quantitative analysis; scanner 3: median age 31 yr, range 26-50 yr, for the nine volunteers included in quantitative analysis). The median interval between scans was 5 days (range 1-8 days). Volunteers were fasted for 4 h before each scan. Premenopausal volunteers were scanned between day 6 and 14 of a menstrual cycle. All DW-MRI scans were carried out in free-breathing as respiratory triggering or navigators were not available on all platforms.
All analyses were carried out at the lead site by a single observer. ROIs were drawn by region growing (kidneys and spleen) or freehand (liver) or by placement of small circular ROIs (uterus) on computed DW images using inhouse software. 21 Computed DW-MRI is a postprocessing step that enables the user to generate DW images at arbitrary b-values. The user may, therefore, generate a DW image at a b-value that was not acquired in the imaging sequence in order to provide improved contrast-to-noise for visualization of lesions or tissues. 21 Diffusion weightings for computed DW images were chosen for each organ to give optimal contrast between the organ and surrounding tissues (b = 500 s mm −2 for kidneys; b = 800 s mm −2 for liver; b = 1000 s mm −2 for spleen; b = 1000 s mm −2 for uterus). For each organ, ROIs were drawn on three contiguous slices and were combined to give a volume of interest (VOI) that contained a large number of pixels. ROIs for the kidneys were drawn around the whole area of the left and right kidneys at the level of the renal hilum (median 4014 pixels in VOI, range 3029-5505). ROIs in the liver were drawn around the whole area of the right lobe of the liver, avoiding motion artifacts near the dome of the liver (median 8958 pixels, range 5804-13 740). ROIs for the spleen were drawn on the slices where the largest area of spleen was visible, avoiding partial volume effects with the kidneys (median 3635 pixels, range 1508-7669). ROIs in the uterus were drawn by placing three circles (32 pixels/circle) in the myometrium on each slice (288 pixels in VOI). Figure 2 shows examples of ROIs in kidneys, liver, spleen, and uterus in images of one volunteer on scanner 1. ADC estimates for each pixel in the ROIs were calculated using in-house software (Levenberg-Marquardt algorithm, least squares fit to monoexponential curve, ADEPT, Institute of Cancer Research, London, UK) using b-values 100, 500, and 900 s mm −2 . (b = 0 s mm −2 was excluded from the analysis in order to reduce the influence of perfusion. [22][23][24] In order to reduce the sensitivity to outlier values, the median ADC estimate from all pixels in the VOI was used for analysis of repeatability. The repeatability of ADC estimates for each organ and each scanner was assessed using the method of Bland and Altman. 25 Bland-Altman plots of untransformed data showed a relationship between the differences in the repeated measurements and their means that was improved by using the natural logarithm of the data. 25 The within-patient coefficient of variation (wCV) of the log-transformed data was used to describe the repeatability of the ADC estimates where Σd 2 is the sum of squared differences between pairs of measurements and N is the number of volunteers on each scanner). 26 The difference between ADC estimates from three scanners was investigated using a Kruskal-Wallis test for each organ, taking the mean of two visits from each volunteer ( 2014a, Mathworks, Inc., Natick, MA).

RESULTS
Phantoms were imaged on all three scanners. Assessment of distortion, ghosting, fat suppression, and homogeneity is described using images from one selected scanner in each case to illustrate each result.

3.A. Distortion
Distortion, assessed using the PDMS phantom, is greater in the b = 1000 s mm −2 images than in the b = 0 s mm −2 images, particularly when acquired using the monopolar sequence compared with the DSE sequence (Fig. 3). This is likely due to eddy current effects. Regions where distortion has occurred between the b = 0 and b = 1000 s mm −2 images appear white or black in the subtraction images. Distortion is more marked in the phase-encode (anterior-posterior) direction, indicating that eddy current effects are more marked in this direction. A parallel imaging artifact may also be seen. The signal-to-background ratio, estimated from the mean signal in the phantom (S) and BG regions in the b = 0 s mm −2 images, was higher in the monopolar sequence compared with the DSE sequence (730 and 650, respectively).
When investigating the effects of parallel imaging, there was less distortion evident in the subtraction images using a reduced acquisition (parallel imaging reduction factor R = 2) than in images acquired without the use of parallel imaging (Fig. 4). Again, distortion was greater in the anterior-posterior direction, indicating that eddy current effects are more marked in this direction. A parallel imaging artifact may also be seen in the images with reduced acquisition.
In investigating the effect of bandwidth, images acquired using monopolar diffusion gradients, without the use of parallel imaging, were used to clearly illustrate the distortion, although our results show that use of bipolar diffusion gradients and parallel imaging is advantageous in reduction of geometric distortion (Figs. 3 and 4). Distortion visualized at the anterior and posterior edges of the phantom was  were acquired by setting the reduction factor to 1 in order to use the same reconstruction method as the reduced acquisition. All other measurement parameters were consistent between the two acquisitions. Scanner 2; ss-EPI; axial slices; 5 mm slice thickness; FOV 320 × 280 mm; acquired matrix 128; reconstructed matrix 256; PE direction = AP; TE = 84 ms; TR = 8000 ms; receiver bandwidth 111 kHz; double spin-echo; trace images; 1 NSA. Subtraction images show the b = 0 s mm −2 image subtracted from the b = 1000 s mm −2 image. more marked at lower bandwidths (Fig. 5). Subtraction images confirmed that distortion was the greatest in the anterior-posterior direction, indicating more marked eddy current effects in this direction. Higher receive bandwidths (lower interecho spacing) resulted in less signal in the subtraction images, indicating a reduction in geometric distortion.

3.B. Ghosting
Visual inspection shows that the ghosts are stronger at higher bandwidths (Fig. 6). The intensity of the ghost as a percentage of the signal (G/S) was (a) 1.0%, (b) 1.4%, (c) 1.7%, (d) 2.4%, (e) 2.3%, (f) 3.7%, and (g) 8.2%; these results are also shown graphically in Fig. 7. Figure 7 shows DI and ghost-to-signal ratio plotted against receiver bandwidth, measured from the images shown in Figs. 5 and 6. There is a reduction in DI and increase in ghost-to-signal ratio with increasing bandwidth, as shown qualitatively in Figs. 5 and 6. The small increase in DI at 3005 Hz/pixel may be attributable to the large increase in noise and ghosting at the highest bandwidth. Bandwidths in the range 1500-2000 Hz/pixel represent an appropriate tradeoff between geometric distortion and ghosting in this case.

3.D. Homogeneity of ADC estimates
ADC estimates from square ROIs (81 pixels) drawn at the center of each slice of a stack of axial slices through a water bottle phantom showed no noticeable deviation in median ADC estimates between the central slice and end slices (<5%) in a sequence with 26 slices [ Fig. 9(a)]. The effects of increasing the number of slices are shown in Fig. 9(b), which demonstrates a reduction in ADC estimates (∼10%) in the end slices compared with the central slice. Except for edge-effects at the water-plastic-air interfaces, ADC estimates from line profiles (width 9 pixels) drawn through the center of the central slice showed no noticeable deviation in median ADC estimates across the FOV in the right-left and anterior-posterior directions (<5%) for either sequence [Figs. 9(c)-9(f)].

3.E. Repeatability of ADC estimates in temperature-controlled phantom
The repeatability of the ADC estimates from five samples in an ice-water phantom was good, with CVs below 4% for all tubes on all three scanners (Table II). There was good agreement between ADC estimates from three scanners, with all scanners showing <5% deviation from the mean ADC estimate across all scanners for all tubes. None of the scanners showed systematic variations over the course of the study. Figure 10 shows axial images (b = 100 s mm −2 and b = 900 s mm −2 ) and ADC maps from the same healthy volunteer from three scanners. Images from all three scanners showed good fat suppression, no marked ghosting artifacts, no gross geometric distortion, and no marked parallel imaging artifacts, although the SNR was noticeably lower in the images from scanner 3.

3.G. Repeatability of ADC estimates in healthy volunteers
One volunteer from scanner 2 was excluded from analysis of the uterus due to susceptibility artifacts arising from bowel gas but was included in analysis of other organs. Bland-Altman plots for ADCs measured in kidneys, liver, spleen, and uterus are shown in Fig. 11. The mean difference between pairs of ADC estimates was not significantly different from zero for any of the organs on any of the scanners (single-sample t-test on differences in natural logarithm of ADC, using a p-value <0.05 to indicate significance). The repeatability of ADC estimates was good, with wCV 2%-4% for kidneys, 3%-7% for liver, 6%-9% for spleen, and 7%-10% for uterus (Table III). Kruskal-Wallis tests showed no significant difference in ADC estimates in kidneys, spleen, or uterus between three scanners. There was, however, a significant difference between ADC estimates in the liver from the three scanners. F. 7. Distortion index and ghost-to-signal ratio measured from axial images of a PDMS phantom acquired using receive bandwidths from 751 to 3005 Hz/pixel. All other measurement parameters were consistent between the acquisitions. Scanner 1; ss-EPI; axial slices; 6 mm slice thickness; FOV 380 × 332 mm; acquired matrix 128; reconstructed matrix 256; PE direction = AP; TE = 178 ms; TR = 8000 ms; no parallel imaging; monopolar diffusion gradients; trace images; maximum b-value = 900 s mm −2 ; 1 NSA. The ROI used to estimate background is shown in Fig. 5(d). ROIs used to estimate intensities in ghosts (G) and signal (S) are shown in Fig. 6(h).

DISCUSSION
These data demonstrate that measurements using simple phantoms may be used to optimize DW-MRI protocols for abdominal/pelvic imaging using a large FOV on scanners from three manufacturers and that these optimized protocols result in ADC values with low wCV across multiple sites and vendors. The high quality images obtained from such a quality-assured process mean that it is possible to implement a DW-MRI protocol for abdominal and pelvic imaging with a good degree of standardization across scanners from three manufacturers and obtain quantitative data in multicenter trials that can be pooled for analysis.
In development of abdominal/pelvic imaging, it is crucial to use large phantoms in order to assess behavior across the FOV that will be employed in the patient population. Small phantoms may fail to reveal artifacts that are apparent at a larger FOV, for example, as a result of poorer shimming across a large FOV. The phantoms that we developed in this study allow qualitative and quantitative assessment of geometric distortion, ghosting, and fat suppression, as well as repeatability and homogeneity of ADC estimates across a large FOV. In many cases, the images can be assessed on the scanner console, which facilitates rapid protocol development, which is particularly valuable in multicenter studies with limited time available in busy clinical departments. A large PDMS phantom, which has previously been used to assess geometric distortion in single-center studies, 16 allows assessment of the effects of bandwidth, parallel imaging, and choice of diffusion gradient scheme in one imaging session using simple subtraction images. Assessment of ghosting, which has been shown to be scanner dependent in another study using a water bottle phantom, 27 can be combined with the same set of measurements.
Optimization of DW-MRI protocols requires trade-offs to be made between many imaging properties, for example, between geometric distortion and ghosting. B 0 distortion can be reduced by increasing the read bandwidth but this is accompanied by an increase in ghosting at higher bandwidths. Quantitative estimates of distortion and ghosting in images of the PDMS phantom allowed us to find an optimal range of bandwidths where both distortion and ghosting were low. In our experience, the preferred read bandwidth varies between scanners and may vary between protocols on the same scanner. Bandwidths in the range ∼1500-2000 Hz/pixel gave acceptable results in the scanners assessed as part of this study. It is also important to note, however, that ghosts may be difficult to assess using protocols that apply SENSE or ASSET due to suppression of signals in background regions.
Development of protocols for multicenter studies necessitates additional trade-offs between optimization and standardization of protocols on different platforms. Some parameters could not be standardized, notably diffusion gradient scheme, diffusion gradient strengths and timings, and methods of fat suppression and parallel imaging. The twicerefocused spin-echo gradient scheme was used on scanners 1 and 2 as assessment of geometric distortion using a PDMS phantom showed that distortion was reduced by employing twice-refocused spin-echo (called "bipolar" and "DSE" on scanners 1 and 2, respectively) diffusion gradient schemes, compared with monopolar gradients, due to reduction in eddy current effects. Twice-refocused spin-echo schemes are, however, not available on all platforms, for example, scanner 3 where a monopolar scheme was used. It would have been possible to standardize the protocols further by using a monopolar scheme on all scanners but this would result in impaired image quality compared with bipolar or DSE schemes. It is, however, important to note that the minimum TE may be higher using twice-refocused gradients, possibly leading to loss of SNR. Twice-refocused schemes may also result in lower SNR than monopolar gradients even at the same TE, which may be due to imperfect refocusing. Geometric distortion was also reduced by employing parallel imaging, which reduces the length of the echo train, and has the additional advantage of reducing the minimum available TE, thus improving SNR. Parallel imaging was employed in this study, despite the lack of standardization between platforms. It would be possible to standardize the acquisition further by not using parallel imaging but this would result in more distortion, due to longer echo trains, and lower SNR, due to longer TE. TE was reduced on scanners 1 and 3 by employing diffusion encoding schemes (three-scan trace or gradient overplus) that use three orthogonal diffusion gradients that are not aligned with the cardinal directions of the scanner. These schemes allow reduction in TE compared with three gradient directions aligned with axes of magnet (called orthogonal or ALL on some platforms) due to reduction in ramp times. It would have been possible to standardize the protocols further by using orthogonal gradients or single gradient directions on T II. Mean, standard deviation, and coefficient of variation of ADC estimates from repeated measurements of five tubes of sucrose solutions in an ice-water phantom on three scanners. The mean ADC estimates from three scanners were calculated by taking the mean of the three mean ADCs for each tube. all scanners but this would have resulted in longer TE, and hence lower SNR, compared with three-scan trace or gradient overplus. Despite the use of gradient overplus, the shortest TE available for this protocol on scanner 3 was still longer than on scanners 1 and 2 (96 ms on scanner 3 compared with 75 and 81 ms on scanners 1 and 2, respectively). The longer TE on scanner 3 is a possible explanation for the lower SNR on this scanner. The timings of the diffusion gradients (∆ and δ) vary between different implementations of diffusion gradient schemes and were not standardized between scanners in this study. Furthermore, the definition of b-values in terms of ∆ and δ is different for double spin-echo and monopolar gradient schemes. 28 The values of ∆ and δ may not be readily available on all platforms and the effects of varying ∆ and δ on the ADC estimates have not been fully explored. In this study, the large differences in ∆ and δ between scanners do not appear to have manifested themselves in the ADC results as only small differences were observed between ADC estimates from the three scanners.
Previous studies have described the development of a large phantom containing corn oil, which was used in this study to assess fat suppression over a large FOV. 18 Corn oil has similar spectral properties and T 1 relaxation times to subcutaneous abdominal fat observed in vivo and the large size of the phantom allows assessment of fat suppression at the edges of the FOV, for example, where B 0 homogeneity may be poorer. Fat suppression requires extensive optimization on some scanners. The preferred method (or methods, since combinations of methods are possible on some platforms) may depend on the scanner and application. It is not possible to standardize the method of fat suppression across scanners from different manufacturers although spectral methods may be employed on all scanners. Spectral methods were chosen over inversion recovery (IR) in the protocols described here as IR reduces the overall signal and introduces T 1 weighting.
Ice-water phantoms have previously been used to compare ADC estimates between scanners. 19,29 Phantoms containing sucrose solutions have also been used to assess long-term repeatability of ADC estimates. 30 Ice-water provides a simple and inexpensive method to control temperature. Sucrose restricts diffusion of water molecules and can be used to reduce the ADC of water to values comparable to ADCs observed in vivo. Our ADC estimates from samples 1 and 2 (0% sucrose) were in good agreement ADC estimates in distilled water at ∼0 • C reported in other studies. 19 Inclusion of samples containing 10%-20% sucrose allowed us to compare ADC estimates between the three scanners at the range of ADCs observed in tumors and normal tissues, confirming that there was good agreement between the three scanners across the relevant range of ADCs. The use of a large FOV imposes an additional requirement for good homogeneity of ADC estimates across the FOV in addition to accurate ADC estimates near the magnet isocenter. Homogeneity of ADC estimates is particularly relevant in patients with multiple lesions spread across a large volume where some lesions may be far from the magnet isocenter. By using an icewater phantom to obtain absolute ADC estimates near the magnet isocenter, in combination with relative measurements across the FOV using a water bottle, we were able to assess the performance of our protocols for quantitative DW-MRI across large imaging volumes. Assessments of homogeneity of ADC estimates using a large water bottle were used to determine the maximum feasible extent of the FOV in the z-direction. Deviation in ADC estimates at the ends of a large FOV may lead to errors in measurements of tumors at the ends of the FOV in the case of very large tumors or multiple lesions. The results described in this study indicate that the ADC may be underestimated by 10% at the ends of a large FOV (>200 mm) in the z-direction. Inhomogeneity of ADC estimates across a large FOV may be due to nonlinearity of the diffusion-weighting gradients. 31 In this study, 26 slices of 6 mm slice thickness were used, providing a 156 mm FOV in the z-direction, as this was achievable on all scanners in the study. The good homogeneity of ADC estimates in the right-left and anterior-posterior directions confirms the feasibility using a large in-plane; this test can also be used to detect artifacts, for example, due to inappropriate image filters or susceptibility artifacts.
The good repeatability of ADC estimates in the ice-water phantom and in healthy volunteers was in agreement with the previous studies, 5,19,32,33 indicating that the protocols described here are robust and suitable for quantitative studies. There were no marked differences in repeatability of ADC estimates between the three scanners, which is an important requirement for comparison of ADC changes in multicenter studies, for example, in assessment of response to treatment.
A limitation of this study was the use of different volunteers on each scanner as logistical constraints prevented ten volunteers from traveling to all three institutions. It was, therefore, only possible to compare ADC estimates from the cohorts, rather than comparing individual measurements. A further limitation is that only 1.5 T scanners were included in this study. However, the phantoms described can also be used at 3 T where similar optimization is required although the challenges presented are greater. 18 A previous multicenter study, which looked at ADC estimates in abdominal organs in DW-MRI data from healthy volunteers imaged on scanners from three manufacturers at 1.5 and 3 T, showed good agreement between ADC estimates in gall bladder, pancreas, spleen, and kidneys from different scanners at 1.5 T but poor agreement in pancreas and kidneys at 3 T and poor agreement in liver at both field strengths. 34 The DW-MRI protocols used the same sequence (single-shot EPI), FOV, matrix, number of averages, slice thickness, and slice gap in all scanners but the TE, TR, b-values, and methods of parallel imaging were not standardized (although a standard subset of b-values were used for analysis). Another multicenter study, which looked at gray matter and white matter in healthy volunteers, found significant variation in ADC estimates between scanners from F. 11. Bland-Altman plots for median ADC estimates in (a) kidneys, (b) liver, (c) spleen, and (d) uterus of ten healthy volunteers on three scanners using protocols described in Table I. Key to symbols: circles scanner 1; squares scanner 2; triangles scanner 3. the same manufacturer and between scanners from different manufacturers. 35 The FOV, matrix, number of averages, slice thickness, slice gap, number of motion probing gradients, and b-values were kept constant between scanners but again TE, TR, and methods of parallel imaging were not standardized.
In both of these studies, as in our protocols, timing parameters and methods of parallel imaging differed between scanners, demonstrating the difficulty of producing fully standardized DW-MRI protocols across multiple platforms.
In our study, the good agreement of ADC estimates in the ice-water phantom between the three scanners and the T III. Coefficients of variation, median, and range of ADC estimates from abdominal and pelvic organs measured on three scanners. absence of any significant difference in ADC estimates in the kidneys, spleen, and uterus in healthy volunteers show that it is possible to obtain similar ADC estimates on different scanners despite the inherent differences in the acquisitions. The significant difference in ADC estimates in liver between the three scanners in our study corroborates the results from an earlier study, which also demonstrated poor agreement in ADC estimates in the liver between 1.5 T scanners from different manufacturers. 34 These results suggest that multicenter studies using ADC estimates in normal liver may require tighter protocol matching, including TE and other timing parameters, and higher SNR. The lack of agreement in ADC estimates in the liver may be related to the lower signal in the liver compared with other organs or phantom measurements, owing to the relatively short T 2 of the liver, compared with other organs measured in this study (T 2 ∼ 46 ms at 1.5 T, compared with 85-87 ms for kidney, 79 ms for spleen, and 117 ms for myometrium), 36 leading to greater sensitivity to the noise characteristics of the three scanners. It is important to note, however, that tumors generally have high SNR on diffusion-weighted images and would therefore be less sensitive to noise. Tumors may, therefore, be expected to show good agreement in ADC estimates between scanners, as observed in kidneys, spleen, and uterus. Furthermore, the change in ADC between pre-and post-treatment measurements carried out on the same scanner may be acceptable in multicenter studies if absolute values cannot be compared. This remains to be verified in patient studies.

CONCLUSIONS
A range of simple phantoms can be used to optimize protocols for DW-MRI of the abdomen and pelvis on 1.5 T scanners from three manufacturers. The optimized protocols give good quality images across a large FOV in healthy volunteers with excellent repeatability of ADC estimates in abdominal and pelvic organs (wCV = 2%-10% for median ADCs). The methods described here serve as a framework for setting up multicenter DW-MRI studies.