SPARSE: Seed Point Auto‐Generation for Random Walks Segmentation Enhancement in medical inhomogeneous targets delineation of morphological MR and CT images

In medical image processing, robust segmentation of inhomogeneous targets is a challenging problem. Because of the complexity and diversity in medical images, the commonly used semiautomatic segmentation algorithms usually fail in the segmentation of inhomogeneous objects. In this study, we propose a novel algorithm imbedded with a seed point autogeneration for random walks segmentation enhancement, namely SPARSE, for better segmentation of inhomogeneous objects. With a few user‐labeled points, SPARSE is able to generate extended seed points by estimating the probability of each voxel with respect to the labels. The random walks algorithm is then applied upon the extended seed points to achieve improved segmentation result. SPARSE is implemented under the compute unified device architecture (CUDA) programming environment on graphic processing unit (GPU) hardware platform. Quantitative evaluations are performed using clinical homogeneous and inhomogeneous cases. It is found that the SPARSE can greatly decrease the sensitiveness to initial seed points in terms of location and quantity, as well as the freedom of selecting parameters in edge weighting function. The evaluation results of SPARSE also demonstrate substantial improvements in accuracy and robustness to inhomogeneous target segmentation over the original random walks algorithm. PACS number: 87.57.nm

In the past decade, efforts have continued towards developing interactive segmentation methods. One of the popular interactive segmentation methods is random walks (RW), (8) where the users achieve segmentation through estimating the probability that a random walker starting at each voxel to reach the labeled initial seed points (SPs). The RW algorithm has several drawbacks in practice. Firstly, RW relies on the location and quantity of the initial SPs to a large extent. It has been demonstrated that RW is stable to changes of the SPs only when the changes of location and quantity are below 10% and 50%, respectively. (9) Moreover, to obtain satisfactory segmentation results, the quantity of initial SPs should be sufficient to be representative of almost all intensity levels in the target. However, in practice, it is tedious and laborious to place sufficient SPs on the target, especially in a 3D scenario. In this sense, a stable segmentation result may not be guaranteed when insufficient SPs are used, especially for medical inhomogeneous targets, which are common in clinical studies.
Great efforts have been made to improve RW by reducing its reliance on the SPs. Dong et al. (10) proposed a novel SPs selection method composed of a region growing technique and morphological operation to promote the RW segmentation of ventricle in 3D dataset. But, it is limited to cavity or cavity wall extraction. Li et al. (11) extended the RW by incorporating a prior shape to relieve the requirement of the SPs. However, this method works under the assumptions of slight occlusion, similar background, and illumination changes in the pedestrian images, which has limited application in medical images. Moreover, a proper prior shape is usually difficult to obtain because of the diversified shapes of clinic targets. Onoma et al. (12) proposed an improved RW by initializing the SPs automatically via fuzzy-C means to yield better segmentation of lung tumor. However, it fails in the segmentation of lesions with complex shapes and inhomogeneous uptake. Cui et al. (13) described a SPs automatic selection strategy for RW-based segmentation of lung tumor in computed tomography (CT) image by using the positron emission tomography (PET) image as the prior. But the availability of the PET image may restrain its application to segmentation tasks in other image modalities.
In this paper, we propose a novel algorithm imbedded with seed point autogeneration for random walks segmentation enhancement (named SPARSE). A previous study (9) has shown that more SPs generally provide more useful information in the target region and background, and thus more robust segmentation results can be guaranteed. With a limited number of userlabeled initial SPs, we have developed a SPs autogeneration scheme to obtain extended seed points (ESPs) by estimating the probability of each voxel with respect to a certain label. The RW algorithm is then applied upon the ESPs to achieve improved segmentation result. The SPARSE is implemented using compute unified device architecture (CUDA) on a graphic processing unit (GPU) platform to improve computation efficiency. The performance of SPARSE is evaluated using homogeneous and inhomogeneous cases. It is found that the SPARSE is robust to the initial SPs in terms of location, quantity, and the selection of parameter β in edge weighting function. Furthermore, SPARSE improves the segmentation accuracy when compared with the original RW and another state-of-the-art interactive segmentation algorithm -the graph cut (GC) method. (14)

II. MATERIALS AND METHODS
A. Review of the RW RW (8,15) is used to solve the segmentation problem by calculating the probability that a random walker starting at each voxel will first reach one of the labeled SPs. A graph consisting of a pair G = (V,E) with vertices (nodes) v∈V and edges e∈E is first created based on the image I to be segmented. The edge e spanning two vertices, v i and v j , is denoted by e ij . The connectivity of two adjacent vertices on an edge is weighted by ω ij , such as where I i and I j indicate the image intensity at voxel i, j, respectively. β represents the only free parameter.
The desired random walker probabilities x on vertices v∈V can be obtained by solving the following combinatorial Dirichlet problem: (2) where x i and x j are the random walker probabilities on vertices v i , v j , respectively. L is the Laplacian matrix: is the degree of a vertex for all edges e ij incident on v i . The Laplacian matrix is a sparse matrix. It is built according to four-connectivity and six-connectivity for 2D and 3D scenarios, respectively.
The vertices can be partitioned into two sets, V M (labeled vertices) and V U (unlabeled vertices) such that V M ∪ V U = V and V M ∩ V U = ∅. In this way, Eq. (2) can be decomposed into (4) where x M and x U correspond to the probabilities of the labeled and unlabeled voxels, respectively. Differentiating D[x U ] with respect to x U and finding the critical point yields The probabilities x U for those unlabeled voxels can be easily calculated by solving the above sparse, positive definite linear Eq. (5). Denoting the probability assumed at node v i for each label s, by x s i , the final segmentation is obtained by assigning each node v i the label corresponding to maximum probability max s (x s i ). Usually, at least two groups of labels (s ≥ 2) need to be supplied manually (i.e., SPs inside the region to be segmented (named foreground points) and SPs at the background (named background points)).

B. Seed point autogeneration
In the 2D homogenous scenario, only a small number of SPs are needed to be labeled to yield a satisfactory segmentation. For the 3D segmentation task, it would be tedious and sometimes impractical to specify all the SPs on each slice of a 3D volume. Moreover, if the target to be segmented is not homogeneous, but contains highly diversified materials instead, using only a small number of SPs usually fails to provide enough information of the target, and thus satisfactory result cannot be obtained. In this study, we propose a SPs autogeneration scheme to obtain the ESPs for RW by estimating the probability of each voxel with respect to a certain label.
Specifically, let us assume that there are k(k ≥ 2) labels s k , and denote the labeled SPs and the corresponding intensities as S s k , t s k c respectively, where c is the number of the labeled SPs S s k . The intensity distribution probability p(i|s k ) of voxel i in image I with respect to label is thus given by (6) where σ is a tuning parameter that reflects the rigor level of the similarity criteria, and N s k is a normalizing constant for label s k : where m = max max (I i -I j ),∀i, j.
Once p(i|s k ) is available, the probability p(s k |i) of voxel i to the chain of label s k can be simply calculated as (8) with 1 representing the highest probability. In this way, we can obtain the probability maps P s k with respect to label s k for each voxel in image I.
Region growing (16) is then performed on the probability maps P s k with the initially labeled SPs S s k as the SPs and p(s k |i) > P T s k as the growing condition to yield the ESPs E s k , where P T s k is the threshold to filter those points with high similarity to the label s k . For simplicity, we set P T s k = P T for ∀s k in this study. The final segmentation is then completed by performing the RW with E s k for each label s k . The above ESPs generation scheme is summarized in Fig. 1.

C. Implementation
The proposed algorithm is implemented on a platform equipped with an NVIDIA Tesla C1060 card with a total number of 240 1.3 GHz processors, and a 4 GB DDR3 memory shared by all processors. This platform enables parallel processing of the same operations on different CUDA threads simultaneously, which speeds up the entire algorithm. In order to efficiently parallelize RW in the CUDA environment, the data parallel portions of the algorithm are identified and grouped into the following kernels: 1) an edge kernel to build a graph with vertices and edges; 2) a weighting kernel to compute ω ij ; 3) a normalization kernel to normalize ω ij ; and 4) a Laplacian kernel to create the Laplacian sparse matrix. Moreover, we adopt the CUSP (a CUDA-based library for sparse linear algebra and graph computations) (17) to solve the sparse linear Eq. (6). Sparse matrices are stored using the Coordinate (COO) format. The conjugate gradient (CG) method is used as the iterative solver with relative tolerance of 10 -6 and maximum iteration 1000 as the stopping criteria.

D. Evaluation data
The performance of SPARSE is evaluated using a synthetic phantom and four clinical 2D (cases 1-4) and twenty (cases 5-24) 3D CT/ MR images. The synthetic phantom (Fig. 2) contains a ground truth target which is made up of piecewise blocks with different intensity values representing inhomogeneous anatomical structures. The image resolution of the phantom is 256×256, and Gaussian noise with variance of 20 is added to the phantom, yielding a signal-to-noise ratio of roughly 25 in it. For clinical cases, Cases 1-4 are 2D cases including: a fibula CT image of resolution 128×128 (case 1); a lung CT image of resolution 512×512 (case 2); a corpus callosum T2 MR image of resolution 320×320 (case 3); and an abdomen CT image of resolution 512×512 (case 4). Cases 5-24 are 3D cases including: three high-dose-rate (HDR) brachytherapy CT images with tandem and cylinder (T&C) applicator of resolution 256×256×85 (cases 5, 11), 256×256×60 (case 12); five HDR CT images with tandem and ovoid (T&O) applicator of resolution 256×256×55 (case 6), 256×256×75 (case 13), 256×256×60 (case 14), 256×256×58 (case 15), and 256×256×82 (case 16); three T1c MR images of resolution 256×256×100 (cases 7, 17-18); three T2, T1c, and FLAIR MR image series of resolution 160×216×176 (cases 8-10, 19-24). In this study, we classify all the evaluated cases into two categories, namely "homogeneous" and "inhomogeneous", according to the variance of the intensity levels of the segmentation target. Specifically, given the ground truth segmentation, the intensity of the target is first normalized between [0,1], and the standard deviation (SD) is calculated. For those cases with target SD values smaller than a predefined threshold (0.1 is used in this study), they can be regarded as "homogeneous"; otherwise, they are classified as "inhomogeneous". The target SD value for the synthetic phantom is around 0.22, and the target SD values for other clinical cases are listed in Table 1. By using this metric, we classify cases 3, 7 and 17-18 as the homogeneous cases, while the others cases, including the synthetic phantom, as the inhomogeneous cases.

E. Quantifications of segmentation performance
To quantitatively evaluate the segmentation performance, we employ three similarity metrics: the Dice's coefficient (DC), (18,19) the percent error (PE), and the Hausdorff distance (HD). (20) Given the ground truth segmentation region A and the calculated segmentation region B, and their corresponding boundary point sets A , which ranges from 0 to 1, corresponding to the worst and the best segmentation, respectively. The PE is defined as PE=(A ∪ B -A ∩ B)/A, with 0 representing the best segmentation. The HD is defined as and is the L 2 norm on the points of and B → . For all the evaluated cases (expect for cases 7-10 and cases 17-24 where the ground truths are available), the targets are delineated manually by three experienced physicians, and the optimal combination of these three raters is estimated using the SPAPLE algorithm (21) and serves as the surrogate of the ground truth segmentation for each case. All the generated ground truth segmentations are then used for segmentation accuracy studies between the RW, GC, and SPARSE with the above quantitative metrics. Figure 2 illustrates the comparison results of the synthetic phantom using the RW, GC, and SPARSE algorithm. This simple phantom contains an inhomogeneous target made up of nine blocks with different intensities ranging from 25 to 255 ( Fig. 2(a)), and the same SPs for RW, GC, and SPARSE are only seeded in five of them ( Fig. 2(b)). We can see that the RW and GC algorithm both fail in segmenting the entire target blocks (Fig. 2(c) and (d)). The SPARSE, in contrast, yields a successful segmentation (Fig. 2(e)). Figure 3 demonstrates an example segmentation of the fibula using RW, GC, and SPARSE with different seed points (case 1). Figure 3(a) shows that RW, GC, and SPARSE can yield satisfactory segmentation results with sufficient SPs (Fig. 3(a)-2,-3,-4). When one (Fig. 3(b)-1) or two SPs (Fig. 3(c)-1) inside the fibula are removed, keeping the others at their original positions, RW, GC, and SPARSE deteriorate, although SPARSE behaves much better than RW and GC. Figure 4 shows another comparison of segmenting a lung tumor with SPs of different locations and quantities (case 2). Figure 4(a) shows that RW, GC, and SPARSE can yield similar satisfactory results when sufficient SPs are labeled ( Fig. 4(a) The synthetic phantom and three challenging cases (cases 1, 2, and 4) with inhomogeneous targets are used for further assessment of the impact of SPs' quantity and location on segmentation performance. To analyze the sensitivity of SPs' quantity, certain amounts of initial SPs are first labeled on the images, and then gradually reduced by random down-sampling with interval of 5% of the original number. The corresponding segmentation results are tracked on each reduction step and quantitatively measured by three similarity metrics: DC, PE, and HD. Figure 5 shows the comparison results of the segmentation response to SPs reduction, with RW and SPARSE. For all the evaluated cases, SPs reduction of less than 80% are seen to produce only minor changes in the resulting segmentation using both RW and SPARSE, while sharp drops in segmentation quality occur when the reduction is larger than 80%. Perturbations of segmentation performance in SPARSE is slightly larger than RW; however, SPARSE always has superior segmentation results than RW for all three metrics.

B. Influence of initial seed points: quantity and location
To analyze the sensitivity of SPs' location, arbitrary SPs placement is simulated by shifting the labeled SPs with a random direction and amplitude. Specifically, given certain initial labeled SPs, the shift amplitude of each SPs is given by multiplying the minimum distance from its current location to the boundary points on the ground truth segmentation with a random number in the range (0,1). The shift direction is randomly assigned. Perturbing in this way, the initial SPs can only move within a reasonable distance without crossing over (e.g., moving the foreground points into the background). The segmentation results are tracked on each random  SPs location and quantitatively measured by DC, PE, and HD. Figure 6 shows the comparison of segmentation performance relative to the SPs position changes between RW and SPARSE. In RW, a small variation of segmentation performance is observed in the synthetic phantom and in cases 1 and 2, but large perturbation is seen in case 4. In contrast, segmentation is stable for all the evaluated cases via SPARSE. Moreover, SPARSE behaves much better than RW in terms of segmentation accuracy.  , and HD (c) are comparisons of the segmentation response to random SPs location between RW and SPARSE, respectively. Given certain initial labeled SPs, the shift amplitude of each SPs is given by multiplying the minimum distance from its current location to the boundary points on the ground truth segmentation with a random number ranges in (0,1), which is the x-axis. The shift direction is randomly assigned.

C. Influence of parameter β
Since the number of SPs is largely increased via SPs autogeneration, the influence of β is weakened in SPARSE. Figure 7 shows experiments of the segmentation response to different β. It is shown that the segmentation results vary greatly when different β is used in RW, and the optimal β is case-dependent, since no fixed perturbation pattern is observed in all the three cases. In contrast, the segmentations are stable relative to different β used in SPARSE for the synthetic phantom and all the three clinical cases, implying that the segmentation is essentially independent of the selection of β. Similar results are also obtained in other evaluated cases, though only three outputs are depicted in Fig. 7 for the clarity of comparison. Therefore, we empirically set β = 90 for SPARSE, and the optimal ones are used for RW for all the other comparison studies in this study.  Figure 8 presents the segmentation results response to different P T in the synthetic phantom and ten segmentation cases (cases 1-10). We have the following two observations. First, the performance of SPARSE is stable when small P T is used, and degenerates as P T increases to a certain degree in all evaluated cases. The turning points are around 0.8 and 0.95 for 2D and 3D, respectively. Theoretically, larger P T implies stricter growing condition; in other words, fewer points will be included into the ESPs chain. If no new points are grown into ESPs, the SPARSE will degenerate to RW. Secondly, when small P T is used (P T ≤ 0.5), the ESPs might be overgrown, or one point might be assigned to more than one label. Case 5 in Fig. 8 is such a typical case that failed segmentation is obtained when P T = 0.5 is used. Based on these observations from both the phantom and clinical cases, we hence heuristically assume that an appropriate range for P T is [0.6, 0.9] and we empirically set P T = 0.8 for all the evaluated cases in this study.  Figure 9 illustrates the comparison of results of segmenting relatively homogeneous targets. Figure 9(a) shows the segmentation comparisons of the corpus callosum in a T2 MR image (case 3). Since only limited SPs are labeled in this case, oversegmentation is observed in RW and GC (arrow in Fig. 9(a)-2,-3), while the SPARSE yields a much better segmentation result ( Fig. 9(a)-4). Figure 9(b) shows a case of segmenting a glioma in a T1c MR image (case 7). We can see that oversegmentations are sparsely distributed in some voxels outside of the target region in RW (arrows in Fig. 9(b)-2), and slight oversegmentation is also observed in GC (arrows in Fig. 9(b)-3). Comparatively, the SPARSE can generate more accurate results ( Fig. 9(b)-4). Figure 10 illustrates the comparison results of segmenting inhomogeneous targets in CT images. Figure 10(a) is the segmentation of the vertebra (case 4). Undersegmentation is observed in RW and GC (arrow in Fig. 10(a)-2,-3), while SPARSE can yield satisfactory result ( Fig. 10(a)-4). Figures 10(b) and (c) are the segmentations of two different types of applicator in CT images: T&C applicator and T&O applicator, respectively (cases 5, 6). The RW and GC produce severe undersegmentation (Figs. 10 Figure 11 shows another inhomogeneous case of segmenting a glioma in T2, T1c MR images (cases 8, 9) ( Fig. 11(a) and (b)), and glioma and edema in FLAIR MR image (case 10) (Fig. 11(c)

F. Quantitative evaluations
The quantitative comparison results for all the 3D cases (cases 5-24) are listed in Table 2. We can see that SPARSE is superior to RW and GC in terms of segmentation accuracy for all the evaluated cases in all the three metrics. For RW, the mean DC, PE, HD, and their corresponding standard deviation are 0.47 ± 0.19, 0.69 ± 0.18, and 50.74 ± 31.17, respectively. For GC, the mean DC, PE, HD are 0.59 ± 0.17, 0.58 ± 0.17, and 15.62 ± 5.97, respectively, which are slightly better than RW. For SPARSE, in contrast, the DC increase to 0.89 ± 0.03, the PE and HD decrease to 0.23 ± 0.07, and 8.24 ± 3.95, respectively.

G. Computational efficiency
All the experiments in this study were conducted on a GPU platform with an NVIDIA Telsa C1060 card with a total number of 240 processors of 1.3 GHz. It is also equipped with three GB DDR3 memory, shared by all processors. The mean computation time is 0.44 ± 0.10 s for four 2D cases (cases 1-4) and 38.63 ± 9.41 s for twenty 3D cases (cases 5-24), respectively. Table 3 lists all computational time for all the 3D cases. It can be seen that the computational time depends on the complexity of the cases tested.

IV. DISCUSSION & CONCLUSIONS
In this paper, we presented a RW-based segmentation algorithm, SPARSE, which incorporates a novel SPs autogeneration scheme for segmentation of inhomogeneous targets. Evaluations of segmentation tasks in the clinical images reveal that the SPARSE decreases the sensitivity to the initial seed points in terms of location and quantity, as well as the dependency of the free parameter β in edge weighting function. With GPU implementation, the robustness and accuracy of the proposed method has been demonstrated with various tested cases in the segmentation of inhomogeneous objects, especially for 3D cases. One merit of the proposed SPs autogeneration in SPARSE is that the influence of the initial SPs placement to the ultimate segmentation can be reduced. The sensitivities of both SPs quantity and location are evaluated by visual inspection, as well as quantitative analysis. It has been shown that variation of segmentation performance in SPARSE is negligible when the number of SPs is reduced or the locations of SPs are changed randomly, and SPARSE generally produces superior segmentation results than RW when SPs are perturbed. Another gain of the SPs expansion strategy is the reduced workload and labor in manual SPs selection, making the SPARSE algorithm a practical tool for segmentation tasks in clinics. Manual SPs labeling is tedious and usually time-consuming for most of the interactive segmentation algorithms. In SPARSE, however, the user only needs to specify several SPs on one or a few slices on a 3D volume instead of carefully seeding all through the slices to obtain a satisfactory segmentation. This will significantly facilitate the clinical workflow.
In practice, more SPs usually imply more target/background intensity information, which can theoretically maximize the performance of RW. SPARSE can facilitate collecting such intensity information to a feasible extent with only limited user interaction, and it is this fact that more robust segmentation can be expected given the "incremented" information by incrementing the SPs. Although less initial SPs dependence is achieved in SPARSE, to guarantee appropriate SPs autogeneration, the initial SPs still need to be labeled on locations that are representative of different intensity levels in the target/background instead of seeding arbitrarily. Furthermore, SPs autogeneration is in favor of the β selection. Choosing an appropriate β for RW is a nontrivial task since β is essentially case dependent, and the segmentation performance can vary dramatically when different β is used. However, the SPARSE can weaken this dependence with incremental SPs, which has been demonstrated in this study.
In Eqs. (6) and (7), we set the tuning parameter σ = 0.5×D I , where D I is the mean difference of the intensity between the labeled SPs inside and outside the target, to keep the choice of σ relevant to the intensity contrast of the target and the background. This approach is shown to be effective in distinguishing the low contrast targets.
In SPARSE, the threshold P T controlling the SPs growth is empirically chosen based on experiments of ten segmentation tasks. Larger P T usually implies stricter growing condition. However, one should note that P T is not the only factor contributing to the SPs growth. Connectivity of voxels in the probability map is another factor, which is purely relative to the textural characteristics of the image. Therefore, potential overgrowth of SPs is theoretically possible even if small P T is used, in which case, manual intervention (e.g., deleting certain undesired incremented SPs) is then necessary. According to our observation, satisfactory segmentation performances can be obtained when P T ranges in [0.6, 0.9]. We have used a relatively rigorous P T = 0.8 to filter out those points with high similarity to the user-labeled SPs and include them into the SPs chain. For simplicity, we heuristically use the same P T for each label s, though this value might not be optimal for all labels. More sophisticated methods (e.g., adaptive approaches) need to be investigated for a better selection of P T , and we would like to include this work into our future study.