Optimal unified combination rule in application of Dempster-Shafer theory to lung cancer radiotherapy dose response outcome analysis

G-319, Philadelphia, PA 19107, USA; phone: (215) 955 1185; fax: (215) 955 0412; email: physhery@gmail.com Optimal unified combination rule in application of Dempster-Shafer theory to lung cancer radiotherapy dose response outcome analysis Yanyan He,1 M. Yousuff Hussaini,2 Yutao U. T. Gong,3a Ying Xiao3 Scientific Computing and Imaging Institute,1 University of Utah, Salt Lake City, UT; Mathematics,2 Florida State University, Tallahassee, FL; Radiation Oncology,3 Thomas Jefferson University, Philadelphia, PA, USA physhery@gmail.com


I. INTRODUCTION
Radiotherapy, which plays an important role in the treatment of lung cancer, often leads to complications. Therefore it is important and necessary for physicians to estimate the risk of complications according to published information (available clinical data) and their experience. In 2010, the Quantitative Analysis of Normal Tissue Effects in the Clinic (QUANTEC) reviews provided focused summaries of the dose/volume/outcome information for many organs. However, uncertainties such as the measurement error of total lung volume involved in radiation therapy practice and inconsistency among the algorithms used in different institutes were ultimately reflected in the outcomes of the data analysis. QUANTEC suggested that the information in the reviews could be updated and improved with the help of new physical and statistical techniques. (1)(2)(3) The present work introduces a statistical tool from the Dempster-Shafer (DS) theory to evaluate dose response. Using the combination procedure in DS theory, data from multiple sources can be fused, and more specific and accurate inference may be achieved. Although there exists meta-analysis, such as the inverse variance weighting method for data fusion, it mainly deals with the uncertain information involving randomness, and it requires unpublished results as well as published results to avoid publication bias. On the other hand, DS theory and consequently its combination rules, are capable of dealing with different types of uncertainties, and producing reasonable results based on the available information. Therefore, in the current work we focus only on the combination rules within DS theory.
We have previously implemented a case study of "belief" and "plausibility" in regard to the occurrence of radiation pneumonitis (RP) as an example to demonstrate the application of the theory. (4) In the current work, we discuss an optimal unified rule that provides a combined result (fused information) most similar to the individual sources of information -the lung cancer radiotherapy dose response data from different institutes.

II. MATERIALS AND METHODS
In lung cancer radiotherapy dose response analysis, we are interested in the incidence of pneumonitis among patients who have received radiotherapy for lung cancer. The information we have (see Table 1) is the probability of pneumonitis (in the form of error bars: ± one standard deviation (SD)) at specific mean lung doses (MLD) from four institutions: Memorial Sloan-Kettering Cancer Center (5) (MSKCC), Duke University Medical Center (6) (Duke), MD Anderson Cancer Center (7) (MD Anderson) and the University of Michigan (8) (Michigan). The incidence ranges of the pneumonitis are the same as those in the previous study, (4) except that the data from Duke University are corrected as follows: We used the observed incidence of RP with respect to MLD (Table 4 from Duke (6) ) to estimate the confidence interval [p -σ, p + σ], where p is the number of observed pneumonitis cases divided by the total number of cases n, and σ is the estimate of the standard deviation using the formula √ p(1 -p)/n .
the incomplete knowledge. Similarly, we construct all the m-functions m j i for jth MLD from ith institutions (see Appendix B).
Next, we apply a special case of Inagaki's unified rule (9) -the optimal unified rule (from He and Hussaini (10) ) based on a distance measure -to fuse the m-functions to yield a single m-function. The m-function obtained (representing the combined information from the four institutions) is most similar to the individual sources of information. See Appendix A for the details of the combination rules.

III. RESULTS & DISCUSSION
A. The combined results from the optimal unified rule The degrees of belief and plausibility of pneumonitis corresponding to the different dosages are calculated using the optimal unified rule (see Appendix C for data) and plotted in Fig. 1. The LKB model is used to fit the degrees of belief and plausibility at the four different dosages to obtain two boundary sigmoid curves.
The best estimation of the incidence of RP (at a specific dose) from the four institutions is between the degrees of belief and plausibility. For example, for a patient receiving the dosage MLD = 15 Gy, the minimum incidence of radiation pneumonitis is 0.83%, while the maximum incidence of radiation pneumonitis is 12.80%. The gap between these two (the belief-plausibility range) takes into account the uncertainty in the original information and part of the conflict among information sources. It can be interpreted as the uncertainty with which a patient belongs to either group ({RP} or {non-RP}). Figure 1 also shows that the belief-plausibility range widens as MLD increases because the original data (see Table 1) involve more uncertainty in data from individual institutions and data conflict/inconsistency among the data at higher doses. The two boundary curves provide the belief and plausibility of {RP} continuously with respect to MLD. Figure 1 also shows that radiation pneumonitis essentially always occurs when MLD ≥ 40 Gy, and the conservative estimation of the dosage with which RP essentially always occurs is 35 Gy.

B. Comparison of the results from the optimal unified rule to the results from DS
and Yager's Rules The results from DS and Yager's rules are shown in Fig. 2 and compared to the optimal unified rule. Figure 2 (left) indicates that Dempster's rule of combination produces counterintuitive results because all the original incidence ranges from the institutions (e.g., at MLD = 20 Gy) are outside the range of belief and plausibility curves; the maximum possibility of pneumonitis Fig. 1. The degrees of belief (◊) and plausibility (+) obtained from the optimal unified rule. The solid curves are the fits for the degrees of belief (lower curve) and plausibility (upper curve) using the LKB model. The error bars are from the clinical data of MSKCC, (5) Duke University, (6) M.D. Anderson, (7) and the University of Michigan. (8) after combination (plausibility value) is even smaller than all the estimated minimum possibilities of pneumonitis (the lower bounds of the vertical bars) from the institutions. This result obviously is due to the renormalization (i.e., distributing the belief mass committed to the empty set to the focal elements proportionally to their belief masses), which reinforces the proposition (focal element) with a larger degree of belief. Compared to Dempster's rule, the optimal unified rule provides reasonable results. Compared to Yager's rule (see Fig. 2 (right)), the optimal unified rule produces results with smaller belief-plausibility ranges, indicating thereby relatively less uncertainty. The solid curves are the fits for the degrees of belief (lower curve) and plausibility (upper curve) obtained from the optimal unified rule using the LKB model. The dashed curves are the fits for the degrees of belief (lower curve) and plausibility (upper curve) obtained from Dempster's rule using the LKB model. The error bars are from the clinical data of MSKCC, (5) Duke University, (6) M.D. Anderson, (7) and the University of Michigan. (8) (right) Belief and plausibility ranges; the results from the optimal unified rule are the thicker lines in red, and the ones from Yager's rule are the thinner lines in black.

APPENDICES Appendix A. The Basics of Dempster-Shafer Theory.
Dempster-Shafer theory introduces measures (belief and plausibility) to model one's intuitive perception of belief/opinion and combination rules to aggregate/fuse data from multiple independent sources. Following are the basic notions of Dempster-Shafer theory.
Let θ be the question of interest (e.g., Does a patient have radiation pneumonitis?) and let Θ = {θ 1 , θ 2 , …, θ n } be the collection of all possible answers (e.g., Yes and No). Θ is called the universal set or the frame of discernment. The true answer θ 0 is unknown, and we consider the strength of support from evidence (information) for the propositions in the form of "the true answer θ 0 is in A" where A is a subset of Θ (A ⊆ Θ). The support of a piece of evidence for proposition A can be represented by a basic belief assignment (BBA, also called the m-function). An m-function assigns a number (called belief mass or simply mass) in [0, 1] to an element in the power set 2 Θ (the collection of all the subsets of Θ): We further assume that no belief mass is assigned to the empty set Φ (i.e., m(Φ) = 0). The element A ⊆ Θ is called a focal element if it has a nonzero belief mass (i.e., m(A) ≠ 0) and the union of all the focal elements is called the core of the m-function.
In general, the m-function is not a traditional probability distribution function (pdf), although it is amenable to such an interpretation in a restricted sense. Belief is evidence supporting a proposition and it may be considered as one's weighted opinion regarding a proposition. Plausibility is evidence that does not contradict a proposition. They may be viewed as providing a lower and upper bound, respectively, on the likelihood of a proposition (being true). The gap between these two describes the uncertainty in one's belief in proposition A due to incomplete or partial knowledge.
The m-, belief and plausibility functions in Dempster-Shafer theory are capable of representing partial knowledge or a piece of incomplete information/evidence. In practice, different sources of evidence may provide information about the same question of interest. Assuming independence of evidence sources, Dempster-Shafer theory introduces a combination rule to combine them to obtain a single belief for the purpose of statistical inference.
Suppose two m-functions -m 1 with focal elements A i (1 ≤ i ≤ n 1 ) and m 2 with focal elements B j (1 ≤ j ≤ n 2 ) -are constructed from two distinct bodies of evidence. The conjunctive sum combines m 1 and m 2 , resulting in a single function: where C is a nonempty subset of the universal set Θ : (C ⊆ Θ, C ≠ Φ). It is quite possible that k = q(Φ) ≠ 0 due to the fact that two bodies of evidence may support contradictory answers (we call it conflict), which violates the definition of the m-function that be zero.
To make the resulting single function satisfy the definition of an m-function, the Dempster's rule of combination simply ignores the conflicting evidence (discards the belief mass committed to the empty set) and inflates q(C) by multiplying them by a factor of such that the sum of the belief masses of the subsets is equal to unity. It is defined as an orthogonal sum: Instead of ignoring the conflict, an alternative combination rule -Yager's rule -considers the conflict part of the ignorance. Yager's rule is defined as: It is clear that Yager's rule reassigns the belief mass committed to the empty set to the universal set, introducing more uncertainty (see Appendix D for proof). Here, we consider an optimal unified combination rule that maximizes the similarity among the multiple datasets, thereby reducing the uncertainty range.
The unified rule produces a single m-function for a fixed β. It is defined as follows: He and Hussaini define the optimal β such that the dissimilarity between the combined m-function and the individual m-functions is minimized. The dissimilarity is measured by the defined total distance (as the root mean square of the distance between the combined m-function and m i ) where dis(m Uni ,m 1 ) is the Jousselme's distance measure: where m (m = 2 N and N =| |) is the vector form of m. The total distance for all four doses can be rewritten as the objective function,

Appendix B. The Constructed m-functions at Four Different MLDs From Different
Institutes. We construct all the m-functions on the data for a given MLD from all four institutions (see Table B.1, where each cell contains the belief masses assigned to {RP}, {non-RP} and Θ, respectively).