Using stable distributions to characterize proton pencil beams

Purpose To introduce and evaluate the use of stable distributions as a methodology to quantify the behavior of proton pencil beams in a medium. Methods The proton pencil beams of a clinically commissioned proton treatment facility are replicated in a Monte Carlo simulation system (FLUKA). For each available energy, the beam deposition in water medium is characterized by the dose deposition. Using a stable distribution methodology, each beam with a nominal energy E is characterized by the lateral spread at depth z: S(z; α, γ, E) and a total energy deposition I D(z, E). The parameter α describes the tailedness of the distributions, while γ is used to scale the size of the function. The beams can then be described completely by a function of the variation of the parameters with depth. Results Quantitatively, the fit of the stable distributions, compared to those implemented in some standard treatment planning systems, are equivalent for all but the highest energies (i.e., 230 MeV/u). The decrease in goodness of fit makes this methodology comparable to a double Gaussian approach. The introduction of restricted linear combinations of stable distributions also resolves that particular case. More importantly, the meta‐parameterization (i.e., the description of the dose deposition by only providing the fitted parameters) allows for interpolation of nonmeasured data. In the case of the clinical commissioning data used in this paper, it was possible to only commission one out of five nominal energies to obtain a viable dataset, valid for all energies. An additional parameter β allows to describe asymmetric beam profiles as well. Conclusions Stable distributions are intrinsically suited to describe proton pencil beams in a medium and provide a tool to quantify the propagation of proton beams in a medium.


Supporting Material I
The data measured at CNAO was kindly provided by Andrea Fontana from the INFN at Pavia, Italy. We performed fits for all energies available. In contrast with the simulation data the measurement data was sometimes slightly shifted from the central axis. Rather than manually shift the data we incorporated it as an extra parameter δ in the stable distribution fit.

Supporting Material III
In this appendix we outline the notion of stable distributions, provide some definitions and show that the characteristic representation parameterizing the quantities (α and γ) indeed represents all stable distributions. The text is extensively based on the treatise by Uchaikin and Zolotarev and is provided as a synthesis and guideline rather than an original scientific contribution, the original work is much more extensive and dense [1].

Defining stable distributions
We start out by quoting the law of large numbers which states that the difference between the estimated mean of a sample from a random variable tends to the mean of the distribution when enough samples are taken. It is best known in the form as proposed by Bernouilli in the 18th century: Theorem 1 (law of large numbers -Jacob Bernoulli). Let X 1 , X 2 ,. . . X n be independent, identically distributed random variables with mean µ n = 1 n n i=1 X i then: i.e. it converges to zero in probability as n → ∞, which provides the reformulation of Bernoulli's law of large numbers: µ n = p + n , n ≥ 1.
A more sophisticated approach considers the random variables as functions on an interval [0,1] where ω is an instantiation of the experiment yielding 0 or 1. In that case the strong law of large numbers can be replaced by a weaker version: which then becomes a degenerate function if infinite samples are taken, but more importantly, before reaching the degenerate condition the sum tends to the normal distribution, which is the classical form of the the central limit theorem.
Theorem 2 (central limit theorem -Moivre-Laplace). Let X 1 , X 2 ,. . . be independent, identically distributed random variables with mean µ and variance σ 2 < ∞. Then as n → ∞, We also provide the notion of equivalent distributions X and Y : As well as the notion of similar distributions which provides the possibility of introducing a linear transformation of the given distribution.
Using expression 6 and 7 it is clear that for similar distributions X and Y on an infinitesimal interval dx: Therefore, the same applies to the distribution functions (cumulative of the density function): For example the normal distribution p G (x; 0, 1) = 1 √ 2π exp(−x 2 /2) provides the following expression: An interesting property arises when, instead of looking at the variables themselves, we now investigate how sums of these variables (summands) behave. If we have two normal distributions Y 1 and Y 2 with variances σ 1 and σ 2 then it is easy to see that, using expression 10, we get: By setting σ 1 = σ 2 = 1 and with an arbitrary summands, n, we obtain a well known result: Or more interestingly expressed as: It is the generalization of this property that leads to the notion of stable distributions by allowing arbitrary values of a and b.
Definition 1. A random variable Y is stable if and only if for any arbitrary constants b and b there exist constants a and b such that: Which then leads to the stable law in the same form as the central limit theorem, as shown by the proof below.
Theorem 3 (Lévy). Let X 1 , X 2 ,. . . be independent, identically distributed random variables, and let there exist constants b n > 0 and a n such that for some function G(x) which is not degenerate 1 . Then G(x) follows the stable law.
P represents the given probability at the value x. While G(x) is the cumulative probability density function of the distribution.

α, β representation
In this section, we show how we move from the expression in 15 to the parameterization that we have been using denoting the type of stable distribution based on the single parameter α. Before we move on, we narrow the definition of stable distributions to that of strictly stable distribution by setting a n = 0, thus: Defining S n the distribution of the sum. By calculating the variance of the distribution 16 we obtain: If var(Y ) = 0 and var(Y ) < ∞ then there is only one possibility Which reverts to the result obtained for the normal distribution.
With the notion of summands, we can now use them to further extend the properties to the general case of summing strictly stable random variables.
We now choose to limit ourselves to summands with 2 k terms Keeping in mind that X 1 +X 2 d = X 3 +X 4 , which we can generalize to any pair, we can rewrite the expression in (20) to read:

Or in much shorter notation
and using the expression in defining strictly stable random variables16: S n = b n Y , we obtain: transforming into ln b n = [(ln n)/ ln 2] ln b 2 = ln n (ln b2)/ ln 2 (24) or b n = n (ln b2)/ ln 2 = n 1/α2 We can now repeat this process with 3 k , 4 k and using induction m k , yielding α 3 , α 4 , and α m , bunching respectively 3, 4 and m summands. In general the following expression is valid: For arbitrary values of m.
If we now set m = 4 we get α 4 = (ln 4)/ ln b 4 On the other hand, selecting k = 2 in expression (25), yields From these last two formula we conclude that α 2 = α 4 . By induction we see that there is a single α for all these stable distributions, and that the scaling factors b n follow the following law: The principle advantage of using the c.f. is that the c.f. of a sum of independent random variables is equal to the product of the c.f.s of the summands: If we take the logarithm (obtaining the second characteristic: ψ X (k) = ln f X (k)), we find that: This is important as it allows us to assess the summation of a large number of independent random variables without evaluating multiple integrals. For this reason we cite the continuity theorem: Theorem 5. (continuity theorem ) Let f n (k), n = 1, 2, . . . be a sequence of c.f.s and let F n (x) be a sequence of corresponding distribution functions. If f n (k) → f (k) as n → ∞, for all k and f (k) is continuous at k = 0, then f (k) is the c.f. of a cumulative distribution function F (x), and the sequence F n (x) weakly converges to F (x), F n ⇒ F . The inverse is also true: if F n ⇒ F and F is a distribution function, then To gain some insight in how to perform this, we can look at two well known stable distributions to find a way forward. The distributions under consideration are the normal distribution and the Cauchy distribution. The calculation of the characteristic function for these distributions is well known and they also form part of the stable distribution, in the form q(x; α, β) representing the stable distribution density: The characteristic function, g(k; α, β) in this notation, can then be calculated in a straightforward manner and can be found in many textbooks: Normal distribution g(k; 2, 0) = e −k 2 Cauchy distribution g(k; 1, 0) = e −|k| Note that the traditional form of the density function for the normal distribution is slightly different: This observation allows us to generalize for any symmetric distributions. Let Y i be such a distribution with an arbitrary parameter α. We remind that Keeping in mind the property as elucidated in equation 30 using the expression for the second characteristic The results obtained earlier when investigating the normal-and the Cauchy distribution lead us to propose a solution of the form ψ Y (k) = −ck µ , k > 0 This satisfies the previous expression with µ = α and an arbitrary complex-valued c, which we choose to be c = λ[1 − ic 1 ], λ, c 1 , real numbers (37) To find ψ Y (k) we use property (3) of the chacaracteristic function, the link between conjugate characteristic function and negative estimates: with ω(k) = c 1 |k| α−1 Which is a trick to rewrite the equation as k × k α−1 = k α , in such a way that we are explicitly splitting the expression in a real and an imaginary part by a good choice of the constant c. We also have not specified what form the function ω(k) takes, we do know that it will depend on the parameter α as well as provide a measure for the asymmetry of the distribution, should that be present. Later we will attribute that to a parameter β. The constant λ is an arbitrary real number and can serve as a scaling factor, which can be renormalised to 1, without loss of generality. This implies that the full expression of a stable distributions characteristic function is of the form: g(k; α, β) = exp(−|k| α + ikω(k; α, β)) Taking into account that the characteristic function of a symmetric stable function is real-valued due to property (3) of the characteristic functions: ω(k; α, β) = 0 (41) and the characteristic function for any stable function becomes: g(k; α, 0, γ) = exp(−|γk| α ) With γ as a scaling factor.
[1] Vladimir V. Uchaikin and Vladimir Zolotarev. Chance and Stability: Stable Distributions and their Applications, chapter Elementary introduction to the theory of stable laws, pages 35-64. De Gruyter, Berlin, Boston, 1999.