Physically constrained voxel‐based penalty adaptation for ultra‐fast IMRT planning

Conventional treatment planning in intensity‐modulated radiation therapy (IMRT) is a trial‐and‐error process that usually involves tedious tweaking of optimization parameters. Here, we present an algorithm that automates part of this process, in particular the adaptation of voxel‐based penalties within normal tissue. Thereby, the proposed algorithm explicitly considers a priori known physical limitations of photon irradiation. The efficacy of the developed algorithm is assessed during treatment planning studies comprising 16 prostate and 5 head and neck cases. We study the eradication of hot spots in the normal tissue, effects on target coverage and target conformity, as well as selected dose volume points for organs at risk. The potential of the proposed method to generate class solutions for the two indications is investigated. Run‐times of the algorithms are reported. Physically constrained voxel‐based penalty adaptation is an adequate means to automatically detect and eradicate hot‐spots during IMRT planning while maintaining target coverage and conformity. Negative effects on organs at risk are comparably small and restricted to lower doses. Using physically constrained voxel‐based penalty adaptation, it was possible to improve the generation of class solutions for both indications. Considering the reported run‐times of less than 20 s, physically constrained voxel‐based penalty adaptation has the potential to reduce the clinical workload during planning and automated treatment plan generation in the long run, facilitating adaptive radiation treatments. PACS number(s): 87.55.de

to the target. Thus, it can merely be approximated by optimizing the beam fluences with cost functions defining a trade-off between coverage of the target and exposure of critical structures.
The clinical quality of an optimized plan strongly depends on the formulation of the cost function and on the selection of the respective optimization parameters. The arguably most common formulation of the cost function F has been the penalized sum of piecewise squared dose deviations to the prescribed dose of the presegmented volumes of interest (VOIs), (2) given by (1) with (2) In the given formulation, the prescribed dose to the respective VOI, V, is replaced by the reference dose constraints d max and d min for which the positivity operator {•}+ ensures that only the appropriate deviations to the dose d i in voxel i contribute to F. The notation i ∈ V restricts the summation to voxels i that belong to VOI V. The squared deviations from Eq. (2) are weighted by corresponding penalty weighting factors p max and p min . For each VOI, the sum of penalized squared deviations is normalized by the corresponding number of voxels n V .
Minimization of a quadratic objective function results in a mathematically optimal result. However, the notion of quadratic deviation in itself and the inability to shape local dose distributions does not necessarily yield a clinically optimal dose distribution since the reference doses and penalty weighting factors are abstract and VOI-based constructs without direct clinical significance. p V min/max and d V min/max are thus indirect tools to shape a dose distribution according to the planner's wishes. Manually finding a suitable set of dose constraints and penalty weighting factors is a tedious and time-consuming process. The automation of this process is often tackled with computationally expensive multicriteria optimization. (3)(4)(5) In clinical practice, however, an automated and fast planning process is desirable to reduce the workload of the treatment personnel and enable novel adaptive treatment strategies.
Ziegenhein et al. (6) developed an ultra-fast optimization software module that is capable of reducing the optimization time for a single treatment plan from several minutes to a few seconds. In this paper, we take advantage of this ultra-fast optimization by incorporating a voxel-based penalty adaptation algorithm in our planning software to enhance and automate the decision making process of choosing adequate penalties p V min/max . Several algorithms that adapt voxel-based parameters like penalty and reference doses or directly manipulate the associated beamlet weights have been proposed since 2000. (5,(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17) Recently Zarepisheh et al. (18) explained the mathematical foundation of these algorithms. For the purpose of this paper, we identify two different classes of adaptation strategies. Algorithms of the first class (5,9,10,13) focus on the decision-making problem and thus aim for an optimization of the importance factors, subject to additional objectives than the actual objective function used during inverse planning. The second class of adaptation algorithms (7,8,11,12) aims for a fast generation of an acceptable plan by adapting the penalties of a previous nonacceptable plan heuristically. This possibly enables class solutions, which is a set of optimization parameters suitable for comparable cases of the same indication, and online adaptive treatment planning, or provides the planner with more interactive possibilities and freedom.
In this study we present ΦWA, a novel voxel-based penalty adaptation strategy of the second class. The acronym stands for 'Physical Weight Assignment' and is an extension of the previously proposed 'Dynamic Importance Weight Assignment' (DIWA). (7) Like DIWA, ΦWA uses a heuristic approach to calculate voxel-based penalty distributions based on the resulting dose distribution of a conventionally optimized treatment plan. ΦWA features a reproducible adaptation strategy needing a few iterations. It extends DIWA to explicitly account for the physical characteristics of photon beams within the patient. We evaluate the adaptation strategy with a planning study on 16 prostate and 5 head and neck cancer patients.
The remainder of this paper is organized as follows: Section II introduces the mathematical and physical concepts used by ΦWA, the computational implementation and the layout of the treatment planning study utilized to evaluate the efficacy of the developed methodology. The results are presented in Section III. Sections IV and V discuss and conclude this paper.

A.1 Initial optimization
The planning process of ΦWA begins with a conventional optimization of the objective function F (Eq. (1)) with respect to the beam fluences. These are represented by beamlets j with a corresponding weight w j , which generate the dose d i through a precalculated dose influence matrix D ij such that d i = Σ j D ij w j . The solution w* to the optimization problem is then given by (3) where n b is the number of beamlets and denotes the real positive orthant.

A.2 Modification of the objective function
The optimization is followed by the calculation of the new voxel-based penalty scale. The distribution of this voxel-based penalty scale ϕ i will be used for a reoptimization which generates an adapted plan based on a modified objective function (4) In general, the process of adaptation and reoptimization with can be repeated. However, few repetitions are desired to guarantee short run-times.

A.3 Heuristics for penalty adaptation
The underlying idea of ΦWA is to increase the penalty within the healthy tissue by the penalty scale ϕ i O such that the local objective function F i (d i ) at a voxel i with dose d i of the last optimization matches its hypothetical value for a desired dose threshold d T . We label this objectivebased strategy by (5) where α is an additional scaling power. Hence, our method comprises two independent parameters d T and α to define the threshold and the relative strength of the applied scaling. By default, the parameter α equals one.
With the new calculated penalty scale distribution ϕ i O , high-dose areas within the healthy tissue of the previous optimization get a locally higher penalty factor used for the reoptimization of F. d T defines the threshold above which dose regions are considered as undesirably overdosed.
This elevation of penalties does, however, interfere with the target irradiation by decreasing coverage and homogeneity, which can be seen in the dose-volume histograms (DVHs) of previous publications (8,11) and in the Results section below. The reason for this lies in physical properties like penumbra, aperture size and beam set-up, which do not allow for infinitely steep dose gradients on the one hand, and an exact geometrical match of the fluence projection on the target shape on the other.
In medium and high dose regions the falloff of a photon beam in the lateral direction δ (i.e., the beam penumbra d P ) can be approximated reasonably well with an error function: (19,20) where d pres represents the prescribed dose to the target, and σ P is the penumbra width. The subtrahend 1.16309 results from a coverage condition 0.95d pres = d P (δ=0), meaning that the target should at least receive 95% of its prescribed dose.
Please note that σ P depends on the depth and field size. For our study, we considered a constant σ P = 3.2 mm as suggested by van Herk et al., (19) which is also consistent with measurements. (21) Figure 1 illustrates the issue of compromised target coverage by solving the inverse planning problem for an error function beam penumbra in a one-dimensional case. Apparently, naive scaling of the penalties close to the target volume yields an incentive for the optimizer to reduce the dose outside of the target volume by compromising coverage inside. We conclude that there is a penumbra region where an additional adaptation of the penalties must be prohibited by the algorithm, which we represent by an allowed penumbra distance δ P from the target by inverting Eq. (6):  Beam penumbra (Eq. (6)) at the target border (dashed red) using constant and VOI-based penalties of p NT max = 1 in the normal tissue and p T min = 500 (dotted blue) for the target and reoptimized beam penumbra (solid red) using adapted penalties (solid blue) according to Eq. (5). The dose threshold is shown in solid green. The increased penalty in the penumbra area pushed the penumbra profile towards the target and thus reduces coverage.
Furthermore, due to the finite beamlet size and the finite number of beams, the penumbra of a beam does not always "fit" to the geometry at the target border and is in general superimposed by the projected photon depth doses and penumbras of the remaining beams. To give the optimizer enough freedom within these areas, we incorporated an additional region Δδ around the prohibited penumbra region in which the adaptation is gradually released to full strength.
This suppression of the penalty adaptation is realized by extending Eq. (5) with a subtrahend containing a term S i (d i ,δ i ) decreasing with the distance δ i of voxel i to the target: (8) With the condition 0 ≤ S i ≤ 1, the new subtrahend decreases the penalty scale to 1 (meaning no scaling) as we approach full suppression at small distances (S i = 1), and has no effect if we have no suppression at large distances (S i = 0).
We propose a suppression factor that decreases quadratically with distance δ i where we can quantify the extent of the suppression region by a maximal distance Δδ: S ′ is, however, only applicable for d i < d pres , since otherwise evaluation of Eq. (7) is not possible due to the restricted domain of the inverted complementary error function. Since doses d i ≥ d pres within the healthy tissue appear infrequently, we chose to use an empirically determined correction factor for this domain that continues the course of the error function qualitatively and artificially shortens the allowed region Δδ. For doses 0.95d pres < d i < d pres , where δ P (d i ) would be negative, we set δ P = 0. The full suppression factor S i (d i ,δ i ) then reads (10) For high doses D i where Eq. (10) would yield S i < 0, we apply the above introduced condition that 0 ≤ S i ≤ 1 and consequently set S i = 0.
The final S(d,δ) and its dependence on dose and distance is visualized in Fig. 2. This physical suppression with S i is the central part of ΦWA.

B. Implementation
ΦWA has been developed for the research treatment planning software module μKonRad (6) and can additionally be accessed through the research treatment planning system DynaPlan (22) via a graphical user interface (GUI).

B.1 Optimization module
μKonRad utilizes a highly speed-optimized L-BFGS implementation. (6,23) The incorporation of a penalty scale ϕ i into μKonRad implies only one additional floating point operation within the objective function evaluation during the optimization. Consequently, our algorithm does not yield a substantial increase of the run-time of the optimization module. Consumption of memory however, is increased due to the added penalty scale cube and necessary calculation of additional distance transforms.
To calculate the distance to the target needed for Eqs. (9) and (10), a 3D extension of the Chamfer distance algorithm (CDA) (24,25) was implemented. It is applied to the target borders with a pseudo-Euclidean distance transform that assigns the distances <3,4,5> to the adjacent voxels (nearest to most far), which Borgefors (25) identified as the best integer approximation compared to the true Euclidean distance.
Since the physically allowed dose levels depend on the prescribed dose to the target, we group the targets by their prescribed dose. To save memory, we do not calculate the distance transform for each VOI, but only for the respective group of targets which are needed for the current penalty adaptation. The algorithm then chooses the adequate δ P to use in Eqs. (9) and (10) by selecting the smallest difference δ i -δ P out of all prescription groups.

B.2 Graphical user interface
Kamerling and colleagues (22) developed a new interactive treatment planning software which also serves as front-end for μKonRad with graphical user interface. This enables evaluation and manipulation of the parameters like α, d T , and Δδ not only before a planning process, but also in-between single optimization and penalty adaptation runs. It is furthermore possible to access the three-dimensional penalty scale cube, providing the possibility to interactively manipulate ϕ i (see Section IV).

C. Planning study
A planning study was performed with 16 prostate cancer patients and 5 more complex head and neck cases. The base data have been provided by UMC Utrecht, and the dose influence matrices D ij have been calculated with a magnetic field of 1.5 T for adaptive planning with an MR-LINAC. (26) In the prostate (head and neck) cases, setups of seven coplanar 6 MV beams at 0°, 50°, 100° (80°), 155°, 200° (205°), 260° (280°), and 310° have been used, with bixel sizes of 5 mm. More details on the patient data can be found in Appendix A.
In this study we only applied ΦWA to the body contour. Local penalty adaptation within organs at risk (OARs) is not considered. The reason for this decision was, on the one hand, to measure independently how OARs and targets are affected by changes of the local penalties within the body contour. On the other hand, the adaptation parameters can be transferred more easily from one treatment site to another, since OAR evaluation criteria may vary widely.
For the prostate patients, a detailed parameter study has been performed to analyze the behavior of plan quality depending on the scaling power α, the desired dose threshold d T , and the allowed distance Δδ for the respective scaling strategies. The plan quality was evaluated through common quality indicators (QI), including the conformity index (CI) (27) and target coverage (CO) (11) where V T is the target volume, V 95 the overall volume that receives at least 95% of the prescribed target dose, and V T,95 is the corresponding subvolume within the target. Furthermore, dosevolume indicators as proposed by QUANTEC, (28) as well as DVHs and the visual inspection of the dose distribution, are evaluated.
Additionally, the adaptation procedures were applied to the prostate and head and neck cases with a predefined class solution optimization constraint set p V min/max , d V min/max which was the same for all cases of a treatment site, respectively. In this way, we evaluated the capability of the strategies to adapt plans without a plan-specific, fine-tuned constraint set.

A.1 Reoptimization with ϕ O
We illustrate the dose-dependent heuristic penalty adaptation ϕ O with an example of a dose distribution in Fig. 3. It was efficiently cleared of hot-spots in the normal tissue, but the penalty adaptation algorithm also selects a very high ϕ i for voxels in the immediate target neighborhood, thus reducing target coverage CO from 95% to 89% and conformity CI from 0.91 to 0.89. Additionally, in the adapted plan, smoother fluence maps can be observed.
The loss of coverage observed in Fig. 1 and in an example case in Fig. 3 is substantiated statistically by the parameter study. For the preparation we planned every patient with an individual constraint set (p V min/max , d V min/max ) to provide a comparable starting situation for all cases. The primary goal of the parameter study was to determine negative effects on the plan quality and to learn which parameter combinations are feasible to minimize these effects. Figure 4 shows the mean absolute change of coverage (CO) and the conformity index (CI) for penalty adaptation with ϕ i O (Eq. (5)). The plots show that certain sets of combinations produce comparable effects on the respective quality indicator and the existence of an isocurve beyond which the indicators start to drop rapidly. Also we can derive that a decrease in conformity can mainly be attributed to the loss of target coverage, as explained in the Materials & Methods section A.3 and illustrated in Fig. 3.
As the DVH in Fig. 3 shows, low-dose regions within the OARs are slightly increased, while the exposure to high doses is decreased by the penalty adaptation. We could also observe this behavior within the parameter study.

A.2 Physical adaptation
The dependence of ΔCI and ΔCO on the allowed distance Δδ is shown in Fig. 5 for different parameter settings. A stable behavior of ΔCI and ΔCO can be observed for Δδ ≥ 1 cm. We also want to point out the increase in conformity for some regions. Figure 6 shows the dose before and after application of ϕ S . ΦWA does actually decrease the overall impact on plan quality compared to adaptation with ϕ O , since our physical constraining only suppresses the naive adaptation in the respective regions, while still efficiently eliminating undesired high-dose regions. Furthermore, the improved smoothness of the fluence maps due to adaptation is at least preserved or even increased. Choosing a higher Δδ does decrease the  impact of ΦWA further, but a too large tolerance region Δδ will lead to almost no effect when adapting plans with hot-spots near the target. Thus, Δδ should be chosen to be only large enough to allow the necessary dose falloff near the target. From the stability analysis of the QIs in the parameter study, we suggest a value of around Δδ approximately equal to 10 mm to 15 mm.

B.1 Prostate cases
Based on the parameter study, we decided that for the prostate cases a parameter set of α = 1.0, d T = 35 Gy, and Δδ = 12 mm provides a reoptimized plan with small enough negative effects on the evaluated quality indicators. This combination of α and d T is chosen since it lies in a region in Fig. 4 that indicates impact on quality indicators while still providing sufficient stability. While we applied individual penalties p V min/max and reference doses d V min/max for the parameter study in section A above, we now evaluate the potential of ΦWA to generate class solutions for different indications. Therefore we apply the same set of VOI-dependent parameters, as shown in Appendix B (Table B.1.), for every patient case. Figure 7 shows the changes in CI and CO for all prostate cases while comparing ϕ O with ϕ S .
We can again observe that ΦWA efficiently reduces the impact on coverage and subsequently conformity. Since ΔCI > ΔCO for all cases (and sometimes even ΔCI > 0), we can derive a decrease of high-dose regions outside the PTV, indicating successful eliminations of hot-spots analogous to the dose distributions shown in Figs. 3 and 6.

B.2 Head and neck cases
For the five head and neck cancer patients, we again used an empirically determined classsolution constraint set for the initial optimization (see Appendix B, Table B.2.). The head and neck cases represent more complex planning scenarios. While the planning problem for the prostate case is mainly restricted to rectum, bladder, and target volumes, the anatomy in the head and neck region involves more clinical relevant structures. Multiple target volumes with different prescriptions additionally complicate the situation and all of the initial plans showed undesired hot spots within the healthy tissue. For adaptation with ϕ S , we again use α = 1.0, d T = 35 Gy, Δδ = 12 mm to undermine the aforementioned transferability of the method in between treatment sites.
The head and neck cases differ in the quantity of irradiated target volumes and their prescribed dose; thus a quantification and interpretation of target conformity through the CI is more difficult, since multiple target structures contribute to the high-dose tissue and are thus influencing the CI of a single target. Additionally, several target volumes are delineated. Thus, we only present the results by example dose distributions and the corresponding DVHs of the first case in Fig. 8. Complete QI changes can be found in Appendix B, Table B.3. Table 1 shows the progression of the mean squared deviation of the penalty scale distributions of successive adaptation and reoptimization procedures.

B.3 Convergence characteristics and run-times
Within two runs, the mean squared deviation decreases by two orders of magnitude indicating stability of the scaling distribution after run two. With the run-times presented in Table 2, we are able to produce an adapted plan within a few seconds.

IV. DISCUSSION
The previous sections documented the implementation and evaluation of a physically constrained voxel-based weight adaptation algorithm ΦWA. The algorithm works stepwise by reoptimizing previously calculated dose distributions with new penalty distributions. Our penalty adaptation is based on an approach that scales the original penalties based on the objective function value, and therefore the dose of the respective voxel. We found that the adaptation of penalties within the healthy tissue by this method decreases the target coverage (and thus conformity) systematically since it also increases the local penalty near the target volumes. ΦWA, however, recognizes these areas as necessary to maintain target coverage and suppresses the originally calculated penalty scale ϕ i O for that voxel. The suppression strength is based on an approximation of the beam penumbra and on geometrical considerations represented by the allowed dose region Δδ. Within a parameter study we found that Δδ approximately equal to 12 mm yields good target coverage preservation for photon beams, which can be related to the bixel size of 5 mm 2 plus the finite number of beams and beamlets, limiting the geometrical freedom to sculpt dose to the target. Also the effect on OAR dose was found to be lower in general for the physically constrained strategy ϕ i S compared to the naive approach ϕ O . This can be attributed to the suppressing nature of ϕ S over ϕ O .
We want to point out that our physical and geometrical considerations are only one part of the full picture. More physical constraints and additional a priori knowledge could be considered; for example, necessary entrance dose due to the photon depth-dose curve could be integrated.
We applied ΦWA to generically optimized prostate and head and neck cases to evaluate the feasibility of the adaptation for class solution planning. For both treatment sites, we achieved preservation of the target coverage and elimination of undesired hot-spots within the normal tissue. We did not alter the parameter set of α, d T , and Δδ, suggesting a good transferability of the adaptation strategy within treatment sites with dose prescriptions of similar levels. We could, however, observe that the effect on coverage becomes larger with increasing plan complexity, as CO can drop more than one or two percentage points for few individual targets of the head and neck cases. Also the effects on OAR dose are not significantly smaller for ϕ S over ϕ O (compare Fig. 8(d)). Still, we want to point out that, generally, the lower dose volumes increase, while the exposure to higher doses is slightly reduced.
Although we did not consider sequencing of the fluence maps into discrete multicollimator leaf segments in our study, the gained fluence maps showed qualitatively increased smoothness, which would facilitate the approximation of the fluence with a discrete number of apertures.
The results after repeated ΦWA adaptation become stable after only one or two adaptations and reoptimizations, with already the first reoptimization giving a practicable result. This leads to total run-times of below 10 s for the prostate patients. Given the larger base data of the head and neck patients (compare Appendix A, Table A.1.) the run-time increases, but stays below Table 2. Run-times in seconds of the ΦWA planning sequence with one reoptimization for a prostate and a head and neck case averaged over 20 respective optimizations with standard deviation, executed on a Windows 7 machine with 32GB of RAM and an Intel Core i7-2600 (3.4 GHz). We evaluated the individual durations of the initial optimization (init. w/ϕ), the calculation of the penalty scale distribution (ϕ calc.), the on-the-fly calculation of the distance transform (metric calc.), and the reoptimization with ϕ (reopt.). The last column is the sum of these and thus represents the total run-time of one plan calculation and adaptation. To monitor the performance loss due to the implementation of the additional penalty scale cube, the initial optimization was additionally performed without the ΦWA module enabled (init. w/o ϕ). The run-times can be related to the size of the base dataset via Appendix A 30 s. From a clinical perspective, this high degree of automatization, combined with short runtimes, helps to reduce the workload per patient and objectifies planning. In our study, we restricted the adaptation strategy solely on the body contour, excluding OARs and target volumes, to demonstrate the minimization of negative effects primarily on the targets by our physical adaptation. Developing physical and geometrical constraints for other VOI types is also possible. This is especially important for adapting plans generated with class solution constraint sets, since these are not optimal themselves, which can be seen in Appendix B, Tables B.3. and B.4. While our work aimed at efficient and fast reshaping of dose subject to elimination of undesired local artifacts while minimizing the effect on significant quality indicators, purposeful increase of coverage and simultaneous decrease of OAR exposure is of interest for enabling class solutions for the clinic. On the other hand, this could also be achieved by optimizing the class solution constraint sets on VOI-basis beforehand.
Also, our adaptation strategy can be transferred to the optimization of other objectives, for example in direct aperture optimization or the optimization of dose-volume measures in OARs or targets, if they are implemented as penalized soft-constraints. Since dose-volume measures are often implemented as penalized dose deviations as well, which are connected to a volume criterion via step functions, (29) applying a comparable adaptation routine to the respective penalties in the affected voxels could be the next step in exploiting this method to achieve a similar effect on suppressing high-dose regions in OARs or targets.
The fast and three-dimensional GUI of DynaPlan enables the arbitrary manipulation of the penalty scale cube. This offers an extension of ΦWA with an intuitive interactive penalty adaptation method. For example, three-dimensional brushes could be used to "paint" the respective adapted penalty scale distribution. First usage with Gaussian-like brushes showed, however, that the free manipulation of the dose cube requires a lot more experience and intuition than working with globally calculated penalty distributions. The interactive method can, however, be used to fine-tune penalty distributions fast and intuitively which were originally calculated with ΦWA.

V. CONCLUSIONS
Voxel-based penalty adaptation allows for exploration of a larger solution set than planning with VOI-based objectives. The vast amount of possibilities to design a voxel-based penalty distribution requires an automated generation. We presented an algorithm to constrain the penalty adaptation to the a priori known physical and geometrical limits of photon irradiation. Our algorithm effectively limits the negative effects of voxel-based penalty adaptation regarding target coverage while still eliminating undesired high-dose regions. We demonstrated that our algorithm can be used to improve class solutions for individual patients. Combined with ultra-fast fluence optimization, this is a promising planning concept for adaptive online radiation therapy.

ACKNOWLEDGMENTS
This work was supported by the Cancer Research UK Programme Grant C33589/A19727.

COPYRIGHT
This work is licensed under a Creative Commons Attribution 3.0 Unported License.