On the complexity of helical tomotherapy treatment plans

Abstract Purpose Multiple metrics are proposed to characterize and compare the complexity of helical tomotherapy (HT) plans created for different treatment sites. Methods A cohort composed of 208 HT plans from head and neck (105), prostate (51) and brain (52) tumor sites was considered. For each plan, 14 complexity metrics were calculated. Those metrics evaluate the percentage of leaves with small opening times or approaching the projection duration, the percentage of closed leaves, the amount of tongue‐and‐groove effect, and the overall modulation of the planned sinogram. To enable data visualization, an approach based on principal component analysis was followed to reduce the dataset dimensionality. This allowed the calculation of a global plan complexity score. The correlation between plan complexity and pretreatment verification results using the Spearman’s rank correlation coefficients was investigated. Results According to the global score, the most complex plans were the head and neck tumor cases, followed by the prostate and brain lesions irradiated with stereotactic technique. For almost all individual metrics, head and neck plans confirmed to be the plans with the highest complexity. Nevertheless, prostate cases had the highest percentage of leaves with an opening time approaching the projection duration, whereas the stereotactic brain plans had the highest percentage of closed leaves per projection. Significant correlations between some of the metrics and the pretreatment verification results were identified for the stereotactic brain group. Conclusions The proposed metrics and the global score demonstrated to be useful to characterize and quantify the complexity of HT plans of different treatment sites. The reported differences inter‐ and intra‐group may be valuable to guide the planning process aiming at reducing uncertainties and harmonize planning strategies.

gantry ring. Both the rotation and translation speeds are constant throughout the treatment. Modulation of beam intensity is accomplished by a pneumatically driven binary multileaf collimator (MLC).
An arc-shaped detector array, mounted opposite to the linac, records the exit radiation signal, which can be used for patient positioning verification, plan deliverability evaluation or machine quality assurance (QA). 1 For treatment planning, parameters like the field width, the pitch and the initial modulation factor are manually set, while each MLC leaf open time per projection (51 by gantry rotation) is determined during the optimization phase. As many authors have reported, 2-5 a suboptimal choice of these parameters can compromise plan quality and deliverability, as well as increase treatment time. Therefore, several optimal values and planning approaches have been suggested.
For instance, Kissick et al. 4 proposed a rule to choose the pitch values and minimize the longitudinal ripple effectthread effectcharacteristic of HT plans. Shimizu et al. 6 presented a method to derive an initial modulation factor and a site-specific upper limit for this parameter to reduce the delivery time without compromising plan quality. Westerly et al., 2 using a subset of plans with unexpectedly poor pretreatment QA results, found that these plans had a high percentage of small leaf open times (LOT), the mean LOT being <100 ms. After replanning, the mean LOT became higher than 100 ms and the deviations between calculated and measured dose fell within ±3%. This could have happened due to the inaccuracies associated with the modeling of the MLC leaf latency in the treatment planning system (TPS) whose impact is higher for short leaf open times. Multileaf collimator leaf latency and tongue-and-groove/ penumbra effects have indeed been pointed as factors that can affect plan deliverability. 2,7 More comprehensive studies for different treatment sites, including a wider set of TPS reported parameters, such as the couch travel, couch speed, number of gantry rotations, gantry period and treatment time, have been carried out. 8,9 Bresciani et al., 8 using 384 HT plans of multiple treatment sites, found no strong correlations between some of these factors and the results of pretreatment QA verification. Binny et al. 9 have used multiple statistical process control methods on a set of head and neck (28), pelvic (19) and brain (23) plans, to define lower and upper limits for planning parameters, like the modulation factor, gantry period, and couch speed, based on acceptable pretreatment QA results. The established ranges were specific to each treatment site and contributed to improve the treatment efficiency at their institution.
Given the numerous degrees of freedom existing in HT, plans created for the same site may have different degrees of complexity, which may not be fully characterized by the TPS reported parameters. The evaluation of radiotherapy plans complexity has been widely researched. Multiple metrics have been proposed for staticgantry IMRT and volumetric modulated arc therapy. [10][11][12][13][14][15] Complexity analysis has demonstrated to play a role in treatment plans characterization and comparison, contributing to adapt and improve the planning, optimization and QA processes. To date, a comprehensive evaluation of the helical tomotherapy plans complexity, through the definition and extension of some existing metrics is lacking in the literature. Thus, this study aims to quantify, evaluate and compare the complexity of HT plans created for various treatment sites by calculating several metrics. These metrics include some commonly evaluated parameters and novel indices that assess different aspects of the HT plans which may directly or indirectly contribute to increased uncertainties in dose calculation and delivery. The potential effect of complexity on the plans deliverability was also investigated.

2.A | Treatment plans and deliverability evaluation
A total of 208 plans from patients who underwent helical IMRT treatments at our institution were retrospectively analyzed. The considered treatment sites included head and neck (105), prostate (51) and brain tumor cases (52). The head and neck plans were generated with simultaneously integrated boost, for two or three dose levels.
The prescription dose per fraction to the high-risk planning target volume (PTV) was 2 or 2.12 Gy. In prostate tumor cases, only plans aiming to irradiate the prostate and seminal vesicles or the involved fossa, with a dose per fraction ranging from 2 to 2.5 Gy were selected. Metastatic brain tumors were irradiated with stereotactic radiosurgery, with the prescription doses varying between 19 and 22 Gy in a single fraction.
All plans were created in the Tomotherapy treatment planning system v.5.1.1.6 (Accuray Inc., Sunnyvale, CA, USA) to be delivered by a Tomotherapy HD unit (Accuray Inc., Sunnyvale, CA, USA). A field width of 2.5 cm in dynamic jaw mode 16 was considered for head and neck and prostate cases and 1 cm for stereotactic brain plans. The initial modulation factor was set according to the planner's preferences and the adopted pitch values were based on published guidelines. 4,5 To evaluate plan deliverability, that is, the agreement between planned and measured dose, pretreatment QA verification results were retrospectively collected. All plans had been recalculated in the Tomotherapy phantom (Cheese phantom) and delivered with the couch out of the bore. Dosimetry Check software v.5.5 (LifeLine Software Inc., Austin, TX, USA) was used to reconstruct the measured dose distribution from the acquired sinogram 17,18 . Three-dimensional global gamma analysis was performed with 3% of maximum dose/3 mm distance-to-agreement criteria and 10% dose threshold (TH) for head and neck and prostate, and 3%/2 mm 10% TH for stereotactic brain plans. The passing rate acceptance limit was 95%. For the purpose of this work, more stringent criteria were also adopted, namely 3%/2 mm 10%TH for head and neck and prostate and 2%/2 mm 10%TH for stereotactic brain cases. For stereotactic brain plans, a Gafchromic EBT3 film (Ashland Inc., Covington, Kentucky, USA) was also used to assess the dose distribution in a coronal plane of the Cheese phantom. Films were scanned in a flatbed scanner Epson Expression 10000 XL (Seiko Epson Corporation, Japan) and a home-made software was utilized for film processing, applying triple-channel dosimetry. 19 Global gamma analysis was performed with a criterion of 3%/2 mm, in a dedicated Tomotherapy station. The passing rate acceptance limit was again 95%. Point dose measurements were also performed using an Exradin A1SL chamber (Standard Imaging, Middleton, WI, USA) placed in the same phantom at the center of the emulated brain lesion. A difference between the planned and measured dose of ±3% was considered acceptable.  adapted from, 12 is calculated for each leaf that opens at least once during the treatment as:

2.B | Complexity metrics
where t max is the maximum LOT for that leaf across all control points is computed for a given control point by summing the LOT differences in two directions: here N l is the total number of MLC leaves ( where the total number of projections is equal to N CP À 1 and f ¼ 0:01 : 0:01 : 2 . 10 The total Z f ð Þ represents the spectrum of such changes in the entire sinogram and it is given by: The modulation index corresponds to the area under the spectrum: The larger the value of MI, the higher the plan modulation 10 .
To assess the amount of tongue-and-groove effect in HT, two Due to the tongue-and-groove/penumbra blur effects, the primary fluence under a given MLC leaf varies according to the state SANTOS ET AL.
| 109 of its neighbors. Such differences are taken into account in the TPS during the end-of-planning process, through a leaf-by-leaf basis correction. 2,20 The presented indices quantify the number of times that those corrections need to be applied, and eventually their accuracy.
The closed leaf score (CLS), adapted from, 13 is computed per control point as the ratio of closed leaves to all MLC leaves (64): The CLS can vary between 0 and 100%, being 100% when all leaves are closed during the treatment. This index is partially related to the target volume. However, when the number of closed leaves per CP is high, it is assumed that a plan can be considered more complex due to the possible significant impact of mechanical errors and dose calculation uncertainties.
The percentage of closed leaves within the so-called treatment area, defined by the right most and left most open leaves in a given control point was also calculated (CLS in ). It gives an indication of the complexity of the irradiation pattern, due to the target volume irregularity and/or its proximity with critical structures. Thus, the higher the CLS in the greater the plan complexity. In this study, all the 14 complexity metrics were considered for PCA analysis that corresponded to a data matrix X with 14 metrics (columns) for the 208 plans (rows). Those metrics had different units of measurement and numerical ranges, which affects the variance.

2.C | Statistical analysis
To ensure that all variables would contribute equally to the analysis, data were standardized before performing PCA, such that all metrics had a mean of 0 and variance of 1. 25 The PCA analysis output consisted of 14 principal components. To determine the number of PCs to keep for data representation, a cut-off of 70% of the total variance explained was adopted. 21 The scree plot for the HT complexity data is given in Figure S1.
Still using PCA, after modifying some metrics such that all increased with increasing complexity, the methodology proposed by the authors in a previous work 25 was followed to compute a global plan complexity score (PCS). This score aims at characterizing and comparing the treatment plans through a single indicator and it is calculated as the weighted mean of the selected principal components: where L is the minimum number of PCs corresponding to a cumulative percentage of the total variance explained higher than 70%, v is the total variance explained by the retained PCs and v l the percentage of variance explained by PC l .
The absolute value of the PCS may not be easy to interpret, as explained in Santos et al. 25 Therefore, a normalized version of this score, nPCS, was calculated for a given plan i within the set of plans as: nPCS is 0 for the plan with the minimum PCS (min PCS) and 1 for the plan with the maximum PCS (max PCS). The higher the value of nPCS, the greater the plan complexity for the set of plans considered in the study.

3.A | Treatment plans
Some of the TPS reported parameters for the considered groups of plans (head and neck, prostate and stereotactic brain) are summarized in Table 1. It can be seen that stereotactic brain plans have the longest gantry period and the highest number of gantry rotations, as well as the smallest pitch (0.100 for all plans), couch speed and couch travel. This is expected due to the high dose delivered in a single fraction (19)(20)(21)(22)) and the small target volume (5.9 ± 5.1 cc, on average). Head and neck cases, on the other hand, present the highest pitch, fastest gantry period and couch speed. The couch travel gives an indication of the craniocaudal extension of the treatment region, being higher for the head and necks plans.

3.B | Complexity metrics
Plans from various treatment sites were included in this study to appreciate the differences in terms of complexity between them based on the analysis of the planned sinogram. Figure 1 displays a representative example of a planned sinogram for each group. As A summary of the complexity metrics computed for the three groups of HT plans is presented in Table 2.
From the computed metrics, PCA was performed to reduce the dataset dimensionality. In PCA, the first two principal components together explained 76.5% of the total variance, 65.2% and 11.3%, respectively, which is above the predefined cut-off (70%). Therefore, the resulting two-dimensional representation of the data can be considered a good approximation of the original scatter plot in 14 dimensions.   To summarize the information provided in the biplot and compare the global complexity inter-and intra-group of plans, the normalized plan complexity score (nPCS) was calculated -   neither when considering each metric individually nor the nPCS (Table S1). For the stereotactic brain group, the obtained Spearman's rank correlation coefficients and corresponding p-values are presented in Table 3. Some moderate and strong dependencies have been obtained for this group of plans. The correlations tended to be stronger when more stringent analysis criteria were adopted for 3D global gamma analysis. Nevertheless, ionization chamber results were not related with any of the computed metrics.

3.C | Correlation between the complexity metrics and pretreatment QA results
Plans with a higher total treatment time per Gy (TT/Gy) were significantly associated with poorer verification results, such that The correlation between some of the TPS reported plan parameters, namely, the pitch, the gantry period, the number of gantry rotations, the couch travel and the couch speed and the pretreatment QA results was also investigated (Table S2). But once again, no significant dependencies have been identified for the head and neck and the prostate groups. Nevertheless for the stereotactic brain plans, all parameters, except pitch (0.100 for all plans) were significantly associated with the film gamma passing rates (r s > 0.4, P < 0.05). The correlation was negative with the number of gantry rotations, couch speed, and couch travel and positive with the gantry period.

| DISCUSSION
In this study, the complexity of a set of helical tomotherapy plans from head and neck, prostate and brain treatment sites was   T A B L E 3 Spearman's correlation coefficients, r s , and corresponding p-values (within brackets) between 3D gamma passing rates with various criteria, ionization chamber percent difference (IC %diff), film results and the complexity metrics/nPCS for the stereotactic brain plans. Correlations were considered statistically significant for a P < 0.05. Values in bold correspond to significant moderate or strong correlations. | 115 To summarize the information given by all complexity metrics, a global plan complexity score (nPCS) was calculated, following the methodology proposed by Santos et al. 25 in the context of a national IMRT audit. 28 The nPCS combines the multiple metrics into a single numerical score, allowing for the comparison of the relative complexity of the entire set of plans. Based on the nPCS values, the head and neck plans confirmed to be the most complex, followed by the prostate and the stereotactic brain ones. A higher complexity variability was observed for the prostate cases, presenting the larger range of nPCS values. This can be explained, in part, by the inclusion of patients with femoral prosthesis, demanding an adaptation of the typical planning strategy to reduce dose calculation uncertainties.
The number of planners was also higher for the prostate group (6) than for the head and neck (3) and stereotactic brain (2) groups, being the adopted planning strategies largely dependent on the planners' skills and approaches. 29 Yet, the reported differences in plan complexity intra-and inter-group may also be due to factors such as variations in patient anatomy, PTV shape and/or volume, and in the dose constraints that may differ from one clinician to another. Plans complexity is also partially determined by the optimization algorithm that works like a black box. In HT, only the initial modulation factor, pitch and field width are set, and the actual modulation factor, number of projections and leaf open times result from the optimization process. 9 The plans deliverability was weakly correlated with both the computed complexity metrics and the usual plan parameters for head and neck and prostate groups. Regarding the stereotactic brain plans, some moderate dependencies have been identified. Stereotactic brain plans with a higher total treatment time per Gy (TT/Gy) and/or with a higher percentage of leaves with opening times approaching the projection duration (%LOT > pT-20 ms) were significantly associated with a poorer agreement (although within tolerance) between planned and measured dose. Plan parameters like the couch speed and the number of gantry rotations were also inversely correlated with the pretreatment QA results. These findings may be useful to establish clinical guidelines for planning of stereotactic brain cases at our institution. Accordingly, planners should carefully evaluate the TPS LOT distribution during the planning phase and aim to achieve a %LOT > pT-20 ms less than the reported mean. Also, new indices were defined that evaluate different aspects of the plans that may directly or indirectly contribute to increased uncertainties in dose calculation and delivery. These indicators demonstrated to be effective in quantifying the complexity of the plans.
The reported differences inter-and intra-group suggest that it may be appropriate to define site-specific recommendations to guide the planning and the QA processes. The values of both planning parameters and complexity metrics may be generally adopted as reference levels at our institution, as the pretreatment QA results of the plans included in this study were all clinically acceptable. plan with the reference ones may be useful to flag plans with a complexity higher than usual, which would be subject to a more rigorous QA.

| CONCLUSIONS
In this study, the complexity of HT plans from different treatment sites was characterized and compared through the calculation of a set of metrics that evaluate multiple features of the planned sinograms. A statistical approach based on principal component analysis was followed to simplify data interpretation, allowing to explore the correlations among the proposed indices and quantify the differences in complexity between the studied groups of plans.
Generally, head and neck plans were found to be the most complex for almost all metrics, which was confirmed by the computed global plan complexity score. The prostate plans had the highest complexity variability, which can be a result of a wider range of planning approaches.
The presented characterization of the differences inter-and intra-group of treatment sites may be useful to guide the treatment planning and the QA processes eventually reducing uncertainties and harmonizing local planning strategies.

CONF LICT OF I NTEREST
None.

R E F E R E N C E S SUPPORTING IN FORMATION
Additional supporting information may be found online in the Supporting Information section at the end of the article.   Table S1. Spearman's correlation coefficients, r s , and corresponding p-values (within brackets) between 3D gamma passing rates with various criteria and the complexity metrics/nPCS for the head and neck and prostate plans. Table S2. Spearman's correlation coefficients, r s , and corresponding P (within brackets) TPS reported parameters and the pre-treatment QA results for the head and neck, prostate and SRS plans.
Values in bold correspond to significant moderate or strong correlations.