Visualization of 4D multimodal imaging data and its applications in radiotherapy planning

Abstract Purpose To explore the benefit of using 4D multimodal visualization and interaction techniques for defined radiotherapy planning tasks over a treatment planning system used in clinical routine (C‐TPS) without dedicated 4D visualization. Methods We developed a 4D visualization system (4D‐VS) with dedicated rendering and fusion of 4D multimodal imaging data based on a list of requirements developed in collaboration with radiation oncologists. We conducted a user evaluation in which the benefits of our approach were evaluated in comparison to C‐TPS for three specific tasks: assessment of internal target volume (ITV) delineation, classification of tumor location in peripheral or central, and assessment of dose distribution. For all three tasks, we presented test cases for which we measured correctness, certainty, consistency followed by an additional survey regarding specific visualization features. Results Lower quality of the test ITVs (ground truth quality was available) was more likely to be detected using 4D‐VS. ITV ratings were more consistent in 4D‐VS and the classification of tumor location had a higher accuracy. Overall evaluation of the survey indicates 4D‐VS provides better spatial comprehensibility and simplifies the tasks which were performed during testing. Conclusions The use of 4D‐VS has improved the assessment of ITV delineations and classification of tumor location. The visualization features of 4D‐VS have been identified as helpful for the assessment of dose distribution during user testing.

associated with the delineation of the target volume 1 and the traditional way to deal with these types of uncertainties is by extending delineations with an appropriate margin. For moving target, one commonly applied strategy comprises the generation of an internal target volume (ITV) 2 from different time bins of 4D imaging data, for instance 4D-CT. However, it remains challenging to efficiently navigate, visualize, and interpret these 4D imaging data. 3 Due to limited time of physicians and lacking tools for dealing with 4D data efficiently, time effort is often reduced by using only the two extreme phases for target delineation. 4 This neglect of large parts of the movement correlated data introduces another source of uncertainty, and might lead to inaccuracy in target volume delineation. Furthermore, as additional information of co-registered functional imaging is increasingly employed in target volume delineation (e.g. 4D-PET), the problem is aggravated, when these additional imaging data should be used in the planning process.
Visualization to efficiently use 4D multimodal imaging data is to the best of our knowledge not sufficiently implemented in currently available systems. Due to this unmet need, we developed a 4D multimodal visualization system (4D-VS) that features fusion of 3D/4D multimodal image information, delineations of tumor and OARs as well as dose distribution data. A high emphasis was laid on interaction allowing for changing time bins, clipping volume information, segmentation and iso-dose surfaces. In this article, we present a visualization system and its evaluation with respect to specific radiotherapy planning tasks. The rendering framework is based on a revised and extended list of requirements, which was presented in Ref. [5].

1.A | Clinical requirements and tasks
Based on discussions with radiation oncologists, we developed a list of requirements to support radiotherapy planning tasks which incorporate 4D multimodality imaging. This includes visualization features which should be available early in the radiotherapy workflow when target and OARs are delineated, and in a later phase after the dose calculation was performed. Including 4D imaging data should make it especially suitable for cases with moving targets, for instance lung tumors, to ensure high accuracy delineations and coverage over the breathing cycle. Our visualization system is based on the following requirements: 1. Visualization and fusion of 4D (3D + t) multimodal data sets with support for changing time bins and data sets easily.
2. Joint visualization of segmentation data, such as ITV and OAR, and multimodal data sets.
3. Joint visualization of dose information (iso-dose surfaces) and multimodal data/segmentation data.

4.
Clipping and/or masking (using segmentation data) in the volume visualization.
5. Support of mixed resolution data sets without resampling and no preprocessing for volume fusion. 6. Interactive modification of parameters for clipping and visual appearance (e.g. fusion parameters). 7. Support for navigation from the volume visualization to the slicewise views. Our tasks are motivated by patients who are scheduled for and/ or treated by stereotactic body radiation therapy (SBRT). Task T.1, although not specific to SBRT, is very important when using SBRT due to the high doses involved. It will usually be performed simultaneously with the actual delineation task of target volumes. However, if the target is delineated using two extreme phases only, quality assessment for the remaining time bins is an equally relevant task.
The classification of tumor location (T.2) is relevant to decide whether or not the patient should be treated with SBRT or receive conventional treatment. The assessment of dose distribution T.3 is also not specific to SBRT, but due the high doses involved, visualization techniques other than using the dose volume histograms (DVH) can be of interest in complicated cases, where the target is spatially close to an OAR.

1.B | Related work
Visualization of multimodality data sets and the use of segmentation information for volume masking were presented in Ref. [6]. Rendering multiple arbitrarily overlapping multiresolution volumes was covered by, 7 and advanced support for clipping the volume visualization using mesh data was presented in Ref. [8]. Specific work on PET/CT visualization with advanced functionality for fusion and clipping can be found in Refs. [9] and [10].
There have been efforts to bring visualization approaches like the aforementioned to frameworks such as the Visualization Toolkit (VTK). 11 However, VTK still lacks multivolume rendering as reported by the visualization literature and extensions for multivolume visualization, for instance, 12 have not found their way into the framework yet. Research platforms, such as 3D Slicer 13 and the Medical Imaging Interaction Toolkit (MITK) 14 which are tailored to medical applications often use VTK as basis for the visualization. They offer solutions to more specific clinical applications or workflows, but they also target data processing aspects and try not necessarily to improve the visualization. For example, SlicerRT 15 is an extension to 3D Slicer with radiotherapy-specific functionality, but it is more focused on data processing. But there is still a gap between what can be found in visualization literature and what has made its way into commercial products. To the best of our knowledge, none of the aforementioned products supports advanced visualization in 3D/4D as intended by our visualization system.

| ME TH ODS AND MATERIALS
The main focus of our visualization system is to improve radiotherapy planning-related tasks by including multimodal volume visualization in an easy to use way. It is based on an in-house developed multimodal rendering framework. Further interaction features are implemented alongside with the user interface within the MITK 18 platform. Parameters which should be interactively modifiable by the user (requirement 6) have dedicated user interface elements implemented as MITK plugins. We refer to this as 4D-VS. A video illustrating the main features (explained in the following) is available in the supporting information.

2.A | Multimodal data description
The rendering framework supports different types of data sources and the fusion thereof: imaging data, delineations, and dose distribution. This is represented by the three blocks in Fig. 1. Representative data for one patient as used in 4D-VS for the three clinical tasks T.1-T.3, can be found in Table 1

2.B | Multimodal rendering core
All volume visualizations take advantage of GPU acceleration, and are, for the main part, implemented using CUDA. 19 Each type of data source gets handled in a slightly different way, and will in the end be combined by fusing the different data sources into a final visualization (see Fig. 1). The rendering framework organizes data sets in a unified coordinate system in GPU memory which takes into account mixed spatial resolutions and transformations between data sets in all rendering algorithms (requirement 5). 4D-VS uses direct volume rendering and fusion 20 of mixed resolution volume data sets for volume visualization (requirement 1). The rendering is based on a GPU accelerated ray-casting 21 algorithm, which uses the different data sources described above at discrete sample points during the evaluation the volume rendering integral. 20

2.D | Visualization and fusion of delineations
Jointly visualizing delineations and image information (requirement 2) is implemented by visualizing binary volumes using iso-surface ren-

2.F | Volume masking using delineation information
Binary volumes can further be used for volume masking (similar to clip objects 24 ) which partly implements requirement 4. Thereby, the binary volume defines a ROI and can be used to enable or disable certain volume parts (similar approach as in Ref. [6]). In Fig. 4

2.G | Volume clipping and user interaction
For completing requirement 4 and 6, we implemented interactive clipping of volumes, which can be seen as a user-defined, global ROI. In

2.H | Volume intersection highlighting
The idea behind requirement 8 is that the classification of tumor location (see task T.2) can be determined by distances of the target to bronchial tree and mediastinum (see Ref. [25]). Binary volumes for bronchial tree and mediastinum were determined automatically with the approach of, 26 and expanded with margins defined in Ref. [25].
We use these margin volumes as additional information, however since they are automatically defined, visual assessment is still required. We include these margin volumes in a separate rendering mode which highlights the intersection volume of the ITV with either one of the margin volumes [see Fig. 3(c)] for task T.2. Furthermore, if the target is delineated using two extreme phases only, quality assessment for the remaining time bins is an equally relevant task to the delineation itself, and is also relevant when using, for instance, an automated 4D segmentation algorithm.  All three tasks have in common, that they use a visual approach for verification.

3.A | Patient data and ground truth
Eighteen patient cases with malignant pulmonary lesions scheduled for SBRT were selected for testing of task T.1 and T.2. For reducing observer bias, we divided them in two groups (one group for 4D-VS and one for C-TPS), whereas each group consists of five central and four peripheral cases. As image information, we provided a full-body CT and a 4D-PET/CT (see Table 1).
For task T.1, we presented two ITVs for each patient which results in 18 separate test cases per patient group and system respectively. The conversion from DICOM-RTSS to binary volumes was done by rasterization of each planar contour with the slice resolution of the planning CT (see Table 1). The axial resolution of binary volumes was the same as the axial planar contour distances since they were generated on the planning CT. Afterwards, we reduced the size of the binary volume by keeping the minimal part of the volume which represents the actual segmentation information to reduce the memory consumption.
The quality of ITV 1 and ITV 2 was determined by calculating the For classification of tumor location (task T.2), the same data sets were used, however, we additionally provided margin volumes for bronchial tree and mediastinum which were determined automatically with the approach of 26 (see above) for using volume intersection highlighting. A ground truth for tumor location was determined by an experienced radiation oncologist different from the test users.
Classification was done according to the rules stated in Ref. [25] using distance measuring tools.
For task T.3, we selected only patients treated with SBRT, resulting in eight out of the 18 which were only considered for SBRT.
Data sets were again split up to reduce bias (four patients for 4D-VS and for C-TPS). As image information, we provided the planning CT and all relevant delineations (see Table 1) of the target (planning ITV was used) and OARs. For all SBRT plans, the 3D dose distribution was calculated with Oncentra MasterPlan.

3.B | User evaluation
Two experienced radiation oncologists (denoted as U 1 and U 2 ) performed the three tasks as described in the introduction. They were asked to give a quality rating for the ITV delineation in task T.1 and for the dose distribution in task T.3. The scale of the rating was from "1" (excellent) to "5" (poor), where a rating of "3" was defined as acceptable. Additionally, they were asked whether they are certain about their decision for the current task. This is summarized in Table 2.
All tasks were performed with C-TPS and 4D-VS. After all tasks were performed, users were asked to answer survey questions (see Table 3) for each of the systems. The survey had also a general remarks section for free comments.  Table 4.
T A B L E 2 Task description summary and quality scale.

Task description Quality scale
T.1 Assess the quality of ITV 1 /ITV 2 and give a rating. Indicate certainty.
-T. 3 Assess the quality of the dose distribution and give a rating. Indicate certainty.

| RESULTS
The average quality rating of ITVs over all test cases is shown in Table 5 (see supporting information for results for individual cases).
The combined and per user average with standard deviations (SD) were both calculated. The consensus of the rating between the users was measured by calculating a conformity index (CI), which was defined as the average of the difference in the rating between U 1 and U 2 . The CI indicating the consistency was calculated per ITV and system over all cases, and is listed for ITV 1 and ITV 2 in Table 5.
The ratings are more consistent between users, using 4D-VS than using C-TPS. Using 4D-VS leads to lower ratings and acceptance rate for ITV 1 compared to using C-TPS. The automatically generated ITV 2 received low ratings in both systems. However, the acceptance rate was even lower in 4D-VS. The level of certainty was slightly higher in C-TPS.
We defined a rating of 3 (acceptable) as the rejection threshold for ITVs, and calculated the resulting minimum, maximum, average and standard deviations (SD) of DC and HD measurements for accepted and rejected ITVs (see Table 6).
Using 4D-VS, all patients were classified correctly, and users indicated that they are certain about their decision in all but one case.
Using C-TPS, one patient was misclassified, and for all test cases users indicated that they are certain about their decision. The average quality rating of dose distributions and the corresponding certainty rates are shown in Table 7.
The overall questions and answers are listed in Table 3. The average rating for tempo-spatial comprehensibility of 4D-VS was 2.
The feature completeness for ITV assessment and classification of tumor localization was indicated as present in both systems, however not for dose distribution assessment (Q6) in 4D-VS. The T A B L E 3 Survey questions with answers. Answers given as (U 1 /U 2 ) or "-" if not applicable.

| DISCUSSION
In this work, we presented a 4D multimodal rendering framework with additional navigation and interaction features, 4D-VS, for the use in radiotherapy planning. 4D-VS was applied to three specific tasks, which were also performed using the standard tool C-TPS to investigate possible benefits. Lower quality ITVs were more likely to be detected. Ratings were more consistent for both ITVs and dose distribution. Furthermore, the classification of tumor location had a higher accuracy using 4D-VS.
For task T.1 (quality rating of ITVs), the planning ITV was chosen as ground truth for all DC and HD measurements due to its high quality guaranteed by institutional standards. The quality of individual ITVs used in our study was measured by the DC and HD with the planning ITV (see supporting information for measurements for each data set). They had varying quality depending on their generating source, which was either an algorithm or a majority vote (see above).
The average DC values (see Table 6) for accepted ITVs using 4D-VS was 0.81 (AE 0.07 SD) and 0.73 (AE 0.13 SD) for C-TPS. The average HD values (see Table 6) for average, maximum and 95% for  Fig. 7 showing the ITV depicted in Fig. 6(b). In C-TPS, volume rendering is limited to a single data set, and therefore it is not possible to fuse information of PET and CT (only slice-wise, see Table 4). The contours are only rendered at the correct spatial depth, if no transparency is applied. In Fig. 7(a), all contours are opaque, and in Fig. 7(b) the heart is partially set transparent, whereas the rest is unchanged. The heart will now be  Fig. 7(b)] and not at its correct spatial position as in Fig. 7(a). It is possible to define ROIs in C-TPS.
However, they are only applied to volume information, and therefore it is not possible to "cut open" contours as it is in 4D-VS (compare Fig. 6). A comparison of available features can be found in Table 4.
Evaluation of the survey indicates 4D-VS provides better spatial comprehensibility (Q1-Q3 in Table 3) and simplifies the ITV assessment. The users indicated in the survey that the ITV assessment is much faster using 4D-VS than using C-TPS.
For task T.2 (classification of tumor location), the differences of the two systems were less prominent, when comparing the quantitative results. All tumors were classified correctly using 4D-VS, but only one (out of nine) patient was wrongly classified with C-TPS.
Although, the intersection highlighting was indicated as helpful for making a decision (Q5), the quantitative comparison does not show a significant improvement. In Fig. 3(c), we give an example of how 4D-VS was used to investigate overlapping regions.
For task T.3 (quality rating of dose distribution), there is no straightforward way to define a ground truth. Therefore, we can only quantitatively compare if the ratings are below or above acceptance, and measure the CI. We observed that the average ratings of the dose distribution are slightly higher and have a slightly higher CI between users (more disagreement) using 4D-VS than C-TPS. This could suggest that using the additional features, presented new information which is not present in the other system and led to more disagreement. There is no clear evidence that the certainty improved, and we only observed that U 1 was more certain when using C-TPS, and U 2 when using 4D-VS. Figure 7 shows example volume visualizations with iso-dose surfaces and contours as available in C-TPS. Iso-dose surfaces can be visualized as meshes or as solid surfaces. The ROI is only applied to the volume information.  Fig. 4  F I G . 7. Using C-TPS for task T.1 and T.2: Contours are visualized together with the planning CT. Clipping is applied, however, only the CT is affected. The ITV is depicted in green, the heart in red and the esophagus in blue. In (b), the heart is made slightly transparent. When compared to (a), the volume covering the heart is not shown correctly anymore. C-TPS does not preserve the depth information of the heart when made transparent. Using C-TPS for task T.3: Green is the 37.5 Gy iso-dose. The planning target volume is depicted in violet, and the ITV in yellow. In (c), the iso-dose surface is visualized as mesh, and as solid surface in (d).
data included only 3D calculated dose distributions derived from routine 3D RT-planning. Those were combined with 4D image information, and thus the judgment would not include 4D accumulated doses but only gives a rough idea of the relation of the target to the location of the dose distribution.
Even though users were unfamiliar with 4D-VS, after a short introduction, they established their own workflow for T.1-T.3. The good spatial overview and additionally using clipping for defining ROIs was remarked as very helpful. It was also remarked that additional training might increase the quality and could further reduce time effort.

CONF LICT OF I NTEREST
The authors declare no conflict of interest.

SUPPORTING INFORMATION
Additional Supporting Information may be found online in the supporting information tab for this article.
Data S1. Supplementary document with detailed statistics and additional plots.
Video S1. Supplementary video demonstrating features of 4D-VS.