The role of medical physicists and the AAPM in the development of treatment planning and optimization.

Developments in radiotherapy treatment planning and optimization by medical physicists and the American Association of Physicists in Medicine are reviewed, with emphasis on recent work in optimization. It is shown that medical physicists have played a vital role in the creation of innovative treatment planning techniques throughout the past century, most significantly since the advent of computerized tomography for three-dimensional (3D) imaging and high-powered computers capable of 3D planning and optimization. Some early advances in 3D planning made by physicists include development of novel planning algorithms, beam's-eye-view, virtual simulation, dose-volume histogram analysis tools, and bioeffect modeling. Most of the recent developments have been driven by the need to develop treatment planning for conformal radiotherapy, especially intensity modulated radiation therapy. These advances include inverse planning, handling the effects of motion and uncertainty, biological planning, and multicriteria optimization.


I. INTRODUCTION
Throughout the history of radiation therapy, the development of new treatment planning techniques has been one of the major accomplishments of physicists. This has been especially true during the past 10 years when radiation oncology physicists have created numerous methods to optimize treatment plans. This has been driven primarily by the need to devise inverse planning algorithms for intensity modulated radiotherapy ͑IMRT͒. In this article we will review developments in treatment planning and optimization and demonstrate the enormous contributions to this field made by medical physicists and the American Association of Physicists in Medicine ͑AAPM͒.

II. TREATMENT PLANNING
Prior to the 1920s, "treatment planning" simply meant that the physician selected the x-ray unit ͑or brachytherapy sources͒ to be used for a given patient and the "dose" to be delivered, and the "medical physicist" determined the time required for each treatment. The terms "dose" and "medical physicist" are used somewhat loosely here since there was no universally accepted dose unit and there was no medical physics profession at this time. It was not until the 1920s, when x-ray beams of sufficient energy to treat deep-seated lesions and brachytherapy sources of sufficient strength became available, and a unit of "x-ray intensity" was defined, 1 that physicists began to play an important role in radiotherapy with such pioneers as Failla, Glasser, Mayneord, Parker, Quimby, Taylor, Trout, Voltz, Weatherwax, and Wideröe. 2 These physicists developed depth-dose tables, measured isodose curves, devised opposing-beam techniques to spare superficial tissues, and developed dose tables and "rules" for brachytherapy. The term treatment planning began to signify the addition of isodose curves to produce the desired high dose region in the patient. This was all done manually. Although some attempt was made to correlate these dose distributions with internal anatomy, this was done only in two dimensions ͑2D͒ and rather crudely, due to the primitive imaging available at the time.
It was not until the 1950s that application of computers to treatment planning began and it became feasible to compute dose distributions in multiple planes and to correlate these with internal anatomy. Computerized treatment planning was first reported by Tsien in 1955 ͑Ref. 3͒ and some of the early software development was done at Memorial Hospital in New York under the guidance of Laughlin, where they used a mainframe IBM 360 computer with multiple simultaneous terminals. 4,5 Memorial ran a treatment planning service whereby physicists at other hospitals could access the computer remotely by telephone. This was the first widely available computerized treatment planning system and preceded other commercial products. Also during this period, Bentley and his colleagues at the Royal Marsden Hospital in the UK, 6,7 and Cox and his collaborators at Washington University, St. Louis, developed treatment planning software and this work provided the stimulus for the development of the first two commercial treatment planning systems in the US, the RAD 8 by the Digital Equipment Corporation in Boston ͑with 8 K of memory!͒, and the PC-12 by the Artronix Corporation in St. Louis. The latter was derived from the programmed console developed at Washington University ͑Cox, Powers, and Holmes͒, in collaboration with physicists from Temple University ͑Tsien and Wright͒, the University of Maryland ͑Robinson͒, the Ontario Cancer Institute ͑Cunningham and Milan͒, M. D. Anderson Hospital ͑Karzmark and Shalek͒, the National Cancer Institute ͑Johnson, Glenn, and Faw͒, and visiting physicists from the UK ͑Bentley and Clifton͒. 8,9 Also of interest during this period was the software developed by Cunningham and colleagues, 10 University of Toronto, which was made available free-of-charge to medical physicists worldwide for use on their in-house computers. This was to eventually lead to the development of the TP-11 computerized treatment planning system of the Atomic Energy of Canada, later to become the Theraplan system. All of the above were developed initially as twodimensional ͑2D͒ systems. Of special significance during the 1960s when only a few institutions had computerized treatment planning systems was a series of atlases of isodose distributions published by the International Atomic Energy Agency. These proved invaluable to physicists involved in planning treatments without the aid of computers. The first of these was an atlas of single-beam isodose distributions compiled by Webster and Tsien 11 in 1965. This was followed by atlases of isodose charts for multiple fields, compiled by Cohen and Martin 12 in 1966, for moving fields, compiled by Tsien et al. 13 in 1967, and for radium brachytherapy, compiled by Stovall et al. in 1972. 14 In the mid-to-late 1960s, Sterling 15 and van de Geijn 16 began to write the first three-dimensional ͑3D͒ treatment planning programs. These preceded the development of computed tomography ͑CT͒, which considerably limited clinical application of their programs. It was not until CT units became prevalent in the mid-1970s, that clinically practical inhouse 3D treatment planning systems began to emerge in a few academic institutions. Some of the major advances made with these in-house 3D systems included development of: • "beam's-eye-view" ͑BEV͒ by Reinstein, 17 McShan, 18 and colleagues, • BEV using digitally reconstructed radiographs by Goitein, Abrams, and colleagues, 19 29 Fraass, 30 Purdy, 31 Mohan, 32 and colleagues, to name but a few, as well as development of 3D treatment planning algorithms in order to speed up the acquisition of data and/or calculation of dose distributions for photons by such physicists as Boyer and Mok, 33 Mackie et al., 34 Mohan et al., 35 and for electrons by Hogstrom et al., 36 and Jette et al., 37 Also, many physicists have developed applications of Monte Carlo to treatment planning, with much of the pioneering work being done by Mackie et al. 38 During this period, van de Geijn and Sherouse distributed 3D planning software which, like Cunningham's programs, was provided free to physicists for use on their in-house computers. These were the EXTDOS ͑Ref. 39͒ and GRATIS ͑Ref. 40͒ programs, respectively. The latter was developed as a component of the treatment planning collaborative working group research contracts supported by the U.S. National Cancer Institute ͑NCI͒. Readers are referred to later in this review in Sec. III B 1 for some of the significant advances made as a result of this NCI initiative and to the May 1991 Special Issue of the International Journal of Radiation Oncology, Biology, Physics, edited by Smith and Purdy, 41 for a report of the results of this working group.
Also in the 1980s, medical physicists in the Nordic region collaborated in the development of 3D treatment planning as part of the Nordic R&D Program on Computer Aided Radiotherapy, 42 which ultimately led to the production of the Helax treatment planning system. Numerous commercial 3D treatment planning systems were soon to appear and it was during this period that physicists began to develop optimization methods for treatment planning. Today, the term treatment planning is significantly more far-reaching and complex than it was just 20 years ago. It now includes not simply deciding on how to combine beams of radiation to maximize the dose to the tumor and minimize doses to surrounding normal tissues, but also how to account for intra-and interfractional motion, applications of image guidance, and physical and biological optimization. Developments in optimization of dose distributions and delivery techniques to achieve these distributions in clinical practice, and methods to "score" rival treatment plans and drive the inverse-planning algorithms, will constitute most of the remainder of this review.

III. OPTIMIZATION
Shortly after the development of the first computer programs for ͑2D͒ radiation treatment planning, medical physicists began to envisage the use of computers for treatment plan optimization. [43][44][45][46] It must be mentioned that the term "optimization" means different things to different people. Medical physicists and clinicians sometimes interpret optimization relatively loosely as the ͑iterative͒ improvement of radiation treatment plans. Mathematicians, who have also been involved with radiation treatment plan optimization, use a more rigorous and more ambitious definition: For them, optimization means striving for nothing less than the best possible solution to a ͑treatment planning͒ problem, possibly within certain error bounds.
In either case, an optimization problem is defined by an objective function, which, in our case, measures the quality of a treatment plan, and by a number of constraints. One then also needs an algorithm to find a solution. In radiation treatment planning, the definition of the objective function and the constraints is at least as difficult a task as the development of an algorithm to find the solution. The difficulty in defining a "good" objective function, sometimes called score function or cost function, was realized even in the early days of treatment plan optimization. 43 It is related to the fact that radiation therapy is not an exact science: Treatment goals may vary from one hospital to the next, and sometimes among clinicians in the same hospital. More importantly, individual clinicians often find it difficult to formulate a complete, unique, optimization representation of treatment planning, even though they are capable of ranking individually prepared plans. This has been expressed as "I know it when I see it." 47 The important question of how medical physicists define treatment objectives and constraints will be discussed in Sec. III B. There we will present efforts to rationalize the choice of the objective function and constraints. First, however, we will try to shed some light on the close bonds between treatment plan optimization and IMRT.

III.A. Plan optimization and IMRT
Even though the first optimization approaches were introduced into radiation treatment planning as early as the 1960s, they were never used much in clinical 2D or 3D planning for conventional treatments. This changed profoundly with the development of IMRT. Whereas practically no commercial 2D or 3D treatment planning system employed optimization techniques, virtually all commercial IMRT planning systems are based on optimization. In the Medical Physics journal, only 13 optimization related articles where published in the pre-IMRT era ͑between 1974 and 1990͒, but 479 optimization articles in the IMRT era ͑between 1991 and 2007͒ ͑source: ISI web of knowledge; topic: Optimization͒. A high percentage of the more recent articles were on IMRT optimization, even though optimization techniques have, in recent years, also been used more frequently than before in other therapy modalities ͑brachytherapy͒ and in imaging.
What is different about IMRT, and why does IMRT rely more on optimization than other radiation therapy modalities? This question has been addressed in many reviews [48][49][50] and was discussed extensively in the IMRT Summer School of the AAPM in 2003 ͑see specifically the chapter by Langer 51 therein͒. A simple answer is that IMRT has many more degrees of freedom than conventional radiation therapy techniques, even when compared with advanced 3D conformal radiotherapy. When optimizing intensity maps, there are on the order of 1000 adjustable beamlet intensities ͑these variables are called decision variables in optimization jar-gon͒ per treatment plan. The higher number of degrees of freedom gives IMRT a much greater flexibility in shaping spatial radiation dose distributions, and it especially provides the ability to produce convex-concave dose distributions that "wrap around" critical structures such as the spinal cord, the parotid glands ͑for head and neck tumors͒, and the rectum ͑in the case of prostate treatments͒. However, it is simply too big a task for a treatment planner to manually adjust all these variables.
One can, therefore, say that medical physicists have used computerized optimization in IMRT as a tool to handle the large number of variables. Today in the clinic, IMRT optimization is routinely used as a "meta-optimization" or "inner loop optimization" procedure. The treatment planners do not really expect planning systems to devise truly optimal treat-ment plans. Instead, they want the systems to create plans that closely match certain prescription and tolerance dose levels, which are based on experience. The most suitable treatment plan is then still found in a "human iteration loop," in which the planner explores the unavoidable tradeoffs between applying high doses of radiation to the tumor target volume and sparing the surrounding healthy tissues. This is typically done by readjusting the weights or importance of the various tissue structures involved, and by tweaking the prescription dose and tolerance dose levels. This trial and error approach is not so different from the conventional treatment planning process. IMRT treatment planning is still a judgment call.
Medical physicists have devised methods to objectify IMRT treatment planning and have developed approaches towards truly optimal IMRT treatment planning, without the human iteration loop. Some of these methods will be discussed in Sec. III B. Only very few of them have found their way into the clinic. For the past 20 years, most medical physicists have focused on the development and streamlining of the current two-stage IMRT approach ͑computerized "inner" optimization plus human outer iteration loop͒.

III.A.1. Optimization of IMRT intensity maps
IMRT was invented and theoretically advanced in Europe. However, its translation into the clinic and the first clinical treatments proceeded in the USA. 52,53 The medical physicist who has undoubtedly made the biggest contribution to the invention and early promotion of the idea of IMRT is Brahme. He and his collaborators realized that in order to conform a radiation dose distribution to an abstract donutshaped target volume requires beams of radiation with nonuniform, continuously modulated intensities. 54 More importantly, he realized that modulating the intensity of radiation beams could improve dose distributions in many clinically relevant cases with concave target volumes.
A central component of an IMRT system is the algorithm that calculates the intensities of the beams that will generate the desired dose distribution. To solve this inverse problem, analytical methods were initially developed. [54][55][56] Because of the restriction of the analytical techniques to geometrically simple model cases, numerical methods were subsequently pursued. These included deconvolution approaches, which deconvolved a rotational dose kernel from the desired dose distribution, either through division by the transfer function in the Fourier space, or by use of iterative techniques. [57][58][59] The inverse problem of IMRT has no exact solution. It is physically impossible to deliver a prescribed radiation dose uniformly to a tumor target volume and completely spare the surrounding healthy tissues. If such a solution were to exist, it would require unphysical beams of negative intensity that subtract radiation from the patient. Medical physicists have therefore formulated the IMRT planning problem as an optimization problem, as discussed above. They have spent considerable time and effort and written many papers, which are too numerous to be cited here, on the development and refinement of iterative algorithms to solve this optimization problem ͑see Ref. 60 for a review of some of the early de-velopments͒. Most of these algorithms are based on the techniques of simulated annealing 61 or variations of the gradient descent technique. 62 Mathematicians have also been involved in the formulation and solution of the IMRT planning problem, 63 but not until recently has there been a more concerted effort to join forces between medical physics and optimization experts from operations research. 47

III.A.2. Optimizing IMRT delivery
Medical physicists have devised many practical methods to modulate the intensity ͑or rather fluence͒ of radiation therapy beams, so as to make IMRT possible in the clinic. Today the most common method is the use of the multileaf collimator ͑MLC͒, which was originally developed for variable field shaping, not for intensity modulation. Convery and Rosenbloom 64 published a breakthrough article that paved the way for this approach. They showed how a unidirectional "sweep" motion of the MLC leaves could produce arbitrary intensity maps in a relatively efficient way. Both "step and shoot" 65,66 and "dynamic" 67,68 IMRT delivery techniques of this type have been developed and are in clinical use today. Many refinements of these techniques have subsequently been proposed and implemented by medical physicists.
The "standard model" of IMRT delivery with a MLC is a two-stage optimization approach. The first stage is the optimization of intensity maps that will nearly produce the prescribed dose in the tumor target volume and keep the dose in the surrounding healthy tissues within tolerance. At the second stage, the set of MLC leaf settings ͑the "segments"͒ is optimized such that the resulting leaf sequence creates the required intensity map as closely as possible. This "divide and conquer" strategy yields good dose distributions but it can lead to an undesirably high number of MLC segments. In response, medical physicists developed the direct aperture optimization technique, which optimizes the MLC leaf settings directly in a single stage. 69,70 One difficulty in the standard IMRT approach is the choice of beam angles. Optimized IMRT beam angle selection is not a simple problem and, in spite of many attempts, it has not yet been satisfactorily solved. In clinical practice, beam angles are often selected manually based on some geometric reasoning, or they are evenly distributed.
The question about placing incoming IMRT beams can be avoided altogether by using a rotational IMRT approach. Yu 71 put forward the idea of intensity modulated arc therapy, where the gantry rotates continuously during IMRT delivery. Several gantry rotations or arcs are performed to achieve the desired degree of intensity modulation at each angle. Medical physicists later discovered that even a single gantry rotation with dynamically varying MLC segments can be sufficient. One may argue whether or not this is still IMRT, but what matters more is that the results look promising and the treatment can be delivered in less time with this method. To the best of our knowledge, this idea of single-arc IMRT was first proposed in 2001 by Boyer of Stanford University, and was first published in 2005. 72 See also Refs. 73 and 74.

III.A.3. IMRT delivery with dedicated equipment and nonstandard types of radiation
Medical physicists have been involved with the development of treatment machines and hardware that were specifically designed and optimized for IMRT. In fact, in the early days of IMRT the general belief was that IMRT could be delivered only with dedicated hardware. The most important development of this type, and the one with the highest clinical impact, has been Tomotherapy, which means slice-based therapy. Tomotherapy was invented by Mackie and colleagues and was first published in 1993. 75 However, as early as in 1992, at a World Health Organization conference in Geneva, Carol presented the first Tomotherapy prototype 52 , which was based on the concepts developed by Mackie et al. The heart of this device was Carol's temporally modulated collimator system, called MIMiC, which was an add-on to existing linear accelerators and could deliver two parallel intensity-modulated fan beams. Extended target volumes could be treated by moving the couch. Also presented by Carol was the associated innovative treatment planning system ͑Peacock͒, which utilized the finite pencil beam algorithm of Bourland. 76 With this machine the very first IMRT was delivered to a patient in 1994. Other unique IMRT delivery devices that were envisaged but not used clinically include the ͑indirectly͒ magnetically scanned photon beam 77 and the "shuttling MLC." 78 The idea of IMRT has also recently been applied to other types of radiation, especially to proton therapy, where it has been termed IMPT ͑intensity modulated proton therapy͒. 79,80 Lomax et al. 81 also demonstrated that the added flexibility of proton therapy could be used to protect intensity modulated treatment plans better against delivery uncertainties.

III.A.4. Optimization including motion and uncertainty
Various kinds of uncertainty of the treatment planning and delivery chain, including motion, have traditionally been accounted for by appropriate safety margins. The use of margins was formalized by the ICRU which led to the concept of the planning target volume ͑PTV͒. The PTV concept does guarantee target coverage as long as the uncertainties and motion do not exceed a certain threshold, but it may unnecessarily overdose healthy tissues. Medical physicists have therefore put a lot of effort, especially in recent years, into the reduction of PTV margins through image guided and adaptive treatment regimes. Furthermore, medical physicists have recently investigated the direct inclusion of uncertainty or motion into the optimization of an IMRT treatment plan. Here, motion or uncertainty is described by a mathematical model, which is incorporated into the formulation of the IMRT optimization problem and is automatically accounted for in the design of the intensity maps. Once this approach finds its way into the clinic, the manual definition of the PTV margin will become obsolete.
Although motion can sometimes be considered as uncertainty in the patient's geometry, a distinction has to be made between predictable motion and uncertainty. If motion is pre-dictable over the course of radiotherapy, the dose delivered to the patient can be calculated deterministically-even though its calculation is computationally more demanding. Consequently, the inclusion of predictable motion into IMRT planning does not represent a change of paradigm with respect to the formulation of the optimization problem, which is typically based on an objective function that depends on the dose distribution. The known delivered dose distribution incorporating the motion can be included in the objective function, which can then be optimized in the standard way. Applications of this approach include regular breathing motion 82 and interfractional random errors assuming a large number of fractions. [83][84][85] Calculation of the dose distribution in the presence of motion has often been approximated as a convolution of a static dose cloud with a probability distribution that describes the possible locations of the volumes of interest. 86 Including predictable motion in IMRT treatment planning potentially can improve the sparing of adjacent healthy tissues compared to a safety margin approach. This can be understood as follows: The typical effect of motion is a "blurring" effect, leading to a dose "shoulder," which in turn potentially underdoses the periphery of the tumor target volume. Adding margins improves the target coverage by moving the shoulder region further outside, but it also pushes the high-dose region further into the surrounding healthy tissues. Instead of adding margins that extend into the normal tissues, IMRT motion optimization tends to add "horns" to the intensity profiles, to compensate for the dose shoulder effect. This also affects the surrounding normal structures to some degree, but less than margins.
In contrast to predictable motion, the term uncertainty implies that the dose distribution that will be delivered to the patient is unknown at the time of treatment planning. For example, all types of systematic error represent uncertainty that cannot be approximated as predictable motion. In recent years, different approaches have been proposed to address this issue. Common to all of them is that the delivered dose depends on a set of uncertain parameters that may vary within some interval. For example, an uncertain parameter could simply be the location of the target in a fraction ͑setup error͒, or the relative amount of time the tumor remains at its exhale position ͑respiratory motion͒.
Following the probabilistic approach, a probability distribution is assigned to the uncertain parameters. Consequently, the delivered dose and the value of the objective function become random variables. Uncertainties are then accounted for by optimizing the expected value of the objective function. As an alternative, robust optimization techniques, adopted from the field of operations research, have been applied to IMRT optimization. These methods can be used to optimize treatment plans that are robust with respect to a worst-case scenario. For example, this could be: Optimize a treatment plan subject to the constraint that every tumor voxel receives the prescribed dose for every realization of the uncertain parameters.
The probabilistic approach has been demonstrated for random and systematic setup errors in idealized geometries, 87,88 it has been applied to interfractional motion of the prostate, 89,90 and it has been applied to variations in the breathing pattern. An application of robust optimization to uncertainty in respiratory motion patterns can be found in Ref. 91. A more recent application of both probabilistic and robust techniques considers range uncertainty in intensity modulated proton therapy. 92,93 The work of Chu et al. 94 and Olafsson and Wright 95 combines elements of the probabilistic approach ͑i.e., assuming a probability distribution for uncertain parameters͒ with robust programming techniques.
Robust and probabilistic approaches are very general and can be adopted to cope with a large variety of different types of uncertainty. However, this may come with increased computational effort. Therefore, it is desirable to find simplifications that can be applied for certain applications. One example is the coverage probability approach 96,97 that handles systematic organ displacement.

III.A.5. Other recent developments: Multicriteria IMRT optimization
A general difficulty with current IMRT optimization techniques is the mere fact that they require the characterization of a treatment plan by a scalar objective function, i.e., a single score. This does not reflect the clinical decision making process, which requires tradeoffs among several objectives and constraints. As a consequence, current IMRT planning systems may yield plans that are mathematically optimal ͑maximal score͒ but not clinically acceptable.
As discussed above, the creation of a suitable treatment plan often requires the physician and treatment planner to run through human iteration loops, during which the treatment planner iteratively adjusts optimization parameters ͑constraints and so-called "weight factors"͒ to steer the plan in the direction intended by the physician. Because the weight factors have no clinical meaning and because they are defined on an arbitrary scale, it requires experience on the side of the treatment planner to adjust the weights ͑which are the input parameters͒ in order to achieve the desired output. This plan-tweaking phase in the human iteration loop between physician and planner can be very time consuming.
The idea of multicriteria optimization in IMRT is to control the output parameters directly. Instead of a single score, one defines several objective functions: For example, one for each critical structure and one for the target coverage. Each of these represents an output parameter. A central notion in multicriteria optimization is Pareto optimality. A Pareto optimal plan is one in which one cannot improve one objective ͑e.g., the sparing of a critical structure͒, without making at least one other objective worse. The set of all Pareto optimal plans is called the Pareto surface. One can think of the Pareto surface as a high dimensional tradeoff curve. In two dimensions it is a traditional tradeoff curve, say, between target coverage and sparing of a nearby critical structure. Multicriteria IMRT optimization thus extends the tradeoff discussion to higher dimensions, i.e., several critical structures and target volumes, which is more clinically meaningful.
Multicriteria optimization was first proposed for IMRT optimization by Yu. 98 Various methods to find suitable plans on the Pareto surface have since been proposed. They include interactive "navigation" on the Pareto surface 99,100 which could be done directly by a physician and would eliminate the step of "bargaining" ͑the human iteration loop͒ between the physician, treatment planner, and the IMRT optimization system. Another approach uses a prioritization of the various objectives, working on the most important objective first and then working step by step toward the less important ones. This approach is known as preemptive optimization or lexicographic ordering in operations research. 101,102

III.B. Modeling treatment objectives
The goal of ͑radical͒ radiation treatment is to eradicate tumor tissue while limiting the side effects of radiation to clinically acceptable levels. Translating this seemingly straightforward goal into technically and physically feasible radiation treatment delivery has proven to be a formidable task. Although it is not difficult for an experienced treatment planner to tell whether a dose distribution is good or bad vis a vis stated treatment objectives, a complete mathematical description of a clinically good plan is very difficult. Defining an optimal dose distribution is even more challenging.
The simplest but still clinically useful form of the objective function is no objective function at all. That is, the optimization problem is reduced to finding any feasible solution, a solution that satisfies all constraints. The quality of a resultant plan is, obviously, completely determined by the constraints. If they are too tight ͑say the constraints on the maximum and the minimum target dose͒ there may not be a feasible solution at all. If they are too loose, a feasible solution picked by the algorithm may represent an unsatisfactory dose distribution. An experienced planner can generate a very good plan even with this relatively simple optimization system.
Until recently the mathematical form of the optimization problem was determined and constrained by the available optimization algorithm. In the early days, the only reliable optimization methods were those based on the Simplex algorithm that could be applied only to linear or quadratic optimization problems. 45,103 The advantage of the Simplex algorithm is that the ͑mathematically͒ optimal solution can be found in a limited ͑and reasonably small͒ number of iterations and the solution is guaranteed to be indeed the best possible. Several popular forms of the objective function can be used using the linear optimization algorithm, including a minimax problem ͑e.g., when the minimum target dose is maximized or the maximum normal tissue dose is minimized͒.

III.B.1. Early developments
It is important to emphasize that significant developments regarding optimization of treatment planning ͑including biological modeling and beam direction optimization͒ started as soon as computers became available for treatment planning in the 1960s. A few examples of those early works are described below.
Although the task was not easy, several important elements of a good dose distribution were defined in mathematical terms suitable for computerized optimization in the late sixties. 43,44,103,104 They can be summarized as follows: ͑1͒ The dose gradient across the target volume should be minimized, ͑2͒ the ratio of the integral target dose to the integral body dose should be maximized, ͑3͒ the integral body dose should be minimized, ͑4͒ the shape of the high dose region should conform to the shape of the target volume, ͑5͒ the integral dose to normal structures should be minimized, and ͑6͒ the maximum dose region should be within the target volume. For example, the objective function of Hope et al. 104 was defined as a weighted sum of those six criteria where each criterion was described using a power law formula-͑KC͒ n , where K is the normalization constant, C is the value of the function describing the criterion, and the power n describes the relative importance.
The problem of finding optimal beam directions has also been tackled since the early days of computerized planning. Bahr et al. 44 used a linear programming algorithm to select a subset of promising beam directions from a set of 72 predefined beams equally spaced around the patient's body. The objective function maximized the gradient of dose just outside the target volume. For the selected subset of beams the optimization procedure was repeated finding the optimal weights and wedge angles. A similar two-step optimization approach was used by Van Laarse and Strackee. 105 A set of "optimal" beam directions was found using a simpler objective function and then the beam weights and wedge angles were optimized using a more refined beam model and objective function.
One of the first optimization methods that explicitly used biological modeling of tissue response to radiation was developed by Cooper. 106 The objective function combined a radiobiological model of normal tissue damage developed by Worthley 107 with a dosimetric function describing dose gradient across the target volume. The objective function was nonlinear forcing the use of a nonlinear optimization algorithm. A fairly sophisticated biological model for dose distribution optimization was proposed by Graffman. 108 The objective function was a combination of a Poisson-based function describing the expected number of surviving clonogens and a sigmoid function describing the tolerance of normal tissues. Optimization of dose distribution accounting for fractionation effects was investigated by Mistry and DeGinder. 109 The authors used the concept of relative radiation effect ratio that was defined for each voxel as a ratio of partial tolerance, using equations developed by Orton and Ellis,110 to nominal standard dose introduced by Ellis. 111 A valuable consequence of accounting for fractionation effects was a demonstration of the importance of delivering all prescribed fields daily. The authors estimated that the detriment in biological effectiveness for a box technique could be as high as 10% if only two of four fields were to be delivered daily. Another interesting approach to optimization combining biological models and dosimetric metrics was introduced by Dritschilo et al. 112 and further developed by Wolbarst et al. 113 The objective function was based on the concept of complication probability factor ͑CPF͒ that described only the normal structures. For a given target dose the optimal plan minimized the CPF which was defined as a sum of the product P͑D͒V͑D͒, where P͑D͒ was the probability of complication for a given dose D and V͑D͒ was the volume of a normal structure that received dose D, and the sum was over all quantized dose levels.
Significant progress and contributions to our understanding ͑and modeling͒ of volume effects for both normal and tumor tissues were initiated in 1981 by the NCI by issuing a request for proposals for the evaluation of treatment planning for particle beam radiotherapy. Particularly influential contributions from the NCI contract ͑completed in 1986͒ were developments of algorithms for estimating normal tissue complication probability ͑NTCP͒ by Lyman at the Lawrence Berkeley Laboratory and for estimating tumor control probability ͑TCP͒ by Goitein at the Massachusetts General Hospital. Under this NCI contract a systematic listing of normal tissue tolerances was developed and a method was outlined to carry out optimization of three-dimensional dose distributions using the TCP and NTCP models. 114 Some of the ideas developed under the particle beam therapy contract have been further refined by the NCI funded Collaborative Working Group which was formed to perform an "Evaluation of High Energy Photon External Beam Treatment Planning." The group included investigators from the University of Pennsylvania and Fox Chase Cancer Center in Philadelphia, Memorial Sloan-Kettering Cancer Center in New York, Mallinckrodt Institute of Radiology in St. Louis, and Massachusetts General Hospital in Boston. The report was published in 1991 as a separate issue of the International Journal of Radiation Oncology, Biology, Physics. 41 This report included one of the most cited papers in the history of radiation oncology, this being a review of ͑sparse͒ clinical data on tolerance of normal tissue performed by a group chaired by Emami.
One of the consequences of these two NCI-sponsored collaborative developments was increased interest in applying models of TCP and NTCP to three-dimensional radiotherapy planning. Despite the scarcity of evidence-based data for modeling normal and tumor tissue response to inhomogeneous dose distributions several groups have demonstrated the usefulness of TCP and NTCP models for treatment plan evaluation and optimizations.
Some can argue that not much progress has been made in the area of defining treatment objectives for plan optimization since the early days of computerized treatment planning. A critical observer may even indicate that the advent of IMRT in the 1980s temporarily dampened the progress. The potential of intensity modulation and dose painting was often highlighted by showing fancy shaped dose distributions ͑e.g., painting the words "TOMO" or "IMRT"͒. Some even declared that radiation treatment planning is a solved "inverse problem" because the inverse planning algorithm can give us any dose distribution we wish. The choice of objective function seemed to be solved as well, since practically all early IMRT investigations used some form of the least squares dose metric to find the optimal dose distribution. Fortunately, when those early IMRT algorithms were applied to actual clinical problems, the limitations of one-step "inverse solution" became transparent. Clearly optimization of radiation treatment planning is a multifaceted problem that requires incorporation of dosimetric, biological, clinical, and technical considerations. The planning objectives should also incorporate uncertainties in our measurements, calculations, and models, as well as uncertainties in our knowledge.

III.B.2. Some current issues
Despite tremendous advances in cancer biology and imaging, techniques for defining target volume͑s͒ remain an art form. Even experts differ in their approaches to target definition, as has been clearly demonstrated by several studies. See, for example, Hong et al. 115 The consequence of this uncertainty is that, what might be a good plan using one person's target definition, is likely to be a bad one for someone else's target definition. Specifically, an effort to get a very conformal dose distribution for one target can create serious and clinically unacceptable underdosing of the other target. The role of physicists is somewhat restricted as they do not define target volumes nor set the dose levels for the target or critical normal structures. One might even argue that a clinician sets the treatment objectives that are "the best" for a given patient and, therefore, the role of a planner is to create a dose distribution that is "as close as possible" to the plan prescribed by a clinician. However, following this strategy is not as obvious as it seems. First, a prescription of a desired dose distribution is never so detailed as to leave no room for the active planning process ͑i.e., trying manually, or computer-aided, various beam setups and weights͒. Second, there are various ways to quantify the goal of being as close as possible to the prescription. For example, it is easy to demonstrate that the often used least-square distance metric can lead to obviously wrong results ͑although it can be quite useful in other circumstances͒.
The mathematical underpinnings of IMRT were directly taken from image reconstruction algorithms ͑mostly back-projection͒ used in computer tomography. That is when the term "inverse planning" was coined. There is, however, an important difference between the image reconstruction in CT and inverse planning in radiation therapy ͑RT͒ using the same mathematical algorithm. In CT, the objects that are being reconstructed exist because they are real anatomical structures. In RT, the corresponding objects ͑i.e., the volumetric dose distributions for volumes of interest prescribed by a planner͒ may not exist ͑i.e., they may be physically impossible to achieve͒. In the early days of IMRT the wonders of intensity modulation were often investigated using a C-shaped target volume surrounded by a critical tissue. The "ideal" dose distribution would be a uniform prescribed dose within the target volume and zero dose outside. This ideal dose distribution was then "inverted" to obtain the intensity profiles. Two important practical issues were immediately discovered. First, "solving" the inverse problem in RT requires some intensities to be negative, which is physically impossible. Second, if the intensities are constrained to be non-negative the resultant dose distribution, although it might be judged to be conformal ͑i.e., nicely shaped to follow the target contour͒ it is not particularly good. Specifically, setting the dose outside the target volume to zero results in underdosing the periphery of the target volume and, generally, spreading much dose ͑albeit conformally͒ outside the target volume. On one hand, the reason for this disappointing result is the use of an objective that is physically impossible ͑i.e., infinitely steep dose gradient͒. On the other hand, the reason is the use of an objective function that minimizes the distance between the prescribed and the actual dose distribution but judging the quality of the resultant dose distribution using clinical experience. That is the way that "closeness to the prescription" is scored.
Our understanding of response of tumors and normal organs and tissues to radiation treatment remains rather limited. Specifically, it is still not possible to translate our everincreasing knowledge of the underlying biological mechanisms into a prescription of the optimal radiation treatment plan ͑i.e., the necessary dose levels for the target͑s͒ and the optimal volumetric and temporal dose distribution͒. Nevertheless, based on over 100 years of experience with radiation therapy, it is not difficult to create a mathematical program that generates clinically good dose distributions using pragmatic approaches and phenomenological measures of response to radiation. Designing treatment plans that are optimal in a clinically meaningful sense is more difficult and may never be achieved.

IV. ACTIVITIES OF THE AAPM
Since the early 1970s the AAPM has contributed extensively to the education of its members in the fields of treatment planning and optimization. Numerous Summer Schools and conferences have been devoted to these topics, as well as several Task Group reports. Of special interest are the series of International Conferences on Dose, Time, and Fractionation in Radiation Oncology initiated by a Task Group of the Biological Effects Committee, starting in 1974, all held in Madison, WI. The following is a brief review of some of the important topics presented at these meetings and in these reports.

IV.A. First International Conference on Dose, Time, and Fractionation "1974…
The hot topic at the time and the main reason for convening this conference 116 was the NSD concept of Ellis. This was beginning to be used for patient dose calculations around the world but it was evident that the NSD equation was being misapplied and misunderstood by many, thus putting patient treatments at risk. Much of the conference was devoted to educating attendees about the proper application of the NSD equation. In addition, several enhancements to the model were presented including work to account for the volume effect.

IV.B. Second International Conference: Optimization of Cancer Radiotherapy "1984…
As suggested by the title, this conference 117 was dedicated to presentation of research into optimization. The theory and practice of a wide range of fractionation schemes were discussed and newer dose, time, and fractionation models to design them were presented. In the proceedings there is a review of time, dose, and volume models ͑by Orton͒, and a paper on time dose fractionation factors for tumors ͑by Supe͒. Several previously unpublished physical and biological optimization concepts were introduced at this conference, including new volume effect models ͑by Herbert, and Schultheiss and Orton͒, and physical dose distribution optimization ͑by Chin and Kijewski͒. There were several papers on biological treatment planning and optimization ͑by Wolbarst, Schultheiss, and Orton; Wigg and Nicholls; and Sha-lev͒ and, of special significance, one of the first proposals for intensity-modulated radiotherapy ͑by Miller͒.

"1988…
This conference 118 was devoted to the presentation of analytical models used for the prediction of response to fractionated radiotherapy and their physical and biological rationales. Several new dose, time, and fractionation models were presented, discussed, and critiqued, and some new optimization algorithms were introduced. In the proceedings, papers of interest include a critique of the linear-quadratic model ͑by Herbert͒, modeling time-dose response of tumors and normal tissues ͑by van de Geijn͒, application of the complication probability factor model for evaluation of 3D plans ͑by Kutcher and Burman͒, optimization by minimizing regret ͑by Viggars, Shalev, and Hahn͒, optimization of dose and dose distribution ͑by Feldman͒, and predictions and optimization ͑by Schultheiss͒.

"1992…
This conference 119 was dedicated mainly to presentation of methods for the determination of radiosensitivity and repopulation rates of cancer cells and how these could be incorporated into mathematical models of response. Predictive assays of radiation response and methods to determine cancer cell kinetics and hypoxia were presented. Of special interest for treatment planning included papers on cell kinetic models ͑by Cohen͒, a critique of the linear-quadratic ͑L-Q͒ model ͑by Herbert͒, and use of the L-Q model for lung damage ͑by Haston and Van Dyk͒.

IV.E. Fifth International Conference: Volume and Kinetics in Tumor Control and Normal Tissue Complications "1997…
Much of this conference 120 was devoted to applications of the linear-quadratic model for fractionated teletherapy and brachytherapy, and presentation of new physical and biological treatment planning optimization models. Of special interest in the proceedings is a paper on volume effects ͑by Schultheiss͒, papers on tumor control probability for nonuniform dose distributions ͑by Deasy and Niemierko͒, presentations regarding analysis of the effects of nonuniform dose distributions, and some of the earliest papers on the equivalent uniform dose ͑EUD͒ concept ͑by Niemierko͒ and inverse treatment planning.

"2001…
Many of the early papers on optimization of IMRT and estimation of NTCP and TCP were presented at this conference. 121 Applications of NTCP, TCP, and EUD models to the optimal selection of fractionation schemes were discussed. Of special importance to treatment planning in the proceedings are papers on biological comparisons of different treatment techniques for prostate cancer ͑Orton, Luxton͒, optimization of IMRT plans using biologically equivalent uniform dose ͑by Mohan, Wu, Niemierko, and Schmidt-Ulrich͒, volume effect implications in IMRT ͑by Wigg͒, and the impact of cold spots on TCP for IMRT ͑by Tomé and Fowler͒.

IV.G. Seventh International Conference: Physical, Chemical and Biological Targeting "2006…
Of special interest at this conference 122 were papers on models for NTCP and TCP estimation ͑by Niemierko, van der Kogel, Deasy, Blanco, and El Naqa͒, and optimization of the temporal pattern of IMRT delivery ͑by Altman, Deasy, Chmura, and Roeske͒.

IV.H. AAPM Summer School: Advances in Radiation Therapy Treatment Planning "1982…
This Summer School and subsequent publication 123 presented the frontiers of treatment planning at that time. The monograph contains a review of time-dose models ͑by Or-ton͒, three presentations on inhomogeneity corrections ͑by Leavitt, Hogstrom, and Cunningham͒, a presentation on treatment planning for heavy charged particle radiotherapy ͑by Chen and Goitein͒, and several presentations concerning CT for planning: On implementation ͑by Hogstrom͒, on accuracy requirements ͑by McCullough͒, and on impact and use of CT ͑by Goitein͒.

IV.I. AAPM Summer School: Radiation Oncology Physics "1986…
At this Summer School 124 there were presentations on treatment planning computers, including 3D algorithms, planning system QA, and verification of treatment plans. Of special interest are the reviews of treatment planning computers ͑by Purdy͒, computer algorithms for photon beams ͑by Cunningham͒, 3D planning algorithms ͑by Siddon͒, CT simulation ͑by Galvin et al.͒, and inhomogeneity corrections ͑by Cunningham͒.

IV.J. AAPM Summer School: Advances in Radiation
Oncology Physics: Dosimetry, Treatment Planning, and Brachytherapy "1990… A significant proportion of this Summer School 125 involved treatment planning, including an update on time-dose relationships ͑by Orton͒, presentations on 3D treatment planning ͑by Fraass͒, electron-beam calculation algorithms ͑by Hogstrom, Starkschall, and Shui͒, and brachytherapy treatment planning ͑by Williamson and Gillin͒. Of special interest are some of the earliest papers on quantitative plan evaluation, including DVH reduction algorithms, TCP and NTCP calculations ͑by Kutcher͒, optimization methods ͑by Altschuler, Censor, and Powlis͒, and applications of artificial intelligence ͑by Kalet͒.

IV.K. AAPM Summer School: Teletherapy: Present and Future "1996…
About half the presentations at this Summer School 126 were concerned with various aspects of treatment planning. In the proceedings there are reports on 2D versus 3D conformal planning ͑by Starkschall͒, experience with 3D planning ͑by Sailer et al.͒, quality assurance for 3D treatment planning, segmentation and visualization for treatment planning ͑by Fraass͒, and planning algorithms for photons ͑by Mackie et al.͒ and electrons ͑by Hogstrom and Steadham͒. There is also a review of early developments in treatment evaluation, including DVH analysis, equivalent uniform dose, biological models of tissue response, and scoring plans for optimization ͑by Niemierko͒.

IV.M. AAPM Summer School: Intensity-Modulated Radiation Therapy: The State of the Art "2003…
As suggested by the title, this Summer School 128 was entirely devoted to IMRT. Treatment planning presentations included talks on inverse planning ͑by Censor, Shepard, Earl, Yu, and Xiao͒, physical optimization ͑by Bortfeld, Palta, Kim, Li, and Liu͒, clinical implementation ͑by Ezzell͒, and biological indices for plan evaluation and optimization ͑by Yorke͒. Other presentations covered imaging for IMRT ͑by Pelizzari͒, plan validation ͑by Xing, Yang, Li, Y. Chen, Luxton, Z. Chen, Song, and Boyer͒, quality assurance ͑by Sharpe and Ezzell͒, and Monte Carlo applications ͑by Siebers and Mohan͒.

IV.N. Other AAPM Task Group and Committee Reports
Numerous other reports concerning treatment planning and optimization have been prepared by AAPM Task Groups and Committees. For example, in 1993 a Task Group of the Biological Effects Committee published a report entitled "Quality Assessment and Improvement of Dose Response Models: Some Effects of Study Weaknesses on Study Findings." 129 This report, prepared primarily by Herbert, included a comprehensive review and critique of time/dose relationships employed in treatment planning. In 1995, a Task Group of the Radiation Therapy Committee published a report entitled "Radiation Treatment Planning Dosimetry Verification." 130 This report, prepared by AAPM members Miller, Bloch, Cunningham, Curran, Ibbott, Jones, Jucius, Leavitt, Mohan, and van de Geijn, included software and data that could be used to verify the accuracy of photon beam treatment planning systems. In 1998 another Task Group of the Radiation Therapy Committee published "Quality Assurance for Clinical Radiotherapy Treatment Planning." 131 This was a comprehensive set of QA guidelines prepared by AAPM members Fraass, Doppke, Hunt, Kutcher, Starkschall, Stern, and Van Dyk, for all aspects of treatment planning, including 3D dose calculation algorithms and DVH analysis. Finally, in 2003 the IMRT Subcommittee of the Radiation Therapy Committee published the "Guidance Document on Delivery, Treatment Planning, and Clinical Implementation of IMRT." 132 This report, written by AAPM members Ezzell, Galvin, Low, Palta, Rosen, Sharpe, Xia, Xiao, Xing, and Yu, gave a thorough overview of treatment planning for IMRT and reviewed differences in dose calculations, beam modeling, and planning algorithms between IMRT and conventional treatment planning. Applications of inverse planning were presented as well as methods for commissioning IMRT planning systems for dosimetric accuracy.

V. SUMMARY AND CONCLUSIONS
The involvement of medical physicists in radiotherapy treatment planning and, indeed, the profession of medical physics, began in the 1920s when x-ray beams of sufficient energy to treat deep-seated lesions first became available. However, it was not until the widespread availability of linear accelerators and computers in the 1960s that treatment planning, as we know it today, began to develop, and not until the advent of CT in the 1970s and high-powered computers in the 1980s, that 3D planning was able to evolve. Development of new methods of treatment planning has always been a major component of radiation oncology physics research and, with the arrival of IMRT in the 1990s, the role of the physicist and the AAPM intensified considerably. Without question, radiation oncology has benefited immensely from the active participation of medical physicists and the AAPM in the development and implementation of innovative methods of treatment planning and, especially, optimization. a͒ Electronic mail: ortonc@comcast.net 1 International Congress of Radiology International X-Ray Unit Committee, "International x-ray unit of intensity," Br. J. Radiol. 1, 363-364 ͑1928͒. 2