AAPM Medical Physics Practice Guideline 8.a.: Linear accelerator performance tests

Abstract Purpose The purpose of this guideline is to provide a list of critical performance tests in order to assist the Qualified Medical Physicist (QMP) in establishing and maintaining a safe and effective quality assurance (QA) program. The performance tests on a linear accelerator (linac) should be selected to fit the clinical patterns of use of the accelerator and care should be given to perform tests which are relevant to detecting errors related to the specific use of the accelerator. Methods A risk assessment was performed on tests from current task group reports on linac QA to highlight those tests that are most effective at maintaining safety and quality for the patient. Recommendations are made on the acquisition of reference or baseline data, the establishment of machine isocenter on a routine basis, basing performance tests on clinical use of the linac, working with vendors to establish QA tests and performing tests after maintenance. Results The recommended tests proposed in this guideline were chosen based on the results from the risk analysis and the consensus of the guideline's committee. The tests are grouped together by class of test (e.g., dosimetry, mechanical, etc.) and clinical parameter tested. Implementation notes are included for each test so that the QMP can understand the overall goal of each test. Conclusion This guideline will assist the QMP in developing a comprehensive QA program for linacs in the external beam radiation therapy setting. The committee sought to prioritize tests by their implication on quality and patient safety. The QMP is ultimately responsible for implementing appropriate tests. In the spirit of the report from American Association of Physicists in Medicine Task Group 100, individual institutions are encouraged to analyze the risks involved in their own clinical practice and determine which performance tests are relevant in their own radiotherapy clinics.


| INTRODUCTION
A comprehensive quality management program in a radiotherapy clinic utilizing external beam radiation therapy will include performance testing of a linear accelerator (linac). The linac must be tested routinely to ensure that current performance parameters have not deviated from baseline clinical parameters acquired at the time of acceptance of the machine. More importantly, it must be validated that the beam models in the treatment planning system (TPS) are still appropriate for the linac in its current operating state.
The technology and control systems within a linac are rapidly evolving and new features emerge frequently to assist the user in accurately and efficiently treating patients. The specific choice and use of technology on a linac will depend on the types of diseases treated, the clinical workload, and workflow. The performance tests on a linac should be selected to fit the clinical patterns of use of the accelerator and care should be given to perform tests which are relevant to detecting errors related to the specific use of the accelerator. Committee members of this guideline reviewed the current protocols for performance tests on a linac. A risk assessment was performed on currently recommended tests in order to identify those tests which will enable the greatest detection of errors, the delivery of high-quality radiation therapy and reflect the characteristics of modern technology.

| GOALS AND RATIONALE
This report describes dosimetry, mechanical, and safety tests for C-arm type linacs only. Specialized systems such as CyberKnife â or TomoTherapy â are not considered here. The scope of this guideline does not include tests for on-board imaging equipment. Imaging tests are essential in a linac QA program and they are addressed in previous reports. [3][4][5] Implementation notes are included for each recommended test so that the QMP can understand the overall goal of each test.
However, this guideline is not intended to be a "how to" document. Suggestions will be made on what types of devices are helpful and suitable for measurement, but the choice of measurement equipment and technique is ultimately the responsibility of the QMP.

| INTEN DED USERS
The intended users of this report are QMPs who are conducting linac performance tests or those that are designing a QA program for linacs and seek to understand the critical tests needed to detect errors and ensure safe and high quality external beam radiation therapy delivery.
Administrators, manufacturers of linacs, personnel representing accrediting bodies and state regulators are also encouraged to use this guideline as a reference in understanding an institution's use of equipment and necessary tests chosen by the QMP to maintain the equipment.
The American Association of Physicists in Medicine (AAPM) is a nonprofit professional society whose primary purposes are to advance the science, education, and professional practice of medical physics. The AAPM has more than 8000 members and is the principal organization of medical physicists in the United States.
The AAPM will periodically define new practice guidelines for medical physics practice to help advance the science of medical physics and to improve the quality of service to patients throughout the United States. Existing medical physics practice guidelines will be reviewed for revision or renewal, as appropriate, on their fifth anniversary or sooner.
Each medical physics practice guideline represents a policy statement by the AAPM, has undergone a thorough consensus process in which it has been subjected to extensive review, and requires the approval of the Professional Council. The medical physics practice guidelines recognize that the safe and effective use of diagnostic and therapeutic radiology requires specific training, skills, and techniques, as described in each document. Reproduction or modification of the published practice guidelines and technical standards by those entities not providing these services is not authorized.
The following terms are used in the AAPM practice guidelines: • Must and Must Not: used to indicate that adherence to the recommendation is considered necessary to conform to this practice guideline.
• Should and Should Not: used to indicate a prudent practice to which exceptions may occasionally be made in appropriate circumstances. A risk assessment was performed on tests from current task group reports on linac QA following the failure mode and effects analysis (FMEA) approach. [8][9][10] Reviewed tests were primarily from the report of AAPM Task Group 142, "Task Group 142 report: Quality assurance of medical accelerators". 3 The goal of the risk-based analysis was to highlight those tests on a linac that are most effective at maintaining safety and quality for the patient per the report of AAPM Task Group 100 "Application of Risk Analysis Methods to Radiation Therapy Quality Management". 8 Each test (or each clinical parameter being tested) was considered a potential failure mode on a linac and was scored for Occurrence (O), Severity (S), and lack of Detectability (D) of a failure.
Each committee member submitted risk assessments scores for O, S, and D. Each committee member also engaged colleagues, such that a total of 25 practicing medical physicists participated in the risk assessment as scoring participants. The range of years of experience among the scoring participants was 5-37 yr with a median of 20 yr.
The scoring participants also have experience in different types of institutions: university/academic, private/community hospital, government, and medical physics consulting groups from different parts of the country. In doing so, the scoring represents the perspective from various patient populations, technologies, age of equipment, types of treatments (i.e., 3D conformal, IMRT, SRS), and diversity of treatments. A scoring table was derived from published tables in order to have a common understanding of the definition and range of O, S, and D. 8,9 Scoring participants assigned occurrence scores to performance tests using their experience of failure rates for the clinical parameter in question. For example, scoring participants considered how often the optical distance indicator (ODI) test has fallen out of tolerance in their experience.
Scoring participants assigned a severity score to each performance test. In order to assign a severity score, scoring participants assumed that the clinical parameter in question was not being tested at the recommended frequency and was out of tolerance.
We then considered the severity of harm to a patient if the patient were treated with an out of tolerance clinical parameter (e.g., the ODI is off by greater than the tolerance value and therefore the patient's source-to-surface distance (SSD) could be off by the same amount participants considered how likely it is to detect that the ODI is out of tolerance if this parameter were not tested daily.
We determined the average score for O, S, and D from each scoring participant and used this to determine an average risk priority number (RPN) value (RPN = OÁSÁD) for each performance test that was scored.

6.B | Risk assessment scores
The average RPN scores from 25 scoring participants are presented in Appendix I. The scores are sorted by test frequency and highest RPN score. The RPN scores were also normalized by the highest score for a particular testing frequency (e.g., daily, weekly, etc.) and are presented as relative RPN scores. Table 1 shows the practice guideline's ranking of daily and monthly TG142 tests compared to O'Daniel's FMEA Analysis of TG142. 11 The method of scoring between the two works is different; O'Daniel chose not to include detectability stating that if the test was not performed, the assumption is that the failure cannot be detected. To determine occurrence, actual data from three linacs over a period of 3 years were analyzed yielding a minimum detectable occurrence rate of 0.04%. Severity rankings were determined by modeling errors in the treatment planning system.
The RPN scores are presented from each work for comparison.
For commonly scored tests, the rank order of daily and monthly tests are similar between this work and O'Daniel's results. The highest ranking tests were the same in both works for daily and monthly performance tests (output constancy, laser localization).
Differences in ranking order exist in the mid-level and lower ranking tests.
6.C | Relative risk compared to other clinical processes Failures in hardware and software systems on a linac can happen and the QMP must design a QA program that includes tests designed to detect failures. However, hardware and software system functions on a linac represent just one portion of the extensive process map that comprises the external beam treatment paradigm. 10 The relative risks of hardware and software errors are lower than risks due to human process-related errors, lack of standardized procedures, and inadequate training of staff. 12 While we must be diligent to ensure that risks of hardware and software errors are kept low and minimally contribute to the overall goal of delivering dose to the target with a high degree of accuracy, 13,14 the linac performance testing portion of our QA programs should be efficient so that time and resources can be dedicated to other areas where FMEA indicates errors with a higher score can also occur.

| MINIMUM REQUIRED RESOURCES AND EQUIPMENT
The authors do not recommend a specific tool or technique to perform each test; rather, we provide guidance on methods to achieve the goal of the test. The test procedure and equipment utilized must be capable of both accurate measurements as well as measuring to the level of the stated criteria or test tolerance. It is assumed that the most basic tools are available to the QMP.
There exists a wide variety of equipment and software tools to aid the QMP in performing, analyzing, and interpreting measurements accurately and efficiently. They can be costly, but actually represent a small percentage of the revenue generated by a single linear accelerator over its lifetime. The budget for a new linac and annual operating budgets should include the cost of such measurement equipment and software.
Administrators and department managers should understand the cost-benefit of purchasing these tools and the time savings that they provide the QMP. It has been shown that some quality control measures are more effective than others 15  That being said, an alternative approach is to compare measurement results to data collected at the time of commissioning, as TG142 and the report of AAPM Task Group 106 suggest. 3,16 Then, the beam data collected at the time of commissioning is used as the reference data. If the QMP chooses the latter approach, it is their responsibility to ensure that the commissioning data agree with the TPS model on an annual basis, as recommended in TG106. This extra step is required in the latter approach so that there is always a link between routinely measured and TPS data. Regardless of the approach chosen, the overall goal is to ensure that during clinical usage the delivered and calculated doses agree within 5% including the uncertainty associated with absolute calibration.
A water tank is typically used for beam measurements at commissioning and annual testing. For more routine measurements, such as profile constancy on a daily or monthly basis, it is easier to use a device other than a water tank. For example, a secondary measurement system may be used for monthly measurements and a tertiary system may be used for daily measurements. In this case, it is necessary to create a reference dataset that has been appropriately verified by TPS data or compared to an absolute standard. An effective approach for creating a routine reference dataset (or creating a baseline) is outlined in the process below: • Perform annual beam measurements • Compare results of annual measurements to TPS data, commissioning data that are verified by TPS data, or absolute standards (TG51 calibration standards) • Ensure results are within acceptable tolerance and resolve differences (if any) • Once annual beam measurements are verified, make measurements with the routine device/method (secondary and tertiary measurement systems). Ideally, this occurs in the same measurement session on the same day. The data acquired from this measurement are now the reference dataset that effectively becomes the baseline for comparison for routine measurements.
It is the responsibility of the QMP to ensure that all reference datasets are appropriately used and verified against absolute standards (i.e., the TPS) on at least an annual basis.

8.B | Isocenter
One critical piece of reference data that is not in the TPS is the location of the radiation isocenter. It is also common to use C-arm linacs to treat patients with a stereotactic radiosurgery (SRS) or a hypofractionated [stereotactic body radiation therapy (SBRT)] treatment regimen. 17,18 In this clinical setting, the QMP should refer to protocols specifically designed for performance tests in a stereotactic setting in order to achieve a higher degree of accuracy than that needed for regularly fractionated patients. 19,20 In addition, the QMP may choose to do additional testing (i.e., Winston-Lutz test) on the day of the treatment for stereotactic/hypofractionated treatments to ensure that the mechanical alignment of the radiation isocenter is appropriate for such patients.

8.E | Performance tests after maintenance
There are some tests that should be performed after general or specific maintenance on an accelerator to ensure that clinical parameters have not changed either intentionally or inadvertently. Tolerance-all tolerances are listed as "within X% or within X mm" and they are listed to mean that the tolerance should be within AEX% or AEX mm of the standard or baseline. When a tolerance is listed as a percent change from a value (e.g., 2% of PDD), it indicates a relative change from the original value.

D | Dosimetry tests D1 | Photon and electron output constancy
Photon and electron beam output measurements had the highest RPN scores in the risk assessment. Therefore, it is recommended that output be measured daily, monthly, and annually.
• Daily and monthly output checks should be performed on all clinically used beams, and should fall within 3% and 2% of that system's baseline values, respectively. Daily checks may be restricted to the beams in clinical use for that day, at the discretion of the QMP. Readings outside these tolerances should be reported to the QMP to resolve the discrepancies and determine the appropriate course of action.
• Annually, output measurements must be performed in accordance with TG51 (or successor): in water with equipment calibrated by an accredited secondary standards laboratory within the previous 2 yr. Output for each beam must be within 1% of dose calculated via TG51 formalism. It is also recommended that the absolute calibration be externally validated.
• Once the beams are calibrated per TG51, secondary (monthly, if applicable) and tertiary (daily) measurement systems should then be irradiated to establish or confirm baseline output readings that are tied to the primary calibration (refer to section 8.A. of this report). The QMP may use a secondary measurement system (i.e., solid water based) for monthly output checks or use a waterbased system as done for annual calibration. The QMP must decide on the details of secondary and tertiary measurement systems; their fundamental attribute should be reproducibility.
The concept of acquiring or confirming annual baselines of secondary and tertiary measurement systems is described in detail of section 8.A of this report and shall also be applied to checks that follow: beam profile checks (D2) and beam energy checks (D3 and D4).

D2 | Photon and electron beam profile constancy
• Most devices designed for daily output measurements also measure off-axis constancy at one or more points in the radial and • On a monthly basis, the QMP shall review the daily off-axis measurements or measure beam profile shape with another device or method.
• Annual measurements of the beam profile must agree with off-  • Annual measurements of photon beam energy may be point measurements or a full depth dose curve in water. At a minimum, the QMP must verify the PDD 10X value used in TG51 calculations.
Alternate measurements could be done to abide by any successive calibration protocol.
• Changes in OAFs have recently been shown to also be an indicator of photon energy change. 22  At a minimum, the light to radiation field congruence should be verified after service to the mirror, field light bulb, or any work on the treatment head that may inadvertently affect the bulb or any component of the optical system.

M5 | Leaf position accuracy
Positional accuracy of all leaves (and backup jaws, if applicable) should be checked monthly. It is the responsibility of the QMP to understand the MLC positioning system and decide which test is appropriate. The test should be performed at different gantry angles to detect any grav-

ity-induced positional errors. An acceptable test includes a Picket
Fence-type test. 27,28 Other tests that are tailored to the design of Elekta and Siemens MLC systems also exist (Hancock for Elekta and the Diamond jig system for Siemens). Leaves should move to prescribed positions to within 1 mm for clinically relevant positions.

M6 | Gantry and collimator angle indicators
Test gantry and collimator angle readouts monthly at cardinal angles.
If the imaging system uses a separate gantry encoder, it should be checked as well.

M7 | Physical graticule
The port film graticule and digital graticules are used for different

M13 | Accessory latches/interfaces (all slots)
Annually, verify that any accessory that mounts to the linac head latches properly and will not be dislodged or move in a way that will clinically affect the dose distribution position as the gantry rotates.
This test is included to verify accessories that may not be included in M11, M12, or W2 (e.g., the block tray).

S6 | Safety procedures
The QMP should use knowledge and experience to determine a set of safety tests and the frequency that is necessary.

CONF LICT OF I NTEREST
There are no conflicts of interest.