Introduction

The validity of clinical trials among multiple institutions is predicated on the premise that the selection of patients and their treatments will be uniform at all of the participating institutions. This assumption requires a concise definition of the population to be studied, the treatment regimens to be followed, and the methods used for evaluating the results [1]. Quality assurance (QA) programs attempt to document the validity of the assumptions and to quantify the extent of any variations. High-standard QA programmes result in improvement of practice quality, which is known as a flow-on effect. It is important to apply the study results and to introduce the trial outcomes into practice. A QA evaluation therefore requires consideration of clinical validity and flexibility with regard to reasonable standards of care.

With the development of multi-modality studies, particularly for radiation therapy (RT), RT planning and delivery procedures have changed dramatically. As a result, assessments of the appropriateness of therapies delivered in each institution have become more complex. After the introduction of 3-dimensional (3-D) treatment planning in the 1980s, the improved technology for RT procedures has gradually spread to general practice from the mid-1990s up to today. During the transition period from conventional 2-dimensional (2-D) to 3-D RT planning, the first proactive QA programs for the Japan Clinical Oncology Group (JCOG) started in 2002.

JCOG 0202, a multi-center phase III trial, compared two types of consolidation chemotherapy after concurrent chemoradiotherapy for limited-disease, small cell lung cancer. As a result, JCOG 0202 demonstrated excellent compliance, as high as 92% [2]. The next trial for esophageal cancer, JCOG 0303, also implemented an on-going RTQA program. This study is an evaluation of the protocol compliance for JCOG 0303. In addition, by being involved in the JCOG RTQA process, we discuss the current conditions and problems of QA for multi-institution trials, as well as the perspectives for future clinical trials.

Materials and methods

Study design and RT requirements

JCOG 0303 was a multi-center phase II/III trial that compared two types of chemotherapy which were administered concomitantly with radiotherapy for locally advanced (T4 and/or unresectable metastatic lymph nodes) thoracic esophageal cancer (Fig. 1). The primary endpoint of this study was overall survival and the secondary endpoints included the proportion of complete responses and the toxicity profile of each treatment. JCOG 0303 was carried out according to the principles set out in the Declaration of Helsinki 1964 and all subsequent revisions, informed consent was obtained, and the relevant institutional review board had approved the study.

Fig. 1
figure 1

Outline for JCOG 0303. PS performance status, CDDP cisplatin, 5-FU 5-fluorouracil

Patients were randomized to receive either low-dose cisplatin/5-fluorouracil (5-FU) (6 weeks of cisplatin 4 mg/m2 plus 5-FU 200 mg/m2 on days 1–5) or standard-dose cisplatin/5-FU (cisplatin 70 mg/m2 on days 1 and 29 plus 5-FU 700 mg/m2 for days 1–4, and 29–32). Both regimens included concurrent RT.

Regarding the current practice for advanced esophageal cancer, RT requirements included a total dose of 60 Gy in 30 fractions and an overall treatment period of 40–63 days [35]. For treatment planning, both conventional 2-D X-ray simulation and 3-D computed tomography (CT) simulation were allowed. Gross tumor volume (GTV) was defined as the volume of a primary tumor demonstrated by a CT scan and/or an endoscope, as well as metastatic lymph nodes that measured ≥1 cm in the long axis. For this trial, a clinical target volume (CTV) for the primary tumor was created to add a 2-cm margin cranio-caudally by considering subclinical extension. A CTV margin for metastatic lymph nodes was not added and CTV did not include elective regional lymph nodes. A planning target volume (PTV) was defined by adding margins at the discretion of radiation oncologists (typically 0.5–1 cm for lateral margins and 1–2 cm for cranio-caudal margins, depending on respiratory motion and patient fixation). A dose of 60 Gy was prescribed at the center of the PTV. Tissue heterogeneity correction was not used for monitor unit calculation, because if heterogeneity correction was required and different calculation algorithms were allowed, the inter-institutional variation of the delivered dose would have been significant, and the convolution–superposition algorithm was not available in some participating institutions at the beginning of this trial.

Dose constraints were defined with regard to maximum point doses to the spinal cord and the digestive organs. The dose to the spinal cord was kept at ≤44 Gy. The doses to the gastric antrum, small intestine, and colon were kept at <50, <40, and <45 Gy, respectively.

If a tumor was located in the middle or lower thoracic esophagus, treatment using 3–4 ports was recommended to reduce the possible risk of heart toxicity. For the treatment of tumors in the upper thoracic esophagus and supraclavicular lymph node metastases, the number of ports used was at the discretion of each institution.

Quality assurance review

For the initial QA review, copies of pre-treatment diagnostic X-rays and CTs, simulation and verification films, worksheets for monitor unit calculations for the prescribed doses, and RT charts were sent to the QA review center within 7 days after beginning RT. Information on the total RT course was required to be sent within 30 days after completing RT. These documents were to be submitted for all accrued patients. They were collected during patient accrual and after the completion of accrual to provide for a final compliance assessment. The criteria for QA assessment were defined before the start of this trial, but they were not described in the protocol. Immediately after the initial records were available, the radiation oncology principal investigator (S.I.) sent each institution a letter reporting whether they had complied with the treatment protocol and an inquiry regarding QA documentation, when necessary (Fig. 2). Progress remarks and problems were reported at periodic meetings for investigators.

Fig. 2
figure 2

Flow chart for QA review After the QA review, feedback was given to the institutions. Treatment planning was modified when possible

To assess RT protocol compliance, the following parameters were reviewed: dose and field border placement (adequacy of margins for GTV), doses to organs at risk, overall treatment time, and dose calculations without heterogeneity corrections. The QA assessment was given as per protocol (PP), deviation acceptable (DA), violation unacceptable (VU), and incomplete/not evaluable. “Protocol compliance” included both PP and DA.

Individual cases were reviewed both by an independent radiation oncologist (N.S.) and the radiation oncology principal investigator (S.I.) using the same criteria. For GTV coverage, VU was defined as the distance from the field edge of the blocks or multi-leaf collimators to the periphery of GTV <1 cm or >2.5 cm laterally and <2 cm or >6 cm cranio-caudally. For the dose at the reference point, a dose <54 or >66 Gy was judged as VU. If the margins for GTV were insufficient in order to avoid an overdose to the organs at risk, this was regarded as DA. However, if GTV was shielded for any reason, it was regarded as VU. If heterogeneity correction was considered for dose calculation and the dose difference exceeded 10%, it was judged as VU. Other criteria for the QA assessment are listed in Table 1.

Table 1 Criteria for QA scores

Details of each assessment were analyzed. The incidence of VU was compared based on the numbers enrolled by institution among 106 fully evaluable cases.

Results

A total of 142 cases were accrued from April 2004 to September 2009. After excluding 36 cases, 106 (75%) were fully evaluable (Table 2). Partially evaluable cases were included for the evaluation of each item.

Table 2 Numbers of evaluable cases and QA scores

Among 132 patients who were evaluable for the treatment planning methods, conventional 2-D X-ray simulations were performed for 9 (7%) patients and 123 (93%) had 3-D CT simulations. Of 31 participating institutions, 22 institutions had introduced 3-D CT simulations, 3 used only 2-D X-ray simulations, and 6 used both. Two opposing ports were used for 61 (46%) patients. Three ports, 4 ports, and 5 or more ports were used for 27 (21%), 40 (30%), and 4 (3%) patients, respectively.

Overall RT compliance (PP + DA) was 96% (102 of 106 fully evaluable). Details for the QA scores are listed in Table 3. There were 4 VU cases: 3 in GTV coverage with insufficient margins for GTV (although 1 VU case resulted from avoiding an excessive dose to the spinal cord); 1 in organs at risk due to an excessive dose to the gastric antrum. No VU case was found for the overall treatment period, dose to the spinal cord, or total dose and dose calculations.

Table 3 Breakdown of QA scores

A miscellaneous variation, other than the pre-defined criteria for the QA assessment, was found for 4 cases; although CTV was not intended to include regional lymph nodes in the protocol, elective nodal irradiation was performed for these 4 cases (3 cases to the supraclavicular region and 1 case to the paraesophageal region).

Institutions with the highest quarter of enrollment recruited more than 7 patients (mean = 11, range = 7–18), which accounted for 68 patients. In those centers that enrolled fewer than 7 patients (mean = 2, range = 1–5) and that recruited a total of 38 patients, 4 cases (11%) were judged as VU, while all of the cases from centers that enrolled 7 patients or more were compliant (Table 4).

Table 4 Numbers of VU cases based on the numbers enrolled among 106 fully evaluable cases

Discussion

An overall compliance of 96% was sufficient to provide reliable results for the current study. There was a substantial number of feedbacks in QA assessment reports after initial case reviews between the radiation oncology principal investigator and investigators at participating institutions, and these were effective in better understanding of the protocol specification and in preventing unacceptable violations. In this trial, the number of unacceptable violations was too few to see the feedback effects, but such were observed in JCOG 0202 [2] in which protocol violations and deviations were seen more frequently in the earlier period of the trial. In the previous esophageal trial JCOG 9708, RT quality was not optimal [6]. JCOG 9708 was conducted to evaluate the efficacy and toxicity of chemoradiotherapy with 5-FU plus cisplatin for patients with Stage I esophageal squamous cell carcinoma. According to a retrospective RTQA review after the closure of this trial, the overall protocol compliance was 70%. After this review, the QA assessment reports were sent to participating institutions, most of which overlapped with those in JCOG 0303. As the influence of clinical trial experience over the years was recognized in RTOG studies [7], the good RTQA compliance in JCOG 0303 also appeared to be attributable to JCOG 9708 experience. Furthermore, as the importance of the pre-trial QA program has been well recognized [813], JCOG will also implement a dry-run as a pre-trial credentialing program.

Impact of RT quality on treatment outcome

The Trans-Tasman Radiation Oncology Group (TROG) conducted a large international phase III trial to evaluate any additional benefit of tirapazamine (TPZ), an hypoxic cytotoxin agent, to standard cisplatin-based chemoradiotherapy for locally advanced head and neck cancer [14]. Although this trial failed to demonstrate any benefits for TPZ, they reported the outcomes of a planned secondary analysis that was used to assess the impact of RT quality planning and delivery on outcomes, which might have provided some explanation for the negative overall trial results [15]. As a result, they found a 20% absolute difference in 2-year overall survivals between those who had protocol-compliant plans and those with plans that had a predicted major adverse impact on tumor control (70 vs. 50%, respectively). This was twice the hypothesized survival benefit of TPZ used in the trial design.

They also showed that centers that treated only a few patients were the major source of RT quality problems. While many reports have shown that failure to adhere to the treatment protocol degraded the outcomes of clinical trials [7, 1622], for the first time they quantified the penalty associated with poor RT and demonstrated a more substantial impact of RT quality on outcomes than any additional effects for new agents. In our study, the numbers enrolled by each institution also adversely affected the number of VU cases. The overall outcomes may also have been influenced by poor quality RT, even though the absolute number of VU cases was small. As pointed out by the TROG trial, it is desirable to limit a trial’s participation to those sites that can contribute a significant number of patients.

Relationships between deviation, eligibility criteria, and protocol

Although the first step in minimizing the variations in clinical trials is the use of a detailed trial protocol, it is sometimes impossible to define a uniform acceptable technique for the treatment of advanced esophageal cancers; however, a certain margin is usually included to cover individual variations in order to identify those variations that are due to clinically valid judgments.

The significance of elective nodal irradiation for locally advanced esophageal cancer, especially for those with T4 and/or unresectable metastatic lymph nodes, has not yet been clarified [3, 23, 24]. In the current JCOG 0303 trial, the protocol specified that such subclinical areas were not to be included as CTV. However, there were 4 cases that received elective nodal irradiations, all of which did not appear to have predicted impacts on tumor control or toxicity. They were still acceptable when assessed by the criterion of reasonable standards of care and, therefore, were judged as DA cases.

We found that most of the DA cases were due to insufficient margins for GTV caused by avoiding overdoses to organs at risk. Such conditions are often experienced due to the anatomy of esophageal cancer. The esophagus is located in contact with vertebrae that embrace the spinal cord. Esophageal cancer often grows to be a bulky mass lying across the anterior walls of the vertebrae, or it frequently metastasizes to the lymph nodes along the right recurrent nerve. Therefore, an off-cord boost is often difficult to create for delivering an adequate dose to the PTV while avoiding an overdose to the spinal cord. In fact, in the current trial, there was one VU case for GTV that was due to avoiding an excessive dose to the spinal cord. This may be more a matter of the eligibility criteria for this trial than of protocol compliance. As a result, during a QA assessment, it can sometimes be difficult to distinguish a VU case from a DA case. Effects of these variations on outcomes are to be assessed with the final results.

Suboptimal proportion of evaluable cases

In the current study, there was a substantial number of cases that were excluded (n = 36; 25% of all cases), while the overall compliance was excellent when the subjects were limited to fully evaluable cases. Among the 36 excluded cases, the data were insufficient or only partially evaluable for 25 cases, 8 cases went off protocol, and 3 cases were ineligible. Improvements of evaluability are another challenge for RTQA, not only for trial outcomes, but also for trial cost effectiveness. Although the support of cooperative group trials is costly due to the involvement of various professionals, improvement of evaluability would make up for the cost by decreasing the exclusion loss from the analysis [1].

Frequency of 3-D CT simulation and credentialing

In early clinical trials, data acquisition was non-uniform and inconsistent, and radiation dose calculations varied significantly. Improvements in the QA procedures have increased treatment uniformity of the study, which has helped to validate the study conclusions. Recently, protocols have been developed with increasing complexity. Especially for RT, current studies have introduced CT-based treatment planning, enabling precise target definitions and dose deliveries. The use of advanced treatment modalities in clinical trials requiring volumetric digital data submission is one of the great challenges in RTQA [25].

Previously, the Radiation Therapy Oncology Group (RTOG) 9415, a randomized phase III trial that compared high-dose radiotherapy with standard doses for esophageal cancer, recommended the use of CT simulation, although it was not mandatory. Dose prescription was conventionally specified at an isocenter. As from the next esophageal trial, E0113, a randomized phase II study of two paclitaxel-based chemoradiotherapy regimens, all participating institutions had to utilize 3-D CT planning. Furthermore, RTOG 0436, a phase III trial evaluating the addition of cetuximab to paclitaxel, cisplatin, and radiation for patients with esophageal cancer, required a facility questionnaire for each institution, as well as a dry-run QA test, in order to prove that the institution was eligible to enter patients into the study.

In the current JCOG 0303 trial, a majority of the participating institutions had introduced 3-D CT simulations; however, in patients with 2-D X-ray simulation, precise 3-D volumetric dose evaluation was not available. Today, CT-based 3-D planning is standard and it will be mandatory in coming JCOG trials. In 2004, the JCOG RT group implemented a pre-trial credentialing program for a phase II trial of stereotactic body RT for early stage non-small cell lung cancer (JCOG 0403). The next trial for intensity-modulated RT for nasopharyngeal cancer will require a dry-run test for all participating centers. As we move to multimodal image-based definitions of target volumes for protocols, timely interactions between study investigators and QA centers through protocol development will become more and more important in future trials.

In conclusion, the results of the RTQA assessment for JCOG 0303 were sufficient to provide scientifically reliable results. Further improvements will be needed for institutions with low accrual rates. A dry-run and credentialing program are being implemented in JCOG trials to further improve RT quality.