Introduction

Neck dissection is the most frequently performed operation in head and neck surgery. Since its proposal by Crile [1], neck dissection has become the most popular and established surgical procedure in this field. The detailed maneuvers of neck dissection may therefore be thought of as almost uniform worldwide. However, this concept was proven otherwise in Japan.

The first evidence was shown in video presentations at annual conferences in Japan. One of the authors (M.S.) was surprised by an unimaginable diversity of neck dissection procedures presented in those videos. Every leading hospital in Japan appeared to perform neck dissections using various techniques for different indications. Apparently, other doctors shared the same impression. Many doctors thought that such diversity could lead to the failure of establishing uniformity or comparability of nonradical neck dissections in Japan. This diversity might also account for the large difference in treatment results among leading Japanese hospitals. These doctors thus thought that an urgent intervention was necessary to ensure the uniformity and quality of neck dissections in Japan.

What were the possible reasons that could have led to this diversity of neck dissection procedures? Three reasons were considered. The first reason was an immense and intrinsic diversity of nonradical neck dissections. Radical neck dissection, which was established and popularized by Martin et al. [2], entailed poor postoperative quality of life [3]. To improve postoperative functions and maintain excellent outcomes of radical neck dissection, many researchers attempted to develop new dissection procedures, resulting in the establishment of diverse nonradical neck dissections [46]. To date, nonradical neck dissections have become the core of neck dissection procedures. Because these procedures had been developed by different surgeons under different conditions, each operation had its own background, indication, and contraindication. This situation could easily mislead a surgeon regarding the procedure of choice for a particular patient.

The second reason was a paucity of medical evidence concerning details of neck dissection procedures. For example, there has been a lack of studies on a comparison of overall survival rates between patients with the cervical spinal nerves severed and those with the nerves preserved. This lack of evidence resulted in surgeons performing any preferred nonradical neck dissection procedures.

The last reason was that there were few exchanges of doctors and surgical techniques among leading Japanese hospitals, implying limited training environments for young doctors. Young Japanese doctors learned surgical techniques only from their senior doctors and did not have a chance to observe operations conducted in other hospitals. If the surgical techniques of their senior doctors were not very common, those of young doctors would also be uncommon.

The authors thought that the last reason was the main cause of diversity in neck dissection procedures and concluded that exchanges of surgical techniques among hospitals had to be enhanced.

In 2002, one of the authors (M.S.) organized the Japan Neck Dissection Study Group (JNDSG), through the support of a governmental grant, to standardize details of neck dissection procedures so that the same neck dissection was performed in every hospital if the primary site and TNM stage of the disease were the same. Because of the annual renewal of the governmental grant, JNDSG had to face close scrutiny by medical authorities at the end of every fiscal year. Now in its eighth year, JNDSG has continued in its efforts to standardize details of neck dissection procedures in Japan.

Patients and methods

To achieve standardization, JNDSG invited 22 leading Japanese hospitals (Table 1) to participate in the study.

Table 1 List of 22 participating hospitals

To enhance exchanges of surgical techniques among hospitals, JNDSG directed surgeons from participating hospitals to directly observe neck dissections conducted in other hospitals. However, one problem with this directive was that there were so many observing surgeons that their points of view could be very different.

To standardize the observation criteria, JNDSG created a specialized form [7] consisting of 79 questions regarding details of neck dissection (Table 2). An observing surgeon must fill out this form during surgical observation.

Table 2 Observation form consisting of 79 questions

The Japan Neck Dissection Study Group also created a protocol to obtain official permission from participating hospitals and to protect the personal rights of observed patients according to the principles set out in the Declaration of Helsinki 1964 and all subsequent revisions. Eligible subjects were previously untreated patients with head and neck cancer who underwent neck dissection during the treatment and presented written informed consent. Patients with recurrence of head and neck cancer were excluded. The planned sample size was 235 patients. The subjects were divided into two groups: the first 93 patients were classified as the “first stage” and the following 142 patients as the “second stage.” The endpoint of the first stage was difference among participating hospitals regarding details of neck dissection, whereas that of the second stage was the 2-year neck control rate. The subjects were followed up on neck control and prognosis every 6 months for 2 years. The study period was 5 years (3 years for enrollment and 2 years for follow-up). JNDSG submitted the protocol to the Institutional Review Board of all participating hospitals and obtained approval.

Data obtained from completed observation forms and follow-up were analyzed. From 235 enrolled patients, 14 patients whose planned observation was cancelled, 12 ineligible patients, and 3 patients whose observation was performed by a second observer were excluded. Data of the remaining 206 patients were analyzed in a computer using the SAS system Release 9.1.3 Service Pack 4 for Windows (SAS Institute Japan, Ltd).

To clarify difference among participating hospitals regarding details of neck dissection, we used several categorical data analysis procedures, such as the chi-square, Fisher’s exact, and Cochran–Mantel–Haenszel tests. In these analyses, the explanatory variable was “hospital,” which was a categorical variable and had 22 values. The response variables were 50 details of neck dissection listed in section C of the observation form (see Table 2). Because the explanatory variable had a large number of values compared with the limited number of processed data, other statistical procedures, such as logistic regression, yielded no consistent results. The most acceptable results were obtained with the Cochran–Mantel–Haenszel test, which enabled adjustment for the possible effects of confounding factors. With categorical analyses between explanatory variables listed in sections A and B of the observation form and the 50 details of neck dissection, 4 explanatory variables [i.e., “primary site,” “hospital,” “N-stage,” and “side (ipsilateral/contralateral)”] were most closely related to the response variables. Three variables (i.e., “primary site,” “N-stage,” and “side”) were considered as confounding factors. The intensity of difference among participating hospitals regarding a particular detail of neck dissection was defined as follows: the difference was “confirmed” when the result of the Cochran–Mantel–Haenszel test was significant (P < 0.05) with the 3 confounding factors included. The difference was “strongly suspected” when it was significant with only 1 or 2 confounding factors included; otherwise, the difference was “denied.” The same analyses were performed with the first-stage or second-stage patients only.

Moreover, the 2-year neck control and overall survival rates of the second-stage patients were compared with those of the control. The control consisted of 904 patients with previously untreated head and neck cancer who underwent neck dissection in participating hospitals in 2003. Because this study had an educational impact on surgeons from participating hospitals, patients who underwent neck dissection during this study could not be selected as controls. Patients who underwent neck dissection just before the start of this study had to be accepted as the second-best solution.

Neck control and overall survival rates were calculated using the Kaplan–Meier (product-limit) test. Comparisons between survival curves were estimated using the log-rank test. A P value <0.05 was considered significant.

Results

Patient enrollment

Patient enrollment started on February 18, 2004, and was completed on November 22, 2006. Only 2.76 years were necessary for the enrollment of the planned 235 patients. Figure 1 shows the number of patients enrolled by month.

Fig. 1
figure 1

Patient enrollment by month: observation conducted (black bars, eligible; white bars, ineligible); gray bars, observation cancelled

The reasons for the cancellation of the planned observations in 14 patients were as follows.

The reasons of the observed hospitals for 8 patients included cancellation of operation because of fever (n = 3), absence of metastasis in the sentinel lymph nodes (n = 1), detection of pulmonary metastasis (n = 1), leukopenia (n = 1), hypothyroidism (n = 1), and patient’s request (n = 1).

The reasons of the observers for 6 patients included sudden change of an observer’s patient (n = 3), observer’s illness (n = 1), manpower shortage because of a doctor’s sudden illness (n = 1), and an air flight cancelled because of a typhoon (n = 1). Because of these unpredictable and unavoidable reasons, only 221 observations were carried out.

Another problem was the erroneous enrollment of 12 ineligible patients. The reasons for ineligibility were recurrent cancer (n = 9), unknown primary site (n = 2), and primary site other than head and neck (n = 1). Because JNDSG repeatedly warned against these violations, ineligible patients were not found during the latter half of the study.

The number of eligible patients was 209. Table 3 shows the patient enrollment by region of Japan. Although 49.3% of the observations were carried out within the same region, 50.7% were between different regions, indicating that this study enhanced exchanges among the regions.

Table 3 Patient enrollment by region of Japan

In 3 patients, the same operation was observed by two doctors. Because the observation by each doctor was counted as 1 patient, the actual number of eligible patients was 206.

Background factors

Several background factors of the 206 analyzed patients are given in Table 4. Patients with a wide range of primary sites were enrolled. Unilateral neck dissection was performed in 105 patients, and bilateral neck dissection in 101 patients, making a total of 307 operated sides. Of these 307 operated sides, only 272 were observed.

Table 4 Background factors of analyzed patients (n = 206)

Difference among participating hospitals regarding details of neck dissection

According to the criteria indicated in “Patients and methods,” the difference was “confirmed” in the following 13 details: inferior resection limit of the inferior deep cervical nodes, lymph nodes around the thoracic duct, sternocleidomastoid muscle, fascia of sternocleidomastoid, digastric muscle, omohyoid muscle, external jugular vein, sternocleidomastoid branch of the spinal accessory nerve, communicating branches between the cervical spinal nerves and spinal accessory nerve, cervical spinal nerves, ansa cervicalis, great auricular nerve, and tail of the parotid gland.

The difference was “strongly suspected” in the following 7 details: superficial plane of dissection, deep plane of dissection, lymph nodes between the cervical spinal nerves and prevertebral layer of the deep cervical fascia, occipital artery, facial artery, internal jugular vein, and common facial vein.

Changes in the intensity of difference among participating hospitals

When the same analyses were performed with the first-stage or second-stage patients only, there were some details where the intensity of difference among participating hospitals changed as the study proceeded from the first to the second stage. The intensity decreased in 11 details (Table 5) but increased in 6 details (Table 6).

Table 5 Eleven details for which the intensity of difference among the hospitals decreased
Table 6 Six details for which the intensity of difference among the hospitals increased

An interesting question is, what actual changes occurred in each detail when the study proceeded from the first to the second stage? The answer was very complicated because the analyzed data were derived from various hospitals, primary sites, and TNM stages. Figure 2 shows an example. In this particular detail, the changes as a whole resulted in more tissue preservation; this was true in most details whether the intensity decreased or not. Although the actual changes in each detail were very complicated, it was confirmed that the extent of resection of neck lymph nodes and nonlymphatic structures became smaller.

Fig. 2
figure 2

Inferior resection limit of the inferior deep cervical nodes. When the study proceeded from the first to the second stage, the intensity of difference among the hospitals regarding this particular detail decreased from “strongly suspected” to “denied.” At the same time, the change in the intensity resulted in more tissue preservation. White bars, higher than the venous angle; black bars, right above the venous angle

Neck control and overall survival rates

The follow-up of this study, which was completed on July 27, 2009, required 2.69 years and exceeded the planned 2-year study period because the enrolled and control patients were reexamined to obtain precise information on prognosis. These pieces of information were utilized to calculate the 2-year overall survival rates, which were not included in the original plan. The follow-up results are summarized in Table 7.

Table 7 Follow-up of second-stage patients and control

The 2-year neck control rate of the second-stage patients was 77.7% [95% confidence interval (CI), 68.7–84.4%] whereas that of the control was 77.1% (95% CI, 74.0–79.9%) (Fig. 3). There was no significant difference between the two curves.

Fig. 3
figure 3

Neck control curves (log-rank test, P = 0.7676): black line, second stage (n = 132); gray line, control (n = 904)

The 2-year overall survival rate of the second-stage patients was 74.7% (95% CI, 66.1–81.4%) whereas that of the control was 71.6% (95% CI, 68.5–74.4%) (Fig. 4). Although the overall survival rate of the second-stage patients was higher than that of the control, the difference was not significant.

Fig. 4
figure 4

Overall survival curves (log-rank test, P = 0.6902): black line, second stage (n = 132); gray line, control (n = 904)

Discussion

The intensity of difference among participating hospitals decreased in 11 details of neck dissection during the study. Every time the results of interim analyses were available, JNDSG warned the hospitals about the details where the difference was “confirmed” or “strongly suspected.” These efforts to achieve standardization could be the reason for the decreased intensity. In contrast, the intensity increased in 6 details. Despite this increase, the tissues were more preserved in most of the 6 details. It appeared that some of the participating hospitals started to preserve these details during the study while the other hospitals continued the same maneuvers, resulting in the increased intensity of difference.

Because there were more details with decreased intensity, it was concluded that difference among the hospitals decreased in total and that this study contributed to some extent to standardization.

Statistical analyses showed no improvement in neck control or overall survival rate with the standardization. However, it was confirmed that there was no decline of treatment results if surgeons were directed to follow several unfamiliar rules during surgery.

For the 20 details of neck dissection where difference among participating hospitals was “confirmed” or “strongly suspected,” the establishment of standard maneuvers was mandatory to achieve standardization. A big hurdle regarding this matter was the paucity of medical evidence. If medical evidence were available concerning a particular detail of neck dissection, the standard maneuver for that detail could be determined easily and unanimously. There was, however, almost no evidence concerning details of neck dissection. Thus, JNDSG had to adopt another strategy by deciding to make a manual based on the discussion among the participating hospitals regarding the optimal procedures for each detail. If problems are encountered with the proposed procedures, the manual must be revised. This strategy is much easier and faster than waiting for the establishment of medical evidence.

The manual, “Standard Surgical Maneuvers for Each Detail of Neck Dissection,” has been revised annually and the fourth unpublished edition is presently available. Although the manual is still considered as a draft, JNDSG plans to publish it on the web in the near future.

The most noteworthy achievement of this study considered by the authors was the highly efficient reeducation of surgeons from participating hospitals. Although this fact is difficult to report scientifically, several proofs were evident. At first the surgeons were reluctant to participate. The directive to observe already familiar operations in other hospitals and occasionally accept unfamiliar surgical maneuvers against their will was not a very pleasant experience. However, about 3 months after enrollment, the surgeons suddenly became very cooperative. One of the authors (M.S.) was surprised because every surgeon opined, “I did not know there could be an operation so different from mine.” This study certainly enhanced exchanges of surgeons and surgical techniques among participating hospitals, increasing the enthusiasm of every participating surgeon. The authors believe that, because of each surgeon’s eagerness, the enrollment proceeded smoothly and finished earlier than originally planned.

One significant limitation of this study was the validity of the method employed. This attempt to standardize complicated operational maneuvers among a large number of hospitals is very rare. There is as yet no specific methodology for the standardization of a surgical procedure. Under the conditions of this study, direct observation by a surgeon of operations performed in other hospitals was very effective in achieving standardization. The flaw of this method is a lack of strict objectivity or reproducibility because the observation is carried out by only one person. To address this point, observation was made by two doctors in three patients. However, this trial was unsuccessful because a large portion of the important neck structures was located deeply and could not be observed by two persons at a time. Another possible solution would be the utilization of videos or photographs. Because there is no established method to standardize a surgical procedure, several candidate methods must be attempted and evaluated.

Although the standardization of neck dissection was successful to some extent, JNDSG currently considers it insufficient. To achieve complete standardization, JNDSG commenced another prospective study.

With categorical analyses, the following three explanatory variables other than “hospital” were the most closely related to details of neck dissection: “primary site,” “N-stage,” and “side (ipsilateral/contralateral).” In the new study, JNDSG limited the primary site to the hypopharynx and supraglottis. It established “Recommendations for the extent of neck lymph node resection” based on the discussions about guidelines in the group. The “Recommendations” were categorized according to the N-stage and side of operation. JNDSG also developed “Recommendations for surgical maneuvers” based on the above-mentioned manual concerning four key details of neck dissection. In this study, instead of direct observation, photographs of the operative field are taken to demonstrate the exact extent of resection and surgical maneuvers employed for the four key details. JNDSG intends to enhance the standardization in this manner. The enrollment for this new study started on June 1, 2009 and is ongoing.

The authors realize that many surgeons may ask why the standardization of neck dissections is so important, especially when no improvement in treatment results was shown with the standardization in this study. “To pave the way to establish medical evidence” is the unanimous answer of the authors.

To scientifically demonstrate the efficacy of a treatment method, establishment of medical evidence in favor of the method is indispensable. To establish the superiority of one type of nonradical neck dissections in a particular type of patients, we must conduct a randomized trial where treatment results of patients with a certain handling of a particular detail of neck dissection are compared with those of patients with another handling of the detail. To ensure the validity of the study, other details of neck dissection must be the same for all the enrolled patients. This last condition can be satisfied only through the efforts to achieve standardization. The authors consider that the JNDSG study was the first step to enable highly productive prospective studies in the future.

Another point of view is that standardization means the definition and security of the lowest possible standard for a surgical procedure. If standardization is successful, common rules are established for details of the procedure and every performance of the procedure is conducted at least at the standard of the common rules; this is very important for a basic and common surgical procedure, such as neck dissection. The authors think that the reliability of surgical procedures achieved through standardization will help surgery in recovering its central role in head and neck oncology.