Introduction

Clinical registries and their data, which reflect real-world clinical practice and outcomes, are now being used to support healthcare quality improvement initiatives worldwide [1, 2]. The Japanese National Clinical Database (NCD), a registry involving a data collection scheme associated with surgical board certification, has provided several risk models for operative mortality and morbidities of commonly performed surgical procedures [3]. The NCD also provides a benchmark for reporting national and facility-specific risk-adjusted performance metrics to participating hospitals [4,5,6,7].

The usefulness of registry data is highly dependent on the quality of the data. However, it is difficult to assess the quality of registry data. Indeed, there are very few published reports describing the quality of registry data, although notable examples include those from the Society of Thoracic Surgeons National Database [8, 9], the American College of Surgeons National Surgical Quality Improvement Program [10], the European Association for Cardio-Thoracic Surgery database [11], and the Japanese Congenital Cardiovascular Surgery Database [12]. Compared with clinical trials, the quality of data in clinical registries is not well monitored or evaluated, so the reliability of evidence from clinical registries is unclear. For example, a previous study revealed that patients with more severe disease were selectively not included in a registry [13].

The aim of this study was to assess the quality of data registered in the NCD in 2011. We evaluated the data in terms of the registration coverage and the accuracy of eight data components commonly recorded in surgical registries.

Methods

Data sources

The NCD was established as part of a project to create a nationwide database that would allow researchers to assess surgical outcomes, with the goal of improving the quality of care (http://www.ncd.or.jp/). The NCD began collecting data in January 2011. It covers various specialties, including cardiac surgery, vascular surgery, gastroenterological surgery, pediatric surgery, breast cancer, and respiratory surgery. The participating hospitals are required to register all major surgical procedures. Clinical departments at each hospital participate in the NCD. A total of 1,165,790 surgical cases were submitted to the NCD in 2011 by 4313 participating departments at 3007 hospitals. The NCD covers approximately 50% of Japanese hospitals and approximately 80% of those that perform surgeries, according to a survey of medical institutions conducted by the Ministry of Health, Labor, and Welfare (MHLW) [14].

Information is collected using data collection forms that contain 14 sections spanning patient characteristics and operative information as basic variables recorded for all cases. Each subspecialty collects up to 500 additional data components, including patient characteristics, preoperative risk, surgical information, postoperative complications, and outcomes. The variables recorded for cardiac surgery and gastroenterological surgery are nearly identical to those recorded in the Society of Thoracic Surgeons National Database (http://www.sts.org/national-database) and the American College of Surgeons National Surgical Quality Improvement Program (https://www.facs.org/quality-programs/acs-nsqip). Data are submitted electronically and are automatically checked for logic, format, and range to reduce error due to data entry. Surgeons and data managers are responsible for registering the data in the NCD, and chief physicians are responsible for approving the data, confirming their integrity.

The Institutional Review Board of the Japan Surgical Society has approved data harvesting by the NCD [http://www.ncd.or.jp/about/ethical_considerations.html (in Japanese)], and the approval included opt-out clauses and processes for ensuring patient informed consent. Data registration at each hospital was approved by either the Institutional Review Committee or the hospital director.

Comparing NCD data with local government report data

To assess whether or not the data registration was complete, we compared the NCD data to regional government report data from eight regional health and welfare bureaus. This is a reporting that is mandatory for hospitals operating under Japanese universal health coverage, linked to facility reimbursement category certification [15]. The facilities report the number of specified surgical procedures performed under health insurance at all authorized medical institutions in Japan. The data are audited by the health and welfare bureaus via site visits. Twenty-five types of surgical procedures were recorded, including cerebral, ophthalmic, ear, lung, nasal, orthopedic, cardiac, and gastroenterological surgery. After assessing the comparability of the procedural definitions used in the regional government report data and NCD data, we selected lung surgeries and esophageal surgeries as target procedures (see supplemental table for the list of specific procedure types included). We compared the list of hospitals included in the regional government report data to those included in the NCD and confirmed the names and addresses of the corresponding hospitals.

Verifying the NCD data using on-site source data

Of the 4313 participating sites, 21 were randomly selected for on-site data verification, and 19 of these sites agreed to our request. Verifications were conducted between August 2012 and March 2013. To confirm that data registered for all surgical cases were complete, we compared the operation logs at each facility with data registered in the NCD for 2829 cases (0.24% of all surgical cases registered in the NCD in year 2011). The operation logs included those managed by operating room staff, medical department staff, or those derived from the electronic medical records. We assessed whether or not the cases registered in the NCD could be matched with surgical logs using the date of surgery and NCD registration code.

To evaluate the accuracy of the registered data, up to 40 cases were randomly selected at each hospital for further data component verification. A total of 616 cases were selected (0.05% of all surgical cases in the NCD). The accuracy assessment included eight data components as follows: the patient’s date of birth, gender, admission date, whether or not an emergency ambulance was used, date of surgery, name of surgeon, data of discharge, and the patient’s status at discharge. We chose these variables because of their importance to the registry, the ability to objectively assess their correctness, and the standardized formats of recording these variables at hospitals [16]. Hospital records (operation records, admission and discharge summaries, and nursing records) were used as source documents. If the data registered with the NCD were the same as the source documents, the items were deemed to be “concordant”.

The present study was conducted by a team of non-clinicians that was commissioned by the NCD. Data were verified by staff who had general medical knowledge and standardized audit training. The main components of training included sessions on (1) overview of the NCD and its purposes/activities, (2) aims and purposes of the audit activities, (3) use of the NCD web-based case registration systems, (4) learning the definitions of the target data components, and (5) processes for data verification and recording of the results. The auditors were only allowed access to the patients’ medical records for verification purposes under conditions of maintaining patient confidentiality.

Statistical analyses

Comparing the NCD data to regional government report data

We compared the numbers of cases of the two surgical procedure groups (lung and esophagus) registered in the NCD relative to the regional government report data among two types of hospital groups: (1) all hospitals with either the regional government report data or NCD data, and (2) hospitals with at least one record of the procedure in both the regional government report and NCD. We also calculated the differences in the number of cases between the NCD data and the regional government report data at each hospital.

Verifying the NCD data using on-site source data

The completeness of case registration at 19 sites was assessed using on-site surgical logs. We estimated the proportion of confirmed registered cases among all surgical cases at each facility, and determined the number of duplicate registrations in the NCD. We assessed the accuracy of data entry by estimating the proportion of the concordance for the eight data items listed above. If the original data sources could not be identified using our standardized process of data verification, the data were excluded from the calculations. To assess how the accuracy of the data differed by whether or not the facility was an academic (university) hospital, we conducted an additional assessment stratified by this factor. All analyses were performed using the SAS software program, version 9.3 (SAS Institute, Cary, NC, USA).

Results

Comparing the NCD data to regional government report data

We first assessed the registration coverage of NCD data relative to the regional government report data. The numbers of lung surgeries and esophageal surgeries reported to the regional health and welfare bureaus and in the NCD are listed in Table 1. For pneumonectomy, 48,716 cases were reported to the regional health and welfare bureaus, and 46,143 cases were registered in the NCD from a total of 1288 hospitals, yielding a coverage of 94.7% (46,143/48,716). In addition, 1010 hospitals with at least 1 procedure report in both the regional government report data and NCD data reported a total of 47,226 cases to the regional health and welfare bureaus and 45,648 cases to the NCD, yielding a coverage of 96.5% (45,648/47,226).

Table 1 A comparison between the number of cases in the NCD and regional government report

Regarding esophageal surgeries, 8399 cases were reported to the regional health and welfare bureaus, and 7494 cases were registered in the NCD from a total of 1087 hospitals, yielding a coverage of 89.2% (7494/8399). In addition, among 826 hospitals with reported cases in both databases, 8024 cases were reported in the regional report, and 7237 cases were registered in NCD, leading to a coverage of 90.2% (7237/8024).

Among the hospitals with reported cases in both databases, 21.9% of hospitals for lung surgery and 44.3% of hospitals for esophageal surgery had exact matching numbers of reports in the 2 databases, and 42.3% of hospitals for lung surgery and 70.7% of hospitals for esophageal surgery had numbers within ± 1 (Table 2).

Table 2 Differences in the reported number of cases between the NCD and the regional government report data at each hospital among facilities identified in both databases with at least one procedure recorded in each

Verifying the NCD data against hospital source data

We assessed the registration coverage and data accuracy of the NCD data against the hospitals’ surgery logs (Fig. 1). After assessing the completeness of registration, we excluded 1 of the 19 sites because the source data were insufficiently prepared or provided. A total of 2829 cases registered in the NCD were subjected to verification, and 2783 (98.4%) cases were confirmed to be valid, independent procedures. Twenty-six cases (0.9%) were falsely duplicated due to human error, and 20 cases (0.7%) could not be found in the surgical logs. A few cases were listed in surgical logs that were not registered in the NCD.

Fig. 1
figure 1

Flow diagram for selecting patients for on-site data verification. The hospital level registration coverage and accuracy of the NCD data were verified against hospital source data

Of the 616 cases included in the assessment of data accuracy, seven were excluded because the patients’ charts were being used for patient care at the time of the audit. Therefore, 609 cases underwent verification, with a median of 39 cases per site (range 8–40 cases per site). The overall concordance of the eight variables was 97.8%, and all values for the individual variables were greater than 95% (Table 3). The discharge mortality status showed the greatest concordance at 99.4%. In general, there were no meaningful differences in these accuracy measures between academic and non-academic institutions (Table 4).

Table 3 Results of the data verification analysis
Table 4 Data verification results for university/non-university hospitals

Discussion

Our study assessed the quality of the NCD data in 2011 in terms of its registration coverage and the accuracy of data entry. The registration coverage was high, at around 90–95%, for both procedures investigated (lung surgeries and esophageal surgeries) compared with the regional government report data. The eight data components that were adjudicated showed high concordance (≥ 95.0%) with the source data.

Many registries, especially ones that are not nationwide, do not report their coverage, making it difficult to compare the registration coverage with those of other registries. For cancer registries, registration coverage above 90% is considered to represent “silver-grade” coverage according to the North American Association of Central Cancer Registries [17]. A population-based US cancer registry reported a coverage of 80% or above [18], a Danish acute myeloid leukemia registry reported 99% coverage [19], and another Danish arthroplasty registry reported 70% coverage [20]. Two Swedish registries on cardiac surgery and esophagus/stomach cancer reported 99% [21] and 96% coverages [22], respectively. A low registration coverage may lead to selection bias. Therefore, the high registration coverage of the NCD supports the representativeness of the analyses based on the registry. It is easier to identify cases that are eligible for surgical registries than those eligible for registries in other settings (e.g., cancer). In the case of the NCD, we believe that requiring case registration for surgical certification was a major factor helping to ensure the complete recording of all cases at each facility. Regarding the few cases found in the surgical logs that were omitted from registration, the main reasons included human error or specific conditions, such as emergency surgeries during the night, surgeries with a low surgical difficulty, or surgeries conducted by visiting surgeons.

Basic variables, such as patient characteristics, discharge status after surgery, and discharge dates, were accurately recorded in the NCD, consistent with adjudication studies of other registries [8,9,10,11,12]. Some errors in data entry are expected in any registry. However, systematic and frequent errors may introduce bias. Accurate data entry is, therefore, essential for clinical registries and for their evaluation.

This is the first report to compare the coverage of the NCD relative to regional government report data, and the validity of our findings depends on the quality of the report used as the gold standard. In Japan’s healthcare system, all facilities are required to file all of their surgical procedures performed under universal healthcare insurance coverage to the regional health and welfare bureaus; therefore, in theory, such information should include 100% of major surgical procedures performed in Japan [23]. However, there remains a possibility of overreporting, since the reported numbers for particular procedures determine whether or not the facilities receive certification for specialty care. As the two surgical procedures that we referenced in the study are not used for such facility certification, we believe the risk of overreporting is minimal. Future studies may confirm our findings using different data sources, such as administrative claims data as the gold standard. In our study, on-site data identification and a comparison against the surgical logs complemented our assessment of the registration coverage of cases at these facilities. Therefore, it is important to elucidate how duplication and omission could occur in these nationwide registries. The sites included in the on-site data verification were randomly selected to avoid bias by selecting facilities with certain characteristics. However, due to time and cost constraints, we could only conduct on-site data verification at a small proportion of all sites involved in NCD activity.

A few limitations of our study should be noted. First, we were unable to compare all of the types of surgical procedures included in the regional government report data with the NCD data to assess registration coverage because the definitions of many procedures differed between the two databases. Furthermore, while the definitions of lung surgeries and esophageal surgeries were quite similar, small differences remained, as shown in the supplemental table. Second, not all hospitals in Japan participate in the NCD. Therefore, some facilities submitted data to the regional office but not the NCD. In addition, some facilities submitted data to the NCD but not to the regional office. We tried to identify and link as many facilities in the two databases as possible, but some remained unlinkable. Further and more granular investigation on the characteristics of these facilities with additional information collection may clarify the reasons for these discrepancies. Third, it was not possible to compare the NCD data and regional government report data at the individual case level because the government report does not provide the granularity necessary for such an analysis. Finally, due to time and cost constraints, on-site data adjudication was limited to a small number of variables. Therefore, we were unable to discuss the accuracies of some of the important variables in the database, especially the procedure type and postoperative complications. Data validation activities are now underway in many NCD registries, including the Japan Cardiovascular Surgery database, gastroenterological surgery database, breast cancer registry, vascular surgery registry, coronary intervention registry, and respiratory surgery database. Target data components for these validation studies conducted by the governing/database committees of each registry include specialty-specific data components, such as comorbidities and postoperative complications (e.g., reports by Tomotaki et al. [12] and Takahashi et al. [24]). While the activity is time- and effort-consuming, it builds the basis for scientific research conducted using these databases, and is, therefore, one of the most important activities for the governing/database committees.

Conclusion

This study showed high registration coverage of data in the NCD 2011 for the specified lung and esophageal surgeries. In addition, the NCD data showed high accuracy for the registration of basic variables, including patient characteristics and the discharge mortality status. Future studies should evaluate the accuracies of other important variables, such as postoperative complications and procedure types.