The National Cancer Database (NCDB) is a hospital-based cancer registry that has been collecting data on cancer patients in the United States since 1989.1 The NCDB is a joint project of the American College of Surgeons Commission on Cancer (CoC) and the American Cancer Society.2 Since 1998, hospitals accredited by the CoC have been required to submit all their cases with a new diagnosis to the NCDB. Currently, more than 1500 hospitals submit data to the NCDB, which in 2018 included 37 million records for 31 million unique patients with a new diagnosis in 1985–2015.

The NCDB includes diagnostic, staging, treatment, and outcomes information. Two prior studies assessed case completeness and found that the NCDB captured approximately 69% and 67%, respectively, of all U.S. cancer cases compared with the National Cancer Registry data.3,4 The first study used projected estimates of cancer incidence in 2005 to compare coverage, whereas the second study used 2004–2006 United States Cancer Statistics (USCS) population-based comparison data. The United States Cancer Statistics (USCS) data include population-based cancer incidence data from the Centers for Disease Control (CDC)’s National Program of Cancer Registries (NPCR) and the National Cancer Institute Surveillance, Epidemiology, and End Results (SEER) Program. These registries cover 100% of the U.S. population.5 This study aimed to update the NCDB comparisons using USCS data from 2012 to 2014.

Methods

Incident cancer cases in the NCDB in 2012–2014 were compared with the number of cancer cases in the USCS data for the 2012–2014 diagnosis years. Comparisons were made by primary site, age, sex, race/ethnicity, and the patient’s state of diagnosis. Patients living in the U.S. 50 states and the District of Columbia at the time of diagnosis were included in the analyses. The study excluded NCDB patients with a diagnosis and/or, treatment in Veteran’s Administration (VA) hospitals. Although the USCS data includes some patient data from VA hospitals, the exact number of VA patients in the USCS is unknown. A survey administered in December 2014 showed 18 NPCR states reporting that they received no data from VA hospitals and 8 states reporting that some but not all VA hospitals were reporting data to their cancer registry (Reda Wilson, personal communication, 8 January 2018).

The USCS cancer data excludes in situ cases except for urinary bladder, and includes separate primary-site categories for invasive and in situ breast. In addition, a separate brain and cranial nerve data file is available, which includes invasive and benign tumors. The other primary site categories in the USCS data use the SEER site recode (ICD-0-3) groups,6 which are also used by the NCDB. The metric used is the number of cases in the NCDB divided by the number of cases in the USCS.

The number of CoC hospitals providing data in 2012–2014 also was assessed and compared with the number of acute-care short-term hospitals in the United States in 2014 from the Area Health Resource Files.7 Short-term hospitals are general or specialized hospitals in which the majority of patients stay fewer than 30 days. Long-term facilities as well as psychiatric and specialty hospitals were excluded. All analyses were performed using SAS software (version 9.4; SAS Statistical Institute, Cary NC, USA).

Results

In 2012–2014, the NCDB captured 72.5% of the cancer cases in the United States (Table 1), which is slightly higher than the 67.4% reported for 2004–20064 and the 68.6% reported for 2005.3 Case coverage was slightly higher in 2014 (73.2%) than in 2012 (71.6%) (Table 2). Among the top 10 major cancer sites, the highest case coverage (80%) was found for breast cancer, and the lowest for melanoma of the skin (52%) and prostate (58%). Colon, bladder, and kidney and renal pelvis cancers had relatively high coverage of 71%, 70%, and 78%, respectively, whereas lung and bronchus had slightly lower coverage (65%). For malignant brain and other nervous system tumors, coverage was 86%, but for benign/in situ brain and other nervous system tumors, coverage was only 58%.

Table 1 Case coverage for National Cancer Data Base (NCDB) by cancer site and sex in 2012–2014
Table 2 Case counts and coverage for United States Cancer Statistics (USCS) and National Cancer Data Base (NCDB) by diagnosis year, race/ethnicity, and age for all cancer sites in 2012–2014

Comparing coverage by sex, females had slightly higher coverage (77%) than males (68%). This is partially explained by the high coverage for breast cancer compared with the lower coverage for prostate cancer. For most primary sites, the coverage rates for males and females were similar. Relatively high coverage was also found for most female gynecologic sites, such as ovary (83%), corpus and uterus NOS (82%), and cervix (83%).

Age group comparisons showed the lowest coverage (60%) for the 85 and older age group, with the highest coverage for those 20–64 years of age (77%) (Table 2). The pediatric age group (0–19 years) had 68% coverage, and the adolescent and young adult (AYA) group (15–39 years) had 79% case coverage.

Race and ethnicity comparisons showed that coverage was similar for whites (73%) and blacks (73%), intermediate for Asian/Pacific Islanders (68%), and lowest for Hispanics (55%) and American Indian/Alaskan Natives (AIAN) (41%).

Differences in coverage by the patient’s state at diagnosis are influenced by the number of CoC-accredited hospitals in the state or surrounding states. The state of Wyoming, which has only one CoC-accredited facility, had 51% coverage, whereas North Dakota, which has six CoC-accredited facilities, had the highest coverage (98%) (Table 3). Three states and the District of Columbia had more than 90% coverage, and 14 states had 80% to 89% coverage. The lowest coverage occurred in Nevada (41%) and Arizona (32%).

Table 3 Case coverage by patient state for all cancer sites in 2012–2014

The states also were grouped by census division. The East North Central and New England divisions had the highest coverage, at 85% and 81%, respectively (Table 4). The lowest coverage was found in the Mountain and Pacific divisions, with coverage at 55% and 60%. Census division case coverage for the top 10 cancers mirrored that for all cancers, with New England and East North Central having the highest coverage and the Mountain and Pacific having the lowest coverage. Prostate cancer is one example of this, with case coverage of 73% in the East North Central division and 74% in the New England division, and with 43% coverage in the Mountain region and 47% in the Pacific region.

Table 4 Case coverage by patient census division and primary site: all sites, and top 10 primary sites

After VA hospitals were excluded, there were 1475 CoC-accredited hospitals in 2012–2014 compared with 5927 acute-care hospitals in the United States in that year, representing 25% of acute-care hospitals. The distribution of hospital coverage and case coverage is shown in Fig. 1. In general, the higher the hospital coverage was, the higher the case coverage was also. The states in the Northeast had the highest hospital and case coverage, whereas Arkansas, Arizona, and Nevada had low hospital coverage and low case coverage.

Fig. 1
figure 1

Percentage of National Cancer Data Base (NCDB) case coverage in 2012–2014 by percentage of NCDB short-term acute-care hospital coverage by state in 2014

Discussion

Overall coverage of cancer cases in the NCDB has remained relatively stable at 72%, with a slight increase above the 67% found in 2004–2006.4 Case coverage also increased slightly between 2012 and 2014, as did the number of CoC-accredited facilities (excluding VA facilities), which increased from 1455 to 1475, representing approximately 25% of acute-care facilities. Most primary sites have seen slight increases in coverage since 2004–20064 or have stayed relatively the same. The primary sites at which cancer is often diagnosed and/or treated in an outpatient setting, such as prostate and melanoma of the skin, had the lowest case coverage. Coverage was slightly higher for females than for males, which is partially explained by the high coverage for breast cancer and low coverage for prostate cancer. In addition, some USCS states include data from VA hospitals, which are not included in the NCDB data.

The distribution of coverage by states increased for most states compared with 2004–2006.4 Since 2004–2006, 25 states had increases in coverage of 5% or more, with 7 states having increases of 10% or more including New York, Wisconsin, Michigan, Nebraska, West Virginia, Louisiana, and Wyoming. These may reflect newly accredited facilities in these states or surrounding states since 2006.

The states in the Northeast had the highest coverage of both hospitals and cases, and the West and Southwest had both lower case coverage and lower hospital coverage. Data by U.S. census divisions show similar findings, with the highest coverage in the New England (82%) and East North Central (85%) census divisions and the lowest coverage in the Mountain (55%) and Pacific divisions (60%). Similar coverage patterns by major cancer sites also were found by census division. These patterns may also be due to the higher population density in the East, where distance to a CoC facility may be shorter than for patients in lower-density areas in the West and Southwest. Although 93% of the patients received diagnosis, treatment, or both at a facility in their state of residence when their cancer was diagnosed, there was variation by state of residence. The states with a small number of CoC facilities were more likely to have a higher percentage of patients with treatment in a different state from their state of residence (data not shown).

Comparisons by race for all cancer sites showed that whites and blacks have a similar coverage rate (73%). Asians and Pacific Islanders had slightly lower coverage (68%) than whites and blacks. The lowest coverage (41%) was found for American Indians and Alaskan natives. In 2010, the states with the highest American Indian populations in the United States were California, Oklahoma, Arizona and Texas,8 which had relatively low coverage rates in the NCDB. Also, the SEER and NPCR cancer registries are able to link their data to the Indian Health Service Administrative records to help identify the American Indian and Alaskan native population.9

Hispanic coverage in the NCDB was only 56%. The lower coverage rates for Hispanics, Asians, and Pacific Islanders may be partially due to the fact that North American Association of Central Cancer Registries (NAACCR) uses an algorithm that identifies Hispanics as well as Asians and Pacific Islanders, which is not available to hospitals.9 In addition, California, Texas, Arizona, and New Mexico have the highest percentage of Hispanics in the United States,10 and these states also have lower case coverage in the NCDB.

Lower coverage rates for the oldest group may be partially due to the fact that the state registries also identify a small percentage of cases based on death certificates alone.11 Although no more than 5% of cases in the registry are allowed to be based on death certificate alone, these may be more likely to occur for elderly patients. The NCDB data do not include death certificate-alone cases, and this may have a slight impact on the coverage of the oldest cancer patients.

The fact that the NCDB case coverage has remained relatively stable during the past decade is relevant to users of the NCDB Participant User Files (PUFs). These files are a de-identified subset of the NCDB data available to researchers at CoC facilities.12 However, these users should also be aware of some differences in case coverage by primary site, geography, age, race, and Hispanic origin documented in this report. Users of PUFs should be aware of these differences when analyzing data for any of these subgroups.