Introduction

Appalachia is a 13-state, 200,000 square mile region that follows the spine of the Appalachian Mountains from southern New York through northern Mississippi (Fig. 1). The region was formally defined by federal legislation in 1965, with additional counties added in subsequent years. Appalachia has long been recognized as geographically isolated, economically depressed and possessing a distinct cultural identity. The Appalachian Regional Commission (ARC) was formed during the 1960 s’ War on Poverty to address overlapping economic, social and health problems. ARC health initiatives have focused on construction of new health facilities and reducing the region’s high infant mortality rate [1]. Appalachian residents often intuitively perceive higher cancer mortality rates in their region relative to the rest of the country [2], but widespread recognition of the legitimacy of geographic place-based disparities has been slow to gain national traction [3]. Historically, reliable information outlining the health status of people within Appalachia has been scarce because it is a large region spanning all or part of 13 states [4].

Fig. 1
figure 1

The Appalachian region of the United States, 2007 Appalachian Regional Commission definition

A limited body of research has documented the cancer mortality burden in Appalachia. Studies in recent decades have identified elevated Appalachian mortality rates for lung cancer [59] colorectal cancer [5, 9], female breast cancer [7], cervical cancer [5, 810] and all cancers combined [59]. However, data upon which these studies were based are now dated. In addition, most study designs are descriptive, and compare Appalachian cancer mortality rates to those for the entire United States. While these comparisons may be valuable for initial identification of geographic differences, comparing cancer mortality of Appalachian versus non-Appalachian counties within states could better account for variations in relevant risk factors distorted by broader geographic comparisons.

Our objective is to present cancer mortality statistics that compare Appalachia with the nation, with additional, primary focus on comparisons of the Appalachian and non-Appalachian regions within the 13 states containing Appalachia. Supplemental statistical analysis explains part of the variations in cancer mortality rates. This approach may clarify more effective cancer control options for health practitioners, public health officials, policy makers and citizens of Appalachia by identifying how certain factors differentially influence cancer mortality in Appalachian and non-Appalachian populations in the region. For clarity, the term “Appalachia” refers to the 410 county multi-state region designated in 2007 by the ARC. The term “non-Appalachia” refers to all other counties within the 13 states.

Factors Associated with Cancer Mortality

Much of the literature about Appalachian health hypothesizes that regional differences in socioeconomic conditions contribute to Appalachia’s poor health [5, 7]. There is limited research focused on identifying associations between specific indicators and cancer mortality. Community-level factors have an influence on mortality independent of individual characteristics [3]. Moreover, ecological variables such as income, education, rurality and percent minority population have been associated with cancer rates in numerous national studies, but analysis to test the validity of these findings in sub-regions and for specific cancer types is limited [11].

Income

The Appalachian poverty rate exceeds the national rate by 14% [12]. The association between low economic status and poor health is such a pillar of public health practice and research [13] that poverty has been referred to as a “carcinogen” [14]. Low income areas are characterized by a lack of health care resources, information and knowledge that can facilitate healthy lifestyles [15]. Low income populations have poorer cancer survival rates than middle and higher-income populations [16].

Education

Evidence strongly links educational attainment with health outcomes [17], and historical data show low high school graduation rates in Appalachia, ranging from 87 to 95% of the national average in recent decades [18]. National research identifies an inverse relationship between education level and cancer mortality rates, with variations in the strength of this relationship depending on tumor site. For example, the variation in lung cancer mortality by education level is larger than the variation observed for other cancer sites [19]. By 2000, lung cancer death rates were twice as high among low education white men and women and black men, compared to the more educated within each of these groups [20]. Risk factor prevalence also varies according to education level—the difference in the prevalence of smoking by educational achievement has widened in the United States over recent decades [19].

Rurality

Thirty-four percent of Appalachian counties are considered rural by our definition (described later). Past research identifies higher mortality rates in rural Appalachia compared to the United States for all cancers combined, lung cancer and cervical cancer [8, 9]. Nationally, and within Appalachia, rural residence is associated with higher poverty rates than non-rural residence [7]. Rural residents have fewer physician visits, lower use of standard preventive care, lower percentage of health insurance coverage and less access to clinical trials [8, 21]. Rural cancer patients are less likely to receive state-of-the-art care because of limited geographic access to specialized diagnostic and treatment services [22].

Minority Population

The relative size and percentage of the minority population in Appalachia varies, with historically low representation in rural counties and in Central and Northern Appalachia. Nationally, African Americans and Native Americans have lower all-cancer survival rates than whites [14]. African Americans have the highest death rate from all cancers combined, lung cancer, colorectal cancer, female breast cancer and cervical cancer of all racial/ethnic groups in United States. The cancer death rate is 40% higher for African Americans than whites for males, and 20% higher for African Americans than whites for females [14].

Are cancer mortality rates in Appalachia higher than national rates? Within the 13 states that contain Appalachian counties, are cancer mortality rates higher in Appalachia compared to non-Appalachia? Subsequently, how much of the hypothesized variations in these rates are explained by select ecological variables? Analysis of associated factors and designs of interventions to address cancer mortality differences requires consideration of the unique profile of a region to improve the likelihood of effectiveness [23]. Appalachia’s length and breadth, and its complex community and regional dynamics make generalizations from national comparisons alone problematic [24, 25]. Our analysis is designed to identify differences between Appalachia and the nation as well as between Appalachia and non-Appalachia within the 13 state region, with respect to socioeconomic/demographic status and site-specific cancer mortality rates. Up-to-date, relevant data could contribute to more effective cancer control measures in the Appalachian region.

Methods

We conduct two levels of analysis. First, a descriptive analysis of cancer mortality rates compares each state’s Appalachian counties with (1) the nation and (2) in-state non-Appalachian counties. Second, further statistical analysis is conducted to examine associations of multiple ecological variables with Appalachian and non-Appalachian cancer rates. County-level data are used because cancer mortality rates and complementary socioeconomic and demographic data are uniformly available [26]. Multi-year data are aggregated to limit potential volatility of single year county rates. Table 1 describes variables in the dataset compiled on each county and independent city (n = 1,100) in the 13 states that include Appalachia.

Table 1 Operationalization and data sources for county-level dataset

Lung/bronchus, colorectal, female breast and cervical cancers, as well as a grouping of all cancers combined, are included in descriptive analysis because Appalachian differences have been proposed in the literature [510]. We use SEER*Stat 7.0.4 software to calculate site-specific, age-adjusted cancer mortality rates for Appalachia and non-Appalachia in each state, with significance at α = 0.05. National vital statistics provide coverage of nearly all deaths within the region [19]. National Center for Health Statistics data include age at death, sex, county of residence, underlying cause of death and contributing cause(s) of death for each decedent in the United States by year. Deaths are coded according to International Classification of Disease, 10th Revision (ICD-10) standards, and only underlying causes of death are used to calculate mortality rates [28].

Following descriptive analysis, we calculate bivariate correlations for associations between cancer mortality rates and selected, unmodified independent variables. This step and subsequent analysis is restricted to cancer sites that show consistent Appalachian/non-Appalachian differences in descriptive-level analysis. For Appalachia and non-Appalachia, we determine if variable pairings have significant and meaningful correlations. We subsequently develop explanatory models using multiple linear regression to discern independent contributions of predictor variables to cancer mortality rates.

Results

Regional Characteristics

The percent Appalachian population across the 13 states ranges from 5% (New York) to 100% (West Virginia). Pennsylvania (n = 5,741,255) has the largest Appalachian population; Maryland (n = 246,387) has the smallest. Those residing in Appalachian counties constitute 24% of the total population of these 13 states. Compared with non-Appalachian counties, Appalachian counties are characterized by: (1) older residents, (2) lower household incomes (3) lower high school graduation rates, (4) smaller minority populations and (5) smaller population sizes (Table 2).

Table 2 Results of independent samples t test comparing Appalachian and non-Appalachian counties in 13 states

Descriptive Epidemiology

Mortality rates in the 13 states that include Appalachia exceed rates in the rest of the country (other 37 states and Washington, DC) for all cancers combined (7% higher), lung/bronchus cancer (13%), colorectal cancer (8%), female breast cancer (7%) and cervical cancer (8%). Each of these differences is significant (P < 0.05). Unless otherwise noted, statistical significance for rate differences in descriptive analysis is set at α = 0.05. Further, within these 13 states, mortality rates for Appalachian counties exceed those of non-Appalachian counties for all cancers combined (5%), lung/bronchus cancer (13%), colorectal cancer (1%) and cervical cancer (4%). Conversely, female breast cancer mortality rates are 4% higher in non-Appalachia than in Appalachia. Each of these differences is significant with the exception of cervical cancer mortality.

Mortality rates for all cancers combined exceed national rates in each of the 13 Appalachian regions (Table 3). Within the 13 states, Appalachian counties in Kentucky (10% higher), New York (11%), Ohio (5%) and Virginia (8%) have significantly elevated all-cancer mortality rates compared to non-Appalachian counties. While West Virginia has no in-state, non-Appalachian comparison group, it has the second highest all-cancer mortality rate of the 13 state Appalachian regions, and it has a higher mortality rate than all non-Appalachian regions except Kentucky. Alabama (3% higher), Mississippi (3%) and North Carolina (3%) have significantly higher all-cancer mortality rates in non-Appalachian counties.

Table 3 Site-specific cancer mortality rates per 100,000 by state and Appalachian status, 2003–2007

Lung cancer mortality rates exceed national rates in each of the 13 Appalachian regions (Table 3). Appalachian counties in Alabama (3% higher), Georgia (9%), Kentucky (17%), Maryland (10%), New York (22%), Ohio (15%), South Carolina (6%), Tennessee (3%) and Virginia (20%) have significantly higher lung cancer mortality rates than their non-Appalachian counterparts. West Virginia’s mortality rate is third highest among Appalachian regions, and is 23% higher than the rate for all non-Appalachian counties combined. No non-Appalachian regions have significantly higher lung cancer mortality rates than their corresponding in-state Appalachian region.

Colorectal cancer mortality rates in 9 of 13 Appalachian regions exceed national rates (Table 3). No consistent in-state differences emerge from analysis of colorectal cancer mortality rates. Ohio is the only state where Appalachia has a significantly elevated (9%) mortality rate. Alabama is the only state where non-Appalachia has a significantly elevated (8%) mortality rate. Appalachian Kentucky has the highest colorectal cancer mortality rate, while Appalachian North Carolina has the lowest.

Female breast cancer mortality rates in 9 of 13 Appalachian regions exceed national rates (Table 3). No state Appalachian region has significantly higher female breast cancer mortality than its non-Appalachian counterpart. Georgia (10% higher), Ohio (9%) and Tennessee (13%) have significantly elevated female breast cancer mortality rates compared to Appalachian counterparts. Appalachian Virginia has the highest female breast cancer mortality rate; Appalachian Georgia has the lowest.

Cervical cancer mortality rates in 7 of 13 Appalachian regions exceed national rates (Table 3). One non-Appalachian region (Georgia) has significantly higher cervical cancer mortality than its Appalachian counterpart, while one Appalachian region (Ohio) has significantly higher cervical cancer mortality than in-state non-Appalachian counties. Non-Appalachian Mississippi has the highest cervical cancer mortality; Appalachian Virginia has the lowest.

Correlation Analyses

Consistent mortality rate differences between Appalachian and non-Appalachian counties in the 13 states are identified for all cancers combined and lung/bronchus cancer. The rest of our analysis focuses on these two groupings.

For Appalachia and non-Appalachia, unadjusted associations between mortality from all cancers and median household income, high school graduation rate and percent white population are analyzed by calculating Pearson’s coefficients (Table 4). All correlations are significant at P < 0.01 (two-tailed). In both strata, income and high school graduation rates are inversely associated with the all-cancer mortality rate. The absolute value of income’s negative correlation with all-cancer mortality is slightly larger in Appalachia than in non-Appalachia. The absolute value of high school graduation’s inverse association with all-cancer mortality is much larger in Appalachia than non-Appalachia. Percent white population is positively correlated with the all-cancer mortality rate for both groupings.

Table 4 Pearson correlation coefficients for independent variables and cancer mortality rates

The same associations are analyzed for lung/bronchus cancer (Table 4). All correlations are significant at P < 0.01. In both regions, income and graduation rates are inversely associated with lung cancer mortality rates. Correlations between income and lung cancer mortality are nearly identical in Appalachia and non-Appalachia. The inverse association between the graduation rate and the lung cancer mortality rate is larger in Appalachia than non-Appalachia. The difference in the size of this negative correlation is larger for lung cancer than for all-cancer. Percent white population is correlated with higher lung cancer mortality rates in both regions, and the correlation coefficient is larger for non-Appalachia.

Linear Regression Models

Multicolinearity represents a threat to the reliability of regression coefficients. Checks for multicolinearity in each of the least squares regression models yield variance inflation factors within the acceptable range, indicating that coefficient variances are not unduly increased as a result of colinearity between independent variables. Tests for effect modification between the dichotomous Appalachian/non-Appalachian and rural/non-rural independent variables and the cancer mortality outcome variable in the two non-stratified models yield non-significant p-values in initial models; these interaction terms are removed from the final version of each model.

Model 1, an explanatory model of all-cancer mortality, is generated by regressing mortality rates of all counties in the 13 states on selected independent variables (Table 5). We examine the unique contributions of selected independent variables while controlling for effects of other variables. Median household income has a significant negative association with all-cancer mortality. The association between percent high school graduation and all-cancer mortality is not significant. Percent white population is positively associated with the all-cancer mortality rate. Thus, controlling for other variables, as the percent minority population increases, all-cancer mortality decreases. Rural residence is not independently associated with all-cancer mortality rates. Living in the Appalachian region is independently associated with lower all-cancer mortality rates. These variables account for about 26% of the variance in all-cancer mortality rates.

Table 5 Least squares regression analysis, weighted by population, dependent variables all-cancer and lung/bronchus cancer mortality rates per 100,000 population

Model 2 looks at the contributions of independent variables to lung/bronchus cancer mortality (Table 5). Median household income has a significant negative association with lung cancer mortality. The association between percent high school graduation and lung cancer mortality is negative, and unlike in Model 1, is significant. The percent white population is positively associated with lung cancer rates. Controlling for other variables, living in Appalachia is associated with lower lung cancer mortality rates. These variables account for about 34% of the variance in lung cancer mortality rates, an 8% increase over Model 1.

In Model 3 we stratify the data by Appalachia and non-Appalachia and generate a regression model for all-cancer mortality (Table 6). Median household income is negatively associated with all-cancer mortality rates in Appalachia and non-Appalachia. The two strata have differences in the direction of association between the high school education variable and all-cancer mortality. In Appalachia, there is a significant negative association between percent high school graduates and all-cancer mortality; in non-Appalachia, there is a significant positive association. There is no significant association between percent white population and all-cancer mortality in Appalachia; in non-Appalachia there is a positive association. Rural residence is not independently associated with all-cancer rates in either region. These variables explain more of the variance in Appalachia (Adj R2 = 0.296) than non-Appalachia (Adj R2 = 0.238).

Table 6 Least squares regression analysis, weighted by population, stratified by Appalachian/non-Appalachian status, dependent variables all-cancer and lung/bronchus cancer mortality rates per 100,000 population

In Model 4 we again stratify by Appalachia and non-Appalachia for lung/bronchus cancer mortality. Median household income is significantly associated with reduced lung cancer mortality rates in non-Appalachia, but not in Appalachia (Table 6). The negative association between high school graduation rate and lung cancer mortality in Appalachia is stronger than the corresponding association for all cancers combined. This association is again in the opposite direction (positive) for non-Appalachia. Percent white population is positively associated with lung cancer mortality in Appalachia and non-Appalachia. Rural residence is not independently associated with lung cancer mortality rates in either stratum. The variance explained in each stratum is about 35%.

Discussion

This study reports on cancer mortality in the Appalachian region of the United States. Analysis of associations between socioeconomic/demographic variables and cancer mortality rates provides new insight into why certain cancers may disproportionately affect residents of Appalachia.

Descriptive Epidemiology

We confirm a pattern of higher cancer mortality rates in the 13 states containing Appalachia compared to the rest of the United States. We also identify differences between Appalachian and non-Appalachian counties within the 13 states, but these are not consistent for all cancer types. Although Appalachia has a unique identity, it is not homogenous, so in-region variations across and within states with regards to socioeconomic status and cancer outcomes should be expected. Previous research undertaken to identify regional disparities compared Appalachian rates to those for the entire United States [5, 710]; our within-state approach contributes an additional and potentially more sensitive analysis to identify cancer mortality disparities.

Regional and intra-state comparisons identify lung cancer as a major problem in Appalachia. The states with the six highest Appalachian lung cancer mortality rates form a contiguous geographic chain from northeast Mississippi in the south to southern Ohio in the north, and include eastern Tennessee, western Virginia, eastern Kentucky and all of West Virginia. Notably, each state Appalachian region with significantly elevated all-cancer mortality (Kentucky, New York, Ohio and Virginia) also has significantly elevated lung cancer mortality.

Linear Regression Models

The most valuable information derived from Model 1 may be the lack a significant association between the education indicator and all-cancer mortality when controlling for other variables, and the absence of an independent rural effect. The education variable is significantly associated with lower lung cancer mortality in Model 2. High school education levels may be a better predictor for lung cancer mortality than for mortality from all cancers. This finding doesn’t rule out the importance of education or rurality to all-cancer mortality rates. Rather, it suggests that additional unmeasured variables associated with these may be more strongly associated with mortality. The strength of the negative association between median household income and mortality rates for all cancers combined is more than twice that for lung cancer. If this relationship has causal elements, increases in household income within the 13 states could result in a larger relative reduction of mortality rates for all cancers combined than for lung/bronchus cancer alone.

In Model 3, the reasons behind the differential association between education and all-cancer mortality between strata are likely complex, but previous unpublished research on a subset (Tennessee, Kentucky, Virginia) of this population observed an association with similar dynamics [6]. This research did not identify a reverse in the direction of this association, but did find a weaker, non-significant negative association for non-Appalachia opposed to a significant negative association for Appalachia. Within Appalachia, opinions on the perceived value of continuing formal education versus addressing immediate personal or family financial needs through work may differ from opinions held outside the region. Schooling offers hope for positive change to rural and impoverished areas, but research suggests that both parents and educators in rural regions may stress the importance of physical labor over careers that require furthering one’s education [29]. Within an Appalachian school-age population, low high school completion rates have been associated with the perception that educational attainment is not linked to economic circumstance [29]. Alternately, unmeasured characteristics more prevalent in large urban populations in non-Appalachia (e.g. New York City, Philadelphia, Baltimore, Memphis, Atlanta, etc.) could also influence the dynamics of this distinction.

The differential association between percent white population and all-cancer mortality is also interesting. National research found positive associations between percent African American population and all-cause mortality independent of income, education, physician supply and region [3, 30]. Our data only quantifies size of the total minority population by county. The minority population of Appalachia is predominantly African American, although Hispanics constitute a growing proportion [31]. Our findings suggest a small protective effect in non-Appalachia of residing in counties with a large percent minority population, but this relationship is likely confounded by other factors associated with the variable of interest. Urban areas with a higher percent minority population may have better overall access to preventive care, diagnostic services and treatment, and benefit from the existence of culturally competent and directed minority personal and public health programs. The Appalachian region—largely rural and with a relatively small minority population—lacks many of these targeted initiatives. Another possibility could be that minorities in areas with a higher percent white population have elevated exposure to risk factors not accounted for in these models.

Model 4 results are partially consistent with associations observed in previous research [6]. Within Appalachia, we observe a strong negative association between percent high school graduates and lung cancer mortality rates. The beta coefficient for percent high school graduates in Appalachia is approximately negative one. Assuming long-term consistency of this relationship, within Appalachia, each 1% improvement in the high school graduation rate could result in one fewer lung cancer death annually per 100,000 residents. Given the region’s total population, this reduction could be interpreted as approximately 240 fewer lung/bronchus cancer deaths per year. The association between changing education levels and health outcomes would have a long latency period [30], but an emphasis on improving high school graduation rates in Appalachia could result in a meaningful reduction in lung cancer mortality. Local, state and national officials have varying levels of influence on education policy, but our findings indicate an opportunity for education stakeholders at every level to help reduce the burden of lung cancer mortality in Appalachia.

In models 3 and 4, the negative relationship between median household income and mortality rates is stronger for all cancers combined than for lung/bronchus cancer alone. In both models, the beta coefficient is also stronger for non-Appalachia than for Appalachia, indicating that the relative importance of income to these cancer outcomes may be greater in the non-Appalachian portions of these 13 states. Focused inquiry into these interesting distinctions may yield more refined conclusions about the regional importance of household income to cancer outcomes.

Strengths and Limitations

Associations at the aggregate level may not exist at the individual level. We use county-level cancer mortality rates and socioeconomic data, and can’t infer individual-level effects of predictor variables on cancer outcomes. However, it is likely that variations in indicators related to socioeconomic status have individual-level implications [30]. Cross-sectional data don’t account for latency periods between risk factors and cancer mortality outcomes, and it is impossible to determine the direction of causality without longitudinal measures. The independent and dependent variables do not all represent identical time periods, but every effort was made to acquire current data from corresponding multi-year periods.

An important methodological issue is the decision whether to weight data by population size. Weighting could underestimate small county effects and overestimate effects of large counties. However, weighting may allow us to make better inferences about individual-level effects from population data [32]. Risk factors affect individuals, so the number of people contributing to a population parameter is relevant.

Our approach does not include behavioral indicators quantifying physical inactivity and utilization of appropriate cancer screening because these data are not consistently available. Also, our study lacks an indicator quantifying cigarette smoking at the county level. Active smoking is associated with approximately 90% of lung cancer cases [33, 34]; the additional 10% are attributed to environmental factors [35]. There is a high prevalence of tobacco use in communities within Appalachia [3638], and at least one study cites significantly elevated smoking rates in Appalachia compared to the rest of the nation [39]. We recognize that lung cancer has a profound impact on overall cancer mortality in Appalachia, but comprehensive county-level smoking prevalence data for the region are not available. While our focus on socioeconomic and demographic variables is limited in scope, it allows for acquisition of comparable data for every geographic unit. Results of this study may serve as a baseline for more comprehensive work incorporating behavioral, structural and environmental indicators.

Conclusion

Cancer represents only one condition for which the Appalachian region suffers a disproportionate burden of disease. Halverson’s 2004 report identified higher Appalachian mortality rates for heart disease, stroke, diabetes and motor vehicle accidents [7]. Greater national attention is being devoted to data-driven identification of place-based geographic disparities [3, 23, 26, 30]. The methods used in this analysis could inform those interested in investigating place-based geographic disparities for other historically underserved regions of the country (i.e. Mississippi Delta, United States/Mexico border, etc.).