Introduction

The United States is a nation of cultural diversity and immigrants. Between 2000 and 2010, the immigrant population has since increased from 10.4 to 12.5% of the total U.S. population. Immigrants have contributed in many ways to the U.S. economy including contributing business revenue of as much as $10 billion each year and at least $80,000 per capita in paid taxes that are more than the expected use of government services over their lifetimes.Footnote 1 Peri (2010) concluded immigrants have also helped expand the productive capacity of the nation’s economy by stimulating investment and promoting specialization. In addition, U.S.-born workers and immigrants tend to follow different occupational tracks. Among the more-educated workforce, the U.S.-born workers are likely to work as managers, teachers and nurses; whereas immigrants are likely to be engineers, scientists and doctors. Young immigrants with low education tend to take labor intensive construction jobs, allowing the construction industry to expand and increase the demand of construction supervisors, managers and designers. This complementary job specialization typically pushes U.S.-born workers toward better-paying jobs, enhances the efficiency of production, and creates jobs. The overall trend of job creation and/or specialization can be generally explained by the principle of comparative advantage that was first introduced by David Ricardo (1819).

Despite the obvious and positive contribution by immigrants to the U.S. economy, controversial arguments are often raised. For example, Goldman et al. (2006) claimed that immigrants have not contributed a share of health care costs that are in proportion to their population share. Immigrants also tend to use selected health services, such as emergency room visits, more heavily than other services causing a disproportionate financial burden on the U.S. health care system. However, Mohanty et al. (2005) analyzed the 1998 Medical Expenditure Survey data and found that immigrants spent 55% less on health care than US-born persons on per capita basis.

The crucial question is why immigrants spend less on health care expenditures. One possibility is that they are healthier than U.S. natives. However, this conclusion seems to contradict the existing literature on health production functions. Numerous studies have examined the marginal contribution of environmental, socioeconomic, behavioral, and medical inputs to health.Footnote 2 The general finding is that socioeconomic status and life style are the most important factors that influence health—lower socioeconomic status is associated with poorer health.

Hispanic immigrants generally have less education, higher poverty rates and tend to be in some of the lowest paid and most dangerous occupations.Footnote 3 Based on these general characteristics, Hispanics should tend to have poorer health. However, in a review of the health status of southwestern Hispanics, Markides and Coreil (1986) concluded the health status of Hispanics in the Southwest is similar or surprisingly better than non-Hispanics in the United States, even though they have low socioeconomic status. This study, thus presents a Hispanic health “paradox”. Using the national survey data, Sorlie et al. (1993) found lower mortality rates for Hispanics relative to non-Hispanic Whites. Using 1986–1990 data from the National Health Interview Survey, Liao et al. (1998) found similar results. Many other studies also found this interesting dilemma.Footnote 4

A number of explanations for the Hispanic health paradox have been proposed in the literature. For example, both Markides and Coreil (1986) and Scribner (1994, 1996) suggested that the lower mortality is the result of more favorable health behaviors, genetic factors, and greater family support. However, Vega and Amaro (1994), Scribner (1994) and Clark and Hofsess (1998) found these positive health behaviors decline with acculturation. As Hispanic immigrants gradually adopt the attitudes, customs and behaviors in the culture of the Unites States, their alcohol consumption, smoking and other risky behaviors increase, as do mortality rates.

On the other hand, Sorlie et al. (1993) and Shai and Rosenwaike (1987) postulates that the lower mortality is not caused by genetic factors but rather caused by self-selection into migration. The health migrant hypothesis suggests that only the healthiest and strongest Hispanics migrate and they bring with them superior health advantages. Several studies provide evidence to support this hypothesis. For example, the international data used by Marmot et al. (1984) indicated immigrants have lower mortality rate comparing with the residents of their country of origin. US data used by Stephen et al. (1994) showed foreign-born persons have lower mortality rate than U.S.-born individuals.

A second related hypothesis is the salmon bias hypothesis that suggests that immigrants tend to return to their birth places to retire or to die. Because foreign deaths are not recorded in the U.S. vital statistics, as a result, the Hispanic mortality rate is artificially lower. Reichert and Massey (1979) and Gasso and Rosenzweig (1982) estimated the emigration rates and did find a large percentage of foreign-born Hispanics return to their home. Abraido-Lanza et al. (1999) using National Longitudinal Mortality Study data, tested the salmon bias hypothesis. The results of their study provided evidence against the salmon bias hypothesis. Franzini et al. (2001) reviewed the published evidence regarding the Hispanic health paradox.Footnote 5

Our study adds to this literature in several ways. For example, the earlier studies reviewed in this paper looked at the paradox over shorter periods. For example, using the National Health Interview Survey both Liao et al. (1998) and Hummer et al. (2000) studied the paradox over shorter time periods, only the 1986–1990 period for the former and only 1986–1995 for the latter. Likewise, Singh and Siahpush (2002) used the National Longitudinal Mortality Study but only included the period 1979–1989. This paper uses the Health and Retirement Study (HRS) longitudinal panel dataset and utilizes the entirety of the HRS, from 1992 to 2012; thus allowing us to examine the Hispanic health paradox during a much longer period than usual in the literature. Another advantage of our study is that our use of the HRS data set allows us to examine the impact of immigration upon health among older Hispanic immigrants rather than just prime working age individuals.

Most importantly, earlier studies on the paradox tend to use mortality or morbidity to measure the health status of the immigrants (e.g., Markides and Coreil 1986; Scribner 1994, 1996). Rather than these gross measures of health, our paper uses two separate measures of health that more accurately measure a respondent’s health. First, we use a self-reported health measure included in the HRS as one measure of the respondent’s health. As evidenced in the literature, self-reported health is subjective and contains some biases (Currie and Madrian 1999; Anderson and Burkhauser 1985; Dwyer and Mitchell 1999; Datta Gupta and Larsen 2010; Dowd and Todd 2011). To correct for these biases, we also use the self-reported health variable to create a latent health index to measure health. As noted above, the previous literature on the Hispanic health paradox has primarily relied on morbidity or mortality measures of health. Our use of both of these measures of health is one of the main contributions of our paper. The main advantage of our use of the latent health index is it represents an improved measure of health that improves the accuracy of our tests of the Hispanic health paradox. With the rapid growth in the number of Hispanics in the United States, this paper should be able to fill the gaps in our understanding of the health status of this group. Thus, the current research focuses on testing for the existence of the Hispanic health paradox using a new data set with superior measures of health status. We do not attempt to test the underlying reasons for the existence of the paradox, primarily due to limitations with the data set.

In contrast to earlier research (e.g., Sorlie et al. 1993), which found that unlike younger Hispanics, older Hispanics no longer exhibited better health, in our sample of older individuals we find strong evidence of the Hispanic health paradox. In fact, we find that while non-immigrant Hispanics in the sample generally do have relatively poorer health than White natives, even after controlling for other relevant explanatory variables, the generally older Hispanic immigrants are found to have much better health than the majority White population.

Measuring health

One of the innovations in our testing of the Hispanic health paradox is the use of more precise measures of health than previous research on the paradox that primarily used mortality or morbidity measures of health (e.g., Markides and Coreil 1986; Scribner 1994, 1996.) Our paper relies upon a longitudinal panel dataset for Americans (1992–2012) from the Health and Retirement Study (HRS) conducted by the Institute for Social Research at the University of Michigan.Footnote 6 The HRS has a variety of health measures. These include a subjective general measure of individual’s self-reported health and relatively more objective measures of health based on functional limitations or medical diagnosis of chronic illnesses.

Both types of measures of health have some limitations. For example, although self-reported health has been widely used in several studies, it has limitations because the measure tends to have random measurement error.Footnote 7 In the presence of such measurement error, the regression estimates are likely to be biased.

For example, it has been found that a systematic bias exists in a sample of older individuals (Anderson and Burkhauser 1985; Dwyer and Mitchell 1999; Datta Gupta and Larsen 2010). Older individuals often tend to report poorer health status than they actually have because ill-health is used as a socially acceptable excuse for withdrawal from the labor force rather than a description of the actual reason; this type of bias is referred to as a justification bias in the literature. Additionally, there could be reporting differences (heterogeneity) in self-reported health based on a variety of factors, such as socio-economic status, education and race/ethnicity. Dowd and Todd (2011) in their study based on individuals over 50 years of age from HRS suggest Hispanics are more optimistic in rating their health as compared to Whites. Thus, we must be careful about solely relying on self-reported health to quantify health disparities in USA. The use of self-reported health that may contain such problems will lead to an inaccurate estimation of the magnitude of health inequalities.

Using relatively objective health measures may resolve the issue of justification bias and remains an alternative to using self-reported health. However, these objective measures are either self-reported or assessed by the interviewer, which implies that they are not superior indicators of an individual’s health (Bound 1991). Another option used by many researchers is to choose relatively more objective self-reports about the presence of a disease condition. However, problems exist with these measures as well due to problems with reporting accuracy, susceptibility to individual rationalization and lack of comparability across individuals. For example, a study by Baker et al. (2004) matches individuals’ self-reports of disease conditions to their actual medical records and finds considerable error in these so-called “objective” self-reports. Moreover, the reporting error may be systematically associated with labor force participation status and hence also a source of “justification bias”.Footnote 8

We use the self-reported measure of individual health in our estimates below rather than using the other supposedly more objective measures of health. However, in order to mitigate the potential biases associated with our self-reported measure of individual health discussed above, we also define a latent health stock variable. Following Bound (1991) and implemented in Bound and Burkhauser (1999), we estimate a model of self-reported health as a function of relatively more objective measures of health (self-reported measures of functional limitations and medically diagnosed chronic conditions) to create a latent health stock.Footnote 9 We then use the predicted value for the latent health stock as one of the outcome variables in our regressions (see Table 6).

We adopt the approach of Jones et al. (2010) and include only the relatively more objective health indicators and some health related behavior as explanatory variables in the latent health stock regressions. Table 1 contains the physical and mental health indicators included as explanatory variables in our model.

Table 1 Health variables included in constructing the latent health index

We use an ordered probit model to estimate self-reported health, where the ordered measure of self-reported health (1 = excellent, 2 = very good 3 = good, 4 = fair and 5 = poor) is regressed on our relatively more objective physical and mental health explanatory variables and health related behavior. The predicted value of the outcome is the latent health stock variable used in the health production regressions in the main body of the paper. A lower level of health is measured by a higher value of the latent health stock. Table 2 presents the marginal effects of the objective health measures for the five different responses (cut points) for self-reported health. All objective measures have a statistically significant impact on an individual’s self-report of health but each measure weighs differently across the five response categories.

Table 2 Ordered probit regression of self-reported health: marginal effects

In Table 2 negative marginal effects indicate that the variable reduces the probability that respondents reply with the given health status while positive marginal effects indicate the variable increases the same probability. For example, in column (1) the following factors decrease the likelihood of the respondent self-reporting his health status as “excellent”: mobility difficulties, functional limitations, and a number of diagnosed chronic conditions, depression and a regular habit of smoking. In contrast, higher cognitive scores (ability) and a regular habit of physical exercise are found to make an individual more likely to report his health as excellent. Column (5) provides another example where we find positive marginal effects, which increases the likelihood of self-reporting poor health, if respondents suffer an objective health problem, have mobility difficulties, functional limitations, etc. We find a lower likelihood of self-reporting “poor” health, for such factors as cognitive abilities and physical exercise. The marginal effects in the remaining columns have similar interpretations.

Note that the predicted values of the ordered probit estimates presented in Table 2 become our latent health index and is used as one of the measures of health in our empirical estimates presented below. We discuss the means of both the latent health index and our self-reported health measure along with our empirical model estimating health in the following section.

Data and empirical model

As noted above, the analysis presented in this paper exploits a longitudinal panel dataset for Americans (1992–2012) from the Health and Retirement Study (HRS) conducted by the Institute for Social Research at the University of Michigan. The HRS is an ongoing longitudinal survey, which began in 1992, and is conducted in biennial waves. Prior to 1998, the main HRS cohort included individuals born between 1931 and 1941, and another distinct cohort, the Study of Assets and Health Dynamics among the Oldest Old (AHEAD), included individuals born before 1924. Since 1998, the data for these two cohorts is collected jointly, and the sample frame has been expanded to include cohorts born between 1924 and 1930 and those born between 1942 and 1947. The HRS is administered for the specific purpose of studying life-cycle changes in health and economic resources. This data is especially suited for our research because it has detailed information on various subjective and objective health outcomes for individuals. Moreover, it also includes a wide range of demographic and family related information crucial for our analysis (given in Table 3). The overall sample of 11 waves consists of 31,746 individuals accounting for 163,810 person wave observations.

Table 3 Variable definitions

The main purpose of the paper is to estimate whether the Hispanic health paradox—that Hispanics, especially recent immigrants, tend to have better health than normal U.S. residents, controlling for relevant variables—remains present in the elderly population.

Table 3 presents variable definitions from the HRS data set we use while Table 4 presents variable means for the full sample and four other subsamples of Hispanics and Non-Hispanics. As noted above, we use two measures of health in our health production function models: (1) self-reported poor health, which equals 1 if respondents report poor health; 0 otherwise and (2) latent health. Both of these variables are constructed from the same underlying self- reported health variable, which lies on a 5 point Likert scale (1 = excellent health, 2 = very good health, 3 = good health, 4 = fair health, and 5 = poor health.) The Latent Health variable contains predicted values from the ordered probit regressing the underlying 5 point Likert scale variable upon a variety of relatively objective measures of the respondent’s health presented in Section II above.

Table 4 Variable means

Our basic health production function regression is based upon Eq. 1:

$$ {\text{Health }} = \,\upbeta_{0} + \,\upbeta_{1} {\text{Demo }} + \,\upbeta_{2} {\text{Family }} + \,\upbeta_{3} {\text{Job }} + \,\upbeta_{4} {\text{Health}}\,{\text{Care}} + \,\upvarepsilon $$
(1)

where Demo refers to respondents’ demographic variables, Family refers to family structure variables, Job refers to work history variables, and Health Care refers to health care utilization variables (i.e., health care inputs).

Table 4 illustrates that Hispanics in the data set tend to have poorer health than do the remainder of the respondents in the survey. For example, 12% of Hispanics self-report poor health as opposed to only 7.2% of the non-Hispanic sample. Likewise, with respect to the Latent Health stock variable, where lower numbers represent better health, Hispanics average 0.52 while non-Hispanics average 0.33. It is important to note that even though Hispanics have worse health, Hispanic immigrants have better health than Hispanic natives for both measures of health. Note that more than half of Hispanics in the sample are immigrants whereas only 6% of non-Hispanics are immigrants. Notice also that Hispanic Immigrants report better health than Hispanic Non-Immigrants; this is true both with respect to the Poor Health variable and the Latent Health variable.Footnote 10

Compared to Non-Hispanics, Hispanics in the sample tend to be younger (average age of 64 as opposed to 67), less likely to be married, live in larger households, and have more children. With respect to job characteristics, Hispanics tend to be in occupations with more physical requirements, less stress, have lower levels of job tenure, much lower levels of education (on average only have finished 9th grade), and have 44% lower household income as compared to non-Hispanics.

Even more relevant to the current paper, Hispanics are less likely to have health insurance and, as a result, also tend to consume health care inputs at lower rates than Non-Hispanics; this is true for all four measures of health care inputs included in Table 4, dentist and doctors’ office visits, prescription drug usage, and hospital visits. Likewise, Hispanics have lower levels of out of pocket medical expenditures. Unsurprisingly, Hispanic Immigrants are even less likely to have health insurance than Hispanic Non-Immigrants and tend to utilize health care at even lower rates with even lower out of pocket medical expenditures. Notice that in general, these differences between Hispanics and Non-Hispanics in the explanatory variables in our health production function regressions do tend to confirm that Hispanics are expected to have lower health. In fact, given that Hispanics tend to have lower levels of inputs that tend to produce better health, it is not terribly surprising that they also tend to have poorer health.

The real question, though, is whether the Hispanic health paradox is still present in our sample of older survey respondents. That is, do Hispanic immigrants also tend to have poorer health even after controlling for these health care inputs? The data in Table 4 indicates that Hispanic immigrants do tend to have better health as compared to Hispanic natives. That yields some evidence of the Hispanic health paradox but to more completely test whether it exists for older Hispanics in the U.S. we must also see if that improvement remains for Hispanic immigrants after controlling for other relevant health inputs.

Empirical results

Tables 5 and 6 present regression estimates of the health production function presented above in Eq. 1. Table 5 contains probit regression results where the dependent variable is Poor Health and shows the marginal effects of the regression results evaluated at explanatory variable means for three different models. All three of these models include occupation dummy variables as explanatory variables but for simplicity these results are not presented in Table 5. We begin with a simpler regression equation in Model 1, which includes all of the basic elements first presented in Eq. 1; demographic variables, family variables, job characteristics, and health care inputs. Model 2 adds more detail about the individual’s family, mostly focusing on variables relating to the mother and father, but including one measure of spousal health. Finally, Model 3 adds both year and Census Division fixed effects as additional controls.

Table 5 Probit regressions of poor health
Table 6 OLS regressions of latent health

In general, the results presented in Table 5 are remarkably consistent across all three models. Recall that the dependent variable in Table 5 is poor health; as a result, positive marginal effects in Table 5 mean that that variable makes health worse while negative marginal effects indicates that the variable improves health. For example, both higher age and education levels tend to improve health although at a declining rate (for age). Both women and married individuals tend to have better health (consistent with the literature, e.g., Tseng and Olsen 2016). In addition, as the length of the marriage increases so does the positive impact on health. In contrast, as the number of residents in the household increase, controlling for marital status, health tends to worsen.

Notice all four of the job related variables tend to be associated with better health (again, lower probabilities that the individual reports poor health); this is true of the physical nature of the job, the level of stress on the job, job tenure, and household income. Unsurprisingly, lack of health insurance is associated with lower levels of health. More dentist visits are associated with better health while use of prescription drugs, and more hospital visits and more doctor visits are associated with lower levels of health, while poor health by their spouse’s is associated with poor health by the individual. Of the variables associated with the individual’s parents, only their parents’ education levels are statistically significant; higher levels of parents’ education positively impact health.

More importantly, notice first that the impact of being Hispanic upon health, after controlling for all other variables, is statistically insignificant; this is true in all three models. The impact of being an immigrant alone also has no statistically significant impact on health in all three models. However, in all three models Hispanics who are also immigrants report better health. Not only are these results statistically significant but they are also objectively large, varying from 37.9 to 41.2% depending upon the model. That is, at the variable means and controlling for other relevant variables, Hispanic Immigrants are approximately 40% less likely to report poor levels of health. That this result is present and statistically significant in all three models from the simplest model to the most complex model, strongly supports the Hispanic health paradox for our sample of older U.S. residents.

Table 6 provides additional estimates of the health production function from Eq. 1, now using latent health as the dependent variable. The latent health variable measures predicted values from an ordered probit procedure regressing the underlying 5 point Likert scale variable (1 = excellent health, 2 = very good health, 3 = good health, 4 = fair health, and 5 = poor health) upon a variety of objective measures of the respondent’s health (see Section II above for more details). Lower numbers here represent better health; that is, the results in Table 6 are interpreted the same as those in Table 5—positive estimated coefficients indicate the variable has a negative impact on health while negative estimated coefficients imply a positive impact on health.

Table 6 also contains three different models representing the same regression equations; a basic regression that includes demographic, job related, family related, and health care input variables (Model 1), then Model 2 adds additional family variables including ones measuring the individual’s parent’s education, age, and whether the parents are living and spousal health. Finally, Model 3 adds wave (year) and census division fixed effects as additional controls for unobserved heterogeneity.

In general, the results from Table 6 simply confirm the results that were presented in Table 5. As age and education increase, health improves although at a decreasing rate. Women tend to have better health than men as do married individuals. All four job characteristics again tend to improve health, physical jobs, stressed jobs, tenure, and income. The health inputs are all found to have the same impacts in Table 6 as were found in Table 5; more dental visits improve health while increased doctor and hospital visits are associated with poorer health as is more use of prescription drugs. These results are consistent with most dental visits being preventative while most other health care usage occurring from health shocks.

The dependent variable here is a predicted value regressing self-reported health on objective measures of health. That is, Latent Health is an attempt to remove the biases in the self-reported health measure used in Table 5 in order to make the measure of health more accurate and objective. We would expect, therefore, to see some impact of a reduction in measurement bias in the health status variable. In fact, we do find two main differences that exist between the results in Table 6 and those we examined in Table 5.

First, in Table 6 we tend to see a larger impact on individual’s health based upon the additional extended family variables. For example, in Table 5 only parents’ education levels and spousal health had a statistically significant impact on self-reported health; higher levels of parents’ education were found to improve health while poor spousal health is also associated with poor individual health. These results are still present in Table 6 but the other variables associated with parents, previously statistically insignificant, are now mostly found to be significant. For example, if the individual’s parents are living an individual’s health improves. As their parents are older, health again improves for individuals. Thus, the general lack of an impact of parents’ health on their children’s health found in Table 5 is primarily a result of the biases inherent in the self-reported health measure modeled in Table 5’s probit regressions. Using latent health to reduce bias in Table 6 we then find that parents’ characteristics have a generally positive and robust impact on their children’s health.

Second, and more relevant to the main purpose of the paper, in both Tables 5 and 6, the comparison racial group is Whites. In Table 5, neither of the remaining racial groups, Hispanics, Blacks, or Other (which would include Asians among others) had statistically different health than Whites. Only Hispanic Immigrants had health different from Whites, controlling for other characteristics, and consistent with the Hispanic Health paradox, Hispanic Immigrants health was better than the comparison white group.

However, in Table 6 after correcting for biases in the self-reported health variable, all of the minority racial groups—Hispanics, Blacks, and Others—have poorer health than Whites and these differences are statistically significant. Similar to the impact of parents on their children’s health, the lack of a racial impact in Table 5 is an apparent result of the biases present in the self-reported health variable Poor Health.

Although the impact of being an Immigrant alone was statistically insignificant in Table 5, in Table 6 immigrants in general have approximately 10% better health than the comparison group, Whites. More importantly, even though Hispanics in general are found to have significantly worse health than Whites in Table 6, Hispanic Immigrants are again found to have better health overall than whites, or any other immigrants controlling for other relevant variables. This is true in all 3 models in Table 6, where in general the positive impact on health of being a Hispanic Immigrant is approximately twice as big as the negative impact on health of simply being Hispanic.Footnote 11 Hence, even after correcting for the biases inherent in the self-reported health used in Table 5, we still find strong evidence of the Hispanic health paradox in our sample of older individuals.

Conclusions

Even though Hispanics, especially recent immigrants, tend to be of lower socioeconomic class, with lower levels of human capital such as education and job tenure, it has commonly been found that they have better health than would commonly be predicted for such individuals. In fact, even though recent Hispanic immigrants have even lower levels of such variables, again indicating lower socioeconomic status, they have still been found to have better health than expected (e.g., Markides and Coreil 1986; Sorlie et al. 1993; Liao et al. 1998). Earlier researchers postulate that such a finding results from factors inherent to the individual such as more favorable health behaviors, genetic factors, or improved family support (Markides and Coreil 1986; Scribner 1994, 1996). Some have found that as Hispanics acculturate into their new country, these positive health behaviors decline (e.g., Vega and Amaro 1994; Scribner 1994; and Clark and Hofsess 1998) and their health begins to decline as well. Alternatively, others hypothesize that the Hispanic health paradox is actually a result of self-selection into immigration, with healthier individuals more likely to immigrate (e.g., Marmot et al. 1984; Stephen, et al. 1994) or with older immigrants tending to return to their country of origin before retiring or dying (Reichert and Massey 1979; Gasso and Rosenzweig 1982).

We use a sample of only older individuals from the Health and Retirement Study (HRS) whose age averages more than 66 years in the sample we use in this paper. Both the longitudinal nature of the data set and the fact that we use only older individuals, means that we have a more complete test of the Hispanic health paradox than earlier research. More importantly, rather than measuring health by mortality or morbidity, our research uses two more accurate measures of health, a self-reported health measure and a latent health index constructed from the self-reported health measure (see Section II.) Using these better measures of health allow us to more accurately test the Hispanic health paradox.

Controlling for other relevant characteristics and using our more accurate measures of health, we find both that Hispanic non-immigrants, as compared to native Whites, actually do have worse health while Hispanic immigrants, who make up more than half of all Hispanics in our sample, actually have much better health than other racial groups in the sample, including Whites. Obviously, our results lend crucial evidence of the Hispanic health paradox. That is, our results tend to suggest that Hispanics enter the U.S. with more healthy behaviors, genetics, or more family support. Interestingly, our sample shows evidence of the latter with Hispanics tending to reside in families with both more children and more adults than non-Hispanics, 35 and 25% more, respectively (see Table 4). Of course, we cannot directly test hypotheses on individuals who have returned to their country of origin as our data only tracks individuals currently in the U.S.