Introduction

Smoking continues to be one of the leading causes of preventable mortality and a major contributor of various types of cancer, chronic pulmonary and cardiovascular diseases. This epidemic is one of the biggest public health threats the world is facing, killing over 8 million people a year, see Jha and Peto (2014). Moreover, approximately 80% of the 1.1 billion smokers live in low- and middle-income countries, where the burden of tobacco-related illness and death is the heaviest. About half the world’s male smokers live in 3 Asian countries: China, India, and Indonesia. Mishra et al. (2012) reported that, with a 42.4% male prevalence rate of tobacco use in India, around 1 million deaths a year can be attributable to smoking by the mid-2010s. Till very recently, India was the second largest consumer of tobacco in the world, second only to China. With 20% of the world’s population, China produces and consumes about 30% of the world’s cigarettes, and suffers over a million deaths a year from tobacco use, see Jha (2019) and Yang et al. (2019).

Ever since a causal relationship between cigarette smoking and coronary heart disease was reported at Mayo Clinic, the smoking prevalence rate has steadily declined in the US from 40% in 1965 to 13.7% in 2018 due to a variety of aggressive anti-smoking measures spearheaded by the courts and government at different levels. Even then, morbidity and ill health due to cigarette smoking and exposure to second hand smoke account for over 480,000 deaths per year.Footnote 1 Furthermore, the direct health care expenditure and lost productivity due to smoking add up to approximately $290 billion a year.Footnote 2 Not surprisingly, academic researchers and public health officials are paying increasing attention to find ways to reduce smoking and make smokers quit all together.

Most policy makers consider cigarette taxes as an effective way to reduce tobacco consumption, and many studies have estimated its elasticity and corroborated its significance. Additionally, health risks as perceived by smokers have a substantial effect on smoking. Research has shown that people who possess correct health information have a tendency to change their addictive habits—the so-called ‘preventive’ motive for quitting to avoid being sick in the future. In addition, many smokers quit after diagnosed with smoking-related diseases—the ‘curative’ motive for quitting. To the extent that smoke-free laws in public places deter smoking and influence public perception, these laws can also have a direct effect on smoking through the preventative channel. However, most of the previous studies have focused primarily on one aspect of the mechanism, not considering the effect of cigarette prices, health shocks and smoke-free laws simultaneously. In this paper, we try to disentangle the separate effects of these three factors on smoking reduction under one framework using a longitudinal data from the US. Health and Retirement Study (HRS).Footnote 3 In our preferred estimation, we propose a two-stage hurdle model to estimate the determinants of smoking participation and conditional demand simultaneously. Our hope is that the lessons learnt from the US experience can be implemented to other countries.

Section 2 reviews the rich existing literature on the impact of health shocks, cigarette prices and smoke-free laws on smoking and on modeling selectivity in nonrandom samples. Section 3 discusses our empirical strategies, followed by descriptions of data in Sect. 4. Section 5 is devoted to the interpretation of results in the context of models of smoking participation and consumption. The paper concludes in Sect. 6.

The Existing Literature on Elderly Smoking

The 1990 US Surgeon General Report is the first comprehensive statement on the health benefits of smoking cessation and concluded that smoking cessation improves immediate and long-term health and increases longevity, even for those who already suffer from smoking-related illness. For instance, smoking cessation by persons with diagnosed heart problems markedly reduces the risk of heart attack and cardiovascular death. In many studies, this reduction in risk has estimated to be more than 50%. As for mortality, Taylor et al. (2002) found that individuals who quit enjoy prolonged lives, relative to those who continue to smoke. Although the gain in longevity is largest when quit at younger ages, it remains substantial at older ages too. Ostbye and Taylor (2004) further found that smoking cessation leads not only to increases in years of life but also in years of healthy life (YHL) by reducing smoking-related illness.

Based on the classic health demand model of Grossman (1972) and the rational addiction model of Becker and Murphy (1988), there is now a rich empirical literature on the determinants of initiation and cessation of smoking. Given the obvious public policy implications, most of these studies have focused on the price elasticity of participation and conditional demand for cigarettes by the youth. However, studying the smoking behavior of the older adults is also important because cigarette taxes interact with health problems that are associated with continued smoking and ageing in general. The deterioration of health in the elderly due to smoking has serious implications for the ever-burgeoning public health insurance programs like Medicaid and Medicare in the US.

In one of the earliest studies on elderly smoking behavior, Lewit and Coate (1982) found little evidence that taxes reduce the smoking participation or consumption of adults, their estimated price elasticities for participation and conditional demand were not significant at the conventional significance levels. Other studies, using varied data sources, found no systematic evidence that higher prices reduce smoking prevalence among older adults. For example, Evans and Farrelly (1998), and Farrelly et al. (2001) defined an older adult as an individual at least 40 years old and found little evidence that they would change their behavior to respond to cigarette prices. Adda and Cornaglia (2013) found no significant tax elasticity for the age group 45 and older using data from National Health and Nutrition Examination Survey (NHANES) during 1988–2006. In another recent study Maclean et al. (2016) estimated the tax elasticity in the elderly population, using the Health and Retirement Study 1992–2008, and found that participation is not responsive to a tax increase. However, using cross-sectional data from Behavioral Risk Factor Surveillance System (BRFSS) over 2000–2005, DeCicca and McLeod (2008) estimated the price elasticity of participation to be around − 0.3 for the age group 45–59, and − 0.2 for the group 45–64, which were significantly different from zero.

Unlike elasticity of participation, relatively more consistent results have been obtained regarding the responsiveness of conditional consumption to cigarette prices for different age groups. For example, Evans and Farrelly (1998) found a significant − 0.498 price elasticity of conditional demand at the extensive margin. Maclean et al. (2016) also found that smokers respond modestly to a tax increase, with elasticity in the range − 0.03 to − 0.04. Nesson (2017) obtains elasticities around − 0.15, using NHANES data.

A number of studies have estimated the price effect for adult smoking, not focusing on the older people specifically. Tauras (2006), using a sample of individuals with 46 years as average age, found a small price participation elasticity of − 0.126 and price demand elasticity of − 0.07. DeCicca et al. (2008)’s estimates indicate that higher taxes have no overall effect on young adults’ cessation and participation decisions when anti-smoking sentiment is controlled for. Liu (2010) found that a significant participation elasticity of − 0.159 for the age group 45–64, but the effect was not statistically significant for the age group 65 and more. Pesko et al. (2016) considered the influence of intra-state price variation on the estimates of price elasticity on the 18 and older age group, and showed that using local price variation could increase the price elasticity of consumption substantially, from − 0.06 to − 0.25.

Health problems, especially smoking-related diseases, are important for older smokers in their decisions to continue smoking and how many. Indeed, studies have found that smokers adjust smoking behavior based on revealed health risk and information. Although Jones (1994) found that doctor advice cannot effectively increase the probability of quitting, Lahiri and Song (2000) and Khwaja et al. (2006) provided evidence that individuals quit smoking in response to major health shocks. Furthermore, Arcidiacono et al. (2007) showed that models of forward-looking behavior explain the pattern of heavy smoking better than a myopic model, which sheds some light on preventive quitting. Sundmacher (2012), using German data, investigated the effect of health shocks on smoking and obesity, and confirmed that a contemporary health shock has a significant positive impact on the probability of quitting.

Selectivity can potentially jeopardize elasticity estimates if unobserved factors in the sequential decisions of participation and consumption are correlated. Lahiri and Song (2000) formulated such a sequential self-selection model in a cross-sectional context to study the effect of smoking on cancer incidence. To model selectivity in count data models, Ophem (2000) suggested an estimable model by transforming the underlying processes to the bivariate normal distribution, in which the two regimes are characterized by potentially two different data generating processes. Min and Agresti (2005) proposed a two-part hurdle model with correlated random effects, which comprehensively handles the zero observations and the positive counts, and takes into account the correlation between measurements for the same individual at different time periods. In our context, this two-regime framework allows for relapses after quitting, and the self-selection underlying the sequential decisions.

Empirical Approaches

We employ two approaches to estimate participation and consumption decisions. Section 3.1 uses a two-part model including Probit model with random effects to analyze the smoking participation decision, and a zero-truncated count model with random effect for the conditional demand. Section 3.2 proposes a two-part hurdle model with correlated random effects to analyze these two decisions simultaneously, and addresses the selectivity problem affecting both. This is a prime innovation in our paper. Most studies in the literature use state dummies to control for unobserved effects without recognizing the possibility of unobserved individual effects. Additionally, apart for losing many degrees of freedom in a short panel like ours, the state dummies absorb all inter-state variation in taxes and anti-smoking legislations, making it difficult to capture their independent role in smoking cessation.Footnote 4 However, the assumption that the individual effect is independent of observed regressors is restrictive and unrealistic. We follow an intermediate solution by specifying a control function for the unobserved individual effects in terms of the individual observed means; the random error in this specification is assumed uncorrelated with observed regressors.Footnote 5 Thus, in models described below, we add the averages of statistically significant individual level regressors, like income and exercise habit, to both participation and demand equations. As explained in Chatterji et al. (2014), with a nested error structure where each individual nests within a specific state, controlling for individual effects will control for unobserved state effects, but not the other way round.

Two-Part Model of Probit and Zero-Truncated Negative Binomial Regression

We model smoking participation and conditional consumption decisions separately as outcomes of a smoker’s utility maximization using a random utility model. Heckman (1979) first postulated this two-part model based on the fundamental premise that the baseline and induced risk factors of cigarette smoking are not always same for all individuals. After perceiving the risk from health shocks, the retail price and the local anti-smoking environment, smokers determine their behavior based on updated subjective beliefs concerning possible health deterioration and monetary cost.

In each period in our observation window, the individual faces a decision to make between two alternatives: (1) continue to smoke or (2) quit. Let \( y_{it} \) be the binary variable which indicates the response to the question “Do you smoke cigarettes now?” for the individual i at time t. If the answer is yes, then \( S_{it} = 1 \), otherwise \( S_{it} = 0 \). With the latent variable \( {\text{S}}_{it}^{*} \), we can construct the following Probit model as the participation decision:

$$ \begin{aligned} S_{it} = 1 \quad if \, S_{it}^{*} > 0 \\ = 0 \quad if \, S_{it}^{*} \le 0 \\ \end{aligned} $$
(1)

where

$$ S_{it}^{*} = x_{it}^{T} \beta_{X} + P_{it} \beta_{p} + H_{it}^{T} \beta_{H} + \mu_{i} + u_{it} $$
(2)

and \( x_{it} \) is a \( K_{1} \times 1 \) vector, including individual characteristics; \( P_{it} \) is the retail price of cigarettes per pack for the individual \( {\text{i}} \) at time \( {\text{t}} \); \( H_{it} \) is a \( K_{2} \times 1 \) vector which includes a set of measures of health (including health shocks) for individual \( i \) at time \( t \). \( \beta_{x} \), \( \beta_{p} \) and \( \beta_{H} \) are \( K_{1} \times 1 \), \( 1 \times 1 \) and \( K_{2} \times 1 \) parameter vectors to be estimated.

The random effect \( \mu_{i} \) is used to address the correlation across time with the following standard assumption

$$ \mu_{i} \sim IIN\left(0,\sigma_{\mu }^{2} \right) $$

which is independent of \( u_{it} \) and \( (x_{it}^{{}} ,H_{it} ) \).

We estimate this panel data model via maximizing log-likelihood with random-effects. The conditional probability of smoking,\( y_{it} = 1 \), is

$$ \begin{aligned} \Pr (S_{it} & = 1|x_{it} ,P_{it} ,H_{it} ,\mu_{i} ) \\ & = \Pr (\varepsilon_{it} > - (x_{it}^{T} \beta_{X} + P_{it} \beta_{p} + H_{it}^{T} \beta_{H} + \mu_{i} )|x_{it} ,P_{it} ,H_{it} ,\mu_{i} ) \\ & = \varPhi (x_{it}^{T} \beta_{X} + P_{it} \beta_{p} + H_{it}^{T} \beta_{H} + \mu_{i} ) \\ \end{aligned} $$

Then the panel-level marginal likelihood \( L_{i} \) is given by

$$ L_{i} (B,\sigma_{12}^{s} ,\sigma_{\mu } ) = \int_{ - \infty }^{\infty } {\prod\limits_{t = 1}^{T} {\left[ {\Pr (y_{it} = 1|x_{it} ,P_{it} ,H_{it} ,\mu_{i} )^{{y_{it} }} \Pr (y_{it} = 0|x_{it} ,P_{it} ,H_{it} ,\mu_{i} )^{{(1 - y_{it} )}} } \right]} } f(\mu_{i} )d\mu_{i} $$

where \( f( \cdot ) \) is density function of \( \mu_{i} \) defined above and \( B = (\beta_{X}^{T} ,\beta_{P} ,\beta_{H}^{T} )^{T} \).

The integral is approximated by Adaptive Gaussian quadrature, with the number of quadrature points 20. Then the likelihood function is written as

$$ L(B,\sigma_{12}^{s} ,\sigma_{\mu } ) = \prod\limits_{i = 1}^{N} {L_{i} (B,\sigma_{12}^{s} ,\sigma_{\mu } )} $$

A follow up question on “how many cigarettes do you smoke per day?” is queried from the daily smokers with responses as positive integers. In specifying the conditional demand function, we employ a (truncated) count model due to the discrete nature of the number of cigarettes.Footnote 6 Poisson distribution and negative binomial distribution are commonly used to model count data, but the latter distribution is preferred because it allows for richer heterogeneity and over-dispersion.

Specifically, \( y_{it} \) here is the number of cigarettes smoked per day by individuals who self-report as smokers. Then the function is

$$ P(y_{it} |V_{it} ,\mu_{i} ,Smoking = Yes) = \frac{{g(y_{it} ;\lambda_{it} (\mu_{i} ))}}{{\left[ {1 - g(0;\lambda_{it} (\mu_{i} ))} \right]}} $$

where \( \mu_{i} \), the usual random effect, is included to allow for correlation across time; \( V_{it} \) is the covariates vector that includes \( \left\{ {x_{it} ,H_{it} ,P_{it} } \right\} \). The conditional mean of dependent variable \( \lambda_{it} \left( {\mu_{i} } \right) \) is parameterized as

$$ \lambda_{it} (\mu_{i} ) = \exp (x_{it}^{T} \zeta_{x} + P_{it} \zeta_{p} + H_{it}^{T} \zeta_{H} + \mu_{i} ) \, $$

with the negative binomial probability density function as

$$ g(y_{it} ;\lambda_{it} (\mu_{i} )) = = \frac{{\varGamma \left( {\alpha^{ - 1} + y_{it} } \right)}}{{\varGamma \left( {\alpha^{ - 1} } \right)\varGamma \left( {y_{it} + 1} \right)}}\left( {\frac{{\alpha^{ - 1} }}{{\alpha^{ - 1} + \lambda_{it} (\mu_{i} )}}} \right)^{ \alpha^{ - 1} } \left( {\frac{{\lambda_{it} (\mu_{i} )}}{{\alpha^{ - 1} + \lambda_{it} (\mu_{i} )}}} \right)^{{y_{it} }} \, . $$

MLE is used in the estimation with likelihood function of

$$ L(\zeta_{x} ,\zeta_{p} ,\zeta_{H} ,\sigma^{2} ,\alpha ) = \int {\prod\limits_{i = 1}^{N} {\prod\limits_{t = 1}^{T} {\frac{{g(y_{it} ;\lambda_{it} (\mu_{i} ))}}{{\left[ {1 - g(0;\lambda_{it} (\mu_{i} ))} \right]}}f(\mu_{i} )d\mu_{i} } } } . $$

As with the Probit part, the integral is approximated by Adaptive Gaussian quadrature, and the number quadrature points was 20.

Hurdle Model with Correlated Random Effects

Our preferred specification consists of a hurdle model with correlated Probit and negative binomial regressions, cf. Min and Agresti (2005). The demand for cigarettes can now be analyzed jointly with the participation decision by considering smoker’s previous smoking trajectory and the degree of addiction. Nesson (2017) has emphasized the need to control for previous smoking history in determining cessation. However, due to extreme state dependence, it was problematic to include lagged smoking intensity in our participation equation, see also Maclean et al. (2016). In our model, the propensity to quit utilizes the information on smoking intensity through the correlated (unobserved) random effects. To the best of our knowledge, no study has analyzed participation in smoking and cigarettes consumption jointly while being cognizant of self-selection in the two switching regimes.

The Hurdle model allows for two correlated data generating processes for the binary and the positive count variables in one framework. The Probit model is used to model the data generating process (DGP) for dichotomous variable indicating whether the individual quits, and a zero-truncated negative binomial distribution to model the probability for each possible count for smokers. Selectivity prevails because the observations are distributed across the two regimes by an endogenous self-selection process. Ophem (2000) showed that the endogenous selectivity could be modeled in the two regimes of a switching-count model by allowing error terms from the two components to follow a bivariate normal distribution. This simplifies our framework to a hurdle model with correlated random effects in the two regimes. Thus, the first regime is formulated using Probit with random effect \( \mu_{1i} \) and a zero-truncated negative binomial distribution with random effect \( \mu_{2i} \) is used to model the data generating process of positive values.

The (conditional) hurdle model is then formulated as

$$ P(y_{it} |V_{it} ,\mu_{1i} ,\mu_{2i} ) = \left\{ {\begin{array}{ll} {(1 - P_{it} (\mu_{1i} ))\quad\quad \,\,\,\quad {\text{ if }}y_{it} = 0} \\ {P_{it} (\mu_{1i} )\frac{{g(y_{it} ;\lambda_{it} (\mu_{2i} ))}}{{\left[ {1 - g(0;\lambda_{it} (\mu_{2i} ))} \right]}}\quad {\text{ if }}y_{it \, } { > 0 }} \\ \end{array} } \right. $$

where

$$ V_{it} = \left\{ {x_{it} ,H_{it} ,P_{it} } \right\} $$
$$ \text{probit}\,P_{it} (\mu_{1i} ) = x_{it}^{T} \beta_{x} + P_{it} \beta_{p} + H_{it}^{T} \beta_{H} + \mu_{1i} $$
(3)
$$ g(y_{it} ;\lambda_{it} (\mu_{2i} )) = \frac{{\varGamma \left( {\alpha^{ - 1} + y_{it} } \right)}}{{\varGamma \left( {\alpha^{ - 1} } \right)\varGamma \left( {y_{it} + 1} \right)}}\left( {\frac{{\alpha^{ - 1} }}{{\alpha^{ - 1} + \lambda_{it} (\mu_{2i} )}}} \right)^{ \alpha^{ - 1} } \left( {\frac{{\lambda_{it} (\mu_{2i} )}}{{\alpha^{ - 1} + \lambda_{it} (\mu_{2i} )}}} \right)^{{y_{it} }} $$
(4)

\( x_{it} \), \( P_{it} \), and \( H_{it} \) are defined in Sect. 3.1, with their corresponding coefficient vectors \( \xi_{X} \), \( \xi_{P} \), and \( \xi_{H} \). The term \( \mu_{1i} \) is the random effect in the first regime.

To parameterize the mean and/or variance in the second regime, we assume

$$ \lambda_{it} (\mu_{2i} ) = \exp (x_{it}^{T} \zeta_{x} + P_{it} \zeta_{p} + H_{it}^{T} \zeta_{H} + \mu_{2i} ) $$
(5)

where \( \mu_{2i} \) is the random effect in the second regime.

Those two regimes of the model are tied together by assuming that the random effects are jointly normal and correlated as,

$$ (\mu_{1i} ,\mu_{2i} )\sim Normal\left( {\left[ {\begin{array}{*{20}c} 0 \\ 0 \\ \end{array} } \right], \, \left[ {\begin{array}{*{20}c} {\sigma_{1}^{2} } & {\sigma_{12} } \\ {\sigma_{21} } & {\sigma_{2}^{2} } \\ \end{array} } \right]} \right) $$
(6)

Let B be the vector of all parameters to estimate:\( (\beta_{{_{X} }}^{T} ,\beta_{P} ,\beta_{{_{H} }}^{T} ,\zeta_{{_{X} }}^{T} ,\zeta_{P} ,\zeta_{{_{H} }}^{T} ,\sigma_{1}^{2} ,\sigma_{2}^{2} ,\sigma_{12} ,\alpha ) \). Then the marginal likelihood function for this hurdle model can be written as:

$$ L({\rm B}) = \prod\limits_{i = 1}^{N} {L_{i} ({\rm B})} $$

where

$$ \begin{aligned} L_{i} ({\rm B}) = & \int {\prod\limits_{t = 1}^{T} {\left[ {1 - P_{it} (\mu_{1i} )} \right]^{{I_{{y_{it} = 0}} }} \prod\limits_{t = 1}^{T} {\left\{ {P_{it} (\mu_{1i} )\frac{{g(y_{it} ;\lambda_{it} (\mu_{2i} ))}}{{\left[ {1 - g(0;\lambda_{it} (\mu_{2i} ))} \right]}}} \right\}^{{^{{I_{{y_{it} > 0}} }} }} } \phi (\mu_{1i} ,\mu_{2i} )d(\mu_{1i} ,\mu_{2i} )} } \\ = & \prod\limits_{t = 1}^{T} {\left[ {1 - P_{it} } \right]^{{I_{{y_{it} = 0}} }} } \prod\limits_{t = 1}^{T} {\left\{ {P_{it} \frac{{g(y_{it} ;\lambda_{it} )}}{{\left[ {1 - g(0;\lambda_{it} )} \right]}}} \right\}^{{^{{I_{{y_{it} > 0}} }} }} } \\ \end{aligned} $$

and \( \phi (\mu_{1i} , \mu_{2i} ) \) is the joint density function of \( (\mu_{1i} , \mu_{2i} ) \) defined in (6).

The procedure NLMIXED in SAS enables us to do the estimation as a nonlinear mixed model by maximizing an approximation to the likelihood integrated over the random effects. The popular software Stata will also have the corresponding code to estimate this model jointly. Different integral approximations and optimization techniques are available for this procedure, and after some experimentation, we use adaptive Gaussian quadrature for the integration, and Quasi-Newton algorithm for the likelihood maximization.

Data and Variables

The Health and Retirement Study (HRS) is a longitudinal household survey data set for the study of retirement and health among the elderly in the United States. The study interviews approximately 22,000 Americans ages over 50 and their spouses every 2 years on domains like health care, housing, assets, pensions, employment and disability. Furthermore, the RAND Center for the Study of Aging, with funding and support from the National Institute on Aging (NIA) and Social Security Administration (SSA), created the RAND (L version) HRS data files. The RAND HRS we use is a concise subset of the HRS, over ten waves (1992, 1993/1994, 1995/1996, 1998, 2000, 2002, 2004, 2006, 2008, and 2010). This study uses the main body of RAND version matched with sensitive administrative information on state and race/ethnicity from different waves of HRS. We use basically the same data as Maclean et al. (2016), but with subtle differences in model specifications.

Many of the health measures like newly diagnosed heart disease are obtained by differencing two consecutive responses to the interview question, ‘Have you ever been told by a doctor that you have heart disease?’ Therefore, the effective number of waves of our panel is nine. Due to this reason, missing data and natural attritions, our final sample contains 98,941 observations on 18,177 respondents. Table 1 has the description of all variables, including demographics, health, and smoking-relevant variables at both individual and state levels.

Table 1 Description of variables

Cigarette Price and Smoke-Free Laws

We merge information on cigarette prices from the different editions of Tax Burden on Tobacco (Orzechowski and Walker 2012), which report taxes and retail prices by state and year, with administrative data on state identifiers of respondents. We found our results to be very similar whether use price or tax. Here we choose the price (including federal and state taxes) because this is what smokers face in their decision-making process. In the spirit of Pesko et al. (2016) to maximize the price variation across states and time, we deflate the variable using the regional consumer price indexes, even though the use of the conventional national CPI data did not make much difference in our elasticity estimates. Since HRS collects information over the whole year, while the Tax Burden on Tobacco reports the retail price of cigarettes as of November every year, we use last year’s retail price in our models. Figure 1 displays the cigarette price per pack between 1991 and 2011 by state, adjusted for regional inflation. It is interesting to note that of the two dimensions of variation in the price data, we found the cross-sectional variation between states is 29.28% of the total variation in our panel data, thus most of the variation in prices comes from within-state variation over time, see Fig. 2.

Fig. 1
figure 1

Source: The Tax Burden on Tobacco (2012), 2012

Cigarette price by state from 1991 to 2011

Fig. 2
figure 2

Spatial variation of cigarette price by states, 2011

Both models also include controls for the state-level smoke-free air laws and sentiment. Following the literature and after extensive experimentation, we used venue-specific smoking ban data from the ImpacTeen project (1991–2008) and Centers for Disease Control and Prevention (2009–2011), and then match the data to the HRS data set by state and year. Considering the consistency of measures over time, we defined a dichotomous variable for smoking ban, which is equal to one if there is an effective state-level smoking ban in one of the locations including restaurants, private worksites and bars. The sentiment measure comes from the TUS-CPS (Tobacco Use Supplement to the Current Population Survey)Footnote 7, from which we use the percentage of respondents who think smoking should not be allowed anywhere in bars as anti-smoking sentiment proxy at the state-level. Since smoke-free laws in public spaces may change perception and attitudes towards smoking, the sentiment measure may pick up additional effect of smoke-free legislations on quitting and the intensity of smoking, see Carton et al. (2016).

Health Shocks and Other Health Measures

We use two types of health information. One type captures health shocks (HS) that an individual experienced recently (within last 2 years). This is similar to health shocks defined in Smith et al. (2001) and Sundmacher (2012). Specifically, if the individual stayed in a hospital overnight, visited doctors an unexpected number of times, or lowered one’s self-rated health substantially, then a health shock (HS) is considered to have occurred to the individual. By inspecting the relevant histograms, we defined doctor visits more than five times a year, and lowering of the self-rated health by two or more categories as a health shock. This way we hope distinguish between sudden unexpected changes in health status, net of expected trends due to aging and associated morbidities. In addition, if there is a newly diagnosed heart disease, cancer, or high blood pressure during last 2 years and has no history of that disease before, then a health shock is also identified to have occurred. For example, if the respondent reported a heart disease in wave 5 that occurred after the survey of wave 4, and reported no history of heart disease before wave 4, then this is recorded as a health shock. In summary, the first type of health shock (HS) is defined by new and substantive changes health status.

The other type of health information is health conditions (HC) that individual ever had, which includes previously diagnosed heart diseases, cancers, and hypertension. We control for both HS and HC since it is reasonable to assume that people might have a different reaction to newly diagnosed diseases and old health morbidities. Most studies attempting to estimate price elasticity do not use such extensive controls for health shocks and morbidity.

Summary Statistics

The smoking prevalence in the US has been steadily declining, especially after the release of The Surgeon General Report (1964).

Figure 3 shows this declining smoking rate for Whites, Blacks and Hispanics, over 1992–2010. While the prevalence of the white population dropped from 24% in 1992 to 11% in 2010, the same for blacks dropped from 32% to 17% during the same time. The average cigarette consumption per capita per year decreased approximately from 100 packs in 1991 to 50 packs in 2011. If we look at the smoking intensity for smokers in our sample, we find that the distribution of the number of cigarettes smoked per day shifted to the left steadily, which indicates that not only the prevalence but also the intensity fell (see Fig. 4). Concurrently, the (deflated) total tax collection including federal and states tax started to increase after 1991 that accelerated after 2008.

Fig. 3
figure 3

Source: The Tax Burden on Tobacco (2012) and authors’ calculation using HRS

Smoking prevalence in the US by race from 1992 to 2010.

Fig. 4
figure 4

Distribution of number of cigarettes smoked per day for smokers only

Figure 5 shows the age distribution of smokers and number of years since quitting at the beginning of our sample. Consistent with other studies, we find that former smokers, on the average, quit smoking in their 40s. Table 2 compares the demographic characteristics of current smokers in the full sample, whereas Table 3 compares the sample of ever-smoked individuals with never-smokers.

Fig. 5
figure 5

Source: Authors’ calculation using HRS survey in 1994

Distribution of age and years since quitting for quitters.

Table 2 Summary statistics—by observation
Table 3 Summary statistics—by individual

Table 2 shows that current smokers are significantly younger, have better self-assessed health and less number of schooling years, which may suggest self-selection. At the same time, current smokers are significantly more likely to be unmarried, working, divorced and widowed. When we compare the sample of ever-smoked individuals with non-smokers, these two groups do not have significantly different age, but non-smokers have higher years of education, are substantially less likely to be a male, and more likely to be a protestant. As for the prevalence of health shocks (HS), we observe fewer health shocks for current smokers. This phenomenon is possibly caused by a (curative) behavioral change in former smokers following a health shock, and could be explained by Table 3, in which former smokers are seen to have a significantly higher prevalence of health shocks including hospitalizations, newly diagnosed heart diseases and cancers compared to smokers, and non-smokers.

Empirical Results

Main Estimates: Price Elasticity and Effect of Health Risks

We summarize our baseline estimates for the two models in Tables 4, 5, 6 and 7. Left panel reports estimates of the participation equation, where the dependent variable equals 1 if the individual is a daily smoker, and the right panel reports estimates of the cigarette consumption equation. In both left and right panels, we present the estimates of price-responsiveness of smoking in terms of price elasticity of demand. The results reported in these four columns are, respectively, Probit model with random effects, Probit part from hurdle model with correlated random effects, zero-truncated negative binomial regression with random effect, and count part from hurdle model. With a similar structure, Tables 5 and 6 show the estimated effects of a number of other health conditions and shocks on the two smoking decisions.

Table 4 Estimated effect of cigarette price on smoking
Table 5 Estimated effect of health shocks on smoking
Table 6 Estimated effect of health shocks on smoking
Table 7 Estimates of other parameters

As showed in Table 4, we find evidence that higher cigarette price reduces daily smoking prevalence in our second specification. In particular, our estimate implies that a $1 increase in cigarette price will reduce the probability of smoking to about 13.0% on average. Relative to a base of 16.6%, this translates to 21.7% reduction in probability and participation elasticity of − 0.569, which is large relative to previous studies (reviewed in Sect. 2). Interestingly, as in Maclean et al. (2016), this elasticity is insignificant in the uncorrelated effects model. The correlation between the two random effects of the two equations is 0.71 and highly significant. This high correlation between the two errors suggests that more addicted smokers tend not to quit, and former heavy smokers are more likely to relapse into smoking again. Moreover, given the formula of the marginal effect of price (see Table 4, footnote 4), and the value of z evaluated at the sample means (− 0.97), the strong positive correlation implies that more addicted smokers respond more to quit in response to increases in cigarette prices, supporting the economic model of rational addiction, cf. Maclean et al. (2016). Given the substantial heaping of number of cigarettes smoked at 20, heavy smokers in our sample possibly mean those smoking 20 or more cigarettes a day (see Fig. 4). Under appropriate incentives provided by increasing taxes coupled with looming health shocks for the elderly, these highly addicted smokers control themselves to quit. Furthermore, in cigarette (conditional) demand part, we find consistent and significant evidence that higher cigarette price reduces the amount smoked. Specifically, a $1 increase in cigarette price reduces the average number of cigarettes consumed per day by nearly 1.03 in the hurdle model and 0.85 in ZTNB model, which are about 6.1% and 5.1% of the average intensive margin, and implies a price elasticity of − 15.9% and − 13.2%, respectively. Our estimates are very similar to those in Nesson (2017) and Maclean et al. (2016). Since with inelastic demand, consumers bear the majority of tax incidence, our estimates suggest that most of the tax burden will be borne by the elderly smokers when tobacco taxes rise.Footnote 8

Estimates for the effect of health shocks are uniformly significant in participation equations across models (Tables 5, 6). Health shocks defined as hospitalizations and high frequency of doctor visits, on average, could reduce the probability to smoke by 0.08 and 0.06, respectively, according to hurdle model’s estimates. A downgrade of self-assessed health, which is a proxy of subjective health shocks, also shows a significant effect on smoking participation, in both models. However, unlike in participation, only hospital stay shows an effective reduction in cigarette consumption, and specifically, one hospitalization would reduce average conditional demand by about 7.7% or 1.3 cigarettes.

When it comes to specifically diagnosed diseases (Table 6), results have a similar pattern as general health shocks. A newly diagnosed heart disease, cancer and hypertension effectively reduces the participation probability by, respectively − 0.124, − 0.103 and − 0.071, which implies − 74.52%, − 62.1% and − 43.1% declines relative to the average probability of smoking. The effects are similar, but slightly smaller, with heart disease and cancers diagnosed at least 2 years before, suggesting decaying effect over time. Previously diagnosed hypertension shows a significant impact on participation. If we compare the immediate effects of these three types of health shocks, heart disease and cancer show a stronger effect on quitting, whereas high blood pressure shows a smaller but more persistent impact. These findings provide a sound evidence on curative motive for quitting for the elderly, and underscores the need to include these health shock variables in specifying equations for cigarette consumption both at the intensive and extensive margins. The difference in effects of the newly diagnosed diseases compared with previously diagnosed diseases reflects the additional incentive to quit based on new health information or, in other words, decaying effect of old illness.

The effect of health shocks on conditional consumption is very different. We did not find much evidence on reduced consumption after health shocks. Only recently-diagnosed heart disease and hypertension have significant coefficients on conditional demand of cigarettes, implying that once the smoker decides to continue smoking even after a serious health shock including cancer, the number of cigarettes smoked does not tend to decrease. Interestingly, hospital stays induce less smoking, which is possibly caused by smoking bans in hospitals and the resultant incapacitation from smoking. Qualitatively, this result is remarkably similar to that in Wang and Heitjan (2008)’s count regression.

Consistent with recent studies on the general population of adults, we find little evidence that state-level smoke-free laws and anti-smoking sentiment reduces smoking prevalence for the elderly (Bitler et al. 2010; Jones et al. 2015; Carton et al. 2016; Maclean et al. 2016). However, we find some evidence that the state-level anti-smoking sentiment is associated with lower average smoking intensity (Table 7). As evidenced in Nesson (20172017), allowing for endogeneity in anti-smoking sentiment in our model will possibly wipe out this effect too. Note that in their carefully designed study, Carton et al. (2016) did not find any effect of state-level smoke-free policies on smoking behavior of the elderly.

From the perspective of model selection, estimates for the conditional demand of these two models are similar in both significance and size of coefficients. The main difference between the uncorrelated two-part model with the correlated hurdle model lies in the participation equation, as the probability to continue smoking depends on the extent of addiction and smoking intensity, as emphasized by Nesson (2017). As a result, our correlated hurdle model yields a large and statistically significant price elasticity of participation of 0.05%.

Decomposition of Various Factors on Cigarette Consumption

DeCicca and Kenkel (2015) have cautioned against the possibility of over-estimation of price elasticity of total consumption in the literature. Based on a meta-analysis of 86 studies they reported a mean price elasticity of − 0.48. According to them, this is inconsistent with the fact that during 1995–2010 the actual average number of cigarettes smoked decreased by around 25%, but the cigarette prices increased by more than 100%. This underscores the importance of specifying the model correctly generating the price elasticities.

With this in mind, we follow a novel approach to decompose the separate contributions of our three broad groups of factors, viz., cigarette prices, health problems, and state-level smoking bans (including time dummies) in explaining the observed drop in smoking participation and cigarette consumption over our sample period.

To make our results comparable to other studies on smoking behavior, we predicted the decline in smoking prevalence, conditional consumption and (unconditional) consumption counterfactually due to the three groups of factors separately based on our estimated elasticities. The average number of cigarettes per capita (or total consumption) is the product of the probability of smoking and conditional consumption of cigarettes per day. Let \( {\text{Pr }}\left( \cdot \right) \), \( \lambda \left( \cdot \right) \) and \( {\text{h }}\left( \cdot \right) \) be the probability functions of smoking, conditional consumption and total consumption, respectively. Then the estimated change in these three outcome variables caused by variable j alone between time t−1 and t can be calculated as

$$ \Delta \widehat{\Pr }\left( \cdot \right)_{jt} = \Pr \left( {G_{t - 1} + \Delta x_{jt} \hat{\beta }_{j} } \right) - \Pr \left( {G_{t - 1} } \right) $$
$$ \Delta \hat{\lambda }\left( \cdot \right)_{jt} = Exp\left( {D_{t - 1} + \Delta x_{jt} \hat{\beta }_{j} } \right) - Exp\left( {D_{t - 1} } \right) $$
$$ \Delta \hat{h}\left( \cdot \right)_{jt} = \Pr \left( {G_{t - 1} + \Delta x_{jt} \hat{\beta }_{j} } \right)Exp\left( {D_{t - 1} + \Delta x_{jt} \hat{\beta }_{j} } \right) - \Pr \left( {G_{t - 1} } \right)Exp\left( {D_{t - 1} } \right) $$
$$ {\text{for}}\;{\text{j}} = 1 \ldots , {\text{J}}\;{\text{and}}\;{\text{t}} = 1 \ldots , {\text{T}} $$

where \( G_{t - 1} = \varPhi^{ - 1} \left( {P_{t - 1} } \right) \), \( {\text{D}}_{t - 1} = { \ln }\left( {\lambda_{t - 1} } \right) \), \( P_{t - 1} \) is the sample mean of the probability of smoking at time t−1, and \( \lambda_{t - 1} \) is the sample mean of conditional consumption at time t−1. Hence, the relative shares of predicted decline in the probability of smoking, conditional consumption and total consumption due to each of the exogenous variables j are calculated as

$$ {\text{s}}_{j}^{p} = \mathop \sum \limits_{t = 1}^{T} \Delta \widehat{Pr}\left( \cdot \right)_{jt} /\mathop \sum \limits_{j = 1}^{J} \mathop \sum \limits_{t = 1}^{T} \Delta \widehat{Pr}\left( \cdot \right)_{jt} $$
$$ {\text{s}}_{j}^{\lambda } = \mathop \sum \limits_{t = 1}^{T} \Delta \hat{\lambda }\left( \cdot \right)_{jt} /\mathop \sum \limits_{j = 1}^{J} \mathop \sum \limits_{t = 1}^{T} \Delta \hat{\lambda }\left( \cdot \right)_{jt} $$
$$ {\text{s}}_{j}^{h} = \mathop \sum \limits_{t = 1}^{T} \Delta \hat{h}\left( \cdot \right)_{jt} /\mathop \sum \limits_{j = 1}^{J} \mathop \sum \limits_{t = 1}^{T} \Delta \hat{h}\left( \cdot \right)_{jt} $$

During 1994–2010, the overall smoking prevalence dropped from 23.60 to 12.03% in our sample, i.e., by 11.57 percentage points. The average cigarette price per pack increased from 1.71 to 3.62 (in 1992 dollars) during the same period, which implies a 6.8% reduction, using the methodology described above. In other words, about 59% of the observed reduction in the prevalence rate can be attributed to increased prices. The state-level smoking bans and anti-smoking sentiment variables show hardly any effect on the individual level participation behavior. The time dummies, which can possibly capture some of the unobserved effects of anti-smoking legislations, were statistically insignificant in this regression. On the other hand, coefficients for the health variables are substantial both statistically and economically. Thus, we attribute the other nearly 40% of predicted decline in smoking prevalence to the occurrence of health shocks and health morbidities in our sample population.

Conditional on smoking, the number of cigarettes smoked per day decreased from 19.77 to 13.85, over 1994–2010. As the price of cigarettes per pack increased from 1.71 to 3.62 (in 1992 dollars), a reduction of cigarettes consumption by 1.99 is predicted by the price elasticity of demand of 0.159 using the methodology described above. This is about 34% of the total decrease in smoking intensity. Health problems, unlike in participation decision, were barely found to have any effect on conditional demand. On the other hand, we found a decline of 3.68 cigarettes consumption per capita is due to anti-smoking sentiment and aggregate trends, i.e. 62% of the total decline. We attribute at least part of the unobserved changes in anti-smoking environment in the society, not captured by our two anti-smoking variables, to the time dummies included in the model.

In our sample, unconditional consumption decreased from 4.67 to 1.67 cigarettes per capita during the year 1994–2010. Cumulatively, increased cigarette prices explained 47% of the total drop, higher prevalence of health shocks and morbidities explained 49%, and only 4% of the drop is explained by the variables representing restricted smoking bans and anti-smoking sentiment in our model (see Fig. 6).

Fig. 6
figure 6

Source: The Tax Burden on Tobacco (2012) and authors’ calculation

Decomposition results for unconditional demand for cigarettes.

To summarize the decomposition results on the two sequential smoking decisions, we find that higher prices helped to reduce both participation and the number of cigarettes smoked quite considerably in this elderly population. Health problems, including health shocks and old health problems induce smokers to quit very significantly, but not the intensity. More restricted smoke-free air law and anti-smoking sentiment including aggregate trends seem to affect very significantly the conditional consumption demand, but not smoking participation. The sum of the individual effects reconciles well with the observed dynamics in smoking prevalence and the amount of cigarette consumption.

Other Covariates

We have included a large number of controls in our regressions. We find individuals with spouses are less likely to smoke, and smoke fewer cigarettes if they do. However, we also found that a smoking spouse increases not only the probability of smoking but also the intensity of cigarettes consumption by the partner. Table 7 shows that compared with females, males carry a higher probability of smoking, and consume more cigarettes conditional on smoking. Using Whites as the reference, Africa-Americans do not have significantly different probability to smoke, and, if they smoke, they smoke less. Hispanics shows a lower propensity and less demand for cigarettes conditional on smoking. Consistent with the finding of Khwaja et al. (2006), income has no statistically significant effect either on the probability of smoking or on the level of consumption, after controlling for individual effects.

With hurdle model, our estimates also suggest that both probability and intensity of smoking are non-linear functions of age. Given the age range in our sample, the probability of participation decreases and smokers tend to smoke less as people age. One interesting point to note is that the predicted probability of participation by age is slightly concave that peaks at age 36, which confirms the findings in previous studies that smokers usually quit around 1940s. The predicted conditional demand as a function of age is also concave, but with a peak at age 50.Footnote 9 In our hurdle model specification, years of schooling and its quadratic term are included. Similar to the effect of aging, higher education reduces the probability of smoking and the conditional consumption.

Conclusions

This study analyzes the effect of cigarette prices/taxes, health shocks and other factors including smoking bans on the smoking participation and consumption behavior among older adults. Specifically, participation and cigarettes demand are estimated both separately and jointly by using two-part and hurdle models with correlated random effects.

We found evidence that higher cigarette prices reduce both smoking prevalence and intensity even for the older adults. The effects are significant and substantial. Specifically, our estimated participation elasticity is large relative to the general consensus for this population, which has ranged from – 21 to − 56%. Although smaller than the elasticity associated with smoking participation, our results present consistently significant and accurate price elasticity of demand at the extensive margin that ranged from − 0.13 to − 0.16. After controlling for health shocks and smoking bans, we find that higher cigarette price explained 59% drop in smoking participation and 34% drop in smoking intensity over our sample period.

When we compare the effect of prices to that of health shocks, we find that the latter factor has a much stronger effect on cessation than on demand conditional on smoking. Individual health shocks have marginal effects on the probability of quitting anywhere from 20 to 53%, whereas heart diseases, cancers and hypertension reduce the probability of participation ranging from 41 to 75%, depending on which disease is diagnosed. On the contrary, health shocks show little impact on the smoking intensity for the smokers who choose to continue smoking even after being diagnosed with smoking-related health problems; only hospitalizations seem to reduce their smoking intensity.

Unconditional consumption, which is a product of the probability of continuing to smoke and the number of cigarettes smoked per day conditional on smoking, decreased from 4.67 to 1.67 per capita during years 1994–2010. Our decomposition analysis shows that cumulatively over our sample period, increased cigarette prices and health shocks explain 47% and 44% this decline respectively, and only 4% to more restricted smoking bans and anti-smoking sentiment at the state level. The empirical characterization of smoking behavior of the elderly at both intensive and extensive margins, and their strong response to higher prices established in this paper underscore the continued role of cigarette excise taxes to curb smoking, particularly in low- and middle-income countries.