Introduction

That tobacco causes disease is a long-established fact. In 2012, globally 12% of all deaths among adults aged 30 years and over were attributed to smoking [1]. Tobacco kills around 6 million people each year. More than 5 million of those deaths are the result of direct tobacco use while more than 600,000 are the result of non-smokers being exposed to second-hand smoke [2]. The list of health conditions for which there exists scientific evidence showing a causal effect is likely to continue to grow, with the latest report of the Surgeon General [3] adding diabetes mellitus, rheumatoid arthritis, colorectal cancer as well as general inflammation and impairment of the immune system to the “classical” group of smoking related ailments such as lung cancer, chronic obstructive pulmonary disease (COPD), myocardial infarction, coronary disease, or stroke. Economic evaluations of tobacco control policies typically account for the loss of quality of life associated with suffering these diseases by means of health-related quality of life (HRQoL) indices that permit the calculation of some outcome measure of life years adjusted by quality. Among such measures, the quality adjusted life year (QALY) [4,5,6] assigns a value of one to 1 year of life lived in full health, and zero to death. A relevant research question, with important implications for policy, is whether smoking affects HRQoL over and above its effect on the likelihood of contracting disease. As Vogl et al. [7] have argued, smoking may induce changes in utility in individuals who are otherwise equal to non-smokers in terms of biological, clinical and social characteristics. Such changes need to be duly accounted for in cost-effectiveness, cost-utility and general return on investment metrics for tobacco control policies.

The main aim of this paper is to find out whether the smoking status of the general Spanish population is associated with systematic variations in HRQoL as measured by the EQ-5D-5L valuation questionnaire instrument [8, 9], once biological and clinical conditions are controlled for.

The EQ-5D-5L questionnaire is a descriptive system of health-related quality of life assessing five dimensions (mobility, self-care, usual activities, pain/discomfort, anxiety/depression). For each of these dimensions respondents can report five levels of severity (no problems/slight problems/moderate problems/severe problems/extreme problems). The resulting 55 potential states are mapped into a one-dimensional index, known as the EQ-5D-5L “score” or “tariff”, usually ranging between unity (representing the best possible outcome of “no problems” in all five dimensions) and zero (worst possible outcome) [10].

The focus on this HRQoL instrument relates to its widespread use in economic evaluations of tobacco control policies [11,12,13,14]. While there are previous studies using Spanish data [15,16,17,18], these use the Short Form 36 (SF-36)Footnote 1 instrument on a small sample or do not control for comorbidities. To our knowledge, there is only one cost-utility analysis relating to smoking and quality of life for the Spanish population [18]. Moreover, while these and other international studies document differences in HRQoL by smoking status, they do not control exhaustively for clinical conditions potentially correlated with tobacco consumption, so it is difficult to attribute an independent effect on HRQoL to smoking. In contrast, our use of the Spanish National Health Survey, consisting of a sample of 20,956 individuals and reporting a wide array of clinical conditions, offers the possibility of testing for the existence of such independent effects with some degree of confidence.

Data and methods

Our data source is the latest National Health Survey release, that of the year 2011–12 [20]. The Spain National Health Survey (ENS) 2011–2012 is a cross-sectional survey of the non-institutionalized Spanish population containing information on lifestyles, health and socioeconomic characteristics of individuals, with separate adult (16+) and children samples. The analysis in this paper is based on the adult sample, which contains 20,956 individuals, representing a population of 38.6 million. The ENS2011-12 is the first ENS that contains information on the EQ-5D-5L self-report questionnaire [8, 9].

Table 1 presents a comprehensive set of descriptive statistics, including the sample size and equivalent number of individuals in the population broken down by gender and 5-year age brackets. Aside from the contents of Table 1 and Fig. 1, presenting smoking status broken down by age (in bands of 5 years) and gender, shows the striking differences in smoking patterns among men and women in Spain. For cohorts up to 50 years of age, the fraction of never smokers drops gradually down to 35% for males and 45% for females. For older cohorts, though, the fraction of women who have never smoked increases sharply and reaches nearly 100% among those above 75. In contrast, less than 45% of men above 50 report never to have smoked. These patterns reflect the gender time lag in the spread of the smoking epidemic in Spain, whereby smoking was rare among women before the 1970s. Nonetheless, the proportion of current smokers is greater among men in all cohorts, even if the difference is small among those below 20 years (25% males and 23% females). The proportion of former smokers is ever greater for older cohorts among males. Indeed, for male cohorts above 50 former smokers outnumber never smokers or current smokers. Among cohorts of women above 50, both current smokers and former smokers are rare, again reflecting the fact that the smoking uptake was infrequent in these population subgroups.

Table 1 Descriptive statistics
Fig. 1
figure 1

Smoking status by age and gender

Figures 2 and 3 present, respectively, the average EQ-5D-5L score broken down by age, smoking status and gender and the proportion of individuals reporting the maximum score (i.e. reporting “no problems” in all five health dimensions) using the same breakdown. As expected, both the average score and the proportion of cases reporting no problems declines with age for both men and women. However, irrespective of smoking status, men report both higher average scores and higher proportions of maximum scores at all ages.

Fig. 2
figure 2

EQ-5D-5L score by smoking status, by age and gender

Fig. 3
figure 3

Fraction of population reporting maximum EQ-5D-5L score smoking status, by age and gender

Concerning the relationship between smoking status and the EQ-5D-5L score conditional on age and gender, and focusing first on males, Figs. 2 and 3 suggest that among male cohorts up to 60 years of age, current smokers tend to report lower scores than either never smokers or former smokers. From 60 years onwards, it is former smokers who appear to report lower scores than the other two groups. As for females, current smokers also tend to report lower scores up to 50 years of age. From such age onwards no clear pattern is discernible from these figures.

Table 1 shows rates of exposure to second-hand smoke, which tend to be higher among the younger cohorts. Additionally, Table 1 presents rates of reports of diagnosed health conditions associated with tobacco consumption broken down by age and gender. These rates show a clear association with age for both men and women. Reports of infarction or other heart disease diagnoses range between 1 and 26%, those of COPD between 0 and 6%, and those of tumours between 0 and 8%. Table 1 also presents rates of reports of pain from any of these causes: migraine, back pain, arthritis or recent injuries, which range between 12 and 54% for males and 15 and 78% for females. Likewise, it includes rates of reports of diagnoses of any disease from the following list: hypertension, varicose veins, allergy, diabetes, stomach ulcer, urinary incontinence, high cholesterol, cataracts, skin problems, constipation, liver cirrhosis, hemorrhoids, osteoporosis, thyroid problems, menopausal problems (for women) and prostate problems (for men). These rates range between 20 and 83% for men and 22 and 92% for women.

Our statistical analysis hinges on the specification of models that aim to explain the variation in the EQ-5D-5L score as a function of biological and clinical characteristics, and lifestyles. These models need to account for the high proportion of responses reporting either the maximum possible value for the EQ-5D-5L score, or very close to it. Similarly, the differences between males and females discussed above call for a separate analysis for both genders. Among the various statistical alternatives suggested in the literature, we opt for the two-part model (TPM) [21]. The first part of the model estimates the probability of reporting the maximum score (i.e. no health problems in any of the 5 domains) by means of a probit model. The second part explains the expectation of the score given that some health problems have been reported by means of a generalized linear model (GLM) with a logarithmic link and gamma disturbances. The TPM has been shown to produce good results in terms of predictive power in comparison with other models in the context of the EQ-5D-5L [21]. Also, the TPM is readily interpretable. As mentioned above, the first part serves to predict the probability of reporting no health problems [which below will be referred to as P (no health problems reported)] and the second part serves to predict the expected value for the score conditional on reporting one or more health problems, denoted as E (score| some health problems reported) below.

With regard to the explanatory variables, we use six different specifications or models. Our baseline specification, Model 0, contains indicators for smoking status distinguishing between current smokers, former smokers and never smokers, a quadratic polynomial in age and controls for marital status, levels of alcohol consumption, physical activity, body mass index and an indicator for exposure to second-hand smoke. Models 1–4 add alternative sets of explanatory variables to the baseline specification. Model 1 includes indicators for medical diagnoses of each of the following conditions: heart infarction, malignant tumour, coronary obstructive pulmonary disease (COPD), stroke, other heart diseases and asthma (five of the classical tobacco related diseases). Model 2 includes indicators for each of the following mental disorders: depression, anxiety or other mental problems. Model 3 includes indicators for each of the following pain conditions: migraine, back pain, arthritis and recent injuries. Model 4 contains indicators for each of the following other medical diagnoses: hypertension, varicose veins, allergy, diabetes, stomach ulcer, urinary incontinence, high cholesterol, cataracts, skin problems, constipation, liver cirrhosis, hemorrhoids, osteoporosis, thyroid problems, menopausal problems (for women), prostate problems (for men). Finally, the full specification, Model 5, adds all the indicators used in Models 1, 2, 3 and 4 to the baseline specification.

The rationale behind these specifications was the necessity to test whether any systematic association of smoking status with HRQoL is robust to the inclusion of different sets of clinical conditions. In the case of specifications 2 and 3, which add, respectively, mental problems and pain, the test is particularly demanding, in the sense that two of the EQ-5D-5L domains are precisely mental problems and pain. Of course, in the context of a cross-section of non-experimental observational data we cannot rule out that such effects, if they exist, are due to correlated unobservables. In order to explore this possibility we carry out a robustness check consisting of expanding specification 5 with controls for social class and degree of perceived social capital.

From these two components it is possible to retrieve the predictions for the unconditional expectation of the score, simply as:

$$E\left( {\text{score}} \right) = P\left( {\text{no health problems reported}} \right)*{\text{value of maximum score}} + ( 1- (P\left( {\text{no health problems reported}} \right)*E\left( {{\text{score}}|{\text{ some health problems reported}}} \right).$$

These unconditional expectations, and their conditional (on reporting some health problem) counterparts, i.e. E (score|some health problems reported), may be used to produce estimates for the EQ-5D-5L based HRQoL index of prototypical profiles of individuals by gender, smoking status and age to use in cost-utility analysis of tobacco policies.

Results

Table 2 presents the estimates for the marginal effects of smoking status on HRQoL within the TPM for the models described above, along with the Akaike information criterion (AIC) measure of goodness of fit. The top panel corresponds to part 1 of the TPM, that is, the probability of reporting no health problems in any of the EQ-5D-5L domains, while the bottom panel corresponds to the second part of the TPM, i.e. the model for expected value of the score conditional on reporting some health problem. The omitted category within the smoking status set of dummy variables is “never smoker”.

Table 2 Marginal effect estimates for smoking status in two-part models

For the first part of the TPM, note that the best specification in terms of the AIC statistic is the one containing the full set of explanatory variables (Model 5), both for males and females. Among Models 1–4, which add alternative sets of covariates to the baseline specification in Model 0, the one including pain conditions (Model 4) results in the best improvement in goodness of fit with respect to the baseline specification, followed by the model including mental diseases (Model 2). This is not surprising since mental disorders and pain are two of the dimensions along which the EQ-5D-5L is measured. The inclusion of tobacco related diseases (Model 1) improves the AIC with respect to the baseline specification by smaller margins.

The marginal effect of current smoking on the probability of reporting some health problem among both males and females ranges between 4 and 2%, this latter estimate corresponding to the best performing model (Model 5), which in the case of women verges on statistical insignificance (p value = 0.109). As for the marginal effect of former smoking, it ranges between 5% in the baseline specification and statistical insignificance for both men and women in Model 5.

For the second part of the TPM,Footnote 2 and in the case of males, Models 1–4 yield no clear improvements in the AIC with respect to the baseline specification. And, although Model 5 yields a better AIC, the marginal effects of the smoking status variables are not significant. For females, the baseline specification yields similar AIC statistics to the rest of the specifications, with a significant but small (about −0.02 EQ-5D-5L score points) marginal effect for being a former smoker. These results are robust to the inclusion of controls for social class and degree of perceived social support.

Table 3 presents estimates for the expected EQ-5D-5L score for a set of representative profiles broken down by age, gender and smoking status. These estimates are defined as the unconditional expectation of the score over the relevant population group, and they have been calculated with the two parts of Model 5. Note that, within age and gender categories, there are no stark differences in the expected EQ-5D-5L score by smoking status.

Table 3 Estimates for the unconditional expectation of the EQ-5D-5L score by age, gender and smoking status, with bootstrapped standard errors

Finally, Table 4 presents estimates for the change in the score associated with suffering a tobacco related disease. They are defined as the difference between the unconditional expectation of the HRQoL score over the population of individuals who do not suffer any of the diseases minus the expectation of the HRQoL conditioned on suffering the corresponding disease and reporting health problems for the same population. Note that for some diseases this change is very substantial. For instance, the drop in the tariff reaches about 0.35 score points in the case of stroke.

Table 4 Decrement in EQ-5D-5L score associated with reporting a diagnosis of a smoking related disease, with bootstrapped standard errors

Discussion

The conjunction of results presented above suggests a series of stylized facts about the relationship between smoking and HRQoL as measured by the EQ-5D-5L. First, even the most comprehensive specifications in terms of clinical, biological and lifestyle conditions detect an independent effect of smoking on HRQoL in comparison to otherwise equal never smokers. This effect operates through a larger probability of reporting some health problem, but not through current smokers reporting a lower score than otherwise equal never smokers who also report health problems along any of the EQ-5D-5L dimensions.

We find that being a former smoker also seems to affect the probability of reporting health problems, but its effect is not statistically significant once the full set of available reported clinical diagnoses is included. This suggests that the former smoker status is a proxy for clinical diagnoses. In the case of women, though, we find that being a former smoker has a small and significant negative effect on the expected EQ-5D-5L score among those who report a health problem. This gender effect is probably a result of the differences in the evolution of the smoking epidemic in Spain, where for male former smokers the average period since quitting is longer than for female former smokers (for instance, the proportion of male former smokers who quit more than 10 years before the date of the survey is 56% while the corresponding figure for females is 42.4%) [20].

Nonetheless, the effects of smoking on HRQoL are very small in magnitude once clinical conditions are comprehensively controlled for. For instance, currently smoking women in the 45–54 age band are expected to have a EQ-5D-5L score of 0.89 compared to a score of 0.92 for women in the same age band who have never smoked or 0.91 for former smokers.

In contrast, the substantive damaging effect of smoking operates through the reduction in HRQoL associated with suffering a smoking related disease. For instance, having a stroke reduces the EQ-5D-5L score by a margin more than 10 times larger than the difference between current and never smokers mentioned above. For those that suffer a heart infarction, other heart diseases, COPD or a tumour the margin is about 5 times larger, and for asthma the difference is about 3 times larger.

Conclusion

We have estimated econometric models for the EQ-5D-5L score in the Spanish population as a function of smoking status plus a wide range of clinical indicators with a view to separating the effect of smoking status from the effect of concomitant diseases potentially triggered by tobacco consumption. The results that we have discussed above are limited by the fact that the observational nature of the data does not afford a study design able to retrieve causal effects. On the other hand, they are based on a particular representation of HRQoL, the EQ-5D-5L, and could well be different using other health instruments. In this respect, as the authors of the EQ-5D-5L index have warned, the value set obtained might be subject to revision due to changes in the EuroQol protocol [22]. These shortcomings call for further research with more sophisticated datasets and alternative health instruments.

Notwithstanding these caveats, there are two stark implications from our results for research on the cost-effectiveness, the cost-utility and the return on investment in general of tobacco control policies. Firstly, attributing substantive HRQoL gains to quitting smoking as well as accounting for the concomitant HRQoL gain derived from a smaller likelihood of contracting tobacco related diseases might lead to an overestimation of the benefits of tobacco control policies. And second, but not least, the relatively large drops in HRQoL associated with being diagnosed with diseases that might be causally linked to tobacco suggest that they should not be omitted from the economic evaluations of tobacco control policies. For instance, a diagnosis of either arthritis or diabetes, two diseases causally associated with smoking according to the latest report from the Surgeon General [3], but nonetheless typically omitted in economic evaluation of tobacco policy, are associated with a reduction of about 0.15 in HRQoL as measured by the EQ-5D-5L score. This effect is about 5 times larger than the difference between smoking currently and not having smoked ever for women in the 45–54 age band. New economic evaluation research in the area of tobacco should consider the inclusion of such diseases.