There is much interest in understanding the role of subjective beliefs in explaining individual behavior. Such research is considered to be an important stepping stone towards using data on subjective beliefs in empirical analysis of economic decisions rather than relying on assumptions about individual expectations, e.g., Manski (2004). Yet there is a lot that is not understood about the nature and formation of subjective beliefs.

In the absence of data on subjective beliefs, economists have made a variety of assumptions about individual expectations in empirical research. Rust and Phelan (1997), in estimating their model, assume “rational expectations,” i.e., individuals’ subjective probability measures coincide on average with objectively estimable population probability measures. In particular, subjective beliefs of each individual i at time t are replaced by an objective probability measure that is estimated as a function of observable individual characteristics, and behaviors using population level data. Subjective beliefs have been analyzed in a number of contexts such as: earnings expectations/job search (Lancaster and Chesher 1983); Social Security expectations and retirement savings (Dominitz et al. 2002); consumption expectations/consumption in retirement (Hurd and Rohwedder 2003); mortality expectations; consumption and bequests (Gan et al. 2004); job loss expectations and consumption (Stephens 2004); retirement and mortality expectations; and savings and retirement (Van der Klaauw and Wolpin 2005). Snowberg and Wolfers (2010) studied the phenomenon of gamblers over-betting for long shots in horse races and under-betting on favorites.

Some scholars have maintained that rather than having rational expectations, people have biased subjective beliefs and that such biased beliefs, more specifically underestimating the adverse consequences of their choices, lead people to engage in behaviors that are harmful to themselves and others (Weinstein 1980; Hansen et al. 1991; Dejoy 1992; Weinstein and Klein 1995).Footnote 1 This phenomenon is referred to as “optimism bias.”Footnote 2 A case in point may be heavy and binge drinking, which can lead to chronic diseases, fights, other forms of disruptive behavior (e.g., Gmel et al. (2012)), and driving while intoxicated (DWI). Heavy and binge drinking substantially raises the probability of DWI (Sloan et al. 1995).

This study tests whether persons who engage in heavy or binge drinking are overly optimistic about the probabilities of adverse consequences from these activities. We use data from a survey conducted for this study in eight geographically dispersed U.S. cities to evaluate the relationship between subjective beliefs and drinking behaviors. We gauge accuracy of beliefs in two ways. The first is to compare accuracy of subjective beliefs with objective probabilities and other values obtained from various secondary sources. The second is to determine whether subjective beliefs elicited at one interview are systematically related to realizations of the same outcomes reported by survey respondents a year later. Although the second approach is conceptually preferable because it matches subjective and objective probabilities for the same individuals and thus reflects peculiarities of our sample, objective probabilities for several study outcomes we would have liked to analyze are too low to permit a within sample before (subjective belief) versus after (realized outcome) comparison.Footnote 3 We assess accuracy of beliefs about a wide range of possible outcomes of heavy and binge drinking (e.g., reduced longevity or onset of liver disease). We also assess accuracy of beliefs about the link between alcohol use and passing the legal threshold for intoxication, the probability of getting into an accident conditional on alcohol consumption, the share of drivers on the road who have had too much to drink, and the legal consequences for DWI. The legal consequences of DWI range from the probability of being stopped to fines and jail terms conditional on a conviction. For within sample comparisons, the subjective probabilities of these outcomes occurring during the following year are: drinking and driving at all; being arrested for DWI; being cited for driving 15 miles per hour or more above the speed limit; and being in a DWI related motor vehicle accident.

Our empirical analysis leads to these conclusions. First, the comparisons of subjective beliefs with objective data reveal that persons are more often pessimistic than optimistic about the adverse outcomes we studied. Binge and heavy binge drinkers, if anything, tend to be more pessimistic about the adverse outcomes from excessive drinking than “other drinkers,” persons who consume alcohol but not in excess according to our study’s criteria for “heavy” or “binge” drinking. Second, three of the four within sample comparisons show that more individuals overestimate the probability of outcomes a year later than underestimate them. But third, we find evidence that binge and heavy binge drinkers think that their driving ability and ability to tolerate alcohol is better than average. This Lake Wobegon effect often underlies claims of optimism bias.Footnote 4 Yet such overconfidence does not translate systematically to more optimistic subjective beliefs of binge and heavy drinkers. We conclude that the evidence overall implies that optimism bias does not explain why some adults consume large amounts of alcohol.

Section 1 describes our data. In Section 2, we compare accuracy of subjective beliefs from our survey with corresponding objective values obtained from secondary sources. In Section 3, we compare accuracy of subjective beliefs with data on realized outcomes reported by the same survey respondents a year later. Section 4 presents results on the Lake Wobegon effect and on focal responses to questions eliciting subjective beliefs. Section 5 describes our results in the context of previous research, reconciles our findings on accuracy of subjective beliefs with those on the Lake Wobegon effect, and discusses implications of our findings.

1 Data

Data on subjective beliefs come from a survey conducted for purposes of this research while data on corresponding objective probabilities come from various secondary sources described below in sections in which we present specifications and results for specific outcomes.

1.1 Survey of alcohol and driving (SAD)

Battelle Memorial Institute conducted a three-wave survey of drinkers and drivers in eight cities in four states during 2009–2012: Raleigh, North Carolina (NC) and Hickory, NC; Philadelphia, Pennsylvania (PA) and Wilkes-Barre, PA; Seattle, Washington (WA) and Yakima, WA; and Milwaukee, Wisconsin (WI) and La Crosse, WI. The cities were selected to yield a broad geographic spread and include both large and small cities. Fatalities from DWI-related accidents are highest in WA and lowest in NC (Mothers Against Drunk Driving 2012). Since the study focuses on DWI, SAD excluded persons reporting during the screener interview: no alcohol consumption or no driving in the past month. Respondents had to be at least age 18. The participant recruitment process was designed to oversample persons who consumed large amounts of alcohol and were prone to DWI to allow us to study the decision-making processes and behaviors of such individuals in detail.

This survey, the Survey of Alcohol and Driving (SAD), included detailed information on drinking, drinking and driving behaviors, risk perceptions, addiction, use of substances other than alcohol, knowledge of statutes and judicial practices with regard to DWI, personal attributes and attitudes, demographic characteristics, and income. When possible, questionnaire design was guided by questions that have been asked in prior surveys, albeit not all asked in the same instrument. The first wave of three was conducted using Computer Assisted Telephone Interviews (CATI). The other two waves were conducted by Computer Assisted Self-Administered Interviews (CASI). Wave 2 (CASI-1) contained questions on subjective beliefs. Wave 3 (CASI-2), also conducted by CASI, asked about realizations of beliefs elicited a year earlier in CASI-1. This study is based on data from all three waves. The CATI and CASI-1 and CASI-2 included 1,634, 1,359, and 1,187 individuals, respectively, the latter due to sample attrition.

1.2 Survey of attorneys

During 2009–2010, we surveyed 62 attorneys in the eight cities in which the SAD was conducted. The questionnaire elicited information on a variety of topics, including the share of drunk drivers on the road, arrest resolution process, and characteristics of the attorneys’ clientele. We use these data for the analysis of the accuracy of the probability of being jailed conditional on being convicted of DWI.

1.3 State data on DWI arrests

We obtain information on individual arrests from each of the four study states for 2009. These data contain detailed information on the arrest resolution process and outcomes. We aggregate these data to the level of individual SAD cities from which we derive objective probabilities.

2 Comparisons of subjective beliefs and objective probabilities from other sources

2.1 Accuracy of beliefs about longevity

2.1.1 Overview

To assess accuracy of beliefs about longevity, we compare objective probabilities of survival based on our empirical analysis of survival using data from the Health and Retirement Study (HRS) for 1996–2008. The HRS is a national longitudinal survey of persons who were 51–61 in 1992 and their spouses/partners—who could be of any age. HRS participants were interviewed every other year.Footnote 5 Using the parameter estimates from the analysis of survival with HRS data and data from the SAD, we compute objective probabilities of survival to each age. We focus on the probability of survival to age 60 for respondents who were age 36 or less when beliefs were elicited by the SAD and the probability of survival to age 75 for persons who were older than this. Past research has indicated that subjective survival expectations elicited by the HRS match objective life table values (Smith et al. 2001; Hurd and McGarry 2002; Hudomiet and Willis 2012).

2.1.2 Objective probabilities of survival

We estimate conditional hazard rates for mortality using HRS. The mean age of HRS participants was 58.8 years in 1996 (SD = 5.4), with ages ranging from 27 to 86.Footnote 6 During 1996–2008, 16.2% died (of the 9,497 persons in the sample alive in 1996).

We estimate the relationship between observables likely to affect mortality using a hazard function for persons of all baseline ages, which allows for unobserved heterogeneity and assumes a Weibull distribution. The hazard function at year t for individual i in the HRS sample with observable explanatory variables Xi is given by:

$$ \lambda \left(t;{X}_i,\theta, \upmu \left|{\eta}_i\right.\right)={\lambda}_0(t) \exp \left({X}_i\prime \theta \right){\eta}_i, $$
(1)

where λ 0(t) is the baseline hazard for the Weibull distribution with the shape parameter μ and exp(X i θ) is the proportional hazard with parameters θ. A time invariant-specific unobserved heterogeneity multiplicative term η i is distributed gamma with mean 1 and variance σ (e.g., Lancaster 1979). Time to failure is years from the interview date to date of death. All survivors at the 2008 HRS interview are treated as censored. Parameters are estimated using maximum likelihood.

The objective probability of living to a given age is

$$ {\widehat{O}}_i\left(t;{X}_i,\widehat{\theta},\widehat{\mu}\left|\widehat{\eta}\right.\right)= \exp \left\{-{\widehat{\eta}}_i\left[ \exp \left({X}_i^{\prime}\widehat{\theta}\right)\right]{t}^{\widehat{\mu}}\right\} $$
(2a)

and,

$$ {\widehat{O}}_i\left(t;{X}_i,\widehat{\theta},\widehat{\mu}\right)={E}_{\eta}\left[{\widehat{O}}_i\left(t;{X}_i,\widehat{\theta},\widehat{\mu}\left|\widehat{\eta}\right.\right)\right]={\left[1+\widehat{\sigma}\left( \exp \left({X}_i^{\prime}\widehat{\theta}\right){t}^{\widehat{\mu}}\right)\right]}^{1/\widehat{\sigma}} $$
(2b)

where \( \widehat{\theta} \), \( \widehat{\mu} \), and \( \widehat{\sigma} \) are estimated from Eq. (1), and t is the number of years to age 60, 75 and their life expectancies.

We obtain baseline data from the 1996 HRS and follow persons through the 2008 HRS. The Xi include explanatory variables for drinker type, alcohol addiction, health, demographic characteristics and cognitive status. We exclude non-drinkers from the analysis sample since they were excluded from the SAD.

Drinker type consists of four groups. Heavy drinkers consumed 14+ drinks per week for men under age 65 or 7+ drinks for women and men over age 65. Binge drinkers consumed 5+ drinks for men and 4+ drinks for women on an occasion during the 3 months before the 1996 interview, but were not heavy drinkers. Heavy binge drinkers satisfy criteria for both binge and heavy drinkers. The omitted reference group is other drinkers.

We measure level of addiction to alcohol using the CAGE, a widely used screening tool. The CAGE is a screening test for alcohol dependence consisting of four questions: Have you felt you should Cut down on your drinking? Have people Annoyed you by criticizing your drinking? Have you ever felt Guilty about your drinking? and Have you ever had to drink first thing in the morning to steady your nerves or get rid of a hangover (Eye-opener)? We define binary variables for counts of one, two, and three or four affirmative responses with a binary variable for a count of zero omitted.

Demographic variables are female gender, age, Black race, Hispanic ethnicity, other race, currently married, and educational attainment (in years). Self-rated health is represented by a set of binary variables for very good, good, fair, and poor health, excellent health omitted.

We include several measures for cognition, all based on questions from the HRS (see Appendix Table 10). The first measure (“recall”), adapted from the HRS, includes an exercise in counting backwards to assess attention and processing speed, and an object naming test to assess language, and recall of the date and name of the Vice President of the United States and the Governor of the state in which the respondent resided. The answers to the questions are summed to form a score with a range of 0 to 8. Second, to measure working memory, we include a serial 7 subtraction test based on a sequence of 5 questions, starting with 100 minus 7, with the next question based on the respondent’s answer to the first question minus 7, and so on. The maximum (best) score on this variable is 5. Third, we include a measure of the respondent’s numeracy. The numeracy question sought to learn whether or not the respondent was able to make percentage calculations. The explanatory variable for numeracy is a count of the number of correct answers to the three questions about numeracy. The fourth cognition measure is for a self-report of memory. Respondents were asked, “How would you rate your memory at the present time?” Response categories were excellent, very good, good, fair, poor. We combine responses to form a single binary variable for very good or good memory and a single binary variable for fair or poor memory, with excellent memory omitted.

The validity and reliability of the HRS cognition measures has been established (Ofstedal et al. 2005); several papers have assessed strengths and weaknesses of the HRS cognition measures (Plassman et al. 1994; Lachman and Spiro 2002; Crimmins et al. 2011).

Research has demonstrated that quantitative responses of individuals lower in numeracy are more sensitive to how questions are framed (Peters et al. 2006; Dickert et al. 2011); using more sophisticated questions than the SAD used, Frederick (2005) found that numeracy was systematically related to differences in individual preferences, e.g., risk tolerance. Numeracy has been linked to better performance on tasks requiring relatively intensive number processing or numeric assessment (Del Missier et al. 2010, 2012) and lower sensitivity to question framing effects (Peters et al. 2006, 2011). Individuals rely on memory in making choices when such information is readily retrievable from memory (Bettman et al. 1998). Poor memory may relate to less accurate assessments about the advantages and disadvantages of particular choices.

2.1.3 Hazard results for obtaining objective probabilities

Drinker type has no effect on survival (Table 1, hazard results). Nor does level of addiction to alcohol, cognitive status, race/ethnicity, or educational attainment (with health included). Decreases in health increase the probability of death monotonically from excellent (omitted) to poor health. Older persons at baseline have poor survival prospects.

Table 1 Objective probability of mortality

2.1.4 Subjective versus objective probabilities of survival

To compare the subjective beliefs with corresponding objective probabilities of survival obtained from the above analysis of HRS data, we compute the difference in the subjective and objective probabilities of survival, the former from SAD and the latter predicted from our analysis of HRS.Footnote 7

Respondents who were aged under 36 were asked, “On a scale from 0 to 10, with 0 being ‘not at all likely’ and 10 being ‘very likely,’ what is the chance you will be alive at age 60?” We convert the 0–10 scale for the questions about subjective beliefs about living to age 60 to probabilities. The same question was asked of persons aged 36+ except age 60 was changed to age 75. The mean subjective probability of living to 60 of persons under 36 (29.0% of the sample) is 0.87 and for persons 36+, the mean subjective probability of living to 75 is 0.84.

The dependent variable is the difference in subjective and objective probabilities of living to a particular age. Positive values in this analysis imply optimism and conversely for negative values. We estimate the difference in probabilities using ordinary least squares (OLS).

Young “other drinkers” were pessimistic about living to age 60, but older other drinkers were optimistic about living to age 75 (Table 2, Panel A, intercept, cols. 1 and 2). Young binge drinkers tended to be more optimistic about longevity than other young drinkers, but only slightly optimistic in absolute terms. There are no statistical differences from other drinkers by drinker type in the older group.

Table 2 Accuracy of responses to probability of survival

Results for drinker type are similar in the full specification (Panel B). Persons who tended to be optimistic were older, highly addicted to alcohol, and in poorer health. Persons with poorer recall and memory tended to be more pessimistic about their longevity.

Thus, our results on longevity are mixed. Persons under 36 tended to underestimate the probability of living to 60 but those 36+ overestimated the probability of living to 75. These results lend no support to the notion that heavy and binge drinkers are overly optimistic about their longevity prospects relative to other drinkers.

2.2 Accuracy of beliefs about harms of drinking

In this sub-section, we report findings from analysis of subjective beliefs versus objective data on several negative outcomes of heavy and binge drinking.

2.2.1 Liver diseases

Liver disease, alcoholic hepatitis (AH), and cirrhosis are important consequences of heavy drinking. For the subjective probability of getting liver disease, SAD asked “On a scale of 0 to 10, where 0 is 0% probability and 10 is 100% probability, what is the chance that long term heavy drinking will lead to liver disease?” On average, people estimated the probability to be 0.80. Based on the literature, we assume the objective probability of acquiring alcoholic hepatitis (AH) or cirrhosis conditional on heavy drinking, ranges from 0.1 to 0.5 (Naveau et al. 1997; McCullough et al. 2011; Mann et al. 2003).Footnote 8 If a respondent gave a probability in this range, we consider the response to be accurate. We compute differences between subjective and actual responses for persons giving responses outside this range.

Including covariates for drinker type categories only, other drinkers overestimated the probability of getting alcohol-related liver disease by 0.33 (Table 3, Panel A, cols. 1), i.e., were pessimistic on average about this adverse outcome. These coefficients are robust to inclusion of additional explanatory variables. Heavy binge drinkers also overestimated this probability, but they were somewhat more accurate in their assessments. Accounting for the other covariates, heavy binge drinkers are still more accurate than other drinkers are (Panel B, col. 1).Footnote 9

Table 3 Accuracy of responses to liver disease and intoxication questions

2.2.2 Consumption needed for intoxication

The SAD asked “Try to estimate the number of one and one-quarter ounce shots of liquor that you would have to drink to bring you over the legal limit.” On average, respondents estimated that it would take 3.0 shots (SD = 1.6) to reach the illegal blood alcohol content level. Estimates of the objective number of shots needed to reach a BAC level of 0.08, which would make them subject to a DWI arrest, are calculated by gender and body weight.Footnote 10 For men, the mean number of shots needed to reach a BAC of 0.08 at the sample mean of weight is four shots; for women at mean weight, the number of shots is three. Thus, people seem to be fairly accurate in judging how many drinks are needed to reach an illegal BAC for driving.

Among drinker types, other drinkers, who are presumably less familiar with the intoxicating effects of drinking based on their personal experiences, underestimated i.e. were pessimistic about the shots needed by almost a drink to become legally intoxicated (−0.87, SE = 0.079, Table 3, Panel A, col. 2). Heavy binge and binge drinkers were more accurate than other drinkers, implying learning by doing (Table 4, Panel A), but they were also pessimistic about this outcome on average. The parameter estimate for heavy drinkers is not statistically different from other drinkers and is robust to changes in specification (Panel B).

Table 4 Accuracy of responses to fraction of drunk drivers on road

2.3 Accuracy of beliefs about drunk drivers on the road

Individuals may avoid being involved in an alcohol related accident by not driving at times during which many drivers are intoxicated. For the subjective probability, SAD asked, “On average on a weekend evening, what percent of drivers on the road have had too much to drink?” We interpret “too much to drink” as a minimum BAC of 0.08. The mean subjective percent is 23.2 (SD = 17.9), which is above objective estimates from other sources.

There is a lack of consensus in the literature about the percent of drunk drivers on the road on an average weekend evening. Lacey et al. (2009) conducted a survey of drivers in 300 locations in 48 states during four two-hour periods. The authors calculated the percent drunk drivers at 2.2 for 2007. Estimates from other sources are considerably higher. Levitt and Porter (2001), using a novel approach, inferred the percent of drunk drivers between 8 pm and 5 am from national Fatal Accident Reporting System (FARS) data. Their estimates ranged from 13.6 to 29.6%. Even if the incidence of drunk driving has declined over time (see e.g., Lacey et al. 2009), these estimates, nevertheless, imply substantial amounts of drunk driving.

We use the Levitt and Porter (LP) method for the eight cities included in the SAD, using FARS data for 2009. LP computed national rather than city-specific estimates. We limit analysis to fatalities resulting from motor vehicle accidents occurring between 8 pm and 5 am. Although the SAD question referred to weekends, to have a sufficient number of observations for the smaller cities, we include data on 8 pm to 5 am accidents on weekdays as well as weekends. We rely on data from the police officer’s evaluation of whether or not a driver had been drinking as reported by the FARS. Estimated parameters that maximize the log likelihood are presented in the Appendix (Table 11). The relative ratio of drinking to sober drivers on the road varies plausibly, from 0.14 in Philadelphia to 0.15 in Seattle and 0.40 in Milwaukee to 0.57 in La Crosse. The latter is a college town in a state with comparatively high alcohol consumption levels.

All drinker type groups are accurate on average in assessing the objective shares of drunk drivers on the road (Table 4, Panel A). Results for drinker type from the full specification are similar to those from the limited specification. The positive intercept in the full specification implies that middle-aged, college educated, non-Hispanic white men with excellent cognition overestimate shares of drunk drivers on the road.Footnote 11 Better recall, serial subtraction scores, and numeracy are associated with more accurate estimates of actual drunk driver shares (Panel B).

2.4 Accuracy of beliefs about legal consequences of driving under the influence

State legislatures have established penalties for drinking and driving with the intent of deterring such behavior. The SAD asked a series of questions about (1) the probability of conviction for DWI, given that the person has been pulled over and has had too much to drink; (2) conditional on the probability of being convicted for DWI, the probability that the person would receive a fine; (3) the amount of fines, given that the person has been charged a fine; (4) conditional on a DWI conviction, the probability of receiving some jail time; and (5) conditional on some jail time, the amount of jail time the person could expect to receive.

The SAD did not ask about the probability of arrest if a person drove after having had too much to drink. For this reason, we do not compare subjective and objective probabilities of arrest conditional on having had too much to drink. Unless otherwise indicated, the objective probabilities come from state arrest data.

2.4.1 Conviction

Using state arrest data for the objective benchmark, other drinkers are very accurate in estimating the probability of conviction conditional on arrest. The coefficient is 0.017 with a standard error of 0.017 (Table 5, Panel A, intercept, col. 1). Binge and heavy binge drinkers overestimated the probability of conviction conditional on being arrested, i.e., were too pessimistic about this outcome although they frequently admitted to drinking and driving.

Table 5 Accuracy of responses to legal consequences of DWI: conviction and fines

In the full specification, results for binge and heavy binge drinkers appear fairly robust to the addition of several covariates (Panel B, cols. 1, 2). We add covariates for the respondent’s self-report of having been arrested for DWI in the past 3 years and binary variables for the city in which the respondent resided (the latter results not shown). The parameter estimate on the prior arrest covariate is 0.14. The positive coefficient does not necessarily imply that persons reporting a previous DWI arrest were relatively pessimistic about the probability of conviction if arrested since the actual conviction rates are citywide, and we do not account for prior arrest records of offenders.

2.4.2 Fines

We also obtain data on the probability of being fined from state arrest records. Other drinkers overestimate the probability of being fined by 0.23 (Table 5, Panel A, col. 2); other drinkers do not differ from heavy, binge, and heavy binge drinkers in their estimates of the probability of being fined.

In the full specification (Panel B), parameter estimates on the binary variables for binge and heavy binge drinkers are positive and statistically significant, but similar in magnitude to the parameter estimates in the limited specification and not very large. The coefficient on prior DWI arrest is not statistically significant.

We obtain information on fine amounts for DWI from state statutes.Footnote 12 We consider an answer correct if it is within the minimum to maximum range of statutory guidelines in the respondent’s state. In such cases, the dependent variable is set to 0. If the answer is incorrect, we measure the difference between the respondent estimate and the relevant lower or upper bound of the minimum to maximum range.

Considering estimated fines within the guidelines as correct, other drinkers overestimated the fine by $223 on average (Panel A, intercept, col. 3). We find no statistical differences between heavy, binge, and heavy binge drinkers and other drinkers in accuracy of magnitudes of fines. Adding covariates does not materially affect the parameter estimates for drinker type (Panel B).

2.4.3 Jail

We obtain estimates of objective probabilities of jail conditional on conviction from two alternative sources, arrest data and from our survey of attorneys. A deficiency of the arrest data is that they are for sentences rather than time actually served. Some persons are sentenced to jail, but the sentence is immediately converted to probation or community service. Like subjective beliefs, the data from attorneys are more likely to reflect actual jail perceptions.

Overall, using the state arrest data for the objective probabilities, other drinkers substantially underestimated the probability of jail conditional on a conviction for DWI (Table 6, Panel A, intercept, col. 1), i.e., were too optimistic. The coefficient is −0.33. Binge and heavy binge drinkers also underestimated the probability of jail, but were more accurate on average than other drinkers were. However, the differences between binge and heavy binge drinkers and other drinkers disappear when we add additional covariates (Panel B). Using objective data from attorneys rather than from arrest data (col. 2), other drinkers were quite accurate in their assessments of probabilities of jail conditional on a conviction, and there are no statistical differences between the other drinker types and other drinkers.

Table 6 Accuracy of responses to legal consequences of DWI: jail

For fines, data on jail sentences come from state statutes. As for fine amounts, other drinkers were pessimistic about jail term lengths (Panel A, col. 3), conditional on being convicted. And there are no statistical differences between the other drinker types and other drinkers. It is possible that respondents were more accurate than they appear to the extent that jail terms in the statutes are not enforced.

2.5 Overall optimism

Using responses to the above items, we create an optimism index, a count of all responses for which the respondent was optimistic. Since respondents sometimes did not answer all 10 items included in the index, we include a covariate for the number of items for which we have data from the respondent. Since the dependent variable is a count variable, we use ordered logit analysis. The mean fraction of optimistic responses is 0.22.

In the limited specification (Table 7, Panel A), binge and especially heavy binge drinkers are overall less often optimistic than their other drinker counterparts. Statistical significance for binge and heavy binge drinker is lost in the full specification, but the odds ratios, albeit slightly higher than in the limited specification, are not very different. Alcohol-addicted individuals are less likely to be optimistic than persons without this addiction.

Table 7 Optimism index (ordered logit)

3 Within sample comparisons of subjective beliefs and objective probabilities

The second wave of the SAD (CASI-1) contained several questions about the probabilities of outcomes expected to occur during the following year, asked on a scale of 0–100. The third wave (CASI-2) was fielded a year later and asked about realizations of the same outcomes. Unfortunately, for our analysis, only nine CASI-2 respondents reported having been arrested for a DWI in the past year. Only heavy binge and binge drinkers were among the nine persons charged with this offense (Fig. 1). Yet subjective probabilities of being arrested for a DWI in the next year rise monotonically from other drinker, to heavy, binge, and heavy binge drinker. All drinker types were too pessimistic about the probability of a DWI arrest in the following year.

Fig. 1
figure 1

Subjective beliefs from CASI-1 versus outcomes realization from CASI-2

Among the other subjective beliefs elicited in the second wave are probabilities of any binge drinking, a citation for driving over 15 miles per hour above the speed limit and a motor vehicle accident. A minority of respondents who experienced an accident were charged with a driving violation (13.6%).

Respondents were too pessimistic about the probabilities of being cited for speeding and of having an accident, and too optimistic about not drinking and driving during the following year. Subjective beliefs for being cited and having an accident do not exhibit the same patterns by drinker types as DWI arrests. However, the pattern for drinking and driving by drinker type is similar to that for DWI arrests. The three outcomes about which respondents were too pessimistic reflect randomly occurring events beyond respondents’ control. By contrast, drinking and driving is fully under the individual’s own control, and respondents were too optimistic about this outcome.

We use logit analysis of a binary variable for whether or not the person actually drank and drove, was cited for driving more than 15 miles per hour over the speed limit, or had an accident during the previous year as reported at the third survey wave. We add a covariate for subjective beliefs for each of three outcomes as reported in CASI-1.

The odds ratios on the covariates for the subjective probabilities of drinking and driving are 27.3 and 3.08, respectively, and are statistically significant (Table 8, Panel A, cols. 1, 2). The odds ratio for the subjective probability of an accident is 2.43, but this result is not statistically significant. Overall, these results imply that people can indeed predict the future.Footnote 13 Odds ratios for drinker type in the drinking and driving analysis rise monotonically from other drinker to heavy binge drinker. Heavy binge drinkers were also more likely to have been cited for speeding, even after accounting for subjective beliefs about this outcome.

Table 8 Ability of past subjective beliefs to predict future outcomes (logit)

Results in the full specification for drinker type and subjective beliefs are similar although adding covariates reduces the magnitudes of the odds ratios, and the odds ratio for heavy binge drinker loses significance in the analysis of speeding citations. The odds ratios on “History” of prior DWI arrests, speeding citations, and accident are all above 2.0 and statistically significant. Even after accounting for subjective beliefs, there is considerable information content in histories of prior citations, arrests, and accidents—information used by motor vehicle insurers in setting premiums.

Overall, these results imply that individuals have fairly accurate beliefs about future events, particularly those under their control. They are sometimes too optimistic, but this is not a general pattern.

4 Other findings

Two other findings are relevant for interpreting our key results. First, people may think that probabilities of both good and adverse outcomes apply to others and not to themselves. SAD asked respondents to assess their driving skills relative to others, “How would you rate yourself as a driver relative to other drivers?” Response options were: much better; better; about the same; worse; and much worse. The odds of binge and heavy binge drinkers viewing their driving ability as relatively favorable are substantially increased over the omitted reference group, other drinkers (Table 9).

Table 9 Self-rated ability (ordered logit)

CASI-1 also asked, “Compared to the average driver, do you think that you can safely handle much more alcohol, somewhat more, about the same, somewhat less, or much less alcohol than the average driver?” As with perceived driving ability, we find that binge and heavy binge drinkers are more optimistic about their ability to handle alcohol. Thus, juxtaposed against our main findings, which do not support optimism bias in the context of alcohol consumption, is the same kind of finding that has led scholars to hypothesize that optimism bias underlies decisions about harmful choices.Footnote 14

Although heavy and heavy binge drinkers tend to think that they are more capable than others, evidence from our analysis of data from the SAD suggests that this belief does not generally translate to subjective beliefs about specific outcomes of high levels of drinking.Footnote 15 A criticism of this line of questioning is that the results may be due to question framing. In particular, questions comparing the individual’s subjective probability of an adverse outcome to the same individual’s assessment of the probability faced by the “average” person may yield biased results. First, average is not defined and individuals are likely to have different reference groups. Also, people may be reluctant to state that they are “below average” or more “vulnerable than average” (Viscusi 2002), especially if they often engage in the activity to which the question refers.

Second, some studies infer from focal responses (0, 50, 100%) that people do not have well formed subjective beliefs about important personal outcomes conditional on their personal choices.Footnote 16 A preponderance of “50%” responses may be particularly indicative of lack of a firm subjective belief. Figure 2 shows the frequency of responses to questions in the SAD phrased in the second person, and Fig. 3 shows the frequency distributions for questions phrased in the third person. As seen in Fig. 3, there are indications of focal responses for the probability of being convicted conditional on an arrest and for the probability of jail conditional on conviction, but little or no indications of focal responses for the other outcomes.

Fig. 2
figure 2

The distribution of subjective beliefs: questions phrased in second person

Fig. 3
figure 3

The distribution of subjective beliefs: questions phrased in third person

5 Discussion and conclusions

The motivation for this study was to determine whether people engage in risky drinking behavior because they underestimate the adverse consequences from their actions. Overall, our results indicate that optimism bias does not explain why people engage in heavy, binge, and heavy binge drinking.Footnote 17

As an alternative to the optimism bias hypothesis, persons who drink frequently and consume large amounts of alcohol daily could be more familiar with the risks of such behaviors. Even advocates of optimism bias have indicated that such bias should decrease with personal experience (Weinstein 1987). In our study, there is some evidence in support of such learning by doing from our analysis of the probability of getting liver disease from long-term drinking, the number of drinks required to reach a BAC of 0.08, and the probability of jail conditional on conviction, but not for the other study outcomes.

Underlying measurement of accuracy of beliefs is the notion or premise that the objective probability is known. For one of our study measures, there is no consensus among experts about what the underlying objective values are. There is disagreement in the literature and among persons who have gained practical experience—such as the attorneys and police we surveyed in the study citiesFootnote 18—about the percent of drunk drivers on the road on weekend evenings. Even if the city-specific estimates we use are rough, as they undoubtedly are, it is noteworthy that the subjective beliefs of the respondents to SAD do not differ systematically from the objective estimates.

Dionne et al. (2007) used survey data on drinking and driving behaviors, knowledge of regulations, attitudes toward drinking and driving, and personal characteristics to assess the accuracy of subjective beliefs about the risk of impaired driving. Unlike our sample, half of their sample persons had been convicted for impaired driving. Their main result relevant to our findings is that no variable measuring drinking behaviors had much influence on risk perceptions. There is some evidence in the Dionne et al. study that individuals who “do not drink” or “do not drink an hour before driving” are more likely to overestimate the risk of having an accident occasioning a police report while drinking and driving. Similarly, individuals who “do not drink” are more likely to overestimate the risk of having an accident causing bodily injuries or death while drinking and driving. Dionne et al. did not analyze these explanatory variables for heavy, binge, and heavy binge drinkers, and the SAD excluded non-drinkers since the survey’s main focus was on determinants of drinking and driving.

If drivers on the whole are aware or even pessimistic about the adverse consequences of excessive alcohol use, why do they engage in such behaviors? One possibility is that people do not intend to engage in risky behaviors, but do so because they lack self-control (Gul and Pesendorfer 2001). This is a possible explanation of our finding that SAD respondents’ subjective probabilities about drinking and driving in the following year exceeded the actual probabilities.

Several indicators of self-control were elicited by the SAD, but they are not analyzed in this study. The SAD asked a series of 12 questions to elicit estimates of the respondent’s impulsivity and self-control. Compared to other drinkers, binge and heavy binge drinkers exhibited higher levels of impulsivity/lack of self-control than other drinkers did.

A second possibility is that individuals make rational choices in regards to their alcohol consumption. On some level, the benefits of alcohol consumption behaviors, such as socializing in connection with drinking, may outweigh the adverse consequences. The SAD got at this by asking, “How important is it for your social life to be able to enjoy a few drinks with your friends?” Binge and heavy binge drinkers were significantly more likely to state that drinking was quite important or very important to their social lives. Heavy drinkers attached higher importance to drinking for their social lives then other drinkers did, but the difference between heavy and other drinkers was not quite significant at conventional levels.

Third, the cost of the adverse consequences may be less for persons who consume a lot of alcoholic beverages. For example, the SAD asked about the costs of a DWI arrest to the respondent’s personal life. Heavy and heavy binge drinkers were less likely to state that the cost of a DWI arrest was high.

Our study has several strengths. First, it is based on data from eight geographically diverse cities with different cultures and attitudes towards drinking. Second, the SAD oversampled heavy and binge drinkers in order to examine the details of their behaviors. Third, given that the SAD was conducted in multiple waves, we are able to measure outcomes for the same individuals for which we have past subjective beliefs about the probabilities that these same outcomes would occur. Fourth, we compare subjective beliefs to objective data from a variety of secondary sources. Fifth, we analyze subjective beliefs about specific outcomes and overconfidence in the same study. Sixth, although our focus was on accuracy of risk perceptions among persons by drinker type, we also consider a multitude of factors involved in high levels of alcohol use. Seventh, our conclusions are based on analysis of risk perceptions about a variety of issues as they pertain to consequences of heavy and binge drinking.

We also acknowledge some weaknesses in our study. First, rejecting the notion that optimism bias does not explain high levels of alcohol use is not equivalent to rejecting optimism bias in all decision-making contexts or accepting rationality as universally applicable. At a minimum, our conclusion about optimism bias applies to binge and heavy binge drinking, which imposes important negative consequences on drinkers and others. On the general applicability of the rationality assumption, as McFadden (1999) stated in a review, “Rationality for Economists?”, even by the late 1990s there was a large body of economic literature and even more evidence from other disciplines questioning the validity of the rationality assumption, irrespective of the details of how rationality is defined in operational terms. Scholars have acknowledged that there is heterogeneity among individuals, and the mix of behavioral types may be critical to market outcomes (Fehr-Duda et al. 2010). Even if risk perceptions are accurate, there are other potential forms of irrationality, for example, whether persons consider the utility of all likely states of the world pertinent to a particular decision, and time and risk dimensions of decisions (Zeckhauser and Viscusi 2008; Frederick and Loewenstein 2008), and the extent to which they rely on heuristics (Katsikopoulos and Gigerenzer 2008).

The objective probabilities we used could vary among individuals with particular attributes in ways we are unable to measure, which is a second and more specific limitation. For example, the objective probabilities of conviction, fines, and jail are for each city. There is likely to be variation in the objective probabilities according to personal attributes, which the arrest data from the four states did not allow us to measure. Third, in a minority of outcomes we analyze in this study (survival and liver disease), the SAD measured subjective probabilities crudely, i.e., on a scale of 0–10 rather than on a scale of 0–100. The 0–10 scale converted to probabilities is coarser than one would ideally like. Thus, a 2 can indicate a probability ranging from 0.15 to 0.24. Viscusi and Hakes (2003) questioned the validity of such scales as measures of probabilities. They found that the 0–10 scale used to elicit subjective probabilities does not satisfy all properties associated with probabilities. But Manski and Molinari (2010), who analyzed data from the HRS from which our survival questions are drawn, found that a substantial fraction of persons answered probability questions in multiples of ten. This implies that many people may not be able to give probabilities in more than one significant figure.

In conclusion, based on our findings, optimism bias is not likely to be an important cause of heavy and binge drinking. Focusing on other explanations of such risky behavior is warranted. Risk perceptions are important to study because they underlie decision making. This study adds to the evidence base which implies it is appropriate to move away from optimism bias as a likely causal mechanism underlying potentially harmful personal choices.