1 Introduction

Econometric analysis of discrete choice has made considerable use of random utility models (RUMs) to interpret the observed choice behavior (McFadden 1974, 1981). Much empirical research concerns choice problems in which persons act with partial knowledge of the utilities of the feasible actions. Economists have used random expected utility models to analyze such choice problems. A common practice has been to specify fully the expectations that persons hold, in which case choice analysis reduces to inference on preferences alone.

Unfortunately, the expectations assumptions made in empirical research often have little foundation, diminishing the credibility of the findings. Consider, for example, the analysis of travel mode choice for the journey between home and work, one of the earliest applications of random utility models and still an important subject of empirical research (e.g., Warner 1962; Domencich and McFadden 1975). The canonical mode-choice model supposes that, each day, a worker chooses between two alternatives, travel by automobile and public transit. The utility of each mode depends on its travel cost and travel time.

Empirical researchers have commonly used models of traffic flow on transportation networks to predict the travel times that particular workers would experience by each mode. Researchers have also assumed that these predicted travel times agree with the travel times that workers perceive when they make their mode choices. The accuracy with which researcher-predicted travel times measure travel time expectations is questionable. Transportation network models cannot precisely emulate the circumstances of individual travelers. Moreover, workers typically are uncertain how long the journey will take by each mode. Travel times may vary from day to day due to unforeseen variation in traffic volume and the possibility of accidents.

RUMs with incorrectly predicted expectations of travel times or other attributes of alternatives are misspecified. The generic result for discrete choice analysis is inconsistent parameter estimation, the specifics depending on the case. To enhance the credibility of econometric analysis, I have recommended survey measurement of the expectations that decision makers hold Manski (2004). A small but growing body of empirical research proceeds in this manner, measuring the probabilistic expectations of sampled decision makers and combining choice and expectations data to estimate RUMs. See Lochner (2007), Delavande (2008), van der Klaauw and Wolpin (2008), van der Klaauw (2012), Zafar (2011), Wiswall and Zafar (2015) and Giustinelli (2016).

What can one do in the absence of expectations data? In this case, one can still study how inference depends on the expectations assumptions imposed. Manski (2010) considered inference when one specifies a set of expectations that decision makers may plausibly hold. I first posed the idea in abstraction and then specialized to binary response with linear utilities, where the analysis is straightforward. I mainly assumed that decision makers possess unique subjective probability distributions on the states of nature and make choices that maximize expected utility. I briefly considered the possibility that persons place only partial probabilistic structure on the states of nature and make choices in some manner that uses the available structure.

I referred to the models of choice behavior developed in Manski (2010) as random utility models with bounded ambiguity (RUMBAs). Ambiguity may be observational, in that the researcher does not observe the expectations that decision makers hold. Ambiguity may be behavioral in the sense of Ellsberg (1961); that is, the persons under study many not possess complete probabilistic expectations. The adjective “bounded” refers to the fact that meaningful inference on the population distribution of preferences is possible only if the researcher possesses sufficient a priori knowledge of their expectations.

This paper revises and updates the presentation of Manski (2010). Sections 2 and 3 consider observational and behavioral ambiguity, respectively.

2 Observational ambiguity

2.1 Generalities

Let J be a population of decision makers, each of whom chooses an action from a finite choice set C. The standard RUM assumes that person j associates utilities \((u_\mathrm{jc} ,c\in C)\) with the feasible actions and chooses one that maximizes utility. The inferential problem is to learn the distribution of preferences from observation of the choices and covariates of a random sample of decision makers. Let X denote the feasible values of the observable covariates. Assume that the distribution of preferences is continuous and has the form \(F_\theta \), where \(\theta \) belongs to a specified parameter space \(\Theta \). Then the equations

$$\begin{aligned} P\left( {c|x} \right) =F_\theta (u_c \ge u_d ,d\in C|x), \quad \left( {c,x} \right) \in C\times X \end{aligned}$$
(1)

relate the choice probabilities P(c|x) to the distribution of preferences \(F_\theta \). The identification region for \(\theta \) is the set of parameter values that satisfy equations (1). The usual practice is to restrict the parameter space enough to point identify \(\theta \) and, hence, fully reveal the distribution of preferences. However, Manski (2007) considers a broad class of problems in which the distribution of preferences is partially identified and Manski (2014) shows how the analysis may be used to study labor supply under alternative income tax schedules.

Most empirical research today concerns choice problems in which persons act with partial knowledge of the utilities of the feasible actions. Economists use random expected utility models to analyze such choice problems. Let \(\Gamma \) be a specified set of states of nature and, for \(\gamma \in \Gamma \), let \(u_{{\mathrm{jc}\gamma }}\) be the utility of action c to person j in a state of nature \(\gamma \). Suppose that, at the time of decision making, person j does not know what state of nature will be realized. Researchers routinely assume that person j places a subjective probability distribution on \(\Gamma \), say \(Q_\mathrm{j}\) and chooses an action that maximizes expected utility. They assume that the joint distribution of preferences and expectations is continuous and has the form \(G_\theta \), where \(\theta \) belongs to a specified parameter space \(\Theta \). Then the equations

$$\begin{aligned} P\left( {c|x} \right) =G_\theta \left( \int u_{c \gamma } \mathrm{d}Q \ge \int u_{\mathrm{d} \gamma } \mathrm{d}Q,d\in C|x \right) , \quad \left( {c,x} \right) \in C\times X \end{aligned}$$
(2)

relate \(P({c|x})\, \mathrm{to}\, G_{\theta }\). A common practice is to specify fully the expectations that persons hold, in which case the task of choice analysis reduces to inference on preferences alone.

The idea developed in Manski (2010) is to specify a set of expectations that decision makers may plausibly hold, rather than assume that they hold particular expectations. Empirical research specifying a set of plausible expectations typically cannot point identify the population distribution of preferences, but it can yield more credible partial-identification findings. It can also make plain the extent to which conventional point estimates rest on untenable expectations assumptions. See Barseghyan et al. (2016) for another work of this type.

I assume that the researcher and decision makers agree on the set \(\Gamma \) of feasible states of nature. Let \(\Pi _\mathrm{j}\) denote the set of subjective distributions on \(\Gamma \) that the researcher deems plausible for person j to hold. I assume that the researcher is correct in thinking that the set \(\Pi _\mathrm{j}\) contains person j’s expectations. I place no cross-person restrictions on expectations; hence, the members of the population can collectively hold any expectations in the Cartesian-product set \(\{\times _{\mathrm{j} {\in J}} \Pi _\mathrm{j}\}\).

Suppose that person j chooses action c. This action maximizes expected utility under the expectations \(Q_\mathrm{j}\). The researcher knows that \(Q_\mathrm{j} \in \Pi _\mathrm{j} \). Hence,

$$\begin{aligned} \mathrm {j\,chooses \, c}\Rightarrow \exists \pi \in \Pi _\mathrm{j} \quad s.t.\int {\textit{u}}_{\mathrm{jc} \gamma } \mathrm{d} \pi \ge \int {\textit{u}}_{\mathrm{j}\mathrm{d} \gamma } \mathrm{d}\pi ,\forall \textit{d}\in C. \end{aligned}$$
(3a)

Suppose that person j does not choose action c. Then there exists another action that yields expected utility at least as large as c under \(Q_\mathrm{j}\). Hence,

$$\begin{aligned} \mathrm{j\,does\, not\, choose\, c}\Rightarrow \exists \pi \in \Pi _\mathrm{j} \quad \mathrm{and} \quad d\ne cs.t.\int u_{\mathrm{j}\mathrm{d} \gamma } \mathrm{d} \pi \ge \int u_{\mathrm{jc} \gamma } \mathrm{d}\pi . \end{aligned}$$
(3b)

This logical relationship is equivalent to its contrapositive

Aggregating across the population, (3a) and (3b’) imply these inequalities relating choice probabilities to the distribution of preferences and expectations:

$$\begin{aligned}&G_\theta \left[ \int {u}_{c \gamma } \mathrm{d} \pi >\int u_{d \gamma } \mathrm{d} \pi ,\forall ( d,\pi )\in C \times \Pi s.t. d \ne c|x \right] \le P \left( {c|x}\right) \nonumber \\&\; {\le G_{\theta } \left[ \exists \pi \in \Pi s.t.\int {u}_{{c} \gamma } \mathrm{d} \pi \ge \int {u}_{{d} \gamma } \mathrm{d} \pi , \quad \forall d\in C|x \right] , \quad \left( {c,x} \right) \in C\times X.}\qquad \end{aligned}$$
(4)

These inequalities provide the basis for inference on \(\theta \). The identification region is the set of parameter values that satisfy (4).

2.2 Binary choice with linear utilities

Inequalities (4) describe the inferential problem in generality, but they are too abstract to communicate much. Hence, I now consider the special case of binary choice with linear utilities.

Assume that each member of the population must choose between two actions, labeled 0 and 1. The utility of action c to person j is

$$\begin{aligned} u_{{\mathrm{jc}} \gamma } =z_{\mathrm{jc}} \beta +\alpha y_{{\mathrm{jc}} \gamma } +\varepsilon _{\mathrm{jc}} . \end{aligned}$$
(5)

Here, \(z_{\mathrm{jc}}\) is a K-vector, \(y_{{\mathrm{jc}} \gamma }\) and \(\varepsilon _{\mathrm{jc}}\) are scalar, and \((\beta ,\alpha )\)are corresponding parameters. The absence of a \(\gamma \)-subscript on \((z_{\mathrm{jc}} ,\varepsilon _{\mathrm{jc}})\)indicates that the person knows these quantities at the time of decision making. The presence of a \(\gamma \)-subscript on \(y_{{\mathrm{jc}} \gamma }\) indicates that this quantity depends on the unknown state of nature. For example, when modeling the choice of travel mode, the decision maker may know the travel cost (z) but not the travel time (y) of each alternative.

Assume that the person chooses an action that maximizes expected utility. The linearity of utility in y implies that expected utility varies only with the mean of y, and not with its entire distribution. Specifically, let \(z_\mathrm{j} \equiv z_{\mathrm{j}1} -z_{\mathrm{j}0} ,y_{\mathrm{j} \gamma } \equiv y_{{\mathrm{j}1}\gamma } -y_{{\mathrm{j}0}\gamma }, v_\mathrm{j} \equiv \int y_{\mathrm{j} \gamma } \mathrm{d}Q_\mathrm{j}, \mathrm{and}\, \varepsilon _\mathrm{j} \equiv \varepsilon _{\mathrm{j}1} -\varepsilon _{\mathrm{j}0} .\) Let \(c_\mathrm{j}\) be the action chosen by person j. Then the decision rule is

$$\begin{aligned} c_\mathrm{j} =1[z_\mathrm{j} \beta +\alpha v_\mathrm{j} +\varepsilon _\mathrm{j} >0]. \end{aligned}$$
(6)

Suppose that a researcher draws a random sample of N members of the population. For each sample member \(j=1,\ldots ,N\), the researcher observes \(({c_\mathrm{j} ,z_{\mathrm{j}}})\) but not \((v_\mathrm{j} ,\varepsilon _{\mathrm{j}} )\). This would be a standard problem of binary choice analysis if the researcher were to observe \(v_\mathrm{j}\). In the absence of data on expectations, the prevailing practice has been to assume that \(v_{\mathrm{j}}\) takes some particular value and proceed with standard binary choice analysis. Here, I assume instead that \(Q_\mathrm{j} \in \Pi _\mathrm{j}.\) I also assume that the sign of \(\alpha \) is known; for convenience, the discussion below assumes that \(\alpha \ge 0.\)

Let \(v_{\mathrm{j}0} \equiv \mathrm{inf}(\int y_{\mathrm{j} \gamma } \mathrm{d}\pi ,\pi \in \Pi _\mathrm{j} ) \,\mathrm{and}\, v_{\mathrm{j}1} \equiv \mathrm{sup}(\int y_{\mathrm{j} \gamma } \mathrm{d}\pi ,\pi \in \Pi _\mathrm{j} ).\)Given (5), (6), and \(\alpha \ge 0\), the logical relationships in (3) reduce to

$$\begin{aligned} \mathrm{j\,chooses}\,1\Rightarrow z_\mathrm{j} \beta +\alpha v_{\mathrm{j}1} +\varepsilon _\mathrm{j} \ge 0, \end{aligned}$$
(7a)
$$\begin{aligned} z_\mathrm{j} \beta +\alpha v_{\mathrm{j}0} +\varepsilon _\mathrm{j} >0\Rightarrow \mathrm{j \,chooses}\,1. \end{aligned}$$
(7b)

Aggregating across the population, (7a) and (7b) yield these inequalities relating choice probabilities to the distribution of preferences:

$$\begin{aligned} G_\theta [z\beta +\alpha v_0 +\varepsilon >0|x]\le P\left( {1|x} \right) \le G_\theta [z\beta +\alpha v_{1} +\varepsilon \ge 0|x], \quad x\in X. \end{aligned}$$
(8)

The observed covariates x include \((z_{0},z_{1},\Pi )\). The parameter \(\theta \) includes \((\beta , \alpha )\) plus the parameters needed to describe \(P(\varepsilon |x),\)the distribution of \(\varepsilon \) conditional on x. The identification region is the set of parameter values that satisfy (8).

The size and shape of the identification region depends on the ranges \(\{({v_{\mathrm{j}0},v_{\mathrm{j}1}}), j \in J\}\) within which the expectations \((v_\mathrm{j} ,j\in J)\)are known to lie. Manski (2010) showed that the identification region is generically unbounded when expectations almost always have indeterminate sign: that is, when \(v_{\mathrm{j}0}<0<v_{\mathrm{j}1} \)for almost all \(j\in J.\) This finding should serve as a warning to empirical researchers studying binary choice under uncertainty. Meaningful inference is possible only if the researcher sometimes knows whether v is positive or negative.

2.3 Monotone-index models

Empirical research on binary choice often assumes that \(\varepsilon \) is statistically independent of x with a specified strictly increasing distribution function, such as the standard normal or logistic distribution. Let F be the assumed distribution function for \(-\varepsilon .\)Then (8) becomes

$$\begin{aligned} F(z\beta +\alpha v_{0} )\le P\left( {1|x} \right) \le F(z\beta +\alpha v_{1} ), \quad x\in X, \end{aligned}$$
(9)

where \(\theta =(\beta ,\alpha )\). The identification region for \((\beta ,\alpha )\)is the set of parameter values that satisfy inequalities (9) or, equivalently, the linear inequalities

$$\begin{aligned} z\beta +\alpha v_{0} \le F^{-1}\left[ {P\left( {1|x} \right) } \right] \le z\beta +\alpha v_{1}, \quad x\in X. \end{aligned}$$
(9′)

This is a monotone-index model with interval regressor data of the form studied in Manski and Tamer (2002, Section 4). Their Corollary to Proposition 4 shows the following:

  1. (a)

    The identification region for \((\beta ,\alpha )\) is convex.

  2. (b)

    Let \(k\in ( {1,\ldots ,K}).\) Let \(P\left( {z_{k} |z_{-k} ,v_{0} ,v_{1} } \right) \) have unbounded support, a.e. \(( {z_{-k} ,v_{0}, v_{1} } ).\) Then \(\beta _{k} \) is point-identified.

  3. (c)

    For each value of v, let there exist no proper linear subspace of \(R^{K}\)having probability one under \(P({z|v}).\,\mathrm{Let} P({v_{0} = v_1})>0.\) Then \((\beta ,\alpha )\) is point identified.

When the parameter space is compact and some other regularity conditions hold, their modified minimum distance (MMD) method provides a consistent estimate of the identification region.

Manski and Tamer (2002) also study the case in which the researcher knows only that some quantile of \(P(\varepsilon {\vert }x)\) is constant on X. The identification region remains convex, but is larger than the one obtained with a monotone-index model. This region may be estimated using a modified maximum score method.

Consideration of a simple monotone-index model illustrates the problem that occurs when expectations almost always have indeterminate sign. Let z have one component, this being an action-specific constant; hence, \(\beta \) is scalar. Let the population contain M distinct values for the range \(({v_0 ,v_1 }),\) say \(({v_{m0} ,v_{m1}}),m=1,\ldots ,M.\) Let\(s_{m} \equiv F^{-1}[{P({1|m})}]\). Then the linear inequalities ((9’) reduce to

$$\begin{aligned} \beta +\alpha v_{m0} \le s_m \le \beta +\alpha v_{m1}, \quad m=1,\ldots ,M. \end{aligned}$$
(10)

When expectations have indeterminate sign, the identification region for \(\beta \) is the entire real line. To see this, consider any conjectured value for \(\beta \). Holding \(\beta \) fixed at this value, it follows from (10) that \(\alpha \) is feasible if it solves these inequalities for all values of m:

$$\begin{aligned}&(s_m -\beta )/v_{m0} \le \alpha \le (s_m -\beta )/v_{m1} \quad \mathrm{if} \quad v_{m1} \le 0,\end{aligned}$$
(11a)
$$\begin{aligned}&\mathrm{max}\big \{(s_m -\beta )/v_{m0} ,(s_m -\beta ) /v_{m1}\big \} \le \alpha \quad \quad \mathrm{if} \quad v_{m0}<0<v_{m1},\end{aligned}$$
(11b)
$$\begin{aligned}&(s_m-\beta )/v_{m1} \le \alpha \le (s_{m} -\beta )/v_{m0} \quad \mathrm{if} \quad v_{m0} \ge 0. \end{aligned}$$
(11c)

The conjectured value of \(\beta \) is feasible if and only if these inequalities have a solution. When expectations have indeterminate sign, (11b) provides the operative inequalities for all values of m. In this case, all \(\alpha \ge \mathrm{max}_{m} \mathrm{max}\{(s_m -\beta )/v_{m0} ,(s_m -\beta )/v_{m1} \}\) solve (11). Hence, the conjectured value of \(\beta \) is feasible, along with all such \(\alpha \).

3 Behavioral ambiguity

In Sect. 2, I supposed that decision makers have complete subjective probability distributions on the states of nature, which they use to maximize expected utility. The inferential problem was observational, in that the researcher did not know what expectations people have. In this section, I suppose that persons want to maximize the expected utility, but do not have unique subjective distributions on the states of nature. Instead, person j has a set \(\Pi _\mathrm{j} \) of such distributions, as in Gilboa and Schmeidler (1989), Walley (1991), and much other research on choice under ambiguity.

Decision theorists have suggested various criteria for choice under ambiguity, such as the maximin- and minimax-regret expected utility rules. There is no consensus on how persons with incomplete probabilistic expectations should or do behave, but decision theorists do largely agree that persons should not choose actions that are strictly dominated. In the present setting, this means that person j should not choose an action if there exists another one that yields higher expected utility under all distributions in \(\Pi _\mathrm{j}\). Moreover, if there exists an action that outperforms all other actions under all distributions in \(\Pi _\mathrm{j}\), person j should choose this action.

These dominance conditions yield two logical relationships between choices and decision rules:

$$\begin{aligned}&\mathrm{j\,chooses\,c} \Rightarrow \not \exists \, d \in C s.t \in \int \ u_{\mathrm{jd} \gamma } \mathrm{d} \pi > \int u_{{\mathrm{jc}} \gamma } \mathrm{d}\pi , \forall \pi \in \Pi _\mathrm{j}. \end{aligned}$$
(12a)
$$\begin{aligned}&\int u_{{\mathrm{jc}} \gamma } \mathrm{d}\pi >\int u_{\mathrm{jd} \gamma } \mathrm{d}\pi ,\forall (d,\pi )\in Cx\Pi _\mathrm{j} s.t.d\ne c\Rightarrow \mathrm{j\,chooses\,c}. \end{aligned}$$
(12b)

Aggregating across the population, (12a) and (12b) imply these inequalities relating choice probabilities to the distribution of preferences and expectations:

$$\begin{aligned}&G_\theta \left[ \int u_{c \gamma } \mathrm{d}\pi>\int u_{\mathrm{d} \gamma } \mathrm{d}\pi ,\forall (d,\pi )\in Cx\Pi s.t.d\ne c|x \right] \le P\left( {c|x} \right) \nonumber \\&\; \le G_\theta \left[ \not \exists \ d \in Cs.t.\int u_{\mathrm{d} \gamma } \mathrm{d}\pi >\int u_{c \gamma } \mathrm{d} \pi , \; \forall \pi \in \Pi |x \right] , \quad ({c,x})\in CxX.\qquad \end{aligned}$$
(13)

The identification region is the set of parameter values that satisfy (13).

Observe that relationship (12b) is the same as the one (3b’) that holds with observational ambiguity. Relationship (12a) is generally weaker than the one (3a) that holds with observational ambiguity. Whereas (3a) requires that the chosen action be optimal for some feasible value of \(\pi \), (12a) only requires that the chosen action not be strictly dominated. However, (12a) and (3a) are equivalent when the choice set C contains two actions. Hence, inequalities (13) and (4) coincide in this case.

The above discussion assumes that the researcher knows the set \(\Pi _\mathrm{j}\) of distributions held by each person j. It may be that a researcher does not know \(\Pi _\mathrm{j},\) but can specify a larger set \(\Pi _\mathrm{j}^{\prime }\supset \Pi _\mathrm{j}\) that contains \(\Pi _\mathrm{j}\). Then the researcher may base inference on \(\Pi _\mathrm{j}^{\prime }\), which expresses both behavioral and observational ambiguity.

4 Conclusion

This paper offers a second-best approach to econometric analysis of choice under uncertainty or behavioral ambiguity. A better approach is to measure the expectations that decision makers hold and to analyze choice behavior with these data in hand. However, expectations data often are not available. In their absence, empirical researchers typically make strong assumptions about expectations. It is important to understand how these assumptions drive findings. Moreover, researchers should want to learn what inferences are possible using weaker and more credible assumptions. The ideas and methods introduced here serve these purposes.

The analysis in this paper provides simple practical guidance for estimation of RUMs with two alternatives and linear utilities. However, the paper provides only an abstract framework for study of more complex settings with multiple alternatives and/or utilities that are nonlinear in attributes with uncertain values. Future work should aim to make estimation of such models tractable. I also see much scope for empirical research to move away from the expected utility model and study the behavior of persons who make decisions under ambiguity.