1 Introduction

The representation of probabilistic models of choice behavior by random utility functions has a long history. One of the early pioneers was Thurstone (1927) who proposed a Probit type model based on normally distributed utilities. In contrast, the theory of Luce (1959) was derived from his Choice Axiom (equivalent to IIA) without reference to an underlying random utility interpretation. Subsequently, Holman and Marley (see Luce and Suppes 1965, p. 338, footnote 7), showed that the Luce model can also be interpreted as a random utility model derived from extreme value distributed random utilities. McFadden (1973), Yellott (1977), and Strauss (1979) have investigated the following identification problem related to the Luce model, namely if there are distributions of the utilities other than the extreme value distributions that yield the Luce model. It turns out that under the assumptions of additively (or multiplicatively) separable utility functions in a deterministic part and a random part, the answer negative, provided the utilities are independent. To this end, the most general results have been obtained by Yellott (1977) and Strauss (1979). Strauss (1979) has also obtained some results for the case where the random parts of the utility function are not necessarily independent across alternatives.

Related works are Falmagne (1978), Strauss (1979), Colonius (1984), Monderer (1992), Barberà and Pattanaik (1986), and Fiorini (2004) who have discussed necessary and sufficient conditions on systems of choice probabilities so as to be consistent with a random utility representation. Dagsvik (1994, 1995) showed that any random utility model can be approximated arbitrarily closely by Generalized Extreme Value models.

In this paper, we consider another extension: we maintain the assumptions of the utilities being independent across alternatives but abandon the assumption of separability. Under these assumptions, and with infinite universal set of alternatives, it turns out that the most general random utility representation of the Luce model is a utility function that is an arbitrary strictly increasing transformation of a separable utility function (additive of multiplicative) with extreme value random component.

2 The luce model and non-separable random utility representations

Let \(S\) denote the set of integers and consider a family of random utility models with utilities, \(U_j , j\in S\), with the following properties. The utilities \(U_j \) and \(U_k \) are independent for \(j\ne k\). To alternative \(j\) there is associated a positive scale \(w_j \) such that \(P(U_j \le u)=F_{w_j } (u),\, u>0\), where \(F_w (u)\) is a c.d.f. defined on \((0,\infty )\) for given w belonging to a some set. The scale \(\{w_j ,j\in S\}\) represents the deterministic parts of the preference representation. In empirical applications, it will typically be specified as a parametric function of individual characteristics and alternative-specific attributes.

Let \(C\) be a finite subset of \(S\). Then the random utility model is a Luce model whenever

$$\begin{aligned} P_C (j)\equiv P\left( U_j =\mathop {\max }\limits _{k\in C} U_k\right) =\frac{w_j }{\sum \limits _{k\in C} {w_k } }. \end{aligned}$$
(1)

The special case where \(F_{w_j } (u)=\exp (-w_j /u)\) corresponds to the multiplicative random utility representation, \(U_j =w_j \varepsilon _j \) where \(\varepsilon _j\) has type I extreme value c.d.f. \(\exp (-1/u)\). The multiplicative representation is equivalent to the additive representation \(\tilde{U}_j =v_j +\eta _j\) where \(\tilde{U}_j =\log U_j , \quad v_j =\log w_j \) and \(\eta _j =\log \varepsilon _j \). It follows readily that \(\eta _j\) has type III extreme value c.d.f. \(\exp (-\exp (-u))\). It is well known that the latter specification implies (1), see for example McFadden (1973).Footnote 1

Theorem 1

Assume a random utility model with independent utilities \(U_j ,j\in S\), where \(P(U_j \le u)=F_{w_j } (u)\) for each given \(w_j \in A\) where A is a set containing at least two positive real numbers. Furthermore, assume that \(F_w (u)\) is strictly monotone and continuously differentiable in \(u\in (0,\infty )\). Then (1) holds for any selection \(\{w_j \in A,j\in S\}\) if and only if \(U_j \) has the same distribution as \(H(w_j \varepsilon _j )\) where H is an arbitrary strictly increasing mapping from \((0,\infty )\) to some suitable set and \(\varepsilon _j ,j\in S\), are independent extreme value distributed random variable with c.d.f. \(\exp (-1/u), u>0\).

Proof

Consider first the “if” part. Then the utility representation \(\{H(w_j \varepsilon _j )\}\) is equivalent to the multiplicative representation \(\{w_j \varepsilon _j \}\) because \(H\) is strictly increasing. Moreover, the latter one is equivalent to the additive representation \(\{\log w_j +\log \varepsilon _j \}\). If \(\varepsilon _j \) has c.d.f. \(\exp (-1/u)\) it follows readily that \(\log \varepsilon _j \) has c.d.f. \(\exp (-\exp (-u))\). Consequently, the Luce choice model follows from well-known results, see for example McFadden (1973).

Consider next the “only if” part. Let \(C=\{1,2,\ldots ,m+1\}\) where \(m\) is any integer. The corresponding choice probability of selecting alternative j can then be expressed as

$$\begin{aligned} P_C (j)=P(U_j =\max _{k\in C} U_k )=\int _{R_+ } {{F}{^\prime }_{w_j } (u)\prod _{k=1,k\ne j}^{m+1} {F_{w_k } (u)du} } =\frac{w_j }{\sum _{k=1}^{m+1} {w_k } }. \end{aligned}$$
(2)

With no loss of generality assume that \(1\in A\). Consider the special case with \(w_1 =w\) and \(w_k =1\), for \(k = 2, 3,\ldots , m+1\). Then the choice probability \(P_C (1)\) reduces to

$$\begin{aligned} P_C (1)=\frac{w}{w+m}=\int _{R_+ } {{F}{^\prime }_w } (u)F_1 (u)^{m}du. \end{aligned}$$
(3)

Since \(F_1 (u)\) is strictly increasing and continuously differentiable it follows that it is invertible and the inverse is also continuously differentiable. By change of variable; \(y=F_1 (u), dy={F}{^\prime }_1 (u)du\), the integral in (3) transforms to

$$\begin{aligned} \frac{w}{w+m}=\int \limits _0^1 {{\psi }'_w (y)y^{m}dy,} \end{aligned}$$
(4)

where \(\psi _w (y)=F_w (F_1 ^{-1}(y))\), which is for each given \(w\) a c.d.f. defined on [0,1]. The equation in (4) must hold for all \(m = 1, 2,\ldots \). The equation in (4) corresponds to Hausdorff’s moment problem (see Feller 1971 vol. II, pp. 224–225). Specifically, Hausdorff has proved that \(\psi _w\) is uniquely determined provided (4) holds for every integer \(m\). Note next that \(\psi _w (y)=y^{w}\) is a solution to (4). Hence, \(\psi _w (y)=y^{w}\) is the only possible solution.Footnote 2 Thus, one must have \(F_w (F_1^{-1} (y))=y^{w}\), which yields \(F_w =F_1 ^{w}\).

Next, define \(H^{-1}(x)=-1/\log F_1 (x)\). Since \(F_1 (u)<1\) for finite \(u\) it follows that \(H^{-1}(x)\) is positive. Furthermore, it is easily verified that \(H^{-1}(x)\) is strictly increasing, which implies that also \(H(u)\) is strictly increasing. Thus, with \(\varepsilon _j \) distributed according to the c.d.f. \(\exp (-1/u), \, u>0\), we obtain that

$$\begin{aligned} P\left( H(w_j \varepsilon _j )\le u\right)&= P\left( \varepsilon _j \le H^{-1}(u)/w_j \right) =\exp \left( -w_j /H^{-1}(u)\right) \\&= \left( {\exp (-1/H^{-1}(u)} \right) ^{w_j }=F_1 ^{w_j }(u)=F_{w_j } (u) \end{aligned}$$

which shows that \(U_j \) has the same distribution as \(H(w_j \varepsilon _j )\). This completes the proof of Theorem 1. \(\square \)