Introduction

Standard discrete choice models represent the decision making process as an interaction between measured attributes of the alternatives (and possibly of the decision maker) and estimated sensitivities of the decision maker. This simplified approach has been heavily criticised by behavioural scientists as it often neglects important idiosyncratic aspects of behaviour and cannot deal with apparently irrational decisions (see for example Gärling 1998). Meanwhile, researchers have increasingly recognised that decision makers differ significantly from one another, and the treatment of differences in sensitivities (and hence choices) across individual decision makers is one of the main areas of interest in choice modelling. While these differences can often be directly linked to socio-demographic characteristics such as age and income, a case has repeatedly been made that underlying attitudes and perceptions may be equally important predictors for these differences, notwithstanding that these attitudes and perceptions may once again be explained by socio-demographic characteristics.

The main issue facing analysts in this context is that while socio-demographic characteristics are directly measurable, the same does not apply to underlying perceptions and attitudes, which are unobserved in the same way that respondent specific sensitivities are not known. In other words, these latent variables are factors that cannot be observed directly; rather, they can at best be inferred from other variables called indicators (Golob 2001; Choo and Mokhtarian 2004). Here, psychometric indicators (typically on a Likert scale) such as responses to survey questions about attitudes, perceptions or decision-making protocols are used as manifestations of the underlying latent attitudes.

We specifically define attitudes and perceptions as follows. Attitudes reflect latent variables corresponding to the characteristics of the decision-maker and reflect individuals’ needs, values, tastes, and capabilities. Attitudes are formed over time and are affected by experience and external factors including socio-economic characteristics (Walker and Ben-Akiva 2002). Perceptions measure the individual’s cognitive capacity to represent and evaluate the levels of the attributes of different alternatives. Perceptions are relevant because the choice process depends on how attribute levels are perceived by the individual beliefs of a specific consumer (Bolduc and Daziano 2008).

The focus of this article is specifically on the incorporation of individuals’ attitudes in discrete choice models. Latent attitudes may play as much of a role in shaping choice as the attributes of the alternatives (Ashok et al. 2002). Therefore, extending choice models to include latent attitudes can lead to a better understanding of the choice processes. Moreover, it is expected that these enhanced models could provide greater explanatory power (Bolduc et al. 2005; Temme et al. 2008).

Early efforts used Structural Equations Models for jointly modelling choices and attitudes. An excellent review of this work is given by Golob (2003). However, at the time of that paper there remained severe software limitations, which implied that both choice variables and attitudinal indicators could be modelled only by linear regression techniques (e.g. as in Golob et al. (1997)). This approach, which is in any case limited to binary choice, must be considered methodologically unsatisfactory.

The use of attitudes in discrete choice models, in particular, is not new, and a number of different approaches have been used in past work. The most direct approach relies on using choice models with indicators. In this case, indicators of the underlying latent variable are treated as error-free explanatory predictors of choice (see Fig. 1a). In other words, rather than correctly treating indicators as functions of underlying attitudes, they are treated as direct measures of the attitudes. The main disadvantages of this approach are that strong agreement with an attitudinal statement does not necessarily translate into a causal relationship with choice. Additionally, indicators are highly dependent on the phrasing of the survey, and furthermore they are not available for forecasting. Incorporating the indicators of latent variables as explanatory variables also ignores the fact that latent variables contain measurement error, and can thus lead to inconsistent estimates (Ashok et al. 2002). Finally, indicators are arguably correlated with the error of the choice model, i.e. there are unobserved effects that influence both a respondent’s choice and his/her responses to indicator questions. This thus creates a risk of endogeneity bias.

Fig. 1
figure 1

Incorporating latent variables in discrete choice models using: a indicators entered directly into the choice model, b factor analysis, and c choice model with latent attributes (taken from Ben Akiva et al. (1999))

An alternative is a sequential estimation approach using factor analysis or structural equation modelling (SEM) for the latent variable component and discrete choice models for the choice component of the model. Factor analysis can be either confirmatory (CFA) or confirmatory with covariates—that is a Multiple Indicator Multiple Cause (MIMIC) model. The factor analysis approach involves analysis of the interrelationships between attitudinal indicators and a statistical procedure that transforms the correlated indicators into a smaller group of uncorrelated (latent) variables called principal components or factors. This procedure requires a single measurement equation. On the other hand, SEM involves two parts: a measurement model and a structural model. SEMs capture three relationships: the relationship among factors (latent variables), the relationship among observed variables and the relationship between factors and observed variables that are not factor indicators. As a next step, the latent variables are entered in the utility equations (see Fig. 1b) of the choice models. The latent variables contain measurement error, and in order to obtain consistent estimates, the choice probability must be integrated over the distribution of latent variables, where the distribution of the factors is obtained from the factor analysis model. This method recognises that both the choice and the response to the indicator questions are driven by the same underlying latent variable. The key disadvantage of this approach is that the latent estimates are inefficient, i.e. they are derived from the attitudinal information only and do not take account of actual choices that the respondent has made (see for example Morikawa et al. (2002)).

Past work has also made use of internal market analysis, in which both latent attributes of the alternatives and consumer preferences are inferred from preference or choice data. In this restrictive approach (Fig. 1c), the observed choices are the only indicators used, and therefore the latent attributes are alternative specific and do not vary among individuals in a market segment (see for example: Elrod (1988); Elrod and Keane (1995)).

With a view to improving on the above methods, recent research efforts have led to the formulation of a combined model structure offering a general treatment of the inclusion of latent variables in discrete choice models. In particular, this model framework is comprised of two components: a discrete choice model and a latent variable model (Fig. 2). In the remainder of this article, we will make use of the name coined for this structure by Bolduc et al. (2005), who refer to it as the Integrated Choice and Latent Variable (ICLV) model, although this term postdates some of the earlier developments. Before proceeding to a more detailed discussion of the ICLV structure, Table 1 provides a summary of previous efforts to incorporate latent variables in discrete choice models.

Fig. 2
figure 2

The integrated latent variable and discrete choice-modelling framework (Sources: Walker and Ben-Akiva 2002; Bolduc et al. 2005)

Table 1 Summary of application-studies employing the ICLV models

The ICLV structure can add to the realism of the model because it explicitly describes how perceptions and attitudes affect choices, as well as using information on observed choices to inform the estimation of the latent attitudinal variables (as opposed to simply using the latent variables as input into the choice model). In the discrete choice model component, alternatives’ utilities may depend on both observed and latent explanatory variables of the options and decision makers. At the same time, these latent variables help explain the responses to observed indicators (that represent manifestations of the latent constructs), while possibly also being functions of explanatory variables (Johansson et al. 2006). In terms of modelling, the latent variables are viewed as structural variables which are related to other variables through a structural latent variable model frameworkFootnote 1 (Bolduc et al. 2005). The latent-variable part of the model captures the relationships between latent variables and MIMIC-type models simultaneously, in which observed exogenous variables influence the latent variables (Temme et al. 2008).

The structural latent variable model formulation incorporates a sub-model that uses the latent variables as explanatory variables in a model in which the dependent variables are answers to questions of a survey (the indicators). The complete model is composed of a group of structural equations (structural model) and a group of measurement relationships (measurement model). The structural model describes the latent variables in terms of observable exogenous variables as well as specifying the utility functions on the basis of observable exogenous variables and the latent variables. The measurement model links latent variables to the indicators. Estimation of the parameters in the full system can be done sequentially (see Ashok et al. 2002; Johansson et al. 2006; Temme et al. 2008) or jointly, i.e. full information (see Bolduc et al. 2005; Morikawa et al. 2002; Walker and Ben-Akiva 2002). Sequential estimation provides consistency while joint (simultaneous) estimation adds efficiency (Bolduc et al. 2005).

Despite their inherent appeal, latent attitude models have thus far only been used rather rarely in applied transport research (and elsewhere). One possible reason for this is the way in which the theoretical work has been spread across numerous disciplines. The first aim of the present article is thus to provide a comprehensive overview of the methodological framework. Next, this article makes a methodological extension to previous work on ICLV models by Ben-Akiva et al. (1999) and Bolduc et al. (2005) by incorporating ordered-logit choice models for the measurement equations of the attitudinal variables. Seemingly unlike much other latent variable choice modelling work, we also explicitly account for the repeated choice nature of the (stated preference) data. As an additional contribution, we present some evidence from a comparison of two commonly used normalisations of ICLV models. In line with a small but growing subset of other studies, we use simultaneous rather than sequential estimation. The empirical application of the models is also novel, looking at the use of attitudinal variables in the context of a stated choice survey on UK rail passengers’ trade-offs across privacy, liberty and security.

The remainder of this article is organised as follows. The following section presents the methodological framework used in the present work, including the extension to an ordered model for the attitudinal responses. We then present the choice context used for the empirical example, with model specification and estimation results being discussed next. Finally, we present the conclusions of the work.

Methodology

Outline of the model

The situation we seek to model is one in which we observe stated or actual choices by surveying respondents who also record responses to attitudinal questions. We hypothesise that both choice and attitudinal responses are influenced by latent variables and we seek to model the choice and attitudinal responses together to give more insight into the processes that motivate respondents’ behaviour. Three sets of relationships therefore have to be defined, as follows. We note that in the following specification we have not used an index for the respondent as it is not necessary for the present discussion. However, it should be understood that all of the variables, except the parameters to be estimated, are in principle specific to respondents.

Choice among the set J of alternatives is modelled by assuming travellers maximise utility, which we assume to be linear in parameters:

$$ k = \arg \max \left( {(U_{j} = aX_{j} + cY_{j} Z + \nu_{j} )|j \in J} \right) $$
(1)

here, k refers to the chosen alternative, X j is a vector of M attributesFootnote 2 of alternative j, while Z is a vector of L latent variables. The vector a measures the impact of the attributes in X j on the utility of alternative j. The impact of the latent variables on the utility of alternative j is controlled by Y j . Here, Y j (N, L) is a matrix of variables indicating whether a given coefficient in the vector c applies to a given latent variable in the utility function for alternative j. The entries in the matrix Y j may be dummies or data values for socio-economic or alternative attributes or combinations of these, and we have N different interactions in c. As an example, if the latent attitude p is to be interacted with the sensitivity to a given attribute, with this interaction captured in the qth element in c, then Y j,q,p would be given by the value of that attribute for alternative j. If, on the other hand, the qth element was to capture the absolute impact of the pth latent variable on the utility of alternative r (i.e. on its alternative specific constant), then Y r,q,p would be equal to 1, and Y j,q,p would be equal to 0 for all j ≠ r. Finally, ν j is a random component of the utility function. The scale of U is fixed by the distributional assumptions made for ν, which are discussed below. The coefficients a and c require estimation, together with any parameters needed to define the distribution of ν.

Attitudinal responses are modelled by a series of relationships known as the ‘measurement’ equations, which the literature generally assumes to be linear:

$$ y_{s} = \delta_{s} + d_{s} Z + \varepsilon_{s} $$
(2)

here, y s gives the observed response to the sth attitudinal indicator (out of S). The impact of the latent variables on the value of the indicator is given by the estimated vector of parameters d s (specific to a given indicator), which may contain zero values when some latent variables are deemed (or found) not to have any impact on a given indicator. The reason for making d specific to a given indicator s is that while a and c in Eq. 1 will have some elements shared across alternatives, the impacts of the latent variables in the measurement equations will almost surely be different across indicators. Finally, ε s gives the random component of the attitudinal response. Each of these equations will require a constant δs, because y is measured on an arbitrary scale (e.g. 1–5); alternatively, the mean value of each y s may be subtracted from the nominal values, so that the mean does not have to be estimated with the other parameters.

Latent variables are assumed to be determined by a series of ‘structural’ relationships, also assumed to be linear:

$$ Z_{l} = bW_{l} + \psi_{l} $$
(3)

here, W(L, Q) are socio-economic variables relating to the latent variables, where it is necessary to specify sufficient unit values in W so that there is effectively a constant in the equation for each Z; this avoids Z being determined by the arbitrary measurement of W. The impact of the elements in the vector W l on the latent variable Z l is estimated by the vector b, while ψ(L) is the error in the latent variable equation.

The use of this model entails the estimation of a number of vectors of parameters, namely:

  • a(M), giving the impact of measured attributes on utilities;

  • b(Q), giving the impact of socio-demographics on latent variables;

  • c(N), giving the impact of latent variables on utilities, where the N rows allow for example for different interactions with different attributes, as well as alternative specific impacts; and

  • d s (L), giving the impact of latent variables on the indicators, with a different d for each indicator.

One final but important point needs discussing, namely the normalisation of the scale for the measurement equations (i.e. Eq. 2). Two normalisations have been discussed in the literature. In the approach taken by Ben-Akiva et al. (1999), the scale of Z is fixed by constraints on the elements in d s . Specifically, combining d s , with s = 1,…S into a matrix d(S,L), the impact of each of the latent variables is normalised for one of the attitudinal indicators, i.e. one of the non-zero values in each of the L columns is normalised. The variance of ψ then needs to be estimated. In the approach taken by Bolduc et al. (2005), the variance of ψ is normalised to 1, and all entries in d are estimated. In either case the scale of ɛ, i.e. the standard deviation of the error in the measurement equations, needs to be estimated. In theory, the two normalisations are equivalent, but to our knowledge, this has not been shown in practice. We thus consider both of these normalisations in the initial stages of the modelling.

Assumptions

The objective is to estimate the vectors of parameters a, b, c, d as well as the parameters of the distributions of the random components ν, ɛ, ψ. Since we have required constants in the equations, it is reasonable to assume that these random components have mean zero (or a standard mean value). This means that we are concerned only with the covariance matrices of the random components.

We therefore have to introduce three further parameters of the model to be estimated:

Ψ:

the covariance matrix of ψ;

E:

the covariance matrix of ɛ; and

Ξ:

the covariance matrix of ξ.

where ξ is an additional error term which will be defined formally below. We propose to estimate these three parameters along with a, b, c, d by maximum likelihood.

Further, it is reasonable to assume (at least in the first instance) that ν, ɛ, ψ are mutually independent.

Assumption: ν, ɛ, ψ are mutually independent.

The three linear equations in the previous section represent three basic assumptions on which the modelling is based. Generally, we are relatively happy with the assumptions of linearity relating to utility U and the latent variables Z. The same cannot be said for the attitudinal indicators. Indeed, the attitudinal responses y will usually be collected on a scale, for example from 1 to 5, and linear regression is not a correct way to model such responses, although it is common even in advanced literature (e.g. Bolduc et al. (2005); Ben-Akiva et al. 1999). For that representation, we would assume that ɛ has a multivariate normal distribution.Footnote 3 This is reconsidered in the final part of this section, where we discuss the use of ordered choice models to represent the attitudinal responses.

The error ψ in the structural equation for the latent variables can most conveniently be defined to have a multivariate normal distribution with covariance matrix Ψ. As discussed above, for the Bolduc et al. (2005) normalisation, Ψ is defined to have unit variance, because this defines the scale of Z, but for the Ben-Akiva et al. (1999, 2005) normalisation, the diagonal elements of Ψ will be estimated. Again, we have not used off-diagonal elements in this matrix for the current article, but the notation leaves the possibility open.

It can clearly be seen already that the presence of the random component in the latent variables (see Eq. 3) will lead to random variations in sensitivities across respondents when latent attitudes are interacted with measured attributes in the utility functions (Eq. 1). The model thus falls into the Mixed Logit family of structures. However, it should be noted that such random variations can also be introduced independently of the latent variables by changing the variation of ν to incorporate additional randomness net of the latent variables, i.e.

$$ \nu_{j} = \xi_{j} + \eta_{j} $$
(4)

where η is i.i.d. type I extreme value (Gumbel) and ξ has some other distribution, for example multivariate normal. In this way, the model net of the latent variables Z is a mixed logit structure, as in the recent work by Yañez et al. (2010), which is however based on sequential estimation. This can clearly also be exploited to allow for correlations between alternatives (by allowing some elements in ξ to be shared by some alternatives). Similarly, it would however also be possible to specify the underlying choice model to be a Nested Logit or other advanced nesting structure.

In the previous discussion, we have suggested that most often the random variables can be considered to be independent, i.e. there are no off-diagonal elements. This feature simplifies the analysis considerably. Bolduc et al. refer to these matrices as “nuisance parameters”. While this is a specific technical term, it understates the importance of the parameters, which are quite interesting from the point of view of understanding and predicting behaviour.

A convenient notation is to define \x to be an n × n matrix whose off-diagonal elements are zero and whose diagonal elements are given by the vector x of dimension n.

Assumptions: ψ, ɛ, ξ are distributed multivariate normal and η is i.i.d. type I extreme value (Gumbel).

  • Ψ, E, Ξ are diagonal matrices; this leads to:

  • \( \Uppsi = \backslash h^2, \)

  • E = \g2 and

  • \( \Upxi = \backslash f^2, \)

where f, g, h are vectors of standard errors (to be estimated).

If we assume that the choices are independent of each other, then there are no further complications. Indeed, if we have a single choice per respondent, the choice probability for given values of Z and ξ can be expressed as,

$$ p(k|Z,\xi ) = \frac{{\exp \left( {aX_{k} + cY_{k} Z + \xi_{k} } \right)}}{{\sum\nolimits_{j} {\exp \left( {aX_{j} + cY_{j} Z + \xi_{j} } \right)} }}, $$
(5)

owing to the type I extreme value (Gumbel) assumption made for η. However, with repeated observations from each individual, such as in Stated Choice experiments, the probability for the sequence of choices k = {k 1,…k T }, conditional on Z and ξ, is given by:

$$ p(k|Z,\xi ) = \prod\limits_{t} {\frac{{\exp \left( {aX_{kt} + cY_{k} Z_t + \xi_{kt} } \right)}}{{\sum\nolimits_{j} {\exp \left( {aX_{jt} + cY_{j} Z_t + \xi_{jt} } \right)} }},} $$
(6)

where the added subscript t is for choice tasks.

In many models the values of ξ and Z will not vary between the choice occasions t for an individual and in those cases the notation could be simplified accordingly. To simplify the notation for this article we shall write the utility for alternative j in choice task t net of the type I extreme value term as \( V_{jt} = aX_{jt} + cY_{jt} Z + \xi_{j} \), where we thus assume that ξ j and Z stay constant across choice tasks. The unconditional choice probability for either single or repeated choices can now be written as:

$$ P(k) = \int\limits_{Z} {\int\limits_{\xi } {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} Z + \xi_{j} \;\forall j,t} \right)dF_{\xi } (\xi )dF_{Z} (Z)} } $$
(7)

where F ξ , F Z are the distributions of ξ, Z respectively and with the understanding that either a single choice or a choice sequence can be represented by p (i.e. T is possibly equal to 1). This is a Mixed Logit structure with the additional role for the latent variable Z. With repeated choice data such as used in this article, we use Eq. 6 inside Eq. 7; the integration is carried out at the level of a sequence of choices (rather than individual choices). The correlation may be induced by the formulation of ξ but also, and specifically to the latent variable model, correlation is induced by Z, both in its deterministic and its random components, with the same value for Z applying to all choices for a given respondent.

Maximum likelihood estimation

The equation for the attitudinal indicators was given above as a linear regression

$$ y_{s} = d_{s} Z + \varepsilon_{s} $$
(8)

Since ɛ s is distributed normally with mean zero and standard error g s , the likelihood of the observation of y S , conditional on a value of Z, is proportional to

$$ P(y_{s} |Z) = \frac{1}{{g_{s} }}n\left( {\frac{{y_{s} - d_{s} Z}}{{g_{s} }}} \right) $$
(9)

where n represents the standard normal (0, 1) frequency function:

$$ n\left( x \right) = \frac{1}{{\sqrt {2\pi } }}\exp \left( { - \frac{{x^{2} }}{2}} \right) $$
(10)

Further, the likelihood of the sequence of values y = {y 1,…y S } is given by the integral over Z of the products of the likelihoods of the separate y s values

$$ P\left( y \right) = \int\limits_{Z} {\prod\limits_{s} {\frac{1}{{g_{s} }}} } n\left( {\frac{{y_{s} - d_{s} Z}}{{g_{s} }}} \right)dZ $$
(11)

The key step in developing the estimation procedure is that the likelihood of jointly observing choice k and indicator y is given by the product of the likelihoods of each observation, i.e. the product of the different choices, as well as the responses to the attitudinal questions. Because of the assumptions we have made about independence, we can write

$$ P\left( {k,y} \right) = \int\limits_{Z} {\int\limits_{\xi } {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} Z + \xi_{j} \quad \forall j,t} \right)\prod\limits_{s} {\frac{1}{{g_{s} }}n\left( {\frac{{y_{s} - d_{s} Z}}{{g_{s} }}} \right)} dF_{\xi } (\xi )dF_{Z} (Z)} } $$
(12)

with \( p\left( {k|V_{jt} = aX_{jt} + cY_{jt} Z + \xi_{j} \quad \forall j,t} \right) \) referring to a sequence of choices, each choice made by an individual in the sequence k is influenced by the same set of latent variables Z, thus inducing a correlation between those choices. This is equivalent to the standard mixed logit approach of allowing coefficient values (i.e. in effect random variations around the fixed values in a) to vary across respondents but stay constant across choices for the same respondent. Such random heterogeneity not linked to latent variables is also possible within this more general model (accommodated in ξ j ), but we have not used this possibility in the current study.

The above notation can be extended to take account of the structure of \( Z = bW + \psi \) to give

$$ P\left( {k,y} \right) = \int\limits_{\psi } {\int\limits_{\xi } {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} (bW + \psi ) + \xi_{j} \quad \forall j,t} \right)\prod\limits_{s} {\frac{1}{{g_{s} }}n\left( {\frac{{y_{s} - d_{s} (bW + \psi )}}{{g_{s} }}} \right)dF_{\xi } (\xi )dF_{\psi } (\psi )} } } $$
(13)

If the matrices Ψ, Ξ had off-diagonal elements, then a Cholesky transformation would be necessary to set up a sampling scheme to estimate the model, as described by Bolduc et al. (2005). However, for the present article the matrices have been assumed to be diagonal, with standard errors h for Ψ and f for Ξ. Then we can write

$$ P\left( {k,y} \right) = \int\limits_{\tau } {\int\limits_{\upsilon } {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} (bW + h\upsilon ) + f_{j} \tau_{j} } \right)\prod\limits_{s} {\frac{1}{{g_{s} }}n\left( {\frac{{y_{s} - d_{s} (bW + h\upsilon )}}{{g_{s} }}} \right)dN(\upsilon )dN(\tau )} } } $$
(14)

where \( N(z) = \int_{ - \infty }^{z} {n(x)dx} \) is the cumulative standard normal distribution and the integration is now over independent standard normal variables υ, τ. We have to estimate a, b, c, d, f, g, h.

This integration can be made by setting up a simulation \( \tilde{P} \) of the likelihood in the usual way:

$$ \tilde{P}\left( {k,y} \right) = \frac{1}{R}\sum\limits_{r} {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} (bW + h\upsilon_{r} ) + f_{j} \tau_{jr} } \right)\prod\limits_{s} {\frac{1}{{g_{s} }}n\left( {\frac{{y_{s} - d_{s} (bW + h\upsilon_{r} )}}{{g_{s} }}} \right)} } $$
(15)

where R draws, indexed by r, are made of υ, τ from independent standard normal distributions. Note that at each draw, all of the components of υ, τ are drawn. Maximisation of the simulated \( \tilde{P} \) then gives consistent estimates of the parameters a, b, c, d, f, g, h as required.

Attitudinal responses as ordered choices

A more sophisticated approach to the representation of the attitudinal variables is to treat the responses as ordered choices. Recall that we supposed in the presentation above that the attitudes of respondents could be modelled as random variables as in Eq. 16, which repeats Eq. 2,

$$ y_{s} = d_{s} Z + \varepsilon_{s} $$
(16)

To apply ordered choices we treat the attitudes as latent variables x and model the probability that the attitude x lies within a particular range to give the observed response y:

$$ x_{s} = d_{s} Z + \varepsilon_{s} $$
(17)
$$ \Pr \left\{ {y_{s} = j|Z} \right\} = \int\limits_{{\mu_{{j - 1},s} }}^{{\mu_{j, s} }} {\varphi \left( {\frac{{x_{s} - d_{s} Z}}{{g_{s} }}} \right)} dx_{s} = \Upphi \left( {\frac{{\mu_{j,s} - d_{s} Z}}{{g_{s} }}} \right) - \Upphi \left( {\frac{{\mu_{{j - 1}, s} - d_{s} Z}}{{g_{s} }}} \right) $$
(18)

where response j is given to question s if μ j−1,s  < x s  ≤ μ j,s ; φ is the normalised frequency function for ɛ and Φ is its cumulative form. For consistency with Eq. 16 we might use the normal distribution in this role, but to reduce difficulty in evaluating the function (e.g. to avoid excessive random sampling) it is effective to use the logistic distribution, which has a closed cumulative form. Here, we acknowledge that more complex specifications of ordered choice models exist than the one used here (Greene and Hensher 2010); we have selected a simple model that incorporates the main effects while not unnecessarily increasing model complexity.

Because we are no longer measuring attitudes on a fixed linear scale, but expressing them as falling in arbitrary intervals on an undefined scale, we need to fix the (multiplicative) scale of x and this can most naturally be done by taking a standard variance for ɛ, i.e. eliminating g in Eq. 18.

In estimating the μ values we may note that we have to estimate one fewer value than we have possible responses. That is, if the attitudinal responses are on a five-point scale, we can take μ 0 = −∞, μ 5 = ∞ and estimate the four intermediate values. Clearly we need to impose the constraint that \( \mu_{j} \ge \mu_{j - 1} \). Moreover, we need to fix the (additive) scale of μ against x, which can be done either by omitting constants from the equation for x or by including constants and setting (e.g.) μ 1 = 0.

The likelihood of the series of attitudinal responses can then be written

$$ \Pr \left\{ {y|Z} \right\} = \prod\limits_{s} {\left( {\Upphi \left( {\mu_{{y_{s} }} - d_{s} Z} \right) - \Upphi \left( {\mu_{{y_{s} - 1}} - d_{s} Z} \right)} \right)} , $$
(19)

where y s gives the value observed for the sth indicator.

By replacing Eq. 11 by Eq. 19, we get a new version of Eq. 14, namely:

$$\begin{aligned} P\left( {k,y} \right) & = \int\limits_{\tau } {\int\limits_{\upsilon } {p\left( {k|V_{{jt}} = aX_{{jt}} + cY_{{jt}} (bW + h\upsilon ) + f_{j} \tau _{j} } \right)} } \hfill \\ & \quad \prod\limits_{s} {\left( {\Phi \left( {\mu _{{y_{s} }} - d_{s} Z} \right) - \Phi \left( {\mu _{{y_{s} - 1}} - d_{s} Z} \right)} \right)dN(\upsilon )dN(\tau )} \hfill \\ \end{aligned}$$
(20)

Here, we have replaced the continuous specification for the indicator by an ordered specification, and the ordered response model for the indicators is clearly still estimated jointly with the choice model, as can be seen from Eq. 20. Note that now we estimate the parameters a, b, c, d, f, h, μ. In this specification, we now combine a discrete model for choices with an ordered model for indicators; this has some similarities to work looking at jointly modelling discrete and ordered choices (e.g. Bhat and Guo (2007)), but in our case, the ordered component relates to the attitudinal indicators, and there is also the additional latent variable component.

Case-study of rail travel in the UK

Stated choice experimental design

The data for the models described in this article come from a stated choice survey conducted to examine trade-offs between policies influencing privacy and liberty in return for security improvements (for details see Potoglou et al. (2010)). The rationale for using stated choice methods to collect data on individuals’ trade-offs between policies influencing privacy, liberty and security is the absence of data describing such trade-offs and choices from the real world. In particular, the aim of the study is to examine individuals’ willingness to trade privacy or liberty against security improvements, and to quantify these trade-offs in terms of willingness-to-pay (WTP) for a particular security improvement. The research objective of the study, therefore, was to examine whether security improvements concerning rail travel would be acceptable to individuals and to examine what factors are likely to influence individuals’ decisions when privacy, liberty and security may be in conflict. Stated choice methods were judged to have the potential to provide useful insights in answering such questions.

The alternative attributes and their levels for the choice experiments were defined through in-depth interviews with data protection officials (Hosein, 2008, National Identity Register, National DNA Databank, Data protection law, personal communication) and security officials (Clarke 2007; Clarke, 2008, Benefits and disbenefits of security initiatives, personal communication), press articles (BBC 2006) and literature review research (Cozens et al. 2002; UK Dept. for Transport 2006, 2008; Srinivasan et al. 2006). The trade-offs between alternatives involved three main categories of relevant attributes: security improvements in terms of surveillance equipment and presence of security personnel and security checks; potential benefits such as increased likelihood that a terrorist plot may be disrupted and how things may be handled in case an incident occurs, and travel related characteristics such as waiting time to pass through security and additional cost to cover security improvements. The complete list of attributes and levels used in the choice experiment is shown in Table 2.

Table 2 Attributes and levels of the rail travel scenarios

The SC experiment was set in the context of choosing between three options describing situations that the respondent may experience when travelling on the UK national rail network. Specifically, respondents were asked to “Imagine that you are making a journey using public transport, such as on the national railway system. We would like you then to consider three ways in which you might make this journey. These are described by different levels of security or privacy”. As shown in Fig. 3, an additional fourth option in the scenario allowed respondents to opt-out from choosing one of the first three alternatives, stating, I would choose not to use the rail system under any of these conditions”. Each alternative differed in terms of security measures, potential benefits from improved security, and travel related characteristics.

Fig. 3
figure 3

A choice scenario example

The large number of attributes and levels meant that a full factorial design was clearly not appropriate, while a D-efficient design was judged to be inapplicable in the absence of reliable prior estimates for model coefficients. For these reasons, we settled on a design that is nearly (although not fully) orthogonal in its nature, and which excluded a number of unrealistic combinations. As an example, security checks could not be performed using “Metal detector—X-ray” if the waiting time for the alternative was less than 4 min. Second, to allow for realistic representation of choice scenarios, when uniformed military presence was postulated, then other security improvements (i.e. advanced Closed Circuit Television (CCTV) cameras that enable real-time face recognition) and tighter security checks (i.e. more than 2 checks in 1,000 travellers) also had to be in place. Overall, we attempted to control for extreme cases, so that none of the choice scenarios would seem unrealistic or dominant compared to the other two options. We settled on an overall design of 120 rows, which was divided into 15 blocks, with each respondent facing eight choice tasks.

Background questions

In addition to the stated choice scenarios, data were also collected on the social and economic characteristics of the respondents (e.g. age, gender, employment status, income, frequency of travel by rail, etc.) and their media preferences including newspapers and news channels.

Respondents were also asked a series of questions about their attitudes towards privacy known as the ‘Distrust Index’ developed by Dr. Alan Westin (Kumaraguru and Cranor 2005; Louis Harris, & Associates and Westin 1994). The specific attitude questions and the response distributions from our survey are shown in Table 3. Respondents were asked to choose amongst the five levels of agreement, described in text. For the purposes of the later analysis, we used a value of 5 for those levels that would equate to the lowest level of distrust, and a value of 1 for those levels that would equate to the highest distrust. The values of 5 would thus equate to strong agreement with the first two statements, and strong disagreement with the final two statements.

Table 3 Distrust index questions and responses

Respondents were also asked to indicate their responses to the Privacy Concern Index through a series of questions about their attitudes towards privacy, security and liberty (also defined by Westin in Kumaraguru and Cranor 2005). These questions are shown in Table 4. For the purposes of the later analysis, a value of 1 was used for the statements that the Kumaraguru and Cranor (2005) work would explain as low concern, and a value of 5 for those statements that would explain high concern.

Table 4 Attitudinal questions of concern and responses

In the sample, 95.8% of the respondents rated the statement “protecting the privacy of my personal information” as somewhat or very important. Also, 96.3% agreed that “taking action against important security risks” was somewhat or very important. Interestingly, a remarkably lower percentage (85.7%) of respondents—as compared with the previous statements—agreed that “defending current liberties and human rights” was somewhat or very important.

Survey implementation and data

After earlier pilot work, the stated choice experiment was conducted through a nation-wide panel of Internet users between 17 and 19 September 2008. A final sample of 2,058 respondents was obtained, with descriptive statistics of the sample being reported in Table 5. After some additional data cleaning, the estimation sample consisted of 1,961 respondents.

Table 5 Sample characteristics (sample size: 2,058)

The sample represents the general population well in terms of gender and age. As expected with Internet surveys, however, the proportion of individuals with a high level of education in the sample is higher than the proportions in 2001 UK Census (www.statistics.gov.uk/census2001). The sample also over-represents retired individuals (28% vs. 13.4%) and under-represents students, compared to the 2001 UK census. Clearly, because of the use of the Internet as the data collection mode and differences in the socio-economic profiles of our sample compared to the 2001 UK census, there could be no claim that the collected sample is statistically representative of the UK population. So, while the sample generally represents the population across key measurable dimensions (e.g. gender and age) the results should be used with some caution.

Model specification and estimation results

In this section we specify the models that we used to analyse the data described above, and report results. We start by discussing a base model without the latent variables. Then, after confirming that the alternative normalisations are equivalent, we investigate the impact of the use of ordered models for the attitudinal indicators. In these initial tests, the latent variables are only interacted with the constant on the no-travel alternative. In the final part of our analysis, we interact the latent variables with another variable in the choice model. All models were coded and estimated in Ox (Doornik 2001). The overall model statistics are summarised in Table 6. Table 7 shows the estimation results for the choice model component of the different models, Table 8 reports the results for the structural equation models for latent attitudes and Table 9 the results for the measurements model for latent attitudes.

Table 6 Overall model statistics
Table 7 Estimation results for choice model components
Table 8 Estimation results for structural equation model for latent attitudes
Table 9 Estimation results for measurement model for latent attitudes

Base model

This section discusses the results for the base model, i.e. a multinomial model without latent attitudinal variables.

The price difference to cover security costs and the time required to pass through security are included as linear terms in the utilities of the three alternatives. The parameter estimates for these two attributes are in line with a priori expectations (i.e. negative) and imply that respondents prefer alternatives with lower costs and shorter times to pass through security.

The attribute levels of the type of camera were coded as categorical variables with the level “No Cameras” set as the base (zero) level in the utility equations. As shown in Table 7, respondents were more likely to choose rail travel options that involved some type of surveillance system involving either standard or advanced CCTV cameras that enable real-time face recognition. The highest valuation among the three levels was placed on advanced CCTV cameras.

Participants were also in favour of some type of security check when compared to the base level situation in which there were no security checks. Here, results indicate that respondents placed the highest value on the attribute level “metal-detector and X-ray for all”. This would imply the highest level of security for all travellers (including the respondent). The method of checking is possibly also seen as less intrusive than a pat down.

Preferences for improvements in security reassurance are also reflected in the positive valuation for the presence of specialised security personnel. Compared to the base-level situation in which only rail staff are present at the rail station, respondents preferred options where British Transport Police, armed police and even uniformed military are present. However, the value placed on a situation in which uniformed military are present is substantially smaller than situations involving British Transport police and armed police, possibly reflecting a general aversion to armed police in Britain, where their presence is much more limited than in most other countries.

Unsurprisingly, respondents were more likely to choose alternatives in which the authorities are more effective in disrupting known terrorist plots. The estimated coefficients of the number of known terrorist plots disrupted are the result of a piecewise-linear specification with two points of inflection at 2–3 plots (coded as 2.5 in the data) and 10 plots every 10 years. The results show that while there is additional utility for each disrupted plot, this marginal utility decreases as the number of disrupted plots increases. Indeed, the first and second prevented plot contribute 0.3096 units in utility each, while from the third plot onwards, this is reduced to 0.0696 per plot, and reduced further to 0.0199 per plot from the tenth plot upwards.

We found no difference among the first three levels of the visibility of response to a security incident. On the other hand, respondents were less likely to choose situations in which an incident would cause some or a lot of disruption and chaos.

Finally, the utility of the fourth alternative (i.e. “not travel by rail”) is given by a constant. In the base model, this obtains a positive value, which would imply an underlying preference for this opt-out alternative when taking account of all other attributes. However, here, we need to take into account the fact that the base levels chosen for the various estimated factors was often the least desirable level (e.g. no cameras, no checks and only rail staff). Once more desirable levels apply, the “not travel by rail” alternative decreases in relative attractiveness.

Latent variable models

In the latent variable models, a latent variable called ‘Distrust’ was used to explain the values for the four distrust index questions (see Table 3), and a latent variable called ‘Concern’ (for privacy, security and liberty), was used to explain the value for the three attitudinal indicator questions shown in Table 4.

Two socio-demographic characteristics, namely age (linear) and gender (male) are used as explanatory variables for each of these latent variables. No other socio-demographic effects were found to be significant, and the linear specification for age was used for simplicity, but also because it gave reasonable results. We explicitly examine three modelling issues: (i) the impact of different normalisation strategies, which we investigated using continuous attribute equations in the measurement model; (ii) the impact of the assumption of an ordered logit model for the attitudinal measurement models; and (iii) the impact of interactions between latent variables and service attributes.

In all model tests the latent attitude model and the choice model are estimated simultaneously resulting in consistent and efficient estimates. The panel nature of the data is also taken into account in all models.

Normalisation

A tricky aspect of the ICLV model specification is the normalisation of the attitudinal models. We tested two normalisation strategies, one set out by Ben-Akiva et al. (1999) and one set out by Bolduc et al. (2005), referred to hereafter as the ‘Ben-Akiva normalisation’ and the ‘Bolduc normalisation’.Footnote 4 The detailed specification of each model is shown below, where, for the sake of simplicity, we have dropped the subscript for choice tasks.

Ben-Akiva normalisation, continuous (normal) attitudinal measurement model

Structural Models (cf. Eqs. 3 and 1):

$$ Z_{l} = bW_{l} + \psi_{l} ,\quad l = { 1},{ 2}, \, \psi \sim {\text{N}}\left( {0,\sum\limits_{\psi } {\text{diagonal}} } \right)\quad \left\{ { 2 {\text{ equations}}} \right\} $$
(21)
$$ U_{j} = aX_{j} + cY_{j} Z + \nu_{j} ,\quad \nu \sim {\text{N}}\left( {0,{ 1}} \right)\quad \left\{ { 4 {\text{ equations}}} \right\} $$
(22)

Measurement Models (cf. Eq. 2):

$$ y_{s} = d_{s} Z + \varepsilon_{s} ,\quad {\text{s }} = { 1}, \ldots , 7, \, \varepsilon \sim {\text{N}}\left( {0,\sum\limits_{\varepsilon } {\text{diagonal}} } \right)\quad \left\{ { 7 {\text{ equations}}} \right\} $$
(23)

here, the “Distrust” latent variable was used for four indicators, and the “Concern” latent variable was used for three indicators. In each of the two groups, one of the interaction parameters d was fixed to one for normalisation.

Bolduc normalisation, continuous (normal) attitudinal measurement model

Structural Models (cf. Eqs. 3 and 1):

$$ Z_{l} = bW_{l} + \psi_{l} ,\quad l = { 1},{ 2}, \, \psi \sim {\text{N}}\left( {0,\sum\limits_{\psi } {\text{diagonal}} } \right)\quad \left\{ { 2 {\text{ equations}}} \right\} $$
(24)

where the variance of ψ is normalised to 1, i.e. two constraints.

$$ U_{j} = aX_{j} + cY_{j} Z + \nu_{j} ,\quad \nu \sim {\text{N}}\left( {0,{ 1}} \right)\quad \left\{ { 4 {\text{ equations}}} \right\} $$
(25)

Measurement Models (cf. Eq. 2):

$$ y_{s} = dY_{s} Z + \varepsilon_{s} ,\quad s = 1, \ldots , 7,\varepsilon \sim {\text{N}}\left( {0,\sum\limits_{\varepsilon } {\text{diagonal}} } \right)\quad \left\{ { 7 {\text{ equations}}} \right\} $$
(26)

The underlying utility specification used in these two models is the same as in the base model, with the difference that the two latent variables are incorporated as interaction effects on the constant for the ‘no travel by rail’ alternative. In other words, the utility for alternative 4 is now given by:

$$ {\text{V}}_{{ 4,{\text{n}}}} = \delta_{ 4} + \Updelta_{ 1} {\text{Z}}_{{ 1,{\text{n}}}} + \Updelta_{ 2} {\text{Z}}_{{ 2,{\text{n}}}} $$
(27)

where Z1,n and Z2,n give the respondent-specific values for the two latent variables, δ4 is the alternative specific constant for the no travel option, and Δ1 and Δ2 are interaction effects, showing the shift in the utility of the no-travel alternative as a function of the two latent variables.

The attitudinal measurement model is a continuous linear model assuming a normal distribution of the latent variable, in line with Eqs. 811.

The results in Table 6 present both the simulated log-likelihood for the complete joint model, i.e. Eq. 15, and the simulated log-likelihood for the discrete choice model (DCM) component only, i.e. computing only \( \frac{1}{R}\sum\nolimits_{r} {p\left( {k|V_{jt} = aX_{jt} + cY_{jt} (bW + h\upsilon_{r} ) + f_{j} \tau_{jr} } \right)} \) on the basis of the final parameter estimates from the joint estimation. As shown in Tables 6 and 7, we obtained exactly the same likelihood and either exactly the same coefficient values or effectively the same values, allowing for the different scaling, with these different normalisation strategies and therefore conclude that they are equivalent. In subsequent models we use the Ben-Akiva normalisation.

From the results in Table 6 we see that the log-likelihood for the choice component of the model is substantially improved with the inclusion of the attitudinal components. Indeed, we note an increase in log-likelihood by −2,941.8 units, at the cost of two additional parameters, where this is of course highly significant at any levels of confidence. The relative size of the coefficients (Table 7) associated with explanatory variables is broadly similar between the base model and the models with the attitudinal components (focussing on coefficients which are significant at the 95% level). This is not entirely unexpected given that the latent variables were only interacted with the constants for the “no travel by rail” option. Here, we note major differences. Indeed, with the base levels for all terms in the utility specifications remaining unchanged, we observe a change to a negative mean value for the constant for this fourth alternative.

The impacts of the latent variables on the “no travel by rail” constant are highly significant, but are best understood in conjunction with the results for the measurement model in Table 9. Here, the latent variable concern has a positive correlation with the privacy, liberty and human rights indicators, but a negative correlation with the security indicator. Perhaps this is because security measures are captured explicitly in the choice model. Or perhaps that concern for privacy and liberty outweighs the concern for security, leading to a low rating for the security indicator. On balance, these results thus allow us to interpret this latent variable as capturing increasing concern, as a result of positive valuations for privacy and liberty. Turning back to the structural equations, we note a positive effect for the latent variable on the constant for the fourth alternative. As the latent variable “concern” increases, respondents are more likely to choose the “would not travel by rail” option, i.e. increasing concern leads to increased refusal to choose any of the rail options.

Table 10 Analysis of choice probabilities for example scenario

A different picture emerges for the second latent variable, “Distrust”. Here, we see that an increased value for the latent variable is positively correlated with all four indicators. Now remember that for the “government can be trusted” and “business helps us more” indicators, this would equate to strong agreement, i.e. a low level of distrust. For the “technology has gotten out of control” and “voting has no impact”, a positive value would equate to strong disagreement, i.e. once again a low level of distrust. Increases in this latent variable thus capture reduced rather than increased distrust, and we will hereafter refer to it as the “reduced distrust” variable. This also explains the negative value for the interaction between this latent variable and the “no travel by rail” constant—reduced distrust leads to reduced rates for choosing not to travel by rail.

The difference in the scale of the interaction terms (i.e. the impact of the latent variables in the utility functions) is a direct result of the different normalisations, and it can be seen that multiplication of the interaction terms from the Ben-Akiva normalisation by the estimated standard deviations from the structural equation model gives the results for the interaction terms in the Bolduc normalisation.

In terms of parameterisation in the latent attitudinal model (cf. Table 8), we found that age and gender were both statistically significant in the structural model. Older people were less likely to be concerned about privacy, liberty and security. The estimate for the impact on the reduced distrust latent variable is also negative, meaning that older respondents are less likely to trust the government, business and technology. Also, men were more likely to be concerned about privacy, security and liberty whereas we found no influence of gender on distrust.

Ordered logit attitudinal measurement model

A more sophisticated and realistic approach to the representation of the attitudinal variables is to treat the responses as ordered choices. This necessitates the following changes.

Structural Models (cf. Eqs. 3 and 1):

$$ Z_{l} = bW_{l} + \psi_{l} ,\quad 1= 1, 2,\psi \sim {\text{N}}\left( {0,\sum\limits_{\psi } {\text{diagonal}} } \right)\quad \left\{ { 2 {\text{ equations}}} \right\} $$
(28)
$$ U_{j} = aX_{j} + cY_{j} Z + \nu_{j} ,\quad \nu \sim {\text{N}}\left( {0,{ 1}} \right)\quad \left\{ { 4 {\text{ equations}}} \right\} $$
(29)

Measurement Models (cf. Eqs. 17 and 18):

$$ x_{s} = d_{s} Z + \varepsilon_{s} $$
(30)
$$ \Pr \left\{ {y_{s} = j|Z} \right\} = \int\limits_{{\mu_{j - 1} }}^{{\mu_{j} }} {\varphi \left( {\frac{{x_{s} - d_{s} Z}}{{g_{s} }}} \right)} dx_{s} = \Upphi \left( {\frac{{\mu_{j} - d_{s} Z}}{{g_{s} }}} \right) - \Upphi \left( {\frac{{\mu_{j - 1} - d_{s} Z}}{{g_{s} }}} \right){\text{s}} = 1, \ldots , 7,\varepsilon \sim {\text{N}}\left( {0,\sum\limits_{\varepsilon } {\text{diagonal}} } \right)\quad \left\{ { 7 {\text{ equations}}} \right\} $$
(31)

where the same normalisation as before is used, i.e. fixing one d to one in the four measurement equations for distrust, and fixing one d to one in the three measurement equations for concern.

It is not reasonable to compare the total likelihood for the joint models, as the processes described are not comparable; the ordered model explains the result of a discrete process of selecting an attitudinal indicator, while the continuous model represents the result of a process assumed to yield a continuously varying indicator. However, we can conclude that the latent variables given by the ordered choice model are qualitatively better than those given by the continuous assumption, with higher fit (by 5.2 units) for the DCM only component in this new model.

To obtain further understanding of the impact of the ordered choice approach, a model was run that estimated ordered choice of attitudinal indicators and the latent variables, without the stated choice model. In other words, this means the maximisation of the following function rather than Eq. 20.

$$ P\left( {k,y} \right) = \int\limits_{\upsilon } {\prod\limits_{s} {\left( {\Upphi \left( {\mu_{{y_{s} }} - d_{s} Z} \right) - \Upphi \left( {\mu_{{y_{s} - 1}} - d_{s} Z} \right)} \right)dN(\upsilon )} } $$
(32)

This model showed that explanation of those two aspects of the overall model (i.e. measurement model plus latent attitude model) was better when the stated choice component was omitted. This is a natural result, indicating that the stated choices contribute to the definition of the latent variable, but in doing so reduce the quality of explanation that the latent variable gives to the seven indicator variables. This is in contrast with the previous result where we see that the latent variables contribute substantially to explaining the stated choices. This result can be understood by remembering that the joint estimation means that the model needs to find estimates for the latent variable that help explain both the choices and the responses to attitudinal questions. It is thus natural that this reduces the ability of the latent attitudes to explain the responses to the attitudinal questions (compared to a model estimated without the choice data), while the base model for the choice data does not incorporate the latent variable.

Looking next at the coefficient estimates, we can see that the main parameters in the choice model remain unaffected. The scale of the impact of the latent variable on the constant for the “no travel by rail” alternative changes substantially, but the signs remain as before, meaning that the latent variables can still be interpreted as “increased concern” and “reduced distrust”. The reduced value for the “increased concern” latent variable is offset by increased standard deviation for the actual latent variable (from 0.1 to 1.77). However, we only observe a small drop in the standard deviation for the “reduced distrust” latent variable (0.35–0.31) to offset the increased value for the interaction term.

In terms of the measurement model, the results remain similar to those from the continuous model, with the exception of the security indicator, where the effect of the “increased concern” latent variable is now positive, but not statistically significant. In terms of the estimates for the thresholds of the ordered model, we see some asymmetry and differences in scale, justifying the move away from a continuous specification.

The biggest difference between the models however arises when looking at the structural equations in the latent attitude model. Here, the influence of age and gender on concern is no longer significant. Older respondents still show higher distrust (negative impact on reduced distrust variable), where the same now applies to male respondents. Overall, these findings are in line with the recognition by Ben-Akiva et al. (1999) that it can be difficult to find good causal variables for the latent variables.

Interacting latent variables and security interventions

In the last test we examined how the latent variables might interact with the attributes incorporated in the SC experiments, rather than just the constant on the no travel option. After extensive testing, it emerged that the valuation of the type of security check, specifically the use of metal detectors and X-rays for all, was influenced by attitudes for concern for privacy, security and liberty, so this interaction was incorporated in the simultaneous model structure; no other significant interactions were identified. In particular, let Xj,n,t be equal to 1 if the “Metal detector/X-ray for all” level applies for the “Type of security check” attribute for alternative j for respondent n in choice task t. In the base model, the contribution of this attribute to the utility would then be given by β·Xj,n,t, while, in this advanced specification, it will be given by (β + β1·Z2,n)·Xj,n,t, where Z2,n gives the latent concern variable for respondent n. The ordered logit attitudinal models were used.

The results (cf. Table 6) show a small but significant increase in model fit for both the overall model (2.5 units at the cost of one parameter, giving a χ2 p-value of 0.025) as well as the discrete choice component on its own (2.1 units at the cost of one parameter, giving a χ2 p-value of 0.04). We observe that persons with high concern place a lower value on the introduction of metal detectors or X-ray check for rail travel. This is completely in line with intuition. Respondents who are more concerned about privacy, security and liberty will be less likely to agree with the notion that every traveller should be checked. We also see a reduction in the variance of the “increased concern” latent variable. Any remaining model parameters remain largely unaffected by this change.

Comparison of models

As a further illustration of the role of the latent variables in the various models, we now conduct an analysis showing their impact on choice probabilities and WTP indicators.

In simple closed form discrete choice models such as Multinomial, Nested, or Cross-Nested Logit, a given set of values for the explanatory attribute gives rise to point values for the probabilities for the different alternatives. The situation is different in the presence of modelled random taste heterogeneity or the inclusion of latent variables. Here, point values are only obtained conditional on given values for these random components. However, the latent nature of these terms means that the probabilities are integrated over these additional random components and thus follow a random distribution across respondents even for a fixed choice task.

To illustrate the differences across models, we look at the example of the single choice scenario illustrated in Fig. 3. Specifically, we take our sample population of respondents, and compute the probabilities for the four alternatives from this scenario. The results are shown in Table 10, giving the mean, coefficient of variation, minimum and maximum. For the MNL model, we clearly have a single point probability for each of the four alternatives, where alternatives 1 and 3 obtain higher probabilities than alternatives 2 and 4. In the remaining three models, the impact of the latent variables is taken into account. For each respondent, age and gender were used to compute distributed values for the two latent variables, and these were then used in interaction with the constant for the no travel alternative in the second and third models. In the fourth model, the concern variable was in addition interacted with the sensitivity to the highest level of security checks.

The effect of the latent variables in the second and third models is clear to see. The interaction between the latent variable and the constant for the fourth alternative means that the probability for that alternative varies between 0 and 1, with a mean probability that is slightly higher than the MNL point value and a coefficient of variation of almost 2. The reason for this variation is that respondents with high concern and high distrust are more likely to choose the no travel option, with the opposite applying for low concern and low distrust. The impact is very similar in the second and third models. The changes in the probability for the fourth alternative are then clearly also reflected in the probability for the first three alternatives, which are now each bounded between 0 and an upper bound where these three upper bounds sum to a value of 1 (applying in the case where the probability for alternative 4 is zero).

The impact in the fourth model of the additional interaction between the concern variable and the sensitivity to the highest level of security checks (which applies for alternative 3) are less substantial. We see a small increase in the variation in the probability for alternative 3, although the impact on the range is more noticeable. This is the result of respondents with increased or decreased concern being more or less sensitive to the highest level of security checks. With latent variables now affecting alternatives 1, 2 and 3 in different ways, the summation of the maxima to 1 no longer applies.

Table 11 shows corresponding results for the WTP measures obtained from the individual model estimates. Here, point values are obtained for all WTP measures with the exception of the WTP for the highest level of security checks in the final model, where the associated coefficient was interacted with the latent variable “concern”, leading to a distribution of the associated WTP measures across the sample population. As would be expected, the interaction between the latent variables and the constant for the fourth alternative only leads to small changes in the WTP measures; here, the main impact is on choice probabilities (and hence would be most visible in forecasting). On the other hand, the interaction between the latent variable “concern” and the coefficient for “Metal detector/X-ray for all” leads to heterogeneity in the associated WTP measure, with a coefficient of variation of 0.18 in the sample population.

Table 11 Analysis of monetary valuations (in £)

Summary and conclusions

Our empirical work has shown the applicability of a latent variable framework to real-world transport modelling work. Specifically, the estimates show the strong impact of two latent variables: one to do with concern for privacy, liberty and security; the other with distrust of business, government and technology. These variables were significant, not only as explanators for the answers to attitudinal questions put to respondents as part of the survey, but also for their propensity to choose the opt-out alternative in the survey. Additionally, the latent variable related to concern shows a significant impact on the sensitivity to an introduction of universal metal detector checks. In other words, individuals concerned about their privacy would be less in favour of this type of security check than the rest of the sample.

The modelling work in our article also has a number of novel components that are of interest given the growing use of latent variable models. Firstly, seemingly unlike many other studies in this area, we explicitly recognise the repeated choice nature of the data. Secondly, we compare the two normalisations employed in the literature on our data, finding them to be equivalent. Thirdly, attitudinal responses have been modelled using ordered choice methods rather than assuming a continuous attitudinal response, which is more consistent with how they are measured. In line with only a small subset of other studies in the area, the entire model, choice, latent variable and attitudinal response, has been estimated simultaneously.

While the models using ordered choice or continuous attitudinal response cannot be compared directly, ordered choice is intuitively a preferable approach, while latent variables estimated using ordered choice also contribute to an improved explanation of the stated choices. We conclude that this approach is superior to the general assumption of a continuous attitudinal response.

The advantages of the latent variable framework over deterministic attitude incorporation are clear; the model is not affected by endogeneity bias, and the choice model component along with the latent variable model can be used directly for forecasting without the requirement for attitudinal indicators (i.e. the measurement model would be dropped in application). In other words, the application of this model (i.e. in forecasting) does not require the collection or simulation of attitudinal measures, which is a substantial improvement on approaches that use attitudinal measures directly in the models of stated choice. The latent variables in this model are forecast directly from observed objective variables (socio-demographic characteristics), with variance around their mean values, so that they can be used in model application without collecting further attitudinal data.

In conclusion, and in line with a number of other papers, we find that the use of latent attitude models leads to an improved understanding of stated choice and can be applied reliably in practical studies. We also highlight the advantages of using an ordered logit model for the response to the attitudinal questions. Tests should be made with other data sets to confirm the wider applicability of the method.