Introduction

In the planning and commission of crimes offenders need to make many choices. Perhaps the most fundamental of these is where to commit crime. The discrete spatial choice framework (Bernasco and Nieuwbeerta 2005) attempts to model location choice by comparing the characteristics of neighborhoods where offenders have committed crimes with those neighborhoods where they have chosen not to commit crimes. This technique circumvents key weaknesses associated with previous approaches applied to the study of location choice, providing analytical means to assess the impact of both area and offender-area interaction level characteristics on location choice. Importantly, a range of recent studies (e.g. Bernasco and Block 2009; Bernasco et al. 2013; Bernasco 2010a; Baudains et al. 2013) have demonstrated that the discrete spatial choice framework provides an elegant means to conceptualize and test hypotheses about offender target selection strategies. To illustrate, Clare et al. (2009) demonstrate how the presence of barriers and connectors influences burglars’ location choices while controlling for other well established explanations of offender mobility.

However, to date the statistical model implemented in all discrete spatial choice studies of offender mobility (the conditional logit) operates under a number of, as yet, untested assumptions that warrant further investigation.

  • First, conditional logit models can only estimate systematic variation in spatial preferences.Footnote 1 This means that area level attributes influence all offenders in the same way and that the only difference between offenders comes from their observed characteristics. It is unlikely that this assumption can be justified. Ethnographic studies of residential burglars clearly suggest that target characteristics such as perceived affluence influence assessments of target suitability of different offenders in different ways (Bennett and Wright 1984; Rengert and Wasilchick 2000).

  • Second, the conditional logit model assumes that observed choices are independent, and it does not account for the fact that offences are nested within offenders (i.e. some offenders commit multiple offences). Recent research has demonstrated that not accounting for such nesting of observations can lead to significantly different conclusions about how individual offenders operate (Smith et al. 2009; Townsley and Sidebottom 2010). It is true that robust standard errors can be computed for conditional logit models to counter this problem, but this simply inflates the estimated standard errors thereby only adjusting the observed significance level of relationships.

  • Third, the conditional logit model operates under the independence of irrelevant alternatives (IIA) assumption. This states that when evaluating preferences, adding or removing alternatives from the choice set will not alter the relative odds associated with any two existing alternatives. It is difficult to maintain this assumption when alternatives are similar.Footnote 2

The primary objective of this paper is to investigate the first of these assumptions: whether location choice preference variation is systematic, i.e. whether it is only a function of measured offender characteristics. Analyzing recorded crime data from Brisbane, Australia, location choice preferences are estimated using the conventional conditional logit and a mixed logit model. The mixed logit, a member of the logit family of statistical models, avoids many of the issues associated with the conditional logit model. It estimates random variation in spatial preferences by allowing the importance of choice criteria to vary at the level of decision makers. Specifying person-specific effects, random effects in multilevel modelling terms, is often a good proxy for aggregate unobserved characteristics, and naturally accounts for the nested nature of data. While not the main motivation of the current study, the mixed logit also relaxes the IIA assumption, which permits more realistic forecasts of preferences when the properties of choices change.

Literature Review

Offender mobility is an important and long standing subject of criminological enquiry. To date, this endeavor has largely been conducted through journey to crime studies (for reviews of this literature see Rengert et al. 1999; Rossmo 2000; Groff and McEwen 2006). A widely accepted, and certainly the best known, finding from such studies is the observation that offenders do not tend to travel very far to commit crime, although different crime types are associated with varied travel distance. Some studies claim a buffer region exists nearby offender residences that reflects an avoidance strategy (Rossmo 2000).

In the last ten years a new method has been introduced that has recast offender mobility research. The discrete spatial choice approach combines three different methodologies (offender-, target- and mobility-based) into a single statistical model (Bernasco and Nieuwbeerta 2005). Each of these have advantages but also drawbacks that limit the insights gained from any single study. By combining the attributes of individual offenders, target locations and their spatial relations, the discrete spatial choice approach allows inter-relationships to be estimated concurrently. In doing so, discrete spatial choice offers considerable advantages over conventional analytic strategies that are only capable of examining a single dimension of interest. For instance, in journey to crime studies the only variable of interest is the distance travelled, the origin or destination of journeys are rarely accounted for yet are crucially important to understanding the distance travelled. To illustrate, travelling a given distance in an urban area will likely offer offenders considerably more potential targets relative to those encountered over the same travel distance in a suburban area.

The first study to apply the discrete spatial choice approach to study crime location choice was undertaken by Bernasco and Nieuwbeerta (2005) who examined the spatial preferences of burglars in The Hague, the Netherlands. They specified a behavioral rule comprising choice criteria that both criminological theory and empirical study suggested were important factors in determining crime location choice. Findings of this study indicated that burglars preferred areas close to home, close to the city center, had low guardianship levels and accessible properties. In addition, area level affluence had a positive but statistically non-significant relationship with burglar preference. This study has been highly influential and has instigated a series of replications using the behavioral rule as a foundation to explore new research questions.

  • Bernasco (2006) compared the spatial preferences of burglars operating by themselves with burglars operating in groups. He found preferences were generally consistent across the two burglar types, albeit with some minor differences in degree.

  • Clare et al. (2009) demonstrated how physical barriers (rivers and highways) and connectors (public transport) influence burglar location choices. While not the first study to establish these observations (see for example Brantingham and Brantingham 2003; Beavon et al. 1994; Johnson and Bowers 2010), it was the first using a comprehensive model of offender mobility where competing explanations could be controlled for.

  • Several studies have explored the influence of residential history on location choice for offenders (Bernasco and Kooistra 2010; Bernasco 2010b), finding that areas which contained previous homes did tend to be favored over other (similar) areas not lived in.

  • While several studies employing the discrete spatial choice approach have focused on residential burglary, other crimes have been explored as well, including street robbery (Bernasco and Block 2009; Bernasco et al. 2013), rioting (Baudains et al. 2013) and theft from vehicles (Johnson and Summers 2015).

  • Townsley et al. (2015) replicated the Bernasco and Nieuwbeerta (2005) study in two additional cities (Birmingham, UK and Brisbane, Australia) and found consistent preferences across all three sites, but with some variation observed.

To date all published studies using the discrete spatial choice framework have used the conditional logit statistical model to estimate crime location choice preferences. While this model is quite straightforward to estimate, it has a number of assumptions that may not reflect the reality of crime location choice. The most substantive of these is that the conditional logit can estimate aggregate preference relationships. When choices are nested within decision makers, it is highly unlikely that aggregate relationships will provide adequate depictions of crime location choice at the individual level. As Goldstein (1995, 202) notes, “repeated measures data constitutes a very good example of a situation in which a two level model is really essential, because most of the variation typically is at the higher level” (emphasis added).

To assess the impact of this assumption, in this paper we apply a mixed logit model. The mixed logit model is a more general statistical model that allows preferences to be estimated for each decision maker. It has only recently been a viable model for empirical investigations due to advances in simulation research. It is currently regarded as the “state-of-the-art” model in discrete choice modelling (Hensher and Greene 2003).

Theoretical Framework

Two assertions concerning crime location choice provide the organizing “scaffolding” around which the discrete spatial choice approach is applied. The first sees offenders as optimal foragers (Johnson and Bowers 2004), who seek to satisfy their needs but expend minimal effort in doing so. This is consistent with the rational choice approach (Cornish and Clarke 1986, 2008) that depicts offenders assessing the relative “rewards”, “risks” and “effort” associated with exploiting particular criminal opportunities. The second depicts crime location choice as a hierarchical, spatially structured decision process (Bennett and Wright 1984), such that offenders complete a sequence of nested decisions to determine a suitable location for crime. This process encompasses target evaluation and selection at increasingly fine levels of spatial granularity (neighborhood \(\rightarrow\) street \(\rightarrow\) house for residential burglary, say) until a suitable target is identified.

This combination of hierarchical and utility maximizing decision-making necessitates operationalizing two constructs: the set of alternatives and a behavioral rule. The set of alternatives, or choice set, is a set of mutually exclusive and finite alternatives the decision maker selects from. In discrete spatial choice situations, each alternative within the choice set is usually defined as an administrative unit such as a neighborhood or a census block. The behavioral rule is a mathematical expression that relates the characteristics of the alternatives (possibly informed by decision maker characteristics) to the representative utility of each alternative. Assuming that decision makers wish to maximize their utility, the probability of selecting an alternative can be estimated using a statistical model.

The behavioral rule for burglar location choice outlined by Bernasco and Nieuwbeerta (2005), which has been used as the foundational base in a series of crime location choice studies, stipulated that offenders were likely to select neighborhoods that (1) appear to contain valuable goods, (2) require little effort to travel to, reach and enter, and (3) where the risks of detection are low. As mentioned previously, the statistical model used to estimate relationships in the extant literature has assumed these relationships apply to all offenders (or there is systematic variation in preferences). Yet this belief runs counter to considerable empirical evidence.

Preference Variation in Offender Location Choice

While previous studies of location choice applying the discrete spatial choice approach have assumed a consistency in offender preferences, a number of other studies strongly suggest offenders are influenced to varying degrees by a range of choice characteristics.

  • Differences in rewards Ethnographic studies suggest offenders are attracted to more affluent areas (Repetto 1974; Bennett and Wright 1984; Cromwell et al. 1991). However, such studies are subject to selection biases towards highly-prolific career criminals, who may not be representative of the offending population. Reviews of the literature generate mixed findings regarding the relationship between area affluence and burglary rates (Cohen and Cantor 1981).

  • Differences in effort Empirical research examining the interaction between age of offender and distances travelled to commit crime consistently shows that younger offenders prefer targets closer to their home than further away (Baldwin and Bottoms 1976; Repetto 1974; Phillips et al. 1980), although a large recent study (Andresen et al. 2014) finds distance to crime to be a quadratic (inverse U shaped) function of age. Conversely, research has also shown that the location of some offenders’ homes dictates that they must travel some specific distance to find suitable targets—in this case one would hypothesize that the spatial location of these individuals dictates that proximity is likely a less important criterion (Smith et al. 2009; Townsley and Sidebottom 2010). Furthermore, offenders, like all members of the population, operate under temporal constraints (Ratcliffe 2006) in their activities, criminal or otherwise. It is not difficult to imagine that age and area of residence might interact to generate a wide range of time “budgets” for offending in the active offender population.

  • Differences in risk The interaction of offender ethnicity, area ethnic heterogeneity and detection prospects dictate that social barriers (including affluence) likely affect some offenders more than others (Rengert and Wasilchick 1990; Summers et al. 2010).

It is important to point out that prior work does exist that explores the consistency of offender preferences, albeit in a limited manner. Bernasco and Nieuwbeerta (2005), Bernasco and Block (2009), Bernasco (2010a), Clare et al. (2009) and Johnson and Summers (2015) include offender covariates (typically age, gender and/or ethnicity) in their behavioral rule. While this allows more sophisticated hypotheses to be tested, such as whether adults differ from juveniles in their preferences for proximal neighborhoods, they still only estimate systematic preference variation for groups of offenders. As a result, only a small number of offender characteristics can be explored before substantial reductions in statistical power result.

Hypotheses

Following Bernasco and Nieuwbeerta (2005) and Townsley et al. (2015), in this study we explore the influence of six theoretically inspired choice criteria, each related to either the perceived risks, rewards or efforts associated with committing a burglary in a particular area. Importantly, while some of these criteria have previously been shown to have no statistical significant effect on location choice, such findings may reflect a lack of consistency across offender preferences. That is, that aggregate level estimates may not be representative of individual level preferences.

In this section, six hypotheses are articulated to focus on the veracity of preference variation in characteristics influencing location choice. For each decision criterion we (1) reiterate the theoretical grounds for why a characteristic is thought to influence location choice; and (2) specify a hypothesis relating to the consistency of influence throughout the offending population.

Choice Criterion 1: Affluence (Reward)

Previous studies have shown that burglars rate wealthy households over poor ones because of likely higher returns (Bennett and Wright 1984; Cromwell et al. 1991; Nee and Meenaghan 2006). Furthermore, research shows that a range of both area level and target specific cues provide offenders with the means to assess the relative affluence of particular neighborhoods and/or targets (Rengert and Wasilchick 1990).

Estimates of the influence of neighborhood affluence on location choice in the Netherlands, United Kingdom and Australia previously showed mixed results, such that affluence conferred no significant effect on location choice in the UK and Australia, while in the Netherlands offenders were in fact negatively influenced by increases in neighborhood affluence (Townsley et al. 2015). Assessing potential preference variations in the influence of this characteristic we hypothesize that:

  • H1 The influence of neighborhood affluence on location choice is consistent across offenders.

Choice Criterion 2: Social Control (Risk)

Offenders prefer to operate in areas where they perceive a reduced risk of apprehension (Smith and Jarjoura 1989). Social disorganization theory proposes that such areas are likely those with low levels of social cohesion where residents are less likely to identify strangers and/or be prepared to intervene in criminal acts they observe (Sampson and Wooldredge 1987; Sampson et al. 1997; Shaw and McKay 1969). Recently, Wikström et al. (2012) demonstrated that collective efficacy was a highly significant predictor of counts of (all) crime for a longitudinal cohort of young people. In contrast, Townsley et al. (2015) found no significant impact on location choice of neighborhood social cohesion on burglars in the Netherlands, UK and Australia. Assessing potential preference variations in the influence of this characteristic we hypothesize that:

  • H2 The influence of neighborhood social cohesion on location choice is consistent across offenders.

Choice Criterion 3: Target Vulnerability (Effort)

Our third criterion, and first measure of the effort required in undertaking a burglary, relates to target vulnerability. Single-family dwellings, which typically offer multiple street level entry-points, are likely both easier and less risky to physically access than flats, apartments, and attached dwellings.

Previously studying burglars from the Netherlands, UK and Australia, Townsley et al. (2015) found that across all three study regions increases in the proportion of single family dwellings in an area consistently positively influenced location choice. Again, assessing potential preference variations in the influence of this characteristic we hypothesize that:

  • H3 The influence of target vulnerability on location choice is consistent across offenders.

Choice Criterion 4: Target Proximity (Effort)

Our second measure of effort relates to the distance offenders must travel to reach a suitable burglary target. A wealth of research has demonstrated that in choosing where to offend offenders operate under limited mobility (Rengert et al. 1999; Snook 2004; Smith et al. 2009; Townsley and Sidebottom 2010).

Consensus suggests that such observations concerning offender mobility are the result of two interconnected mechanisms. First, the distance decay explanation (Ratcliffe 2006) states that offenders are subject to a range of time constraints and as a result tend to minimize the distance traveled to, and time involved in reaching crime locations. The second explanation relates to offenders’ preference to operate in familiar areas, where they have local knowledge of the environment and are less likely to be identified by residents as outsiders. Drawing on principles of human geography, crime pattern theory (Brantingham and Brantingham 2008) proposes that offenders develop spatially-referenced cognitive maps of offending opportunities known as awareness spaces. Such awareness spaces are located around those locations (nodes) which are frequently visited, such as the home, a place of employment, the homes of peers etc. As the majority of a person’s nodes tend to be spatially clustered, awareness spaces also exhibit spatial clustering. This means that familiar areas, as reflected in awareness spaces, are more likely to be located near an offender’s home neighborhood.

Previous studies of burglar location choice (Bernasco and Nieuwbeerta 2005; Townsley et al. 2015), have consistently shown that proximity of an area to an offender’s home residence has a significantly positive influence on the likelihood it is selected for burglary. Here, assessing potential preference variations in the influence of this characteristic we hypothesize that:

  • H4 The influence of target proximity on location choice is consistent across offenders.

Choice Criterion 5: Offender Awareness (Effort)

Our third measure of effort extends the assertions of crime pattern theory regarding the familiarity of particular locations to estimate the influence that a location being close to the city center has on its likelihood of being selected for burglary. Crime pattern theory proposes that as a result of the services and facilities located at the city center it is typically a common node in the awareness spaces of both offenders and non-offenders. As a result, the city center is likely also better known and more visited by offenders than other areas within a given urban locality.

A recent cross national study (Townsley et al. 2015) showed mixed results regarding the influences of target area proximity to the city center on burglary location choice, such that in the UK and The Netherlands no significant effect on location choice was observed, while in Australia offenders were more likely to select target areas closer to the city center (Townsley et al. 2015). Assessing potential preference variations in the influence of this characteristic we hypothesize that:

  • H5 The influence of target proximity to the city center on location choice is consistent across offenders.

Choice Criterion 6: Target Availability (Control)

As in Bernasco and Nieuwbeerta (2005) and Townsley et al. (2015) our sixth and final criterion relates to the availability of potential targets in a given area, and simply operationalized as a control variable. In this case asserting that areas with more residential dwellings offer greater numbers of opportunities for residential burglary.

  • H6 The influence of number of targets on location choice is consistent across offenders.

Data

Data describing 2844 detected residential burglaries committed by 873 offenders over 2006–2009 calendar years in the Brisbane Local Government Area were collected and analyzed. These offences represent 22 % of all recorded burglaries during this period, which is similar to clearance rates observed in the UK, but relatively high compared to the US (Weisel 2002). The offending rate of these offenders was highly skewed. While the mean number of burglaries was 3.26 (SD = 6.59), the most prolific offender was detected for 87 offences.Footnote 3 This level of nesting suggests controlling for repeated measures directly is warranted.

Examining the location choices of these offenders, a choice set of 158 Statistical Local Areas (SLAs) were defined as possible offending areas. For each SLA the following characteristics, each relating to a specified behavioral rule, were sourced from the Australian Bureau of Statistics (ABS) 2006 Census and included in the estimated model. Table 1 summarizes these area characteristics, which are now briefly described:

  • Mean housing repayment A measure of affluence. Described in the 2006 census as ordinal variables (interval bands such as $200–$249/week, $250–$300/week,...), weighted averages were used to compute an average housing payment or house price index. This variable is transformed into deciles.

  • Proportion of single-family dwellings A metric of target accessibility. Measured as the proportion of residential households in an area which are not classified as apartments. This variable is scaled such that a one unit increase in the variable relates to a 10 % increase in the proportion of single family dwellings.

  • Residential mobility A measure of social cohesion. This was measured as the sum of incoming and outgoing population proportions in the last census (i.e. \(\frac{{\text{incoming residents}}}{{\text{neighborhood population}}} +\frac{{\text{outgoing residents}}}{{\text{neighborhood population}}}\)). To illustrate, if all residents of a neighborhood left (100 % outgoing) and were replaced by entirely new residents (100 % incoming) we would denote a residential mobility of 2. This variable is multiplied by a factor of 10 so that a one unit increase relates to a 10 % increase in the summed proportions of residential turnover.

  • Proximity to city center The Euclidian distance in kilometers from the centroid of the neighborhood in question to the centroid of Brisbane central business district.

  • Number of households A count of the number of residential dwellings, and thus targets for burglary, within a neighborhood.

Whereas the previous characteristics only vary between SLAs (e.g. the residential mobility of an SLA is the same for every burglar), the final variable in the model, proximity, varies across both SLAs and offenders: the proximity of a particular SLA is different for different offenders.

  • Proximity the Euclidean distance in kilometers between the centroid of the neighborhood in question and the centroid of the home neighborhood of the offender making the choice.

For both proximity variables distances measures are multiplied by negative one in order that increases in proximity (i.e. where an alternative is closer) produce a positive coefficient. Observations recording a zero distance (where offending occurs within the neighborhood an offender resides within) are transformed using the Ghosh correction (Ghosh 1951).

Table 1 Descriptive statistics of census variables for alternatives

We now describe the mixed logit model implemented here to assess variation in the influence of these alternative characteristics over offenders, and in turn test our previously specified hypotheses.

Method

The Mixed Logit

A mixed logit model was estimated to test whether differences in behavioral preferences exist between individual burglars. This statistical model is a more general version of the conditional logit model, which all previous criminological studies have used. In practical terms, the conditional logit model estimates a single coefficient to summarize the relationship between an independent variable and the explanatory variable for all decision makers. The mixed logit estimates this relationship for each decision maker. Our formulation closely follows that given in Train (2009). The conditional logit is formally specified as:

$$U_{ij}=\beta x_{ij}+\epsilon _{ij}$$

where

  • \(U_{ij}\) the utility of neighborhood j for the burglar i;

  • \(x_{ij}\) are observed variables that relate to the neighborhood j and burglar i;

  • \(\beta\) is a vector of coefficients that are fixed over burglars and neighborhoods; and

  • \(\epsilon _{ij}\) is a random error term that is independent and identically distributed extreme value.

In the conditional logit model the probability of burglar i choosing neighborhood j is:

$${\text{Prob}}(Y_i = j) = \frac{e^{\beta x_{ij}}}{\sum_i e^{\beta x_{ij}}}$$

Like any regression-type analysis, the objective of the conditional logit is to estimate a set of coefficients that provide accurate predictions of the dependent variable. In our case, six coefficients are estimated, one for each of the hypotheses previously outlined. The mixed logit estimates a set of coefficients for each decision maker. Mixed logit models are particularly appropriate for panel data because the estimation takes account of observations nested within decision makers (i.e. there is correlation in the choices made by the same offender). The mixed logit is formally specified as:

$$U_{ij}=\beta _i x_{ij}+\epsilon _{ij}$$

where

  • \(\beta _i\) is a vector of coefficients for decision maker i representing their varying preferences. These coefficients vary over decision makers in the population with density \(f(\beta |\theta )\) where \(\theta\) represent the parameters of f (optionally, one or more \(\beta _i\) can be constrained to be constant across decision makers); and

  • all other terms have the same meaning as the conditional model specification.

In the mixed logit model the choice probability conditional on \(\beta _i\) is:

$$L_{ij}(\beta _i)=\frac{e^{\beta _i x_{ij}}}{\sum_k e^{\beta _i x_{ik}}}$$

To obtain the unconditional choice probability, we integrate \(L_{ij}(\beta _i)\) for all values of \(\beta _i\):

$${\text{Prob}}(Y_i = j) = \int L_{ij}(\beta _i) f(\beta |\theta ))d\beta$$

The mixed logit is so named because it is the result of a series of logit probabilities with f as a mixing distribution. The most common distributions in the literature are the normal, log-normal, triangular and the uniform (Hensher and Greene 2003). The conditional logit is a special case of the mixed logit where the mixing distribution \(f(\beta )=1\) where \(\beta\) takes on some empirically derived value. The utility probabilities cannot be computed directly because the integral does not allow a closed form solution.Footnote 4 Instead, mixed logit models are estimated through simulation methods.

Estimation Method

There are two common estimation methods used to compute mixed logit models: maximum simulated likelihood and hierarchical Bayes. Models using maximum simulated likelihood estimation are not feasible when the model includes a very large choice set, which is the case for our data, because Halton sequencesFootnote 5 become correlated in higher dimensions (Bhat 2003; Hensher and Greene 2003). The mixed logit was estimated using hierarchical Bayes, so named because a hierarchy of parameters is estimated. For our purposes we closely followed the approach outlined by Train (2009). The first layer corresponds to the individual-level parameters (\(\beta _i\) reflects the preferences of offender i). The second layer captures the population-level parameters (or hyper-parameters) that depict the \(\beta _i\) distribution. Here, the \(\beta _i{\text{s}}\) are distributed with mean b and variance W (the mixing distribution f).

In Bayesian estimation, the researcher specifies a prior probability distribution reflecting their knowledge (or uncertainty) of the phenomenon. This can be based on intuition, experience, and the existing literature. For the individual-level parameters (\(\beta _i\)) in the absence of theoretical or empirical justification, all area level characteristics were estimated as random effects and were hypothesized to be normally distributed. This allows the model maximum flexibility in estimating relationships. The population-level parameter b had a flat prior, so called as it describes a distribution with no peak and extremely long tails. This means there is no presumption about the underlying relationship; the data is allowed to “speak for itself”. The prior for W was an inverted Wishart with six degrees of freedom and an identity scale matrix, reflecting the number of independent variables included in the model. Once \(\beta _i\), b and W have been estimated their distributions are termed posteriors.

The logic behind Bayesian estimation is that we seek to estimate the joint distribution of \(\beta _i\), b and W. However, it is difficult to estimate this directly but straightforward to estimate the conditional distribution of each parameter, given the values of the other two (Gelfand and Smith 1990). Hierarchical Bayes involves defining three conditional functions (conditional posterior distributions). Each is a function of one parameter, conditioned by the other two. \(K(b|W,\beta _i)\) would estimate the distribution of b taking into account the current values of W and \(\beta _i\), for instance. At each iteration the three conditional posteriors are estimated using the most recent values of the two associated conditioning parameters. This process is repeated many times until the joint posterior distribution displays stationarity (i.e. the mean and variance of the distribution do not change or follow any trend).

For completeness, the model can be expressed as follows:

  • \(U_{ijt}=\beta _i x_{ijt}+\epsilon _{ijt}\) (where t refers to the \(t\hbox {th}\) decision made by individual i),

  • \(\epsilon _{ijt}\) is a random error term that is independent and identically distributed extreme value,

  • \(\beta _i \sim N(b,W)\).

The observed choice is \(y_{it} = j\) if and only if \(U_{ijt} > U_{ist} \forall j \ne s\). The priors for the model are:

$$K(b,W) = k(b)k(W),$$

where

  • k(b) is \(N(b_0,S_0)\) with extremely large variance,

  • k(W) is IW(6, I).

The conditional posteriors are:

$$\begin{aligned}&K(\beta _i|b,W,y_i) \propto \prod _t \frac{e^{\beta _i x_{iy_it}}}{\sum\nolimits_k e^{\beta _i x_{ikt}}}\theta (\beta _n|b,W) \quad \forall i\\&K(b|W,\beta _i\,\forall i)\hbox { is } N(\bar{\beta },W/N), \quad \bar{\beta } = \sum_i \beta _n/N\\&K(W|b,\beta _i\,\forall i)\hbox { is IW}\,\left( K+N,\frac{KI+N\bar{S}}{K+N}\right) ,\quad \hbox {where}\quad \bar{S} = \sum _i(\beta _i-b)(\beta _i-b)'/N \end{aligned}$$

300,000 iterations were used for estimation, with a burn in period of 100,000 iterations. Several sets of starting values were tested, but none appeared to have an effect on the number of draws needed to converge. An equivalent conditional logit model was also fitted to provide a baseline to assess the performance of the mixed logit. All analyses were conducted in the statistical programming language R using the RSGHB package (Dumont et al. 2013) to estimate both models.

Results

Figure 1 contains the point estimates and 95 % credibility intervals at the population level for both models. The mixed logit estimates for the two proximity variables are much higher coefficients than the conditional logit estimates. Affluence is statistically significantly different from the null under the mixed logit model, which is not case for the conditional model. Residential mobility effects even show different signs; higher values were seen as more attractive under the conditional model (all things being equal), but the mixed logit implies the reverse of this—neighborhoods with low turnover rates were more frequently chosen by burglars than neighborhoods with high turnover rates, again all things being equal. The models provide similar estimates for single families and number of households.

Fig. 1
figure 1

Comparison of population-level estimates from conditional and mixed logit models

There are number of goodness of fit statistics that suggest the mixed logit provides a superior fit to the data, and they generally convey the same information. For our evaluation we used the root likelihood, computed as the geometric mean of the predicted alternative probabilities (Frischknecht et al. 2013). A perfect model would have a root likelihood statistic of 1 and the null model for these data has a root likelihood statistic of 0.006 (= 1/number of alternatives). The root likelihood for the conditional model was 0.028 and for the mixed logit it was 0.151, a more than a fivefold increase in fit. However, the root likelihood is an aggregate statistic of model fit and does not provide insight in the individual-level estimates generated by the mixed logit model.

To explore further, the distributions of individual-level estimates are plotted for each of the independent variables in Fig. 2. This graph depicts not only where the bulk of offenders’ preferences lie, but also the degree to which they vary. In addition, the fraction of offenders with preferences counter to our expectations can be determined (recall that all hypotheses are expressed so that theory predicts positive values).

Fig. 2
figure 2

Distribution of mixed logit individual-level estimates and population-level 95 % confidence interval from the conditional logit model

To consider the conditional model to be sufficient to describe the contribution of each covariate, we examined the underlying distributions at the individual level. One striking feature of Fig. 2 is that every variable has a fraction of offenders with negative co-efficient estimates. Both proximity variables, number of households and percentage single family dwellings have a minority of individuals with preferences counter to theory (10, 26, 2 and 36 % respectively). Residential mobility and affluence had a majority of offenders with preferences counter to theory (66 and 71 % respectively).

Additionally, the sample skewness scores imply non-symmetric distributions, providing further evidence against using the conditional logit model. Bulmer (2012) suggests sample skewness scores between \(-\)0.25 and 0.25 can be considered approximately symmetric, scores within \(-\)1 to \(-\)0.5 or 0.5 to 1 are moderately skewed and less \(-\)1 or greater than 1 considered as highly skewed. Four of the six variables display moderate skewness and another (proximity to city center) is very close to this threshold.

Preference Variability Scrutinized

Mixed logit models provide unique coefficient estimates (\(\beta _i\)) for every individual decision maker for every choice criterion (variable) that is included as a random effect in the equation. The variation in \(\beta _i\) between decision makers reflects heterogeneity among decision maker preferences. This heterogeneity can be the object of subsequent analyses for two different but related purposes. The first is to explore potential causes of coefficient variability across decision makers by assessing whether it is related to their observed characteristics. The second purpose is to assess how much of the coefficient variability across decision makers can be explained by variation in other observed characteristics.

The rest of our analysis focuses on a single choice criterion: proximity to offenders’ home. The journey to crime literature describes what factors are associated with the distances travelled by offenders (Bernasco 2010b; Townsley and Sidebottom 2010), so it seems an intuitive candidate for exploration. Proximity also has a number of advantages methodologically. It had the largest effect size observed in the mixed logit model estimates, it had the largest change between the conditional and mixed logit population estimates, and displayed considerable variation between offenders.

There are a number of ways the journey to crime might differ between individuals, but we focus on two here. First, it seems possible that where offenders live could play a role in their preference for travel. We explore this by mapping the mean proximity preference of burglars per residential area and assessing the amount of spatial autocorrelation between these areas.

Second, there is considerable empirical evidence that journey to crime is related to offenders’ age (Levine and Lee 2013; Wiles and Costello 2000). To explore the extent to which burglars’ preferences for nearby target areas are age-dependent we use segmented regression analysis of burglars’ proximity preference on their age.

Mapping Proximity Preference Variability

Our approach to exploring the geographical variation in model estimates is conceptually similar to geographically weighted regression (GWR) analysis (Brunsdon et al. 1996; Fotheringham et al. 2003). GWR is a technique used to explore spatial variation in effect sizes by estimating spatially weighted regression coefficients and plotting their values on a map. Geographic patterns may suggest causes of variation. For example, spatial concentrations of either large or small regression coefficients could suggest the presence of spillover effects. In contrast to GWR analysis, where a spatially weighted kernel is used to estimate a regression coefficient for every observed location in the data sample, we explore variation in the \(\beta _i\) of proximity by mapping the average \(\beta _i\) of the 141 SLAs where at least one burglar resided [there are 16 target areas that did not have any burglars living in them; the number of burglars in the other SLAs range from a single burglar (in 20 SLAs) to as many as 55 (in 1 SLA) with an average of just over six burglars per target area].

Before turning to the map it should be noted that all individual \(\beta _i\) values were assigned to the residential area in which the burglars resided in 2006, also for those offenders who in later years still committed burglaries after they had already moved to another area. The overall variation in \(\beta _i\) coefficients appears only partially related to the spatial nesting of burglars in SLAs. The intraclass correlation coefficient (ICC), which describes how strongly the \(\beta _i\) of the burglars in the same origin area resemble each other with respect to their proximity preferences and which can vary between 0 and 1, equals .07 (i.e. the correlation between two randomly drawn burglars living in the same, randomly drawn origin area, is .07). This implies that 7 % of the variation in burglars’ proximity preference estimates can be attributed to their homes being nested in origin areas. The significance of this ICC value was assessed by performing a deviance test of a null model with SLA nesting against a model without SLA nesting and dividing the resulting P value by 2 (see Snijders and Bosker (1999), p. 90). The results showed that burglar proximity preferences are not randomly distributed \((\chi ^2= 21.24; df=1; P<.001)\), but do somewhat cluster at SLA level.

The map of average \(\beta _i\) coefficients per SLA is displayed in Fig. 3. The variation in \(\beta _i\) is displayed in four shades of grey. The burglars that live in SLAs colored in black have relatively strong proximity preferences, more than one standard deviation above the mean. They do not seem to cluster nor do they signal any other meaningful pattern. SLAs colored in the next two shades of grey (from zero to average, and from average to one standard deviation above the mean) are also scattered across Brisbane, but they both seem to form small clusters: most are adjacent to one or two similarly shaded SLAs. This could however also be caused by the fact that there are simply many more SLAs within these categories and as such it does not indicate spatial clustering. There are only two SLAs with an average \(\beta _i\) below zero, i.e. where the average offender actually prefers distant to nearby target areas.

Fig. 3
figure 3

Average proximity preference coefficient value for offender origin areas

To quantify the spatial clustering of \(\beta _i\), a Moran’s I statistic was calculated, using the exact average burglars’ \(\beta _i\) values per SLA and inverse distance weighting. The resulting Moran’s I was 0, which suggests that the average \(\beta _i\) values per SLA do not cluster spatially. It seems that burglars’ proximity preferences are randomly distributed across Brisbane. As emphasized elsewhere in this article, it remains to be seen whether these findings also apply at smaller levels of spatial aggregation than SLAs.

The lack of spatial autocorrelation suggests that there is little to be gained from further exploring whether individual proximity preferences vary with where in the city of Brisbane the burglars live. At least, we find no evidence that burglars living in neighboring areas have similar proximity preferences. Nevertheless, it could well be that physical and social barriers might explain some of the differences in proximity preferences. For example, the Brisbane River cuts across the city and could function as a barrier, which would explain why burglars living in some parts of the city face more difficulty traveling to other parts of the city that are geographically close but relatively hard to reach because they are on the other side of the river (cf. Baudains et al. 2013; Clare et al. 2009).

Proximity Preference Variability and Age

In previous research it was hypothesized that juvenile offenders under the legal driving age would have stronger preferences for targets nearby than adult offenders (Bernasco and Nieuwbeerta 2005; Townsley et al. 2015). Using a conditional logit model for burglary location choice, Townsley et al. (2015) show that juveniles in Brisbane are indeed more strongly influenced by proximity than adult offenders. In a recent multilevel residence-to-crime study, Ackerman and Rossmo (2015) depict the expected curvilinear effect of age on distance traveled for white male residential burglars living in Dallas, Texas. They show that the distance lengthens during teenage years and peaks at the age of 26 after which it starts declining again. The same result was found by Andresen et al. (2014).

The mixed logit model allows us to explore how the \(\beta _i\) values of proximity vary with the age of the burglars. Because we study burglary data from the period 2006–2009 and some burglars committed burglaries in multiple years, we used the age at which they first appeared in our data. Figure 4 depicts how proximity preferences vary with the age of the burglars. We employed a segmented linear regression model (Wagner et al. 2002) with a single threshold to model the relationship between the \(\beta _i\) values of proximity and offender age, because we expected a sudden change around the legal driving age. By varying the threshold age in the range from 15 to 30, we explored what threshold age leads to the best fitting model. The fitted regression line with the threshold at age 19 shows the best fitting model. The results are in the expected direction with juvenile burglars showing a stronger preference for nearby targets than adults. In fact, the preference for nearby targets already starts declining during the teenage years. After the age of 19 we find no statistically significant positive slope. Although previous studies also estimated proximity effects conditional on offender age, an advantage of the mixed logit model used here is that it allows us to assess how much of the proximity preference variation age actually explains. Figure 4 shows that offender age explains only a small part of the proximity preference differences (adj. \(R^2=.03\)). This suggests that more theorizing is needed about why certain offenders have stronger preferences for nearby target than others.

Fig. 4
figure 4

Proximity preference estimates as a function of offender age

Discussion

This study investigated whether location choice preferences were the same for all offenders in a sample of residential burglars in Brisbane, Australia. To address this question we contrasted a mixed logit model, previously unused in studies of offender location choice, with the conventional conditional logit model. The latter model assumes all offenders with the same observed characteristics have the same preferences, while the former allows for estimating and testing preference variation between offenders who have the same observed characteristics. Findings of these analyses demonstrate considerable preference variation between offenders. Comparing mixed and conditional logit model variants, we observe a stronger influence of both proximity of a neighborhood to an offender’s home and the city center if preferences are allowed to vary at the individual level. Moreover, residential mobility is considered attractive assuming systematic location preferences but unattractive when individual preferences are estimated. This shift highlights the extent of preference variation that exists between offenders. In estimating preference variation between individuals, a sizable fraction of offenders appear to have preferences that run counter to theory; the majority of burglars are not attracted to areas with increased levels of affluence and residential mobility. The differences observed between the conditional and mixed logit models call into question the findings of studies employing statistical models that assume preference variation is systematic; and demonstrate strong support in favor of the analytical approach applied here.

To understand what factors might explain observed location preference variation, two approaches were taken. The first assessed the impact of offender home location, and the second the effect of offender age on target preferences. Surprisingly, we find little support for hypotheses that link offender residence location to spatial preference for nearby targets, and only a weak relationship is found between offender age and preference for nearby targets.

Primarily, these results highlight that our current understanding of offending choice outcomes is limited. For all choice criteria examined, some offenders appear to be attracted to neighborhoods that exhibit a particular characteristic, whereas others are repelled from neighborhoods with the same characteristics. Moreover, our ability to understand these differences is currently limited. This is difficult to resolve, especially given that each hypothesized choice criterion is supported by both theory and empirical studies.

As discussed in this paper, the mixed logit model is a highly flexible generalization of the conditional logit model. It allows the researcher to correctly specify repeated choice by the same decision-maker, and to test the statistical significance of preference heterogeneity. Is the mixed logit model an adequate replacement of the conditional logit model in future studies of offenders’ target selection? We think it is, although in specific circumstances there may be reasons to stick to the conditional logit. A case in point is the intractability of discrete choice models with large numbers of alternatives. The technique typically used to solve the intractability of the estimation problem, ‘sampling from alternatives’ (McFadden 1978), depends on the IIA property and therefore cannot be used in the mixed logit case (Guevara and Ben-Akiva 2013).

Like any other study, this research is subject to number of weaknesses that should be considered when interpreting its findings. As with all studies of offender mobility that use recorded crime data, revealed preferences reflect those of cleared residential burglaries only. In the absence of more nuanced data, we make the necessary assumption that all offending trips originate from an offender’s home location. In addition, we are limited in the data used for neighborhood characteristics. Our measures of affluence and social cohesion are derived from a secondary data source and so may not capture the theoretical constructs as intended. Using median housing repayment as a proxy of neighborhood affluence is defensible, especially as we partition areas into deciles. Social cohesion was measured using a metric of residential turnover, which has been used in many other studies (e.g. Baudains et al. 2013; Johnson and Summers 2015; Sampson 1991; Sampson et al. 1997; Wikström et al. 2012).

Finally, while the mixed logit variation of the discrete spatial choice approach relaxes a number of assumptions inherent in all discrete spatial choice studies, we remain encumbered by several others which likely warrant further empirical investigation. These include: (1) that the choice process used to locate suitable targets is hierarchical in structure; (2) that the geographies selected to represent choices are both an adequate reflection of offender choice perceptions and are not compromised by excessive internal heterogeneity; and (3) no significant changes over time in either the behavioral rule applied by offenders in selecting targets, or the choices selected in targeting particular locations take place. With respect to the first and second, it may be that modelling preference variability is more feasible at smaller units of analysis than the neighborhoods used here. For instance, Weisburd et al. (2012) demonstrate the tremendous variability in crime levels using street segments over neighborhoods.

Overall, this study strongly demonstrates the need for additional theorizing and modelling of actor-criterion interactions with respect to location choice amongst offenders. There are clear theoretical and applied consequences associated with better understanding why certain types of offenders have different offending preferences relative to others. In pursuing such insight we propose a number of avenues of future research.

First, studies should be undertaken to assess the impact of other trip origin characteristics in explaining location choice. While here we have explored the impact of home location on target proximity, drawing on theory, we might also look to examine if, for instance, offenders that originate from more affluent neighborhoods are more or less attracted to equally affluent neighborhoods to offend within. Or how the location of nodes (e.g. school, work or peers) and the related intervening travel paths influences offender preferences.

Second, we suggest that research should be undertaken to examine preference variation between different types of offenders. To illustrate, the spatial preferences of purposive prolific offenders are likely to be different from those who offend less frequently. In terms of applied outcomes, identifying general trends within those subsets of offenders who are responsible for disproportionate levels of victimization confers significant advantages in developing effective crime reduction strategies. A better understanding of the proclivity for nearby target areas amongst prolific offenders may inform analytical approaches such as geographic profiling and crime-linkage analysis, and subsequently investigative practices that seek to identify potential suspects for undetected crimes.

Third, future studies should attempt to not only assess but also explain preference variability between offenders. In particular, and extending prior work on the relevance of offenders’ residential histories (e.g. Bernasco 2010b), future studies could include more measures of offender activity spaces. Spatial knowledge and experience are likely to be important in determining where offenders decide to perpetrate, and are also likely to vary widely between offenders.

Fourth, the observed disparity between aggregate and individual level estimates of the impact of residential mobility on location choice is an interesting finding that, in the latter case, challenges previous research in the field of social disorganization. As such, it warrants future investigation. While this observed shift in preferences may highlight considerable variation at the individual level, it may also reflect limitations in examining individual level preferences with respect to relatively large target choice areas, where internal heterogeneity may compromise the validity of some area level indicators. Considering this, and the difficulties associated with applying the mixed logit approach to study large numbers of alternatives, additional studies incorporating more fine grained choice sets drawn from smaller study areas might better expose the relationship between constructs proposed by social disorganization and offender preferences.

Fifth, further research should explore changes in offenders’ preferences over time. Multiple offences perpetrated by the same offender may not necessarily be based on the same preferences. Offenders, for instance, may learn from prior crimes and adjust their preferences to maximize utility in light of previous experience. Furthermore, research shows that offenders are typically generalists (McGloin et al. 2009), committing multiple types of crime. As such, preferences for one type of crime may inform preferences for another. Given appropriate data, further research might investigate the degree to which preference variation exists within individuals across multiple crime types.

Finally, there is still considerable work to be done in incorporating more nuanced representations of the environmental backcloth in which offenders operate. Location choice, criminal or otherwise, is a multiple constraints problem. While offenders may seek to minimize effort and risk but also maximize rewards, identifying an optimal choice is challenging because there is no single combination that satisfies all minima and maxima. Greater rewards typically require corresponding levels of risk.