Introduction

Bicycle commuting has been shown to be an effective way to reduce congestion and improve public health (de Geus et al. 2007; Lawlor et al. 2003). Dill and Carr (2003) and Nelson and Allen (1997) demonstrated that higher bicycle commuting rates were associated with higher levels of bicycle infrastructure (e.g., bicycle lanes and bicycle paths). Two national polls reported that 50 % of respondents supported requirements that streets include bicycle lanes or paths, even if it meant less space for cars and trucks (League of Illinois Bicyclists 2003). Although there has been a significant increase in funding for building new facilities for bicyclists, policymakers and city planners still need to know where to allocate resources towards bicycle facilities, because road and sidewalk space is often limited.

Given limited resources, it is probably impossible to add bicycle lanes or bicycle paths along every street. Thus, investments should be made where the new facility will be used. In most situations, bicyclists can either choose to ride on the street close to motor vehicles or ride on the sidewalk, where pedestrians and other factors reduce their speed. Sometimes, however, even though a street is equipped with a facility like a bicycle lane, some bicyclists continue to ride on the sidewalk, as in Davis, California and Ottawa and Toronto, Canada (HCM 2010; Aultman-Hall and Adams 1998). This study seeks to identify factors that explain a bicyclist’s choice between available facility choices—off-street (sidewalk and bicycle path) or on-street (bicycle lane and roadway). This could help the facility planners make the best bicycle facility investment decision for a given situation.

Literature review

To build a bicycle-friendly community with limited funds, city engineers need to determine where best to invest those funds in bicycle facilities. It would be helpful to be able to predict how bicyclists would react to the introduction of a particular kind of bicycle facility, based on different environments and user populations.

For the purposes of our study, we consider two types of bicycle facilities—on-street and off-street. On-street facilities include bicycle lanes and roadways. A bicycle lane is defined “as a portion of roadway that has been designated by striping, signing, and/or pavement markings for the preferential and exclusive use of bicyclists.” (Highway capacity manual 1994) Off-street bicycle facilities include bicycle paths and sidewalks. A bicycle path is defined as “a bikeway physically separated from motorized traffic by an open space or barrier” (Highway capacity manual 1994), either shared with pedestrians or limited to exclusive bicycle use. A sidewalk is a “path designed for pedestrians, usually paved and alongside a street”.

The most direct impact of improved bicycle facilities is an increase in the number of persons using bicycles. According to a stated preference study conducted by the Federal Highway Administration (1992), people indicated that having a bicycle path or bicycle lane would encourage them to bicycle more. Based on analyses in 43 large cities, Dill and Carr (2003) built several regression models using percentage of workers commuting by bicycle as the dependent variable, and the number of bicycle lanes per square mile and various socio-economic measures as independent variables. They found positive association between bicycle facilities and bicycle commuting. Nelson and Allen (1997) indicated that each additional mile of bikeway per 100,000 residents would increase bicycle commuting 0.069 % while holding other factors constant. Krizek et al. (2009) found that areas near new bicycle facilities showed considerably more of an increase in bicycle mode share than areas farther away. After a comprehensive review, Heinen et al. (2010) concluded that the presence of bicycle infrastructure might not only result in more cycling, but higher cycling frequency could also stimulate the construction of bicycle infrastructure. However, Parkin et al. (2008) found that the provision of infrastructure alone appears insufficient to engender higher levels of cycling. Buehler and Pucher (2012) collected length data for bicycle lanes and paths in the 90 largest USA cities. They concluded that cities with a greater supply of bicycle paths and lanes have significantly higher bicycle commute rates—even when controlling for land use, climate, socioeconomic factors, gasoline prices, public transport supply, and cycling safety. Their conclusions, based on city-level data, are in accordance with previous studies. What’s more, through estimated elasticities, they indicated that both off-street paths and on-street lanes have a similar positive association with bicycle commute rates in USA cities.

Various comparative studies had been done to discuss the safety and operational impact of bicycle facilities. One minute spent in a shared curb lane is 4.1 times as onerous as a minute spent on a designated bicycle lane (Hunt and Abraham 2007). Adding a bicycle lane would increase the perceived safety of a bicyclist, thus improving the level-of-service of a segment (Harkey et al. 1998). Based on the videotapes of almost 4,600 bicyclists from 48 sites in Santa Barbara, Gainesville, and Austin, Hunter et al. (1999) found that wrong-way riding was much more prevalent on sidewalks at wide curb lane sites than at bicycle lane sites. Moreover, bicyclists on a sidewalk or bicycle path would still incur, on average, 1.8 times as much risk as those on the roadway because of blind conflicts at intersections (Wachtel and Lewiston 1994). This finding is consistent with the work by Meuleners et al. (2007).

Several studies have indicated that bicyclists’ preferences for bicycle facilities differ according to their purpose (e.g., recreation vs. commuting), riding skill (e.g., experienced vs. inexperienced). (Antonakos 1994; Taylor and Mahmassani 1996; Harvey et al. 2008; Stinson and Bhat 2005; Hunt and Abraham 2007) and sex (Krizek 2005). In general, bicyclists would prefer a bicycle path to a bicycle lane or roadway (Stinson and Bhat 2005), but in most cases, there is insufficient space in which to build a bicycle path. Based on 167 respondents’ perceptions, Tilahun et al. (2007) found that users would pay the highest price for a designated bicycle lane, followed by the absence of parking on the street and by taking a bicycle lane off-road. More determinants for commuting by bicycle and a bicycle facility preference analysis can be found in comprehensive reviews by Heinen et al. (2010) and Pucher et al. (2010).

In a study by Ford et al. (2011), serial data were collected and two binary logit models were estimated to identify the critical segments where bicyclists prefer off-street to on-street components, and vice versa. However, it was found that bicyclists’ preferences regarding bicycle facilities are greatly affected by path-level decisions. We need a better understanding of segment-level choice analysis, which could help city engineers decide the location of new facilities. This paper contributes to current knowledge by adding a disaggregate specification, to connect commuting bicyclist preferences with variations in bicycle facility supply and segment characteristics.

Data

Intercept surveys were conducted at bicycle racks throughout the campus of Purdue University, West Lafayette, Indiana USA during the fall semesters of years 2006–2008 to obtain revealed preference cross-section choice data. The university-oriented database was created because (a) it was easy to conduct the interviews, (b) the university is the major destination in the area for bicycle commuters of various ages, and (c) there is a clear need for a systematic way to plan improvements for bicycling in the street network adjacent to campus. The respondents were bicyclists who began their trips off-campus and had just arrived at bicycle racks. The interview questions were:

  1. 1.

    When you bicycled to campus just now, where did that trip start? Give an address, living group name, or intersection of streets nearest the start.

  2. 2.

    What streets did you use to reach your destination on campus? List the streets or show on the map provided, if that is easier.

  3. 3.

    If, at any point during that trip, you rode on the sidewalk, indicate where that happened. Again, create a list or use a map, as preferred. Idea: Mark links where sidewalks were used with a “W”.

The survey is described in more detail by Ford et al. (2011). If a bicycle lane or the street was chosen, it is regarded as on-street choice. If a bicycle path or a sidewalk was chosen, it is treated as off-street choice. This survey provides us with information on the series of on-street/off-street choices made by each bicyclist. In most cases, something about the bicycling environment changed during the trip. Residential streets fed into arterials, a bicycle lane began or ended, etc. Any such change offered an opportunity—or a reason—for the bicyclist to change his/her cross-section choice. Even if a bicyclist using a sidewalk continued to use the sidewalk as some observable factor changed, there may be information to be learned from that case.

A total of 931 ‘observations’ were collected from 178 bicyclists. Each observation is associated with a segment of the bicyclist’s trip. At the beginning of a segment, the bicyclist would make a decision whether to use an on-street facility or an off-street facility, based roadway characteristics, personal perceptions of safety, and other factors listed in Table 1. Each outcome comprised an observation. In our database, a “segment” is defined as that part of the route between two intersections at which a bicyclist would face a new choice of cross-section.

Table 1 Variables available for modeling

It would be also useful to investigate the choice between sidewalk and bicycle path when both these off-street options are available. However, in our database, only 14 cases provide both off-street options. Because such a small sample could not reveal any significant distinction between these two choices, we decide to aggregate them together as an off-street choice. For on-street choices, we assume that a bicyclist who chooses to use an on-street facility will use the bicycle lane where one exists. Observations with only one option (i.e., a street without bicycle lane, bicycle path, or sidewalk) are not included in the model.

Data on segment characteristics, including 85th percentile traffic speeds, curb lane widths, and average daily traffic, were obtained from the City of West Lafayette and the Indiana Department of Transportation’s Annual Average Daily Traffic maps. Other segment characteristics data came from field visits. Weather data were added from online weather records for the days and times of the interviews. The bicycle compatibility index (BCI), originally developed by Harkey et al. (1998), was incorporated into our study. This index is associated with bicycle level of service on a roadway, which could be used to quantify a bicyclist’s perception of safety (or risk) along a segment of roadway. Klobucar and Fricker (2007) have applied BCI in assessing the level of service offered to bicyclists in West Lafayette’s street network. The typical BCI value would range from 1 to 5. The higher this value is, the more dangerous a bicyclist would perceive the on-street facility. Road pavement surface conditions of West Lafayette were evaluated by applying the Pavement Surface Evaluation and Rating (PASER) system (Walker 2002), which rates pavement surface condition from 1 (failed) to 10 (excellent) in terms of surface distress.

Bicyclist and network characteristics are summarized in Table 2. Among the bicyclists interviewed, 45 were female and 133 were male, with ages ranging from 18 to 30. All of them are commuters who are familiar with conditions on their routes to campus.

Table 2 Sample statistics summary

Methodology

Because cross-section choice behavior is an individual choice, a logit model is suitable for our analysis. In the standard binary logit model, only the utility difference of two choices matters. Therefore, the utility difference could be captured by a constant term, covariates that determine the outcome for observations, and a Gumbel distributed disturbance term:

$$ y_{in}^{ * } = \alpha +{\varvec{\upbeta }}^{\rm T} {\mathbf{X}}_{in} + \varepsilon_{in} $$
(1)
$$ Y_{{in}} = \left\{ {\begin{array}{*{20}c} {1,\quad y_{{in}}^{ * } > 0} \\ {0,\quad y_{{in}}^{ * } \le 0} \\ \end{array} } \right. $$
(2)

The subscript i denotes an individual bicyclist, n denotes the number of choices she/he had made along the chosen route, and y* is a latent variable that represents the utility difference between two alternatives. If y* is greater than zero, an on-street facility will be selected. Otherwise, an off-street facility will be selected. X is the vector of variables determining the discrete choice for observation n made by bicyclist i, β is the vector of estimated parameters, and ε is the random disturbance. The probability \( P_{n} \left( {Y_{in} = 1|x_{in} ,\varvec{\upbeta}} \right) \) that an on-street facility is chosen in the nth observation is represented in Eq. 3 as (Ben-Akiva and Lerman 1985):

$$ P_{n} \left( {Y_{in} = 1|x_{in} ,{\varvec{\upbeta}}} \right) = P\left( {y_{in}^{ * } > 0} \right) $$
(3)

Because, in our data base, each respondent generates multiple observations, these observations will likely share unobserved effects (unobserved characteristics relating to each respondent). These shared unobserved effects violate the assumption of independent disturbances made in Eq. 1. If this is not accounted for, disturbance correlation will violate the disturbance independence assumption and result in potentially erroneous parameter estimates (Washington et al. 2011). A random effects approach could be introduced to solve this problem by adding individual random effect terms φ i (assumed to be normally distributed with mean = 0 and variance = σ2).

$$ y_{in}^{ * } = \alpha + {\varvec{\upbeta}}_{i}^{\rm T} x_{in} + \varepsilon_{in} + \phi_{i} $$
(4)

Before we admit the random effects of unbalanced panel data generated by the bicyclists, we need to realize that there are also many other sources of unobserved heterogeneity. It is likely that different people behave differently given the same conditions, such as two people making different choices on the same segment. The same road may have several segments that may share unobserved effects. People may behave differently during a longer trip than during a shorter trip, etc. Unobserved heterogeneity makes invalid the assumption that the effect of any individual explanatory variable is the same. This problem may be addressed by a more generalized random parameters approach, which assumes that the estimated parameters vary across the population. It allows individuals to have heterogeneous responses with respect to changes in a dependent variable. To some extent, a random effects model is a special case of a random parameters model, by allowing the “constant” term to be random for each bicyclist. Based on the groups established by the random effects model, we estimate parameters for each bicyclist and the probabilities of on-street facility being chosen with a mixed model \( P_{n}^{r} \left( {y_{i} = 1|x_{in} } \right) \) are defined as

$$ P_{n}^{r} \left( {y_{i} = 1|x_{{in}} } \right) = \int\limits_{x} {P_{n} \left( {y_{i} = 1|x_{{in}} } \right)} f\left( {{{\varvec{\upbeta} }}\phi} \right)d\:\varvec{\upbeta} $$
(5)

where \( f\left( {\varvec{\upbeta}|\phi} \right) \) is the density function of \( \varvec{\upbeta} \) with φ referring to a vector of parameters of that density function (mean and variance). The \( \varvec{{\upbeta}} \) new contains a common mean plus a randomly distributed term (e.g., a normally distributed term with mean zero and variance σ 2), representing a standard deviation of the mean for each bicyclist. \( P_{n}^{r} \left( {y_{i} = 1|x_{in} } \right) \) were approximated by drawing values of \( \varvec{{\upbeta}} \) from \( f(\varvec{{\upbeta}}|\phi ) \), given values of φ, and using these drawn values to estimate the random effects logit probability \( P_{n} \left( {Y_{in} = 1|x_{in} ,\varvec{{\upbeta}}} \right) \) (Washington et al. 2011). To estimate this model, a log-likelihood transformation is conducted to produce a LL function

$$ LL = \sum\limits_{n = 1}^{N} {\left[ {\delta_{in} LNP_{n}^{r} \left( {y_{i} = 1|x_{in} } \right)} \right]} $$
(6)

where \( \delta_{in} \) is 1 if observation n is the choice of an on-street facility, and 0 otherwise. Estimation of Eq. 6 would be undertaken by a simulation-based maximum likelihood method with a random draws or Halton draws (Halton 1960) procedure. More discussion about this estimation technique can be seen in Bhat (2001, 2003) and Train (1999).

In addition to parameter estimation, we calculated marginal effects for indicator variables and elasticity for continuous variables. There are two ways to calculate marginal effects: marginal effects at the means (MEMS) and average marginal effects (AME). (Bartus 2005) To calculate MEMS for an indicator variable, one needs to hold all the other variables at their means, or simply take the derivative with respect to the binary variable as if it were continuous, which provides an approximation that is often surprisingly accurate (Greene 2002). This method works well for linear models; however, in the logit model, AME is superior due to the nonlinearity of the outcome probabilities with respect to each variable and because MEMS suffers aggregation bias (Washington et al. 2011). Instead of computing only one marginal effect while setting other variables at their means, AME computes the marginal effect for each observation while other variables stay fixed and then takes the average. For an indicator variable x k , its AME would be computed as:

$$ \frac{{\Upsigma_{N} \left[ {P_{n} \left( {Y_{n} |x_{k} = 1} \right) - P_{n} \left( {Y_{n} |x_{k} = 0} \right)} \right]}}{N} $$
(7)

where N is the number of observations. Marginal effects for binary indicator variables are interpreted as a change in the outcome probability given this variable changing from zero to one. For a continuous variable, marginal effects are the outcome probability change given one unit change in that variable. However, sometimes it is difficult to quantify “one unit” of a continuous variable. Rather than applying an impractical concept, instead, elasticity is adopted. Elasticity values are interpreted as the percent effect that a 1 % change in the targeted variable has on the outcome probability, which is shown as

$$ E_{{x_{k} }}^{P(n)} = \frac{\partial P(n)}{{\partial x_{k} }} \times \frac{{x_{k} }}{P(n)} $$
(8)

If the computed elasticity value is greater than one, then this variable is said to be elastic, meaning that a 1 % change in this variable will bring more than a 1 % change in the probability of an on-street facility being chosen. If the computed elasticity value is less than one, then this variable is said to be inelastic. A 1 % change in this variable would bring less than 1 % probability change in the outcome.

Model estimation

Best model specification

Halton draws and random draws were both tested as part of a simulation-based maximum likelihood method, and 8,500 Halton draws were found to produce stable results. Random draws would produce results similar to Halton draws, but with a much larger number of simulations. This result is consistent with Bhat’s (2003) suggestion that using Halton draws is more efficient than random draws. Therefore, only the Halton draws’ results are presented. Likelihood ratio tests were conducted to determine whether the traditional random effects or more general random parameters model provided the best statistical fit. It was found that the mixed logit model provided a statistically superior fit in that the null hypothesis that the traditional random effects and the mixed models were statistically the same could be rejected with over 99.99 % confidence.

To save space, only the results of the mixed logit model with parameters varying across respondents using Halton draws are presented in Table 3. In the mixed logit model, the t-stat for a parameter’s mean is not of interest. Instead, we care more about the standard deviation of the estimated parameters for each variable. If the standard deviation is significant, then the random parameter approach is warranted to help capture the unobserved heterogeneity.

Table 3 Estimation results (all random parameters are normally distributed)

Turning to the estimation results in Table 3, the parameters that were found to be random (varying across respondents) were: the constant term, bicycle lane indicator; major arterial indicator; minor arterial indicator; BCI; effective sidewalk width; high average daily traffic indicator, and good pavement condition indicator. All of these parameters were found to be normally distributed with their standard deviations significantly different from zero (log-normal, uniform, Weibull, exponential, and triangle distribution were also tried but provided inferior statistical fits relative to the normal distribution).

Interpretation of estimation results

The mixed logit model estimates provide information about how user characteristics and segment characteristics are associated with previously-defined cross-section choices and help capture unobserved heterogeneity. A positive parameter sign in Table 3 indicates that the corresponding factor makes a bicyclist more likely to use an on-street facility. Besides the randomly distributed variable stated above, the parameters for the “signalized intersection at the beginning of segment” indicator variable are fixed across the population, because the estimated standard deviation of this parameter distribution was not significantly different from zero at the 95 % confidence level. Marginal effects for indicator variables and the elasticity for continuous variables are displayed in Table 4. The percentages of the random parameter distributions that are above and below zero, given the estimated mean and standard deviation of the random parameters, are displayed in Table 5.

Table 4 Average marginal effects and elasticity
Table 5 Random parameters percent of distribution

A constant term is estimated for each bicyclist, to pick up the mean unobserved effects in the error term. Without it, the error term may not have a mean of zero, which violates the regression model assumption. The constant term in Table 3 is assumed to be normally distributed with a mean of 5.992 and a standard deviation of 1.754. Its sign is positive for all the bicyclists. This result suggests, in our case study environment, that nearly all bicyclists have an initial preference to ride on the street. This is because, under ideal conditions, such as not being threatened by high-volume or fast-moving vehicle traffic and having good curb lane pavement condition, bicyclists riding on the street can comfortably maintain higher speeds. However, as the bicycling environment departs from the ideal, a bicyclist may adjust her/his facility type choice correspondingly.

The bicycle lane indicator produces random parameters with a mean of 1.392 and a standard deviation of 5.092. Its positive mean value indicates that the introduction of a bicycle lane would encourage bicyclists to use it. The bicycle path variable has a fixed negative parameter, which suggests that adding a bicycle path would decrease, on average, by 0.29 the probability that an on-street facility would be selected. These two results are intuitive and in line with other researchers’ results (Krizek et al. 2009; Sener et al. 2009), because a bicycle lane or a bicycle path would provide bicyclists with a specified right-of-way. Based on a link-level analysis of bicyclist behavior, our results suggest that adding a bicycle lane would be viewed favorably by 61 % of bicyclists, increasing by 0.30 the probability that an on-street facility is chosen. The bicycle lane and bicycle path variables are demonstrated to have similar strength in terms of their AME—+0.30 and −0.29, respectively. However, when we look at individual choices (to be discussed in Sect. 5.3), the bicycle path is found to be more attractive. Although building bicycle lanes has been found to increase bicycle ridership (Krizek et al. 2009), being able to estimate the extent to which bicyclists would use a new bicycle lane or bicycle path can guide bicycle infrastructure investment at specific locations. Adding a bicycle lane without considering other factors like roadway characteristics and safety concerns may not be a good strategy. The models developed in this study permit the incorporation of multiple factors.

The estimated parameter for major arterial is assumed to be normally distributed with a mean of −2.717 and a standard deviation of 6.270. This suggests that, for 67 % of bicyclists are more likely to choose the off-street facility on the major arterial. Similarly, the estimated parameter for minor arterial is tested to produce a normal distributed parameter with a mean of 0.609 and a standard deviation of 8.118, which suggests 53 % of bicyclists are more likely to use on-street facility on the minor arterial. Riding on a major arterial would result in the probability that an on-street facility is chosen decreases by 0.11. Meanwhile, riding on a minor arterial would decrease the probability that an on-street facility is chosen by only 0.05. Higher class roads have higher vehicle speeds and higher traffic volumes, making a bicyclist feel less safe. This outcome suggests that an on-street bicycle facility investment along a higher functional class segment may not be suitable; rather, a bike path should be considered, if space permits.

The “BCI” variable yields a random parameter with a mean of −0.325 and a standard deviation of 1.031. This outcome suggests that 62 % of bicyclists are more likely to use an off-street facility when they feel more dangerous riding on the street. It is intuitive that most bicyclists avoid vehicle threats by switching to an off-street facility. However, it should be noted that 38 % of bicyclists are less sensitive to a dangerous riding environment. A 1 % increase in this variable would cause a 1.25 % decrease in the probability of on-street facility usage. This value suggests BCI is elastic and that somehow improving bicyclists’ perceived on-street safety level would keep many of them off the sidewalk.

The effective sidewalk width is tested to be significant at 95 % confidence level and produces random parameters with a mean of −0.300 and a standard deviation of 0.923. This indicates that 63 % of bicyclists would be attracted by a wider sidewalk. The elasticity value shows that this variable is elastic; a 1 % increase in this variable value would increase by 1.93 % the probability that an off-street facility is chosen. However, wider sidewalks tend to exist where there are more pedestrians to serve. If the objective of the city planner is to keep bicyclists off the sidewalk, an attractive alternative for bicyclists should be provided. A bicycle path could be provided to avoid pedestrian and bicyclist conflicts if space permits.

A threshold value of 8,600 vpd for the “average daily traffic” indicator variable was chosen after several trials. A street with ADT greater than 8,600 produces a normally distributed parameter with a mean of −1.832 and a standard deviation of 6.881. This suggests that, for 61 % of bicyclists, high ADT would have negative impact on them, which decreases by 0.07 the probability that an on-street facility is used. The threshold may vary between communities.

A bad “curb lane pavement condition” negatively influences a bicyclist’s decision to ride on the street. According to the PASER system (Walker 2002), if the pavement condition rating is greater than 6, only minor patching or routine maintenance is needed, which is good pavement surface performance. In our case, a good “pavement condition” indicator yields random parameters with a mean of 0.893 and a standard deviation of 5.224. This suggests that 57 % of bicyclists are more likely to use an on-street facility if the pavement condition is good or better. Improving pavement condition on the street can increase the likelihood of bicyclists leaving the sidewalk by 0.03.

The “one-way against traffic” indicator variable has a negative parameter. This suggests that bicyclists would be more likely to use an off-street facility when they are using a one way street and riding against vehicle traffic. The probability that an off-street facility will be chosen by a bicyclist is 0.05 higher when traveling against the direction of vehicles. This result suggests that “contra-flow” bicycle movements are best accommodated with an off-street facility such as a bicycle path, if space is available. Providing an official contra-flow bicycle lane against a one-way street’s vehicle direction requires specific signs and markings (NACTO 2012), which are not present in our case, for example, along one-way northbound University Street.

The “segment length” variable yields a negative parameter, indicating that bicyclists are more likely to choose an off-street facility on a longer segment. This may be because a longer segment length is associated with higher exposure to vehicle traffic. The segment length variable is inelastic; a 1 % increase in segment length leads to only a 0.26 % decrease in the probability of an on-street facility being selected.

The “signalized intersection at end of segment” indicator variable produces a fixed negative parameter that indicates that bicyclists would prefer to use an off-street facility (with 0.03 lower probability) when they will confront a signalized intersection at the end of the segment. A traffic signal tends to be located at a busy intersection. The most common manifestation of the model results is the use of crosswalks and pedestrian signals by many bicyclists.

The “signalized intersection at beginning of segment” indicator variable yields a fixed negative parameter, indicating that facing a signalized intersection would decrease by 0.03 the likelihood that an on-street facility is chosen. Because traffic signals tend to be located at busy intersections, bicyclists tend to use crosswalks and pedestrian signals there, and the sidewalks they lead to.

Application of model results

From a city engineer’s perspective, we might want to measure the impact of adding a new bicycle path or a bicycle lane along a segment, such that one can predict the probability of each bicycle facility within a cross-section will be chosen. Therefore, prediction analysis based on the model specification would provide more insights about the relative strength of adding bicycle path/lane and give guidance on making bicycle facility improvements. To illustrate the potential usefulness of the model, the impact of different facility improvement scenarios and strategies are considered below.

We examine the predicted probabilities with respect to “bicycle lane” and “bicycle path”, while other variables stay constant. The probability of an on-street facility being selected is predicted for each observation. To avoid losing any information for each observation, the summation of predicted probabilities for all observations is used as the predicted number of using on-street facility. In the sample database, the on-street facility was chosen 525 times and the off-street facility was chosen 406 times. According to the mixed logit model prediction, on-street facility is estimated to be selected 559 times. Compared to the observed cases, the model overestimates somewhat the use of on-street facilities. However, the prediction error is within 5 % [(559 − 525)/931 = 3.6 %]. Under five different scenarios, the corresponding indicator variable is set to be one or zero to represent different scenarios while other variables stay unchanged. The results are shown in Table 6.

Table 6 Model prediction with respect to different scenarios

If a bicycle lane were added to every segment, while holding other factors unchanged, in 607 − 559 = 48 cases would bicyclists switch to the bicycle lane. If all bicycle lanes were removed, on-street facilities would lose 232 users, which is 24.9 % of the 931 total observations. Similarly, if all segments are equipped with a bicycle path, while holding other variables constant, in 268 cases bicyclists would switch to off-street facilities. In the scenario which all the segments have both a bicycle lane and a bicycle path, while holding other factors constant, in 207 cases bicyclists would switch to off-street facilities. If we evaluate only the AME of two variables, we draw the conclusion that these two facilities are equally attractive. However, when we look at individuals’ predicted response to both bicycle facilities, we would learn that a bicycle path is more attractive than a bicycle lane. This result is in line with Stinson and Bhat (2005). The seemingly different result drawn from AME and prediction analysis can be explained by the large standard deviation of the parameter estimates. Different bicyclists perceive safety differently and have a large variance in facility preferences. This result also reveals the preferences of each bicyclist and demonstrates the strength of using a random parameters model, which can help capture unobserved heterogeneity. Furthermore, this aggregate prediction procedure puts into better perspective the potential effect of investments in certain bicycle facilities.

Conclusion

This paper looks at bicyclists’ choice of bicycle facility types within the street right-of-way cross-section, with the intent of developing a way to guide public investment decisions. The methodology adopted in this paper has produced useful findings. The mixed logit model established in this paper can capture bicyclist preferences and unobserved heterogeneity. In general, bicycle facility planners would prefer to keep bicyclists off the sidewalk, perhaps by providing more bicycle lanes. Some of our study’s findings are expected:

  • Under ideal conditions, bicyclists prefer riding on the street.

  • A bicycle path is more attractive than a bicycle lane.

  • Bicyclists tend to use off-street facilities along higher functional class roads or streets with high ADT.

  • If a bicyclist’s feeling of safety (as measured by the BCI) is improved, the use of on-street facilities is increased.

  • Poor road surface conditions drive bicyclists to off-street facilities.

Other findings were less obvious:

  • In many situations, bicyclists use sidewalks, despite potential conflicts with pedestrians.

  • Adding a bicycle lane may not be the best use of limited bicycle facility funds.

The mixed logit model estimates provide information about how user characteristics and segment characteristics are associated with previously defined cross-section choices along a bicyclist’s route and help capture unobserved heterogeneity (indicated by the large standard deviation of the parameter estimates). In addition to parameter estimation, we calculated average marginal effects for indicator variables and elasticity for continuous variables. The results from such models, of course, may vary as the database varies, but the mixed logit model allows us to make full use of a database that includes multiple observations for each bicyclist during his/her trip. Considering the large standard deviation of the parameter estimates, a prediction analysis is conducted to test different scenarios with respect to bicycle facility improvement. A bicycle path was found to be more attractive than a bicycle lane. Using the model results to test potential scenarios can help guide investments in bicycle facilities.