Introduction

People are increasingly using cell phones, tablets and laptop computers, mp3 players, and other mobile electronic devices while traveling to optimize the use of their travel time. Accordingly, wireless internet connectivity has been gradually introduced on many public transportation services (as well as airlines) to improve their attractiveness and attract new riders. However, even though onboard connectivity is considered by many users to be an important amenity, the specific impact of such connectivity on ridership is not necessarily easy to estimate.

On November 28, 2011, the rail service provider Amtrak California launched free Wi-Fi service on the California Capitol Corridor (CC). The 170-mile CC provides intercity rail service to eight Northern California counties (including feeder bus routes operated by California Thruway Services) and is the major public transportation option available on the transportation corridor between Sacramento (the State capital) and the San Francisco Bay Area. The CC currently serves 17 stationsFootnote 1 (Fig. 1). On all trains serving this corridor, passengers can connect to the web for business, personal or entertainment purposes,Footnote 2 accessing the free wireless internet connection from their own devices. The Capitol Corridor Joint Powers Authority (CCJPA) pays about $405,000 a year to provide the Wi-Fi service.

Fig. 1
figure 1

Route of Capitol Corridor. Source http://www.capitolcorridor.org/route_and_schedules/, accessed May 13, 2013

The CCJPA and the University of California, Davis (UC Davis) launched a joint research study to investigate whether the introduction of Wi-Fi has influenced ridership on the CC, as part of a broader research program on the impact of multitasking while traveling conducted at UC Davis. This paper discusses the modifications in passengers’ ridership associated with the introduction of Wi-Fi services, through the analysis of survey data collected on CC trains in March 2012.

After briefly reviewing previous related research in “Literature review” section, we describe the survey contents, data collection, and data preparation in “Data collection and preparation” section. “Weighting the sample” section discusses the need for and calculation of the weights for this dataset, and presents selected characteristics of the unweighted and weighted sample. Additional descriptive statistics are analyzed in “Descriptive analysis (weighted sample)” section, focusing on relationships between potential explanatory variables and expected 2012 trip frequency. In “2012 frequency model” section, we present a weighted regression model for the self-reported expected number of CC trips in 2012, and we estimate the change in the expected number of trips due to the availability of Wi-Fi. The last section provides a summary of the study and explores directions for future research.

Literature review

A number of studies have addressed the interactions between Information and Communication Technology (ICT) and travel behavior (e.g. Salomon 1986; Mokhtarian 2009; Choo and Mokhtarian 2005), and found that ICT has a more complex impact on travel behavior than had been originally assumed by many. One reason for this is that today ICT is not only used at home or at certain locations to substitute for making a trip, but mobile devices provide increased opportunities for performing activities “anywhere”, in particular during journeys (Aguilera and Guillot 2010). This can enable more effective use of otherwise “wasted” travel time (Lyons and Urry 2005; Schwieterman et al. 2009; Gripsrud and Hjorthol 2012; Lyons et al. 2007; Frei and Mahmassani 2011), which reduces the disutility of travel and thereby encourages greater mobility (Kenyon and Lyons 2007).

ICT, of course, can refer to a wide array of devices, which may or may not be connected to the internet at any given moment. An important subset of the literature on the role of ICT during travel has focused on accessing the internet while traveling—more specifically, on the impact of onboard wireless internet access, or Wi-Fi. Zhang et al. (2006) explored the impact of Wireless Internet Service (WIS) for Dutch business travelers on trains. The results showed that with WIS respondents perceived their travel time to become more useful and pleasant, and the quality and efficiency of their work during business trips to increase. Leonard (2007) described the benefits of providing Wi-Fi internet access for passengers and suggested that passengers’ travel experience could be thereby enhanced. Banerjee and Kanafani (2008) investigated the value of wireless internet connection on trains and indicated that the combination of work with travel increases the perceived utility of the trip and thus reduces the valuation of travel time savings, i.e. the amount travelers would be willing to pay to reduce their travel time. Connolly et al. (2009) found that males and passengers with longer trips derive a greater benefit from Wi-Fi internet access on public transit.

Thus, there is ample evidence that transit passengers value Wi-Fi access. But does Wi-Fi affect ridership? In particular, does it primarily increase the frequency of current riders, or does it also contribute to attracting new riders to transit services? Some studies indicate that wireless internet may have an important impact on the frequency of using transit. Fischer and Schwieterman (2011) administered a survey to bus riders waiting at curbside boarding locations in six Eastern and Midwestern cities in the U.S. The results showed that 52.1 % of passengers in the Eastern region and 43.1 % in the Midwest considered the availability of wireless internet an important factor when making their travel decisions. Another survey was conducted on three California intercity rail services in July 2005, to evaluate the willingness to use and to pay for an onboard internet connection. Based on 1,092 valid surveys, 606 passengers were already traveling with Wi-Fi equipped devices. More than a third (36.1 %) of these equipped passengers declared that they would increase the number of trips by train if the service were introduced (Kanafani et al. 2006). Another survey was conducted on the Alameda County (AC) Transit Transbay service a few months after Wi-Fi had been introduced to that service. Out of 725 respondents, 46 % had used the Wi-Fi service and 41 % of those reported that they had increased their use of the Transbay service because of the availability of Wi-Fi. Among Wi-Fi users, 39 % said that Wi-Fi was a major factor that affected their decision to start using the Transbay service (Twichell et al. 2008). This indicates that Wi-Fi may also help create new riders.

The present study adds to the literature demonstrating an impact of Wi-Fi on transit ridership. Its contribution lies in the design of a survey that did not initially highlight the importance of Wi-Fi to the study (which would potentially bias responses), the development of a model estimating the impact of Wi-Fi after controlling for other factors, the segmentation of the model to account for substantially different impacts based on different market segments, and the development of sample weights to account for the bias inherent to the onboard sampling approach.

Data collection and preparation

The 4-page survey that was distributed in this study includes three main parts:

  • Part A collects information about the use of the CC train service, including trip frequency during 2011 (the previous year), and details about the actual trip during which the survey was distributed (including boarding/disembarkation stations, boarding time and trip purpose, presence of companions, and transportation alternatives for the current trip). Wi-Fi was deliberately not mentioned in this part, to reduce nonresponse and response biases: had we stressed that the main purpose of the study was to evaluate the impacts of Wi-Fi, those for whom Wi-Fi had little impact would be less inclined to complete the survey, and those who completed the survey would be more influenced to attribute an effect to Wi-Fi.

  • Part B explores riders’ experience and opinions regarding free Wi-Fi on board the CC: familiarity with the service, whether they accessed the service or not, and their evaluation of the service. This section also asks for passengers’ expected frequency of trips on the CC during 2012, the key dependent variable of this study.

  • Part C obtains socio-demographic data, including gender, age, occupation, employment, education, auto ownership, household size, and income.

Paper copies of the survey were distributed to riders on the CC on the three working days (Tuesday–Thursday) of March 6–8, 2012. We focused on weekdays because regular commuters and business travelers are an important component of the CC ridership. About 70 % of the train runs during these three working days were covered by the data collection. This allowed representing most operating times, including peak and off-peak periods, and both directions. More details on the data collection for this study and a copy of the survey, together with tabulated responses, can be found in Mokhtarian et al. (2013).

A proper response rate cannot be calculated, since we do not know how many passengers declined to complete the survey and how many of these were frequent travelers who had already completed it on a previous trip. However, we estimate that about 40 % of the passengers offered the survey declined to answer it, for any of several reasons including having completed it during a previous trip. In all, 1,627 completed surveys were collected. Fifty-one surveys were excluded as either ineligible (respondent under age 18), too incomplete, or apparently frivolous or having major inconsistencies, for a final dataset of 1,576 valid cases.

CC trip frequencies in both 2011 and 2012 were measured with six response categories in the survey. Reporting frequencies only in terms of categories was obviously easier for the respondent; however, we transformed the categorical frequency responses to a continuous scale, as is frequently done in similar cases (e.g. Bhat 1994). This was due to both (1) the need for a specific number of 2012 trips, which was used in the calculation of weights (as explained in “Weighting the sample” section), and (2) the unsatisfactory results from the estimation of models using ordered categories. Specifically, we assumed that the “Not at all” category (“not counting the present trip” in 2012), represents zero trips in 2011 and one (the current) trip in 2012. With respect to the category “Less than once a month”, for 2011 the respondent was given a blank in which to write the exact number of trips. We used that exact number as the response for 2011.Footnote 3 If they responded “Less than once a month” for both years, we used the same number for 2012 that they reported for 2011. The numbers of trips in other categories (“1–3 times a month”, “1–2 times a week”, “3–4 times a week”, and “5 or more times a week”) were initially obtained by taking the midpoint of each category. For example, “1–3 times a month” represents 24 trips a year. We assumed that respondents use CC for about 48 weeks per year (to allow for vacations, holidays, and personal leave), so that the last three categories respectively correspond to 72, 168 and 240 trips a year.

Although using the midpoint of each category is reasonable in the absence of any other information, in some cases we had other information that required an adjustment of the value assigned to a given individual’s response. For example, the survey asked, “Compared to your frequency in 2011, which of the following factors (if any) are changing the frequency with which you expect to use CC in 2012? Please check all that apply”. The responses offered were, “job location change”, “home location change”, “change in preferences”, “change in auto ownership”, “free Wi-Fi”, and “other: ___” as well as “I do not expect my frequency to change”. In 248 cases, respondents checked one or more reasons for a frequency “change” but reported the same trip frequency category for 2012 as for 2011. We presumed that in those cases the frequency changed within category, even if we did not know by how much. However, for all reasons except Wi-Fi, we cannot ascertain the direction for this change, and therefore, unless Wi-Fi was also checked, we left 2011 and 2012 frequencies unchanged for those cases.

On the other hand, if Wi-Fi was given as a reason, we can infer the direction of change from a separate question in the survey. Among the 111 (unweighted) respondents who indicated Wi-Fi as a reason for their frequency change, 49 reported the same trip frequency category for 2012 as for 2011, but 47 of those 49 respondents indicated that without free Wi-Fi they would use CC less often than their reported expected frequency in 2012—and therefore are considered to be increasing their frequency due to Wi-Fi. Only two respondents stated that they would use CC more without Wi-Fi. Thus, there are 49 change-within-category cases for which the direction of change could be determined.Footnote 4 To represent a within-category change, we arbitrarily took the difference between the 25th percentile and the 75th percentile of the frequency range for that category, with the sign of the difference reflecting an increase or decrease.

Weighting the sample

The need for weighting and calculation of weights

We distributed surveys to the CC passengers in most operating times and both directions on three working days. Using this approach yielded a somewhatFootnote 5 random sample of person-trips, but a far less random sample of passengers. To see this, imagine the list of all passengers riding CC during 2012. In our 3-day sampling period, we are much more likely to intercept higher-frequency riders on that comprehensive list than lower-frequency riders. Many lower-frequency riders will only be taking the train outside that 3-day period, and therefore could not be sampled at all, whereas a sizable proportion of the higher-frequency riders will be on the train sometime during that period. Accordingly, the resulting sample over-represents higher-frequency passengers and under-represents lower-frequency passengers.

However, when we evaluate the impact of Wi-Fi on ridership, we would like to base the analysis on a random sample of passengers rather than person-trips, especially since we would like to evaluate the impact on various segments of the population of riders. To produce a representative sample of passengers, we needed to weight the sample to mimic the conditions of a simple random sample, in which each member of the population has an equal probability of being sampled. Thus, we needed to deflate the weight of higher-frequency passengers and inflate the weight of lower-frequency passengers. Since we conducted our sampling in 2012, the weights should be calculated based on respondents’ indicated 2012 trip frequency. As there are 35 missing responses for this question, only 1,541 cases are used in the calculation of weights.

To obtain the weights for each category j, we calculated the probability P j that a respondent in that frequency category was sampled. As a slight simplification, we assumed we covered all passengers traveling on the 3 days we sampled. Therefore, the probability that we surveyed passengers in the higher two categories (“3–4 times a week” and “5 or more times a week”) is 1 since they took the train at least once within those three days. P j for the “1–2 times a week” frequency category is calculated by taking the average of the respective probabilities of sampling passengers traveling once a week and twice a week. Passengers traveling on the CC only once a weekFootnote 6 have a 60 % chance (P j  = 0.6) of being sampled during the 3 weekdays of data collection. For those who travel twice a week, there are nine out of tenFootnote 7 equally-likely situations in which they would have been intercepted. Accordingly, for the category “1–2 times a week,” P j  = (0.6 + 0.9)/2 = 0.75.

For the higher frequencies of once a week or more, we assumed this represented a regular schedule with fairly uniform spacing of trips. For the lower-frequency categories “Less than once a month” and “1–3 times a month” we allowed for the possibility that rides could be clustered, resulting in multiple trips by such a passenger during the survey week. In that case, we have

$$P_{j} = \sum\limits_{i = 0}^{5} {P_{|i} P_{i(j)} }$$
(1)

where P |i  = Pr[being sampled \(|\) a passenger rode i times in the survey week], and P i(j) = Pr[a passenger in category j rode i times in the survey week], i = 0, 1, … 5.

P i(j) is determined using the hyper-geometric distribution. We assume 48 weeks a year, so there are 240 workdays a year:

$$P_{i(j)} = \frac{{\left( {\begin{array}{*{20}c} 5 \\ i \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {235} \\ {F_{j} - i} \\ \end{array} } \right)}}{{\left( {\begin{array}{*{20}c} {240} \\ {F_{j} } \\ \end{array} } \right)}},\quad i = 0,1, \ldots ,5,$$
(2)

where F j  = the average number of 2012 trips taken by each respondent in category j. Since F j for the “Less than once a month” category is non-integer (= 4.86), we compute P i(j) for the adjacent two integers F j  = 4Footnote 8 and F j  = 5 first, and then take the weighted average of the two outcomes. Based on the previous discussion, P |i , the probability of being sampled given that a passenger rode i times in the survey week, is 0 for i = 0, 0.6 for i = 1, 0.9 for i = 2, and 1.0 for i ≥ 3.

To calculate the weights (W j ), we first computed I j as the inverse of P j for each category, then we normalized I j so that the sum of weighted cases, \(\sum\nolimits_{j} {W_{j} T_{j}^{12} }\), would equal the total number of original cases in the sample, \(\sum\nolimits_{j} {T_{j}^{12} }\). The weight for each category j can be expressed as:

$$W_{j} = I_{j} \frac{{\sum\nolimits_{j} {T_{j}^{12} } }}{{\sum\nolimits_{j} {T_{j}^{12} I_{j} } }},$$
(3)

where \(T_{j}^{12}\) = the (unweighted) number of cases in the 2012 frequency category j; and I j  = 1/P j .

After calculating the weights for the six categories applying the method above, however, we found that the cases in the lowest frequency category (“Not at all”, which was initially assigned F j  = 1, representing the current trip, and had a sampling probability P j  = 1/240) were consequently weighted too heavily and caused nonsensical results. Such outcomes are common when developing weights, and a standard remedy is to combine two (or more) categories (Little 1993; Brick and Kalton 1996). Therefore, we collapsed the lowest two categories and calculated a new average F j for this combined category. The final weights for each category are listed in Table 1, together with the corresponding unweighted and weighted sample sizes. Comparing the two distributions shows that, for example, only 22 % of all weekday trips are made by one-time and less than once a month riders, but 78 % of all weekday passengers fall into that category. At the other frequency extreme, it can be said that 27 % of the trips are made by only 5 % of the riders.Footnote 9

Table 1 Calculation of weights for each 2012 frequency category (N = 1,541)

Weighted sample characteristics

Table 2 presents the weighted sample statistics for some selected characteristics. Males comprise 53.4 % of the respondents who reported their gender. Nearly half of the respondents are in professional/technical occupations and one-third are salaried workers. More than a third of the respondents have graduate degree(s), and nearly a quarter have an annual household income higher than $125,000. These traits might be partially a result of some response biases in completing the survey, but they are also consistent with the expected population of users of this service.Footnote 10 Nearly one-third of the respondents traveled for social/entertainment/recreation purposes. The “average” characteristics of a respondent are: male, around 43 years old, college graduate, in a household with 2.8 members, 1.4 cars, having an annual household income of $75,000–$99,999, and with an average trip frequency in 2011 of about two trips a month.

Table 2 Selected characteristics of the sample

The corresponding characteristics for the sample without weighting are also shown in Table 2. Many sociodemographic features are similar, but after weighting, the sample has higher shares of (signifying that lower-frequency riders are more likely to be) women; people younger than 25 or older than 64; students; self-employed or retired/not working; less well-educated; lower-income; and owning fewer vehicles. The weighted sample also has a far lower share of commute trips: 65 % of all (weekday) trips on CC are commute trips, but only 23 % of all passengers (from the imaginary list of all weekday passengers riding CC in 2012) commute on CC. A final notable difference is that the 2011 average number of trips is far higher in the sample before weighting, which is also not surprising in view of the strong (0.837) correlation between the 2011 and 2012 trip frequencies.

All subsequent analyses are conducted on the weighted sample unless otherwise noted.

Descriptive analysis (weighted sample)

2012 frequency against conventional predictors of train ridership

Table 3 shows that CC trip frequencies appear to be relatively stable for continuing riders, especially at the extremes of the frequency distribution: more than 78 % of passengers who traveled less than once a month or 3 or more times a week in 2011 remained in the same category in 2012. Not surprisingly, commuters reported higher expected trip frequencies in 2012 than those traveling for any other purpose, with 48.4 % of them planning to travel once a week or more during 2012. Salaried, hourly wage, or contract workers expect to ride CC more frequently than others.

Table 3 Cross-tabulation of expected trip frequency in 2012 with conventional factors

Wi-Fi variables and other reasons for frequency changes

The survey offered six reasons for a frequency change as well as an “other” option. Respondents were invited to “check all that apply”: 9.8 % of the passengers chose “job location change” as one reason; the other reasons are “home location change” (8.9 %), “change in preferences” (6.3 %), “change in auto ownership” (2.6 %), “free Wi-Fi” (8.7 %), and “other: ___” (10.6 %). Nearly three-fifths (59.2 %) of the passengers indicated that “I do not expect my frequency to change”. Wi-Fi was deliberately placed late in the list to minimize any bias toward conforming to the presumably desired answer.Footnote 11 Table 4 cross-tabulates the role of free Wi-Fi against expected 2012 trip frequency, disaggregated by 2011 frequency segment.

Table 4 Impact of Wi-Fi for different traveler segments (weighted sample; column percentages, by segment)

Among the 133 passengers (8.7 %) who gave free Wi-Fi as a reason for a trip frequency change, 110 (7.1 % of the weighted sample of 1,541 cases) reported increasing their CC ridership frequency due to Wi-Fi.Footnote 12 This is considerably lower than the 20 % found by the prospective survey of Kanafani et al. (2006). Wi-Fi appears to have the largest positive impact on the expected 2012 frequency for new riders: for that segment, 72.5 % of those who reported Wi-Fi as influencing their 2012 frequency expect to use CC for more than just the current trip in 2012, compared with fewer than 60 % of the new riders who did not see Wi-Fi the same way.

2012 frequency model

Model comparison and selection

We tested several different model specifications to estimate a model of 2012 expected trip frequency. We discarded the initial model specification, based on the estimation of an ordinal logit model, because of its low predictive power and its goodness of fit below expectations (one reason being that it could not account for changes within frequency category). Ultimately, the multiple linear regression model on a transformation of the categorical dependent variable appeared to fit the data best. To estimate the model, we initially created a continuous dependent variable by using the midpoint of each frequency category, except where we used the 25th or 75th percentile of the range to account for a frequency change within category, as discussed in “Data collection and preparation” section. We experimented with a natural log and other nonlinear transformations of the dependent variable, but eventually chose the raw frequency as most appropriate.

We first estimated a multiple regression model based on the pooled data. However, given the differences among the segments shown in Table 4, we also estimated best models for each segment separately. Finally, we combined those three specifications into a single model so that a composite R2 measure could be computed. Including a full set of segment-specific variables in the model allows coefficients to differ across segment. Discarding insignificant variables led to the final segmented model shown in Table 5.

Table 5 Weighted segmented model of 2012 expected CC trip frequency (N = 1,448)

The adjusted R2 of the segmented model (0.791) is higher than that for the pooled model (0.780, model not shown), confirming its superior fit. Even more importantly, the dramatic differences in estimated coefficients across segments provide strong post hoc justification of the segmentation strategy.

Interpretation and discussion of the model coefficients

Seven variables plus the constant appear in the final preferred model (each with up to three segment-specific versions): five “conventional” variables and two reasons for changing frequency, including Wi-Fi. As already evident from the descriptive analysis, Trips in 2011 (also transformed from frequency categories to continuous) has a strong impact on 2012 trip frequency; we now see that this result remains after controlling for other explanatory factors. For both lower-frequency and higher-frequency riders, the more trips passengers took in 2011, the more trips they expect to take in 2012. For these riders, all other coefficients represent effects of additional variables after controlling for 2011 trip frequency.

Commuting is the only trip purpose statistically significant (for both new riders and lower-frequency riders) in the segmented model: commuters use the train more often than those traveling for personal/social and entertainment purposes. Commuting is not significant for higher-frequency riders, probably due to the significant correlation (0.32) between the Commuting and Salaried variables.Footnote 13 We also tested the inclusion of a dummy variable indicating whether the passengers boarded during the peak hour. It turned out to be insignificant when the Commuting variable was present (the 0.37 correlation between these variables is significant, and about 83 % of the passengers boarding during the peak hour were commuters).

Station-to-station distance is statistically significant only for new riders, with the negative sign meaning the longer the distance the fewer the trips. This is the result of an interaction with trip purpose: the average distance of new riders traveling for social/entertainment purposes is longer than that of commuters, but their expected trip frequency is much lower.Footnote 14

The indicators for Salaried and Hourly wage workers are statistically significant and positive for higher-frequency riders: both types of workers tend to use the service more frequently than self-employed higher-frequency riders (self-employed workers might have a stronger need to rely on their own vehicles for their mobility). Moreover, salaried higher-frequency riders were far more likely to be commuting (69.7 %) than their self-employed counterparts (3.4 %), so the Salaried variable reflects the expected relationship that commuters use the CC more frequently than those traveling for other purposes. Income is also typically positively associated with trip generation, and salaried higher-frequency riders have higher average annual incomes ($96,436) than self-employed higher-frequency riders ($82,087).

Dummy variables were allowed in the model to check the relative importance of passengers’ reasons for changing frequency [for example, 35 % of the 122 passengersFootnote 15 giving Wi-Fi as a reason also gave (an)other reason(s)]. Only two variables indicating reasons for changing the expected frequency in 2012 are ultimately significant: “Job location change”, and “Free Wi-Fi”. Job location change has a positive impact on 2012 trips for new riders but a negative impact for the higher-frequency riders.Footnote 16 The Free Wi-Fi variable exhibits moderately-to-strongly significant and positive coefficients across all three segments, indicating that the service plays a substantive role in increasing trip frequency for both new and continuing riders.

Estimation of the impact of free Wi-Fi

To estimate the impact of free Wi-Fi, we first use the model from Table 5 to calculate the predicted value of the dependent variable for each case. Then, we turn all values for the Wi-Fi variables to 0 to switch off their effect, and recalculate how many trips would have been made without the influence of Wi-Fi. The difference between those two numbers is an approximate measure of the impact of Wi-Fi on the estimated number of trips in 2012. Results of the calculation are shown in Table 6.

Table 6 Impact of free Wi-Fi on (projected) 2012 ridership

For the sample as a whole, the sum of the estimated trips considering the effects of Wi-Fi is 38,620, with 37,596 estimated trips if Wi-Fi had no influence. The difference (increase) in the estimated number of total trips for 2012 due to Wi-Fi amounts to about 2.7 %. However, the effect of free Wi-Fi differs among the various segments. Overall, new riders are the most influenced by the availability of free Wi-Fi, and this suggests that the service had a useful role in attracting new riders to CC. The higher-frequency continuing riders are the least influenced, probably because they travel more frequently as a baseline condition. The absolute number of new trips they contribute is large relative to the size of their group and to the degree of impact of Wi-Fi. However, the “rate of impact” of Wi-Fi on lower-frequency riders is higher because they have more “room” to adjust their travel frequency, whereas many or most higher-frequency riders may already have “maxed out” the frequency that is possible or desirable for them.Footnote 17

It is reasonable to question the sensitivity of the results to the decision to use the category midpoints for the two frequency variables. To assess that sensitivity, we repeated the analysis using random numbers uniformly distributed over each frequency interval. A total of 30 sets of random numbers were generated and used for re-estimating the model. This approach provided 30 additional sets of percentage increases for the pooled sample and the three segments. The mean values of the percentage increase in ridership obtained from the 30 random extractions (to be compared to the results in Table 6) respectively are 2.70, 8.53, 6.15 and 1.05 %, which are very similar to the results calculated using the mid-point for each frequency category. The respective ranges of the results obtained are 2.53–2.86, 7.88–9.33, 5.00–7.30, and 0.96–1.13 %. Thus, using the midpoint of each frequency category does not have a substantial impact on the results.

It is also of interest to explicitly compare these results with those from the unweighted sample, using a similar frequency model and computation method for that case. The details for these calculations are available in the Appendix of Mokhtarian et al. (2013). Although the impacts for lower-frequency riders and higher-frequency riders are roughly similar, the estimated overall impact of Wi-Fi in the weighted sample is twice as high as in the unweighted sample. Further, the impact of Wi-Fi on new riders becomes significant after weighting the data. These results demonstrate how different the outcomes can be when sampling on a trip basis rather than on a passenger basis.

Conclusions

This study analyzes the impact of free Wi-Fi on the expected train ridership for the California CC using data collected through an onboard passenger survey. We analyze the impact of Wi-Fi on three main segments in the sample, based on respondents’ 2011 trip frequencies: new riders; lower-frequency continuing riders (riding in 2011, but less than once a week); and higher-frequency continuing riders (riding once a week or more in 2011). A linear regression model based on these three segments was built to better understand the impact of selected variables on the expected number of trips in 2012. Seven significant variables (each with up to three segment-specific versions) appear in the model: five conventional variables (trip frequency in 2011, commuting purpose, distance between stations, and two variables identifying the employment category); and two variables indicating reasons for changing trip frequency (the availability of free Wi-Fi as well as job location changes). Past trip frequency is the most important predictor of future frequency: the more frequently respondents used CC in 2011, the more frequently they expect to do so in 2012.

The impact of free Wi-Fi on 2012 trip frequency is statistically significant and positive across all three segments. Using the estimated parameters from the model, the number of trips the sample expects to make in 2012 is 2.7 % higher than it would have been if free Wi-Fi had no impact. Granting that this effect is limited in magnitude, it still constitutes an example of how ICT can facilitate and generate travel, although in this case the generation may be mode-specific: we do not know how much of this travel is drawn from other modes as opposed to constituting entirely new trips.

The effect of Wi-Fi is strongest among new riders, who are estimated to make 8.6 % more trips in 2012 than if Wi-Fi were not available. The corresponding increases are 6.2 % and 1.0 % for lower- and higher-frequency continuing riders, respectively. This is a reasonable result, as it is more likely that higher-frequency riders have already maximized their use of the CC service as far as is practical, while lower-frequency riders might still have more room to increase trip frequency.

By its nature, the onboard sampling approach overrepresented higher-frequency (in 2012) riders, and thus it was necessary to weight the sample to more appropriately represent the population of passengers rather than person-trips. The overall estimated impact of Wi-Fi is more than twice as large in the weighted sample. Thus, these results point to the importance of considering the sampling bias inherent in the execution of onboard surveys of any kind, and the need to weight the sample to minimize such a bias whenever the passenger rather than the trip is the desired sampling unit.

This study has several key limitations. By relying on an onboard survey, it precluded the ability to account for passengers who discontinued riding after 2011. It also relies on self-reported frequencies, both retrospective (2011) and prospective (2012), which are subject to memory and optimism biases. Finally, no information was collected on passengers riding during weekends. This represents an important topic for future investigation: public transportation generally has unused capacity during the off-peak and on weekends. If free Wi-Fi is able to increase ridership during these times of low demand, this would represent a very important outcome (at zero additional cost) for public transit agencies.

In addition, in future extensions of the current research, it would be desirable to replicate this survey during following years to further investigate the dynamics of ridership changes. It would be even more desirable to conduct a true panel survey, in which a sample of riders is recruited and followed over time, whether they continued to use the service or not. Such a study would permit a comparison of the actual frequency of use of CC during a given year with the expected trip frequencies reported ex-ante, as well as investigations into travel behavior adjustments in reaction to other changes in service, fares and amenities. It would also be of interest to compare different methods of estimating the impact of Wi-Fi, for example including a time series analysis of aggregate data on CC ridership that could allow the isolation of the influence of Wi-Fi from background trends in ridership attributable to higher gasoline prices and/or improvements in service.