1 Introduction

According to the National Highway Traffic Safety Administration (NHTSA)’s report for the year 2015 (NHTSA 2017), 5376 pedestrians were killed and ~ 70,000 pedestrians were injured in traffic crashes in the United States. On average, a pedestrian is killed every 1.6 h and injured every 7.5 min (NHTSA 2017). Pedestrian fatalities occurred more in urban areas (76%) than in rural areas. Of the total number of pedestrian crashes, 18% of the pedestrian crashes occurred at intersections (NHTSA 2017). About 13% and 20% of total traffic fatalities involved pedestrians in North Carolina and the city of Charlotte, respectively. In the year 2015, Charlotte ranked 11 for pedestrian crash frequency among the North Carolina cities with a population of 10,000 or more (NC DMV 2017). These statistics indicate the need to address pedestrian safety issues at the city, state, and national level.

Public transportation ridership in the United States has grown by more than 20% in the last decade, reaching its highest levels since the year 1957 (FTA 2017). Several urban areas have planned, designed, and constructed mass rapid transit systems (MRTS), light rail transit (LRT) systems, streetcar, monorail or bus rapid transit (BRT) systems to cater the growing demand for public transportation, reduce traffic congestion, and make the world more sustainable. As an example, the LYNX Blue Line is the region’s first LRT service operating from Interstate-485 at South Boulevard station to uptown 7th Street station in the city of Charlotte, North Carolina. It is 9.6 miles (15,449 m) long with 15 stations, which began functioning in November 2007. The LYNX Blue Line LRT was recently extended to the University of North Carolina at Charlotte main campus.

Residents within a proximal area have a higher tendency of walking and cycling to access the LRT stations and use the system (Luan et al. 2020). More than 80% of safety incidents associated with LRT or streetcar systems are auto-pedestrian conflicts (Currie and Reynolds 2010). The majority of LRT and streetcar collisions occur in mixed right-of-way or near at-grade intersection crossings (Ziedan and Brakewood 2020). Besides, public transportation ridership depends on user’s safety, who are mostly pedestrians for a certain distance (say, 0.25 miles or 402 m) from the transit station. Therefore, there is a need to address pedestrian safety issues that are likely to arise at at-grade intersections and midblock locations near transit (say, LRT) stations.

Transit agencies, both, at state and federal level have been working at improving the safety of its users through engineering, education, and enforcement programs (Nabors et al. 2008). To proactively improve pedestrian safety at intersections near LRT stations, it is important to understand the influence of various predictor variables on pedestrian crashes and estimate the number of pedestrian crashes as a function of critical predictor variables. The focus of this paper is, therefore, two-fold—(1) to research and identify critical predictor variables such as road network and land-use characteristics that influence pedestrian safety at intersections near LRT stations, and, (2) to examine the change in pedestrian crash patterns at these intersections before and after the LRT is in operation to serve the users. The reported pedestrian crashes considered for analysis and modeling in this research involve a pedestrian and a motorized vehicle. Other types of pedestrian crashes (as discussed in Ceccato and Willems 2019) like pedestrian-pedestrian crashes, pedestrian-bicycle crashes, and self-inflicted pedestrian crashes were not observed nor considered in this research.

Srirangam and Pulugurtha (2018) analyzed pedestrian crash data at intersections near LRT stations and developed a pedestrian crash estimation model using road network characteristics. They did not consider land-use characteristics, nor examined the correlation between the dependent and predictor variables. It is important to consider and research the influence of land-use characteristics on pedestrian crashes as they play a prominent role in defining travel characteristics and safety in an area. Additionally, land-use characteristics are good surrogates to pedestrian counts or activity data, which are not generally available for analysis and modeling. Therefore, this paper extends the previous efforts and explores the influence of road network and land-use characteristics on pedestrian crashes at intersections near LRT stations.

The remainder of the paper includes five sections. They include a review of the past literature, the method adopted for the research, results from the analysis and modeling, a discussion of the results, and conclusions.

2 Literature review

An overview of past research on (1) walking distance and pedestrian activity, (2) pedestrian safety at intersections, and (3) safety associated with the LRT and urban road users is presented in this section. These subsections are followed by the limitations of past research and contribution of this research effort.

2.1 Walking distance and pedestrian activity

Loutzenheiser (1997) stated that walking links practically every trip taken, yet it is not used as a primary mode of transportation. The findings from their research indicate that density, the number of parking spaces at the station, income, and education of the residents are key factors in the decision to walk.

Canepa (1992) challenged the standard to plan transit-oriented developments (TODs) in the United States using a 0.5-mile (804 m) radius as a reliable limit for pedestrian walkability from and to an LRT station. They stated that several variables could have a significant influence on walking patterns and ultimately on the TOD radius. Contrarily, O’Sullivan and Morrall (1996) reported that the average walking distance to suburban LRT stations was 0.4 miles (644 m) while it was 0.2 miles (322 m) to Central Business District (CBD) stations.

According to the Pedestrian Safety Guide for Transit Agencies (Nabors et al. 2008; FHWA 2013), most people are willing to walk for 5–10 min, or approximately 0.25 miles to 0.5 miles to a bus stop. To encourage bus usage, safe and convenient pedestrian facilities should be provided within 0.25 miles to 0.5 miles of bus stops and other transit stations.

Pulugurtha and Repaka (2008) developed a model to assess pedestrian activity at signalized intersections in the city of Charlotte, North Carolina. Pedestrian counts were used as the dependent variable while the road network characteristics, land-use characteristics, demographic characteristics, and the number of bus stops were used as the predictor variables. They found that the number of bus stops within the vicinity of an intersection increased pedestrian activity at signalized intersections. The findings from their study indicate that 0.25-mile to 0.5-mile buffers can be used as proximal areas to capture land-use and demographic characteristics to estimate pedestrian activity at an intersection in a city like Charlotte, North Carolina.

Schneider et al. (2009) emphasized the importance of pedestrian count data to have reliable estimates of pedestrian activity. They developed a methodology for counting and modeling pedestrians crossing counts in Alameda County, California using road network characteristics, land-use characteristics, and socio-economic characteristics. They observed that the total population within a 0.5-mile radius, the number of jobs and commercial retail properties within a 0.25-mile radius, and the presence of a regional transit station within a 0.1-mile (~ 161-m) radius influence pedestrian crossing counts at an intersection.

Rodríguez et al. (2009) used segment-level primary data collected for 338 street segments near 71 BRT stations in Bogotá, Colombia. They conducted factor and regression analyses to identify two dimensions of the built environment that were associated with higher levels of pedestrian activity around the BRT stations. Built environment characteristics considered in their research include (1) pedestrian-friendly amenities comprised of wider and higher quality sidewalks, (2) the presence of amenities such as benches, garbage cans, and bike paths, (3) connectivity comprised of higher levels of road density, (4) three- and four-way intersections, and, (5) density. They observed a greater pedestrian activity on segments with a higher development intensity, with a lager mix of land uses, and with more crossing aids.

Overall, the findings from past studies indicate that road network, land use, demographic, and socio-economic characteristics influence pedestrian activity at intersections, and 0.25 mile to 0.5 mile buffers can be used as proximal areas to assess pedestrian counts or activity.

2.2 Pedestrian safety at intersections

A few researchers worked on identifying factors influencing pedestrian safety and/or developed models to assess pedestrian safety at intersections. Schneider et al. (2010) examined the association between intersection characteristics and pedestrian crash risk in Alameda County, California. Thirty variables were considered to develop a negative binomial regression model. Their findings indicate that a significantly higher number of pedestrian crashes occurred at intersections with more right-turn-only lanes, more non-residential driveways within 50 feet (15 m), more commercial properties within 0.1 miles, and a greater percentage of residents within 0.25 miles who were younger than 18 years.

Miranda-Moreno et al. (2010) developed a two-equation model to investigate the influence of the built environment on pedestrian activity and crash frequency at signalized intersections in Montreal, Canada. Their results indicate that the built environment within the proximity of an intersection has a strong association with the pedestrian activity but a small direct influence on the crash frequency. Zahabi et al. (2011) estimated the potential influence of speed limit, built environment, and other factors on the severity of pedestrian and cyclist injuries when involved in a collision with motor vehicles. Their results indicate that darkness, vehicle movement, whether a crash occurred at an intersection, vehicle type, and land-use characteristics have an influence on the injury severity of pedestrians.

Pulugurtha and Sambhara (2011) developed generalized linear pedestrian crash estimation models for signalized intersections, using data for the city of Charlotte, North Carolina. The average number of pedestrian crashes per year within 200 feet (61 m) of each selected intersection was considered as the dependent variable. The road network characteristics, land-use characteristics, socio-economic characteristics, demographic characteristics, and the number of bus stops were considered as the predictor variables. Their results indicate an increase in the number of pedestrian crashes at intersections with an increase in the number of bus stops within 0.25 miles radius and an increase in the number of approaches.

Ukkusuri et al. (2012) examined the role of the built environment on pedestrian crash frequency. A model was developed for New York City. The data used in their research include road network, land-use, demographic, transit supply, and travel characteristics, and two different crash frequency outcomes (total and fatal-only crash counts). Their results indicate that a greater number of schools and transit stops—which are determinants of pedestrian activity, were more likely to result in a greater number of pedestrian crashes. Further, the likelihood of a pedestrian-vehicle crash was observed to increase with the number of lanes and the road width.

Zegeer and Bushell (2012) provided insights into the magnitude of the pedestrian crash problem around the world and presented lessons learned from several countries, particularly in Europe and the United States, for improving pedestrian safety. Beginning with pedestrian safety statistics at the global, regional, and national levels, they addressed potential countermeasures and strategies for improving pedestrian safety from an international perspective.

Overall, the findings from past studies indicate that road network, land-use, demographic, and socio-economic characteristics influence pedestrian safety at intersections.

2.3 Safety associated with the LRT and urban road users

A few authors focused on safety associated with the LRT system and urban road users. Novales et al. (2014) developed a design tool kit to improve the safety of LRT urban insertion. Design solutions such as intersections with hook turns to relocate opposing turns were found to have a better safety performance than conventional intersections without hook turns (Currie and Reynolds 2011). The success of such designs depends on driver comprehension and behavior while traversing through unconventional intersections. The safety problems arising due to driver comprehension or behavior could be addressed and benefits improved by designing suitable facilities with warning signs to alert road users (motorists, pedestrians, and cyclists) of specific risk conditions, limiting or temporarily prohibiting turning movements, reducing conflict situations, and building connected environments (Craig et al. 2016; Guerrieri 2018). While some of these solutions are more applicable to manage regular traffic along with trams, a few may result in negative consequences (for example, affect cross-street traffic).

Mathew and Pulugurtha (2020) assessed the effect of the LRT system on the road network within its vicinity using travel time measures. They compared travel time measures for scenarios like networks without LRT, a testing phase of LRT, first month of LRT operation, third month of LRT operation, sixth month of LRT operation, and ninth month of LRT operation. Their findings indicate that increased green times and better coordination may improve travel time reliability on the at-grade LRT corridor while the alternate routes and cross-streets may be negatively affected by the LRT system.

The differences in design and operation of at-grade crossings for the LRT system warrant the development of separate crash prediction and hazard index models (Fischhaber and Janson 2012). Developing and using such LRT-specific crash prediction models may yield better estimates compared to the United States Department of Transportation recommended safety performance functions (Fischhaber and Janson 2015).

2.4 Limitations of past research

Overall, a research gap was observed pertaining to the predictor variables that influence pedestrian safety at intersections near LRT stations. It is also hypothesized that the number of pedestrian crashes could be higher at intersections near an LRT station when compared to intersections that are relatively far away from an LRT station. This could be due to a higher level of pedestrian activity within the vicinity of an LRT station. Additionally, the number of pedestrian crashes could be higher at an intersection near an LRT station serving users than when compared to the before LRT construction period. Therefore, this research intends to bridge the gap by developing pedestrian crash estimation models to study the influence of road network characteristics, distance from the LRT station, and land-use characteristics on the number of pedestrian crashes at an intersection near an LRT station and draw inferences on pedestrian safety before and after the construction of the LRT system.

3 Research method

As the focus of this research is on pedestrian safety at intersections near LRT stations, the LYNX Blue Line LRT corridor in the city of Charlotte, North Carolina was considered as the study area. The pedestrian crash data from the year 2000 to the year 2016 as well as available road network data and land-use data from the year 2012 to the year 2016 were gathered from the city of Charlotte Department of Transportation (CDoT) and used in this research. Geographic Information Systems (GIS) software was used to extract geospatial data like the number of pedestrian crashes and other data elements. Statistical analysis was then conducted to develop pedestrian crash estimation models. The research method involving the following steps is discussed next.

  1. 1.

    Identifying study intersections

  2. 2.

    Identifying predictor variables contributing to pedestrian crashes at intersections near LRT stations

  3. 3.

    Conducting geospatial analysis to extract data

  4. 4.

    Examining the correlation and identifying critical predictor variables for modeling

  5. 5.

    Developing and validating pedestrian crash estimation models

  6. 6.

    Comparing pedestrian crash data before and after LRT is in operation to serve the users

3.1 Identifying study intersections

Figure 1 shows the fifteen LRT stations overlaid on the road network. The recently extended LYNX Blue Line section was not considered due to insufficient after-period crash data.

Fig. 1
figure 1

LYNX blue line LRT stations

The intersections data was obtained in a geospatial format from the CDoT. For every selected LRT station, a buffer of radius equal to 0.25 miles (walking distance) was generated and overlaid on the layer that contains all the intersections in Charlotte, North Carolina. This proximal area was considered as most bus and LRT users do not prefer to walk for more than 0.25 miles in the study area (also, as discussed in Sect. 2).

Of the intersections within a 0.25-mile radius of each selected LRT station, 70 intersections were selected such that they are geographically distributed throughout the LYNX Blue Line LRT corridor. This was done to select intersections in uptown and urban areas for analysis and modeling, as road network, land-use, and socio-economic/demographic characteristics (and, hence, pedestrian and vehicular activity levels) depend on the area type.

3.2 Identifying predictor variables contributing to pedestrian crashes at intersections near LRT stations

The presence or absence of a pedestrian signal, stop-and-go signal, flashing signal, railroad flasher and sidewalk, an increase in the total number of approaches, the total number of lanes, the speed limit, and the total number of approaches with a median at the intersection, an increase in the annual average daily traffic (AADT) of the major road of the intersection, an increase in the actual distance from the nearest LRT station, and an increase in the number of bus stops at the intersection may decrease or increase pedestrian safety at the intersections within the vicinity of LRT stations. Therefore, data pertaining to these road network characteristics were extracted at each selected intersection from databases developed and maintained by CDoT and verified using Google Earth aerial photographs. The presence of the sidewalk was not considered for analysis and model development as all the selected intersections had sidewalks within their vicinity.

The inclusion of pedestrian count data at the selected intersections could enhance the analysis and model development. However, this data is not readily available for many urban areas. The land-use characteristics, socio-economic characteristics, and demographic characteristics could be considered as surrogate data. It is assumed that the pedestrian count and the number of pedestrian crashes at an intersection would increase with the population or residential areas (single-family or multi-family or mixed use) within the vicinity of the intersection. Additionally, some land-use types may influence the pedestrian count and the number of pedestrian crashes at an intersection more when compared to other land-use types.

Studies conducted in the past using data for the study area have indicated that socio-economic and demographic characteristics are associated with land-use characteristics (Pulugurtha et al. 2013). Further, socio-economic and demographic datasets are not updated frequently. For example, census data are collected once in every 10 years while traffic analysis zone level planning variables data are updated once in every 5 years. Contrarily, land-use data are updated regularly and readily available from local agencies like CDoT. Therefore, the land-use data was preferred over the demographic/socio-economic data and considered in this research as surrogate for pedestrian count data. It was gathered in a geospatial format from CDoT.

3.3 Conducting a geospatial analysis to extract data

The geospatial analysis was performed to identify and extract the total number of pedestrian crashes, the number of bus stops, and land-use characteristics within the vicinity of each selected intersection.

3.3.1 Extracting the number of pedestrian crashes

All the pedestrian crashes during the before- and after-periods were obtained and considered in the analysis. The before-period pedestrian crash data used in this research was from the year 2000 to the year 2004 (5 years). This was to avoid biased pedestrian data as the LYNX Blue Line LRT was under construction between the years 2005 to early 2007 (operational since November 2007).

The Highway Safety Manual (HSM) suggests using 3–5 years of data to improve the reliability of analysis using crash data (AASHTO 2010). Considering data for more than 5 years may yield misguided results as other transportation projects and site developments within the vicinity of the LYNX Blue Line LRT corridor could influence pedestrian activity over time and, hence, the number of pedestrian crashes. Therefore, the after-period pedestrian crash data used in this research was from the year 2012 to the year 2016 (5 years). The after-period pedestrian crash data was used to develop pedestrian crash estimation models and to compare with the before-period pedestrian crashes.

The crash data considered in this research does not include safety/security incidents on LRT trains or crashes involving LRT trains and motor vehicles only (no pedestrians). These are outside the scope of this paper as the focus is only on pedestrian crashes at an intersection near an LRT station. Therefore, the implementation of the CATS See Say appFootnote 1 or other security measures on LRT trains are not expected to have an influence on findings from this research.

The pedestrian crashes within a 0.25-mile radius per LRT station was assumed to be influenced by the presence of the LRT station. This is a reasonable assumption as the region within the walkable distance (0.25 miles) is where pedestrian activity is expected to be higher due to the LRT system in operation to serve the users and, thereby, greater chances for LRT-influenced pedestrian crashes.

To obtain the sum of pedestrian crashes per intersection, for before- and after-periods, buffers with a radius of 200 feet were generated around each selected intersection and spatially joined with the 5-years before and after pedestrian crash datasets. The 200-feet radius was considered reasonable to capture only those pedestrian crashes that occurred at the intersection and are intersection-related. Pedestrian crashes in the overlapping buffers were not doubly counted but assigned to the closest intersection. The output layers are the total number of pedestrian crashes per intersection for, both, before- and after-periods separately.

A few intersections had zero crashes. These intersections were considered for analysis to minimize any possible bias towards low or high pedestrian crash intersections.

3.3.2 Extracting the number of bus stops

The number of bus stop data was extracted from the geospatial transit data obtained from the CDoT, by overlaying the 200-feet buffer generated around each selected intersection on the bus stop layer of Charlotte, North Carolina. This resulted in the number of bus stops within a vicinity of 200 feet of each selected intersection.

3.3.3 Extracting the land-use characteristics data

The land-use characteristics were extracted by generating 0.25-mile radius buffers around each selected study intersection and overlaying these buffers on the geospatial land-use data of the city of Charlotte, North Carolina.

3.4 Examining the correlation and identifying critical predictor variables for modeling

Twenty-four predictor variables pertaining to road network and land-use characteristics were considered in this research. Table 1 shows the list of predictor variables considered for analysis and model development.

Table 1 Predictor variables

SPSS®, a statistical analysis software, was first used to examine the correlation between the total number of pedestrian crashes during the after-period (dependent variable) and predictor variables such as road network and land-use characteristics and then develop pedestrian crash estimation models. Omitting correlated variables is preferred if the primary purpose of developing a model is to identify significant predictor variables and have precise estimates (Tay 2017). Therefore, the correlations were also examined between the considered predictor variables by computing the Pearson correlation coefficient matrix.

The Pearson correlation coefficient lies between − 1 and + 1 with a corresponding significance value. The significance value considered was 0.05 (at a 95% confidence level). Significance values greater than 0.05 indicate that the variables are not correlated to each other (statistically insignificant correlation).

Priority was given to those predictor variables that are significantly correlated to the dependent variable (total number of pedestrian crashes during the after-period). However, all such predictor variables were not selected as some of them may be highly correlated with other predictor variables. Pearson correlation coefficients were examined within the considered predictor variables and those that are not correlated to each other were identified and used to develop the pedestrian crash estimation models.

Based on anecdotal evidence and findings from the literature review, one or two critical predictor variables were selected first to complement with other predictor variables to develop the pedestrian crash estimation models. The critical predictor variables are the speed limit and AADT, mixed-use area, office and retail areas, and single-family and multi-family areas. These predictor variables are positively or negatively associated with the pedestrian activity and safety at an intersection. The number of bus stops within the vicinity of an intersection, though not correlated with other predictor variables, was considered as a critical predictor variable and included in all the combinations.

3.5 Developing and validating pedestrian crash estimation models

Linear and non-linear regression models were tested for applicability. Multiple models were developed based on combinations of the selected predictor variables.

During model development, the significance value considered was 0.1. Predictor variables whose significance value was greater than 0.1 were eliminated when developing the model, as they could not significantly explain the number of pedestrian crashes at a 90% confidence level. The predictor variable with the highest significance value was removed first to re-generate the model. This step was repeated until each final pedestrian crash estimation model had only those predictor variables which had a significance value less than 0.1.

The goodness of fit was based on the quasi-likelihood under the independence model criterion (QIC) and the corrected quasi-likelihood under the independence model criterion (QICC). The criteria for a good model are: the lower the QIC and QICC values, the better is the goodness-of-fit. Also, the difference between QIC and QICC should be as low as possible.

The model was validated using the data for the same study period for 16 intersections with characteristics similar to the selected intersections but were not considered for model development. The root mean squared error (RMSE) and mean absolute deviance (MAD) were computed to evaluate and validate the quality of the developed pedestrian crash estimation models.

3.6 Comparing pedestrian crash data before and after LRT is in operation to serve the users

Pedestrian crash data for before and after operating LRT service periods was used to compare and examine the change in the pedestrian crash pattern. Intersections based on the number of pedestrian crashes during both the periods were also identified separately and analyzed.

4 Results

The results from model development, validation, and before-after analysis are discussed in this section.

4.1 Critical predictor variables and pedestrian crash estimation models

To develop the pedestrian crash estimation models, the pedestrian crash data used were from the year 2012 to the year 2016 while the road network and land-use characteristics used was for the year 2014 (middle year) for each selected intersection. Table 2 summarizes the minimum, average, maximum, and standard deviation of the dependent and predictor variables considered in this research. The observed minimum number of pedestrian crashes is zero, while the observed maximum number of pedestrian crashes is eight at an intersection. The average number of pedestrian crashes and standard deviation are 1.8 and 1.9, respectively. Of the selected intersections, 53 had a pedestrian signal, 15 had a stop-and-go signal, 52 had a flashing signal without a stop sign, and 3 had a railroad flasher. The speed limit varied between 25 mph (40.2 kmph) and 45 mph (72.4 kmph) while the AADT varied between 3500 and 99,000 (less than 40,000 at most of the intersections).

Table 2 Descriptive statistics

Table 3 summarizes the computed Pearson correlation coefficient matrix. The Pearson correlation coefficients were examined to identify critical predictor variations as well as those that would help minimize multicollinearity when developing pedestrian crash estimation models.

Table 3 Pearson correlation coefficient matrix

The presence of a pedestrian signal, the presence of a stop-and-go signal, the presence of a flashing signal without a stop sign, the speed limit, the number of bus stops, office area, mixed-use area, transportation area, open space/recreational area, and vacant area are significantly associated with the number of pedestrian crashes at an intersection near an LRT station (dependent variable). With the exception of the presence of a stop-and-go signal, the speed limit, transportation area, and vacant area, all other aforementioned predictor variables have a positive Pearson correlation coefficient when associated with the number of pedestrian crashes.

The speed limit was observed to be significantly correlated with the presence of a flashing signal without a stop sign, the number of lanes, the number of approaches with a median, and distance to the nearest LRT station. It was also observed to be significantly correlated with all considered land-use characteristics except multi-family and utility areas. Therefore, the presence of stop-and-go signals, speed limit, AADT, the number of bus stops, multi-family area, and utility area were explored to develop pedestrian crash estimation model 1.

The mixed-use area was observed to be significantly correlated with all predictor variables except the presence of a railroad flasher, the number of lanes, AADT, the number of bus stops, and a multi-family area. Therefore, these predictor variables along with the mixed-use area were explored to develop pedestrian crash estimation model 2.

Office and retail areas were not significantly correlated with each other. They were also not correlated with the presence of railroad flashers, the number of lanes, the number of bus stops, and industrial areas. Therefore, these predictor variables were explored to develop pedestrian crash estimation model 3.

Single-family and multi-family areas were not correlated with each other. They were also not correlated with the presence of a railroad flasher, the number of lanes, AADT, distance to the nearest LRT station, the number of bus stops, and industrial areas. Therefore, these predictor variables were explored to develop pedestrian crash estimation model 4. Utility and vacant areas were not considered in this case as they were significantly correlated with the presence of a railroad flasher and the distance to the nearest LRT station.

Combinations of predictor variables considered to develop pedestrian crash estimation models 1–4 are summarized in Table 4.

Table 4 Combinations of predictor variables for pedestrian crash estimation model development

A linear regression analysis of the dependent and predictor variables showed that the model goodness-of-fit was poor. Non-linear relationships based on Poisson log-link distribution, negative binomial log-link distribution, and zero-inflated negative binomial log-link distribution were then tested. The negative binomial log-link distribution had the best goodness-of-fit statistics and was considered for modeling, as the distribution can reasonably accommodate over-dispersed data, i.e., the variance was greater than mean (variance = 1.84 and mean = 3.44). Additionally, the output from the Voung test, that compares the zero-inflated negative binomial log-link distribution model with a negative binomial log-link distribution model, was insignificant indicating that negative binomial log-link distribution is the preferred model distribution for data used in this research.

Table 5 summarizes statistically significant predictor variables and related statistical parameters (coefficient, standard error, Wald Chi-Square, and significance value) for negative binomial log-link distribution-based pedestrian crash estimation models 1–4. The significance values of the predictor variables listed in Table 5 are less than or equal to 0.1 while the Wald Chi-Square values are greater than 1. The standard errors are generally low.

Table 5 Summary of pedestrian crash estimation model statistical parameters

The QIC and QICC for pedestrian crash estimation models 1–4 are also summarized in Table 5. While they are comparable, the difference between QIC and QICC is lowest for the pedestrian crash estimation model 2 followed by the pedestrian crash estimation model 3.

The AADT, the number of bus stops, multi-family area, mixed-use area and office area have a statistically significant influence with a positive coefficient on the number of pedestrian crashes at intersections near LRT stations. The presence of a stop-and-go signal, the presence of a railroad flasher, the speed limit, single-family area, and industrial area have a statistically significant influence with a negative coefficient on the number of pedestrian crashes at intersections near LRT stations. The negative coefficient for the speed limit in the pedestrian crash estimation model 1 is counter-intuitive. It could be to counter over-fitting of the data. As the difference between QIC and QICC is the highest, the pedestrian crash estimation model 1 was not considered good and excluded for model validation.

The results from model validation are summarized in Table 6. The actual number of pedestrian crashes are the number of pedestrian crashes at each intersection during the years 2012 to 2016. The estimated number of pedestrian crashes is the estimated number of pedestrian crashes at each intersection during the same period. Data was extracted using the same “Research Method” as outlined in Sect. 3 for these 16 intersections. The computed RMSE and MAD are also shown in Table 6. The computed RMSE and MAD are reasonably low, indicating the predictability of the developed pedestrian crash estimation models. They are lowest for the pedestrian crash estimation model 2 followed by the pedestrian crash estimation model 4.

Table 6 Model validation results

4.2 Before-after analysis

Figure 2 shows the number of pedestrian crashes at the selected intersections by LRT station, within 0.25 miles for before- and after-periods (includes intersections with zero pedestrian crashes during the before-period). The results indicate similar trends of pedestrian crashes except at a few LRT stations. On average, the total number of pedestrian crashes increased by 4.6 times during the after-period when compared to the before-period. The number of pedestrian crashes at LRT stations closer to uptown Charlotte (downtown area) were higher than at intersections away from uptown Charlotte. This can be attributed to the higher pedestrian activity in the uptown area compared to other areas, which in turn may contribute to an increased number of pedestrian crashes within the vicinity of LRT stations.

Fig. 2
figure 2

Comparison of pedestrian crashes—all intersections

The number of intersections that had pedestrian crashes during both the periods was 20 (out of 70). An analysis was conducted considering pedestrian crash data for these intersections with prior pedestrian crash history. Figure 3 shows the trends in the number of pedestrian crashes by LRT station. On average, the number of pedestrian crashes during the after-period increased by 2.7 times compared to the before-period at these intersections near LRT stations. The pedestrian crash trend was similar except for a few stations like East–West Blvd. station and Stonewall station. Even though these results are not an exact measure of changes in pedestrian safety, at the LYNX Blue Line LRT stations, they provide insights on the crash patterns during both periods.

Fig. 3
figure 3

Comparison of pedestrian crashes—intersections with prior pedestrian crash history

5 Discussion of the results

An analysis and development of pedestrian crash estimation models to understand the relationship between pedestrian crashes at intersections within the vicinity of LRT stations and predictor variables such as road network characteristics and land-use characteristics was performed. The predictor variables data for 70 selected intersections within a vicinity of 0.25 miles of an LRT station, in the city of Charlotte, were gathered and analyzed. The presence of a pedestrian signal, the presence of a stop-and-go signal, the presence of a flashing signal without a stop sign, the speed limit, the number of bus stops, office area, mixed-use area, transportation area, open space/recreational area, and vacant area are significantly associated with the number of pedestrian crashes at an intersection near an LRT station.

Four combinations of predictor variables based on correlations to minimize collinearity were explored to develop pedestrian crash estimation models to estimate the number of pedestrian crashes at an intersection near an LRT station. The pedestrian crash estimation models indicate that the number of pedestrian crashes at an intersection near an LRT station decrease with the presence of a stop-and-go signal and/or railroad flasher. They also decrease with an increase in single-family or industrial areas, as was also observed by Pulugurtha and Sambhara (2011). This could be due to lower pedestrian activity levels or speed limits in these areas.

An increase in the speed limit was observed to reduce the number of pedestrian crashes at an intersection near an LRT station in one of the four models. This is counter-intuitive and could be attributed to counter over-fitting of the data.

The number of pedestrian crashes was observed to increase with an increase in the number of bus stops and the presence of an LRT station at an intersection (within 200 feet). This finding is the same as was observed by Pulugurtha and Sambhara (2011) and Ukkusuri et al. (2012). Bus stops and LRT stations are typically provided along high speed, high traffic volume, multi-lane roads with commercial, residential, and office land use. Pedestrian activity is high at these intersections due to the presence of bus stops and LRT station, thereby increasing exposure and the number of pedestrian crashes.

Zahabi et al. (2011) observed that land uses influence injury severity of pedestrians. From this research, the number of pedestrian crashes at an intersection near an LRT station was observed to increase with multi-family, mixed-use, and office areas. This could be because these land uses are generally planned and built along high-speed corridors, thereby, increasing pedestrian exposure and crash involvement. These land uses also serve LRT-related commute trips.

The change in the crash pattern before and after the LRT system is in operation to serve the users was also examined. The total number of pedestrian crashes at the selected intersections near LRT stations, during the after-period, increased on an average by 4.6 times compared to the before-period. The total number of pedestrian crashes at the selected intersections near LRT stations, during the after-period, increased on an average by 2.7 times when only intersections with prior pedestrian crash history are considered.

6 Conclusions

It can be concluded from this research that pedestrians are at a higher risk of getting involved in crashes at intersections near LRT stations. This can be attributed to the increase in pedestrian activity because of LRT service, intermodal connections (bus stop to LRT station and vice versa), or changes in the land-use characteristics within the vicinity of LRT stations.

The number of pedestrian crashes at an intersection near an LRT station increase with the number of bus stops, mixed-use area, and office area but decrease with the presence of a railroad flasher, single-family residential area, and industrial area. The findings from this research can be proactively used by practitioners to estimate the number of pedestrian crashes due to road network and/or land-use changes and identify remedial solutions that minimize conflicts to pedestrians and enhance their safety at intersections near LRT stations. The provision of pedestrian-friendly and safe facilities at intersections near LRT stations could indirectly influence transit ridership.

While the developed pedestrian crash estimation models can be calibrated and used for urban areas like Charlotte, North Carolina, it is recommended to adapt the research method, develop region-specific pedestrian crash estimation models, and apply them to enhance pedestrian safety at intersections near LRT stations in other towns/cities.

Collecting land-use data over time and capturing geospatial data within the vicinity of LRT stations would help better understand the relationship and possible impact of the LRT stations on the number of pedestrian crashes at nearby intersections. The pedestrian crash data do not indicate if the pedestrian is a bus/LRT system user or other. Capturing this information when preparing crash reports and including it in the analysis will help better assess the influence on LRT system users at intersections. Further, the research could be expanded by examining pedestrian safety at intersections near MRTS, streetcar, monorail, and BRT systems. These merit further investigation.