Keywords

1 Introduction

Active modes of travel such as walking are being encouraged in urban cities to project a positive environmental impact and to improve the well-being of the population. However, the physical vulnerability of pedestrians may expose them to severe consequences when involved in traffic collisions. Pedestrians who are jaywalking or engaged in distracted activities such as using their cellphones while crossing a road facility are at the risk of being exposed to unsafe and conflicting situations. In such instances, identifying violations and distraction can provide a reliable surrogate road safety measure, whenever road collisions are attributable to non-conforming behavior. Similarly, pedestrian evasive actions, mainly manifested in the variations of walking behavior can provide useful measures of traffic interactions. Sudden changes of direction, walking speed or even stopping are characteristic actions that pedestrians adopt as strategies to avoid collision. It is therefore suggested that behavioral based traffic conflict indicators can be considered as an alternative to assess traffic safety in less organized, high road user mix driving cultures. However, comprehensive engineering programs can be hindered by limitations in collision data quality as well as gaps in research related to pedestrian behavior and data collection. The analysis of walking gait behavior is an active research area in health science where different methodologies are developed to understand how the walking mechanism changes under varying conditions. The main goal of this paper is expanding the application of computer vision to detect and analyze pedestrian behavior and safety. Several applications are discussed.

The first application examines the possibility of automatically detecting distracted pedestrians on crosswalks using their gait parameters. The methodology invokes recent findings in health science concerning the relationship between walking gait behavior and cognitive abilities. Walking speed and gait variability are shown to be affected by the complexity of tasks (e.g., texting) that are performed during walking. Experiments are performed on a video data set from Surrey, British Columbia. The second case study addresses the automated analysis of pedestrian data collection and conformance behavior. Two types of violations are considered for analysis: spatial and temporal violations. Spatial violations occur when a pedestrian crosses in non-designated crossing regions. Temporal violations occur when a pedestrian crosses an intersection during an improper traffic signal phase. In this analysis, the automated violation detection is performed using pattern matching. The analysis is applied to an intersection with a perceived high rate of traffic conflicts in the Downtown Eastside of Vancouver, British Columbia. The third application addresses the problem of understanding and detecting pedestrian evasive actions during safety critical situations. There is increasing evidence that conventional traffic conflict indicators such as Time-to-Collision (TTC) and Post-Encroachment-Time (PET) lack the ability to describe conflict severity in many traffic environments and may need to be combined with other indicators for safety diagnosis [1, 2]. In this work, a novel method based on time series analysis of the pedestrian walking prolife is used to identify pedestrian evasive actions. The analysis is applied on a data set from Shanghai, China.

This research can benefit applications in several transportation related fields such as pedestrian facility planning, pedestrian simulation models as well as road safety programs.

2 Computer Vision Methodology

The pedestrian analysis methodology uses a video analysis system to automatically detect, classify, and track road users and interpret their movements. The positional analysis of road users requires accurate estimation of the camera parameters. Once calibrated, it is possible to recover real-world coordinates of points in the video sequence that lie on a reference surface with known model (pavement surface). The accuracy of the developed method was examined in [3] and was found to be adequate for positioning slow-moving road users such as pedestrians. The foundation of the tracking system relies on feature-based tracking [4] where important points are tracked on moving objects (Fig. 1.a). The subsequent step is to select a point that moves at similar speed and satisfy other motion constraints to the same coherent object (Fig. 1.b) [4]. Road users are classified into vehicles and pedestrians (Fig. 1.c). The tracking and classification accuracy were presented in [5, 6] and is considered satisfactory.

Fig. 1.
figure 1

Illustrations for the computer vision based analysis

3 Application 1: Automated Distraction Detection Based on Pedestrian Gait Analysis

This case study uses traffic videos from a location in Surrey, British Columbia. The Surrey data is collected for a major 4-legged intersection. A camera position is used to capture the intersection, as shown in Fig. 2. In this experiment, 50 pedestrians who are looking or typing on their phone are selected along with 98 pedestrians who are not distracted. The selection was performed by a traffic expert using an elaborate and vigilant review to avoid errors in the manual ground truth labeling. Each pedestrian was described by the manual reviewer in detail so that its state could later be compared to the automatically classified one. Figure 2.c illustrates trajectories for distracted pedestrians using their phone while crossing the intersection. For ground truth, the distraction state for each pedestrian is labeled manually based on a good observers’ judgment.

Fig. 2.
figure 2

Intersection in surrey, BC

Walking speed is estimated by placing screens around the region of interest and measuring the amount of time it takes for the pedestrian trajectory to cross this region. Walking mechanism can be explained through the spatio-temporal gait parameters (step length and frequency [7, 8]) that can be extracted from the pedestrian trajectories. Step frequency is defined as the number of times a foot touches the ground in a unit of time. The distance between those two instances is defined as the step length. Each pedestrian step is observed to introduce a periodic fluctuation in the speed profile which enables the measurement of the gait parameters such as step frequency and step length. The detection of the dominant periodicities in the speed profile is performed using the power spectral density (PSD) estimation of the speed profile signal [9].

In addition, other gait variables are measured including the walk ratio which represents the relationship between the amplitude and frequency of the rhythmic leg movements when walking. Deviations from the normal walk ratio during free walking may reveal a degree of abnormal walking patterns [10]. Other features include the Acceleration root mean square (RMS Arms) which measures the dispersion of the normalized acceleration profile. The RMS value indicates the degree of gait variability, thus, a higher RMS indicates a higher degree of variability and a lower degree of stability [11]. Acceleration Auto correlation (AAC) measure of the stride similarity and regularity by examining the similarity of the acceleration profile shape. A higher AAC value indicates a greater degree of gait stability [11].

Results Summary. The distraction state estimation is performed using k-nearest neighbors (kNN) classification procedure. KNN assigns a class (state) to a pedestrian based on a majority vote of its k-nearest neighbors; obtained from a training dataset of labeled pedestrians. Pedestrian features data is divided into two subsets; a constraint subset for training and a validation subset for performance evaluation. Maximum correct classification CCR is around 80 % for the different combination of relevant features. The classification performance is also evaluated by means of receiver operating characteristics (ROC), which quantify the trade-off between the detection rate (the percentage of positive examples correctly classified) and the false positive rate (the percentage of negative examples incorrectly classified). Figure 3 compares the performance of the different runs of the classification with different features selection. Figure 3.a shows a true positive rate (non-distracted pedestrian classified as a non-distracted) of 87 % at a false-positive rate (distracted pedestrian classified as non-distracted pedestrian) of around 0 %. The KNN classification achieved a 90 % correct detection rate at less than 20 % false negative. ROC plot is also shown for the correct distraction classification where a true positive rate (distracted pedestrian classified as a distracted) of 80 % at false positive rate (non-distracted pedestrian classified as distracted pedestrian) of around 10 %. KNN method achieved 100 % correct detection rate at less than 15 % false negative. This result shows the trade-off that is inherently involved in the classification process. A selection of the “best” classifier can then be dependent on the target data collection application. For example, if a practitioner is interested in gathering information about the rate of distracted pedestrians; then the choice of the classifier will rely on the analysis provided in Fig. 3.b.

Fig. 3.
figure 3

ROC plot evaluation

Fig. 4.
figure 4

Location characteristics

4 Application 2: Automated Analysis of Pedestrians Non-conforming Behavior at Urban Crossing

In this application, illegal spatio and temporal crossing violations are automatically identified. The site used in this study is a busy eastside downtown signalized intersection located at East Hastings and Main Streets with a mix of business and residential activities. The intersection layout and the lengths of the crosswalk legs are shown in Fig. 4. The intersection was selected because of a perceived high rate of conflicts between vehicles and pedestrians [12, 13]. Some safety countermeasures have already been implemented at the intersection such as reducing the speed limit from 50 km/H (normal posted speed within Vancouver) to only 30 km/H [12].

Violation Detection Procedure: Violation detection starts with identifying a set of movement prototypes that represent what are considered as normal crossing prototypes. Subsequently, a comparison is conducted between a pedestrian trajectory and normal crossing prototypes. Any significant disagreement between both sequences of positions is interpreted as evidence that the given trajectory represents the movement of a non-conforming pedestrian. The longest common sub-sequence algorithm (LCSS) is adopted for the spatial violation detection. More specifically, the comparison relies on an LCSS similarity measure between the movement prototypes and the trajectories to make decision about the classification [14].

The basic idea of the procedure for detecting temporal violations is to identify pedestrian traversing an intersection segment during an improper signal phase. This is performed by automatically recording the temporal and spatial information of each pedestrian and comparing this information against the provided traffic signal cycles and specified screen lines. The first step of the procedure is to draw the boundaries of the intersection segment. The violation detection is then implemented in two consecutive steps. First, the trajectories of the pedestrians crossing the region of interest, at any given time, are identified. This is achieved by intersecting the trajectories coordinates with the intersection segment. The next step is to identify the time period within which the pedestrian trajectories existed in this segment. This period is then compared against the corresponding signal timing phase. If the time period intersects with a phase when the pedestrian is prohibited to cross, then the pedestrian is labeled as violator, otherwise it is labeled as non-violator.

Results Summary. First, the violation classification procedure is applied to estimate pedestrian compliance rate at the intersection. Pedestrians in the scene are tracked and classified according to the methodology developed in the paper. In the 45 min selected video component, a total of 376 pedestrians were tracked. For validation purposes, pedestrians in the scene were manually identified and classified to be used as ground truth. Figure 5(a) and (b) show respectively the trajectories for the normal and spatial violation pedestrians. A high level of crossing violations by pedestrians was detected in the mid-block region (25.5 percent of pedestrians are crossing illegally). As expected, this westbound approach has an increasing number of violations. The performance of the violation detection is shown using the confusion matrix below (Table 1). At 9.31 percent false detection rate (non-violating pedestrian as violating), a 84.5 percent of correct detection rate of true violator can be achieved. The main factor affecting the correct detection rate is pedestrians moving very close to the crosswalk. Those pedestrians were labeled in the ground truth annotation as non-violating pedestrians. However, proximity to the crosswalk resulted in a prototype matching with high score and therefore classified as non-violating.

Fig. 5.
figure 5

Pedestrians trajectories

Table 1. Confusion matrix

Additionally, a manual review of the video was performed and data collection revealed that of the total pedestrians (450 in total) in the scene, 108 were considered violators. Due to the high definition properties of this video, the majority of the spatially violating pedestrians in the scene were tracked (97 as mentioned earlier). Out of 108 total spatially violating pedestrians, only 11 were missed. This is due to missed detection or over-grouping. Over-grouping occurs when several pedestrians share a common trajectory. This can occur when a group of pedestrians are crossing at a fairly close distance with similar walking speed.

The temporally violating pedestrians were 11. Out of those temporally violating pedestrians, 5 were also spatially violating which shows that a large portion of those temporally violating has a tendency to cross in non-designated area. This is likely due to the tendency of the pedestrians to minimize the travel distance. The automated temporal violation methodology detected all the violation correctly with no false detection of non-violator. It is useful to note that the temporal violation accuracy depends on the precision of the camera calibration as well as the provided signal timing. See Fig. 5.d for the spatial distributions of violating pedestrian trajectories.

5 Application 3: Examining Pedestrian Evasive Actions as a Potential Indicator for Traffic Conflicts

This study addresses the problem of understanding and detecting pedestrian evasive actions during safety critical situations. The analysis is demonstrated on a data set collected at a busy congested intersection in the city of Shanghai, China. The intersection has a high mix of different road-users (vehicles, motorcycles, bicycles and pedestrians). The layout of the intersection camera field of view is shown in Fig. 6. Generally, the traffic indicated disorganised road-user behaviour and many conflicts resulting from risky actions and lack of compliance to regulations.

Fig. 6.
figure 6

The layout of the intersection camera field of view

Once the road-users trajectories are extracted, possible conflicts between road-users are detected conflicts between road-users are determined by evaluating if any of their future positions coincide spatially and/or temporally with each other. Time-to-Collision (TTC) and Post-Encroachment-Time (PET) conflict indicators are evaluated as described in [14].

Evasive actions performed by pedestrians are reflected in the pedestrian change of speed or direction. However, these changes are sometimes not apparent in the signal of the pedestrian movement. The ordinal time-series analysis is a complexity measure that can detect qualitative changes of the underlying dynamics in a time series. The ordinal analysis finds a proper abstraction of the pedestrian walking profile that prunes redundant information while retaining qualitative properties relevant to the evasive action analysis. A pedestrian trajectory with varying dynamics will have therefore a varying complexity. The dynamical complexity measured by Permutation Entropy (PE), is the basic Shannon entropy applied with the ordinal patterns as the symbolic words. The formal procedure of the ordinal analysis and permutation entropy procedure is summarized in [15]. This application examines the use of PE in identifying pedestrian evasive actions. PE is adopted to identify behavioural changes in the pedestrian walking pattern at the onset of potential conflicts.

Figure 7 provides illustrations of two different pedestrian situations with vehicles and their corresponding PE profiles. The first is a pedestrian walking normally and then start running as a response to being in a conflict with a turning vehicle shown in Fig. 7a. This pedestrian speed shows the sudden change as a result of the evasive action. This change is reflected in the PE profile as a hard drop. On the other side, the second example shows a steady moving pedestrian maintaining the same walking steps pattern in spite of being in a conflict in Fig. 7b. In this case, the cyclic signal of speed maintains a constant pattern and accordingly there is no change in the PE profile. The PE drop obtained from different pedestrian conflicts can be a good indicator for the evasive action behaviour of pedestrians. The next step is to analyse the PE drop obtained from different conflicts towards the evasive action exerted by the pedestrian.

Fig. 7.
figure 7

Pedestrians involved in conflicts and the corresponding speed and permutation entropy profile (a) pedestrian start running to avoid conflict (b) pedestrian moving steady in conflict.

Analysis and Results. In the investigation of the validity of using PE profiles to identify pedestrian evasive actions, the PE is compared to traditional measures of traffic conflicts (e.g. TTC and PET). A sample of 60 pedestrian involved conflicts randomly selected is manually annotated by two traffic safety expert observers. The pedestrian conflicts in the two evasive action groups are compared in terms of severity. The conflicts breakdown logically showed that the evasive action group had higher severity conflicts than the no-evasive action group. For both expert results, most highly severe conflicts showed evasive actions in the pedestrian movement. This explains the importance of detecting evasive actions to identify severe pedestrian conflicts.

Calculating the conflict indicators of conflicts in the two groups, evasive action/No evasive action, Fig. 8 shows the means of the different indicators for the two conflict groups. The mean of the PE drop is significantly higher for the evasive action group while for the TTC and PET the mean values are relatively similar. This result is affirmed by the p-value calculated from the Analysis of variance (ANOVA) test performed on the difference between the means of the with/without Evasive Action groups. The difference in the PE case is highly significant at 95 % confidence compared to the TTC and PET. These results show that the PE drop value can differentiate between evasive action and non-evasive action conflicts lacking by conventional proximity measures.

Fig. 8.
figure 8

Difference in indicator means for the evasive action/no evasive action groups and the p-value obtained from the difference in means test between the two groups

The potential of the newly developed indicator was validated by a set of pedestrian interactions reviewed and ranked by safety experts. Results showed that the PE based indicator can identify pedestrian conflicts that have sudden evasive action. Variations in permutation entropy are shown to be a suitable measuring of the extent of the evasive action better than TTC and PET in less-organized, high road user mix traffic environments.

6 Summary

Recent advances in computer vision encouraged the analysis of safety surrogate measures such as conflict and violation detection. This paper demonstrated the latest development on locations considered to be high risk for pedestrian crossing. The paper showed the possibility to identify safety hazardous situation like distraction and illegal jaywalking. It also demonstrated a technique to identify pedestrian evasive actions performed to avoid collision with vehicular traffic. The results of the studied video datasets showed that the proposed approach is promising. The amount of manual intervention needed to collect data on pedestrians in transportation facilities can be significantly reduced by deploying the proposed approach. Such improvement in the data collection can have significant impact on studying pedestrian walking behavior, crossing decisions and would potentially lead to better pedestrian microscopic modeling. The use of computer vision techniques for measuring gait parameters has several advantages such as capturing the natural movement of pedestrians and minimizing the risk of disturbing the behavior of observed subjects.

Expanding on this work would involve investigating the relationship between violations and other traffic factors like wait-time and design characteristics of the intersection. Other directions would involve studying the effect of violations on safety [16]. This can be made possible by defining severity profiles as safety measure and by developing relationships between violations and other safety conflict indicators. Some challenges remain to be addressed including scalability and the efficient management of the computing resources. One area of potential research is to assess a relationship between the accuracy of tracking and pedestrian density. Finally, more experimental results at different intersections are desirable to have a robust estimation of the practicality of the approach.