1 Introduction

Traffic congestion is a phenomenon that the load of urban roads exceeds its specified capacity of traffic system, which especially occurs in commute peaks and poor weather conditions. Traffic congestion is a bane of urban life, especially in megacities, which significantly increases the travel cost for residents (Ke et al. 2020), causes more traffic accidents (Retallack et al. 2019), and makes traffic management extremely difficult (Praveen et al. 2021).

With a large number of data acquisition equipment densely distributed in the road network, it is possible to assess traffic characteristics by using the collected high-volume, real-time, and high-accuracy data from multiple and autonomous sources (Wu et al. 2014). In the case of traffic congestion, this information includes GPS data, map application data, data from massive sensors, and so on. Big data offers advantages over conventional data sources in terms of volume, velocity, variety, and veracity (Yaqoob et al. 2016). It can reveal some potential insights of smart cities after effective research and analysis (Chauhan et al. 2016). Therefore, it is highly important to analyze traffic congestion and its characteristics via big data technology for better traffic control and management.

Traffic congestion patterns refer to the data curves of traffic congestion index in one day with different curve characteristics (Zhao and Hu 2019). The grasp of urban traffic congestion patterns and their spatial–temporal evolution characteristics is instrumental to the accurate prediction of traffic situation and information provision for urban residents to optimize their daily travel decisions. From the macro perspective of urban management, it can provide the basis for road construction and city planning (Torkjazi et al. 2018). At the same time, understanding the evolution trend of urban traffic situation is helpful to judge and forecast the level and direction of the regional economic development (Li et al. 2019).

The analytical framework of this study is shown in Fig. 1. First, a linear interpolation method is used to fill in the missing values and a 2-sigma rule is used to identify and modify the outliers for data preprocessing. Second, an improved weighted K-means clustering method is proposed to identify the traffic congestion patterns, which takes a weighting operation among the daily traffic congestion index data before conducting the clustering process. Finally, the spatial–temporal variations of traffic congestion patterns arising from the space difference, the time difference, and the automobile license plate restriction are analyzed.

Fig. 1
figure 1

The analytical framework of this work

The main contributions of this research are summarized as follows. By modifying the typical K-means clustering method, a novel clustering method on time series data is proposed to identify the traffic congestion patterns. Based on the real-life traffic congestion index data in Beijing, the paired t-test is carried out, it is revealed that the traffic congestion patterns are both spatial dependent (there are significant differences in the number and shape of traffic congestion patterns in different regions) and temporal dependent (the variations of dates and automobile license plate restriction both impact the traffic congestion patterns). This work strengthens the understanding on urban traffic congestion patterns and their spatial–temporal characteristics, which is helpful for the accurate prediction of traffic situation and the precise decision for traffic operations management.

The remainder of this work is organized as follows. Section 2 reviews the related researches. Section 3 describes the traffic congestion index data in Beijing. Section 4 introduces the data preprocessing process. Section 5 presents methodologies using in this research. Section 6 shows the identification results on traffic congestion patterns, and the spatial–temporal characteristics of traffic congestion patterns. Section 7 concludes the paper with a brief summary and gives some potential directions for future research.

2 Literature review

In recent years, scholars pay more and more attention to traffic congestion forecasting from the perspectives of traffic flow (Angayarkanni et al. 2021), traffic velocity (Jiang et al. 2021), delay time (Shelke et al. 2019), traffic cost (Tian et al. 2010), traffic congestion index (Wang et al. 2018), and so on. Here traffic congestion index is a comprehensive and integrated indicator, which is defined as a conceptual value that could synthetically reflect the traffic conditions. A higher traffic congestion index corresponds with the heavier traffic congestion. Su et al. (2019) considered total number of vehicles in the system varying over time and proposed a dynamic stochastic differential model to describe traffic flow based on the Markov chain theory. By using traffic flow data from the I-80 Freeway Dataset from the NGSIM program, it showed that the proposed approach provided more accurate predictions of traffic flow. Sanchez-Cambronero et al. (2017) took advantage of the plate scanning technique to propose an algorithm that minimizes the required number of registering devices and their location in order to identify vehicles candidates to compute and predict the travel times of a given set of routes (or sub routes). Wang et al. (2017) pointed out that the PageRank values can act as signals in predicting upcoming traffic congestions, and observed the aforementioned laws experimentally based on the trajectory data of 12,000 taxies in Beijing city for one month.

In addition, the existing literature has also carried out a large number of analyses on the traffic congestion characteristics. For example, by computing urban traffic evolution on temporal complex network with PageRank, Wang et al. (2017) found the congestion degree of a local region is not only affected by the traffic states of its neighboring regions but also those of the whole network. ShirMohammadi et al. (2020) analyzed the traffic density, congestion index and peak hours for the main network of Hamedan communication routes based on the collected data of speed performance, and simulated the relationship between traffic velocity and congestion index by using neural network and genetic algorithm. Kan et al. (2019) proposed a traffic feature analysis and classification approach to detect traffic congestion from taxis’ GPS trajectories at the turn level. The case study in Wuhan supported the feasibility of this approach and proved that the proposed approach can sense traffic congestion at a lower cost compared with other approaches. Chen et al. (2021a, b) proposed a new categorization criterion to define traffic conditions as five levels based on speed performance index values, and applied the proposed criterion in a case study to investigate the daily curve of speed performance index data in Beijing. It was found that the curves vary significantly in shape on different days. Some research also detected traffic congestion characteristics from other perspectives, the results illustrated that there are significant differences across days (Kim et al. 2019).

Traffic congestion patterns refer to the data curves of traffic congestion index in one day with different curve characteristics (Zhao and Hu 2019). The existing literature mainly concerns on the traffic congestion forecasting and traffic congestion characteristics analysis, while the analysis on traffic congestion pattern is very rare. Wen et al. (2014) selected eight evaluation indices on traffic congestions in morning and evening peak hours, and then proposed a hierarchical clustering analysis method to divide the pattern characteristics of traffic congestions. The results revealed that weekdays included Normal Weekdays, Key Congested Weekdays, and Most Congested Weekdays. Sun et al. (2019) adopted hierarchical clustering algorithm to study the congestion patterns in Qingdao based on traffic performance index (TPI) data. The results showed that there were three categories of traffic congestion pattern: Workdays, Latter half of vacation (October 4th–8th), and Weekends and the beginning of vacation (October 1st–3rd). Based on the macro traffic congestion index data in Beijing, Zhao and Hu (2019) revealed that there were two typical traffic congestion patterns on weekdays by applying K-means cluster analysis, i.e., weekday mode A and weekday mode B. The former often appeared on Mondays and the main characteristic was the obvious morning peak and evening peak with similar congestion duration, while the latter often appeared on Fridays and the main characteristic was that the peak and duration of congestion in the evening were significantly higher than in the morning.

The above research has enlightening significance for the urban traffic management at the strategic level, but does not answer the following questions at the operational level: (1) whether the traffic congestion pattern is spatial dependent, should we carry out spatially differentiated traffic congestion management? (2) whether the traffic congestion pattern is temporal dependent, should we carry out temporally differentiated traffic congestion management? The motivation of this research is to answer these questions by using the micro traffic congestion index data, which could provide more valuable information for traffic management, planning and policy-making. Comparative analysis between the existing literature and this research is shown in Table 1.

Table 1 Comparative analysis of related works

3 Traffic congestion index data

Traffic congestion index is a conceptual value that can synthetically reflect the road traffic conditions (Zhao et al. 2019), which has been widely studied as an urban traffic situation indicator in literature (Wen et al. 2014; Sun et al. 2019; ShirMohammadi et al. 2020). As the capital of China, Beijing is a typical megacity with permanent residents 21.89 million and motor vehicles 6.57 million by the end of 2020. The road network structure of Beijing is a ring road system radiating urban districts. Since 2006, Beijing has established traffic congestion index as the core evaluation indicator of traffic conditions, and publishes the real-time traffic congestion index to the public through the Internet and APPs [Footnote 1]. As shown in Fig. 2, the traffic conditions are divided into five grades as the traffic congestion index \(R\) ranges from 0 to 10, that is, no congestion \(\left( {0 \le R < 2} \right)\), less congestion \(\left( {2 \le R < 4} \right)\), congestion \(\left( {4 \le R < 6} \right)\), medium congestion \(\left( {6 \le R < 8} \right)\), and serious congestion \(\left( {8 \le R \le 10} \right)\). The higher the traffic congestion index, the heavier the traffic congestion (Wang et al. 2017). The traffic congestion index was 6.6 at time segment [10:00, 10:05) on May 26, 2021, which belongs to medium congestion. In this study, we collect the traffic congestion index data from January 1, 2017 to December 31, 2017, including over 670, 000 records covering six urban districts (Dongcheng, Xicheng, Chaoyang, Haidian, Fengtai, and Shijingshan). The sampling step of recorded data is five minutes, which means that the system records one piece of data each five minutes. As a result, the whole day (0:00–24:00) is partitioned into 288 time segments, and the length of each time segment is 5 min. An example of the recorded data is shown in Table 2.

Fig.2
figure 2

Real-time traffic congestion index in Beijing

Table 2 An example of traffic congestion index data in Beijing

4 Data preprocessing

Due to mechanical failure or human error, there are missing values and outliers in the raw traffic congestion index data inevitably. As a result, data preprocessing is necessary before conducting the data analysis process. For the Beijing traffic congestion index data, the ratio of missing values is around 2.13% and among which more than 80% appear as single missing value. An example on the phenomena of missing value is shown in Table 3, in which the sample data with ID 61,094,687 takes congestion index value “Nan”, indicating that the congestion index data at time segment [7:50, 7:55) is missing. An outlier in a dataset is an observation with value far away from other observations. In Table 4, the traffic congestion index between 3:45 a.m. and 3:50 a.m. in Shijingshan district is generally less than 1.5 from January 9, 2017 to January 18, 2017, while it suddenly rises to 9.8 on January 13, 2017, which could be considered as an outlier. The missing values and outliers in the time series may distort the shape of traffic congestion patterns, therefore filling in the missing values and modifying the outliers should be performed first.

Table 3 An example of missing value
Table 4 An example of outlier

In literature, a great number of methods have been developed for filling in the missing values, in which linear interpolation method (Lu et al. 2003) is always used to tackle with the cases with small range of missing values, while empirical orthogonal function (Beckers et al. 2003), Gamma distribution function (Simolo et al. 2009), and autoregressive model (Kim et al. 2015) are more appropriate for dealing with the cases with large range of missing values. For the traffic congestion index data, since only single or small range of missing values are identified, the linear interpolation method is taken, which has been widely used in the preprocessing of transportation data analysis (Degen et al. 2007; Zhao et al. 2019; Sun et al. 2021). Assume that missing values are detected at successive time segment \(i = 1,2, \cdots ,I\), while \(x_{0}\) is the recorded congestion index at time segment \(i = 0\) and \(x_{I + 1}\) is observed congestion index at time segment \(i = I + 1\). The linear interpolation method approximates the missing values \(x_{i}\) as follows

$$x_{i} = x_{0} + \frac{i}{I + 1} \times \left( {x_{I + 1} - x_{0} } \right),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \forall \, i = 1,2, \cdots ,I.$$
(1)

Taking Table 3 for example, there is only one missing value detected at time segment [7:50, 7:55). In this case, we have \(I = 1\) and \(x_{1} = \left( {x_{0} + x_{2} } \right)/2\). The filled congestion index at time segment [7:50, 7:55) should be \(\left( {7.7 + 8.1} \right)/2 = 7.9.\)

The detection and modification of outliers in time series are the key steps for data preprocessing. The existing methods mainly include 2-sigma rule (Li et al. 2015), 3-sigma rule (Klos et al. 2015), maximum likelihood estimation (Lee et al. 2006), Bayesian method (Kruschke et al. 2012), and multilevel model (Shi et al. 2008). For detecting the outliers sensitively and modifying them expediently, the 2-sigma rule is employed to handle the traffic congestion index data before feeding them into the clustering analysis algorithm, which has been widely used in transportation data analysis and achieved good performance (Li et al. 2015). Denote the daily time series data of traffic congestion index as an M-dimensional vector, where M is the amount of sampling points each day. For example, if the sampling step is 5 min, then there are 12 sampling points each hour and 288 sampling points each day. In this case, we have M = 288. Denote N as the number of observation days, \(x_{n}^{m}\) is observed congestion index at sampling time segment m on the nth day. Then the traffic congestion index can be written as

$$X_{n} = \left( {x_{n}^{1} ,x_{n}^{2} , \cdots ,x_{n}^{M} } \right), \, \forall \, n = 1,2, \cdots ,N.$$
(2)

The intraday trend \(\overline{X}\) among these observation days, which represents the average value of daily traffic congestion index, can be formulated as

$$\overline{X} = \left( {\overline{x}_{{}}^{1} ,\overline{x}_{{}}^{2} , \cdots ,\overline{x}_{{}}^{M} } \right) = \left( {\frac{1}{N}\sum\limits_{n = 1}^{N} {x_{n}^{1} ,} \frac{1}{N}\sum\limits_{n = 1}^{N} {x_{n}^{2} ,} \cdots ,\frac{1}{N}\sum\limits_{n = 1}^{N} {x_{n}^{M} } } \right).$$
(3)

The residual fluctuations of the nth day are

$$r_{n} = X_{n} - \overline{X} = \left( {r_{n}^{1} ,r_{n}^{2} , \cdots ,r_{n}^{M} } \right),{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \forall \, n = 1,2, \cdots ,N.$$
(4)

Finally, the sample standard deviation \(\sigma^{m}\) is calculated as the square root of \(r_{1}^{m} ,r_{2}^{m} , \cdots ,r_{N}^{m}\) with \(m = 1,2, \cdots ,M.\) A point \(x_{n}^{m}\) is defined as an outlier if the absolute residual \(\left| {r_{n}^{m} } \right|\) is greater than twice of the sample standard deviation \(\sigma^{m}\).In this case, the observation value \(x_{n}^{m}\) is modified as \(\overline{x}_{{}}^{m} - 2\sigma_{{}}^{m}\) or \(\overline{x}_{{}}^{m} + 2\sigma_{{}}^{m}\). Otherwise, it is regarded as a regular point and its value should keep unchanged. The outlier detection and modification procedure is exhibited as follows:

$$x_{n}^{m} = \left\{ {\begin{array}{*{20}c} {\overline{x}_{{}}^{m} + 2\sigma_{{}}^{m} ,} & {{\text{if}}{\kern 1pt} r_{n}^{m} > {\kern 1pt} 2\sigma_{{}}^{m} \, } & {} \\ {x_{n}^{m} , \, } & {{\text{if}}{\kern 1pt} - {\kern 1pt} 2\sigma_{{}}^{m} \le r_{n}^{m} \le 2\sigma_{{}}^{m} ,} & {\forall \, n = 1,2, \cdots ,N, \, m = 1,2, \cdots ,M.} \\ {\overline{x}_{{}}^{m} - 2\sigma_{{}}^{m} ,} & {{\text{if}}{\kern 1pt} r_{n}^{m} < - {\kern 1pt} 2\sigma_{{}}^{m} \, } & {} \\ \end{array} } \right.$$
(5)

5 Research methodologies

Traffic congestion patterns refer to the data curves of traffic congestion index in one day with different curve characteristics (Zhao and Hu, 2019). In this section, an improved weighted K-means clustering method is proposed to identify traffic congestion patterns, and paired t-test method is introduced to analyze the temporal and spatial dependence.

5.1 Improved weighted K-means clustering method

Time series data is multi-dimensional, dynamic and temporal-dependent. Although time series data is composed of multiple data samples connected by time points, it can also be expressed as a single object to be clustered in the form of column vector. Assume that \(D = \left\{ {X_{1} ,X_{2} , \cdots ,X_{N} } \right\}\) is a set of time series data where \(X_{n}\) represents a column vector. The target of time series clustering is to divide the given set into K different types of clusters represented as \(C = \left\{ {C_{1} ,C_{2} , \cdots ,C_{K} } \right\}\) in an unsupervised way, where \(C_{k}\) is defined as the kth cluster and \(D = \bigcup\nolimits_{k = 1}^{K} {C_{k} }\).

K-means clustering method uses iterative process to partition a collection of sampling points into subsets known as clusters (Li et al. 2012; Yang et al. 2018; Xu et al. 2020; Chen et al. 2021a, b). Assume that there are K clusters in the sample dataset, the target of K-means clustering is to minimize the total deviation

$$\sum\limits_{k = 1}^{K} {\sum\limits_{{X_{n} \in C_{k} }}^{{}} {\sum\limits_{m = 1}^{M} {\left( {x_{n}^{m} - u_{k}^{m} } \right)^{2} } } } ,$$
(6)

where \(X_{n} = \left( {x_{n}^{1} ,x_{n}^{2} , \cdots ,x_{n}^{M} } \right)\) represents an M-dimension sample, and \(U_{k} = \left( {u_{k}^{1} ,u_{k}^{2} , \cdots ,u_{k}^{M} } \right)\) is an M-dimension vector representing the center of cluster k, which is calculated as

$$u_{k}^{m} = \frac{1}{{|C_{k} |}}\sum\limits_{{X_{n} \in C_{k} }} {x_{n}^{m} } , \, \forall \, m = 1,2, \cdots ,M,k = 1,2, \cdots ,K.$$
(7)

A cluster contains the cluster center and the data samples assigned to it. Each time a data sample is allocated, the cluster center will be recalculated according to the existing objects in the cluster. This process will be repeated until the termination condition is satisfied. The termination condition can be that the cluster centers keep unchanged or the sum of squares of errors is local minimum. Due to its good performance and computing efficiency, the K-means clustering algorithm has been widely used in the field of transportation data analysis (Zhao et al. (2019); Sun et al. 2021).

Based on the preprocessed time series data of traffic congestion index, an improved weighted K-means clustering method is proposed to identify traffic congestion patterns, which assigns differential weights among all M sampling time segments. Specifically, the sampling time segments with higher dispersion among daily congestion index are assigned with greater weights to strengthen their role in the clustering process. Conversely, the sampling time segments with lower dispersion among daily congestion index are assigned with smaller weights to weaken their influence. Here the Coefficient of Variation is taken (Arachchige et al. 2020) to measure the degree of dispersion, that is,

$$CV_{m} = \frac{{\sigma_{m} }}{{\overline{x}_{m} }}, \, \forall \, m = 1,2, \cdots ,M,$$
(8)
$$\overline{x}^{m} = \frac{1}{N}\sum\limits_{n = 1}^{N} {x_{n}^{m} } , \, \forall \, m = 1,2, \cdots ,M,$$
(9)
$$\sigma_{m} = \sqrt {\frac{1}{N}\sum\limits_{n = 1}^{N} {\left( {x_{n}^{m} - \overline{x}^{m} } \right)^{2} } } , \, \forall \, m = 1,2, \cdots ,M,$$
(10)

where \(CV_{m}\) represents the degree of dispersion at sampling time segment m, \(\overline{x}^{m}\) represents the sample mean of \(x_{n}^{m}\) at sampling time segment m, \(\sigma^{m}\) represents the sample standard deviation \(x_{n}^{m}\) at sampling time segment m. The coefficient of variation is an appropriate weight selection, which considers the stability and volatility of time series data at the same time.

Based on the value of \(CV_{m}\), a weighted K-means clustering method is proposed to partition the time series data of traffic congestion index. The objective is to minimize the total weighted deviations to the cluster centers

$$\sum\limits_{k = 1}^{K} {\sum\limits_{{X_{n} \in C_{k} }}^{{}} {\sum\limits_{m = 1}^{M} {\left( {CV_{m} x_{n}^{m} - u_{k}^{m} } \right)^{2} } } } ,$$
(11)

where \(X_{n} = \left( {x_{n}^{1} ,x_{n}^{2} , \cdots ,x_{n}^{M} } \right)\) represents the time series data with M sampling points in a day, and \(U_{k} = \left( {u_{k}^{1} ,u_{k}^{2} , \cdots ,u_{k}^{M} } \right)\) represents the weighted center of cluster k, which is defined as

$$u_{k}^{m} = \frac{1}{{|C_{k} |}}\sum\limits_{{X_{n} \in C_{k} }} {CV_{m} x_{n}^{m} , \, \forall \, m = 1,2, \cdots ,M} .$$
(12)

For determining the best number of clusters, i.e., the value of K, Silhouette Coefficient (Rousseeuw, 1987) is taken to evaluate the clustering performance associated with each value of K, and the one that maximizes the clustering performance is selected. First, for each sample \(X_{n}\), its Silhouette Coefficient is defined as

$$s\left( {X_{n} } \right) = \frac{{b_{n} - a_{n} }}{{\max \left\{ {a_{n} ,b_{n} } \right\}}},$$
(13)

where \(a_{n}\) represents the average Euclidean distance between sample \(X_{n}\) and all the other samples in its cluster, and \(b_{n}\) represents the average Euclidean distance between sample \(X_{n}\) and all samples in its nearest cluster. Note that the Silhouette Coefficient works while the number of clusters is more than or equal to two, i.e., \(K \ge 2\). Second, the Silhouette Coefficient for the whole dataset is defined as the mean Silhouette Coefficient among all samples, that is,

$$S = \frac{{s\left( {X_{1} } \right) + s\left( {X_{2} } \right) + \cdots + s\left( {X_{N} } \right)}}{N},$$
(14)

which takes value in [− 1, 1]. The closer the value is to 1, the better the clustering results.

Based on the above description, the general procedure for such a weighted K-means clustering method is summarized in Algorithm 1.

figure a

5.2 Paired t-test method

The paired t-test method is used to test whether the average difference between two set of paired sample data is zero. It can also be used in making observations on the same event under different conditions, in order to evaluate the influence of conditions on the event (Konietschke et al. 2014). The test is based on the difference between the values of a single pair denoted as \(\left\{ {d_{1} ,d_{2} , \cdots ,d_{L} } \right\}\), and the test statistic value t is calculated as

$$t = \frac{{\sum\nolimits_{l = 1}^{L} {d_{l} } }}{{\sqrt {\frac{{L\left( {\sum\nolimits_{l = 1}^{L} {d_{{_{l} }}^{2} } } \right) - \left( {\sum\nolimits_{l = 1}^{L} {d_{l} } } \right)^{2} }}{L - 1}} }},$$
(15)

where L is the number of observations in a set of sample data.

If the two tailed P value that corresponds to the test statistic t with L-1 degrees of freedom is less than the chosen significance level (e.g., 0.10, 0.05, and 0.01), it indicates that the difference is significant between the two set of sample data (Hong et al. 2017).

6 Case studies

In this section, case studies are exhibited in details. First, traffic congestion patterns at different districts in Beijing are identified based on the proposed weighted K-means clustering method. Second, the temporal dependence of traffic congestion patterns is examined using paired t-test method. Finally, the spatial dependence of traffic congestion patterns is tested by analyzing the indicator values of different traffic congestion patterns at different districts.

6.1 Identification of traffic congestion patterns

Based on the weighted K-means clustering method, this subsection identifies the traffic congestion patterns at different districts in Beijing. The Silhouette Coefficients with different number of clusters are shown in Table 5, which imply that the traffic congestion patterns are spatial dependent, the districts closer to the downtown (i.e., Chaoyang, Dongcheng, and Xicheng) have three categories of congestion patterns; while the districts far away from the downtown (i.e., Haidian, Fengtai, and Shijingshan) have two categories of congestion patterns.

Table 5 The optimal number of clusters at different districts

First, taking the clustering results in Haidian district as an example. In Fig. 3a, there are two representative traffic congestion patterns on weekdays. The first pattern is less congested, which often appears at the first-half or middle of the weekdays (i.e., Monday, Tuesday, and Wednesday), while the second pattern is more congested, which generally appears at the last-half of the weekdays (i.e., Thursday and Friday). For simplicity, they are hereinafter respectively named as MTW pattern and TF pattern. The main difference between them occurs from 6:00 to 23:00, and they are relatively consistent from 0:00 to 6:00. The trends of these two curves are basically the same, namely the morning peaks appear around 8:00 and the evening peaks appear around 18:00, but TF pattern obviously takes higher values than MTW pattern. The characteristics of traffic congestion patterns in Fengtai district and Shijingshan district are similar to those in Haidian district which are shown in Fig. 3c, e.

Fig. 3
figure 3

The traffic congestion patterns at different districts

Differently, there are three traffic congestion patterns on weekdays in Chaoyang district. In Fig. 3b, except the MTW pattern and TF pattern, there is a holiday pattern, which will be hereinafter named as H pattern. Compared with the MTW and TF patterns, the H pattern is the least congested, which often appears on working days within 3 days before and after holidays (e.g., the Spring Festival, the Mid-Autumn Festival, the National Day, the Dragon Boat Festival). Similarly, the trends of these three patterns are basically the same, but their values differ significantly. The characteristics of traffic congestion patterns in Dongcheng district and Xicheng district are similar to those in Chaoyang district which are shown in Fig. 3d, f.

Zhao and Hu (2019) revealed that there were two typical traffic congestion patterns on weekdays in Beijing by applying K-means cluster analysis, i.e., weekday mode A and weekday mode B. The former often appeared on Mondays while the latter often appeared on Fridays. As shown in Table 5, when the number of congestion pattern clusters is two at Chaoyang, the corresponding Silhouette Coefficient is 0.32, which does not reach its optimal value. Essentially, the Silhouette Coefficient reaches its optimal value 0.34 when the number of congestion pattern clusters is three at Chaoyang. Similar clustering results also occur in Dongcheng and Xicheng. Therefore, it is more reasonable to divide the traffic congestion patterns of Chaoyang, Xicheng, and Dongcheng into three categories rather than two categories.

6.2 Temporal dependence of traffic congestion patterns

Now the temporal dependence of traffic congestion patterns can be examined. As described in Sect. 4.3, the clustering results of traffic congestion pattern are consistent with the variation of dates, which is also how each pattern is named. Taking Haidian district as an example (See Table 6 and Fig. 4a), the MTW pattern includes 42 Mondays, 32 Tuesdays, 34 Wednesdays, 10 Thursdays and 2 Fridays, the total proportion of Mondays, Tuesdays and Wednesdays is 90%, and the total proportion of Thursdays and Fridays is only 10%; while the TF pattern includes 4 Mondays, 16 Tuesdays, 16 Wednesdays, 40 Thursdays and 48 Fridays, the proportion of Thursdays and Fridays is 71%, and the proportion of Mondays, Tuesdays and Wednesdays is 29%.

Table 6 Statistical results of congestion patterns across weekdays at different districts
Fig. 4
figure 4

Proportions of Mon., Tue., Wed., Thur., and Fri. across different congestion patterns

Similarly, in Chaoyang district (See Table 6 and Fig. 4b), the MTW pattern includes 38 Mondays, 38 Tuesdays, 27 Wednesdays, 16 Thursdays and 7 Fridays, the total proportion of Mondays, Tuesdays and Wednesdays is 81%, and the total proportion of Thursdays and Fridays is only 19%; while the TF pattern includes 0 Mondays, 3 Tuesdays, 12 Wednesdays, 25 Thursdays and 39 Fridays, the total proportion of Thursdays and Fridays is 81%, and the total proportion of Mondays, Tuesdays and Wednesdays is only 19%; the H pattern is evenly distributed from Monday to Friday, but the most significant distribution characteristic for this pattern is that it includes 32 working days within 3 days before and after holidays, and it accounts for 82% of the total number of days in H pattern.

Based on the above analysis results, it is concluded that the variation of dates greatly impacts the congestion patterns, but can not completely explain the impact on traffic congestion patterns. Therefore, it could be inferred that there are other factors affecting the congestion patterns.

Automobile license plate restriction (ALPR) sets out rules that restrict automobile travel at particular date. For example, driving can be restricted based on vehicle license plate numbers for private cars. In details, vehicles with license numbers ending in 0 or 5 are prohibited from driving on Mondays; vehicles with license numbers ending in 1 or 6 are prohibited from driving on Tuesdays; vehicles with license numbers ending in 2 or 7 are prohibited from driving on Wednesdays; vehicles with license numbers ending in 3 or 8 are prohibited from driving on Thursdays; and vehicles with license numbers ending in 4 or 9 are prohibited from driving on Fridays. Generally speaking, the ALPR rules are updated quarterly and there are no driving restrictions on weekends. In China, ALPR is commonly implemented as a measure to reduce traffic congestion in megacities, e.g., Beijing, Tianjin, Guangzhou, Chengdu.

The influence of ALPR on the traffic congestion patterns is analyzed in this subsection. The statistical results about the congestion patterns across no-drive days at all six districts are shown in Table 7, which indicate that in TF pattern, the highest proportion is the days when the restriction numbers ending in 4 and 9. The number of days with restriction numbers ending in 4 and 9 increases from MTW pattern to TF pattern at Haidian, Fengtai, and Shijingshan, and increases from H pattern to MTW pattern, and then to TF pattern at Chaoyang, Dongcheng, and Xicheng.

Table 7 Statistical results of congestion patterns across no-drive days

For each district q with restriction scenario p, Congestion Degree is defined as the sum of the mean congestion index of each congestion pattern multiplied by the proportion of days. The higher the congestion degree, the more serious the congestion in the restriction scenario. If wpq denotes the congestion degree at district q with restriction scenario p, then we have

$$w_{pq} = \sum\limits_{k = 1}^{K} {e_{pq}^{k} \lambda_{pq}^{k} } ,$$
(16)

where \(e_{pq}^{k}\) represents the mean congestion index of the kth congestion pattern, and \(\lambda_{pq}^{k}\) represents the proportion of days of the kth congestion pattern with \(\lambda_{pq}^{1} + \lambda_{pq}^{2} + \cdots + \lambda_{pq}^{K} = 1\). In Table 8, the numbers in the last row indicate the average congestion degree among six districts in each restriction scenario. It is shown that the restriction scenario (4, 9) results in the highest congestion degree, while the restriction scenario (1, 6) and (3, 8) lead to the lowest congestion degree. This is due to the Chinese people’s taboo for the number 4, which makes the number of vehicles ending in 4 very limited compared with other numbers. Conversely, 6 and 8 are the lucky numbers for Chinese people, which make the quantity of vehicles ending in 6 or 8 very large.

Table 8 The congestion degree at all six districts

Paired t-test is taken to evaluate the difference of the congestion degree at all six districts under different restriction scenarios. The test statistic for paired t-test between scenario \(p\) and scenario \(p^{\prime}\) is calculated as

$$t_{{pp^{\prime}}} = \frac{{\sum\nolimits_{q = 1}^{6} {\left| {w_{pq} - w_{{p^{\prime}q}} } \right|} }}{{\sqrt {1.2 \times \left( {\sum\nolimits_{q = 1}^{6} {\left| {w_{pq} - w_{{p^{\prime}q}} } \right|}^{2} } \right) - 0.2 \times \left( {\sum\nolimits_{q = 1}^{6} {\left| {w_{pq} - w_{{p^{\prime}q}} } \right|} } \right)^{2} } }},$$
(17)

where \(w_{pq}^{{}}\) denotes the congestion degree at district q with restriction scenario p, \(w_{{p^{\prime}q}}^{{}}\) denotes the congestion degree at district q with restriction scenario \(p^{\prime}\).

The results of paired t-test are given by Table 9. With significance level 0.05, if the two tailed P value is less than 0.05, it can be concluded that the values of congestion degree are significantly different. In Table 9, it is noted that the restriction scenario (4, 9) is significantly different from all other four scenarios; there is significant difference between scenario (1, 6) and scenario (2, 7); there is significant difference between scenario (3, 8) and scenario (2, 7); there are no significant differences among other scenarios. Therefore, the ALPR policy has an important influence on traffic congestion patterns.

Table 9 Paired t-test results among different restriction scenarios

6.3 Spatial dependence of traffic congestion patterns

As we have shown in Fig. 3, traffic congestion patterns are spatial dependent, that is, different districts have different number of traffic congestion patterns. If the traffic congestion index takes value more than 4.0, it means that the traffic situation is congested (See Fig. 2). In Table 10, the minimum congestion index, the maximum congestion index, the mean congestion index, the variance of congestion index, and the duration of congestion associated with all congestion patterns across all districts are calculated respectively. It is found that the maximum value, mean value, variance, and congestion duration increase gradually when the congestion pattern changes from H and MTW to TF, while the minimum value keeps almost unchanged. Most importantly, the maximum value, mean value, variance, and congestion duration have significantly difference across districts, which illustrate again that congestion patterns are spatial dependent.

Table 10 The indicator values of different traffic congestion patterns at different districts

Note that the shapes of H/MTW/TF patterns across districts are also significantly different as shown in Fig. 3. As the district gets closer to the downtown, the valley between the morning peak and evening peak becomes more sharp, the peak value gets greater, and the congestion lasts longer, which could also be observed in Table 10. Interestingly, although Haidian and Chaoyang have the similar distance to the downtown, their traffic congestion patterns are extremely different, which is reflected in both the number of congestion patterns and the specific indicator values. This phenomenon can be explained by the functional differences of these two districts in Beijing: Chaoyang is an important business center and foreign affairs with relatively active traffic, while Haidian is an education center with relatively light traffic congestion.

7 Conclusions

Alleviating traffic congestion has always been an important challenge for the sustainable development of megacities. Accurate understanding of traffic congestion patterns and its characteristics is helpful to formulate scientific congestion prevention measures. In this paper, a traffic congestion pattern analysis framework was constructed based on the congestion index data. First, an improved weighted K-means clustering method was proposed to identify the traffic congestion patterns. Second, based on the identified traffic congestion patterns, the spatial–temporal variations of traffic congestion patterns were analyzed. Case studies with real-life data illustrated that the traffic congestion patterns are both spatial dependent and temporal dependent, and the automobile license plate restriction has important influence on the traffic congestion patterns.

On the basis of the results in this study, several issues are deserving of future study. First, the traffic congestion index data used in this study are collected according to the administrative division of Beijing, more precise division should be carried out for obtaining more valuable information, such as the congestion pattern analysis for roads or blocks. Second, for the analysis of influencing factors about congestion patterns, more issues should be considered, such as weather conditions, geographical conditions, emergency events, and so on. Third, the congestion pattern analysis in this paper is based on the congestion index data in Beijing, the congestion patterns and characteristics at other cities should be conducted and compared with Beijing.