1 Introduction

Accident statistics indicate that on average, 50 lives are lost daily in road crashes in Iran [1]. It is evident that this problem will be exacerbated due to the increasing number of vehicles and consequently increased traffic on the roads. Therefore, there is a need for solutions to develop intelligent transportation systems aimed at enhancing safety to the maximum extent possible. The European Union has set standards with the vision of achieving smart cities, particularly in the field of intelligent transportation, and countries are expected to take steps towards achieving these standards [2].

One of the current issues related to roads is the continuous monitoring of road conditions to ensure the preservation and maintenance of road quality during their service life [3]. Additionally, the installation of signs indicating the status of the road and potential black spots, especially in newly opened roads, has always been a major concern for organizations responsible for managing road conditions and preventing road accidents [4]. The conventional solution for these two problems often involves the use of geospatial information systems based on the network coordinates of the country’s roads, relying on annual traffic accident data that have been published over the past years. “Accident black spots,” also known as black spots or accident-prone areas, refer to specific locations on roads, highways, or intersections where a high number of traffic accidents or collisions occur. These areas have a history of a disproportionately high frequency of accidents, injuries, or fatalities compared to other parts of the road network. The existing methods for identifying black spots are incapable of predicting these areas before accidents occur. This problem becomes more prominent when constructing new roads as there would be no existing data on accidents and other related information [5, 6].

On the other hand, mobile phones are recognized as the most penetrating medium within societies, to the extent that in some cases, their penetration rate even surpasses certain vital health indicators within a community [7]. Therefore, a claim that smartphones are likely to be more prevalent in vehicles in the near future than any other in-car devices is not far-fetched. This notion reinforces the motivation for proposing a system aimed at reducing costs and maximizing utilization of existing facilities.

1.1 Literature Review

Modern smartphones possess various capabilities that have proven useful in the transportation industry. In a study, a safety enhancement system based on smartphone accelerometer and GPS was proposed [8]. In this system, if the deceleration or braking intensity exceeds a certain threshold value, it detects a collision. In another study [9], both the smartphone sensor and an external hardware module were utilized to report traffic collisions to an emergency and rescue center, specifically checking the status of the vehicle’s airbags. In a study [10], two performance indicator measures were used to determine accident black spots on the road, namely the time to collision and the minimum deceleration required to prevent an accident. In order to estimate these two measures related to two consecutive vehicles, it is assumed that these vehicles are moving in the same direction on the same lane. The values of these measures are a function of time, position and instantaneous speed of vehicles, and are calculated between two consecutive vehicles as they drive through a road section. This study demonstrated that the probability of a collision on a specific road segment is higher than others due to the presence of a gas station and more traffic issues in that segment.

In articles [11] and [12], the authors claim that sudden braking and low speed can serve as indicators of heavy traffic on a route. Therefore, it is possible to issue warnings to the vehicles located on this route, as these situations are detected and reportedto a central traffic control server, which then provides this information to the users via appropriate media.

In the article [13], using a GPS receiver and an accelerometer sensor embedded in a mobile phone placed inside the car, road surface irregularities were determined by sudden changes in the vehicle’s vertical acceleration. The proposed algorithm was sensitive to a threshold value of acceleration. However, since the acceleration threshold and the accuracy of measurement could vary among different smartphones, adopting a fixed value for the acceleration threshold may not be reliable. Therefore, in article [11], a scalable method is proposed where the threshold value is obtained through a machine learning process. Naturally, this method is expected to have less error compared to the fixed acceleration threshold solution due to the proximity of its threshold value to the actual value. This concept was implemented to determine road surface irregularities in the city of Boston. In this system, vehicles are equipped with hardware consisting of a 1 Hz accelerometer sensor and a 380 Hz GPS receiver. The data from these sensors is processed by a clustering algorithm, and the output is a sequence of irregularities that are mapped onto a graphical map [14, 15]. is the proposed system is in fact a traffic information system that monitors the dynamic behavior of vehicles during movement on a specific route and stores the relevant information for further processing. The collected raw data can be classified and categorized in order to analyze and determine problematic spots.

Overall, the literature review underscores the pivotal role of cutting-edge technology and sophisticated data analytics in revolutionizing the domain of traffic management and road infrastructure. By leveraging the power of advanced sensors, such as GPS receivers and accelerometers, in vehicles, the system enables real-time monitoring of traffic conditions. Key indicators like sudden braking and variations in speed are meticulously analyzed to promptly identify congestion and heavy traffic, allowing for timely warnings to be disseminated to drivers.

2 Methods

In this study, a traffic information system relying on smartphone sensors has been developed to identify accident-black spots road locations. The assumption is that regular sudden brakes by vehicles at specific points and road surface irregularities can be potential indicators of accident-black spots along a route. Therefore, the main challenge is to find these points and analyze the behavior of drivers at these locations. On the other hand, this system can be considered as a form of Mobile Crowd Sensing problem, where the mobile nodes are the smartphones responsible for collecting information related to vehicle movements [16]. The architecture and components of this system are illustrated in Fig. 1. The implementation and realization of the system should include the basic step of data collection: (a) establishing the data collection infrastructure, (b) creating an initial database, (c) processing the collected data, and (d) displaying the results on a map, as shown in Fig. 2. These steps will be further explained in detail.

Fig. 1
figure 1

Overall architecture of the proposed traffic information system

In this system, it is necessary to gather information that serves as an output from the sensors of smartphones during vehicle movement. Therefore, it was decided to design a mobile application with the purpose of collecting data, which can be installed on smartphones of road users while on the move.

Fig. 2
figure 2

Different steps of project implementation

Considering the high utilization of the Android operating system in smart phones [7], an application was designed and developed for this operating system. Instantaneous acceleration values, changes in acceleration, speed, vehicle geographic locational coordinates, and real time were chosen as the initial data to be recorded. The application is capable of recording the instantaneous three dimensional acceleration values from the output of the smartphone’s accelerometer sensor, and it can also receive real-time speed and geographic location coordinates from the GPS satellite. Furthermore, a logical algorithm was developed to calculate and record the changes in acceleration rate of the vehicle at each time step.

The functionality of this software is based on the event of instantaneous acceleration change. That means, in addition to recording the instantaneous acceleration value, it also records the speed, time of occurrence, and geographic coordinates. Furthermore, at each time step, the current acceleration value is compared to its previous value. If this value exceeds a threshold value, the acceleration change is recorded. Otherwise, it is considered as noise and its value remains unchanged in the database. The acceleration values and their changes are recorded as three dimensional the X, Y, and Z vectors (Fig. 3).

Fig. 3
figure 3

User interface of the data recording application software

Following the provision of necessary requirements for data collection, the next step was data collection. In order to evaluate the performance of assembled data collection system,, a 42 km route on the Mashhad-Chenaran Highway (Azadi Highway) was monitored round trip (Fig. 4). The average daily traffic on this route is about 100,000 vehicles per day. The posted limit speed is 90 km/h, and a five year period crash data (2016 to 2021), including both injury and fatal crashes were available for this site. Due to the benefits and drawbacks of each method for detecting accident-black segments along this route, a variety of techniques were utilized. These segments were identified using the crash frequency (CF), crash rate (CR), and kernel density approaches. Given that the information on crash locations had a precision length of 1 km, the bandwidth for the kernel density approach and the length of each road segment was assumed to be 1 km [17]. During the last 5 years, 23 segments along this route recorded non-zero crashes from which, 5 segments were identified as accident-black segments, which involved 49% of overall injury crashes and 57% of overall fatal crashes. It should be noted that crashes at these five black spots had led to 21 deaths and 237 injuries over the past five years. The location analysis was performed in the presence of highway police officers to reassure of the suitability of selected accident-black segments. In addition to the determination of accident-black segments, speed reduction points and points related to the road surface irregularities were also specified on the selected route segments.

Fig. 4
figure 4

Mashhad-Chenaran Highway (Azadi Highway)

The software was validated using the data recorded from the smartphones of 20 drivers who used this route. The frequency of data recoding by the software was one recording per second. In total, about 50,000 primary data were recorded. The drivers were all licensed drivers and none of them had traveled on this route for the last six months.

The recorded data stored directly on the smartphones or devices used for data collection using a format like “Notepad”. This is a typical approach, especially when utilizing a mobile app or software that collected and stored data locally on the device. After completing the data collection, all of it was implemented in the Excel software for evaluation and analysis.

3 Results

After extracting the recorded data from the application software, a database comprising a large number of raw data was provided. The recorded data should then be classified to reflect various road conditions that had contributed to the crash. Based on the preliminary studies, it was decided to classify the data collected in this phase of research into three categories. The classification and characteristics of road condition in each category are presented in Table 1. This led to a sample of 700 speed/acceleration driving data sets and their associated crash events categorized in accordance with Table 1.

Table 1 Event classification based on project objectives

It was now possible to plot the data stored in the databases for analysis purposes. For example, Fig. 5 illustrates a one minute trip, experienced by a driver along a section of this route. In this trip, the smartphone is placed in a fixed position, indicating the Y-axis as the direction of movement and the Z-axis as the vertical axis. The Z-axis represents the vertical changes of the vehicle. To interpret this axis accurately, consideration must also be given to the vehicle’s speed. When the vehicle’s speed is very low, changes in the Z-axis indicate speed reduction. Conversely, intermittent changes in the Z-axis at high speeds may indicate road surface irregularities. In Fig. 5, it can be observed that at 19:14:16 clock time, significant fluctuations in the acceleration values along the Z-axis are evident, while the speed remains relatively constant. Based on the previous descriptions, we can refer this event to the category of road surface irregularities which falls under Class C. However, at approximately 19:14:42, we observe significant fluctuations in acceleration along the direction of movement together with a decrease in speed. This event indicates sudden braking and will be labeled as Class (A) Furthermore, at 19:14:56, we have fluctuations in acceleration along the direction of movement, a decrease in speed, and fluctuations along the Z-axis. This event could be attributed to the presence of a speed hump along this route section and will be labeled as Class (B) These presumptions were then confirmed with field observations. In order to identify further accident-black spots that would fit with any of these categories, an event registration form was designed and completed through site visits. On this basis, a sample of approximately 700 data recordings associated with such event categories were identified and labelled for further analysis.

Fig. 5
figure 5

Output of a one-minute trip

Continuing with the evaluation of the labeled dataset, two types of neural network models namely, Multilayer Perceptron (MLP) and Radial Basis Function (RBF) models, were utilized. These neural networks were trained and modeled using the sample dataset in the MATLAB environment [18]. The MLP neural network is a type of feedforward network that consists of an input layer, hidden layers, and an output layer. It receives 7 parameters as model inputs, which include instantaneous there dimensional accelerations, there dimensional changes in accelerations and speed. It has a single output that determines the type of event based on Table 1. The RBF neural network is a type of precise network where the number of neurons is equal to the number of vectors in the input matrix. It also has 7 inputs and a single output.

The dataset was divided into training, testing, and validation datasets. As from the total of 700 sample dataset, 500 samples were randomly selected for model training. These selected samples were then utilized to train both the Multilayer Perceptron (MLP) and Radial Basis Function (RBF) models. The remaining 200 datasets were then evenly split between the testing and validation datasets. The testing dataset, was utilized for fine-tuning of the models and selecting optimal hyperparameters. By monitoring the models’ performance on the testing dataset, appropriate decisions on model architecture, regularization techniques, and learning rates were made. The testing dataset played a crucial role in preventing overfitting problem and ensuring that the selected models generalized well to the unseen data. The validation dataset, was employed to assess the performance of the trained models on the unseen data. Performance evaluation measures such as accuracy, precision, recall, or F1-score, were calculated to measure the models’ performance on the validation data.

For the MLP model, the final architecture comprised of an input layer with 7 neurons representing the input parameters, two hidden layers with 10 neurons in each layer, and an output layer with 1 neuron representing the output category. Similarly, the RBF network consisted of an input layer with 7 neurons, a hidden layer comprising 700 neurons (equal to the number of categorized records in the dataset), and an output layer with 1 neuron representing the output category.

The selection of transfer functions (activation functions) for each model was determined based on the specific requirements and characteristics of the problem. Depending on the desired behavior of the network and the nature of the output, commonly used activation functions such as sigmoid, tanh, ReLU, or softmax were employed.

Figure 6 illustrates the output prediction error for validation data records. The mean square error for the RBF and MLP neural networks was 0.0066 and 0.1399, respectively, indicating an acceptable prediction accuracy of both models, even though the RBF model demonstrated an overall better performance. Also, The Root Mean Square Error (RMSE) for the RBF and MLP neural networks were 0.081 and 0.373, respectively.

Fig. 6
figure 6

Prediction error plot for RBF and MLP neural network models

In order to demonstrate the location of various events precited by the developed models, it would be desirable to display them on a geographical map. For this purpose, a software was developed that utilizes online Google maps and has the capability to receive the output points from the previous stages as input and map them onto the online maps, using the geographic coordinates of each event point previously recorded by the smartphone’s GPS receiver. The display of information and reports in this software can be customized according to the user needs. For instance, the location of a bumpy point in one of the highways leading to the Mashhad city is shown visually to the viewer in Fig. 7.

Fig. 7
figure 7

An graphic display of an event point on the Google map

4 Discussion

The proposed solution in this study represents a significant advancement in leveraging modern technology, particularly smartphone sensors, and integrating them into a centralized traffic information system. The aim of this integration is to enhance road safety by actively monitoring the dynamic behavior of vehicles and collecting trip-related data for further in-depth analysis. By employing a classification approach based on key metrics such as instantaneous three-dimensional accelerations, changes in accelerations, and speed of vehicles during their journey on the road, the system identifies accident-prone locations or black spots. This approach is in line with the conclusions drawn in a study by [19], which explored machine learning-based methods for identifying accident-black spots using smartphone data. Their study highlighted the integration of smartphone sensor data with other pertinent sources like traffic data and weather conditions to effectively pinpoint these vulnerable areas. This aligns with the methodology used in this study, where machine learning algorithms, including neural networks, were applied to analyze smartphone-derived data.

The specific approach of utilizing two distinct neural network models, namely the Multilayer Perceptron (MLP) and Radial Basis Function (RBF), for analyzing the classified dataset, proved effective in achieving acceptable prediction accuracy. Notably, the RBF network showcased superior performance over the MLP network in terms of Mean Square Error (MSE), implying a higher level of precision in its predictions. This finding corroborates with a study by Liu et al. (2020) [20], where various neural network models were investigated for road accident prediction. The outcome reinforces the potential of RBF networks for applications such as accident prediction and identifying accident-black spots in traffic safety analysis.

The holistic approach of integrating smartphone sensor data and employing advanced neural network models can significantly contribute to enhanced road safety. By effectively identifying accident-prone areas and predicting road incidents, this approach offers valuable insights to traffic management authorities and helps prioritize safety measures. The use of machine learning algorithms, particularly neural networks, demonstrates their versatility and efficacy in processing and analyzing complex datasets derived from smartphone sensors. These advancements pave the way for further research and practical implementation in the field of traffic safety, holding the promise of reducing accidents and ultimately saving lives on the road.

5 Conclusion

In this study, a traffic information system is proposed and developed to identify black-spot locations on roads. In this system, the speed and acceleration values of vehicles collected by smartphone sensors are used for analyzing the dynamic behavior of vehicles on the road and identifying accident-black spots. The system not only reduces costs but also provides reliable reports on road safety conditions as an output. To evaluate the system, a highway in Iran was monitored, and a total of 7 parameters were examined as input. The classified dataset was modeled using two types of neural network models namely, Multilayer Perceptron (MLP) neural network and Radial Basis Function (RBF) network. Although the RBF model performed better overall, the mean square error for the MLP and RBF neural networks was 0.0066 and 0.1399, respectively, indicating a respectable level of prediction accuracy for both models.

In conclusion, this study contributes to the growing body of literature emphasizing the potential of smartphone-based approaches and machine learning techniques in improving road safety.The low prediction error rates and acceptable prediction accuracy achieved in this study validate the feasibility and potential of the proposed solution.

While this study provides valuable insights into the feasibility and effectiveness of smartphone-based solutions for road safety analysis, it is important to acknowledge certain limitations. Firstly, the evaluation was conducted on a specific route on a highway in Iran, which may restrict the generalizability of the findings to other regions and road types. To enhance the applicability of the proposed solution, future research should encompass diverse settings and road conditions. Furthermore, the analysis focused on a selected set of parameters derived from smartphone sensor data. Incorporating additional parameters or alternative data sources could enhance the comprehensiveness of the analysis.

The proposed system is a centralized and non-participatory traffic information system. However, since smartphones support the 802.11 network card standard, in the future, it will be possible to utilize this capability to create a local network among smartphones inside vehicles. In this case, the developed application will be able to broadcast messages over the network under certain conditions.