1 Introduction

Globally, the rate of is urbanization increasing at an alarming rate. A high urbanization rate in combination with the intense desire for car ownership causes rapid growth of motorization. The main purpose of transportation is to provide efficient movement from one place to another place; there have been many vehicles on the road such as cars, motorcycles, buses, and others that help to move from one destination to another. Such movements face many challenges, which are mainly caused by rapid urbanization and an increase in car ownership, which then influence both the flow of traffic and the environment.

Before cars were invented, road traffic injuries occurred involving people, animals, and carts. The numbers grew exponentially as cars, buses, and other motor vehicles were invented and became more common. A cyclist in New York City was the first recorded injury case on 30 May 1896, and a Pedestrian in London was the first recorded motor vehicle death on 17 August of the same year (World’s first road death 2003).

The total of road traffic deaths had reached an estimated 25 million by 1997. In 2002, an estimated 1.18 million people died from road traffic accidents; on average, 3242 people are dying per day. Road traffic accident accounted 2.1% of all deaths; this makes them the 11th leading cause of global deaths.

Road traffic injuries place a heavy burden on household finances, not only on global and national economies. Many families are driven deeper into poverty by the loss of breadwinners and the added burden of caring for members disabled by road traffic injuries.

The global status report on road safety, launched by the World Health Organization (WHO) in December 2018, highlights that the number of road traffic deaths has reached 1.35 million annually (World Health Organization 2018). And road traffic crashes cost most countries 3% of their gross domestic product. It is supposed that 30–50 million people are exposed to physical disability annually by traffic accidents in the world according to the World Health Organization (WHO) reported in 2012. This report included that more than 600-billion-dollar property can be damaged by traffic accidents annually.

Road traffic injuries are currently estimated to be the 9th leading cause of death across all age groups globally and are predicted to become the 7th leading cause of death by 2030. Following current trends, about 2 million people could be expected to be killed in vehicle accidents each year. Road traffic deaths occur among young males under the age of 25 years who are almost three times as young females (Global Health Estimates 2014).

In a middle- and low-income countries, they particularly affect active age group, or those who contribute to family, society, and the workforce in general. Many families are susceptible poverty by the expenses of prolonged medical care, or the added burden of caring for a family who is disabled from a road traffic accident. Different documents verify that the traffic accident is in 7th place among different causes of death in low-income countries in 2019 ().

Every day, about 3700 people are died globally in vehicle accidents. More than 50% of people who are killed are pedestrians and motorcyclists. Accidents are estimated to be the 8th leading cause of death globally for all age groups and the leading cause of death for children and young people five to twenty-nine (5–29) years of age (World Health Organization 2018). More people now are dying in traffic accident than from HIV/AIDS. It is estimated that accident will cost the world economy approximately 1.8 trillion dollars from (2015–2030). That’s equivalent to a yearly tax of 0.12% on world GDP (Chen et al. 2019).

The WHO Global Burden of Disease study predicts road traffic injuries from 1990 to 2020 (The Global Burden of Disease 1990).

  • Road traffic injuries will rise in rank to 6th place as a major cause of death worldwide.

  • Road traffic deaths will increase worldwide, from 0.99 million to 2.34 million (Representing 3.4% of all deaths).

  • Road traffic deaths will increase on average by over 80% in low-income and middle-income countries and decline by almost 30% in high-income countries.

  • Road traffic injuries will become the second leading cause of disability-adjusted life years (DALYs) lost for low-income and middle-income countries

  • In high-income countries, the annual number of road traffic deaths is projected to decrease by 27% from 2000 to 2020

More than 90% of road traffic deaths occur in middle and low-income countries. Even though these countries have 60% of the world's vehicles. In developing countries, this situation is made worse by unplanned and rapid urbanization. The inadequate infrastructure in cities, together with the lack of a legal frameworks, make the exponential rise in the number of road accidents. The African continent has the highest road death rates of all the world's regions that is (26.6 per 100,000) population relative to the global rate of (17.5 per 100,000) population. The continent owns only 2% of the world's vehicles; it contributes 16% to worldwide deaths. The region will continue to have the highest road traffic death rates due to the high rate of urbanization and motorization but lagging road infrastructural development as well as poor road and vehicles' safety (World Report on Road Traffic Injury Prevention 2015; AFRO factsheet 2013).

The rates of death and serious injury are many times higher in car-to-truck collisions than in car-to-car collisions (Mackay and Wodzin 2002). Other problems are failure of the passengers use a protective material, lack of features to stop passengers from being ejected from the car and lack of other safety features, such as high-mounted stop lamps in the rear. Pedestrians involved in accidents more often sustain multiple injuries resulting in death or disability than do car occupants.

In our country Ethiopia, the number of deaths due to traffic accidents is reported to be among the highest in the world. According to the WHO, in 2013 the road accident fatality rate in Ethiopia was 4984.3 deaths per 100,000 vehicles annually, compared to 574 across sub-Saharan African countries. Ethiopia ranks 24th in the world. The number of people injured or killed in one crash is about 30 times higher than crashes in the United States (Mackay and Wodzin 2002). According to the WHO statistics report in 2015, the cause-specific mortality due to accidents in Ethiopia is 94 per 100,000 population which is more than 8% of all deaths. Ethiopia is among the countries experiencing the highest road traffic accidents in the world with a 79% fatality rate of injuries related to traffic accidents (Global Status Report and on Road Safety 2015).

The Sustainable Development Goals (SDGs) target to reduce road traffic deaths by 50% in 2020 (Loewe and Rippin 2015). The Ethiopian Health Sector Development Plan gives attention to an accident and aims to reduce fatalities related to it. In a country like Ethiopia where the majority of the deaths is occurring among the economically active population (MoH 2015).

Estimates of the magnitude of road traffic incidents and traffic injuries in low-income countries are primarily obtained from police records and sometimes hospital registry data. In Ethiopia, traffic police reports are the source of data for road-related incidents. Though death related to road traffic accident is one of the major health challenges in Ethiopia, it is highly underreported (Abegaz et al. 2014).

According to a review undertaken by the Economic Commission for Africa and the Secretariat of the UN Secretary-General's Special Envoy for Road Safety, Mr. Jean Todt, in collaboration with the Ministry of Transport of Ethiopia, in our country Ethiopia on average, about 13 people die in a road traffic accident each day, according to a report. About 85% of all road traffic accidents occur in Oromia, Amhara, and SNNP regional states as well as Addis Ababa city administration (Economic Commission for Africa (ECA) xxxx).

According to WHO data published in 2017, road traffic accident deaths in Ethiopia reached 27,140 or 4.27% of total deaths. The age-adjusted death rate is 36.36 per 100,000 population ranking Ethiopia 22nd in the world (Yohannis 2019).

Factors contributing to the high incidence of RTAs in Ethiopia include drivers' and pedestrians' lack of awareness, poor technical inspections of vehicles, older vehicles, driving license problems, weather conditions, failure to enforce traffic laws, and poor road quality (Mackay and Wodzin 2002).

Oromia is the largest region among the regions in Ethiopia. Oromia is the most populated region and highest road traffic movement next to Addis Ababa (Finfinnee), which is the capital city of Ethiopia and Oromia regional State, accounting for 58% of all fatalities and two-third of all injuries. Many cross-country roads from another region cross this region to enter Addis Ababa city, and the region is mostly subject to traffic accidents due to this congested traffic flow.

The Oromia region has 22 zonal administrations, special zone of Oromia is among the zones in Oromia. This special zone includes Woredas like Mulo, Sululta, Sabeta Hawas, Barak, Chancho, Sendafa, Welmera, Akaki, Burayu, Holeta, and Sebeta Town. Since this zone surrounds the capital city Addis Ababa, the traffic flow is very overcrowded and many traffic accidents are frequently occurring in this area. The number of registered vehicles is increasing from time to time, due to this increase in the number of road traffic the road accidents rate is increasing.

Many factors may cause accidents including geographical factors such as road conditions, weather conditions and human factors such as age, gender, alcohol, or drugs, and also vehicle technical problems such as vehicle age and body type. Therefore, traffic accident prediction and forecasting are recommended to reduce road accidents in future.

Traffic accident prediction and forecasting involves a complex relationship between variables or parameters of road traffic accidents; the advantage of artificial neural networks (ANNs) over machine learning algorithms depends on their ability to solve complex and non-algorithmic problems. ANN learning is a good tool in solving prediction and classification problems, and ANN allows the inclusion of a large number of variables. Neural networks can relate input with output and are also error-tolerant. Therefore, the primary interest of this study is to analyze the current road traffic accidents, predict the accident severity depending on the input parameters, and forecast accident numbers using different artificial neural network (ANN) models.

1.1 Statement of the problem

Economic development is strongly related to the availability of transportation. The increase in the number of vehicles created a major social problem through traffic accidents. A traffic accident has a major impact on a country's economic growth by destroying infrastructures, harming and killing youths who are working-age groups. However, the technique or process of analyzing traffic accidents is poor due to the manual way of accident monitoring and controlling mechanisms. Factors contributing to the high incidence of RTAs in Ethiopia include rampant reckless driving behaviors, poor road network, substandard road conditions, failure to enforce traffic laws, and poor conditions of vehicles (Road traffic accidents in Ethiopia 2008). Many factors can contribute to traffic accidents, but only certain factors can appear on certain roads according to statistics from Oromia Region Police Commission and Oromia Special Zone.

In general, the major factors of road traffic accident in Oromia Special Zone are:

  • Driver Faults

  • Poor quality of roads, signals invisibility

  • Lack of lane for pedestrian and bicycles

  • Lack of knowledge of road safety

  • Poor vehicle maintenance tradition

  • Animal drawn carts and animals frequently using on main highways

  • Lack of safety-conscious design and planning of road network

  • Disrespect for traffic rules and regulations

The problems above made me go for research to predict severity of traffic accident and forecast the number of accidents using machine learning which indirectly contribute a role in reducing road traffic accident. It is expected that neural networks will be effective incorrectly classifying and forecasting the accident severity.

1.2 Objectives

1.2.1 General objective

The main objective of this study is to forecast road traffic accidents and predict the severity of accidents in Oromia Special Zone using ANN-based models.

1.2.2 Specific objectives

  • To collect data from the Oromia Police Office in Addis Ababa and preprocess it with feature extraction and feature selection using Embedded Model

  • To build artificial neural network (ANN) models such as BPNN, FFNN, MLPNN, RNN, RBFNN, and LSTM for accident severity prediction.

  • To compare the performance of BPNN, FFNN, MLPNN, RNN, RBFNN, and LSTM models and identify the best model which is suitable for accident severity prediction

  • To construct the forecasting models using LSTM model.

  • To develop an interactive Graphical User Interface prototype (GUI) for the accident severity prediction and an email notification system to notify the stakeholders about the future number of accidents.

2 Literature review

2.1 Introduction

This subsection includes road traffic accident in Ethiopia level, common factors of road traffic accidents, machine learning applications, and review of local and global researches that were done on road traffic accident predictions (Tables 1, 2, 3, 4).

Table 1 Total road accident in Oromia region from 2011 to 2012 EC
Table 2 Total road accident in Oromia Special Zone Woredas from 2011 to 2012 EC
Table 3 Dataset parameters/attributes and their descriptions
Table 4 Data resampling results

2.2 Traffic accident in Ethiopia

In Ethiopia road traffic accident is a common public health problem and the most common causes of death and injury of human; according to World Health Organization (WHO) report, Ethiopia is considered one of the worst countries in the world where road traffic accident kills and injures a large number of road users; every year nearly 2000 people die due to road traffic accidents where 48% are pedestrians, 45% passengers and 7% drivers, and over 400 to 500 million ETB are lost yearly, as a result of road traffic accident (Amdeslasie et al. 2016).

According to a report of the Ministry of Transport National Road Safety Council, pedestrians account 51% and passengers account 46% of fatality and of the total traffic accidents; 93% of the cases were associated with human factors, 5% accounted for vehicle factors, and 2% were associated with road-related problems. Similarly, the Ethiopian Federal Police report of 2011 stated that the road accident level of the country is one of the worst in the world, as expressed by per 10,000 vehicles. The average traffic accident record from 2003 to 2007 of the Federal Police Commission shows that, of the total fatal accidents, 76% were caused due to drivers' error, 6% due to vehicle technical problems, 5% due to pedestrian error, 2% due to road defects and 12% due to other factors. This indicates that drivers' error has been contributing to the majority of road accidents in Ethiopia. The report also shows that 82% of the accidents occur on straight and level roads with a good surface, while 83% of all the accidents occurred on good asphalt roads (Minister of Transport 2011). As per the same report, the underlying reasons for road traffic accidents were:

  1. a.

    Improper behavior or low skill of drivers, which is associated with:

    • Drivers are not respecting the pedestrian priority

    • Overspeeding

    • Overloading of passenger

  2. b.

    Vehicle’s defect (technique problem)

  3. c.

    Animals and carts using the highways

  4. d.

    Pedestrians not taking proper precautions

  5. e.

    Poor traffic law enforcement

  6. f.

    Low safety consideration during roads developments

Ethiopia is one of the 50 countries with the deadliest roads in the world according to a WHO report in 2015. Principally, injured people have occupied 30 to 70 percent of orthopedic beds in developing countries' hospitals. Road traffic accident-related causalities are very high in Ethiopia. This makes an urgent need for bringing road safety to the country's public health agenda. Pedestrian fatality in Ethiopia is common and high. Accordingly, about 74% of the fatal, serious, and slight injury accidents happened when pedestrians tried to crossroads (Abagaz 2018).

2.3 Paradigms of machine learning

2.3.1 Supervised learning

Supervised learning is machine learning in which the network is trained (learn a function from training data) by providing it with input and matching output patterns. These input–output pairs can be provided by the system or an external teacher which contains the network self-supervised. There are two kinds of supervised learning tasks: regression and classification. Classification predicts discrete responses. The algorithm labels or classify by choosing two or more classes for each example. If a classification is done between two classes, then it is called binary classification, and else if it is done between two or more classes, then it is called multi-class classification. Regression predicts continuous responses. Here, the algorithms return a statistical value. Some of the most common algorithms are support vector machine (SVM), decision tree (DT), and artificial neural network (ANN).

2.3.2 Unsupervised learning

Unsupervised learning is machine learning in which an output is trained to respond to clusters groups of patterns within the input. There is no category into which the patterns are to be classified rather the system develops its representation of the input. It mostly discovers the unknown patterns in the data, but most of the time these approximations are weak when compared with supervised learning. There are two unsupervised learning tasks: Association and Clustering. In clustering, the input data are formed into clusters or groupings. Some of the most common algorithms are K-means clustering and association rule learning algorithm.

2.3.3 Reinforcement learning

Reinforcement learning is a kind of machine learning method concerned with how software agents automatically determine the ideal behavior within a specific context, to maximize its performance. It consists of two learning tasks: Classification and Control. Some applications of reinforcement learning are computer-played board games, robotic hands, and self-driving cars. The most commonly used algorithms are Q-learning, temporal difference, and deep adversarial networks (Haug 2018).

2.3.4 Neural networks

In 1943, Mcculloch and Pitts (McCulloch and Pitts 1943) were the first to introduce the concept of an artificial neuron. The motivation for the development of neural network technology is a desire to develop an artificial intelligence system that could perform intelligent tasks like human brain. Their research work showed that the simple class of network can perform and analyze highly complex problems. Also, it was concluded that this system of neurons could be able to implement any complex function consisting of a system of neurons interconnected by the synaptic weights. Neural networks (NN) represent deep learning using artificial intelligence. ANNs are computational modeling tools that have recently emerged and extensive acceptance in many disciplines for modeling complex real-world problems. They are composed of simple elements operating in parallel inspired by biological nervous systems. It is a structure that consists of simple processors trying to imitate the basic abilities of human brains and consists of artificial neurons (nodes) that are capable of performing parallel computations for data processing and knowledge representation.

ANN is similar to the human brain in two ways:

  • They acquire knowledge through learning

  • Their knowledge is stored within interneuron connection strengths known as synaptic weights. The nodes are stacked next to each other in three layers: the input layer, the hidden layer(s), and the output layer.

  • Input layers are a bottom layer that takes input from the dataset which is also called the visible layer, because it is the exposed part of the network. A neural network is drawn with a visible layer with one neuron per input value (column in the dataset).

  • Hidden layers are layers after the input layer which is called hidden layers because they are not directly exposed to the input. It is a place where all the action is performed. The simplest network structure has a single neuron in the hidden layer that directly outputs the value. As the number of neurons in hidden layers increases, a very deep neural network can be constructed.

  • Output layer is the final hidden layer that is responsible for outputting a value or vector of values that correspond to the format required for the problem. A classification problem with multi-class may have multiple neurons in the output layer.

The attractiveness of ANNs comes from the remarkable information processing characteristics of the biological system such as nonlinearity, high parallelism, robustness, fault and failure tolerance, learning, and their ability to handle imprecise and fuzzy information. Nonlinearity allows fit to the data, noise insensitivity provides accurate prediction in the presence of uncertain data and measurement errors, and hardware failure tolerance, adaptivity and learning allow the system to modify its internal structure in according to changing environment. ANN models increased problem-solving capabilities by increasing the number of hidden layers or by increasing the number of neurons in any given layer and the number of paths between neurons.

In ANN the independent variables are called inputs, and the dependent variables are named as output. Each neuron receives a multiplied inputs and weights which are then added with static bias value, then passed to an appropriate activation function which decides the final value to be given out of the neuron. There are different types of activation functions available as per the nature of input values. Once the output is generated from the final neural net layer, the loss function is calculated, and backpropagation is performed where the weights are adjusted to make the loss minimum. Weights are numeric values that are multiplied with inputs. In backpropagation, they are modified to reduce the loss and self-adjust depending on the difference between predicted outputs vs training inputs. Activation function is a mathematical formula that helps the neuron to switch ON/OFF. The input layer represents the dimensions of the input vector. The hidden layer takes a set of weighted inputs and produces output through an activation function. The output layer represents the output of the neural network (Figs. 1, 2).

Fig. 1
figure 1

Causes of death in low-income countries

Fig. 2
figure 2

Sample pictures of traffic accidents in Oromia Special Zone

2.3.5 Backpropagation neural network (BPNN)

In this study, a neural net was used to create a model and forecast visibility. The goal was to create a statewide model rather than one for each location or a set of locations by category. As shown in Fig. 3, a neural network has an input layer, a hidden layer, and an output layer. Each layer is made up of a group of neurons. Synapses connect the neurons of the input layer to those of the hidden layer. Synapses are links with numerical weights. At the hidden layer level, the input layer neurons are graded by their respective numeric weights and summed.

Fig. 3
figure 3

Network architecture for ANN

Applied superficially thresholds or biases are then employed to decrease the cumulative input to an activation function and generate the output in the output layer (Arqub et al. 2021). This study's BPNN visibility model has multiple layers and employs the feedforward backpropagation learning algorithm. Tangent sigmoid function (Arqub et al. 2021) feature are the techniques commonly for the output and hidden layers, respectively. As the learning algorithm, the Bayesian regulation function is employed to minimize the linear combination of squared errors and weights while avoiding costly cross validation (Abu Arqub 2021). The BPNN visibility model was trained with the objective of reducing overall prediction error.

2.3.6 Multilayer perceptron neural network

A multilayer perceptron neural network is the most common type of feedforward neural network. The framework for multilayer perceptron neural networks is the same as for single layer neural networks, but with more hidden layers. An MLP has at least three node layers: an input layer, a hidden layer, and an output layer. In general, MLP trains the model using a supervised learning method known as backpropagation. Among the various types of NNs, the multilayer perceptron neural network has become the most common, universal, basic, and required for the majority of tasks.

MLPNN was developed and compared to other neural network models for the prediction of serious level of accident. Both of these studies investigated MLPNN's ability to predict accident seriousness more accurately than other conventional models.

The ANN with MLP architecture was used to predict the severity of traffic accidents using twelve input variables and three serious levels: fatal, slight, and minor (Abu Arqub 2021). The ANN provided the best prediction performance in this study in Addis Ababa Special Zone, Ethiopia, with an R-value of around 0.87.

2.3.7 Radial basis function neural network

The radial basis function neural network model explains the more specific interactions between crash frequency and risk factors (Abu Arqub 2021). This model allows for the rapid replacement of redundant nodes (Momani et al. 2011). The RBFNN has good approximation accuracy and a fast convergence speed, which can effectively address the shortcomings of BP neural networks (Momani et al. 2011). RBFNN has been widely used in the fields of function approximation, problem solving, computer vision, system identification, and so on due to its deep physiology foundation, simple network structure, fast learning speed, and outstanding approximate solution abilities (Song and Li 2011). RBFNN has been widely used in a variety of applications due to its deep physiology foundation, simple network structure,

According to the findings of this study, the RNN neural network was the best model for determining the severity of driver injuries. Huang et al. 2016b developed a method for predicting accident frequency and determining the risk parameters that were more significant for road accidents to occur. While optimizing the RBFNN, the efficiency of its estimation increases, despite the fact that other variables are found to have little effect on the frequency of occurrence for the data.

2.3.8 Long short-term memory

Short-term memory that is long (Hochreiter and Schmidhuber 1997) [66] proposed the 2e LSTM (Yohannis 2019) model, which is a recurrent neural network variant (RNN). It creates a specialized memory storage unit that uses a time backpropagation algorithm to train the data. It can address the issue of the RNN's lack of long-term dependence.

The standard LSTM is written as follows. Each step t and its corresponding input sequence are X × 1, × 2,…, xt, where t is the input gate, it is the forget gate, and ft is the output gate. Memory cell state ct regulates data memory and oblivion via various gates.

Different architectures are feasible for neural networks. The direction of information flow is set by the connections between their neurons. They are classified as feedforward or recurrent based on their network connections (backpropagating).

2.3.8.1 Feedforward neural networks

Information flows in only one direction in these neural networks, from input layer to output layer. When the weights are determined, they are rarely changed. Weights can be ascertained explicitly or using functions such as the radial basis function. The nodes in this area perform their functions without regard for whether the results produced are accurate or not (i.e., they do not re-adjust based on the results produced). There is no response from the layers ahead.

2.3.8.2 Recurrent neural networks (backpropagating)

Recurrent neural networks are neural networks that are generally designed to describe functions and improve with feedback links. They perform better and are more technically acceptable than feedforward networks. Hochreiter and Schmidhuber (1997) and [66] proposed a solution to the traditional RNNs algorithm's vanishing or exploding gradient problem (1997).

To produce a result, information is passed from the input layer to the output layer. Errors in the result are now communicated back to previous layers. Nodes learn how much they contributed to the incorrect answer. The weights have been re-adjusted. The neural network has been enhanced. It grows up. There is information flow in both directions. This essentially implements both algorithms, feedforward and backpropagation.

2.3.9 Researches on road traffic accident prediction

Nowadays machine learning is one of the most advanced scientific fields of artificial intelligence that can be applied to get a better result. Many types of research are conducted on accident data analysis and tried to show major factors causing accidents and also used different kinds of techniques to predict severity of road traffic accidents. Some of them are:

Maria Lígia Chuerubim, Alan Valejo, Barbara Stolte Bezerra, Irineu Da Silva (2019), classified the severity of road accidents using artificial neural network (ANN) modeling with MLP. The traffic accident database used in this research consists of accidents that occurred between 2009 and 2012 from km 125 to km 145.5 of Dom Pedro I Highway (SP-065), located in the city of Campinas (Brazil). During the period considered, 2824 accidents occurred, excluding missing observations, incomplete data. Eleven (11) variables were selected to classify the accident severity with fatal and non-fatal. 1964 (70%) of the data were used for ANN training and 860 (30%) of the data were used for the ANN test. ANN model with MPL classified each category of injury accidents being accidents with no injury (NI) and accidents with injury (WI). ANN ranked fatalities (WI) with an accuracy of 2.6%, while non-fatal accidents (NI) were classified with high accuracy (97.4%). Generally, the results obtained with the ANN had an accuracy of 77.9% and with mean AUC (area under curve) values of 0.618.

Meisam Siamidoudaran and Ersun Iscioglu (2019) applied MLP neural network (that can be used to categorize various road safety problems), SVM neural network (applied for the classification of collision injury severity), and hybrid MLP-SVM (The combination of MLP and SVM network). The data used were obtained from the road accident data in the city of London. SVM and MLPNN models were applied to predict the accident severity by using MATLAB software. All the factors were normalized between zero and one. The dataset was randomly separated into training data of 70% and test data of 30%. In the MLP model, the accuracy measure for the prediction in the training and test phase was found to be 84.41% and 83.82%, respectively. In the SVM model, the amount of accuracy obtained was 81.63% for the training phase and 80.88% for the test phase. The hybrid model provided a superior fitting of the model compared to the MLP and SVM models, and the obtained results have been very satisfactory. The hybrid model was able to maximize the accuracy rate for predicting the injury severity by 10%. The testing classification accuracy of the hybrid model was 90.58%.

3 Methodology

3.1 Data collection

Dataset used is data of road traffic accidents collected from Oromia Police Commission Excel Data and Oromia Special Zone Police Department hardcopy data; the dataset consists of accidents in a special zone of Oromia districts (Woredas) from 2005 to 2012. Most of the data are available in hardcopy, so first, the data are filled on Excel and some of the data are retrieved from Excel and then converted to CSV (comma separated values) format; these data can be viewed using a spreadsheet software such as MS Excel. CSV format data are easy to import using the pandas library.

3.2 Study area

The Oromia Special Zone Surrounding Finfinnee (in Afaan Oromo: Godina Addaa Oromiyaa Nannawaa Finfinnee) is one of the zones in the Oromia Region in Ethiopia that surrounds Addis Ababa. It was created in 2003 E.C from parts of North Shewa Zone, East Shewa Zone, Southwest Shewa Zone, and West Shewa Zones. The zone was created to support the cooperation and development of the surrounding areas of Addis Ababa. The administrative center of this zone is in Addis Ababa the capital of the Oromia region. The districts and town in this zone include Akaki, Berek, Burayu, Holeta Town, Mulo, Sebeta Hawas, Sebeta Town, and Sendafa.

3.3 Dataset parameters

The dataset contains person-related parameters (such as casualty, gender, and age), accident-related parameters (such as accident time, day of accident, cause of accident, and location of accident), environmental characteristics (such as weather condition and light condition), vehicle-related characteristics (such as vehicle type), and road-related parameters (such as road type and road surface).

Today’s real-world datasets are susceptible to missing, noisy, and redundant, contain errors and inconsistent data. Working with low-quality data leads to low-quality results. Therefore, the first step in every machine learning application is to explore the dataset and understand its characteristics to make it ready for the modeling stage. This process is commonly known as data preprocessing. Data preprocessing is a technique to transform raw data into a simple and understandable format. Real-world data is often incomplete, inconsistent, or likely to contain errors. Data preprocessing includes data integration, data cleaning, normalization, transformation, data balancing, feature extraction, and selection. Data preprocessing prepares data for further processing by correcting outliers, handling missing values, normalization, and scaling of data (Abu Arqub 2021). Data preprocessing helps the neural network in learning the important patterns, which subsequently improves the data fitting and prediction accuracy.

3.4 Data preprocessing techniques

3.4.1 Data integration

Road traffic accident data are collected from hardcopy and Excel Softcopy files. These data need to be integrated common file and the file converted to a common file format which is comma-separated values (CSV) format.

3.4.2 Data cleaning

Cleaning data is the removal or fixing of missing features. There may be data parameters that are incomplete and do not carry the data you believe you need to address the problem. These instances may need to be removed. Additionally, there may be sensitive information in the attributes and these features may need to be removed from the data entirely. The dataset may contain many unnecessary entries that took up large memory size. This is solved by applying the required parameters that directly factor in road accidents.

  • Set Null (Missing) Values For various reasons, many real-world datasets contain missing values, often encoded as blanks, NaNs. Such datasets are incompatible with scikit-learn estimators’ library which assumes that all values in an array are numerical and that all have and hold meaning. A better strategy is to impute or fill the missing values. Different procedures are used to estimate missing values.

    • Univariate Imputation is a simple strategy to fill missing value, and missing value will be imputed using different statistical operations such as mean (Substitute a feature’s mean value computed from entire column to fill in missing data values on the remaining cases. A smarter solution is using feature mean for all samples belonging to the same class to fill in the missing value) and mode (the value of the feature that occurs most often is selected to be the value for all the unknown values of the feature) each column where missing value exists than the general feature means.

  • Dealing with outliers Outliers are data points that are far from other data points. They are problematic for many analyses because they can cause tests to either miss significant findings or distort real results. Outliers can be treated like missing data and be either removed or replaced by techniques similar to missing data, e.g., with mean, mode, or median.

  • Noisy Data Correction Noise is a random error or variance in a measured variable. Therefore, the data need to be smoothed. The following data are smoothing techniques.

    • Binning methods Smooth a sorted data value by consulting the values around it. The sorted values are distributed into several buckets or bins. This method consults the neighborhood of values; they perform local smoothing values around it.

3.4.3 Feature extraction

Feature extraction is a dimensionality reduction technique. Unlike feature selection, which selects and retains the most significant attributes, feature extraction transforms the attributes. The transformed attributes, or features, are linear combinations of the original attributes.

Rules for Feature Extraction

Rules for extracting fatal accidents


Condition 1:


If cause of accident is overspeeding or head collision or rear collision or angular collision and


If the road surface is wet or snow or mud and


If weather conditions are raining or fog


If lighting conditions are dark


Condition 2:


If cause of accident is vehicle defect and


If the road surface is wet or snow or mud and


If weather conditions are raining or fog


Condition 3:


If cause of accident is roll over or fall and


If the road surface is wet or snow or mud and


If weather conditions are raining or fog


Condition 4:


If cause of accident is pedestrian fault and


If the road surface is wet or snow or mud and


If weather conditions are raining or fog or snowing

3.4.4 Rules for extracting serious accidents

Condition 1:


If cause of accident is vehicle defect and


If weather conditions are raining or fog or snowing


Condition 2:


If cause of accident is overspeeding or side collision or sliding


If weather conditions are fine


Condition 3:


If cause of accident is driver fault or pedestrian fault and


If the road surface is wet or snow or mud and


If weather conditions are fine


Condition 4:


If cause of accident is fall down or roll over


If the road surface is dry


Condition 5:


If cause of accident is hitting animals or poor light and


If the road surface is wet, snow, or mud and


If weather conditions are raining or fog or snowing

3.4.5 Rules for extracting slight accidents

Condition 1:


If cause of accident is hitting animals or driver fault or pedestrian fault)


If the road surface is dry


Condition 2:


If cause of accident is poor light, weather condition or road condition


If weather conditions are fine


If the road surface is dry


Condition 3:


If the cause of accident is overspeeding or vehicle defect and


If weather conditions are fine


If the road surface is dry

3.4.6 Feature selection

Working with a large number of parameters may affect the performance of models because training time increases exponentially with the number of features. Therefore, feature selection methods are used for selecting a subset of the input features for training the model depending on their contribution to identifying the features (inputs) important for classification and ignoring the irrelevant or redundant ones. So, feature selection is a critical factor for reducing model overfitting, reducing training time, reducing the complexity of a model and making it easier to interpret, and getting a more accurate prediction. Sklearn is a python library that has been used to diverge the feature with less importance. To obtain the most essential features, we operate an experiment by applying three different algorithms of feature selection.

4 Result and discussion

As presented in the previous chapter, the data are well described, explored, selected, and clean enough to be used for model building. This chapter presents the detailed activities carried out in selecting a modeling technique, implementation of the technique selected using the most appropriate algorithms, and evaluation of the models to select the best one for prediction. The study focuses on identifying determinant factors of drivers and road that leads to traffic accidents and building a prediction model. Various classification models have been built by using ANN. The models have been tested on different numbers of the selected attributes, and finally, the model with the best performance is selected.

4.1 Data distributions of collected dataset

The dataset contains a total of 6170 instances, and a total of fourteen variables (District, Area, Accident Date, Time, Time of Accident, Cause of Accident, Road Type, Road Surface, Lighting Conditions, Weather Conditions, Casualty, Sex of Casualty, Age of Casualty, and Type of Vehicle) are used as input variables (Figs. 4, 5).

Fig. 4
figure 4

The capital of the Oromia region

Fig. 5
figure 5

Low-quality data lead to low-quality results

4.2 Exploratory data analysis

4.2.1 Accident severity distribution

Figure 6.

Fig. 6
figure 6

Accident severity distribution

4.2.2 Accident severity distribution by year

Figure 7.

Fig. 7
figure 7

Accident severity by year

4.2.3 Accident-prone areas

Figure 8.

Fig. 8
figure 8

Accident distribution by time

4.3 Data preprocessing

A total of 6170 entries are present in the dataset. The categorical attributes (District, Time of Accident, Cause of Accident, Road Type, Road Surface, Lighting Conditions, Weather Conditions, Casualty, Sex of Casualty, Type of Vehicle, and Accident Severity) are converted into numeric using Data Frame. Replace (). The Accident Date attribute is slit into new attributes like day, month, and year which can be used as a feature for the LSTM model to forecast accident number.

4.4 Fill null values

Statistical operations mean is used to fill a null row in attribute (Substitute a feature’s mean value computed from the entire column to fill in missing data values).

4.4.1 Feature scaling

Features are scaled between 0 and 1 using the Min Max Scaler library.

4.5 Resampling unbalanced data

The imbalanced dataset is balanced using SMOTE, Random Over Sampler, and Random Under Sampler from the ’imblearn’ module of the python library. SMOTE, Random Over Sampler is an up-sampling technique that balances the data by increasing the number of data points for the minority class. Random Under Sampler balances the data by decreasing the number of data points for the minority class (Figs. 9, 10).

Fig. 9
figure 9

Original rows with a null value and substituted row after applying mean operation

Fig. 10
figure 10

Output of dataset after applying feature scaling

4.6 Feature selection

For determining the most significant factors, we have applied two approaches: embedding method and feature importance.

4.6.1 LASSO regularization (L1) result

In Fig. 11.

Fig. 11
figure 11

Lasso regularization result

4.6.2 Feature importance results

Figure 12 shows the results of different attributes for feature importance. Feature importance assigns a score to input features based on how useful they are at predicting a target variable.

Fig. 12
figure 12

Feature importance

4.6.3 Permutation importance results

Figure 13 shows the results of different attributes for permutation importance. The method permutes a parameters value and then measures the increase in the model error.

Fig. 13
figure 13

Permutation importance scores

4.7 Accident severity prediction

Six neural network algorithms have been used for accident severity classification in this study. They are feedforward neural network, backpropagation neural network, multilayer perceptron neural network, radial basis function neural network, recurrent neural network, and long short-term memory neural network. The outcome of models and the performance evaluations metrics are discussed below.

4.8 The outcome of feedforward neural network model

Feedforward neural network model was trained with hidden layers of 200 neurons and ReLu activation function. FFNN has scored an accuracy of 80.46%, and the precision, recall, and F-measure are 0.804, 0.804, and 0.804, respectively. Cohen’s Kappa statistic is 0.706. Mean absolute error and root mean squared error are 0.198 and 0.453, respectively (Figs. 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25).

Fig. 14
figure 14

Confusion matrix of FFNN

Fig. 15
figure 15

Confusion matrix of BPNN

Fig. 16
figure 16

Confusion matrix of MLPNN

Fig. 17
figure 17

Learning curve of MLPNN

Fig. 18
figure 18

Confusion matrix of RBFNN

Fig. 19
figure 19

RBFNN model accuracy and loss

Fig. 20
figure 20

Confusion matrix of RNN

Fig. 21
figure 21

Confusion matrix of LSTM

Fig. 22
figure 22

LSTM loss and accuracy

Fig. 23
figure 23

Models accuracy comparisons

Fig. 24
figure 24

Accident number forecast

Fig. 25
figure 25

LTSM model loss on train and test

4.9 The outcome of backpropagation neural network model

BPNN has scored an accuracy of 77.26%, and the precision, recall, and F-measure are 0.756, 0.756, and 0.756, respectively. Cohen’s Kappa statistic is 0.556.

4.10 The outcome of multilayer perceptron neural network model

MLPNN model is built using 50 and 15 neurons hidden layers, logistic activation function, and adam solver. MLPNN has scored an accuracy of 97.13%, and the precision, recall, and F-score are 0.979, 0. 979, and 0. 979, respectively. Cohen’s Kappa statistic is 0.968. Mean absolute error and root mean squared error are 0.021 and 0.151, respectively.

4.11 The outcome of radial basis function neural network model

RBFNN has scored an accuracy of 87.00%, and the precision, recall, and F-score are 0.717, 0.717, and 0.717, respectively. Cohen’s Kappa statistic is 0.581. Mean absolute error and root mean squared error are 0.318 and 0.598, respectively.

4.12 The outcome of the recurrent neural network model

RNN model is constructed with 3 cells and 150 neurons in each cell for 100 epochs. RNN has scored an accuracy of 97.18%, and the precision, recall, and F-measure are 0.9718, 0. 9718, and 0. 9718, respectively. Cohen’s Kappa statistic is 0.957. Mean absolute error and root mean squared error are 0.03 and 0.18, respectively.

4.13 The outcome of long short-term memory neural network model

LSTM model has been trained using ReLu activation function, adam solver for 100 epochs and 300 batches. LSTM has scored an accuracy of 91.00%, and the precision, recall, and F-measure are 0.894, 0.894, and 0.894, respectively. Cohen’s Kappa statistic is 0.808. Mean absolute error and root mean squared error are 0.105 and 0.327, respectively.

4.14 Comparison of the performance between algorithms

Comparison between the classification algorithms is shown in Table 5 where the highest accuracy 97.18% have been obtained in the RNN model and the accuracies of MLP, LSTM, RBFNN, FFNN, and BPNN are 97.13%, 91.00%, 87.00%, 80.56%, and 77.26%, respectively.

Table 5 Performance comparison between neural network algorithms

4.15 Accident forecasting

4.15.1 The outcome of long short-term memory neural network model

LSTM model has trained using ReLu activation function, adam solver for 18 epochs and 300 batches and the model forecasted accidents number 3555 for three years. LSTM has scored a mean absolute error of 1.81, a root mean squared error of 2.455 and a mean percentage error of − 50.79.

4.16 Identifying accident-prone areas

Figure 26 shows areas with accident severity low, mid, and high. The red spot shows accident areas with a high accident severity number, green spot shows areas with mild accident severity numbers, and the light green spot shows areas with a low accident severity number (Table 6).

Fig. 26
figure 26

Accident-prone areas

Table 6 Comparing actual accident number and forecasted accident number

5 Conclusion and future work

5.1 Conclusions

Road traffic accidents have become a major cause of death and injury nowadays. With increasing urbanization and growing populations, the volume of vehicles has increased exponentially and it became one of the most problems that drain the material and human resources. Road traffic accidents usually cause damage and injuries, ranging from minor to death or permanent disability. As a result, road traffic accident forecasting and identification of accident-prone areas can help in reducing the risks of traffic accidents. The objective of this research has been to explore the use of artificial neural networks as a technique for classifying severity levels in accidents, to identify the most significant factors that contribute to the severity of such crashes, and to forecast accident numbers. The data about the road traffic accidents were obtained from the Oromia Police Commission and Oromia Special Zone Police Department. The output or the accident severity class was divided into three major categories namely: slight accident, a serious accident, and death or fatal accident.

The major finding of this research is that ANN models perform well on the objectives above. In this paper, we have used six deep learning algorithms for prediction and LSTM for forecasting accident numbers. From this study, we can conclude the following points:

  1. 1.

    In testing the performance of the ANNs models developed in this study, the successful classification was found as follows:

    1. a.

      BPNN with a classification accuracy of 77.26%

    2. b.

      FFNN with a classification accuracy of 80.46%

    3. c.

      RBFNN with a classification accuracy of 87.00%

    4. d.

      RNN with a classification accuracy of 91.00%

    5. e.

      MLP with a classification accuracy of 97.13%

    6. f.

      RNN with a classification accuracy of 97.18%

    The results have been compared based on different parameters of the algorithms such as kappa statistics, mean absolute error, root mean squared error, F-measure, etc. By comparing all these statistical metrics, it was found that the recurrent neural network model has given the best performance among the models.

    The above classification performance indicates that the ANN was successful in capturing the relationship between the different parameters and the severity of accident cases used in testing the performance of the network.

  2. 2.

    During the process of data reduction and classification, some factors were removed because they seemed to have no effect or little effect on the accident severity level.

  3. 3.

    Road surface, weather conditions, and cause of accident are among the most important factors that influence the severity of the accident.

Although a significant improvement in accuracy has been observed, this study has some limitation with accident dataset which contains limited number of data samples from the year 2005 to 2012 since the zone was established recently. This study provides a way to analyze the severity of road accidents, predict accident severity type, forecast severity number, and analyze the factors that lead to them. It was observed that factors such as road surface, weather conditions, and cause of accident had a high effect on the severity of an accident. Although the occurrence of an accident cannot be controlled, the analysis of this data can enable the government and its citizens to take precautionary steps toward keeping themselves safer.

5.2 Recommendations

From the study performed in this thesis, we recommend the following:

  • The number of instances in the dataset needs to be increased to make a more effective predictive model.

  • There are some useful techniques such as regression analysis, Bayesian belief, Naïve Bayes, etc., that can be used for data preprocessing and there are several algorithms such as ensemble learning algorithms, time-domain neural network (TDNN) and convolution neural network (CNN) with multi-class that will be implemented. Finally, an expert system will be developed based on the best model achieved from this research study.