1 Introduction

Indoor air pollution causes the majority of air pollution in Africa (Piqueras and Vizenor 2016; Dida et al. 2022). Carbon monoxide, carbon dioxide, gaseous pollutants, and other primary pollutants are released into the environment when charcoal is burned uncontrollably and used as fuel for various purposes, such as food preparation (Abera et al. 2021) (Obanya et al. 2018). Today, vehicle transportation is also recognized as a significant contributor to air pollution in newly developing megacities (Nieuwenhuijsen 2016; Singh et al. 2021; Haslina et al. 2022); this contributes to both immediate health effects and future climate change (Caserini et al. 2013; Takuchev et al. 2014).

The African Union’s (AU) conference headquarters are located in Addis Ababa, the capital of Ethiopia and one of the continent’s emerging megacities (Habtamu. 2020). Urbanization and industrialization are growing alarmingly quickly today (Addis Ababa urban planning report, 2018) due to this people migrate from rural to the city. As a result, a strong annual population increase rate was seen, with an estimated 4.79 million people in the 2017 census (UN 2017). This has directly raised the demand for the city’s transportation system (Kume et al. 2010). The total number of vehicles was 8264 in 2005, and there were three times as many automobiles in 2015 as there were in 2005 due to significant investment in the transportation and other services industries (Wondifraw 2019). According to the Addis Ababa Transport Authority report from 2016, the city’s annual growth rate for imported vehicles is 9.88%. Air pollution caused by traffic is now a serious issue in Addis Ababa (Fekadu 2017). Another obstacle to reducing traffic-related air pollution is the main driver of increased long-distance travel (Kumie et al. 2021), improper maintenance of older vehicles, and slowly expanding road networks and design (Kebede et al. 2022).

According to the United Nations (2017), Megenagna is one of the cities of Addis Ababa’s main gathering places where most people congregate to exchange goods and services, travel in various directions, and communicate. People are sensitive to exposure to traffic air emissions due to mixed (transport and population) enormous amounts of transportation systems. Because of the extreme traffic congestion in the street and Megenagna Square, there is a serious problem with this traffic-related air pollution. It is critical to assess the effects of receptor exposure because it is difficult to accurately predict how pollutants will disperse in this dispatch region given the current traffic situation. Since air pollution is the primarily distributed contaminant, it is challenging to numerically analyze and make precise predictions (Mehrdad. 2013).

Basic dependent and independent input variables for the ADMs model include meteorological, emission rate, topography data, building, and elevation data (Sunil et al.,2015). The most popular model is the USEPA-approved American Environmental Regulatory Model (AERMOD), which relies on steady-state conditions, continuous emission, and environmental considerations (US EPA 2015). Indicating the line source model with the mixing of pollutants in line, line-area, and line-volume circumstances is another application of the AERMOD (Wei et al. 2016). In determining the emission rate, this dispersion model was employed (Milando and Batterman 2018) and was developed (Wang et al. 2008) (Equation (1)).

$$C=\frac{Q}{2\surd 2\pi \updelta zue}\left\lfloor \mathit{\exp}\left\{-\frac{1}{2}\left(\frac{z- ho}{\updelta z}\right)2\right\}+\mathit{\exp}\left\{-\frac{1}{2}\left(\frac{z+ ho}{\updelta z}\right)2\right\}\right\rfloor \times \left[\mathit{\operatorname{erf}}\left|\frac{sin\theta \left(\frac{L}{2}-y\right)- xcos\theta}{\surd 2\updelta y}\right|+\mathit{\operatorname{erf}}\left|\frac{sin\theta \left(\frac{L}{2}+y\right)+ xcos\theta}{\surd 2\updelta y}\right|\right]$$
(1)

where C stands for the pollutant concentration (μg/m3), Q for the emission rate per unit length (g/m.s), ue for the effective wind speed (m/s), h0 for the plume center height (x from the road) distance (m), and z for the receptor’s height above the ground (m). Pollutant’s y and z are the standard deviations in a crosswind’s horizontal and vertical directions (m). L is the length of the source (in meters), h is the angle between the ambient wind and the road, erf is the error function, x is the receptor’s distance from the source line (in meters), and y is the receptor’s distance from the centerline of the road together with the source line (m). The total emission of the pollutant (Ep) for each source is estimated by the model as Equation (2) (Kumar et al. 2016).

$$\mathrm{Ep}\;=\;{\textstyle\sum_{i=0}^n}\;\left({\mathrm e}_{\mathrm{ip}}\;\left(V\right)\;\times\;Ni\right)\;L,$$
(2)

where eip(V) is the emission factor for pollutant p and vehicle class I as a function of average speed V; Ni is the number of vehicles of class I; and L is the length of the road segment. The emission rate is also influenced by the average speed, the mass of the vehicle, and the technology used to reduce emissions.

It is crucial to properly aggregate cars by classes and categories to obtain an accurate estimate of air pollutant emissions (Caserini et al. 2013). To improve the simulated data, the sensitivity test and data validation are used (Kerr et al. 2014). The simulated data are improved by using dispersion modeling to model and validate concentration in various scenarios for the second independent variable (Bachtiar et al. 2020). To solve Addis Ababa city’s poor air quality issue and provide an explanation to the regulatory authority, monitoring, and forecasting traffic-related pollution is crucial. As a result, this study was an oversight of the use of AERMOD to measure the dispersion of traffic-related air pollution in Megenagna, Addis Ababa, over a 1.5 km by 1.5 km region.

2 Methods

2.1 Description of the study areas

The city of Addis Ababa, the capital of Ethiopia, which is located at 8° 55′ N, 38° 45′ E, was the site of this study. Addis Ababa City is around 20 km long from north to south and about 25 km long from east to west. It is located between longitudes 380 43′ and 380 50′ east and latitudes 80 56′ and 90 05′ north. The Megenagna region of Addis Ababa, which is halfway between Bole and Yeka, was chosen as the study location. According to their respective densities of 5165.1 and 3190.7 people per square kilometer, Yeka and Bole’s sub-cities have a total population of 424,217 and 378,104, respectively (Aklilu and Necha 2018).

2.2 Site selection

The six road lines of Megenagna that are shown in Fig. 1a, the sampling station, are where the sampling points were chosen. Megenagna has two main rings (R1 and R2) in the station, and the sample was collected in two different ways, namely at a root sample site near the bus station areas and on a road line outside of the root. The entrance of Bole-Goro road (Ring road), CMC road, Meskel square road, and Kenenisa road are all found in the first square (R1), which is situated in the lower elevation (Kenenisa Avenue). The second square (R2) is situated above the bridge entrance of Bole-Goro road (Ring road) and Asmara road, at the upper station of Megenagna, which connects Sidst Kilo road (Fikremariam Aba Techan str.) and Sidst Kilo road (Kotebe road). Further line source samples were collected at each intersection of the road that leads away from Megenagna’s root. Images of the bus station’s traffic flow can be seen in Fig. 1b. There were 43 sample locations, 16 of which were taken in the bus terminal, and the remaining 27 sampling points were occupied along each road line while taking into account the concentration of road sensitive points.

Fig. 1
figure 1

Sampling station. a Root sample site and b bus station (source: Google Earth Pro 2.3)

Table 1 lists the 27 sampling locations from the six road lines, which are, in order, road 1 (Six-Kilo Road Line), road 2 (Kenenisa Avenue Road Line), road 3, (Meskel square road line), road 4 (Bole-Goro Road Line), road 5 (CMC Road Line), and road 6 (Kotebe Road Line).

Table 1 Sample point, the distance between the sample point, and respective coordinate

2.3 Sample collection

Over the study period, data from various scenarios, both qualitative and quantitative, were gathered. The sampling period ran from January 1 to February 28, 2021. Ethiopia is currently experiencing its dry season. The first month’s sample, January, was utilized for model simulation and calibration, while the second month’s sample, February, was used for model validation by cross-monitoring of the first sample points. Three peak hours morning, noon, and afternoon were chosen for sample measurement. To cover sample measurement at all of the chosen sample points during sampling, both transit walk and transport systems were employed. The street stands’ average height of 1.5 m was used for the sample analysis because that height is what is advised for human respiratory intake.

PM2.5 and PM10 aerodynamic diameters were measured on the field using Air-test portable sensing equipment (Model-CW-HAT2005). The portable Air-test sensor was created to measure the PM2.5 and PM10 concentrations in the air for 60 s at a 500 mL flow rate using the accepted methodology (Lin et al. 2015). On the other hand, gaseous pollutants (SO2 and NO2) were measured using Aeroqual (Model Series 500 (2016)) within the same port in the various sensors. All of the vehicles that went by the sample points during the busiest time of the day were counted. Cars were divided into heavy-duty, medium-duty, and light-duty categories for the study. Overall, the Air-test instrument (Model-CW-HAT2005) and Aeroqual are both effective tools for measuring particle and gaseous contaminants Table 2.

Table 2 Detection threshold, the data range, and some characteristics of the Air-test instrument (Model-CW-HAT2005) and Aeroqual (McKercher et al. 2017)

2.4 Data analysis and model setup

2.4.1 Data analysis

The average concentrations of each parameter were used as input for the AERMOD software after the raw data was compiled and summarized using an MS Excel sheet. The average daily traffic (ADT) and average monthly traffic (AMT) were calculated and examined, together with the traffic flow at the sample places, wind speed, wind direction, surface temperature, meteorological temperature, and wind speed. Results were compared to the recommended value, which was established by the Ethiopian Environmental Protection Commission and WHO’s 24-h air quality monitoring.

2.4.2 Meteorological data preparation

The meteorological agency supplied the metrological information, which included the following variables: temperature, relative humidity, cloud cover, precipitation, wind speed, and wind direction. A site-based air test sensor also measured temperature and relative humidity in addition to these facts. To assess its impact on the level of pollution in the research locations, meteorological data was employed as an independent variable. Since meteorological factors are the primary element determining the dispersion of emissions in the ambient environment, the impact of climatic variables such as wind speed and wind direction was highlighted to assess the condition and extent of air pollution in Megenagna (Bachtiar et al. 2020). Using WRPLOT view 9.8 to locate the eight directions of wind flow in the east, west, north, or south as well as the corresponding nearby direction of the wind in the study areas, the wind rise was calculated. The meteorological bureau provided data for the 2 months’ variation in wind speed and wind direction during the day.

The colorful legend in Fig. 2a depicts how often wind speeds are present in the indicated direction. A wind rise shows wind direction and frequency in the Megenagna region. In the Megenagna region, a wind rose shows wind speed and direction frequently. The majority of wind probabilities were in an east-to-west direction rising at a 248° bend. The majority of wind direction predominated to the west at Megenagna bus station, and the percentage contribution of wind speed was 41%. Maximum wind speeds more than or equal to 11.1 m/s were frequently reported, while minimum speeds of 0.5 m/s were as well.

Fig. 2
figure 2

a Wind rose in January and February and b dominant of wind rose in sample location

In Fig. 2b, buildings were classified according to their height as highest, medium, and small, ≥ 30 m, 15 m ≤ b < 30 m, and <15 m, respectively. The length of the buffer zone from the major road was estimated. The sample point’s elevations as well as the demographics of the Yeka and Bole sub-cities per-residence density were used as inputs. Using WRPLOT software on top of Google Earth Pro, the primary preset meteorological variables of wind speed and wind direction were added as the independent variable.

2.4.3 Model setup

Using AERMOD software, an observational study of air pollution concentration was replicated. There are five paths in AERMOD’s environmental system. They are the output pathway, meteorological pathway, source pathway, and control pathway. The steps for operating AERMOD were sequential. The new project was launched at its focal point (47,723.20, 997,169.40) using the World Geodetic 1984 projected 1.5 km by 1.5 km coordinate system in the 37N universal time zone. In the source route, a template format for the first stage impact of the parameter from the Excel sheet was created in the chosen station of line source ID. Average building height, width, and cross-flow were taken into consideration in the second step of the source pathway. In the source route, a template format for the first stage impact of the parameter from the Excel sheet was created in the chosen station of line source ID. Average building height, width, and cross-flow were taken into consideration in the second step of the source pathway. Maximum hourly background concentration is assessed at the starting and finishing numbers for further validation purposes. Except for the reference point in the root, this pathway’s base elevation (0 m) and road width were measured in increments of 20 to 30 m. With 43 sampling points and a user-specified average human height of 1.5 m, the number of receptors on the receptor pathway is identical to the number of receptors allocated by default. Starting at the root of Megenagna, the distance between receptors was estimated to be 25.5 m. The terrains chosen were flat, raised, and uniformly Cartesian. AERMET software was employed on the meteorological pathway as an independent variable of the AERMOD input value. Surface file (SFC) and profile file (PFL) formats were utilized as the result of AERMET meteorological preprocessing of the data. From the archived data file, the raw met data was taken out and organized. Twelve-hour data were retrieved from a 24-h collected data file. For the AERMET input data file, all required data files were prepared. Wind speed, wind direction, relative humidity, precipitation, dry temperature, and cloud cover were the primary variables that AERMET processed.

Eventually, the output of the AERMET program contains one or more levels of the profile of wind speed, wind direction, temperature, and standard deviation of the fluctuation component of the wind. The surface data file (sfc) is utilized for boundary layer scaling parameter height of wind and temperature. The counter profile as it had been automatically generated by AERMOD was examined. The entire procedure from the projection to the metadata was double-checked on the output pathway. The last step is run before determining whether the progression is successful, and the run is then verified using the message’s instructions for the error and warning messages.

In the AERMOD software environment, the model’s output is executed in three-time intervals for a chosen control hour (rush time, morning, and afternoon). The plot AERMOD was used to predict particulate matter and gaseous pollutants from vehicle emissions for average time options of 3, 6, and 12 h. These times were chosen because they represented the traffic’s peak. Pollutant concentration and dispersion are influenced by the contour line; the higher the contour line, the higher the concentration of pollutants. The first scenario must represent the period from 1:00 to 3:00 AM local time, and the second scenario must represent the average of 6 h from 1:00 to 6:00 AM. Typically, incoming data processes a variable to produce an outcome, as seen in Fig. 3. According to the color-coded legend with the concentrations of contaminants, simulations of certain concentrations were done. According to the contour line surface simulation for the categorization in the AERMOD environment, there was a considerable amount of pollution.

Fig. 3
figure 3

Model setup and data analysis

2.5 Calibration and validation of the model

The model evaluation is required to raise the level of trust in the observed data before the research findings are reported for a decision-maker to implement the socio-economic implications of air pollution. Model prediction requires research findings after the examination of statistical analysis. Different calibration techniques have an impact on the sensitivity change as the independent variable changes. The expected concentration, the dependent variable, and the observed value were contrasted. The model was then tested using the final set of simulated data for the second-month data. As one variable was checked, the simulation’s first value was compared to the data that had been collected. The projected value was simulated in various ways by altering the independent variable of the metadata value while keeping the dependent input variable fixed. In comparison to relative humidity and precipitation, wind speed and wind direction were chosen for model testing since they are the key factors impacting the dispersion of pollutants during the dry season (US EPA 2003).

Dispersion modeling was used to simulate the investigation modeling and validation of PM10 concentration. This was done twice for the sample that was collected (Bachtiar et al. 2020). The second sample was used to validate the pollutants while the first sample was utilized to simulate PM10 at the default values of 0°, 30°, and 60°. In terms of traffic volume and density, the angle of 0° < ɑ < 90° revealed the highest concentration of PM10, with 61% and 51%, respectively. This first thought led to the validation of the research utilizing a try-and-error value adjustment of the dependent and independent variables of wind speed and wind direction with the concentration outcome in the January and February data.

3 Result and discussion

3.1 Dispersion of particulate matter (PM)

The contour line determines the dispersion and concentration of contaminants; the higher the contour line, the more pollutants are present. The highest simulated value was displayed in a significant level red color with a high contour line, while the lowest concentration was displayed in a violet color with a lower contour line. To adjust the multiplication factor of the result in the AERMOD environment, the emission rate for the output pathway was employed as a user-defined factor of 10. The pink color was anticipated to have the lowest value far from the Megenagna root, while red was predicted to have the highest value close to it. To foretell the concentration difference in the peak hour variation, the AERMOD simulation was given the go-ahead. The first 3-h average of PM2.5 was expected to have a maximum concentration of 34.8 μg/m3 and a minimum concentration of μg/m3, as illustrated in Fig. 4a. The highest and lowest predictions for the first average 6-h PM2.5 concentration were 34.9 μg/m3 and 0.7 μg/m3, respectively.

Fig. 4
figure 4

Prediction of PM in the first 3-h average: a PM2.5 and b PM10

The projected maximum and minimum PM10 concentrations for the first 3 h were 68.8 μg/m3 and 0.7 μg/m3, respectively (Fig. 4b). According to Table 3, the maximum and minimum expected PM10 concentrations for the 6-h concentration were 63.3 and 1.0 μg/m3, respectively, while for the 12-h prediction, the maximum predicted concentration was 40.6 μg/m3 and the lowest predicted concentration was simulated at 0.4 μg/m3. The maximum predicted values for the first 24-h average from area, line, and point sources were 339 μg/m3, 24 μg/m3, and 11 μg/m3, respectively, in the investigation on PM10 dispersion model prediction that was carried out in the high-rise metropolis (Onat 2016). The line source model’s predicted concentration of PM10 was lower than the actual results. This could be a result of the traffic intensity or the difference in time. In cities in India and the UK near roads, a comparison of the dispersion model of PM2.5 was conducted (Gulia et al. 2015). It was predicted that Delhi, Chennai, and Newcastle city will have first-hour average PM2.5 concentrations of 114.14 μg/m3, 87.10 μg/m3, and 14.26 μg/m3, respectively. In comparison to the current findings, Delhi and Chennai had a larger concentration of projected value, while Newcastle city had a lower concentration and lower relatively with Newcastle city.

Table 3 Prediction of PM in different scenarios

The highest predicted concentration of PM was observed in the first hour of an average 3-h scenario. The lowest PM concentration was projected in the lower contour line of the violet color, which is away from the bus station, while the highest PM concentration was typically expected in the middle of Megenagna with a high contour line. The largest expected amount of pollutants was predicted to disperse in the southwest, upwind, for the first average of 3 h. It includes emissions from a wide range of close-by traffic. The line source model underestimated oblique and crosswind conditions while overestimating parallel wind direction in the other area of investigation (Goyal and Dhir 2015).

3.2 Prediction of gaseous pollutants

A new spatiotemporal variation was used to forecast the dispersion of gaseous pollutant concentration. During the day’s busiest time, notably early in the morning, the automobile count revealed the highest traffic flow. The model was to simulate the concentration of gaseous pollutants with peak hour emission in the first 3-h, 6-h, and 12-h scenarios, depending on the source’s emission inventory. In the AERMOD setting, the initial average of 3, 6, and 12 h with a considerable amount of SO2 concentration was simulated. According to the simulation in the red color at Megenagna’s center, as seen in Fig. 5a, the first average 3-h maximum SO2 prediction was 973.7 μg/m3. Due to traffic density and prolonged idle times, pollutant dispersion is lower in the center. Because traffic signals and crossroads at this station affect the concentration level, the pink color was simulated between 700 and 900 μg/m3 next to the Lam Hotel on the Meskel square road line. Together with the CMC road line’s high traffic density, Fig. 5’s large concentration of construction and maintenance equipment is also noticeable.

Fig. 5
figure 5

First 3-h average predicted gaseous pollutants for a SO2 and b NO2

In the initial average 3-h simulation, as depicted in Fig. 5b, the greatest NO2 prediction was simulated. NO2 concentrations for the first average of 6 h were predicted to be 78.8 and 0.8 μg/m3, respectively. The first average 6-h simulation indicated a moderately significant level of pollution, which was a minimal concentration. The AERMOD line source model of vehicular NOx emission was created in a recent study (Amoatey et al. 2020). The first average of 3 h had the highest simulated NO2 concentration level, which was measured at 188.22 μg/m3 among the 24-h prediction. Due to Muscat’s high traffic density, this number was larger than that found in the current study. Observation vs. prediction-fitting correlation coefficient was used to assess the accuracy of the gaseous pollutant prediction. Perception of more than 88% of the fitting curve was found during the experiment. The correlation coefficients of NO2 between predicted and observed values were R2 = 0.904. This demonstrates that the simulated value was the best suited. A study of many studies revealed that in Sydney, the monthly and hourly value regression coefficients for SO2 were R2 = 0.57 and R2 = 0.34, respectively (Gibson 2013). The regression coefficient of SO2 in the preview study had the lowest fitting value compared to the results of the current study, which is consistent with those findings. This was caused by the variation in the finding’s meteorological and spatial variability. The correlation coefficients for SO2 and NO2 varied from 0.73 to 0.96 and 0.73 to 0.95, respectively, when the general finite line source model employing the dispersion model was established (Kumar et al. 2016). This value shows the best fitting compared with the current study. The regression coefficient fits the data less well than the present result. This can be due to the distinction between metrological and geodemographic variability.

3.2.1 Sensitivity test and calibration of AERMOD

Different calibration techniques cause the sensitivity (the independent variable) to change and have an impact on the dependent variable. The value that was observed and the dependent variable were compared. As one variable was checked, the first value from the simulation was compared to the data that had been collected. The projected value was simulated in various ways by altering the independent variable of the metadata value while keeping the dependent input variable fixed. Because these characteristics have a greater impact on model predictions than relative humidity and precipitation during the dry season, they were chosen for model testing. As shown in Table 4, the five categories A, B, C, D, and E each had a distinct speed value; thus, the AERMOD software set the default value for each category to be constant.

Table 4 Calibration test of the AERMOD default value

The calibration process involved five steps of the category. The main prediction variable equation was created using wind speed or direction change, and the anticipated value was simulated as either a maximum or minimum value. As a substrate of wind speed and wind direction, the objective function (concentration) varied in various calibrations sensitivity changes from Equation (3). The influence of the receptor in the monitoring area was found to be sensitive; if the wind was blowing from the south or west, the impact of the receptor would become more pronounced in the other geographic location when the wind direction changed (Bachtiar et al. 2020).

$$\mathrm Z1\;\left(\mathrm{Xi},\;\mathrm{Yi}\right)\;=\;\mathrm Q\;\left(aXi\;+\;\mathrm{bYi}\right),$$
(3)

where Z1 is the predicted value due to the change in independent variables Xi and Yi, wind speed and wind direction, respectively. Q is the independent variable input value of emission rate and observed value, and a and b are the other constant values. The maximum and minimum value of pollutants was simulated in different wind speeds and wind direction variation as presented in Table 5. As indicated in Table 5, the regression values for SO2 and NO2 were R2 = 0.91 and R2 = 0.96, respectively, while the regression values for PM2.5 and PM10 were calibrated and were R2 = 0.95 and R2 = 0.93, respectively.

Table 5 Model calibration using different scenarios

3.2.2 Validation

Using the 2 months of collected data, the model’s validity was developed. The data from the first month were utilized for model calibration and prediction, while the data from the second month, collected via self-monitoring at the same sampling locations, were used for validation. The second month of observed data was used as an input simulated model, with the independent variable of the simulated data remaining unchanged. The objective function simulation model was utilized for validation, except for the dependent variable used as a function of the concentration’s self-monitored value for the second month, as shown in Equation (4).

$${Z}_2(Qi)=\textrm{aX}\left(\textrm{Ci}\right)+\textrm{bY}\left(\textrm{Qi}\right)+\varepsilon,$$
(4)

where Z2 is the concentration of output as the second-month emission rate imported, which maintains the independent variable constant optimum values (scenario-5) in calibrated models; Qi is the emission rate of pollutants; Ci is the concentration of the second-month cross-monitored; X and Y are the independent variables with the constant coefficient of a and b; and ε is an error.

The maximum concentrations of PM2.5 and PM10 were 36.6 and 66.6 μg/m3, respectively, with fitting values of regression analysis of 98% and 90%. While PM10 was somewhat lower than the monitoring value in Fig. 6, the validation of PM2.5 was roughly the best fitting to the truth value of monitoring data. For PM2.5 and PM10, the estimated inaccuracy was 2% (0.02) and 10% (0.1), respectively. Around 90% of the anticipated validation of PM was accomplished. In the other study, simulations of dispersion modeling were used to model and validate PM10 concentration under various scenarios for the second independent variable (Bachtiar et al. 2020). For the collected sample, the analysis was done twice. The first sample was used to simulate PM10 at default values of 0°, 30°, and 60°, while the second sample was used to validate pollutants. At an angle of 0° < ɑ < 90°, the largest concentration of PM10 was found, with 61% and 51%, respectively, of the total traffic volume.

Fig. 6
figure 6

Validations of data of aPM2.5, b PM10, c SO2, and d NO2

Around 95% of the calibrated validations for cross-monitoring gaseous pollutants were obtained. The perspective error was 2% (0.02) and 4% (0.04), respectively, with SO2 and NO2 fitting best at 98% and 96%, respectively. With the help of regression analysis, the task of estimating model validation of gaseous pollutants was completed to the best of fitting using the first month as simulation modeling and the second month for validation purposes.

Statistical modeling is essentially dependent on the evaluation of sampling points. Correlation, root mean square error (RMSE), bias, mean absolute error (MAE), mean bias error (MBE), and mean square error (MSE) statistical analysis methods were used to check and validate emission-related variables such as particulate matter and gaseous contaminants. Cross-validation of seen and predicted data is a technique used to assess a predictive model’s performance on unobserved data. It entails repeatedly dividing the data into two sets: training and testing. The model is trained on the training data and then used to predict the test data. This process is performed several times with different data splits, and the model’s average performance across all folds is calculated. The correlation between observed and predicted was best feted (0 to 1). In Table 6, RMSE values between 0 and 20% show that the model can relatively predict the data accurately. The lower value of RMSE indicated the best fit of the model. NO2 indicated that the higher value implies that cross-validation was not accurate. It might be the sensitivity of the model at NO2 (1.042). The probability of bias also was the best result, while at PM2.5, under-predicted results were obtained.

Table 6 Cross-validation of observed and predicted data

Generally, RMSE, bias, MAE, MBE, and MSE indicate a better fit between the observed and predicted data except for NO2, which was higher at RMSE and MSE, with values of 1.042 and 1.086, respectively.

4 Conclusion

Assessment and forecasting of traffic-related air pollution in Megenagna, Addis Ababa, was the main conclusion of this study. The best calibration was obtained with a wind speed of 1.0 m/s at 450 with a higher probability of receptor. Air pollution prediction using AERMOD reached over 75% of R2 with monitored data. Except for SO2, the concentration of both gaseous and particle pollutants in the Megenagna area did not exceed the regulatory guideline when the expected pollution level was compared to it. It may be necessary to evaluate the fuel type and automobile model because the gaseous pollutant SO2 was significantly above the advised level. This research was only able to focus on the 2-month variation of January and February due to time and money constraints, but additional seasonal and geographical region coverage research is essential to address Addis Ababa’s air pollution issues.