1 Introduction

Geomagnetic storms (GSs) [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18] have been widely discussed during the past centuries. Generally, investigations focus on understanding GS dynamics via the solar wind parameters (SWp) (Bz, E, P, N, v, T) and the zonal geomagnetic indices (ZGi) (Dst, ap, AE), where the B magnetic field z component Bz (nT), the proton density N (1/cm3), temperature T (K), the plasma flow speed v (km/s), the dynamic pressure P (nPa), and the electric field E (mV/m) [17,18,19,20,21,24]. Moreover, various studies [7,8,9,10, 12, 14, 17, 18, 25,26,27,28,29,30] have revealed models between the SWp and the ZGi. These models are intensively handled with traditional methods [31,32,33,34,35,36] and occasionally with new methods like the artificial neural network models (ANNm) [37,38,39]. The Dst (nT), the ap (nT), and the AE (nT) ZGi are the prominent indices in the models [24, 40,41,42]. This paper is established on the ANNm [10, 37,38,39, 43,44,45,46,47,58] that used the back-propagation performance of Rumelhart et al. (1986) [59]. The paper offers the scaled conjugate gradient (trainscg) training algorithm. Thirty-five neural members provide the interior interaction of the model.

The study aims to model and compare two superstorms (November 20, 2003, Dst = –422 nT and November 08, 2004, Dst = –374 nT) by the ANN with remarkable consistency. In the discussion governed by the causality principle [60,61,62], the SWp is the cause, and the ZGi is the effect. As in previous studies [8,9,10, 63], phenomena are dealt with in their physical context. In the ANN, where SWp is given as an input, ZGi is presented as an output. The first part of the study reviews the literature. The second part attempts to depict the GS dynamics by considering the variables with their behavior in the events. In the third part, after discussing the binary relations, hierarchical clusters, and distributions of the variables, ANN analysis is employed. In the last part, the paper is concluded with discussions of the results.

The bow shock to the ionosphere is considered in the essay through ground stations. All data are handled in hourly versions.

2 Data

The emergence of superstorms can be illustrated at the beginning of the research. The 5-day (120-h) storm period in which the day of the storm settles in the middle is presented in Fig. 1a–b. The investigation employs SPEDAS for displaying fluctuations of the GSs’ variables. GS titles can be specified as weak, moderate, and severe according to the Dst (nT) index.

Fig. 1
figure 1

Dst (nT) index, Bz (nT), E (mV/m), P (nPa), N (1/cm3), v (km/s) SWp, ap (nT) and the AE (nT) index (a) November 20, 2003, storm; (b) November 8, 2004, storm

According to Fig. 1, at 07:00 UT on November 10, the first coronal mass ejection (CME) hits with a sudden increase in the dynamic pressure (nPa) and the proton density (1/cm3). The first CME signs the commencement of the November 2003 superstorm. Meanwhile, conditions are immediately ready for a storm with the negative orientation of the Bz (nT) magnetic field from northward to southward. After then, the peak of the Bz (–50.9 nT) magnetic field at 15:00 UT and the Dst index hitting –422 nT at 20:00 UT with a 5-h response time point to the heart of the storm.

According to Fig. 1, the first CME for the November 2004 superstorm hit at 15:00 UT on November 06 with a sudden increase in dynamic pressure (from 1.11 to 5.31 nPa) and the proton density (from 5.3 to 18.0 1/cm3). This storm's initial CME is much more shocking than the previously reported November superstorm from the previous year. The next day, the second CME hits at 03:00 UT. On the same day, after the orientation of the Bz magnetic field from north to south, the Dst index reached its minimum value of –374 nT on November 08, at 06:00 UT.

3 Modeling

Dual relations with the Pearson correlation matrix for the data of the superstorm on November 20, 2003, and November 08, 2004, GSs are displayed in Table 1, which reveals the mutual relations of data. When the coefficients in Table 1 get close to ± 1, their relationship gets stronger. The hierarchical appearance of two super GSs' data and the scattering of variables are exhibited in Figs. 2a–b and 3a–b, respectively. Each line exhibits the correlation of variables through what is called the dendrogram.

Table 1 Data correlation
Table 2 Some Correlation Coefficient (R) estimation scores
Fig. 2
figure 2

Cluster of the variables (a) November 20, 2003, storm; (b) November 8, 2004, storm

Fig. 3
figure 3

Scattering of the data (a) November 20, 2003, storm; (b) November 08, 2004, storm

As shown in Fig. 3, in the 2003 November superstorm, variables clustered around two major centers. T, v, Bz, and Dst variables are in one group, while E, AE, ap, N, and P are in another group. In the 2004 November superstorm, v, AE, ap, and E variables are clustered in one group, while Dst, N, Bz, P, and T variables are clustered in the other group.

When it comes to the statistical partnership of data, the ANNm could be useful to remember. The ANNm is inspired by the human brain that interacts via neurons. Like regions in the brain, the ANN has layers called input, hidden, and output layers (Fig. 4). This complex structure learns by training (educating) with the aid of mathematical arguments, especially nonlinear ones. The ANN inputs and outputs do not want any information or homework for modeling [43].

Fig. 4
figure 4

The ANN interaction frame for the estimation

This ANNm employs,

$$y_{ij} = \mathop \sum \limits_{k = 1}^{n} w_{kj} x_{ik} + b_{j}$$
(1)

equation; where w is the weight vector, y is the independent variable of the activation function (as an output), x is the input, and b is the bias. The sigmoid transfer function [64] is f:

$$f\left( y \right) = \frac{1}{{1 + e^{ - y} }}$$
(2)

where f is the logistic function.

The instructional learning method is a commonly utilized approach [65]. The ANN framework includes a few layers, and a pre-defined number of neural cells is used in the frame. The input layer is usually the initial layer. The hidden layer [43] is the second layer. One hidden layer is commonly preferred over multiple ones [66]. The last layer is the output layer. While the input layer consists of the independent variables (SWp: Bz, T, N, v, P, E), the output layer consists of the dependent variables (ZGi: Ds, ap, AE). The output layer uses the sigmoid transfer function. This paper employs 120 (h) data in total; 84 h (70%) are used for training ANN, 24 h (20%) for testing, and 12 h (10%) for validating. The study utilizes the back-propagation algorithm for the estimation. This algorithm learns through feedback iteration. The feedback sum (or iteration) is the gradient reduction approach that uses the weights of the variables in the path of the activation function's negative gradient. Newton’s method [67] and gradient descent are widely used as standard optimizations in back-propagation algorithms. Feedback training-learning using constant input minimizes the total error residue by backward cluster. The study uses the scaled conjugate gradient (trainscg) training algorithm.

After creating the training algorithm, the neurons' amount of the hidden layer ought to be specified. The neurons' number should be determined as required. While a few neurons cause inadequate learning, a large number of neurons also cause memorization by the ANN. The appropriate neuron enforces the ANN to improve its generalization skills [68, 69]. The number of the layer's neurons is specified as 35, in which the mean square error (MSE) value tends to be steady. The MSE is:

$${\text{MSE}} = \frac{1}{n}\sum \left( {y_{{{\text{observed}}}} - y_{{{\text{estimated}}}} } \right)^{2}$$
(3)

While the MSE values (Eq. 3) are calculated, the number of updates (iteration, epoch) is continued until the ideal validation performance is attained (Fig. 5). When the MSE value inclines steadily, the ANN ends iteration. As a consequence of the learning, the ANN should yield the most suitable result. The authors try to eradicate memorization in the iterations. The MSE does not rise in the validation-train-test series after that marked iteration (Fig. 5). When the MSE of the validation-train-test series demonstrates no considerable memorization, the performance of the ANN reaches an acceptable level (Fig. 5a–b). Figure 5 exhibits the MSE (nT) values of the Dst (nT), the ap (nT), and the AE (nT) ZGi, respectively.

Fig. 5
figure 5

From the top to the bottom, the MSE (nT) of the ANNm of the Dst, ap, AE. (a) November 20, 2003, storm; (b) November 08, 2004, storm

The significant investigations have discussed the Dst, the Kp (or ap), and the AE estimation of ANNm (Table 2). Some of them can be indicated in Table 2.

Figures 6a–b and 7a–b discuss the examination results by indicating the three monitored and estimated ZGi. It is possible to observe the harmony of the output-target dual team's R and RMSE (nT) values (Fig. 6a–b).

Fig. 6
figure 6

The estimated-monitored Dst, ap, AE ZGi and their regressions, respectively. (a) November 20, 2003, storm; (b) November 08, 2004, storm

Fig. 7
figure 7

The estimated and monitored Dst, ap, AE indices with absolute errors (a) November 20, 2003, storm; (b) November 08, 2004, storm

For the November 20, 2003, Storm: The R constant estimation rate of the Dst, ap, and AE ZGi were 96.7, 98.2, and 98.3%, respectively (Fig. 6a).

For the November 8, 2004, storm, the R constant estimation rate of the ZGi was 98.4, 98.8, and 98.2%, respectively (Fig. 6b).

The model for predicting the ZGi impacts of superstorms appears similar. The ANNm indicates not only the fitting of the monitored-estimated of the ZGi outputs but also shows the consistency of the results.

Figure 7a–b indicates the observed-estimated ZGi values with their average absolute errors. The absolute error of the estimated ZG indices rates according to the monitored ones can be shown with the \({\text{Error}} = \left| {\frac{{{\text{Ds}}{{\text{t}}_{{\text{est}}}} - {\text{Dst}}}}{{{\text{Dst}}}}} \right|,\) \({\text{Error}} = \frac{{\left| {{\text{a}}{{\text{p}}_{{\text{est}}}} - {\text{ap}}} \right|}}{{{\text{ap}}}}\), and \({\text{Error}} = \frac{{\left| {{\text{A}}{{\text{E}}_{{\text{est}}}} - {\text{AE}}} \right|}}{{{\text{AE}}}}\) where the Dstest, apest, and AEest are the estimated Dst, ap, and AE index values, respectively. The graphics of the estimated Dst (nT), ap (nT), and AE (nT) indices of two super GS are exhibited in Fig. 7a–b. The low absolute error rates in the ANNm can be immediately recognized.

Absolute error rates in Fig. 7a–b, together with their variance values:

For the November 2003 storm

For the November 2003 storm

 

Error

Variance

 

Error

Variance

Dst

0.395 (0.50%)

0.0281

Dst

0.340 (0.34%)

0.274

ap (nT)

0.228 (0.43%)

0.046

ap (nT)

0.405 (0.43%)

0.488

AE (nT)

0.274 (0.06%)

0.146

AE (nT)

0.243 (0.04)

0.146

The assessment of the estimated ZGi values average errors superstorms through data obtained from NASA is presented in Fig. 7a–b.

The relationship [73] of the independent and dependent variables in the ANNm may be defined by:

$${\text{\% Effect}} = 100 \cdot \left( {1 - \frac{{R_{n} }}{{R_{{{\text{diff}}}} }}} \right)$$
(4)

Equation (4) is utilized by omitting the variables from the investigation. The correlation coefficient governs Eq. (4). Rn is the correlation constant obtained by anticipating related input. Rdif is the basic correlation constant between estimated and observed data in Eq. (4). The data influenced by the ANNm can be observed in Table 3.

Table 3 The ANN estimation performance effect of each variable

As the Bz (nT) B magnetic field z component, N (1/cm3) the proton density, v (km/s) the flow speed, and P (nPa) the dynamic pressure:

November 20, 2003, Storm: According to Table 3, in the prediction of the Dst (nT) ZGi, the Bz (nT), N (1/cm3), and v (km/s) have the maximum effect. When these indices are neglected, the R-value declines by 13.24, 12.20, and 11.99%, respectively. The R-value is also moderately affected by 7.45% due to the omitting of the P (nPa). The Bz (nT), the N (1/cm3), the v (km/s), and the P (nPa) are vital estimator for the Dst (nT) ZGi [2, 36]. Physically, the irregularities of high-speed (v) energetic particles that are hot plasma-dense produce coronal gaps. The magnetic field polarization responds with the SW high-speed [77]. A GS is designed by the Bz (nT) that orients to a negative southward direction, and the flow speed reaches high speed. The Dst (nT) ZGi responds to the irregularities of the flow speed and the Bz field by taking its negative peak score. The SW reaches and presses the magnetosphere with the proton density-the plasma dynamic pressure pairs [76]. The disturbance of the Dst index is the main effect of the compression caused by the flow speed [78,79,80]. As can be realized, the ANNm of the Dst values is in parallel with the literature.

According to Table 3, in the estimation of the ap (nT) index, the Bz (nT), the N (1/cm3), the v (km/s), and the P (nPa) has a high effect. The R-value of the ap (nT) ZGi is affected due to neglecting mentioned variables by 19.14, 11.81, 10.08, and 9.37%, respectively. Physically, the electromagnetic field polarization exhibits parallel impressions with the N (1/cm3), v (km/s), and P (nPa) while the ap (nT) ZGi responses with nonlinearly the instabilities [9, 12, 14, 50]. Table 3 indicates the clear relationship between the stated SWp and the ap (nT) ZGi.

According to Table 3, in the auroral AE (nT) ZGi forecasting ANNm, the Bz (nT) has the maximum effect. The v (km/s), and the N (1/cm3) have moderate impact with value of 9.87, 8.75, and 7.73%, respectively (Table 3) [40].

November 08, 2003, Storm: The response of the November 2004 super GS to the neglect of the SWp of the R score that belongs to the ZGi estimation is almost identical to the 2003 superstorm.

According to Table 3, in the forecasting of the Dst (nT) ZGi, the N (1/cm3), the Bz (nT), v (km/s), and the P (nPa) have the maximum effect. When these ZGi are neglected, the R-value declines by 14.43, 13.92, 10.37, and 9.65%, respectively.

According to Table 3, in the estimation of the ap (nT) ZGi, the N (1/cm3), the v (km/s), the Bz (nT), and the P (nPa) have a high effect. The R-value of the ap (nT) ZGi is affected due to neglecting mentioned variables by 13.46, 12.35, 10.83, and 10.0%, respectively.

According to Table 3, in the auroral AE (nT) ZGi estimating ANNm, the v (km/s) has a high effect with a value of 10.08%. The Bz (nT) and the N (1/cm3) have a moderate impact with a value of 8.76 and 6.42%, respectively.

4 Conclusions

The present study discusses two super GSs (November 20, 2003, and November 08, 2004) of the 23rd solar cycle. The study systematically reveals the estimation of the ZGi via SWp. Similar conclusions are drawn from the results obtained. For the 2003 storm, while the Dst (nT), the ap (nT), and the AE (nT) indices are estimated with the absolute error rate of 0.395, 0.228, and0.274 and the variance rate of 0.281, 0.046, and 0.146, respectively, for the 2004 storm, the Dst (nT), the ap (nT), and the AE (nT) indices are estimated with the absolute error rate of 0.340, 0.405, and 0.243 and the variance rate of 0.274, 0.488, and 0.146. In addition, for the 2003 superstorm, the Dst (nT), the ap (nT), and the AE (nT) indices are modeled with the R-value of 96.7, 98.2, and 98.3% and the RMSE 10.93 (nT), 5.71 (nT), and 40.31 (nT), respectively. For the 2004 superstorm; the Dst (nT), the ap (nT), and the AE (nT) indices are modeled with the R-value of 98.4, 98.8, and 98.2% and the RMSE 6.41 (nT), 4.78 (nT), and 31.97 (nT), respectively. It is crucial that the models can be accurate so that the results of these two superstorms, which occur at different times, can be displayed with similar results. The model dynamic of GSs may contribute to interplanetary studies.

In his study, Eroğlu (2021b) [81] examined the first four moderate geomagnetic events of 2015 with an artificial neural network model. All in all, it reveals that the model is more than 90% consistent in predicting ZGi for these four moderate storms. The similarity between the results of the present study and those of previous studies supports the reliability of this study's findings.