Introduction

Solar radiation (Rs) provides the essential energy for life on Earth (Wild et al. 2005) and is the foundation of global climate formation (Antonopoulos et al. 2019). Solar energy is one of the most advantageous energy sources, as it is clean, free, abundant, and inexhaustible (Khatib et al. 2012; Desideri et al. 2013; Jamil and Akhtar 2017; Zhang et al. 2019). As the global energy demand is gradually increasing, solar energy has attracted increasing attention. The application of solar energy systems depends on the amount and intensity of global Rs; thus, reliable information on Rs directly affects the development of solar energy (Citakoglu 2015; Zhang et al. 2019). Furthermore, the level of Rs is directly related to the characteristics of regional climate change and the layout of agricultural production, especially crop production (Bailek et al. 2018; Fan et al. 2019; Jiang et al. 2020; Wu et al. 2022a). The most accurate Rs data can be obtained by measurements (Fan et al. 2019). However, the high requirements and costs of the measuring devices have resulted in few measurements worldwide (Besharat et al. 2013; Oates et al. 2017; Feng et al. 2020). China has the largest energy demand in the world. Among the 752 national meteorological stations in China, only 122 stations have measured Rs data (Pan et al. 2013). Thus, using other commonly available climatic data to predict Rs is a feasible alternative.

Various climatic variables, such as precipitation (P), sunshine duration (n), air temperature, and relative humidity (Hr), are effective factors for Rs estimation (Katiyar and Pandey 2010; Jamil and Akhtar 2017; Jamil and Siddiqui 2018; Kaba et al. 2018; Wu et al. 2022b). Thus, various types of models have been developed based on these climatic variables, including empirical models (Liu et al. 2009; Citakoglu 2015; Demircan et al. 2020; Feng et al. 2021a), machine learning models (Hossain et al. 2017; Fan et al. 2018; Feng et al. 2019c), and radiative transfer models (Gueymard 2001; Wu et al. 2020). Owing to the acceptable accuracy and low computational costs and input requirements, empirical models are the most widely applied models (Hassan et al. 2016), among which the Hargreaves–Samani (HS) model and Bristow–Campbell (BC) model are two well-known empirical models. Liu et al. (2009) modified the HS and BC models in different regions of China, and found that the accuracy of the models was improved by 4–7% after correction.

Because Rs has a nonlinear relationship with other climatic variables, as indicated by empirical models, machine learning models can improve the accuracy of Rs estimation and prediction (Chen et al. 2011). To date, many machine learning models have been extensively applied to estimate and simulate Rs (Katiyar and Pandey 2010; Jamil and Akhtar 2017; Kaba et al. 2018; Feng et al. 2019a), such as the adaptive neuro-fuzzy inference system (ANFIS) (Tabari et al. 2012), M5 model tree (Kisi 2016), random forests (Feng et al. 2017a), and gene expression programming (Shiri et al. 2014). Bueno et al. (2019) evaluated the performances of neural networks, support vector regression, and Gaussian processes for Rs prediction using satellite data as inputs, and reported that the three machine learning models provided reliable estimates. Zou et al. (2017) compared the ANFIS model with an improved BC model and Yang’s model for Rs estimation, and found that machine learning models showed better results than the BC model and Yang’s model. Fan et al. (2019) compared 12 machine learning models and 12 empirical models to estimate Rs. They showed that the ANFIS model, MARS model, and XGBoost model may be promising models in China.

Although machine learning models have improved the accuracy for estimating Rs, they still have some issues to deal with. The parameters random selection of traditional machine learning models can affect the calculation accuracy. The particle swarm optimization (PSO) algorithm can solve the limitations of parameters and improve the accuracy of traditional machine learning models. Gaussian exponential model (GEM) is a novel machine learning model that has not been applied to Rs estimation. To further improve the accuracy of GEM, the PSO algorithm was utilized and the PSO-GEM was developed in this paper. To confirm the accuracy of PSO-GEM and GEM, we compared the models with three traditional machine learning models (M5 model tree (M5T), support vector machine (SVM), random forest (RF)) and two empirical models (HS and BC). China consumes a large amount of energy, and a significant amount of energy is used for economic development every year (Liu et al. 2017; Fan et al. 2018). Clean solar energy is of great significance for energy conservation and emission reduction (Jin et al. 2005; Feng et al. 2021b). Northeast China, which is the main industrial production region, accounts for approximately 20% of China’s energy consumption (Zheng et al. 2019). Therefore, determining an optimal Rs model for this region can provide scientific information for solar energy applications. However, the performance of different models in this region has not been well documented. Thus, in this paper, PSO-GEM and GEM were developed to estimate Rs in Northeast China with different climate data. The main purpose of this study was to examine the applicability of five machine learning models (M5T model, SVM model, RF model, GEM, PSO-GEM, HS, and BC) for Rs prediction in Northeast China.

Methods and materials

Study area and data collection

Northeast China generally consists of three provinces, including Liaoning, Jilin and Heilongjiang. In Liaoning province, the terrain is generally high in the north and low in the south. Mountains and hills are distributed on the east and west sides of Liaoning province. In Jilin province, the terrain is high in the southeast and low in the northwest. In Heilongjiang province, the terrain is higher in the northwest, northern and southeastern regions, and lower in the northeast and southwest. Northeast China has a temperate monsoon climate (Feng et al. 2018), where the average annual temperature is 6.6 °C, the annual relative humidity is 60%, the annual precipitation is 608.3 mm. In this study, long-term climatic data, including Rs, n, maximum and minimum air temperature (Tmax and Tmin, respectively), Hr, wind speed at 2 m height, and P, during 1997–2016 were collected from four stations located in Northeast China (Fig. 1). Extra-terrestrial solar radiation (Ra) calculated from geographic information and the day of the year (DOY) were also used for modeling. These data were provided and quality examined by the China Meteorological Administration. We further refined the data based on linear interpolation according to the rules: (1) missing measurements; (2) Tmin ≥ Tmax; (3) n > N. Here N is the theoretical sunshine duration. Figure 2 shows the monthly variations in climatic variables. Table 1 shows the climatic conditions of the study region.

Fig. 1
figure 1

The geographical distribution of the stations in Northeast China

Fig. 2
figure 2

Monthly variations of meteorological variables at the four stations in Northeast China

Table 1 Climatic conditions of the four stations in this study

Gaussian exponential model

The GEM was proposed by Liu et al. (2014). The model is divided into three procedures. First, learning samples are clustered by the k-means algorithm as the most primitive allocation of samples. Second, the parameter estimates of the sample are calculated using the maximum likelihood estimation. Third, learning samples are regrouped according to the maximum posterior probability criterion. The model can be defined as follows:

$$f\left(n\right)={H}_{i}\times \mathrm{exp}\left(-\frac{2\left(n-{N}_{i}\right)}{{W}_{i}^{2}}\right),i=1, 2,\dots , n$$
(1)

where Hi is the peak amplitude, Ni is the peak time position, and Wi is the half-width of the Gaussian wave.

Hybrid Gaussian exponential model and particle swarm optimization

The PSO algorithm has been widely used in model optimization, and has been proved its applicability (Yu et al. 2016; Zhu et al. 2020). In PSO, every particle has a fitness value. By calculating the fitness, the optimal output result is obtained.

In D-dimensional space, given a population with n particles (N1, N2, N3, …, Nn). The position and velocity of every particle i are (n1i, n2i, n3i, …, nni) and (v1i, v2i, v3i, …, vni). Updating of the position and velocity can be expressed as:

$${N}_{id}={N}_{id}+{N}_{id}(d=\mathrm{1,2},\dots ,D,i=\mathrm{1,2},\dots n)$$
(2)
$${V}_{{}_{id}}^{k+1}=\omega {V}_{{}_{id}}^{k}+{c}_{1}{r}_{{}_{1,i}}^{k}({P}_{{}_{id}}^{k}-{X}_{{}_{id}}^{k})+{c}_{2}{r}_{{}_{2,i}}^{k}({P}_{{}_{gd}}^{k}-{X}_{{}_{id}}^{k})$$
(3)

where ω is the weight; k is the current iteration number; c1and c2 are the acceleration coefficients; r1,ik and r2,ik are the empirical parameters falling [0,1].

Although GEM has been proved to have high accuracy and computation speed (Jia et al. 2021), the PSO algorithm can further optimize the structure of GEM and improve the model accuracy.

M5 model tree

Quinlan (1992) first developed the M5 tree (M5T) model, which selects the expected standard deviation after scanning all the possible splits (Feng et al. 2019b). The procedure that makes up the model is divided into two parts. First, the data are divided into several subsets to create decision trees. The expected error of the subsets can be calculated by the model. The model accuracy can be defined as follows:

$$SDR=SD(Q)-\sum \frac{|{Q}_{i}|}{|Q|}SD({Q}_{i})$$
(4)

where SD and SDR are the standard deviations, Q is a set of samples that reach the target value, and Qi is a subset of Q.

To improve the application efficiency of the model, it is necessary to traverse each node of the initial model tree through the pruning process to merge some subtrees and replace them with leaf nodes (Sattari et al. 2013). The detailed model procedure of the M5T model is described by Quinlan (1992).

Support vector machine

The SVM was first proposed by Vapink (1999). This model is considered the best theory for current small-sample statistical estimation and prediction learning (Belaid and Mellit 2016; Shamshirband et al. 2016). The model replaces traditional experience minimization with structural experience minimization, which can overcome many shortcomings of neural networks (Quej et al. 2017). The SVM function can be expressed as follows:

$$f(x)=\sum_{i=1}^{n}{\alpha }_{i}{y}_{i}\kappa ({x}_{i},{y}_{i})+b$$
(5)

where κ(xi,xj) is a higher-dimensional feature vector converted from the input vector xi and xj. yi is the ordinate of the input vector, αi is the weight of the input vector, and b is the bias.

Random forest model

The RF model was proposed by Breiman (2001). The model introduces random attribute selection during model training. The model extracts data based on randomness and difference, which can greatly improve decision accuracy. The procedures of the RF model are described by Buja et al. (2008).

Hargreaves–Samani model

The HS model only uses Tmax and Tmin data as inputs and is widely reported to have acceptable accuracy for Rs estimation. The model is as follows:

$${R}_{s}=[C{({T}_{\mathrm{max}}-{T}_{\mathrm{min}})}^{0.5}]\times {R}_{a}$$
(6)

where Rs is the global Rs (MJ m−2 d−1), Tmax and Tmin are the Tmax and Tmin, respectively (℃), C is the empirical coefficient, and Ra is the Ra (MJ m−2 d−1).

Bristow–Campbell model

Bristow and Campbell (1984) developed the BC model, which only uses Ra and the diurnal temperature range (△T) as the input data. The model is defined as follows:

$${R}_{s}=a[1-\mathrm{exp}(-b\Delta {T}^{c})]\times {R}_{a}$$
(7)

where △T is the △T (℃) and a, b, and c are empirical coefficients.

Model training and testing

Five input combinations of meteorological data were used to train the machine learning models. Details of the combinations are presented in Table 2. The dataset was divided into two parts, i.e., 1997–2011 and 2012–2016, for training and testing the machine learning models, respectively. The coefficients of the empirical models were locally calibrated at each station by the least square error method using the training data (data from 1997 to 2011). The model training/calibration and testing were performed in Matlab 2018a. The parameters of the machine models are presented in Table 3.

Table 2 Input combinations for training the machine learning models in this study
Table 3 Parameters applied for different machine learning models in this study

Statistical indicators

The root mean square error (RMSE), relative root mean square error (RRMSE), coefficient of determination (R2), mean absolute error (MAE), and coefficient of efficiency (Ens) were used to assess the Rs models (Feng et al. 2017b), as follows:

$$RMSE=\sqrt{\frac{1}{m}\sum_{i=1}^{m}{({Y}_{i}-{X}_{i})}^{2}}$$
(8)
$$RRMSE=\frac{\sqrt{\frac{1}{m}\sum_{i=1}^{m}{({Y}_{i}-{X}_{i})}^{2}}}{\overline{X} }\times 100\%$$
(9)
$${R}^{2}=\frac{{[{\sum }_{i=1}^{m}({X}_{i}-\overline{X })({Y}_{i}-\overline{Y })]}^{2}}{{\sum }_{i=1}^{m}{({X}_{i}-\overline{X })}^{2}{\sum }_{i=1}^{m}{({Y}_{i}-\overline{Y })}^{2}}$$
(10)
$$MAE=\frac{1}{m}\sum_{i=1}^{m}|{Y}_{i}-{X}_{i}|$$
(11)
$${E}_{ns}=1-\frac{{\sum }_{i=1}^{m}{({Y}_{i}-{X}_{i})}^{2}}{{\sum }_{i=1}^{m}{({X}_{i}-\overline{X })}^{2}}$$
(12)

where Xi and Yi are the trained and estimated values, respectively, and \(\overline{X }\) is the average value of Xi.

Owing to the excessive evaluation index, it is very difficult for a single evaluation index to compare different models. Therefore, the global performance indicator (GPI) was introduced to comprehensively evaluate the model simulation results (Despotovic et al. 2015), as follows:

$$GP{I}_{i}=\sum_{j=1}^{5}{\alpha }_{j}({g}_{j}-{y}_{ij})$$
(13)

where αj is a coefficient that is equal to 1 for the RMSE, RRMSE, and MAE and equal to − 1 for Ens and R2; gj represents the median of statistical indicator j, and yij represents the scaled value of the statistical indicator j. A higher GPI value indicates the better performance of the model.

Results and discussion

Results

Evaluation of the models on a daily basis

The statistical performance of the models at the four stations is presented in Table 4. At Harbin station, the PSO-GEM1 showed the highest accuracy under input scenario 1 with RMSE, RRMSE, R2, Ens, and MAE values of 2.893 MJ m−2 d−1, 36.1%, 0.569, 0.563, and 3.742 MJ m−2 d−1, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 3, the five machine learning models had higher accuracies than the models under input scenarios 1 and 2, with an RMSE value of less than 2.900 MJ m−2 d−1, RRMSE value of less than 21.4%, R2 value of greater than 0.952, Ens value of greater than 0.931, and MAE of less than 2.237 MJ m−2 d−1. This indicated that introducing climatic variables into the model training greatly improved the model performance. Among the models under input scenario 3, the PSO-GEM3 had the highest accuracy, considering the values of their evaluation indices. The PSO-GEM4 showed the highest accuracy under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 showed the highest accuracy with RMSE, RRMSE, R2, Ens, and MAE values of 1.719 MJ m−2 d−1, 12.7%, 0.964, 0.946, and 1.283 MJ m−2 d−1, respectively.

Table 4 Statistical performances of daily Rs of different models at the four stations. The best model in each station is marked in bold

At Jilin station, the PSO-GEM1 showed the highest accuracy under input scenario 1 with RMSE, RRMSE, R2, Ens, and MAE values of 5.393 MJ m−2 d−1, 43.6%, 0.503, 0.501, and 4.030 MJ m−2 d−1, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 3, the PSO-GEM3 had the highest accuracy, considering the values of their evaluation indices. Under input scenario 4, the PSO-GEM4 showed the highest accuracy. The five machine learning models under input scenario 5 showed the highest accuracies among the models under different input scenarios. The PSO-GEM5 showed the highest accuracy with RMSE, RRMSE, R2, Ens, and MAE values of 1.245 MJ m−2 d−1, 9.1%, 0.974, 0.973, and 0.844 MJ m−2 d−1, respectively.

At Shenyang station, the PSO-GEM1 showed the highest accuracy under input scenario 1 with RMSE, RRMSE, R2, Ens, and MAE values of 5.355 MJ m−2 d−1, 37.7%, 0.503, 0.489, and 4.265 MJ m−2 d−1, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 3, all the machine learning models had higher accuracies than the models under input scenarios 1 and 2, with an RMSE value of less than 2.619 MJ m−2 d−1, RRMSE of less than 18.4%, R2 of over 0.882, Ens of over 0.878, and MAE of less than 1.836 MJ m−2 d−1. The PSO-GEM3 had the highest accuracy. The PSO-GEM4 showed the highest accuracy under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 showed the highest accuracy with RMSE, RRMSE, R2, Ens, and MAE values of 1.658 MJ m−2 d−1, 11.7%, 0.953, 0.951, and 1.147 MJ m−2 d−1, respectively.

At Yanji station, the PSO-GEM1 showed the highest accuracy under input scenario 1 with RMSE, RRMSE, R2, Ens, and MAE values of 5.030 MJ m−2 d−1, 36.6%, 0.484, 0.476, and 3.973 MJ m−2 d−1, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 3, the five models had higher accuracies than the models under input scenarios 1 and 2, with an RMSE of less than 1.746 MJ m−2 d−1, RRMSE of less than 12.7%, R2 of over 0.946, Ens of over 0.937, and MAE of less than 1.322 MJ m−2 d−1. The PSO-GEM3 had the highest accuracy. The PSO-GEM4 showed the highest accuracy under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 showed the highest accuracy with RMSE, RRMSE, R2, Ens, and MAE values of 1.045 MJ m−2 d−1, 7.6%, 0.981, 0.977, and 0.801 MJ m−2 d−1, respectively.

As for the empirical models, the HS and BC models showed lower accuracies compared with those of the machine learning models with the same inputs (input scenario 3), with an RMSE of 3.885–4.557 MJ m−2 d−1, R2 of 0.634–0.707, RRMSE of 28.3–32.2%, Ens of 0.630–0.696, and MAE of 3.069–3.526 MJ m−2 d−1. The accuracy of the machine learning models considering n was significantly higher than that of the models without n input, with the RMSE reduced by 44.3–79.9%, RRMSE reduced by 44.2–91.2%, MAE reduced by 40.2–80.6%, R2 increased by 67.7–95.6%, and Ens increased by 67.4–124.9%.

The boxplots of the statistical indicators of daily Rs for different models in the study area are presented in Fig. 3. Under input scenario 1, the five machine learning models showed low prediction accuracies for the whole region, with average RMSE, RRMSE, MAE, and Ens values of 4.668–9.627 MJ m−2 d−1, 38.8–69.9%, 4.002–7.579 MJ m−2 d−1, and 0.220–0.507, respectively. The PSO-GEM1 showed the highest accuracy among the five models. Under input scenario 2, the PSO-GEM2 was the best model, considering the values of their evaluation indices. The five models under input scenario 3 showed higher prediction accuracies than the models under input scenarios 1 and 2, which did not consider climatic variables as inputs. The PSO-GEM3 showed the best results, considering the values of their evaluation indices. Under input scenario 4, the PSO-GEM4 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 showed the highest accuracy with average RMSE, RRMSE, MAE, and Ens values of 1.417 MJ m−2 d−1, 10.26%, 1.019 MJ m−2 d−1, and 0.962, respectively. The HS and BC models showed much lower prediction accuracies compared with those of the machine learning models, with average RMSE, RRMSE, MAE, and Ens values of 4.306 MJ m−2 d−1 and 4.174 MJ m−2 d−1, 31.23% and 30.26%, 3.341 MJ m−2 d−1 and 3.218 MJ m−2 d−1, and 0.658 and 0.679, respectively.

Fig. 3
figure 3

Boxplots of the statistical indicators of daily Rs for different models

The GPI values of the different models at the four stations are presented in Fig. 4. The SVM1, M5T1, GEM1, RF1, and PSO-GEM1 models under input scenario 1 showed the lowest prediction accuracies compared with those of models under other input scenarios, with average GPI values of − 3.915, − 3.101, − 2.883, − 3.357, and − 2.163, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, followed by the GEM2, SVM2, RF2, and M5T2 models. Under input scenario 3, the PSO-GEM3 showed the highest accuracy, considering the values of their evaluation indices. The PSO-GEM4 was the best model under input scenario 4, followed by the GEM4, RF4, M5T4, and SVM4 models with average GPI values of 0.434, 0.375, 0.033, 0.019, and − 0.211, respectively. Under input scenario 5, the PSO-GEM5 and GEM5 showed much higher accuracies with average GPI values of 0.641 and 0.560, respectively. The accuracies of the HS and BC models were higher than those of the M5T1, SVM1, GEM1, RF1, and PSO-GEM1 models without climatic inputs, with average GPI values of − 1.745 and − 1.622, respectively. Relatively good estimates and high accuracies could be obtained from models with at least the DOY, Ra, and n as inputs, including models under input scenarios 3, 4, and 5. These results further confirm that n is the most important variable for estimating Rs.

Fig. 4
figure 4

GPI values of daily Rs of different models at the four stations in Northeast China

Evaluation of the models on a monthly basis

The accuracy index of monthly Rs of different models in different stations is presented in Table 5. As shown in Table 5, in Harbin station, the PSO-GEM1 showed the highest accuracy under input scenario 1, with RMSE, RRMSE, R2, Ens, and MAE of 0.878 MJ m−2 d−1, 13.5%, 0.984, 0.943, and 0.803 MJ m−2 d−1, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, followed by the GEM2, considering the values of their evaluation indices. Under input scenario 3, the five models had higher accuracy than the models under input scenario 1 and scenario 2, with RMSE less than 0.825 MJ m−2 d−1, RRMSE less than 8.5%, R2 over than 0.999, Ens over than 0.972, MAE less than 0.621 MJ m−2 d−1. The PSO-GEM3 had the highest accuracy. The PSO-GEM4 showed the best precision under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 and GEM5 showed much higher accuracy among the five models, considering the values of their evaluation indices. HS and BC models showed much poorer prediction accuracy with RMSE of 1.186 and 1.193 MJ m−2 d−1, with RRMSE of 8.8% and 8.8%, R2 of 0.977 and 0.976, MAE of 0.955 and 0.951 MJ m−2 d−1, and Ens of 0.951 and 0.950, respectively.

Table 5 Statistical performances of monthly Rs of different models at the four stations. The best model in each station is marked in bold

In Jilin station, the PSO-ELM1 showed the highest accuracy under input scenario 1, with RMSE, RRMSE, R2, Ens, and MAE of 0.932 MJ m−2 d−1, 6.6%, 0.971, 0.964, and 0.750 MJ m−2 d−1. Under input scenario 2, the PSO-ELM2 had the best precision, considering the values of their evaluation indices. Under input scenario 3, the PSO-ELM3 had the highest accuracy, considering the values of their evaluation indices. Under input scenario 4, the PSO-ELM4 showed the best precision, with RMSE, RRMSE, R2, Ens, and MAE of 0.341 MJ m−2 d−1, 2.8%, 0.998, 0.994, and 0.301 MJ m−2 d−1. The five models under the input scenario 5 showed the highest accuracy among the models under other inputs. The PSO-ELM5 showed the highest accuracy, followed by the GEM5, with RMSE, RRMSE, R2, Ens, and MAE of 0.197 and 0.242 MJ m−2 d−1, 1.5% and 1.8%, 0.999 and 0.998, 0.999 and 0.998, and 0.137 and 0.159 MJ m−2 d−1.

In Shenyang station, under input scenario 1, the PSO-GEM1 showed the highest accuracy, followed by GEM1, with RMSE, RRMSE, R2, Ens, and MAE of 0.932 and 0.962 MJ m−2 d−1, 6.6% and 6.8%, 0.971 and 0.979, 0.964 and 0.962, and 0.750 and 0.788 MJ m−2 d−1. Under input scenario 2, the PSO-GEM2 showed the best precision, considering the values of their evaluation indices. Under input scenario 3, the PSO-GEM3 and GEM3 model showed higher accuracy, considering the values of their evaluation indices. The PSO-GEM4, GEM4 and SVM4 models showed better precision under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, PSO-GEM5 showed the highest accuracy, followed by GEM5, with RMSE, RRMSE, R2, Ens, and MAE of 0.313 and 0.351 MJ m−2 d−1, 2.2% and 2.5%, 0.999 and 0.998, 0.996 and 0.995, and 0.256 and 0.273 MJ m−2 d−1, respectively.

In Yanji station, the PSO-GEM1 showed the highest accuracy under input scenario 1, considering the values of their evaluation indices. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, with RMSE, RRMSE, R2, Ens, and MAE of 0.661 MJ m−2 d−1, 4.8%, 0.998, 0.979, and 0.603 MJ m−2 d−1. Under input scenario 3, the five models had higher accuracy than the models under input scenario 1 and scenario 2, with RMSE less than 0.694 MJ m−2 d−1, RRMSE less than 5.4%, R2 over than 0.998, Ens over than 0.974, and MAE less than 0.649 MJ m−2 d−1. The PSO-GEM3 had the highest accuracy, considering the values of their evaluation indices. The PSO-GEM4 model showed the best precision under input scenario 4, considering the values of their evaluation indices. Under input scenario 5, the PSO-GEM5 showed the highest accuracy, followed by the GEM5, considering the values of their evaluation indices.

The boxplots of the statistical indicators of monthly Rs for different models in the study area are presented in Fig. 5. Under input scenario 1, the five models showed lower prediction accuracy in the whole studied area, with RMSE, RRMSE, MAE, Ens of 0.824–1.308 MJ m−2 d−1, 8.0–10.3%, 0.636–1.077 MJ m−2 d−1, 0.898–0.944, respectively. The PSO-GEM1 showed the highest accuracy among the five models. Under input scenario 2, the PSO-GEM2 model was the best, considering the values of their evaluation indices. The five models under the input scenario 3 showed higher accuracy than the models under the input scenarios 1–2. The PSO-GEM3 showed the best precision, followed by the GEM3, considering the values of their evaluation indices. Under input scenario 4, the PSO-GEM4 showed the highest accuracy. Under input scenario 5, the PSO-GEM5 and GEM5 showed higher accuracy, considering the values of their evaluation indices. HS model and BC model showed poorer prediction accuracy with RMSE of 1.005 and 1.009 MJ m−2 d−1, with RRMSE of 7.3% and 7.4%, MAE of 0.822 and 0.848 MJ m−2 d−1, and Ens of 0.952 and 0.955, respectively.

Fig. 5
figure 5

Boxplots of the statistical indicators monthly Rs for different models

GPI of monthly Rs of different models in the whole studied area is presented in Fig. 6. As shown in Fig. 6, SVM1, M5T1, RF1, GEM1 and PSO-GEM1 models under input scenario 1 showed the lowest prediction accuracy, with average GPI of − 2.957, − 2.406, − 1.979, − 1.553, and − 1.034, respectively. Under input scenario 2, the PSO-GEM2 showed the highest accuracy, considering the values of their evaluation indices. Under input scenario 3, the PSO-GEM3 showed the best precision, followed by the GEM3 model, considering the values of their evaluation indices. The PSO-GEM4 was the best model under input scenario 4, followed by the GEM4 model, with average GPI of 0.755 and 0.686, respectively. Under input scenario 5, the PSO-GEM5 showed the best precision, with average GPI of 0.855, respectively. The accuracy of HS and BC models was high than M5T1, SVM1, and RF1 models, with average GPI of − 1.734 and − 1.742 respectively. Machine learning models with complete data inputs had the highest precision. Meanwhile, the models which considered n, Tmax and Tmin, n and P showed similar precision compared to the models as for seven-inputs. The models only considered DOY and Ra showed the lowest prediction accuracy, with GPI of − 4.114 to − 0.588. The accuracy of the monthly Rs models which considered DOY, Ra, Tmax, and Tmin was higher than the models for two inputs, with GPI increased by 17.5–29.4%. The increase in accuracy was not significant. In the calculation of monthly Rs, sunshine duration was the most significant variable in the studied area.

Fig. 6
figure 6

GPI values of monthly Rs of different models at the four stations in Northeast China

Discussion

The PSO can further improve the accuracy of GEM, as PSO can improve the iteration rate of GEM and avoid the initialized weights. Under different input scenarios, the PSO-GEM showed the highest accuracy. The GEM can better reflect the nonlinear relationship between radiation and meteorological factors by calculating the Gaussian exponents. The accuracy of GEM has been proven (Lesser et al. 2011; Jia et al. 2021.). Wu et al. (2021) showed that the PSO can improve the accuracy of the extreme learning machine models and have better ability in optimizing the parameters. It confirmed generalizability and robustness of PSO-GEM. Machine learning models generally had a higher accuracy than the HS and BC models when climatic variables were included as inputs. The machine learning models that considered only the DOY and Ra showed the lowest accuracies at the four stations, especially the SVM1 and RF1 models. Fan et al. (2019) showed that in China, the SVM and RF models had worse rankings, which agrees with our conclusion.

To further confirm the reliability of PSO-GEM for Rs estimation, the Taylor diagrams of different models at four stations were analyzed. The standard deviation and correlation coefficient of the statistical indicators by the models over the stations are listed in Figs. 7 and 8. It was clear that PSO-GEM5 at different stations have the lowest standard deviation, the lowest mean square error and the highest correlation coefficient with the standard values. These results further confirmed the performance of PSO-GEM5 at different stations in Northeast China.

Fig. 7
figure 7

Taylor diagrams of daily Rs of different machine learning models at different stations

Fig. 8
figure 8

Taylor diagrams of monthly Rs of different machine learning models at different stations

The results of this study showed that the models with complete inputs had the highest accuracy. This indicated that the effect of each meteorological factor on Rs estimation was positive. However, the models with Tmax and Tmin as inputs showed lower accuracy, especially the HS and BC models. The models considering n (input scenarios 3, 4, and 5) showed a much higher accuracy, which revealed that n is the most important factor affecting Rs estimation in Northeast China. Mecibah et al. (2014) investigated the performance of different Rs models and found that the accuracy of models with n was much higher than that of models with air temperature. The same conclusion was also reported by Zhang et al. (2018) because the magnitude of n directly affects the Rs reaching the surface of the earth. The amount of solar radiation reaching the Earth’s surface is closely related to sunshine duration. Clouds and the weather patterns are also the most important atmospheric phenomena limiting solar radiation on the Earth’s surface. These are the main reasons for the higher accuracy of the models considering sunshine duration and precipitation. The solar radiation reaching the Earth’s surface is absorbed by the atmosphere or emitted into the air in the form of long-wave radiation. The long-wave radiation absorbed by the atmosphere will increase the temperature. Thus, the temperature is also one of the important factors affecting solar radiation. But there are many factors affect the atmospheric temperature, the relationship between solar radiation and temperature does not correspond exactly. It is why the accuracy of the temperature-based models is lower than the sunshine-duration-based models.

The PSO-GEM can be recommended to estimate Rs in Northeast China. The proposed model can provide scientific support for evapotranspiration estimation, agricultural irrigation management and solar energy development. In this study, we considered a simple data set assignment for training machine learning models. K-fold cross-validation is an efficient training method recommended for training models (Shiri et al. 2015). In future research, we can combine PSO-GEM and K-fold cross-validation to further improve the accuracy of Rs estimation.

Conclusions

Five machine models with five groups of input parameters and two empirical models were evaluated for Rs prediction using meteorological data from four stations in Northeast China. The PSO-GEM with full climatic data as inputs showed the highest accuracy with RMSE, RRMSE, MAE, and Ens values of 1.416 MJ m−2 d−1, 10.27%, 1.018 MJ m−2 d−1, and 0.962, respectively. The PSO-GEM showed the highest accuracy under other input scenarios. n is the most influential factor affecting Rs estimation by machine learning models.

Overall, the PSO-GEM5 is recommended for estimating Rs in Northeast China when all the meteorological variables are available. The PSO-GEM3 is recommended when only n and air temperature data are accessible. The PSO-GEM4 and GEM4 are recommended only when sunshine data and P data are available.