Keywords

1 Introduction

Energy demands around the world are expected to have more than double by 2050 and to more than triple by the end of century. Growing improvements in the existing energy grid would not be enough to meet the world’s energy demand. Global security, economic growth and the quality of life are closely related to sufficient supplies of clean energy. The most daunting challenges for the world is to find energy resources to meet the rising demands of the planet. Solar prediction is a milestone to these challenges. Solar prediction depend on several factors such as characteristics of solar power plants which convert sun’s energy to electric power, scattering process, knowledge of the Sun’s path, nature of atmosphere etc. [1]. Solar forecasting information is necessary for the operation and planning for the future. Forecasting information provide grid operators with means to forecast and align electricity production and consumption and set up bilateral contract negotiations between suppliers and customers. Precise prediction methods increase the quality of the energy supplied to the grid and reduce the extra costs associated with ancillary equipment [2]. Based on the input data types and forecasting time horizons requirement various prediction approaches are introduced. For a very short time scale, on-site measurements are sufficient for the time series model. Intra-hour forecast obtained from a ground based sky imager with a high spatial and temporal resolution. Cloud motion vector forecast based on intra-day satellite images shows good results. These forecast based on NWP models. Photovoltaic system integrated with grid required information up to minimum 2 days ahead or even beyond. Different type of solar power systems exist in the solar forecasting such as solar concentrating system and solar non-concentrating system [3]. DNI is highly correlated to a concentrated photovoltaic system. Measurement of DNI is very important for the operation and management of concentrated solar thermal power plants. DNI is highly affected by the number of factors like as dust storms, air pollution, cirrus clouds which degrade the DNI up to 30%. For non-concentrated solar PV systems, the primary element to measure is GHI which is less sensitive to error in DNI.

2 Research Motivation

India is facing Energy crises along with the world. There is a substantial difference in energy demand and supply. As our country progresses towards development day by day, this gap is rising and addressing this situation is very important in order to continue the ascending direction of our country. In order to address this situation, a range of options with a strong emphasis on renewable energy are considered. A lot of researcher and academics are engaged in developing tools, models and algorithms in today solar system. In today’s dynamic world forecasting is a critical part of business planning with greater penetration of renewable energy resources and implementation of power deregulation in industry. Forecasting of solar power has become a major issue in power systems. Following needs of the markets, various techniques are used to forecast the solar radiation. As a result, it is anticipated that the thorough analysis would assist potential researchers as well as utility operators in gaining useful insight into the need for solar power output and forecasting models. The knowledge gained can also assist the government and energy market participants in making more efficient and beneficial decisions regarding solar power system implementation.

3 Solar Radiation Component

Three essential and fundamentals component are assessed for measurement of the solar irradiance [4].

3.1 Direct Normal Irradiance

Direct Normal Irradiance can be defined as the total amount of solar radiation obtained in a direct path from the sun at the horizontal earth surface with no atmospheric losses. Such amount of radiation is very important for the concentration of solar thermal system like concentrated solar power and concentrated photo voltaic.

3.2 Diffuse Horizontal Irradiance

Diffuse Horizontal Irradiance can be characterized by as the amount of solar radiation obtained from the sun on an indirect path on the horizontal surface, when it has been spread by air molecules, aerosol particles, cloud particles or other particles [5].

3.3 Global Horizontal Irradiance

Global Horizontal Irradiance is the cumulative amount of radiation received by the surface horizontal to the ground [6]. It consists of both DNI and DHI.

The three fundamental component of solar irradiation can be related to each other using the following equation:

$$ GHI= DNI+ DHI\times COS(Z) $$
(1)

where Z represent the solar zenith angle.

4 Need of Solar Forecasting

The necessity of forecasting is for the operation and planning for the future. However the need of forecasting is given below

  • Solar generation is variable in nature

  • Necessity for successful bilateral contract negotiations between suppliers and customers

  • Operational planning decision which are used to describe the economic location, type and scale of solar power plants

  • Solar forecast provide grid operators with means to forecast and align electricity production and consumption

  • Decision on expansion and enhancement of transmission, augmentation of generation, planning of distribution and exchange of regional electricity

5 Solar Forecasting Methodologies

The determinant factor for classifying methodologies for solar forecasting is different-2 forecast horizons. Precise forecasting can enable grid operators to create balance between consumption and production [7]. Table 1 shows three forms of horizons: intra-hour, intra-day, and day ahead.

  1. 1.

    For very short time scale various time series models such as an Artificial Neural Network, Autoregressive Integrated Moving Average, and Persistence model used for forecast solar irradiance [8, 9]

  2. 2.

    For short time irradiance forecasting, solar irradiance largely depend on the observation based on the temporal developments of clouds, may be used as a basis

    • For the sub-hour range, cloud data is collected from sky images ground based with high spatial resolution may be used to predict solar irradiance.

    • For 30 min up to 6 h solar irradiance depends on cloud motion vector from satellite photos.

  3. 3.

    For long term horizon, from 4–6 h ahead numerical weather prediction model perform better than the satellite based forecasts [9, 10]

  4. 4.

    There are also integrated techniques to derive an optimized forecast for the different-2 time horizon

Table 1 Relationship between time horizon, prediction model and related operations

6 State of Art for Solar Irradiance Forecast

As Per literature, forecasting methods are categorized into three types: statistical method, physical method and ensemble method.

6.1 Physical Methods

The physical methods are depending on the Total Sky Imagers (TSI), Numerical Weather Prediction (NWP) and physical parameters include temperature, cloudcover and humidity etc [11].

6.1.1 Numerical Weather Prediction

The numerical weather prediction depends on atmospheric physics. For forecasting the future weather state, current observations of the weather are forecast using the assimilation process. NWP model performance is good for the horizons of 1 day to multi days ahead [12]. NWP process as follows:

  • Step 1: In the initial stages, satellite and sky images ground based used to collect the current weather condition of an atmosphere. Assimilation process is used for processing the current weather state which is a very critical and complex process.

  • Step 2: In the second phase, the most dominant atmospheric equation such as thermodynamic equation, Newton second law for fluids are integrated and solved [3].

Well, a known example of NWP models is worldwide model, regional model and weather research & forecasting model (WRF) model (Table 2). We can differentiate them in term of input parameters and spatial resolution [1].

Table 2 Comparison of various NWP model

6.1.2 Cloud Imagery and Satellite Models

The situations of clouds are analyzed by cloud imagery with high spatial resolution. They detect the variability of clouds and predict global irradiance up to 6 h ahead. Solar irradiance is highly affected by cloud cover and cloud optical depth. Information about the clouds helps to predict the solar irradiance using total sky imagers for very short term forecasting. Some researchers develop their own TSIs while other researchers use commercially available TSIs such as TSI-800 [4].

6.2 Statistical Methods

Prediction method depends on the previous time series data of solar irradiation as input and does not depend on the internal phase of the model. Persistence model, ARIMA, ANN, Fuzzy logic etc. include in the statistical method.

6.2.1 Time Series Model

Time series model predicts future value by consider previously observed value. Observation measured over time it may be hourly, daily and weekly. The sequence of data could be random and mainly focus on the pattern of the data. The pattern of the data should be recognizable and predictable for forecasting techniques. Autocorrelation Function (ACF) & Partial Correlation Function (PCF) used to identifying the pattern [5].

Time series is expressed as:

$$ y(t)=s(t)+R(t)+T(t)\kern1em \mathrm{Where}\kern0.5em t=-1,0,1,2,3..\dots \dots \dots \dots $$
(2)

S(t) = Seasonal term, R(t) = Random term, T(t) = Trend term Stock market, revenue forecasting, economic forecasting, budgetary analysis, sales forecasting also utilize the application of time series method.

One of the benchmark model in the solar irradiance forecasting is an ARIMA model. Moving Average (MA), Auto Regressive (AR), & Autoregressive Moving Average (ARMA) is the variants of the ARIMA model. First one is moving average and the second one autoregressive. ARIMA is the most commonly used model for evaluating the relationship between real and expected performance. ARIMA model is the statistical tool to analyze the relationship between actual and forecasted output. ARIMA use three main steps for the forecasting: model identification, estimation of parameters and diagnostic checking [6]. There are seasonal and non-seasonal time series models that can be used for forecasting. An ARIMA is describe by three elements: p, d, q. where “p” is for autoregressive term, “q” is for moving average term and d is the number of differencing required to make the time series stationary. Mathematically, Autoregressive AR can be expressed as:

$$ {y}_t=a+\sum \limits_{i=1}^p{\phi}_i{y}_{t-i}+{\varepsilon}_t=a+{\phi}_1{y}_{t-1}+{\phi}_2{y}_{t-2}+\dots \dots \dots \dots \dots {\phi}_p{y}_{t-p}+{\varepsilon}_t $$
(3)

yt represents the actual value, ϕi is model parameter and εt represent the random error, a and p are the constant term. This equation represents the linear relationship between the predicted value and the past value with some random error and constant term.Whereas the Moving Average equation represent the past value as a dependent variable

$$ {y}_t=\eta +\sum \limits_{j=1}^q{\theta}_j{\varepsilon}_{t-j}+{\varepsilon}_t=\eta +{\theta}_1{\varepsilon}_{t-1}+{\theta}_2{\varepsilon}_{t-2}+\dots \dots \dots \dots {\theta}_q{\varepsilon}_{t-q}+{\varepsilon}_t $$
(4)

θj represent the model parameter, η represent the mean of the time series and q is the order of the model.

Combining Eqs. (1) and (2) become ARMA and mathematically can be expressed as

$$ {y}_t=a+\sum \limits_{i=1}^p{\phi}_i{y}_{t-i}+\sum \limits_{j=1}^q{\theta}_j{\varepsilon}_{t-j}+{\varepsilon}_t $$
(5)

Here p is for autoregressive and q is for moving average.

ARIMA is very popular for users due to the advantage of statistical expertise, the latest version of MATLAB makes it easier with the “Econometric Modeler app” available in MATLAB 2018 and 2019 [7]. In comparison to ARIMA, this approach requires an additional coefficient differencing operator, i.e. (p, d, q). The ARIMA’s mathematical expression is

$$ \phi (L){\left(1-L\right)}^d{y}_t=\theta (L){\varepsilon}_t $$
(6)

ϕ and θ is model parameter, εt is random parameter and L denote lag operator and d represent differencing operator.

6.2.2 Persistence Model

It is also known as naïve predictor. Persistence model is very simple as comparison to other forecasting model. It forecast the future value based on previous value [8].

$$ {x}_{t+1}={x}_t $$
(7)

The performance of persistence model is better when changes in weather pattern are little.

6.2.3 Artificial Neural Network

The working of the neural network is similar to the human brain which takes the decision based on biological neurons. Neurons in the human brain perform the different-2 types of parallel processing, pattern recognition etc. The same phenomenon can be used to solve non-linear math problems in modeling, image analysis, and in other fields [10]. The ANN use different-2 algorithm to predict solar irradiation such as: scaled conjugate gradient, levenberg marquardt algorithm, pola-ribiere conjugate gradient etc. This techniques trained model to map the input and output to obtain the best value. Support vector machine, radial basis network, multilayer perceptron and Hopfield network include under the artificial neural network. ANN process is carried out in three stages: (1) Design phase (2) Training phase (3) Validation phase. First stage consists of input parameters, neural network type, hidden neurons, In the training phase weight of the neuron are modified and in validation stage forecasting of solar irradiance based on trained weight [11]. Basic Architecture of artificial neural network is shown in Fig. 1.

Fig. 1
figure 1

Basic Architecture of ANN

The MLP structure is one of the important forms of neural network. This MLP structure consist an input layer, hidden layer and output layer. Hidden layer was characterized as a number of hidden neuron, input and output layer denoted by vector p & q respectively.

$$ q=q\left(p:w\right)=\sum \limits_{i=0}^h\left[{w}_if\left(\sum \limits_{j=0}^d{w}_{ij}{x}_j\right)\right] $$
(8)

j & i represent the weights and biases while vector w supervises the non-linear mapping. Babak Jahani et al. compared the empirical, artificial neural network and artificial neural network with a genetic algorithm optimization technique to predict the global solar radiation. The Genetic algorithm was used in the model to reduce the error in predictive results [13] Premalatha Neelamegam et al. proposed two artificial neural network model with different combinations of inputs, the accuracy of the model was measured based on MAE,RMSE and R2 [14]. Voyant et al. presented a review of solar radiation forecasting using machine learning techniques. According to the author standalone models such as: artificial neural network, linear regression, random forest, support vector machine performed well in the forecasting field while hybrid model are viable way to improve the accuracy of prediction model [15].

6.2.4 Support Vector Machine

It is a form of machine learning introduced in 1995 by Cortes and Vapnik with statistical learning. Firstly, this particular approach is developed for pattern recognition and is now enthusiastically used for various technologies such as image retrieval, fault diagnosis, regression computation and forecasting etc. [16]. The time series is used to train a model that is as simple as neural network model and there is no question of over fitting curve, struck to local minima in SVM [17]. Essentially, it uses the mapping function to map the input vector (x1 + x2 + x3 + . ……xn) to the output (y1 + y2 + y3 + ……yn). The equation with SVM can be represented as

$$ y=\sum \limits_{i=1}^n{\phi}_ik\left(x,{x}_i\right)+b $$
(9)

where y is output function and b is bias and the basic architecture of SVM shown in Fig. 2.

Fig. 2
figure 2

Architecture of SVM

Jie shi et al. used SVM model to forecast the solar power. The entire data is dividing into four groups based on all seasons. The categorized data are feed into four SVM developed models. The performances of developed models are evaluated using RMSE and MAE and performance of all developed models outperform bench mark model [18].

6.2.5 Markov Chain

It represents a deterministic cycle that used to forecast wind and solar irrdiance. The procedure of deterministic cycle is essential reliant on the neighboring states i.e. the current state variables are dependent on the former one. Similarly, the next state variables are reliant on the current one [19] as shown in Fig. 3.

Fig. 3
figure 3

Markov chain process

This procedure is described by a sequence of finite random digit. Let {yn, n=0,1,2……… .}. The sequence for the current state i at nth time can be shown as

$$ {y}_n=i $$

Whereas the likelihood of next condition in j is Pij. i.e.

$$ p\left\{{y}_{n+1}=j|{y}_n=i,{y}_{n-1}={i}_{n-1,}\dots \dots \dots \dots \dots, {y}_1={i}_1,{y}_0={i}_0\right\}={P}_{ij} $$
(10)

This equation shows the next state having the dependency on the present state.

To estimate the power of a photovoltaic device, Sanjari et al. developed a markov chain model. The input parameters were radiant energy and relative humidity. The proposed model outperforms other approaches in terms of MAPE results [20].

7 Empirical Model

Empirical modeling is a genetic term for activities that create model by observation and experiments. Samani and Hargreaves present first empirical model in 1982. Now number of model have evolved by changing the various factor such as altitude, latitude, angular position, tilt angle, air particle dispersion, water vapor content, hours of sunlight, max temp, lowest temperature, cloud cover index etc. Empirical model is a mathematical technique used to forecast solar irradiance by creating a linear or nonlinear connection between climatologically and solar variables [21]. Nadjem Bailek et al. addressed mathematical models for obtaining a accurate diffuse solar radiation. The developed models were dividing into three categories based on sunshine period and clearness index. The performance of all three models were evaluated using MAPE, RMSE and U95 (Uncertainty Factor) and compared with the eight models discussed in the literature [22].

8 Deep Learning

This term deep learning introduced in 1986 by Rina Dechter & over the past years, deep learning has become very prevalent. It is also named as deep structured learning, is a branch of machine learning which intern is the subset of artificial intelligence; Machine learning is a technique for achieving artificial intelligence through algorithms trained with data, whereas artificial intelligence is a technique for enabling a machine to act like a person as shown in Fig. 4. On the other hand, Deep learning is a set of statistical machine learning techniques used to learn feature hierarchies which is often based on ANN. Here, learning can be supervised, unsupervised or semi-supervised. The application of deep learning algorithm such as: CNN, RNN and DBN are used in computer vision, image processing, audio recognition, speech recognition etc. [23].

Fig. 4
figure 4

Deep learning is a subset of machine learning

Deep learning is a modern substitute for machine learning; we can have a variety of structured and unstructured data in various forms and aspects from every region of the world. Structural data can be easily drowned out while unstructured data could take decades to provide relevant information. Deep learning is used to deal with a huge amount of data simply known as big data which is taken from various medium such as social medium, online platform i.e. e-commerce, internet engine search so on. This abundance amount of data is smoothly accessible. It can be shared through fintech application such as mobile payment applications etc.

Wang et al. present a forecasted model using deep learning techniques. The author applies pre-processing technique to improve the performance of forecasted model [1]. Melit et al. conducted a review on machine learning techniques. According to the author, deep learning techniques and numerical weather forecast with extracting features use to generate long term photovoltaic power generation and for determine the time dependence information in forecasting the performance of convolution neural network and recurrent neural network were better [24].

9 Hybrid Method

These models are used to enhancing the precision of forecasted models. There are many factors that are not considered in the individual model by a model needed to perform more accurately. The hybrid approach is about integrating two or more methods for determining the forecast. Various data decomposition techniques used with forecasted models to increase the accuracy of forecasted models [25] (Table 3).

Table 3 Study of solar forecasting techniques

10 Factors Influencing Solar Radiation Forecasting

There are some other factors/parameters that affect the accuracy of model forecasting directly or indirectly. The solar forecasting depends on forecast horizons, geographical condition, day/night value and normalization, testing period, climatic variability and pre-processing technique.

10.1 Input Parameter Selection

Solar energy is an important aspect of solar radiation forecasting but it is unavailable for many places due to measuring device cost, upkeep and calibration. So, we need input parameters for estimating the solar radiation. The input parameter may be temperature, pressure, humidity, solar zenith angle, precipitation, latitude, longitude, wind direction, wind speed, sunshine duration [48,49,50].

10.2 Forecast Horizon

The time horizon concept is concerned with the span of time duration which the model is used for prediction. Time duration can range from a few seconds to many hours. As per literature, four type of time horizon exist such as: very short term, short term, mid-term and long term forecasting [48,49,50,51].

10.3 Climatic Variability

The variables in the input data may be systemic, endogenous and exogenous. On various combinations of input parameter different model behave differently. The model’s efficiency suffers as the number of insignificant meteorological variables used. As a result, the necessary parameters must be chosen to improve a model’s efficiency. To predict solar radiation, M.A. Behrang et al. proposed two models using neural network based on various combinations of a climatologically variable [52].

10.4 Night Hour and Normalization

The solar irradiance is not available in the night hours. But energy providers required PV production continuous at all times. The bulk of the test took place during the day time and omitted the night time hours. To avoid the effects of inaccurate readings, the time just after sunrise and just before sunset were also excluded from the data collection [53].

10.5 Preprocessing Techniques

The quality of input data plays crucial role in the enhancement of forecasted model. The data collection from various sites mostly available in raw format and does not have a significant characteristics to provide appropriate accuracy. So, the data has to be process before processing with the model called preprocessing stage. Here, the preprocessing means scale up or down the input measurements, clean up and define the input data accordingly to the specifications. There are number of preprocessing techniques available in the literature such as: wavelet transforms kalman filter, empirical mode decomposition, self organization map, normalization, trend free time series which were used before the model learning [54].

10.6 Training and Testing Period

The training and testing cycle is also one of the factors which affect the accuracy of the model. Various studies have shown that the large collection of training data set enhance the learning capacity and also improve the accuracy. B. Sivaneasan et al. used 4 months data set to train the model and 1 month data set is used to test the model [55] whereas Mohammed Bou-Rabee et al., used 3 years data to train and 1 year data to test the model [36].

10.7 Geographical Location

The behavior of the model varies according the geographical location. The model performance directly affected by the area or locations having certain/uncertain climatic conditions like Leh, India where the cold desert receives the enormous amount of solar radiation may perform better than the area having most of the cloud in the sky [14].

11 Solar Forecasting Evaluation Metrics

Various evaluation metrics have used by researchers to predict solar irradiation value. The aim of the evaluation metrics is to compare the actual observed value with the forecasted value. Different performance metrics have different units; for example, the statistical error of solar radiation is measured in W/m2, whereas power is measured in KW or MW. The forecast evaluation provides a forecaster with:

  • The ability of selecting correct forecasting model so, that the maximum prediction accuracy can be achieved as comparison to others.

  • Forecasters analyze forecasting error and utilizing it for improving performance of forecasting model.

Forecasting model accuracy is the primary concern for the forecaster and it can be evaluated by using the following Conventional Statistical Assessment Metrics:

  • Normalized Error :

    It is indicate by Ne and is used to identify outliers in a set of data used. Mathematically it is represented as [26]

    $$ nE=\frac{R_{prediction}-{R}_{real}}{\max \left({R}_{prediction}\right)} $$
    (11)
  • Mean Bias Error :

    This metric is used to measure the system’s or model’s average bias [56].

    $$ MBE=\frac{1}{n}\sum \limits_{i=1}^n\left({R}_{prediction,i}-{R}_{real,i}\right) $$
    (12)

    The MBE positive value indicates that the model is overestimation whereas the negative value represents the underestimation.

  • Mean Absolute Error (MAE):

    It provides uniform forecasting error. This metric provides a difference between two set of data [57].

    $$ MAE=\frac{1}{n}\sum \limits_{i=1}^n\left|{R}_{prediction,i}-{R}_{real,i}\right| $$
    (13)
  • Standard Deviation Error (SDE):

    This metric is used to assess the deviation from the average [42].

    $$ SDE=\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({R}_{prediction,i}-{R}_{real,i}- MBE\right)}^2} $$
    (14)
  • Root Mean Square Error (RMSE):

    It is a metric for determining the largest expected error in the forecasted data [58].

    $$ RMSE=\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({R}_{prediction,i}-{R}_{real,i}\right)}^2} $$
    (15)
  • Mean Absolute Percentage Error (MAPE):

    It is a metric for uniform forecasting error expressed as a percentage [56]

    $$ MAPE=\frac{1}{n}\sum \limits_{i=1}^n\left|\frac{R_{prediction,i}-{R}_{real,i}}{R_{real,i}}\right| $$
    (16)
  • Mean deviation Absolute Percentage Error (Md-APE):

    Outliers have less of an effect on this metric than they do on the MAPE [59].

    $$ MdAPE= median\left(\left|100.\frac{R_{forecast}-{R}_{real}}{R_{real}}\right|\right) $$
    (17)
  • Relative root mean square error (Rrmse):

    It is a metric for determining the largest expected error in the forecasted data set [59]

    $$ Rrmse=\frac{RMSE}{R_{real}}\times 100 $$
    (18)
  • Correlation Coefficient

    This metric is used for representation a connection between two set of data.

    Forecasted Model’s ability is better if the value of correlation coefficient is high. The optimal correlation coefficient value is 1 [60]

    $$ \rho =\frac{{\left( Conv\left({R}_{real}{R}_{prediction}\right)\right)}^2}{Var} $$
    (19)

    where Rreal represent real radiation value and Rprediction represent predicted radiation value.

  • Determination Coefficient

    It is used to derive knowledge about the association between predicted and actual values and this metric is denoted by R2 [61]

    $$ {R}^2=1-\frac{\operatorname{var}\left({R}_{real}-{R}_{prediction}\right)}{\operatorname{var}\left({R}_{prediction}\right)} $$
    (20)
  • Clear Sky Index

    It is defined as the proportion of measure radiation to the clear sky radiation

    $$ {K}_t=\frac{R_{real}}{R_{real(CKS)}} $$
    (21)

11.1 Contemporary Statistical Metrics

The MAPE, MAE and RMSE cannot distinguish the two different data sets with the same mean and standard deviation but having varying consistency or skewness distributions and Kurtosis. However, traditional metrics are required to measure the system but other parameters such as skewness, kurtosis may affect the real time procedure.

  • Kolmogorov-Smirnov test integeral (KSI) and OVER metrics

    The Kolmogorov-Smirnov test is used to distinguish the relationship between two data sets. The distinction between two CDFs is represented as [62].

    $$ D=\max \left|{F}_{(ni)}-{\hat{F}}_{(ni)}\right| $$
    (22)

    F represent the actual data set for solar power generations and \( \hat{F} \) represents the predicted solar power generation data set. D statistics define the disparity between one sample and the reference sample is smaller than the target value (Vc). The target value depends on the amount of points in the estimation of the data series, measured at a confidence level of 99% [62].

    $$ {V}_c=\frac{1.63}{\sqrt{N}}\operatorname{}N\ge 35 $$
    (23)

    The distinction between the two CDFs of real and forecasted energy is specified for each phase

    $$ {D}_j=\max \left|{F}_{ni}-{\hat{F}}_{ni}\right|\;\mathrm{where}\;\mathrm{j}=1,2,3\dots \dots \dots \mathrm{m} $$
    (24)

    Where Pi ∈ [Pmin + (j − 1)d, pmin + jd].

    The period difference d is calculated as follows:

    $$ d=\frac{P_{\mathrm{max}}-{P}_{\mathrm{min}}}{m} $$
    (25)

    The KSI factor is represent as the distinction between two CDFs calculated as

    $$ KSI=\underset{x_{\mathrm{max}}}{\overset{x_{\mathrm{min}}}{\int }}{D}_n dx $$
    (26)

    The actual value and the predicted value are identical when the KSI is lower [62] Dn represent the distinction between the two CDFs.

  • OVER

    It is used to define the difference between the cumulative distribution function of real and predicted solar value [59].

    $$ OVER=\underset{x_{\mathrm{max}}}{\overset{x_{\mathrm{min}}}{\int }} Tdt $$
    (27)

    Where xmin and xmax represent the minimum and maximum radiation value and t is represented as

    $$ {\displaystyle \begin{array}{c}T=\left\{{T}_j-{U}_c\kern0.5em if\kern0.5em {D}_j>{U}_c\right.\\ {}\left\{0\kern0.5em if\kern0.5em {D}_j<{\mathrm{U}}_{\mathrm{c}}\right.\end{array}} $$

    Uc represent the critical value.

  • Skewness and Kurtosis

    The assessment of incongruity in a probability distribution is skewness [55].

    $$ \gamma =E\left|{\left(\frac{e-{\mu}_e}{\sigma_e}\right)}^2\right| $$
    (28)

    e = difference between the forecast solar power and real solar power μe indicate mean error and σe represent the standard deviation error.

    Kurtosis: It is a metric used for assessing the magnitude of the distribution

    $$ K=\frac{\mu_4}{{\sigma_e}^4}-3 $$
    (29)

    K is the kurtosis, μe represent the mean and σe denote standard deviation error.

  • Uncertainty Quantification

    Renyi entropy of solar forecast error: The Renyi entropy is used to quantify the degree of uncertainty in solar prediction and expressed as [63, 64]

    $$ {H}_a(x)=\frac{1}{1-a}{\log}_2\sum \limits_{i=1}^n{P_i}^a $$
    (30)

    a represent the scale of Renyi entropy and Pi represent the probability distribution function. Larger Renyi entropy value indicate more ambiguities present in the expected outcome.

  • Metrics for Ramp Characterization

    The main priority associated with grid operators is to maintain a constant solar power output because a number of fluctuations occurred in the solar output due to variability of weather events. Solar ramps also influenced by different time and geographic factors, they can be up ramp or down ramp. The accurate solar forecasting help to overcome these types of uncertainties [65].

    In case of Ramp Characterization Florita et al. developed a signal compression algorithm in which used to extract ramp interval into a sequence of power cycle by specifying the beginning and finishing point of each ramp [66].

  • Ramp Detection Index (RDI)

    This metric is used to measure the caliber of a model to predict ramps in a short time frame [67].

    $$ RDI=\frac{N_{hit}}{N_{hit}+{N}_{miss}} $$
    (31)

    Where Nhit represent the total number of strike counts

    Nhit + Nmiss represent the cumulative number of times a ramp appears.

  • Ramp Magnitude (RM)

    It is used to measure the difference between radiation value at current time and after small time with respect to the clear sky radiation value of the current time. Chu et al. study the concept of ramp magnitude in their research paper to explore the caliber to predict ramps [68]

    $$ RM=\frac{R_h(t)-{R}_h\left(t+\Delta t\right)}{R_{csk}(t)} $$
    (32)

12 Conclusion

This study conducts on several statistical, physical and ensemble methods. NWP models, satellite based models and cloud Imagery are studied in the case of physical method. These models are used for long-term forecasting horizons ranging from a few hours to several days and are ideal for circumstances where no other information is available. The only downside of the physical approach is that they are suffering from spatial and temporal resolution. Various time series and learning model discussed in the statistical model. In the time series method, the observation is measured over time. AR, MA, ARMA, ARIMA included in the time series model. Learning model include Markov chain, artificial neural network, support vector machine which provide excellent information about the solar irradiance when enough historical data is available. Nowadays, a hybrid method is used to overcome the shortcomings of individual model. These techniques also reduce the forecasting error. For evaluating the performance of prediction model various error metrics are discussed. Solar prediction error assessments allow understanding the model and re-evaluating it in case of high error.