Keywords

1 Introduction

Access to real-time assessments of the state of the economy as well as to forecasts of its expected evolution is essential in the decision-making process, either for policymakers or businesspeople. Up-to-date macroeconomic projections are critical inputs for designing and adjusting economic policy. This becomes even more important in non-stable and challenging economic environments, such as those faced with COVID-19 pandemic and the Russo-Ukrainian War.

However, data offered by the System of National Accounts is delivered with considerable delay. In the case of European countries, Eurostat provides the value of EU and Euro Area GDP 70 days after the end of each quarter, preceded by a first preliminary estimate and a second estimate 30 and 45 days after the end of the quarter, respectively. This delay, as pointed out by [8], is the consequence of the difficulties in producing timely and accurate low-frequency aggregates: (i) not all disaggregated data are available when needed to compute a relevant aggregate; (ii) many disaggregated time series are only preliminary estimates, subject to substantial revisions, so they are not accurate descriptions of the current conditions.

Nevertheless, a considerable number of short-term economic indicators available at a much earlier stage could be used to capture information about the state of the economy. For instance, we have access to monthly data from consumer surveys or the industrial production index, daily data from financial markets and even more frequently observed variables, such as Google or Twitter trends and, sometimes, phone mobility data. Although likely not as complete as data from the System of National Accounts, these high-frequency (HF) indicators can improve the prediction of relevant economic aggregates, such as GDP growth, inflation or unemployment (see, e.g., [33, 35]).

Classical models usually require using data observed at the same frequency, representing a setback when working with a more complete dataset integrated by variables observed at different frequencies. The way to extract information from the available high-frequency indicators is not always a simple task; there are several methodologies, with different levels of complexity, to do it. Various classes of models have been proposed to work explicitly with mixed-frequency datasets, some of the most widely used being the so-called MIDAS (MIxed DAta Sampling) models. These models have attracted considerable attention recently, even being adopted by many official institutions.

Original MIDAS models [22, 23, 25] were defined in terms of a Distributed Lag (DL) polynomial, explicitly modeling the relationship between variables observed at different frequencies. In order to keep parsimony, standard MIDAS models are built in terms of a few parameters. In this chapter, to differentiate them from other classes of MIDAS models, we will name this standard MIDAS model as ADL-MIDAS (Autoregressive Distributed Lag-MIxed DAta Sampling). This model has been used to nowcast GDP, private consumption and corporate bond spreads, among other variables (see [10, 11, 14, 26, 36]).

A specific variation of the previous standard model, known as Unrestricted MIDAS (U-MIDAS), was first introduced by [30] and later deeply analyzed by [17]. Based on a set of simulation exercises, these authors state that U-MIDAS nowcasting precision outperforms the standard MIDAS when the difference in sampling frequencies is not large, specially for monthly to quarterly frequencies, as it is usually the case of macroeconomic nowcasting.

A more recent variation of the original MIDAS model called TF-MIDAS, standing for Transfer Function MIDAS, is introduced by [4]. The authors demonstrate that TF-MIDAS is a general version of U-MIDAS and show that this model beats the latter in terms of out-of-sample nowcasting performance for several HF data generating processes in a set of simulation exercises.

A considerable number of studies compare the different variants of MIDAS models in terms of their nowcasting and forecasting accuracy when projecting macroeconomic variables.

A first group of papers focuses on the nowcasting and predictive performance of the original ADL-MIDAS model in comparison with a set of alternative non-MIDAS models. Specifically, [37] compares ADL-MIDAS model to bridge equations and an AR model through an empirical exercise, in which Euro Area GDP growth is nowcast for the period 2010Q1–2014Q4. The author finds that MIDAS models tend to outperform bridge equations for most predictors, but only for a few indicators do these models beat a simple AR model.

Jansen et al. [28] consider Euro Area and its five largest economies (Germany, France, Italy, Spain and the Netherlands) to test the predictive capacity of a wide range of models, including random walk, AR model, Bayesian VAR model (BVAR), bridge equation, dynamic factor model, MF-VAR and ADL-MIDAS. They analyze the predictive accuracy of these models in two disjoint evaluation samples: 1996Q1–2007Q4 and 2008Q1–2011Q3, which allows them to consider a stable and a volatile period. They conclude that MF-VAR and MIDAS models yield better predictions after the financial crisis, but this does not occur in stable times.

More recently [6] employ data from six developed countries (US, UK, Japan, France, Germany, Italy) and the Euro Area to obtain empirical evidence on the predictive performance of five classes of models (AR, Factor Augmented DL, MIDAS, BVAR, and DSGE model). They consider a general evaluation sample that goes from 1993Q1 up to 2011Q3 and also split that sample into 5-year windows for most of the considered countries. The conclusions are that MIDAS models work better at 1-period ahead horizons. Nevertheless, they also show t-statistics with large spreads, meaning that they work well for the median country but poorly for some individual countries. For 4-period ahead forecasts, BVAR clearly performs better.

Other papers comparing ADL-MIDAS with other forecasting models are [32], for the Euro Area with an evaluation sample 1999Q2–2008Q1, [15], again for the Euro Area with evaluation sample 2003Q1–2009Q1, [10], for US on 1985Q2–2005Q1, [9], for Canada on 2002Q1–2016Q2, [38], for Singapore on 2001Q1–2010Q4, [18], for Switzerland on 2005Q1–2015Q2, or [12], for Turkey on 2010Q2–2015Q1.

A second and less numerous group of papers compares the nowcasting and forecasting performance of U-MIDAS model with alternative non-MIDAS models. For example [2], consider U-MIDAS model versus the classical bridge equation model to build a daily indicator of growth for the Euro Area. The results show that forecasts obtained from U-MIDAS considering different indicators present a higher forecasting accuracy when they are combined with inverse Mean Square Error (MSE) weights.

In this sense, [31] compare the predictive performance of U-MIDAS versus Dynamic mixed-frequency Factor Model (DFM) for Baden Württemberg’s regional GDP growth. The evaluation sample, in this case, spans from 2012Q1 to 2019Q3. The paper finds MIDAS-based predictions to be more robust and to outperform DFM slightly.

Last, a third set of papers is formed by those simultaneously comparing ADL-MIDAS and U-MIDAS predictive precision against other non-MIDAS models. For instance, [13] analyses the nowcasting and short-term forecasting power of U-MIDAS and ADL-MIDAS against an AR and bridge equations benchmark models for the Euro Area in the evaluation sample 2007Q1–2012Q4. The results show that MIDAS models contribute to increasing predictive capacity. Additionally, differences in forecasting precision between ADL-MIDAS and U-MIDAS tend to vanish as the forecasting horizon increases.

Similarly, [17] compare U-MIDAS with the traditional ADL-MIDAS and an AR benchmark model in terms of their power to nowcast and short-term forecast US and Euro Area GDP growth. They use two evaluation samples for US (1985Q1–2006Q4 and 1985Q1–2011Q1) and one for the Euro Area (2003Q1–2010Q4). Similar results are observed for the two regions. Neither U-MIDAS nor ADL-MIDAS have a significantly superior performance during the more stable pre-crisis period, even failing against the AR benchmark. However, ADL-MIDAS and U-MIDAS significantly outperform the AR model for the crisis sample.

Additional papers simultaneously comparing ADL-MIDAS and U-MIDAS with other forecasting models are [29], employing Korean data with evaluation data 2000Q1–2013Q4, and [34], for the Philippines on 1999Q1–2019Q4.

In brief, researchers have yet to reach a consensus on which model, if any, presents the best performance at predicting macroeconomic variables. As this literature review suggests, it seems that MIDAS models (either, ADL- or U- MIDAS) have better results at nowcasting and short-term forecasting than most of the alternatives, although results may depend on the sample, country and HF indicator applied. Additionally, differences between MIDAS models’ accuracy still need to be clarified. For this reason, we run a comparative exercise of forecasting performance of MIDAS models in which a new MIDAS-class model, not yet considered by the literature, is added.

Therefore, this chapter assesses the empirical nowcasting and forecasting performance of the three variants of MIDAS models: ADL-MIDAS, TF-MIDAS and U-MIDAS. With this aim, we use data from the four major Euro Area economies, France, Germany, Italy and Spain, covering the period 1995Q1–2020Q3, analyze the out-of-sample forecast for the evaluation sample 2011Q4–2020Q3, and consider several sub-periods with different levels of uncertainty. When the forecasting errors obtained in the whole sample are observed, we find a slightly higher accuracy, ranged between 2.4 and \(4.1\%\) in terms of the root mean squared forecast error (RMSFE), of TF-MIDAS models. However, when these errors are analyzed in a meta-regression, where we include model, country, indicator, horizon and sample dummies, we do not find a statistically significant difference in the predictive performance of the models under comparison, nor in terms of RMSFE or the mean absolute forecast error (MAFE). Part of the content presented in this paper and further conclusions drawn from the country, indicator, horizon and sample effects (not included here due to the lack of space) can be found in the unpublished Chap. 4 of the Ph.D. Thesis [3].

The chapter is organized in five sections, including the present introduction. In Sect. 2, compared MIDAS models are briefly reviewed, describing the main differences among the three variants considered. Section 3 details the empirical forecasting performance evaluation exercise and presents the first results of the relative out-of-sample nowcasting performance. In Sect. 4, we estimate a meta-regression to obtain the individual effects of several variables on two accuracy forecasting measures. Finally, Sect. 5 summarizes the main results and concludes.

2 MIDAS Models Under Comparison

This section reviews the main theoretical features of the mixed-frequency models compared in the chapter: ADL-MIDAS, U-MIDAS and TF-MIDAS. Additionally, we briefly discuss the identification, estimation and how the nowcasts and forecasts have been computed.

Throughout the chapter we follow the notation used in [4], which is summarized in the following lines. A high-frequency (HF) indicator is denoted by letter x. Let t be the time index for this variable x, \(t=1,...,T\) (i.e., in this chapter, months), being T the last period for which data of variable x is available. L denotes the lag operator for this HF indicator. If \(x_{t}\) is the monthly industrial production index, then \(Lx_{t}\) will be the previous month’s value of the index.

Similarly, let y be the low-frequency (LF) variable that is aimed to be nowcast, sampled at periods denoted by time index \(t_{q}=1,...,T_{q}\) (i.e., in this chapter, quarters), being \(T_{q}\) the last period for which data of variable y is available. Usually, \(T\ge kT_{q}\), as observations of HF indicators are available earlier than LF ones. Past realizations of the LF variable will be denoted by the lag operator Z, where \(Z \equiv L^{k}\). So, if \(y_{t_{q}}\) is quarterly GDP, then \(Zy_{t_{q}}\) will be the previous quarter’s GDP value.

The HF indicator x is sampled k times between samples of y. For example, for quarterly GDP and monthly indicator, \(k=3\).

Finally, both the target variable y and the indicator x are assumed to be stationary, so these variables often correspond to a (log) differenced version of some raw series.

2.1 ADL-MIDAS Model

The original MIDAS (MIxed DAta Sampling) model was introduced by [22, 23, 25]. From now on, we will refer to it as ADL-MIDAS (or simply ADL-M) to emphasize the differences with other variants. In the ADL-M model, the response of the LF variable to an HF explicative variable is modeled through a Distributed Lag polynomial, but particular attention is paid to parsimony. To avoid the so-called “parameter proliferation” problem, lag coefficients are not free, but are defined as a function of a vector of a few parameters, \(\boldsymbol{\theta }\), known as hyperparameters.

Andreou et al. [1] extend the DL specification of MIDAS model introducing an autoregressive term. In the case of only one HF indicator and only one autoregressive LF term, which is the most widely used form in practice, ADL-M equation for h-steps ahead nowcast can be written as

$$\begin{aligned} y_t \ = \ \beta _{0} \ + \ \lambda \ y_{t - p_y k} \ + \ \beta \sum _{j=1}^{K_{\max }}b(j;\boldsymbol{\theta }) \ x_{t-j-h+1} \ + \ \epsilon _t \quad \qquad t= k, 2k, ..., Tk, \end{aligned}$$
(1)

where \(K_{\max }\) is the maximum number of lags of the HF variable included in the model and \(p_y\) is the lag of the autoregressive term, which is a function of the nowcasting/forecasting horizon (i.e., \(p_{y}=s \in \mathbb {N}\), such that h satisfies the condition \((s-1)k \le h \le sk\)). The process \(\epsilon _t\) is assumed to be a white noise in weak sense.

Function \(b(j,\boldsymbol{\theta })\), a component of the lag polynomial, is used to model the weights assigned to each lag of the HF indicator. This function depends on the indicator’s period, j, and the vector of hyperparameters \(\boldsymbol{\theta }\). An overview of different weighting functions proposed in the literature is provided by [20], the most popular being Exponential Almon and Beta functions.Footnote 1 Several variations have been built upon the basic MIDAS model. A detailed summary of their main features can be found in [21].

Once ADL-M model is estimated by Non-Linear Least Squares (NLS), nowcasts and forecasts for \(y_{Tk + k}\) conditional on information available at period \(Tk + k - h\) (i.e., h-steps ahead nowcasts and forecasts) are calculated as

$$\begin{aligned} \hat{y}_{Tk+k \ | \ Tk+k-h} \ = & \ \hat{\beta }_{0} \ + \ \hat{\lambda } \ y_{Tk - (p_{y}-1)k} \\ \nonumber + & \ \hat{\beta } \sum _{j=1}^{K_{\max }}b(j;\hat{\boldsymbol{\theta }}) \ x_{Tk + k -j -h+1} \end{aligned}$$
(4)

where h is the forecasting horizon (notice that Eq. (4) generates nowcasting if \(0 \le h < k\) and forecasting if \(h \ge k)\) and \(p_{y}\) is the lag of the autoregressive term. In the forecasting performance exercise, the ADL-M predictions will be built from Eq. (4). We will consider several values for \(K_{\max }\), ranging from 1 to 24, in order to account for ADL-M models with different levels of parsimony.

2.2 U-MIDAS Model

Koenig et al. [30] introduced a variant of MIDAS model known as U-MIDAS (hereinafter U-M), which was later thoroughly studied by [16, 17]. U-M does not employ functional distributed lag polynomials to model the relationship between x and y, but a linear lag polynomial.

The idea behind this variant is that when the difference in sampling frequencies is not large the risk of falling into the curse of dimensionality becomes less relevant, and so it does the need to resort to functional DL polynomials.

Similarly to the previous model, U-M model with one HF indicator and one autoregressive term is defined by:

$$\begin{aligned} y_t \ = \beta _{0} \ + \ \lambda \ y_{t- p_y k} \ + \ \sum _{j=1}^{K_{\max }} \beta _{j} \ x_{t-j-h+1} \ + \ \epsilon _{t} \qquad \qquad t= k, 2k, ..., Tk, \end{aligned}$$
(5)

where, again, h is the forecasting horizon, \(K_{\max }\) is the maximum number of lags of the HF variable, and \(p_y\) is the lag of the autoregressive term.

Foroni et al. [17] state that basic ADL-M model can be considered nested in U-M specification because it is the result of imposing a particular dynamic pattern on it. An important computational advantage of U-M model over the basic MIDAS models is that it can be estimated by OLS, as long as lag orders \(p_y\) and \(K_{\max }\) are long enough to make the error term, \(\epsilon _t\), uncorrelated and so the weak white noise assumption can be validated.

Once U-M model is fitted by OLS, nowcasts and forecasts for \(y_{Tk + k}\) conditional on information available at period \(Tk + k - h\) (i.e., \(h-\)steps ahead nowcasts and forecasts), when considering only one autoregressive term, are computed as

$$\begin{aligned} \hat{y}_{Tk+k \ | \ Tk+k-h} \ = \ \hat{\beta }_{0} \ + \ \hat{\lambda } \ y_{TK-(p_y -1)k} \ + \ \sum _{j=1}^{K_{\max }} \hat{\beta }_{j} \ x_{t-j-h+1} \end{aligned}$$
(6)

Similarly to ADL-M, the U-M predictions will be built from Eq. (6), and values ranged from 1 to 24 will be considered for \(K_{\max }\) to apply models with different levels of parsimony.

2.3 TF-MIDAS Model

Recently, [4] have proposed another variant of the basic MIDAS model, called TF-MIDAS model (or simply TF-M), in which instead of a DL polynomial expression, an alternative representation based on a transfer function is applied.

General TF-M model with one HF indicator is easier defined using two equations:

  1. (1)

    the equation that models the relation between y and x:

    $$\begin{aligned} y_{t} \ = \ \beta _{0} \ + \ \sum _{j=1}^{k} \ \dfrac{a_{j}(Z)}{b_{j}(Z)} \ x_{t-j} \ + \ \eta _{t}, \text { and} \end{aligned}$$
    (7a)
  2. (2)

    the equation that models the noise:

    $$\begin{aligned} \phi (Z)\eta _t \ = \ \theta (Z) \epsilon _t, \qquad t= k, 2k, ..., Tk. \end{aligned}$$
    (7b)

In Eq. (7a, 7b), \(a_{j}(Z)\) and \(b_{j}(Z)\) are finite lag polynomials and \(Z \equiv L^k\), while \(\phi (Z)\) and \(\theta (Z)\) are, respectively, autoregressive and moving average polynomials of order p and q. Additionally, let \(\phi (Z)\) and \(\theta (Z)\) have all their zeros lying outside the unit circle and do not have common factors [see, e.g., 5], and \(\epsilon _t\) be a weak white noise process.

For monthly-quarterly data (i.e., \(k=3\)), one HF indicator and \(1-\)step ahead nowcast, the previous equations become

$$\begin{aligned} y_t \ = \ \beta _{0} \ + \ \dfrac{a_{1}(Z)}{b_{1}(Z)} \ x_{t-1} \ + \ \dfrac{a_{2}(Z)}{b_{2}(Z)} \ x_{t-2} \ + \ \dfrac{a_{3}(Z)}{b_{3}(Z)} \ x_{t-3} \ + \ \eta _{t} \end{aligned}$$
(8a)
$$\begin{aligned} \phi (Z)\eta _t \ = \ \theta (Z) \epsilon _t, \qquad t= k, 2k, ..., Tk, \end{aligned}$$
(8b)

where \(x_{t-1}\) is the second monthly observation of the current quarter, \(x_{t-2}\) is the first monthly observation of the current quarter, and \(x_{t-3}\) is the third monthly observation of the previous quarter.

TF-M model is estimated by exact ML. To do so, it is transformed into its equivalent state space formulation.Footnote 2 As ML convergence sometimes depends on the initial values of the parameters and TF-MIDAS usually has a considerable number of them, we suggest applying a procedure to get consistent estimates for those values prior to the ML estimation. Here we use the procedure by [19]. Then, the exact ML is computed using the standard Kalman filter equations for a state space model with stochastic inputs [see, 7] by iterating on the set of parameters. Obviously, the estimation through iterative methods may entail some drawbacks with respect to LS techniques, as computational cost and stability issues.

Once the TF-M model is estimated, nowcasts and forecasts for \(y_{Tk + k}\) conditional on information available at period \(Tk + k - h\) (i.e., \(h-\)steps ahead nowcasts and forecasts) are calculated as

$$\begin{aligned} \hat{y}_{Tk+k \ | \ Tk+k-h} =\,& \hat{\beta }_0 \ + \dfrac{\hat{a}_{1}(Z)}{\hat{b}_{1}(Z)} \ x_{Tk+k-h} \\ \nonumber \ +\,& \dfrac{\hat{a}_{2}(Z)}{\hat{b}_{2}(Z)} \ x_{Tk+k-h-1} \ + \dfrac{\hat{a}_{3}(Z)}{\hat{b}_{3}(Z)} \ x_{Tk+k-h-2} \\ \nonumber \ +\,& \dfrac{\hat{\theta }(Z)}{\hat{\phi }(Z)} \ \hat{\epsilon }_{Tk+k} \end{aligned}$$
(9)

where \(\hat{\phi }(Z)=1+\hat{\phi }_1Z + \hat{\phi }_2Z^2 + ... + \hat{\phi }_p Z^p\), \(\hat{\theta }(Z)=\hat{\theta }_1Z + \hat{\theta }_2Z^2 + ... + \hat{\theta }_qZ^q\), and \(\hat{a}_{j}(Z)\) and \(\hat{b}_{j}(Z)\), with \(j=1,2,3\), are finite lag polynomials, whose order will be specified by means of information criteria (see Sect. 3.2 and Table 3 for more detail).Footnote 3

3 Forecasting Performance Evaluation

This section first describes the data used in the forecasting evaluation exercise. Later, it details how the performance evaluation has been designed. Finally, a discussion of the unconditional distribution of the forecasting errors is also presented.

3.1 Data Description

We employ data from the four major economies of the Euro Area (France, Germany, Italy and Spain) in the period 1995Q1–2020Q3. In all cases, we have transformed GDP data to make it stationary, so our target variable is the quarterly change in seasonally adjusted log real GDP. The source of GDP data is Eurostat.Footnote 4

We consider a set of fifteen monthly-observed economic indicators for each GDP, whose description is reported in Table 1. Each indicator series is seasonally adjusted and, as GDP, transformed to induce stationarity. These data were also obtained from Eurostat.Footnote 5

Table 1 Description of the indicators

3.2 Evaluation Design

As [27] show evidence of changing predictive capacity over time, we decide to separate our out-of-sample GDPs forecasts in three disjoint periods of three years (12 quarterly forecasts) each. The dates are 2011Q4–2014Q3, 2014Q4–2017Q3 and 2017Q4–2020Q3. We choose these periods deliberately to analyze the behavior of the methods in three substantially different economic contexts: the European sovereign debt crisis in the first period, a recovery and more stable phase during the second period, and a third convulsive period struck by the COVID-19 pandemic. We will check if changes in the underlying structure of the economies and the exogenous shocks affect the methods’ relative forecasting performances.

In addition to the different periods, each prediction is calculated for four countries (Germany, France, Italy and Spain), seven forecast horizons, fifteen indicators and nine methods. Horizons have been chosen to investigate if nowcasting and forecasting affect the methods’ performance differently. We set the forecasting horizons to 0, 1, 2, 3, 6, 9 and 12.

On the other hand, Table 2 presents the acronyms and a short description of the models employed. The methods are divided into four ADL-MIDAS models, ADL-M\(_3\), ADL-M\(_6\), ADL-M\(_{12}\) and ADL-M\(_{24}\), four U-MIDAS models, U-M\(_3\), U-M\(_6\), U-M\(_{12}\) and U-M\(_{24}\) and the TF-M model.

Table 2 Model acronyms

Each ADL-M\(_{K_{\max }}\) and U-M\(_{K_{\max }}\) method considers a set of models that range from 1 to \(K_{\max }\) lags of the HF variable and one autoregressive term with lag \(p_y\). In the case of TF-M, a set of sixteen models are considered, see Table 3, covering different orders of lag polynomials \(a_{j}(Z)\), \(b_{j}(Z)\), \(\theta (Z)\) and \(\phi (Z)\). Every specification for ADL-M\(_{K_{\max }}\), U-M\(_{K_{\max }}\) and TF-M is then fitted. We choose one specification for each method with the in-sample lowest BIC information criterion for each new observation. These chosen models are then used to compute the corresponding predictions.Footnote 6

Table 3 Polynomials structure of the TF-M model (8a8b)

Finally, each sub-period analyzed is made up of twelve out-of-sample forecasts. Every prediction is computed using a recursive (expanding) forecasting scheme, i.e., we use all observations available from the beginning of the sample up to the forecasting origin in both the identification and estimation process.

In order to check the robustness of the results obtained, two measures of point forecasting performance are used: (1) the root mean squared forecast error (RMSFE), and (2) the mean absolute forecast error (MAFE). Each of these measures is calculated with the previous twelve observations.

We use E4 [7] and Midas [20] MatLab Toolboxes to perform the estimation and prediction processes.Footnote 7

3.3 Unconditional Distribution of the Forecasting Errors

This section investigates the relative performance of the considered MIDAS methods in terms of their nowcasting and forecasting accuracy according to the two measures, RMSFE and MAFE. The results obtained for RMSFE are summarized in Fig. 1.Footnote 8

According to this figure, TF-M presents the lowest average RMSFE (in white in Fig. 1), although the discrepancy with respect to the other methods is not very large. The percentage difference in terms of average RMSFE between the best performing method (TF-M) and the second best (U-M\(_{24}\)) is \(2.4\%\), and with the worse one (ADL-M\(_{24}\)) is \(4.1\%\). Moreover, the RMSFE distributions displayed by the violin and box plots look really similar compared to the differences observed by countries, indicators and horizons.Footnote 9 Similar results are obtained for MAFE measure.

Fig. 1
figure 1

Violin and box plots for RMSFE by nowcasting/forecasting method. Values displayed in white inside the boxes are the average RMSFE

4 Forecasting Performance: A Meta-Regression Analysis

As mentioned, when looking at the mean of the forecasting performance measures by methods in the previous section, we do not account for the potential effect of the rest of the variables. In this section, we address this matter through a meta-regression.

4.1 Description of the Meta-Regression Analysis

We aim to study how the forecasting performance measure (RMSFE or MAFE) varies with the method applied, the source country, the HF indicator, the horizon and the specific sample considered. For that, we will run a meta-regression with the whole sample, containing 11,340 observations.

The main feature we consider is the model applied. To analyze its influence on the forecasting performance metric, we use eight dummy variables, each corresponding to a specific ADL-M and U-M model. The benchmark model is thus TF-M.

The impact of the country origin of the dataset is evaluated by including three dummy variables, choosing Germany as the baseline country.

The third feature whose impact on the forecasting performance is assessed is the HF indicator used, so we create fourteen dummy variables, one for each indicator, being IPI the baseline indicator.

We also include six dummy variables corresponding to horizons 1, 2, 3, 6, 9 and 12. The effect of horizon 0 is captured by the constant.

Finally, we also want to study the effect of the specific sample chosen on the forecasting performance measure. This way, we evaluate how stable or uncertain periods affect the nowcasting or forecasting performance metrics. So, we create two dummy variables: sample1 for the period 2011Q4–2014Q3 and sample2 for 2014Q4–2017Q3, being 2017Q4–2020Q3 the benchmark period.

From these definitions the meta-regression equation is

$$\begin{aligned} AM_i =\,& \beta _0 \ + \ \sum _{j=1}^{8} \beta _{j}^{M} D^{M}_{ij} \ + \ \sum _{j=1}^{3} \beta _{j}^{C} D^{C}_{ij} \\ \nonumber \ + & \ \sum _{j=1}^{14} \beta _{j}^{I} D^{I}_{ij} \ + \ \sum _{j=1}^{6} \beta _{j}^{H} D^{H}_{ij} \\ \nonumber \ + & \ \sum _{j=1}^{2} \beta _{j}^{S} D^{S}_{ij} \ + \ \epsilon _i, \end{aligned}$$
(10)

where \(D^{M}_{i},D^{C}_{i},D^{I}_{i},D^{H}_{i}\) and \(D^{S}_{i}\) are, respectively, Model, Country, Indicator, Horizon and Sample dummy variables for each \(AM_i\) (Accuracy Measure) observation obtained. \(AM_i\) is either the RMSFE or MAFE discrepancy quantity, computed with 12 observations characterized by the variables in Eq. (10) for the observation i. Obviously, dummy variables do not include TF-M, Germany, IPI, horizon0 and sample3, as these effects are captured by \(\beta _0\). The value of the rest of coefficients (all \(\beta \)s different from \(\beta _0\)) are interpreted as gains/losses relative to the benchmark model.

4.2 Meta-Regression Main Results

Table 4 presents estimates of the regression in Eq. (10), for RMSFE and MAFE accuracy measures. Results for both metrics are very similar, except for the expected different scale. As the stars denoting the statistical significance show, regressors’ significance do not depend on the metric considered.

Table 4 Main meta-regression

First, as the unconditional analysis showed, the \(\beta \)s corresponding to the estimation methods are all greater than zero, suggesting a better performance of the TF-M model. In fact, for RMSFE, the loss of precision when not using TF-M ranges from \(2.4\%\) to \(4.1\%\) (U-M\(_{24}\) and ADL-M\(_{24}\) coefficients, respectively, in terms of the mean dependent variable). The percentages for MAFE, although lower, show the same picture. However, when looking at the statistical significance, none of the methods presents a different nowcasting/forecasting accuracy, at least at a 10% significance level; see Table 4.

Regarding country dummies, all of them have a statistically significant and positive effect on the error quantity, either RMSFE or MAFE, meaning that predictions for France, Italy and particularly Spain are less accurate than those computed for Germany, apparently no matter the sample, method, horizon or indicator. This result would be originated in a more stable and thus predictable economic environment in Germany compared to the other countries in the sample.

All indicator dummies present a statistically significant increase in RMSFE and MAFE with respect to the use of IPI as HF indicator. First, this reveals that the inclusion of construction’s production in IPI2 does not contribute to improving the accuracy of GDP’s predictions. Second, it also shows that COF-related indicators do not provide more valuable information than IPI to reduce prediction errors.

Concerning the effect of the horizon on the accuracy measure, only horizon1 and horizon3 dummies have no statistically significant effect with respect to horizon0. The rest of horizon’s dummies present a negative relevant impact on RMSFE and MAFE compared to horizon0. Contrary to the literature, and leaving aside horizon2, whose effect deviates from the rest of shorter horizons, these results seem to point out that MIDAS models have a better performance at longer horizons, i.e., at forecasting much more than at nowcasting. This somehow baffling result might be induced by the interaction of these dummies with others in the meta-regression. Therefore, a more profound analysis in this direction is needed here to understand the roots of this finding.

Finally, \(\beta \)s associated to both period dummies reveal a strong negative effect on RMSFE and MAFE compared to the benchmark sample 3. This result is easily understandable, as sample 3 corresponds to the period including COVID-19 pandemic and the last observations in the sample are much more unpredictable than any in the other two sample periods.

In summary, the main findings obtained from this meta-regression analysis are: (i) the TF-M lowest error quantities are not large enough to show statistically significant coefficients of the rest of the model dummies, implying that either ADL-M or U-M models report a similar forecasting performance relative to TF-M, once we control for the other factors; (ii) all country dummies are relevant variables, indicating that nowcasts and forecasts for Italy, France and Spain are less accurate than the ones for Germany; (iii) using other indicator than IPI results in a statistically significant increase in RMSFE and MAFE; (iv) horizon dummies result to be most relevant, although with differences: while for shorter horizons (i.e., nowcasts), only horizon 2 presents a clear gain in accuracy in comparison to horizon 0, longer horizons (i.e., forecasts) do show in all cases a statistically significant reduction in RMSFE and MAFE relative to horizon 0; (v) predictions for the first period and second period (i.e., 2011Q4–2014Q3 and 2014Q4–2017Q3) are much more accurate than the ones for the third period (i.e., 2017Q4–2020Q3), which is consistent with the fact that this period involves a huge uncertainty associated with COVID-19 pandemic; and, finally, (vi) conclusions do not depend on the specific accuracy measure applied, as RMSFE and MAFE yield very similar results.

5 Conclusions

This chapter attempts to shed some light on the use of the different variants of MIDAS models in forecasting. To do so, we assess the empirical nowcasting and forecasting performance of the ADL-, U- and TF- MIDAS family models. We use data from the four main Euro Area economies (France, Germany, Italy and Spain) for the period 1995Q1–2020Q3, accounting for different HF indicators and horizons. We report the results of the out-of-sample forecasting analysis for the sample 2011Q4–2020Q3 and three disjoint sub-periods characterized by different levels of volatility and uncertainty: 2011Q4–2014Q3 (European sovereign debt crisis), 2014Q4–2017Q3 (recovery and stable phase) and 2017Q4–2020Q3 (including COVID-19 pandemic shock). The predictive accuracy of the distinct MIDAS models is compared according to two accuracy measures: RMSFE and MAFE.

The results of an unconditional analysis reveal a better performance of TF-MIDAS in terms of lowest RMSFE and MAFE. However, a meta-regression with the whole sample does not show a statistically significant difference in the predictive accuracy of the compared methods at standard significance levels. Some other interesting features were found instead: (i) German GDP seems to be more predictable than Italy, France and, specially, Spain’s; (ii) IPIs turn out to be the best HF indicators; (iii) contrary to the literature, MIDAS models seem to perform better at forecasting (longer horizons) than nowcasting; and, (v) as expected, the studied sub-periods can be decreasingly sorted according to predictability as 2014Q4–2017Q3, 2011Q4–2014Q3 and 2017Q4–2020Q3. All these findings are robust to the error metric used, either RMSFE or MAFE.

Finally, these results were obtained without including interaction terms in the meta-regression, which could cast some light on the conclusions drawn. This will be the object of future research.