1 Introduction

The conduct of monetary policy, and economic policy in general, requires an assessment of the state of the economy in real time. Macroeconomic indicators tend to be released with substantial delays, and this is especially true for gross domestic product (GDP). To deal with this issue, policy institutions have traditionally used simple forecasting models and judgement to predict the current state of the economy, as well as that of the recent past. This process is now commonly referred to as nowcasting.

In this paper, we estimate an approximate dynamic factor model (DFM) for Canada and evaluate its nowcasting performance for Canadian GDP. The model developed in this paper reflects many of the distinctive characteristics of the Canadian economy and its data availability. Following Giannone et al. (2008), there is now a large literature on nowcasting with DFM. Recent contributions include: Brazil (Bragoli and Modugno 2015), BRIC countries and Mexico (Dahlhaus et al. 2015), China (Giannone et al. 2013), France (Barhoumi et al. 2010), Indonesia (Luciani et al. 2015), Ireland (D’Agostino et al. 2013), New Zealand (Matheson 2010), Norway (Aastveit et al. 2011 and Luciani and Ricci 2014), and the USA (Giannone et al. 2008). A recent paper by Bragoli and Modugno (2016) constructs a DFM for Canada that bears many similarities with the model developed in this paper.Footnote 1 These papers have shown that DFM nowcasts not only outperform simple benchmarks and other competing nowcasting approaches, such as bridge models and MIDAS regressions, but also often produce nowcasts that are on par with those of professional forecasters.

Dynamic factor models have also been used to study the role of specific variables for nowcasting economic activity in a data-rich and real-time environment. Lahiri et al. (2015) study the role of ISM business surveys in nowcasting current quarter US GDP growth and find evidence that these indices improve the nowcasts. Similarly, Lahiri and Monokroussos (2013) study the role of consumer confidence in nowcasting real personal consumption expenditure in a real-time and data-rich environment. They establish a robust link between consumer confidence and nowcast performance for consumption in real time that is especially strong for service consumption. We also find that the ISM business survey for the USA is an important predictor for nowcasting Canadian GDP growth, given its timely release and the importance of the US economy for Canadian GDP growth.

Through a pseudo-real-time exercise using data from the first quarter of 1980 to the second quarter of 2016, we show that the RMSE of the DFM improves steadily as more information is released. The model also outperforms simple benchmarks like autoregressive models (AR) and is competitive with other nowcasting models, such as bridge equations and MIDAS regressions, especially at the nowcasting horizon.

Apart from the aforementioned paper by Bragoli and Modugno (2016), there are very few papers on nowcasting Canadian macroeconomic variables. Binette and Chang (2013) propose a nowcasting model for Canadian GDP growth based on the Euro-STING of Camacho and Perez-Quiros (2010). Galbraith and Tkacz (2013) examine the usefulness of payments data to nowcast GDP growth. Relative to these papers, we contribute by proposing a different DFM, examining a larger set of monthly and quarterly predictors, and benchmarking the results against other nowcasting models commonly used in the literature.

The paper proceeds as follows: The next section describes our data set, followed by Sect. 3, which presents our dynamic factor model. Section 4 shows the performance of the DFM against various benchmarks and the contributions of each indicator. Finally, Sect. 5 concludes.

2 Data

When building a nowcasting model, it is important to find variables that are: (i) helpful to predict GDP growth; (ii) timely; and (iii) updated frequently (e.g., monthly). To help us meet these criteria, we choose variables that are followed by the market and reported on Statistics Canada’s official release bulletin “The Daily.” This results in a mix of hard and soft indicators. We also include commodity prices and a set of US economic indicators because of the Canada’s close economic ties with the USA. Recent papers (e.g., Alvarez and Perez-Quiros 2016) that use a similar DFM show that medium-sized data sets (i.e., with 10–30 variables) perform equally as well as models with larger data sets with over 100 variables. With these considerations in mind, we select 23 predictors of the Canadian economy. There are two notable peculiarities in Canadian macroeconomic data: First, the data are released with a larger delay relative to other developed economies, and second, Canada has a monthly GDP indicator. Below, we provide a more thorough description of the variables used in the model.

The Canadian National Accounts data are released two months after the end of the reference quarter. This means that GDP for the first quarter of the year is released at the end of May. The monthly GDP data, denoted GDP at basic prices, have a similar lag, such that January GDP would be released at the end of March. This is quite different from other developed countries such as the USA and the UK, for example, who release their first estimates of GDP four weeks after the reference period. Other European countries and Japan release their real GDP figures with a delay of only six weeks. Furthermore, many of the most commonly used and timely leading indicators, such as industrial production, are not available for Canada in a timely manner. In the USA, industrial production is available within two weeks of the reference period, while in Canada it is reported as a special aggregation of monthly real GDP data. As such, it is released 60 days after the reference period.Footnote 2

The monthly GDP at basic prices series and the quarterly GDP at market prices are distinct measures that can have, at times, quite different growth rates. The difference lies in the treatment of taxes and subsidies on the products.Footnote 3 While production at basic prices excludes taxes and subsidies, GDP at market prices includes them. This discrepancy can lead to significant differences in the annualized quarterly growth between the two series, sometimes greater than 1 percentage point in absolute value. Figure 1 illustrates that, while monthly GDP (aggregated to the quarterly frequency) tracks quarterly GDP at market prices closely, it can deviate significantly at times. Nonetheless, monthly GDP at basic prices is a very important predictor of quarterly GDP, and we construct our DFM to take that into account, as detailed in the next section.

Fig. 1
figure 1

GDP at market prices (quarterly) versus GDP at basic prices (monthly). Notes This figure shows Canadian GDP at market prices as published at the Quarterly National Accounts and monthly GDP at basic prices Q/Q growth rates at annualized rates

Since Canada is a small open commodity exporting economy with important trade and financial links to the USA, our DFM includes some US indicators, as well as commodity prices and world economic activity indicators. Of the 23 variables that we include, 14 are domestic, 6 are USA, and the remaining 3 are the Bank of Canada non-energy commodity price index, WTI oil prices, and Global Purchasing Manager’s Index (PMI).

The domestic variables cover most of the standard nowcasting variables: car sales, PMI, merchandise trade, housing variables, and various real activity measures. We also include an indicator from the Bank of Canada’s Business Outlook Survey (BOS). The BOS is a quarterly business survey of about 100 firms across Canada that reflects the diverse composition of the Canadian economy in terms of region, type of business activity, and firm size.Footnote 4 Specifically, we use the balance of opinion on the future sales question, which has been shown to be useful in forecasting GDP growth (Pichette and Rennison 2011).

Finally, we turn to the foreign variables. Since the USA is such an important trading partner for Canada, we include several indicators of US economic activity in our DFM, such as US PMI,Footnote 5 and a set of standard real activity indicators, such as industrial production, retail sales, and non-farm payroll.

Table 1 Domestic macroeconomic variables

We transform the series to ensure stationarity. Table 1 shows all the monthly and quarterly series, and their transformation. Furthermore, several series published by Statistics Canada have been re-based or undergone definitional changes, which makes finding series with a sufficiently long history difficult. To overcome this obstacle, series that suffer from this problem are simply spliced together with the corresponding older series.

3 Econometric framework

We follow the approach proposed by Giannone et al. (2008), with the maximum likelihood estimation methodology of Bańbura and Modugno (2014), which allows for arbitrary patterns of missing data. Doz et al. (2012) study the asymptotic properties of quasi-maximum likelihood estimation for large approximate dynamic factor models. The authors find that the maximum likelihood estimates of the factors are consistent, as the size of the cross section and sample go to infinity along any path. Furthermore, the estimator is robust to a limited degree of cross-sectional and serial correlation of the error terms. This is particularly interesting because in large panels the assumption of no cross-correlation could be too restrictive. We use a block factor structure similar to what is developed in Banbura et al. (2011).

Firstly, our model obeys the factor model representation:

$$\begin{aligned} y_t = \varLambda f_t + \epsilon _t, \end{aligned}$$
(1)

where \(y_t = (y_{1,t}, y_{2,t}, \ldots , y_{n,t} )\) is a set of standardized monthly stationary variables, \(f_t\) denotes a vector of r unobserved factors, and \(\varLambda \) is a vector of loadings.

$$\begin{aligned} f_t = A_1f_{t-1} + \cdots + A_pf_{t-p} + u_t, \qquad u_t\sim \,\mathrm{i.i.d.} N(0,Q), \end{aligned}$$
(2)

and \(A_1, \ldots , A_p\) are \(r \times r\) matrix of autoregressive coefficients.

Finally, we assume that the ith idiosyncratic component of the monthly variables follows an AR(1) process:

$$\begin{aligned} \epsilon _{i,t} = \alpha _i\epsilon _{i,t-1} + \varepsilon _{i,t}, \qquad \epsilon _{i,t} \sim \,\hbox {i.i.d.} N(0,\sigma ^2_i), \end{aligned}$$
(3)

with \({\mathbb {E}}[\epsilon _{i,t}\epsilon _{j,s}] = 0 \quad \) for \(i \ne j .\)

3.1 Quarterly series

Quarterly series are incorporated into the model by expressing them in terms of their partially observed monthly counterparts, as in Mariano and Murasawa (2003). Quarterly variables, like GDP (\(\hbox {GDP}_t^Q\)), are expressed as the sum of their unobserved monthly contributions (\(\hbox {GDP}_t^M\)):

$$\begin{aligned} \hbox {GDP}_t^Q = \hbox {GDP}_{t}^M + \hbox {GDP}_{t-1}^M + \hbox {GDP}_{t-2}^M, \end{aligned}$$
(4)

for \(t = 3,6,9,\ldots .\) define \(Y_t^Q = 100 \times log(\hbox {GDP}_{t}^Q)\) and \(Y_t^M = 100 \times log(\hbox {GDP}_{t}^M)\). The unobserved monthly rate of GDP growth, \(y_t = \varDelta Y_t^M\), is also assumed to follow the same factor model representation as the monthly variables:

$$\begin{aligned} y_t= & {} \varLambda _Qf_t + \epsilon _t^Q \end{aligned}$$
(5)
$$\begin{aligned} \epsilon _t^Q= & {} \alpha _Q \epsilon _{t-1}^Q + \varepsilon _t^Q \end{aligned}$$
(6)

where \(\varepsilon _t^Q\) is an \(i.i.d. N(0,\sigma ^2_Q)\) process.

To link \(y_t\) with the observed quarterly GDP series, we construct a partially observed monthly series:

$$\begin{aligned} y_t^Q = {\left\{ \begin{array}{ll} Y_t^{Q} - Y_{t-3}\,, &{} t=3,6,9\\ \hbox {unobserved} \,, &{} \hbox {otherwise} \end{array}\right. } \end{aligned}$$

and use the approximation of Mariano and Murasawa (2003) to obtain:

$$\begin{aligned} y_t^Q&= Y_{t}^Q - Y_{t-3}^Q \approx (Y_{t}^M + Y_{t-1}^M + Y_{t-2}^M) - (Y_{t-3}^M + Y_{t-4}^M + Y_{t-5}^M) \nonumber \\&= y_{t} + 2y_{t-1} + 3y_{t-2} + 2y_{t-3} + y_{t-4}. \end{aligned}$$
(7)

3.2 Impact of new data releases

Nowcasters are frequently interested in the impact of each new data point. For example, it might be interesting to know what the impact of the latest industrial production figure is for the GDP forecast. Furthermore, the nowcasting environment is characterized by a large set of variables that can arrive at a high frequency. This results in the nowcaster studying a sequence of nowcasts that can be updated very frequently, reflecting the steady stream of new information arriving. The DFM framework used in this paper and developed by Giannone et al. (2008) allows us to study this so-called news. As discussed in Banbura et al. (2011), by analyzing the forecast revision, we have a way of quantifying the change in information set and the average impact of each variable.

Let \(\varOmega _v\) denote a vintage of data available at time v, where v refers to the date of a particular data release. Since data are constantly arriving, \(\varOmega _v\) expands throughout the nowcast period. Furthermore, let us denote GDP growth at time t as \(y^Q_t\).

In this context, we can decompose a new forecast into two components.

$$\begin{aligned} \underbrace{{\mathbb {E}}\left[ y^{Q}_{t} |\varOmega _{v+1}\right] }_\text {new forecast} = \underbrace{{\mathbb {E}}\left[ y^{Q}_{t} |\varOmega _v\right] }_\text {old forecast} +\underbrace{{\mathbb {E}}\left[ y^{Q}_{t} |I_{v+1}\right] }_\text {revision}, \end{aligned}$$
(8)

where \(I_{v+1}\) is the subset of the set \(\varOmega _{v+1}\) that is orthogonal to all the elements of \(\varOmega _v\). As specified above, the change in nowcast is due to the unexpected part of the new data release, which is called the “news.” The news is useful because what matters in understanding the updating process of the nowcast is not the release itself but the difference between the release and the previous forecast.

Hence, the effect of the news is given by

$$\begin{aligned} \underbrace{{\mathbb {E}}\left[ y^{Q}_{t} |\varOmega _{v+1}\right] - {\mathbb {E}}\left[ y^{Q}_{t} |\varOmega _v\right] }_\text {forecast revision} = \sum _{j\in \mathbb {J}_{v+1}}b_{j,t,v+1}\underbrace{\Bigr (x_{j,T_{j,v+1}} - {\mathbb {E}}\Bigr [x_{j,T_{j,v+1}}| \varOmega _v\Bigr ]\Bigr )}_\text {news}, \end{aligned}$$
(9)

where \(b_{j,t,v+1}\) are weights obtained from the model estimation and \(\mathbb {J}\) is the set of new variables. The nowcast revision is a combination of the news associated with the data release for each variable and its relevancy for the target variable (quantified by its weight \(b_{j,t,v+1}\)). This decomposition allows the nowcaster to trace forecast revisions back to unexpected movements in individual predictors.

3.3 Estimation

We estimate the model parameters by maximum likelihood using the implementation of the expectation maximization (EM) algorithm proposed by Bańbura and Modugno (2014). This implementation can deal with arbitrary patterns of missing observations.

An additional advantage of the maximum likelihood approach is that it easily allows us to impose restrictions on the parameters. This feature is especially appealing in the case of Canada, as it makes possible the addition of a factor that solely loads on the monthly and quarterly GDP series. Bork (2009) and Bork et al. (2009) show how to impose restrictions in the model described above. We assume that there are two factors that relate to quarterly GDP, monthly GDP, and the remaining macroeconomic and financial indicators, as follows:

  1. 1.

    \(f_{1,t}\) is the factor that captures the co-movement among quarterly GDP, monthly GDP at basic prices, and all other monthly series;

  2. 2.

    \(f_{2,t}\) is the factor that solely loads on quarterly and monthly GDP at basic prices.

The block factor structure implies the following properties of the transition Eq. (2), where the subscript refers to the factors described above.

$$\begin{aligned} f_t = \quad \begin{pmatrix} f_{1,t}\\ f_{2,t}\\ \end{pmatrix}, \quad A = \begin{pmatrix} A_{1}&{}0\\ 0&{}A_{2}\\ \end{pmatrix}, \quad Q = \begin{pmatrix} Q_{1}&{}0\\ 0&{}Q_{2}\\ \end{pmatrix} \quad \end{aligned}$$
(10)

The modeling choice above differs from the Bragoli and Modugno (2016) nowcasting model of the Canadian economy. In their model, monthly GDP does not directly load on the factor; rather, it follows a vector autoregressive (VAR) process where it interacts with the factor in the state equation. The forecast for quarterly GDP is then the monthly GDP forecast aggregated within the model. Although the structure of their model is different, we share several key results, as the next section shows.Footnote 6 Our papers also differ in that we examine the performance of a DFM relative to other nowcasting models and our benchmarks use final data, similar to the DFM.

4 Results

Since we do not have the real-time data vintages of every release, we perform a pseudo-real-time out-of-sample evaluation of our model in which we simulate the flow of data availability. We replicate the data availability pattern by creating over 5,000 vintages of data, which simulates the forecasting environment for every new release. Using these vintages, we update our predictions with every new release of data. Table 1 shows the assumed order of data availability for our empirical exercise. The model is estimated recursively, and the first out-of-sample forecast is for the first quarter of 2002. We start predicting quarter t GDP growth 30 days before the start of the quarter. The model is then updated with every variable release until the publication of the National Accounts for quarter t, about 60 days after the end of quarter t. Hence, we have 180 days over which the predictions for quarter t GDP growth rate are generated.

As discussed in Sect. 3.3, we estimate the model with two block factors and one lag (\(p=1\)) in the VAR driving the dynamics of those factors. Finally, as specified in Eq. (3), we allow the idiosyncratic components to follow an AR(1) process.

As a first pass, we benchmark the DFM forecasts with two different versions of simple autoregressive (AR) models. The first one, which we denote the quarterly AR, is simply an AR model with quarterly GDP data.

$$\begin{aligned} y_{t}^Q = \alpha + \sum _{i=1}^p \rho _i y_{t-i}^Q + \varepsilon _{t+h} \end{aligned}$$
(11)

As discussed earlier, Canada releases data for a monthly GDP series. Thus, we also estimate a monthly AR model, whose monthly forecasts we then aggregate into a quarterly figure.

$$\begin{aligned} y_{t+h}^M = \alpha + \sum _{i=1}^p \rho _i y_{t-i}^M + \varepsilon _{t+h}, \end{aligned}$$
(12)

where \(h=1, 2,\ldots , 6\) months, depending on which month of the quarter the forecasts are being made.

Fig. 2
figure 2

RMSFE as new data is released throughout the prediction horizon. Notes This figure shows the RMSFE of the DFM as new data are released throughout the prediction horizon. The red line represents the RMSFE of the quarterly AR benchmark, whereas the green lines display the RMSFE of the monthly AR model. The out-of-sample forecast period runs from 2002Q1 to 2016Q2. (Color figure online)

Figure 2 shows the root mean squared forecast error (RMSFE) of the model over the 180 days it generates predictions for quarter t GDP. The red line shows the RMSFE of the quarterly AR model, whereas the green lines show the RMSFE of the monthly AR model. Both models are estimated with one lag, \(p = 1\). At the longest forecast horizon, 30 days before the start of the quarter, the DFM performs slightly better than the AR models, with a RMSFE about 9% lower than the monthly AR model. Nonetheless, as new data arrive, the performance of the DFM improves substantially. Over the three months of the nowcasting horizon, the DFM improves upon the benchmarks by a large margin. For example, at the end of the first month of quarter t, the DFM improves upon the monthly AR model by 32%, and at the end of the second month, by 32% as well. As we move into the backcasting horizon, three months after the beginning of quarter t, the DFM is still more accurate than the monthly AR model for the next 30 days. Finally, at the second month of backcasting, when two months of monthly GDP are already known, the DFM forecasts are slightly worse than the ones from the monthly AR.

Table 2 Average reduction in RMSFE by variable

Table 2 shows the average reduction in RMSFE due to each predictor in the model for each period that the model generates predictions. At the longest prediction horizon, before the start of the reference quarter t, US variables releases lead to the largest reductions in RMSFE.Footnote 7 US and Global PMIs both lead to large decreases in RMSFE at 4 and 5 bps, respectively. Also, US industrial production, retail sales, and housing starts make important contributions to enhancing the accuracy of the model. On the domestic variable front, the employment rate and the terms of trade are the two releases that reduce the RMSFE the most.

As we move into the nowcasting horizon, US variables continue to play an important role in reducing the model’s RMSFE. US PMI is the most important release at the first nowcast horizon, and US industrial production is also an important release. Imports and exports also lead to significant decreases in RMSFE, as do wholesale trade, manufacturing sales, and the Bank of Canada Commodity Price Index. Finally, when we reach the backcasting horizon, the predictors other than monthly GDP have very little impact on reducing the RMSFE, especially at the second backcast, when two months of GDP at basic prices (monthly GDP) for quarter t are already known.

Fig. 3
figure 3

RMSFE as new data is released throughout the prediction horizon with and without US variables. Notes this figure shows the RMSFE of the DFM as new data are released throughout the prediction horizon with and without the US variables

To better illustrate the importance of US variables, we estimate the DFM excluding the US variables. Figure 3 shows the RMSFE over the prediction horizon. As the analysis of Table 2 makes clear, the largest contribution of the US variables takes place during the forecasting horizon (\(T-29\) to T) and the first two months of the nowcasting horizon (T to \(T+60\)). During these periods, the US variables play an important role in reducing the RMSFE of the DFM. In the third month of the nowcast, when the monthly GDP data for the first month of the quarter is released, the performance of the DFM with and without US data is roughly equal.

As shown in Sect. 3.2, the DFM model can be used to decompose the news component of every new economic release. Looking at the news provides a better understanding of the importance of the predictors for nowcasting Canadian GDP. Figure 4 shows the average absolute forecast revision of the models’ forecast after the release of each predictor for each month of the prediction horizon. As the graphs clearly show, the importance of the predictors varies widely over the prediction horizon. At the forecasting horizon, before the start of quarter t, it is clear that US macrovariables like, PMI, industrial production and Employment affect the predictions significantly. To some degree, these results confirm the importance of US variables discussed in the previous section. The US variables are important because of the close economic ties between the USA and Canada and because of the timeliness of their release relative to Canadian data, a fact also highlighted by Bragoli and Modugno (2016). Monthly GDP, on the other hand, has an almost negligible impact on the prediction.

As we move into the nowcasting horizon, US variables still affect Canadian GDP predictions, as do the Canadian employment rate, terms of trade, exports, and imports. Nonetheless, as we move further into nowcasting quarter t, the importance of monthly GDP increases, especially in the third month of the nowcasts, when monthly GDP for the first month of quarter t is released. Finally, as we reach the backcasting horizons, the importance of the additional predictors is much diminished. After two months of monthly GDP is known to the model, the additional predictors hardly move the final predictions.

Fig. 4
figure 4

Quarterly GDP growth predictions. Notes this figure shows the predictions of the DFM from 2002Q1 to 2016Q2 for three different horizons: the first nowcast, the last nowcast, and the last backcast. The solid black line displays actual GDP growth, while the dashed lines represent the DFM predictions

Fig. 5
figure 5

Average impact of news releases. Notes this figure shows the average absolute impact (basis points) on the forecast of every announcement in the forecast, first nowcast, second nowcast, third nowcast, as well as on the first and second backcasts

Finally, to further demonstrate the fit of the DFM, Figure 5 shows the predictions for QoQ GDP growth at annual rates for our out-of-sample period (first quarter of 2002 to second quarter of 2016) for three different horizons: at the end of the first nowcasting month, at the end of the last nowcasting month, and at the end of the last backcasting month, right before the release of the national accounts. As one can easily see, the DFM gets increasingly better as we move along the prediction horizon. At the end of the first nowcasting month, though the model does a good job of capturing average GDP growth, it does not show the sharp fall in the fourth quarter of 2008 or the rebound in the second quarter of 2009. At the end of the last nowcasting month, the model does capture the sharp fall in GDP growth during that period, and even more so in the last backcasting period, when all model predictors are known to the model (Fig. 5).

4.1 Comparison with other nowcasting models

In this section, we compare the results of the proposed dynamic factor model with other commonly used models for nowcasting, namely bridge equations models and MIDAS regressions.

4.1.1 Bridge models

Bridge models have a long tradition in short-term forecasting and are often used by central banks and policy-making institutions (see, Baffigi et al. 2004; Golinelli and Parigi 2007 among many others). This technique involves forecasting high-frequency indicators with auxiliary models and using the results to forecast a low-frequency target variable. Since Canada has a monthly GDP series, we alter this procedure slightly. Instead of aggregating the high-frequency indicator to quarterly, we simply forecast monthly GDP using that indicator and an autoregressive (AR) term. The nowcast is then aggregated to a quarterly frequency.

We estimate bridge models using the following specification:

$$\begin{aligned} y_{t_m+h} = \alpha + \sum _{i=1}^p \rho _i y_{t_m-i} + \beta _1 x_{i,t+h_m} + \varepsilon _{t_m}, \end{aligned}$$
(13)

where \(x_{i,t+h}\) are the remaining monthly indicators. We estimate bridge models featuring one indicator at a time and then average all of the nowcasts for \(y_{i,t+h}\) for a unique combined nowcast.

We use autoregressive models to forecast the missing observations of the monthly series. We estimate a total of 20 bridge models, one for each of the monthly series in the dataset. The models are re-estimated over the quarter as the monthly indicators are released. The forecasts are then averaged with equal weights.Footnote 8

4.1.2 MIDAS regressions

A more modern benchmark model is the mixed data sampling (MIDAS) regression (Ghysels et al. 2006, 2007; Clements and Galvão 2008). The defining feature of MIDAS models is the way they deal with mixed frequencies. These models use a polynomial weighting function to link high-frequency regressors onto a low-frequency regressand. This makes the MIDAS regression a direct forecasting tool, which does not explicitly model the dynamics of the indicator. Instead, the MIDAS directly relates future quarterly GDP to present and lagged high-frequency indicators. This necessitates a model for each forecast horizon.

The basic model for forecasting \(h_q\) quarters ahead with \(h_q = h_m/3\) is:

$$\begin{aligned} y_{t_q+h_q} = y_{t_m+h_m} = \beta _0 + \beta _1 b(L_m,\theta )x^{(3)}_{t_{m+w}} + \varepsilon _{t_m+h_m}, \end{aligned}$$
(14)

where \(y_{t_m}\) is GDP growth and \(x_{t_m}^{(3)}\) is the corresponding skip-sampled monthly indicator, \(L_m\) is the monthly lag operator, and \(w = T_x - T_y\). The lag polynomial \(b(L_m,\theta )\) is defined as:

$$\begin{aligned} b(L_m,\theta ) = \sum _{k=0}^K c(k;\theta )L_m^k. \end{aligned}$$
(15)

The parsimonious parametrization of the lagged coefficients \(c(k;\theta )\) is one of the key features of MIDAS models. While there are several common ways to parameterize the lagged coefficients, we choose the so-called Beta Lag:

$$\begin{aligned} c(k,\theta _1,\theta _2) = \frac{f(\frac{k}{K},\theta _1;\theta _2)}{\sum _{k=1}^K f(\frac{k}{K},\theta _1;\theta _2)}, \end{aligned}$$
(16)

where \(f(k,\theta _1,\theta _2) = \frac{k^{\theta _1-1}(1-k)^{\theta _2-1}\varGamma (\theta _1+\theta _2)}{\varGamma (\theta _1),\varGamma (\theta _2)}\), \( \varGamma (\theta ) = \int _0^\infty e^{-x}x^{\theta -1}{} \textit{dx}\) , and parameters \(\theta _1\) and \(\theta _2\) govern the shape of the distribution. This parametrization is quite general and can take various shapes with only a few parameters. These include increasing, decreasing, or hump-shaped patterns. Furthermore, we restrict the last lag to be equal to zero.

The MIDAS model is estimated using nonlinear least squares (NLS) in a regression of \(y_t\) onto \(x^{(3)}_{t-h}\) for each forecast horizon \(h = 1,\ldots , H\). The direct forecast is given by the conditional expectation:

$$\begin{aligned} {\hat{y}}_{T_y+h|T_x} = y_{t_m+h_m} = \hat{\beta _0} + \hat{\beta _1} b(L_m,\theta )x^{(3)}_{t_{m+w}}, \end{aligned}$$
(17)

where \(T_x = T_y +w\) is such that the most recent observations of the indicator are included in the conditioning set of the projection. For example, if we were trying to forecast Q2 GDP and July PMI was available, the regression would include a lead of our indicator.

Since Canada has a monthly GDP measure, it is necessary to extend the basic MIDAS model to have multiple explanatory variables. Furthermore, we include a low-frequency autoregressive term. The forecasting model then becomes:

$$\begin{aligned} y_{t_q+h_q}= & {} y_{t_m+h_m} = \beta _0 + \beta _1 b(L_m,\theta _1)x^{(3)}_{1,t_{m+w-h_m}} \nonumber \\&+\, \beta _2 b(L_m,\theta _2)x^{(3)}_{2,t_{m+w-h_m}} + \lambda y_{t_m} + \varepsilon _{t_m+h_m} \end{aligned}$$
(18)

with \(x^{(3)}_{1}\) being monthly GDP measured at basic prices and \(x^{(3)}_{2}\) an additional leading indicator. As in the bridge models, we take the same set of leading indicators, create a model with each, and average the individual forecasts with equal weights to create the MIDAS class forecast.

4.1.3 Comparison results

Table 3 RMSFE of the DFM and benchmark models

Table 3 compares the RMSFE of our DFM with the two alternative models described above. We compare the models at the end of each month prior to the monthly GDP release, when all data except monthly GDP are known. Relative to both the bridge and MIDAS models, the DFM is more accurate before the first release of monthly GDP. The DFM improves over the MIDAS by close to 19% and the bridge equations by close to 28%. For the second month of the quarter, the same trend emerges; the DFM outperforms the MIDAS and bridge models by approximately 13 and 14%, respectively. It is interesting to note how close the bridge and MIDAS models are in terms of RMSFE; it seems that in our context there are not many gains from the more complicated bridging polynomial. This is likely because we forecast monthly GDP and then aggregate to quarterly. In this sense, we know the proper weights and thus do not have to estimate them as in the MIDAS regressions. At the shortest horizon, the DFM accuracy is slightly worse than that of the bridge and MIDAS models.

To test for the statistical differences in the forecast performance, we apply Diebold and Mariano (1995) tests of forecast accuracy. We find that differences between the performance of the nowcasting models are not statistically significant. However, the difference in accuracy between the DFM and the quarterly AR model is significant at most forecast horizons, as shown in Table 3.

5 Conclusion

This paper proposes a medium-sized dynamic factor model to nowcast quarterly GDP in Canada. We deviate from the traditional dynamic factor models used in the nowcasting literature to accommodate specificities of the Canadian macroeconomic data availability. The model is estimated using a panel of 23 variables and features an additional restricted factor to properly take into account the publication of a monthly GDP series in Canada.

In a pseudo-real-time exercise, we show that the model performs well. Our proposed dynamic factor model is more accurate than traditional simple benchmarks such as univariate AR models. It also performs well against competing MIDAS and bridge models, which explicitly consider additional predictors, mixed frequencies, and ragged edges.