1 Introduction: benchmarking and reconciliation of time series

The benchmarking problem arises when two time series for the same target variable are measured at different frequencies with different level of accuracy, and there is the need to remove discrepancies between the annual benchmarks and the corresponding aggregates (either sums or averages) of the sub-annual values. For example, the optimal combination of annual levels and quarterly movements requires an adjustment that preserves as much as possible the short-term movements in the preliminary infra-annual figures subject to the restrictions provided by the annual constraints.

The reconciliation problem is commonly known as the adjustment process of a system of \(m ({>}1)\) time series in accord with their low-frequency benchmarks and with at least one high-frequency aggregate series. In this case, both temporal aggregation constraints, for each individual series across the temporal dimension, and contemporaneous constraints, for each individual period across the variables of the system, must be satisfied. As for the benchmarking problem, this adjustment should be done according to some movement preservation principle such that the temporal profiles of the original series are preserved to the highest possible degree.

Benchmarking and reconciliation problems are typically faced by statistical agencies in the production of official statistics. Typical examples are the compilation of quarterly supply and use tables (SUT) in national accounts, where the quarterly flows are required to satisfy more comprehensive level estimates from annual SUT and to be in line with the many row and column equalities of the tables; or the production of seasonally adjusted (SA) estimates following a direct approach, when SA series are required to be consistent with their annual unadjusted levels and SA aggregates need to be in line with SA components in any observed period.

Both benchmarking and reconciliation can be performed setting up constrained minimization problems of some mathematical criterion aimed at preserving at the best the movements in the sub-annual values. It is commonly understood by many authors and practitioners that an ideal movement preservation principle should be formulated as an explicit preservation of the period-to-period rates of change of the preliminary series (Helfand et al. 1977; Bloem et al. 2001), according to which the sum of the squared differences between the growth rates of the target series and the growth rates of the preliminary series is minimized. The Growth Rates Preservation (GRP) criterion, however, gives rise to a nonlinear problem (NLP), whose solution requires numerical optimization algorithms.

Denton (1971) proposed alternative movement preservation principles (i.e. objective criteria) for benchmarking, that give rise to quadratic-linear optimization problems in the target values. The benchmarked values can thus be found using an explicit formula involving simple matrix operations. In particular, the Proportionate First Differences (PFD) criterion, one of the variant proposed by Denton, has been very successful in practical benchmarking applications. The PFD criterion looks for benchmarked estimates aimed at minimizing the sum of squared proportional differences between the target values and the unbenchmarked values.

Di Fonzo and Marini (2011) extended the PFD criterion to the reconciliation of a system of time series subject to both temporal and contemporaneous constraints. A simultaneous solution to the problem was proposed, which exploits the sparsity of the linear system to be solved. Furthermore, a two-step reconciliation strategy was recommended to reduce the complexity of the problem in the case of large systems: a benchmarking procedure is applied for each series at the first step, and then, the benchmarked series are reconciled year-by-year using a least squares balancing procedure. The work demonstrated empirically that a two-step procedure with a least squares adjustment proportional to the squared level of the benchmarked series at the second step results in a close approximation of the Denton PFD simultaneous solution.

At the individual series level (i.e. for univariate benchmarking), the PFD criterion has often been claimed to be a close approximation of the GRP principle. Thanks to the satisfactory results obtained from the modified Denton’s PFD technique and from other related linear solutions (e.g. Dagum and Cholette 2006), few significant progress has been achieved towards the development of an efficient and robust optimization algorithm to minimize the nonlinear GRP function for benchmarking and reconciliation problems. A benchmarking procedure based on the GRP criterion was first implemented by Causey and Trager (1981; see also Trager 1982; Bozik and Otto 1988). To solve the NLP defined by the GRP criterion, Causey and Trager developed a steepest descent (SD) algorithm based on first-derivative information (i.e. the gradient). However, using only first-derivative information may result in poorly efficient procedures, characterized by slow convergence and possible troubles in finding actual minima of the objective function. More recently, Brown (2010, 2012, see also Titova et al. 2010) proposed a gradient-based procedure that uses a Conjugate Gradient (CG) algorithm, but the results were broadly in line with the Causey and Trager’s SD procedure in terms of efficiency and robustness of the solutions.

Our interest in developing GRP-based benchmarking and reconciliation procedures has been motivated by two reasons. First, we think that recent advances in optimization algorithms along with the huge increase in computational power of computers make it possible to solve the nonlinear GRP problem nowadays more accurately and more rapidly than in the past. Second, once a more efficient optimization algorithm is available, we aim at assessing empirically how true is the supposed approximation of the optimal, nonlinear GRP objective function by the linear PFD solution in both benchmarking and reconciliation problems.

In our recent works we have found that massive improvements in both efficiency and robustness of the benchmarking results can be obtained exploiting both the analytical gradient vector and Hessian matrix of the GRP function. First, Di Fonzo and Marini (2013a, b) found that an interior-point method (Nocedal and Wright 2006), which uses second-order derivative information, provides more accurate and faster solutions compared to other gradient-based optimization procedures. Second, Di Fonzo and Marini (2012a) proposed a Newton’s method with Hessian modification that can be applied after the original constrained benchmarking problem is transformed into an unconstrained problem.

With these effective implementations of the GRP benchmarking procedure we have been able to clarify the nature of the PFD approximation. Using a simulation exercise, we showed the conditions under which the PFD solution provides a close approximation to GRP (Di Fonzo and Marini 2013a). We also found that the approximation works particularly well when the movements in the preliminary series are smooth (low variance in the growth rates) and the relationship with the (annual) target series is relatively stable (no changes in the ratio between their levels), but deteriorates as soon as the preliminary series is lumpy (or affected by strong seasonal effects) or presents sudden level shifts compared to the target series.

In this paper we continue our research on the GRP by extending this approach to the reconciliation of systems of time series subject to both temporal and contemporaneous constraints. We show that using an interior-point method to solve a constrained non-linear optimization problem is a fast and feasible choice even for very large systems. Then, we compare the results with the PFD-based reconciliation procedures presented in Di Fonzo and Marini (2011) in order to demonstrate the effectiveness of the GRP adjustment in terms of both computational efforts and quality of the results.

The paper is organized as follows. In Sect. 2 the optimization problem in terms of GRP is discussed and compared to the classical benchmarking procedure by Denton (1971). The extension to reconciliation with both temporal and contemporaneous constraints is then presented. In Sect. 3 we introduce the interior-point method, and in Sect. 4 we discuss two-step reconciliation procedures based on the GRP criterion. In order to analyze the distinctive features of the proposed procedures, Sect. 5 presents two applications of the new reconciliation procedures for real-life systems of series, namely 175 quarterly series from the EU Quarterly Sector Accounts (EUQSA), and 236 monthly series from the Canadian Monthly Retail Trade Survey (MRTS). Section 6 presents some final remarks and conclusions.

2 Growth rates preservation in benchmarking and reconciliation

Benchmarking and reconciliation problems are solved through the minimization of an objective function of the unobserved values of one or more target series, which must satisfy given temporal and contemporaneous aggregation constraints. Let us denote the target series by \(y_{j,t}\), where the two sub-indices indicate the cross-sectional dimension and the temporal dimension, respectively: \(j=1,\ldots ,m\) and \(t=1,\ldots ,n\), with \(m\) the number of series considered (if \(m=1\), we have a univariate benchmarking problem) and \(n\) the number of high-frequency periods. Each series is denoted in vector form as \(\mathbf{y}_j=[y_{j,1}, y_{j,2},\ldots , y_{j,n}]^\prime \). The whole system of series can be conveniently stacked in a single vector as \(\mathbf{y} = [\mathbf{y}^\prime _1, \ldots ,\mathbf{y}^\prime _m]^\prime \). In the following, we assume that for benchmarking and reconciliation problems the system constraints assume the general form

$$\begin{aligned} \mathbf{Ay }=\mathbf b \end{aligned}$$
(1)

where \(\mathbf{A}\) is a \((r \times mn)\) matrix of any given real numbers defining the relationships between the \(mn\) observations and \(\mathbf{b}\) is the \(r\)-dimensional vector with the known quantities (benchmarks) of the system (for details, see Appendix A in Di Fonzo and Marini 2012b).

The objective function is usually set as a distance metric between the unobserved target series \(y_{j,t}\) and some preliminary series \(p_{j,t}\) observed at the same frequency. For economic series, especially those observed at the infra-annual level, it is sensible to define this metric in terms of movements of the series: the user of such statistics is generally much more interested in the dynamic of a monthly or quarterly series (e.g. how much it has grown since the last month or quarter) rather than in its level (e.g. how much is the level in the month or the quarter). For this reason, objective functions for benchmarking and reconciliation problems are most commonly known as movement preservation principles.

2.1 Benchmarking (\(m=1\))

As for benchmarking, Causey and Trager (1981) consider a criterion to be minimized explicitly related to the growth rate, which is a natural measure of the movement of a time series:

$$\begin{aligned} f_\textit{GRP} = \displaystyle \sum _{t=2}^n \left( \frac{y_{t}-y_{t-1}}{y_{t-1}} - \frac{p_{t}-p_{t-1}}{p_{t-1}} \right) ^2 = \displaystyle \sum _{t=2}^n \left( \frac{y_{t}}{y_{t-1}} - \frac{p_{t}}{p_{t-1}} \right) ^2. \end{aligned}$$
(2)

The benchmarked values \(y_t^*,\, t=1,\ldots ,n\), minimize the criterion (2) subject to the aggregation constraints \(\sum _{t\in T}y_t = Y_T,\, T=1,\ldots ,N\), where index \(T\) denotes the low-frequency period (e.g. the year). In other words, the benchmarked series is estimated in such a way that its temporal dynamics, as expressed by the growth rates \(\left( \displaystyle \frac{y_t - y_{t-1}}{y_{t-1}}\right) , t=2,\ldots , n\), be ‘as close as possible’ to the temporal dynamics of the preliminary series, where the ‘distance’ from the preliminary growth rates \(\left( \displaystyle \frac{p_t - p_{t-1}}{p_{t-1}}\right) \) is given by the sum of the squared differences.

Two observations are in order. Looking at the criterion \(f_\textit{GRP}\), it clearly appears that it is grounded on an “ideal” movement preservation principle, “formulated as an explicit preservation of the period-to-period rate of change” of the preliminary series (Bloem et al. 2001, p. 100). Second, the constrained minimization of \(f_\textit{GRP}\) is nonlinear in the target values \(y_{t}\), has not linear first-order conditions for a stationary point, and thus an explicit, analytic expression for the solution cannot be found. A solution can only be found using nonlinear optimization algorithms. We think that this complication has made the GRP benchmarking approach less appealing for data-producing agencies, and limited its use in practical applications.

Denton (1971) proposed a benchmarking procedure grounded on the Proportionate First Differences (PFD) between the target and the preliminary series:Footnote 1

$$\begin{aligned} f_\textit{PFD} = \displaystyle \sum _{t=2}^n \left( \frac{y_{t}-p_{t}}{p_{t}} - \frac{y_{t-1}-p_{t-1}}{p_{t-1}} \right) ^2 = \displaystyle \sum _{t=2}^n \left( \frac{y_{t}}{p_{t}} - \frac{y_{t-1}}{p_{t-1}} \right) ^2. \end{aligned}$$
(3)

Since \(f_\textit{PFD}\) is a quadratic function of the target values, the benchmarked series can be expressed in closed form and its values found with simple matrix operations (Dagum and Cholette 2006; Di Fonzo and Marini 2013a).

In the literature (Cholette 1984; Bloem et al. 2001; Dagum and Cholette 2006) it is often claimed that the PFD procedure produces results very close to the GRP benchmarking. Indeed, \(f_\textit{GRP}\) and \(f_\textit{PFD}\) are very close to each other. For, it can be seen that

$$\begin{aligned} \sum _{t=2}^n \left( \frac{y_{t}}{p_{t}} - \frac{y_{t-1}}{p_{t-1}} \right) ^2 = \sum _{t=2}^{n} \left[ \frac{y_{t-1}}{p_t} \left( \frac{y_{t}}{y_{t-1}}-\frac{p_{t}}{p_{t-1}} \right) \right] ^2. \end{aligned}$$
(4)

The term in parentheses on the right-hand side is the difference between the growth rates of the target series and those of the preliminary series, namely the addendum of (2). In the PFD criterion these terms are weighted by the ratio between the target series at \(t-1\) and the preliminary series at \(t\). When these ratios are relatively stable over time, which is the case when the ‘benchmark-to-indicator ratio’ \(\frac{ \displaystyle Y_T}{\sum _{t \in T} \displaystyle p_t},\, T=1,\ldots ,N\) (Bloem et al. 2001), is a smooth series, criteria (2) and (3) are very close to each other. On the contrary, when the ratios \(\frac{ \displaystyle y_{t-1}}{ \displaystyle p_t}\) behave differently, each term in the rhs summation in expression (4) is over-(under-)weighted according to the specific relationship between target and preliminary series in that period. For example, sudden breaks in the movements of \(y_{t-1}/p_t\) might arise in case of large differences between the annual benchmarks and the annually aggregated preliminary series. In addition, Di Fonzo and Marini (2013a) studied empirically this issue, showing that PFD and GRP benchmarked estimates are close when the variability of the preliminary series and/or its bias are low with respect to the target variable. When this is not the case (e.g. preliminary series with large growth rates and/or bias), the GRP and PFD results diverge.

A GRP-benchmarking problem can be solved very efficiently by using Newton’s optimization methods, which exploit the analytical expressions of the gradient and the Hessian of \(f_\textit{GRP}\) (Di Fonzo and Marini 2012a, 2013b). First, the class of methods known as interior-point (IP) methods (also referred to as barrier methods, Nocedal and Wright 2006), which has proved to be fast and accurate for many nonlinear constrained optimization problems, has been considered. Second, a Newton’s method with Hessian modification applied to a suitably reduced-unconstrained problem has been developed.

2.2 Reconciliation (\(m>1\))

For reconciliation problems, both (2) and (3) can be extended to consider all the \(m\) series in the system. The global GRP criterion is defined as

$$\begin{aligned} F_\textit{GRP} = \displaystyle \sum _{j=1}^m \sum _{t=2}^n \left( \frac{y_{j,t}-y_{j,t-1}}{y_{j,t-1}} - \frac{p_{j,t}-p_{j,t-1}}{p_{j,t-1}} \right) ^2 = \displaystyle \sum _{j=1}^m \sum _{t=2}^n \left( \frac{y_{j,t}}{y_{j,t-1}} - \frac{p_{j,t}}{p_{j,t-1}} \right) ^2 \end{aligned}$$
(5)

whereas the global PFD function is

$$\begin{aligned} F_\textit{PFD} = \displaystyle \sum _{j=1}^m \sum _{t=2}^n \left( \frac{y_{j,t}-p_{j,t}}{p_{j,t}} - \frac{y_{j,t-1}-p_{j,t-1}}{p_{j,t-1}} \right) ^2 = \displaystyle \sum _{j=1}^m \sum _{t=2}^n \left( \frac{y_{j,t}}{p_{j,t}} - \frac{y_{j,t-1}}{p_{j,t-1}} \right) ^2. \end{aligned}$$
(6)

Since there is no cross-sectional interaction between different series in (5) and (6), the relationship between the global criteria follows straightforward from (4):

$$\begin{aligned} \sum _{j=1}^m \sum _{t=2}^n \left( \frac{y_{j,t}}{p_{j,t}} - \frac{y_{j,t-1}}{p_{j,t-1}} \right) ^2 = \sum _{j=1}^m \sum _{t=2}^{n} \left[ \frac{y_{j,t-1}}{p_{j,t}} \left( \frac{y_{j,t}}{y_{j,t-1}}-\frac{p_{j,t}}{p_{j,t-1}} \right) \right] ^2. \end{aligned}$$
(7)

A reconciliation procedure based on \(F_\textit{PFD}\) was presented by Di Fonzo and Marini (2011). Likewise the benchmarked estimates, the reconciled estimates can be derived as (part of) the solution of a linear system. Compared to the benchmarking case, however, there are two major complications in the coefficient matrix of the system: (1) its size tends to be large, increasing exponentially with the number of variables and the number of periods, and (2) the presence of constraints having different nature causes rank deficiency. Using very efficient sparse algorithms available in MATLAB, Di Fonzo and Marini (2011) showed that a simultaneous adjustment of all the variables in the system is still feasible even when the system is very large. Two-step reconciliation procedures were also considered, because they are computationally less demanding if sparse matrices facilities are not available (we will consider them again in Sect. 4).

To our knowledge, a reconciliation procedure minimizing the global GRP criterion (5) has never been attempted. The two complications mentioned before—large size and rank deficiency of the system matrix, are likely to make even more difficult, if not impossible, the application of nonlinear optimization algorithms. However, the Newton’s algorithms considered for the GRP benchmarking case have proved to be very efficient and robust for a single series and look promising for dealing with many variables and many constraints at the same time. Similarly to our previous works on benchmarking, our main interests in this paper are:

  • to verify if reconciliation problems (possibly large and complex) based on the minimization of the global GRP criterion (5) can effectively be solved using Newton’s methods exploiting second-order information;

  • once an effective GRP reconciliation procedure is developed, to determine how close is the presumed approximation of the PFD reconciled estimates using practical applications.

3 Optimization algorithms for the GRP problem

3.1 Temporal benchmarking

The GRP criterion \(f_{GRP}\) is a nonlinear function of the target values. More precisely, it can be shown that it is a non-convex function (Di Fonzo and Marini 2012a). Differently from the PFD case, the constrained minimization problem based on the GRP function does not have linear first-order conditions for a stationary point, and therefore it is not possible to find an explicit, analytic expression for the solution. On the other hand, provided that both \(p_t\) and \(y_t,\, t=1,\ldots ,n-1\), be different from zero,Footnote 2 \(f_{GRP}\) is a twice continuously differentiable function, making it possible the use of several iterative minimization algorithms (Nocedal and Wright 2006).

The first implementation of an algorithm for the GRP benchmarking problem was given by Causey and Trager (1981), who used a constrained steepest descent algorithm based on first-derivatives information (i.e. the gradient) of the GRP function. The minimization problem is solved in the original variables \(y_t\) by using a feasible direction method, according to which at each iteration the unconstrained search direction is projected onto the feasible set of solutions defined by the constraints of the problem. The Causey and Trager’s procedure is implemented in the DOS-executable programme BMK1, which has been used extensively by the US Census Bureau. Brown (2010, 2012) has recently proposed a similar feasible direction method based on the conjugate gradient algorithm and implemented in SAS. However, according to the results, this implementation seems to offer little improvements over the original procedure by Causey and Trager.

As a matter of fact, gradient-based algorithms may result in poorly efficient procedures, characterized by slow convergence and possible troubles in finding actual minima of the objective function. Improvements in both efficiency and robustness may be obtained by considering second-order information from the objective function, i.e. the Hessian matrix.

Di Fonzo and Marini (2012a) developed an efficient Newton’s method with Hessian modification to solve the GRP benchmarking problem. The algorithm consists of the following steps. First, the analytical expression of the Hessian of the GRP function is derived; second, the original constrained (benchmarking) problem is transformed into an equivalent unconstrained problem; third, a Newton’s method that allows a modification of the Hessian in order to preserve positive definiteness is applied. The Newton’s method was compared with different gradient-based procedures (including the feasible steepest descent algorithm implemented in BMK1) through a benchmarking exercise of hundreds of series. In all the cases considered, the Newton’s method significantly outperformed gradient-based methods in terms of both accuracy of the solution and convergence rates.

3.2 Reconciliation of systems of time series

In this paper we extend the GRP approach to the reconciliation of systems of time series. To our knowledge, the simultaneous reconciliation of time series subject to both temporal and contemporaneous constraints according to the global GRP criterion has never been considered in the literature. Due to the large and sparse nature of the constrained optimization problem for \(F_{GRP}\), we only consider an interior-point (IP) method, which is a powerful algorithm for solving large-scale nonlinear constrained problems (Nocedal and Wright 2006).Footnote 3

The IP method requires the gradient vector and the Hessian matrix of \(F_{GRP}\), whose analytical expressions are shown in the “Appendix”.

In a sense, IP methods represent the most advanced methods for solving large and sparse reconciliation problems based on the global GRP nonlinear function. One appealing feature of IP methods is that they are “infeasible” algorithms; that is, it is not required that the method start from a point in the feasible region. This feature is quite relevant in solving reconciliation problems, because it allows the user to start the algorithm by using the preliminary series \(p_{j,t}\).Footnote 4

4 GRP in two-step reconciliation procedures

When the system of time series is very large, a simultaneous solution can be operationally difficult to apply, mostly if the practitioner either does not intend to or cannot use sparse matrices computation facilities. Simplified solutions are however possible, based on a generalization of the two-step approach proposed by Quenneville and Rancourt (2005) for restoring the additivity of a system of SA time series such that their sum is in line with directly-derived SA totals: firstly, a univariate benchmarking procedure (e.g. the modified Denton PFD benchmarking procedure or the more general regression based benchmarking procedure by Cholette and Dagum 1994) is used to restore the temporal additivity of every series; in the second step, the component series are reconciled 1 year at a time using a least squares balancing procedure.

Di Fonzo and Marini (2011) discussed and applied to real-life systems of time series two-step procedures based on the PFD criterion, and compared them with the simultaneous PFD reconciliation solution based on (6). In the first step, the modified Denton PFD benchmarking technique is used. In the second step, instead, two alternative least-squares adjustments of the benchmarked estimates are employed to reconcile them with the contemporaneous constraints of the system 1 year at a time. Denoting with \(y_{j,t}\) the target (reconciled) series, and with \(y^{\textit{PFD}}_{j,t}\) the benchmarked series obtained at the first step with the modified Denton PFD method, the second step is based on the following objective criteria (Di Fonzo and Marini 2011):Footnote 5

$$\begin{aligned} F_T^{\textit{PFD}-\textit{BB}}= & {} \displaystyle \sum _{j=1}^m\sum _{t=(T-1)s+1}^{Ts} \displaystyle \frac{\left( y_{j,t}-y^\textit{PFD}_{j,t} \right) ^2}{|y^{PFD}_{j,t}|}, \quad T=1,\ldots ,N \end{aligned}$$
(8)
$$\begin{aligned} F_T^{\textit{PFD}-\textit{ST}}= & {} \displaystyle \sum _{j=1}^m\sum _{t=(T-1)s+1}^{Ts} \left( \displaystyle \frac{ y_{j,t}-y^\textit{PFD}_{j,t} }{y^\textit{PFD}_{j,t}}\right) ^2, \quad T=1,\ldots ,N \end{aligned}$$
(9)

where \(s\) is the order of temporal aggregation (e.g. \(s=3\) for monthly/quarterly aggregation, \(s=4\) for quarterly/annual, \(s=12\) for monthly/annual). For each low-frequency period \(T\), a constrained optimization of either functions (8) or (9) is performed, with constraints given by suitable subsets of the system constraints (1), that can be written by analogy as \( \mathbf{A}_T \mathbf{y}_T = \mathbf{b}_T \), where \(\mathbf{A}_T,\, \mathbf{y}_T\), and \(\mathbf{b}_T\) have dimensionsFootnote 6 \((h_T \times ms),\, (ms \times 1)\), and \((h_T \times 1)\), respectively, and \(\displaystyle \sum \nolimits _{T=1}^N h_T = r\), which is the row-dimension of matrix \(\mathbf{A}\) in (1).

As shown by Di Fonzo and Marini (2011), the choice of the criterion on which the second step is grounded may have a clear impact on the temporal dynamics of the reconciled series, the adjustments to the preliminary growth rates due to criterion (8) being generally larger than those produced by criterion (9).

However, in our previous work we considered only the modified Denton PFD technique at the first step. We did not investigate other options because our main concern was to compare two-step procedures with the simultaneous PFD reconciliation procedure. Now we are instead interested in two-step procedures having the GRP benchmarking technique at the first step, and a least-squares adjustment at the second step alternatively based on the following two criteria:

$$\begin{aligned} F_T^{\textit{GRP}-\textit{BB}}= & {} \displaystyle \sum _{j=1}^m\sum _{t=(T-1)s+1}^{Ts} \displaystyle \frac{\left( y_{j,t}-y^\textit{GRP}_{j,t} \right) ^2}{|y^\textit{GRP}_{j,t}|} \end{aligned}$$
(10)
$$\begin{aligned} F_T^{\textit{GRP}-\textit{ST}}= & {} \displaystyle \sum _{j=1}^m\sum _{t=(T-1)s+1}^{Ts} \left( \displaystyle \frac{ y_{j,t}-y^\textit{GRP}_{j,t} }{y^\textit{GRP}_{j,t}}\right) ^2. \end{aligned}$$
(11)

As for the PFD case, we wish (1) to assess how much alternative choices of the criterion adopted in the second step do affect the temporal dynamics of the reconciled series as compared to the preliminary ones, and (2) to have empirical confirmation to the expectation that the two-step procedure based on the GRP benchmarking method at the first step and a least-squares adjustment based on (11) at the second step is a close approximation to the simultaneous GRP reconciliation procedure.

5 Applications

In this section we consider the reconciliation of two systems of time series:Footnote 7 (1) the EUQSA system, 175 quarterly variables of the European Union’s quarterly national accounts (28 quarters) to be reconciled with known annual totals and 30 accounting constraints, and (2) the MRTS system, 236 monthly series of Canadian seasonally adjusted retail trade by provinces (156 months) to be reconciled with annual unadjusted totals and 32 contemporaneous constraints for geographical aggregations (for more details on the two datasets, see Di Fonzo and Marini 2011).

Aggregate statistics on both temporal \(\left( \sum \nolimits _{t \in T} p_{j,t} - Y_{j,T},\, j=1, \ldots , m,\, T=\right. \left. 1, \ldots , N\right) \) and contemporaneous \(\left( \sum \nolimits _{j=1}^m p_{j,t} - z_t,\, t=1, \ldots , n\right) \) discrepancies in the two systems are presented in Table 1. The table presents averages of (in order) mean, absolute mean, standard deviation, minimum, maximum, and range of the discrepancy for both temporal and contemporaneous constraints. The average discrepancy is expressed in % of the benchmark values.Footnote 8 It can be noted that EUQSA presents much higher discrepancies than MRTS in both temporal and contemporaneous constraints. Discrepancies in EUQSA are mostly generated by the fact that quarterly preliminary figures are available for just a few member states and not all the countries of the European Union (data-driven discrepancy); on the contrary, MRTS discrepancies come from the direct application of seasonal adjustment to each component in the system (procedure-driven discrepancy).

Table 1 Average statistics on discrepancy (in % of benchmark values) in the EUQSA and MRTS systems

We extend the comparison in Di Fonzo and Marini (2011) by including reconciliation procedures based on the GRP criterion. We apply two simultaneous procedures

  • Sim GRP, through the application of the interior-point method presented in Sect. 3 to minimize the global criterion (5);

  • Sim PFD, through the direct solution of the linear system coming from the constrained minimization of the global criterion (6);

and four two-step procedures

  • PFD-BB: the modified Denton PFD benchmarking is applied at the first step, and a least-squares adjustment based on (8) at the second step;

  • PFD-ST: the modified Denton PFD benchmarking is applied at the first step, and a least-squares adjustment based on (9) at the second step;

  • GRP-BB: the GRP benchmarking is applied at the first step, and a least-squares adjustment based on (10) at the second step;

  • GRP-ST: the GRP benchmarking is applied at the first step, and a least-squares adjustment based on (11) at the second step.

In order to assess the performance of the procedures, for each series we calculate the Mean Absolute Adjustment (MAA) to the percentage growth rates, that is

$$\begin{aligned} \textit{MAA}_j = 100 \times {\displaystyle \frac{1}{n-1}\displaystyle \sum _{t=2}^n \left| r^y_{j,t} - r^p_{j,t} \right| }, \quad j=1,\ldots ,m \end{aligned}$$

where \( r^y_{j,t}=( y_{j,t}-y_{j,t-1} ) /y_{j,t-1} \) and \(r^p_{j,t}=(p_{j,t}-p_{j,t-1}) /p_{j,t-1} \) are the growth rates of the reconciled and preliminary series, respectively. Overall indices for the whole system of time series are calculated accordingly. In our previous works, we used the Root Mean Squared Adjustment (RMSA) statistic to assess the adjustment. The RMSA measures the adjustment in terms of growth rates, which is the most natural choice for assessing the effects of the reconciliation process. The RMSA, however, is based on the square root of the GRP criterion, which is minimized by construction by GRP-based procedures. Compared with the RMSA, the MAA represents a more neutral statistic of growth rates preservation to evaluate and compare the performance of the GRP and PFD procedures.

Table 2 shows summary statistics on the MAA values for the two systems. Sim GRP outperforms the other procedures in both systems, with the lowest mean, median and standard deviation. Sim GRP is also the procedure with the highest number of series with minimum MAA value (41.1 and 29.2 %).

Table 2 Summary statistics of MAA for the EUQSA and MRTS systems

As expected, we note that Sim PFD is a good approximation of Sim GRP. The median MAA for EUQSA is 0.421 % for Sim PFD and 0.378 % for Sim GRP; on MRTS we find 0.487 % for Sim PFD versus 0.456 % for Sim GRP. Nevertheless, using Sim PFD we notice a higher standard deviation of MAA (2.367 vs. 1.464 %) and a maximum equal to 24.455 %, much larger than 6.366 % of Sim GRP (we come back on this difference below).

Concerning the two-step procedures, we note that using a GRP benchmarking at the first step slightly improves the results upon using the modified Denton PFD benchmarking procedure. However, the choice of the type of adjustment at the second step is much more important: whatever the first step is, using ST instead of BB guarantees an overall smaller adjustment in terms of MAA.

Looking at the two-step procedures using ST at the second step (PFD-ST and GRP-ST), we note a similar difference in the maximum MAA value identified when comparing Sim PFD and Sim GRP (24.396 % for PFD-ST and 6.915 % for GRP-ST). It appears that using GRP makes the adjustment process less volatile across the variables in the system, either when GRP is applied at the first step (provided ST is used) or when GRP is applied as a simultaneous procedure. Similarly to the already known relationship between PFD-ST and Sim PFD, the GRP-ST reconciliation procedure turns out to be a close approximation of Sim GRP.

Our next step is to relate the MAA values to the size of the variables in the system. We restrict our attention to Sim GRP, Sim PFD, and the two-step procedure PFD-BB.Footnote 9 In Di Fonzo and Marini (2011) it is shown that PFD-BB preserves more the movements in larger variables than in smaller ones (in terms of RMSA statistic), while Sim PFD distributes the adjustment more uniformly across the variables. We are interested to see if there is a similar picture using the MAA statistic and, more importantly, how is the distribution of adjustment resulting from Sim GRP. Figure 1 shows a scatter plot for EUQSA between the size of the variables (\(x\)-axis), each expressed in percentage of their total sum, and the MAA values (\(y\)-axis) for the three procedures. The presence of ‘\(\times \)’ along the \(y\)-axis, and those associated with smaller MAA values along the \(x\)-axis, confirm that PFD-BB tends to over-adjust smaller variables in favour of the larger ones. The isolated green dot for Sim PFD on the \(y\)-axis signals that the maximum adjustment from this procedure (24.455 %) is made to a very small-size variable. It can be noticed that the adjustments from Sim GRP are more evenly distributed than Sim PFD.

Fig. 1
figure 1

EUQSA system. MAA by share (%) of variables for Sim GRP, Sim PFD and PFD-BB

In previous GRP-PFD comparisons on univariate benchmarking (Di Fonzo and Marini 2013a), GRP was found to outperform PFD when large discrepancies and/or high variability in the preliminary series were present. It is thus interesting to look at the scatter plots of MAA versus (1) the average temporal discrepancy (in % and absolute values, Fig. 2), and (2) the coefficient of variation (in absolute value, Fig. 3), taken as a standardized measure of variability of the series. Moving along the \(x\)-axis we find the variables with higher temporal discrepancy and larger variability, respectively. In Fig. 2 we notice that the maximum adjustment from Sim PFD moves to the right-end side of the plot, which identifies that variable as the one presenting the highest (percentage) temporal discrepancy in the system. Furthermore, from Fig. 3 we note that the highest values of MAA for PFD-BB (i.e. the highest ‘\(\times \)’s in the scatter plot) are achieved for variables presenting medium–high variability in the system. These distributional aspects of the adjustment in the two systems confirm that, even in a reconciliation process of a system of time series, PFD deviates from GRP when discrepancies are large and the preliminary series is volatile.

Fig. 2
figure 2

EUQSA system. MAA by average temporal discrepancy (in absolute value) for Sim GRP, Sim PFD and PFD-BB

Fig. 3
figure 3

EUQSA system. MAA by coefficient of variation (in absolute value) of variables for Sim GRP, Sim PFD and PFD-BB

Finally, Table 3 presents an indication on the computational time needed to complete the reconciliation of the two systems of time series. Times are expressed in seconds and are derived as average from five sequential executions of the same reconciliation process. The simultaneous GRP solution takes \(<\)1 s to reconcile EUQSA and about 18 s for MRTS. Noting the difference in the computational times between EUQSA and MRTS, we may expect that even the interior-point method could become an unfeasible procedure when the system dimension is huge. In that case, the two-step procedure GRP-ST would represent a valuable alternative to Sim GRP.

Table 3 Average computational times (in seconds) to perform the reconciliation process

6 Conclusions

In this paper we have proposed simultaneous and two-step reconciliation procedures based on the GRP movement principle, which is widely recognized as the most natural choice for preserving the movements in an economic series. To solve the nonlinear GRP problem we have made recourse to Newton’s optimization methods that exploit the full derivative information of the problem, namely the gradient and the Hessian. Using two systems of economic series with both temporal and cross-sectional constraints, we have shown that these procedures are accurate, feasible, and time-efficient in finding an optimal solution of the GRP problem.

This work has largely benefited from our previous findings on benchmarking and reconciliation. First, we had already solved efficiently the nonlinear GRP problem for benchmarking (i.e. one variable at a time) using the same Newton’s optimization methods proposed in this paper. Second, we knew that only by preserving and exploiting the sparsity structure of the matrices involved would allow us to solve a reconciliation problem with many variables and many observations simultaneously, namely with all constraints in the system (temporal and contemporaneous) considered at the same time. Finally, we were aware that a satisfactory approximation of the optimal simultaneous solution could be obtained through a more convenient and simplified two-step procedure.

The purpose of this paper was twofold. First and foremost, we wanted to prove that the nonlinear GRP reconciliation problem could be solved accurately and efficiently using Newton’s optimization algorithms. Then, we wanted to compare the GRP solution with the Denton PFD solution to verify, as often claimed, that PFD is a close approximation of GRP. We believe that both objectives have been successfully achieved.

On the implementation aspects, we derived the analytical expressions of the gradient and Hessian of the global GRP function. These expressions are necessary to feed any Newton’s optimization methods, and to our knowledge they have never been derived before. Next, we used two different (but related) algorithms to solve the GRP reconciliation problem: an interior-point algorithm applied to the original constrained problem, and a Newton’s method with Hessian modification applied to a suitably reduced unconstrained problem. The interior-point algorithm turned out to be very fast, accurate and robust, solving both the problems faced in this paper. The only drawback of the interior-point algorithm used is that it is available through a commercial software, i.e. Matlab, which is not affordable for many potential “customers” of reconciliation procedures (in general public data-producing agencies). Being the interior-point an algorithm very difficult to replicate, we are currently studying a generalization of the GRP benchmarking procedure described in Di Fonzo and Marini (2012a), which is based on a Newton’s method with Hessian modification for the unconstrained problem.

To assess the GRP and PFD results we compared the two simultaneous solutions along with four two-step procedures. The latter are derived from using alternatively the PFD and GRP benchmarking techniques at the first step, and from using the level or the squared level of the temporal benchmarked series as normalizing factor at the second step. The comparison, based on a metric different from both the GRP and PFD criteria, showed that the simultaneous GRP solution is always the best method at preserving the movements in the preliminary series. We found, in particular, that using the simultaneous GRP guarantees a less volatile adjustment process across the variables. In general, the simultaneous PFD solution was shown to be very close to the GRP; but for a few series in the system, the most volatile ones with large temporal discrepancies to be distributed, PFD resulted in markedly higher adjustments than GRP.

As regards the two-step procedures, the same issue of robustness noted above was noted from using the GRP or the PFD benchmarking techniques at the first step. In general, choosing GRP at the first step guarantees a less volatile adjustment than PFD. However, the choice made at the second step counts much more in terms of quality of the adjustment, as already highlighted by Di Fonzo and Marini (2011) for the PFD case. The simultaneous GRP solution is best approximated when the GRP is used at the first step, and the squared temporal benchmarked series is considered as a normalizing factor at the second step (i.e. the GRP-ST procedure). The use of the (absolute) level, instead, was found to penalize too much the smaller series in the system in favour of the larger ones.

Practitioners may want to know which are the implications of our findings on the GRP on the routine work they conduct in benchmarking and reconciliation. As we have shown in this paper, and in the companion paper Di Fonzo and Marini (2012a) on benchmarking, the nonlinear GRP problem can be solved accurately and efficiently through optimization algorithms exploiting the full derivative information of the problem. If one aims at preserving the growth rates of the variables in the system under adjustment, our simultaneous GRP solution is undoubtedly the most accurate and reliable approach. Nonetheless, we have also shown that very similar performance can be reached by using a two-step procedure where the GRP is more conveniently applied at the first step on each individual series of the system.

We note, however, that in the great majority of the cases the PFD solution is very close to the GRP one. Large differences in the adjustment arise only when the series are volatile and present large discrepancies with respect to their low-frequency benchmark values. When the series are smooth and discrepancies are consistently small across the system—two desirable conditions in any reconciliation problem, simultaneous or two-step reconciliation procedures based on the Denton PFD principle can approximate very well the GRP results.