1 The spatio-temporal model

The approach used in Secchi et al. (2015) for analyzing the spatial and temporal variability of Erlang data bears strong resemblance to some spatio-temporal models proposed by Olives et al. (2014) and Lindström et al. (2014) for studying air pollution data. Similarly to air pollution data, the spatio-temporal distribution of Erlang quantities is characterized by both a strong temporal seasonality and a strong spatial dependence.

Using the approach introduced by Lindström et al. (2014), the spatio-temporal Erlang data \(E_{{\mathbf {x}}(t)}\) can be modeled in a more general form as

$$\begin{aligned} E_{{\mathbf {x}}}(t)=y({\mathbf {x}},t)=\mu ({\mathbf {x}},t)+\epsilon ({\mathbf {x}}, t) \end{aligned}$$
(1)

where

$$\begin{aligned} \mu ({\mathbf {x}}, t)=\sum _{k=1}^K\beta _k({\mathbf {x}})\psi _k(t). \end{aligned}$$
(2)

The \(\{\psi _k(t)\}_{k=1}^K\) is a set of (smooth) temporal basis functions with \(\psi _1(t)=1\) that can be estimated by the modified singular value decomposition method (see Fuentes et al. 2006; Szpiro et al. 2010). The terms \(\beta _k({\mathbf {x}})\) are spatially varying coefficients for the temporal functions that can be estimated using universal kriging (Matheron 1969), where the trend is a linear regression on geographical covariates and the spatial dependence structure is provided by a set of covariance matrices given by \(\varSigma _{\beta _k}(\theta _k)\), parameterized by an unknown parameter vector \(\theta _k\). In the particular case of Erlang data, the trend of the regression kriging could contain information about land use (i.e, university, residential, or industrial areas).

The residual space-time component \(\epsilon =\epsilon ({\mathbf {x}}, t)\) is assumed to be independent in time with stationary, parametric spatial covariance \(\varSigma _{\epsilon }^t(\theta _{\epsilon })\), for \(t=1, \ldots , T\). In particular, the residual \(\epsilon ({\mathbf {x}}, t)\) consists of a correlated component \(\epsilon ^*({\mathbf {x}}, t)\) and a nugget-effect \(\epsilon _{nugget}({\mathbf {x}}, t)\) including small-scale variability and measurement errors, that is,

$$\begin{aligned} \epsilon ({\mathbf {x}}, t) =\epsilon ^*({\mathbf {x}}, t)+\epsilon _{nugget}({\mathbf {x}}, t). \end{aligned}$$
(3)

Assuming the independence of the components of (3), the spatial covariance of \(\epsilon ({\mathbf {x}}, t)\) can be written as \(\varSigma _{\epsilon }=\varSigma ^*_{\epsilon }+\varSigma _{\epsilon , \text {nugget}}\), where \(\varSigma _{\epsilon , \text {nugget}}\) is a diagonal matrix.

Then note that the model proposed in Secchi et al. (2015) could be considered a particular case of (1) where the components \(\beta _k\) (denoted in Secchi et al. (2015) by \(D_1, \ldots , D_K\)) do not take into account the spatial dependence, and the residual component \(\epsilon \) is a random error variable, independent in time and space.

In addition, note that an alternative definition of the mean component \(\mu ({\mathbf {x}},t)\) in (1) has been recently proposed by Olives et al. (2014) as

$$\begin{aligned} \mu ({\mathbf {x}}, t)=\sum _{k=1}^K\{\beta _k({\mathbf {x}})+\gamma _k({\mathbf {x}})\}\psi _k(t), \end{aligned}$$

where \(\beta _k({\mathbf {x}})\) are Gaussian spatial random fields distributed as \(\beta _k({\mathbf {x}})\sim N(0,\varSigma _{\beta _k}(\theta _k))\) as in Lindström et al. (2014), and \(\gamma _k({\mathbf {x}})\) are i.i.d. random effects distributed as \(\gamma _k\sim N(0, \sigma ^2_k\mathbf {I})\). They can be considered the nugget effect of the \(\beta _k({\mathbf {x}})-\)fields.

1.1 Smooth temporal functions

The objective of the smooth temporal basis functions \(\psi _k(t)\) is to capture the temporal variability in the data using deterministic functions, or functions obtained as smoothed singular vectors (see Fuentes et al. 2006). Nicolis and Nychka (2012) and Matsuo et al. (2011) suggest to use the non-orthogonal wavelet basis (such as the W-transform) for their ability to fit a variety of standard covariance models. The mayor drawback of these approaches is that one needs to specify the functional form of the basis \(\psi \). The treelets used in Secchi et al. (2015) provide an interesting tool to the analysis of the temporal behavior of Erlang data, especially for their feature of being ‘data-driven’ basis.

However, other methods take different approaches to construct data-driven basis. The Tree-Based Wavelet (Gavish et al. 2010) and the lifting scheme are some examples. While the Tree-Based Wavelet transform is defined via a hierarchical tree (built through and adaptive Haar-like orthonormal basis) which is assumed to capture the geometry and structure of the input data, the lifting schemes provide a simple and general construction of second generation wavelets, where the choice for primal and/or dual lifting is fully determined by the values of the data (Jansen and Oonincx 2005; Sweldens 1997).

In particular, the lifting scheme allows one to custom design the filters needed in the transform algorithms to the situation at hand, that is, the filters generate functions whose form depends on each particular case. Finally, lifting scheme leads to a fast, fully in-place implementation of the wavelet transforms (Sweldens 1997). We think that these methods could provide alternative tools to the analysis of temporal Erlang data.

1.2 Spatial dependence and dimension reduction

Parameter estimation of a spatio-temporal model tends to be challenging in practice. Methods for reducing the computational burden are becoming more common when the data set is very large. Some recent methods are based on ‘low-rank’ (or ‘reduced rank’) approaches which aim is to reduce the spatial process to a dimensional subspace of a lower dimension in order to increase the computational efficiency (see Banerjee et al. 2008; Nicolis and Nychka 2012; Olives et al. 2014). The idea of the low rank approach proposed by Olives et al. (2014) is to replace the covariance \(\varSigma =\{||C({\mathbf {x}}_i-{\mathbf {x}}_j)||\}_{i,j\in S}\) where \(S\) is the observed space of spatial locations \({\mathbf {x}}\), with a low rank covariance \(Z{\tilde{\varSigma }}^{-1 }Z\) where \(Z=\{||C({\mathbf {x}}_i-\kappa _j)||\}_{i\in {\mathcal {S}}, j\in {\mathcal {K}}}\), \({\tilde{\varSigma }}=\{||C(\kappa _i-\kappa _j)||\}_{i,j\in {\mathcal {K}}}\), and \({\mathcal {K}}\) is a set of spatial locations \(\kappa \), of cardinality \(n\ll N\) (\(N\) is the number of observations in the space \({\mathcal {S}}\)). Then, the \(\beta _k-\)fields can be approximated by a vector with dimension \(n\times 1\). A similar approach has been used by Nicolis and Nychka (2012) where the authors use a multiresolution approach based on non-orthogonal wavelet functions to reduce the dimension of the original space. The conditional simulation is then used for estimating the process over all the original space. Banerjee et al. (2008) use ‘knots’ and predictive processes for reducing the dimension.

In the approach proposed by Secchi et al. (2015) the spatial dependence has been estimated empirically over several subsets of data using the Voronoi tessellation, and ’low rank’ matrices are produced. Then summary statistics on the simulation results (using the bootstrap technique) provides the estimation of the process on the complete space. We think that the low-rank above mentioned approaches can be considered an important contribution for the estimation of spatial dependence of non-stationary processes. For all these approaches the optimal choice of the subset of data \(n\) remains an open problem.

2 Pre-processing of data: denoising with missing data

If some data are missing, several denoising methods cannot be directly implemented. When the data are spatially and temporally correlated many methods have been proposed for infilling missing data and smoothing irregular curves (Glasbey 1995; Haworth and Cheng 2012; Olives et al. 2014; Onorati et al. 2013; Smith et al. 1996, 2003). For example, Smith et al. (1996) use spatial patterns from EOF for reconstructing the data in a given temporal period, and (Olives et al. 2014; Onorati et al. 2013) apply cubic smoothing splines to some of the left singular vectors of the singular vector decomposition. A comparison of methods for smoothing and gap filling in time series has been proposed by Kandasamy et al. (2013).

Similarly, Erlang data show strong spatial dependence in the principal surfaces that can be used for infilling temporal data. The Fourier-based technique used by Secchi et al. (2015) for denoising and infilling missing data have the following advantages: (i) it transforms discrete data into functionals that can be used in the space-time model; (ii) it is considered a denoising technique; and (iii) it resolves the problem of missing data.

However, the main drawbacks of the proposed pre-processing methodology are that: (i) it needs to choose a basis of very high dimension, in order to be sure to catch up all relevant localized features, with a consequent increase of the computational burden, and (ii) it does not consider the spatial dependence among sites.

We think that including denoising and imputation of missing data in the estimation procedure of the space-time model could improve the computational efficiency of the algorithm and the precision of the estimates.