Abstract
It has long been known that the Errors-In-Variables (EIV) Model is a special case of the nonlinear Gauss–Helmert Model (GHM) and can, therefore, be adjusted by standard least-squares techniques in iteratively linearized GH-Models, which is the approach by Helmert (Adjustment Computations Based on the Least-Squares Principle (in German), 1907) and – later – by Deming (Phil Mag 11:146–158, 1931; Phil Mag 17:804–829, 1934).
Apart from the fact that there are, at least, two other nonlinear models that are equivalent to the above GH-Model, thus allowing two more classical least-squares approaches based on iterative linearization, it was the seminal paper by Golub and van Loan (SIAM J Numer Anal 17:883–893, 1980) in which they proved that a purely nonlinear approach can be followed as well, thereby avoiding any model linearization. They called such an approach “Total Least-Squares adjustment” by which any normal equations may be replaced by a simple eigenvalue problem, as long as only diagonal dispersion matrices are involved.
Here, an attempt will be made to show the differences and parallels in various algorithms, even in the fully weighted case, which obviously all generate the same results, but without necessarily showing equal efficiency in doing so, as is well known since the publications by Schaffrin and Wieser (J Geodesy 82:415–421, 2008), Fang (Weighted Total Least-Squares solutions with applications in geodesy, 2011), and Mahboub (J Geodesy 86:359–367, 2012).
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The Errors-In-Variables (EIV) Model has recently seen a lot of attention since, in accordance with Golub and van Loan (1980), it can be treated in its nonlinear form by a least-squares approach that they coined “Total Least-Squares adjustment”. It eventually leads to a (generalized) eigenvalue problem that needs to be solved in lieu of the sequence of normal equations that would result from a traditional “Least-Squares adjustment” within iteratively linearized models. The latter approach dates, at least, back to Helmert (1907), but has as well been used by Deming (1931, 1934) for the approximation of curves and, more recently, by Neitzel (2010) to determine the parameters of a similarity transformation.
In contrast, the nonlinear Total Least-Squares (TLS) approach which, in its original formulation, could tolerate only “element-wise weighting” and thus only diagonal weight matrices, has since been generalized in several steps by Schaffrin and Wieser (2008), Fang (2011), and Mahboub (2012) to now accept any positive-definite weight matrices. This development will be presented in the following Sect. 2, thereby showing how the more specialized algorithms can be derived from the more general ones by simplification.
Moreover, it should be noted that progress has also been made towards the use of positive-semidefinite dispersion matrices in TLS adjustment, which may be handled as described by Schaffrin et al. (2014). These cases are quite relevant whenever the random error matrix needs to show a certain pattern or structure after the adjustment. Due to the limited space, these advanced methods will not be discussed below.
Instead, attention will be paid to a triplet of classical nonlinear models that all can be constructed to be equivalent to the EIV-Model and, furthermore, may undergo a sequence of Least-Squares adjustments via iterative linearization which, in the end, converge to the very same TLS solution. This will be the theme in Sect. 3 although many details have to be left out; for those, see Schaffrin (2015).
2 Nonlinear TLS Adjustment in an EIV-Model
2.1 Fang’s Algorithm
Let the EIV-Model be defined by
where
-
y is the \( n\times 1 \) observation vector;
-
A is the \( n\times m \) (random) coefficient matrix with full column rank (aka “data matrix”);
-
E A is the \( n\times m \) (unknown) random error matrix associated with A;
-
ξ is the \( m\times 1 \) (unknown) parameter vector;
-
e y is the \( n\times 1 \) (unknown) random error vector associated with~y;
-
e A is the \( nm\times 1 \) vectorial form of the matrix E A ;
-
Q is the \( n\left(m+1\right)\times n\left(m+1\right) \) block-diagonal pos.- def. cofactor matrix;
-
\( P:={Q}^{-1} \) is the corresponding block-diagonal pos.- def. weight matrix;
-
σ 2 o is the (unknown) variance component (unit- free);
-
\( \mathrm{C}\mathrm{o}\mathrm{v}\left\{e_y, {\it vec}E_A\right\}=0 \) for the sake of simplicity.
The model generalizes the one used by Schaffrin and Wieser (2008) where a Kronecker product structure for
was assumed, as well as the one used by Golub and von Loan (1980) who only allowed diagonal cofactor matrices with
The objectives of a nonlinear Total Least-Squares (TLS) adjustment are now based on the principle
which can be given the equivalent form of a Lagrange target function, namely:
Consequently, the Euler-Lagrange necessary conditions result in the following system of nonlinear “normal equations”:
which still needs to be reduced by partial elimination since the sufficient condition is fulfilled as
Now, (6a, b) are transformed to provide the residual vectors through
so that (6d) can be rewritten as
with \( {Q}_1={Q}_1\left(\widehat{\xi}\right) \) being nonsingular, thus leading to
and, together with (6c), to the system
Obviously, the estimated parameter vector is now obtained as in Fang (2011, p.27) via
and allows updates for Q 1, \( \widehat{\lambda} \), and ẽ A , from which a new estimate \( \widehat{\xi} \) results.
The Total Sum of weighted Squared Residuals (TSSR) may now readily be computed from
so that a suitable variance component estimate may be obtained through
as the redundancy in model (1a, b) is still n-m.
Alternatively, system (10) can be given the asymmetric form
which would then provide the estimated parameter vector through
and should lead to a similar iteration as before. Note that (15) also appears as formula (21) in Xu et al. (2012), but essentially represents a variant of Fang’s algorithm; also, cf. Fang (2013) where further alternatives are presented.
2.2 Mahboub’s Algorithm
On the other hand, combining (9) with (6c) leads to the following sequence of identities:
where K denotes a \( nm\times nm \) ”commutation matrix” that is also known as “vec-permutation matrix”; for more details, see Magnus and Neudecker (2007).
Obviously, (16) translates into the estimated parameter vector
with \( {R}_1={R}_1\left(\widehat{\xi},\widehat{\lambda}\right) \) and, from (16), with
without necessarily implying that \( {R}_1=-{\tilde{E}}_A^T \). Therefore, the sequence of solutions to (15) may differ from the sequence of solutions to (17a) when iteratively updating Q 1, \( \widehat{\lambda} \), and R 1, before a new parameter vector estimate \( \widehat{\xi} \) can be found; yet the ultimate convergence points will be the same.
Again, the TSSR can be computed from (12) which will lead to the variance component estimate in (13).
2.3 A New Variant of Mahboub’s Algorithm
After giving (16) the form
the estimated parameter vector may as well be obtained from
thus allowing updates for Q 1 and \( \widehat{\lambda} \). This algorithm will be further explored in the near future.
2.4 The Schaffrin–Wieser Algorithm
This algorithm was designed for the somewhat more special case where the cofactor matrix Q A can be split into a Kronecker product, thereby indicating that all columns have cofactor matrices proportional to each other. This implies
and thus
with
(20) and (21) together generate the identity
suggesting the iteration
with
while (12) and (13) generate first the TSSR and then a suitable variance component estimate.
2.5 The Golub-van-Loan Algorithm
Now, the condition (19) is further specialized to
and
so that (22a) becomes
with
and this, from (24a, b), becomes
(24a) and (24c) allow the problem to be rephrased as a generalized eigenvalue problem, specifically as:
with the variance component estimate
The original situation, treated by Golub and van Loan (1980), was characterized by the further specializations
which, in turn, lead to the standard eigenvalue problem
whose solution provides the Total Least-Squares Solution (TLSS).
In the next section, a few equivalent models will be presented for which, traditionally, an identical weighted LEast-Squares Solution (LESS) would have been found after iterative linearization.
3 Traditional Models, Equivalent to the EIV-Model
3.1 The Nonlinear Gauss–Helmert Model
Here, the new vectors
are introduced. Then,
with the nonlinear vector-valued vector function
due to the term \( {E}_A\cdot \xi \), forms an equivalent Gauss–Helmert Model that would traditionally be linearized for an iterative Least-Squares adjustment.
The truncated Taylor series, following Pope (1972), then reads:
with suitable approximations ξ o and \( {\mu}_o:=Y-\underset{\sim }{0} \) where \( \underset{\sim }{0} \) here denotes a “stochastic zero vector” of size \( n\left(m+1\right)\times 1 \). This leads first to
then to
and eventually to the linearized Gauss–Helmert Model:
Note that the weighted LEast-Squares Solution (LESS) is now being formed through the normal equations
with
and the residual vectors through
Looking at the next and all the following iteration steps, it becomes clear that this represents one specific iterative solver of Fang’s TLS normal equations (11).
For more details, see Fang (2011, ch. 4.4), Snow (2012, ch. 4), and the forthcoming OSU-Report by Schaffrin (2015), as well as Neitzel (2010) for a specific application.
3.2 The Nonlinear Gauss–Markov Model
In this case, the expectation of the data matrix A is introduced as a new \( n\times m \) ”parameter matrix”
leading to the equivalent Gauss-Markov Model
with the nonlinear vector-valued vector function
due to the term \( {\varXi}_A\cdot \xi \). The linearization of model (37a, b) with respect to the approximations ξ o and \( {\xi}_A^{(o)}:={\it vec}\left({A}^{(o)}\right)={\it vec}\left(A-\underset{\sim }{0}\right) \), where \( \underset{\sim }{0} \) now denotes a “stochastic zero matrix” of size \( n\times m \), then leads first to
and finally to the linearized Gauss–Markov Model
After a number of further manipulations, the weighted LESS for model (39a, b) can be shown to fulfill the “normal equations”
with
which nicely corresponds to (35a, b). More details can be found in the forthcoming OSU-Report by Schaffrin (2015).
3.3 The Model of Direct Observations with Nonlinear Constraints
Now, the expectation of the observation vector y is introduced as just another parameter vector ξ y of size \( n\times 1 \) so that the new model combines the direct observation equations
with the nonlinear constraints
which might be linearized into
In the already mentioned OSU-Report by Schaffrin (2015), it will be shown how the resulting iterative LESS’s do converge to the Total Least-Squares Solution.
For another take on this model, refer to Donevska et al. (2011) who stress the equivalence to orthogonal regression as applied by Deming (1931, 1934).
4 Conclusions
It has been clarified that the TLS approach towards the EIV-Model requires a nonlinear treatment of the nonlinear model. A number of different algorithms have been presented to generate the Total Least-Squares Solution from a certain set of nonlinear normal equations. A triplet of conventional nonlinear models has also been considered, suggesting that the LEast-Squares Solutions from iterative linearization do converge to the nonlinear TLS-Solution in all three cases. Most of the details, however, will be published in a forthcoming OSU-Report, due to the space restrictions for these Proceedings.
References
Deming WE (1931) The application of least squares. Phil Mag 11:146–158
Deming WE (1934) On the application of least squares II. Phil Mag 17:804–829
Donevska S, Fiśerova E, Horn K (2011) On the equivalence between orthogonal regression and linear model with type II constraints. Acta Univ Palacki Olomuc, Fac rer nat, Mathematica 50(2):19–27
Fang X (2011) Weighted Total Least-Squares solutions with applications in geodesy. Publ. No. 294, Department of Geodesy and Geoinformatics, Leibniz University, Hannover
Fang X (2013) Weighted Total Least-Squares: necessary and sufficient conditions, fixed and random parameters. J Geodesy 87:733–749
Golub GH, van Loan CF (1980) An analysis of the Total Least-Squares problem. SIAM J Numer Anal 17:883–893
Helmert FR (1907) Adjustment computations based on the Least-Squares Principle (in German), 2nd edn. Teubner, Leipzig
Magnus JR, Neudecker H (2007) Matrix differential calculus with applications in statistics and economics, 3rd edn. Wiley, Chichester
Mahboub V (2012) On weighted Total Least-Squares for geodetic transformations. J Geodesy 86:359–367
Neitzel F (2010) Generalization of Total Least-Squares on example of unweighted and weighted 2D similarity transformation. J Geodesy 84:751–762
Pope A (1972) Some pitfalls to be avoided in the iterative adjustment of nonlinear problems. In: Proceedings of the 38th Ann. ASPRS Meeting, Amer. Soc. of Photogrammetry: Falls Church, pp 449–477
Schaffrin B, Wieser A (2008) On weighted Total Least-Squares adjustment for linear regression. J Geodesy 82:415–421
Schaffrin B (2015) The Errors-In-Variables (EIV) model. Nonlinear Total Least-Squares (TLS) adjustment or iteratively linearized Least-Squares (LS) adjustment? OSU-Report, Division of Geodetic Science, School of Earth Sciences, The Ohio State University, Columbus
Schaffrin B, Snow K, Neitzel F (2014) On the Errors-In-Variables model with singular covariance matrices. J Geodetic Sci 4:28–36
Snow K (2012) Topics in Total Least-Squares adjustment within the Errors-In-Variables model: singular covariance matrices and prior information, Report No. 502, Division of Geodetic Science, School of Earth Sciences, The Ohio State University, Columbus
Xu P, Liu JN, Shi C (2012) Total Least-Squares adjustment in partial Errors-In-Variables models. Algorithms and statistical analysis. J Geodesy 86:661–675
Acknowledgment
This author is very much indebted to the thorough reading of the text by three anonymous reviewers. Their recommendations have improved the readibility quite substantially. Also appreciated are many discussions with his long-time collaborator Kyle Snow.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Schaffrin, B. (2015). Adjusting the Errors-In-Variables Model: Linearized Least-Squares vs. Nonlinear Total Least-Squares. In: Sneeuw, N., Novák, P., Crespi, M., Sansò, F. (eds) VIII Hotine-Marussi Symposium on Mathematical Geodesy. International Association of Geodesy Symposia, vol 142. Springer, Cham. https://doi.org/10.1007/1345_2015_61
Download citation
DOI: https://doi.org/10.1007/1345_2015_61
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24548-5
Online ISBN: 978-3-319-30530-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)