1 Introduction

The concept of typical/classic errors-in-variables (EIV) models has been already introduced by several authors in the last years. A typical EIV model is similar to a Gauss–Markov (GM) model but all the variables are subject to random errors. For further reading, see e.g., Van Huffel and Vandewalle (1991), Schaffrin and Wieser (2008), Felus (2004), Schaffrin et al. (2012a, b), Mahboub et al. (2012), Fang (2013, 2014), Snow and Schaffrin (2012), Snow (2012) and Mahboub (2014), etc. Meanwhile, some other researchers investigated this problem traditionally; see e.g., Neitzel (2010) and Shen et al. (2011). The term “total least-squares (TLS)” was coined in the field of numerical analysis by Golub and Van Loan (1980) as one of the standard solutions of this model. Also in several contributions particularly in geodetic literature, its applications have been investigated. Linear regression (Schaffrin and Wieser 2008; Fang 2011), geodetic resection (Schaffrin and Felus 2008), transformation (Mahboub 2012) and rapid satellite positioning (Mahboub and Sharifi 2013) are some examples. Nevertheless, there are some other problems such as curve fitting which all the variables are subject to errors but the model is not similar to GM model. In other words, the design matrix in these kinds of models is a non-linear function of random variables. Some examples are as follows (both y and x coordinates are subject to random errors and underlining indicates random variables):

$$\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = a\sin \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} + b\cos \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} + c = [\begin{array}{ccc} {\sin \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} } & {\cos \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} } & 1 \\ \end{array} ]\left[ {\begin{array}{c} a \\ b \\ c \\ \end{array} } \right] = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )\left[ {\begin{array}{c} a \\ b \\ c \\ \end{array} } \right], $$
$$\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = a_{0} + a_{1} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} + a_{2} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t}^{2} + a_{3} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t}^{3} = \left[ {\begin{array}{cccc} 1 & {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} } & {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t}^{2} } & {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t}^{3} } \\ \end{array} } \right]\left[ {\begin{array}{c} {a_{0} } \\ {a_{1} } \\ {a_{2} } \\ {a_{3} } \\ \end{array} } \right] = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )\left[ {\begin{array}{c} {a_{0} } \\ {a_{1} } \\ {a_{2} } \\ {a_{3} } \\ \end{array} } \right].$$

In fact, in the non-typical EIV model, the first order moment/mean of the non-linear random design matrix A(t) is not known. We only know the mean of the random variables (t). In other words, in the non-typical EIV model, the functional dependence of A on t is considered through \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{A} = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ). \) Mathematically, one encounters the following functional models for the typical and non-typical EIV models:

$$ \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )\xi \quad {\text{with}}\quad \left\{ {\begin{array}{l} {E(A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )) = A(E(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ))\;\quad{\text{for}}\;{\text{typical}}\;{\text{EIV}}\;{\text{model}}\;({\text{I}})}, \\ {E(A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )) \ne A(E(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ))\; \quad{\text{for}}\;{\text{non-typical}}\;{\text{EIV}}\;{\text{model}}\; ( {\text{II}})}, \\ \end{array} } \right. $$

is the vector of unknown parameters, where E(·) denotes expectation operator. In geodesy, the non-typical EIV models appear in some problems. Clearly curve fitting is one of them due to the above formulas and it will be discussed in Sect. 4 (numerical results and discussions). The other example is surface reconstruction. “One well established technique to construct a surface that best fits to an observed scattered point cloud is based on the Kriging methodology that uses semi-variograms. As this semi-variogram regularly turns out to have a steep slope near the origin—where it matters most—, a better idea seems to be seeking a best fit on the basis of the Total Least-Squares (TLS) principle” (Schaffrin and Uzun 2008). Although they correctly mentioned that considering the errors for both ordinate and abscissa provides an estimated semi-variogram that is “nearest” to the empirical values in the geometric sense, namely measured along perpendicular projections onto the graph of the semi-variogram, they should have considered the empirical semi-variogram as a non-typical EIV model since the observed quantities in its linearized versions have non-linear relationships. The examples of this paper numerically indicate the significance of this problem. In other words, the standard TLS solution of the model (II) is not admissible when it is incorrectly considered as the model (I).

Although the method of least-squares is one of the oldest methods of estimation, it is still the automobile of modern statistical analysis (Stigler 1999); therefore, in this paper, first we review an algorithm in order to solve the non-typical EIV model which is a novel model using the traditional least-squares method with the linearization of a nonlinear mixed model and iterative improvement of the solution. We employ the traditional Lagrange approach to optimize the target function of this problem. Then by comparison, a modified weighted TLS (WTLS) algorithm is proposed in order to treat the non-typical EIV model (II) as a typical EIV model (I) overseeing the dependence of A on t through \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{A} = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ). \)

The TLS algorithms modify the elements of the design matrix in the typical and non-typical EIV models, see e.g., matrix A(t) in models (I) and (II). This modification is desirable for the typical EIV models since the design matrix is a linear function of the random variables; however, it is not necessarily correct for the non-typical EIV models [model (II)] which have nonlinear design matrix. Therefore, the proposed way based on the modified WTLS algorithm is not necessarily equivalent to the former way but as it is shown by numerical examples it results in an excellent approximate solution. This foundation is important because there is no need for linearization in the TLS algorithms. Moreover, their rate of convergence is usually better than the traditional approach.

This paper is organized as follows. In the next section, the concepts of the non-typical EIV model are introduced. An algorithm in order to solve this problem is reviewed using the traditional non-linear least-squares method within a non-linear mixed model. We then propose the modified WTLS algorithm to deal with this problem. In a later section, two simulation studies give insight into the efficiency of the algorithms proposed. Finally we conclude the paper.

2 Non-typical EIV model and a review of its classic solution based on the traditional least-squares method with linearization of a non-linear mixed model: algorithm 1

The typical/classic EIV model has been introduced by several contributions. In this model the first order moment/mean of design matrix is linear, however, it is not true for non-typical EIV model in which the design matrix is a non-linear function of random variables.

The mathematical definitions of these two types of models clarify our discussion. The typical EIV model is given as follows:

$$ \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = \left( {A - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{E}_{A} } \right)\xi + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} ,\quad rank\;A = m < n, $$
(1)
$$ \left[ {\begin{array}{c} {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} } \\ {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{A} } \\ \end{array} } \right]\!\text{:= }\left[ {\begin{array}{c} {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} } \\ {{\text{vec(}}\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{E}_{A} )} \\ \end{array} } \right]\sim \left( {\left[ {\begin{array}{c} 0 \\ 0 \\ \end{array} } \right]|\sigma_{0}^{2} \left[ {\begin{array}{cc} {Q_{y} } & 0 \\ 0 & {Q_{A} } \\ \end{array} } \right]} \right),\quad {\text{P}}_{\text{y}} = Q_{y}^{ - 1} ,\quad {\text{P}}_{\text{A}} = Q_{\text{A}}^{ - 1} . $$
(2)

Here \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} \) is the n × 1 observation vector, \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} \) is the respective n × 1 vector of observational noise, A is the n × m coefficient matrix of input variables (observed), \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{E}_{A} \) is the corresponding n × m matrix of random noise, ξ is the m × 1 parameters vector (unknown), \( D(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} ) = \sigma_{0}^{2} Q_{y} \) and \( D(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{A} ) = \sigma_{0}^{2} Q_{A} \) are the corresponding dispersion matrices of size n × n and mn × mn (partly known), \( \sigma_{0}^{2} \) is the variance component (unknown).

We define the non-typical EIV model as follows:

$$ \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = A\left( {t - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{t} } \right)\xi + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} ,\quad rank\;A\left( {t - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{t} } \right) = m < n, $$
(3)

or

$$ E(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} ) = E(A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ))\xi \quad {\text{with}}\quad E(A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )) \ne A(E(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} )), $$
(4)
$$ \left[ {\begin{array}{c} {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} } \\ {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{t} } \\ \end{array} } \right]\sim \left( {\left[ {\begin{array}{c} 0 \\ 0 \\ \end{array} } \right]|\sigma_{0}^{2} \left[ {\begin{array}{cc} {Q_{y} } & 0 \\ 0 & {Q_{t} } \\ \end{array} } \right]} \right),\quad {\text{P}}_{\text{y}} = Q_{y}^{ - 1} ,\quad {\text{P}}_{\text{t}} = Q_{\text{t}}^{ - 1} , $$
(5)

where \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} \) is the n × 1 observation vector, \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} \) is the respective n × 1 vector of observational noise, the coefficient matrix A() is the n × m non-linear function of input random variables t of size l × 1 (observed), \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{t} \) is the corresponding l × 1 vector of random noise ξ is the m × 1 parameters vector (unknown), \( D(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{y} ) = \sigma_{0}^{2} Q_{y} \) and \( D(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{e}_{t} ) = \sigma_{0}^{2} Q_{t} \) are the corresponding dispersion matrices of size n × n and l × l (known), \( \sigma_{0}^{2} \) is the variance component (unknown).

Equation (4) can be easily converted into a non-linear mixed model which we define as follows (see e.g., Leick 2004, Chap. 4):

$$ f(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} ,\;\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} ,\;\xi ) = 0, $$
(6)

where f() indicates an implicit non-linear relationship with two groups of random observed vectors \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} \) and t and one deterministic unknown vector ξ. Snow (2012) denoted all the observed quantities (here \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} \) and t) by a vector Y and the unknown vector ξ by the vector Ξ and introduced this model as the non linear Gauss–Helmert model.

In fact the old works by Deming (1931) treat very special cases which we might now classify as non-linear type of EIV models, but he does not treat what we call non-typical EIV model in all his generality. The suggested paper by Xu et al. (2012) offers a presentation of preceding work plus two new results in Sects. 2.2 and 2.3. The first (2.2) is a WTLS algorithm treating the classical EIV model where he assumes that all elements of A are independent random variables in the sense that Q A is invertible (ω −1 in their notation). This is hardly the case because we almost never observe all the elements of A but only a much smaller number of variables t on which A(t) depends. As a consequence Q A is singular and one has to use Rao’s unified theory for getting a proper weight matrix out of infinitely many. In this respect our approach is much more general and closer to the reality of actual applications.

Also we can say that the non-typical EIV model (3) or even its linearized version will be given by Eq. (7) differ from the partial EIV model proposed in Xu et al. (2012) because they merely partitioned the matrix A into a stochastic and a deterministic parts while in our case the observed vector t in A(t) [Eq. (4)] cannot be separated from design matrix before linearization since its elements are arbitrary non-linear functions. In other words, the deterministic matrix B in Eq. (25-a) (Xu et al. 2012) cannot be extracted from our non-typical EIV model. In fact, the approach of Xu et al. (2012) can solves only the typical EIV model. Their approach is useful for statistical analysis.

In order to adjust the non-typical EIV model defined by Eqs. (3)–(6) using traditional non-linear LS within a mixed model, first one must linearize it respect to the two random unknowns which refer to observable quantities \( \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} \) and t and one deterministic unknown ξ as follows:

$$ y_{0} - A\left( {t_{0} } \right)\xi_{0} + I_{n} e_{y} - \left( {\xi^{T} \otimes I_{n} } \right)\frac{\partial A(t)}{\partial t}_{{\left| {\begin{array}{l} {t = t_{0} } \\ {\xi = \xi_{0} } \\ \end{array} } \right.}} e_{t} - A\left( {t_{0} } \right)\delta \xi = 0, $$
$$ w + e_{y} - Be_{t} - A\delta \xi = 0, $$
(7)

with

$$ w = y_{0} - A\left( {t_{0} } \right)\xi_{0} ,\quad B = \left( {\xi^{T} \otimes I_{n} } \right)\frac{\partial A(t)}{\partial t}_{{\left| {\begin{array}{l} {t = t_{0} } \\ {\xi = \xi_{0} } \\ \end{array} } \right.}}\!\!\!{,}\quad A = A\left( {t_{0} } \right). $$

The Lagrange target function can be set up as follows:

$$ \Upphi \left( {e_{y} ,\;{\text{e}}_{\text{t}} ,\;\lambda ,\;\updelta \xi } \right): = {\text{e}}_{\text{y}}^{\text{T}} {\text{P}}_{\text{y}} e_{y} + {\text{e}}_{\text{t}}^{\text{T}} {\text{P}}_{\text{t}} {\text{e}}_{\text{t}} + 2\lambda^{T} \left( {w + e_{y} - Be_{t} - A\delta \xi } \right), $$
(8)

where λ is the n × 1 (unknown) Lagrange multiplier vector.

Since the derivation of this problem is well-known, we only present the following algorithm for the non-typical EIV:

  • First step: input y (0) = y(observed), t (0) = t(observed) and

    $$ \hat{\xi }^{(0)} = \left( {A\left( {t^{(0)} } \right)^{\text{T}} {\text{P}}_{\text{y}} A\left( {t^{(0)} } \right)} \right)^{ - 1} A\left( {t^{(0)} } \right)^{\text{T}} {\text{P}}_{\text{y}} y^{(0)} . $$
  • Second step: for iN compute:

    $$ w^{(i)} = y - A(t)\hat{\xi }^{(i - 1)} ,\quad B^{(i)} = \left( {\xi^{T} \otimes I_{n} } \right)\frac{\partial A(t)}{\partial t}_{{\left| {\begin{array}{l} {t = t^{(i - 1)} } \\ {\xi = \hat{\xi }^{(i - 1)} } \\ \end{array} } \right.}} ,\quad A^{(i)} = A\left( {t^{(i - 1)} } \right), $$
    (9)
    $$ R^{(i)} = \left[ Q_{y} + B^{(i)} Q_{t} B^{(i)T} \right]^{ - 1} , $$
    (10)
    $$ \delta \hat{\xi }^{(i)} = \left( A^{(i){\rm T}} R^{(i)} A^{(i)} \right)^{ - 1} A^{(i){\rm T}} R^{(i)} w^{(i)} , $$
    (11)
    $$ \hat{\lambda }^{(i)} = R^{(i)} \left( {w^{(i)} - A^{(i)} \delta \hat{\xi }^{(i)} } \right), $$
    (12)
    $$ \tilde{e}_{\text{y}}^{(i)} = - Q_{y} \hat{\lambda }^{(i)} , $$
    (13)
    $$ \tilde{e}_{\text{t}}^{(i)} = Q_{t} B^{(i)T} \hat{\lambda }^{(i)} , $$
    (14)
    $$ \hat{\xi }^{(i)} = \hat{\xi }^{(i - 1)} + \delta \hat{\xi }^{(i)} , $$
    (15)
    $$ y^{(i)} = y + {\tilde{\text{e}}}_{\text{y}}^{(i)} , $$
    (16)
    $$ t^{(i)} = t + {\tilde{\text{e}}}_{\text{t}}^{(i)} . $$
    (17)
  • Third step: repeat 2nd step until one sees convergence.

The variance component \( \hat{\upsigma }_{0}^{2} \) can be estimated based on the proposed algorithm by exploring Eqs. (13)–(15) in the following quadratic forms:

$$ \hat{\upsigma }_{0}^{2} = \frac{{{\tilde{\text{e}}}_{\text{y}}^{\text{T}} {\text{P}}_{\text{y}} {\tilde{\text{e}}}_{\text{y}} + {\tilde{\text{e}}}_{\text{t}}^{\text{T}} {\text{P}}_{\text{t}} {\tilde{\text{e}}}_{\text{t}} }}{{{\text{n}} - {\text{m}}}} = \frac{{\hat{\uplambda }^{\text{T}} {\text{R}}^{ - 1} \hat{\uplambda }}}{{{\text{n}} - {\text{m}}}}. $$
(18)

Important note since Q y and Q t are usually invertible and the matrix B is full row rank, the normal equation of this algorithm has an excellent stability and consequently the algorithm is usually stable. The iterative method based on the GHM is also not sensitive to initial values, because the estimates in the first iteration can be a very good approximated solution, see Shen et al. (2011).

3 A total least-squares (TLS) algorithm to deal with non-typical EIV model: algorithm 2

Although there is only one least squares criterion, there are several techniques by which least squares may be applied. Regardless of which technique is applied, the results of an adjustment of a given set of measurements must be the same. The choice of a technique, therefore, is mostly a matter of convenience and/or computational economy (Mikhail 1976). In fact, both the TLS and traditional non-linear LS methods are based on the L2 norm estimator; however, some theoretical properties have the former be more interesting than the latter. For instance, the TLS algorithms do not need linearization and they are less sensitive to the approximate initial values of the unknown parameters than the traditional approaches. As a result, one is right to seek a TLS algorithm to deal with the non-typical EIV model although in this case, the first order moment/mean of the random design matrix is not linear and we cannot employ a TLS algorithm to solve it directly.

A comparison between the algorithm of previous section and the WTLS algorithm within the typical EIV model [given by Eqs. (1) and (2)] in Mahboub (2012) shows that this WTLS algorithm has potential to deal with the non-typical EIV model after some slight modifications. In other words, treat the non-typical EIV model as a typical EIV model. Its actual computation can be based on Monte-Carlo methods. However, a first order approximation can be obtained by applying the law of covariance propagation to the noisy design matrix \( A\left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{t} } \right). \)

Approximately the converted typical EIV model is: \( {\text{y - e}}_{\text{y}} = \left( {{\text{A}}({\text{t}}) - \frac{{\partial {\text{A}}({\text{t}})}}{{\partial {\text{t}}}}{\text{e}}_{\text{t}} } \right)\xi , \) i.e., \( {\text{E}}_{\text{A}} = \frac{{\partial {\text{A}}({\text{t}})}}{{\partial {\text{t}}}}{\text{e}}_{\text{t}} \) and the dispersion matrix QA is easily derived by the law of error propagation as \( {\text{Q}}_{\text{A}} = \frac{{\partial {\text{A}}({\text{t}})}}{{\partial {\text{t}}}}{\text{Q}}_{\text{t}} \left( {\frac{{\partial {\text{A}}({\text{t}})}}{{\partial {\text{t}}}}} \right)^{\text{T}} . \) The converted typical EIV model can be solved with WTLS. Summarizing the following WTLS algorithm for the non-typical EIV is proposed:

  • First step: [N, c] = A T P y [A, y], ξ (0) = N −1 c.

  • Second step: \( Q_{A} = \frac{\partial vec(A(t))}{\partial t}_{{\left| {t = t^{(0)} } \right.}} Q_{t} \left( {\frac{\partial vec(A(t))}{\partial t}_{{\left| {t = t^{(0)} } \right.}} } \right)^{\text{T}} . \)

  • Third step: for iN compute:

    $$ R_{1}^{{({\text{i}})}} = \left[ {Q_{y} + \left( {\hat{\xi }^{{(i - 1){\text{T}}}} \otimes I_{n} } \right)Q_{A} \left( {\hat{\xi }^{(i - 1)} \otimes I_{n} } \right)} \right]^{ - 1} , $$
    $$ \hat{\lambda }^{(i)} = R_{1}^{{({\text{i}})}} \left( {y - A\hat{\xi }^{(i - 1)} } \right), $$
    $$ R_{2}^{{({\text{i}})}} = \left( {I_{\text{m}} \otimes \hat{\lambda }^{{(i){\text{T}}}} } \right)Q_{A} \left( {\hat{\xi }^{(i - 1)} \otimes I_{n} } \right)R_{1}^{{({\text{i}})}} , $$
    $$ \hat{\xi }^{(i)} = \left( {A^{\text{T}} R_{1}^{{({\text{i}})}} A + R_{2}^{{({\text{i}})}} A} \right)^{ - 1} \left( {A^{\text{T}} R_{1}^{{({\text{i}})}} + R_{2}^{{({\text{i}})}} } \right){\text{y}}. $$
  • Fourth step: repeat third step until one sees convergence

    $$ \left\| {\hat{\xi }^{(i)} - \hat{\xi }^{(i - 1)} } \right\| < \delta . $$

4 Numerical results and discussions

The determination of the initial position and constant velocity from redundant position measurements is an example of curve fitting. Let g(x) be an unknown function and we measure m points of it which both coordinates (x, g(x)) are falsified by random noise (see Fig. 1)

$$ y_{i} \approx g\left( {x_{i} } \right),\quad i = 1, \ldots ,m. $$
(19)
Fig. 1
figure 1

The sampling of an unknown function g(x) for x i , i = 1,…, m

Clearly, one is not able to reconstruct an arbitrary function g(x) from a finite set of its samples and we require additional information about it. The additional information can be expressed by a linear combination of n known base functions b r (x), r = 1,…, n,

$$ g(x) = \mathop \sum \limits_{r = 1}^{n} a_{r} b_{r} (x), $$
(20)

with the n unknown coefficients a r . Equations (19) and (20) give the following system of equations:

$$ \left[ {\begin{array}{c} {y_{1} } \\ \vdots \\ {y_{m} } \\ \end{array} } \right] \approx \left[ {\begin{array}{ccc} {b_{1} (x_{1} )} & \ldots & {b_{n} (x_{1} )} \\ \vdots & {} & \vdots \\ {b_{1} (x_{m} )} & \cdots & {b_{n} (x_{m} )} \\ \end{array} } \right]\left[ {\begin{array}{c} {a_{1} } \\ \vdots \\ {a_{m} } \\ \end{array} } \right]. $$
(21)

This system is non-linear because of the randomness of the coordinates x i which appear in the base functions b r . Therefore one is not allowed to solve this system using linear LS method within a GM model. Also as it has been discussed in the previous sections, the TLS algorithms within a typical EIV model are not admissible theoretically. We should employ one of the two proposed ways in this research.

Due to the nature of the function g(x), different base functions b r (x) can be used. Here we examine two sets of the useful base functions namely trigonometric and polynomial series.

4.1 Curve fitting using trigonometric base functions

Suppose that we measure the coordinates of 10 points of the function g(x) which is given as follows:

$$ \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} = a\sin \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} + b\cos \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} + c = [\begin{array}{ccc} {\sin \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} } & {\cos \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} } & 1 \\ \end{array} ]\left[ {\begin{array}{c} a \\ b \\ c \\ \end{array} } \right] = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} )\left[ {\begin{array}{c} a \\ b \\ c \\ \end{array} } \right], $$
(22)

where a, b and c are the unknown coefficients and \( (\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} ,\;\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} ) \) denotes the noisy coordinates which have been observed with different precision. The coordinate of these samples with their weights are given in Table 1 and Fig. 2.

Table 1 Observed points and their weights
Fig. 2
figure 2

The trigonometric curve and the noisy coordinates of 10 points as its samples

We adjust the system of Eq. (22) using four ways: 1, the linear LS method; 2, the standard TLS method on the assumption that the non-typical EIV is a typical EIV model; 3, algorithm 1 in Sect. 2 based on the traditional non-linear LS method within a mixed model; 4, algorithm 2 in Sect. 3 based on modified WTLS method within a typical EIV model. The estimated unknown parameters using five methods and the true value of the unknown parameters are given in Table 2.

Table 2 The estimated unknown parameters using three methods and the true value of the unknown parameters

From Table 2, one can clearly see that the linear LS method gives a bias estimation of non-typical EIV model (22) which in fact is a non-linear system of equations. Also one is not permitted to adjust this problem using the standard TLS method on the assumption that the non-typical EIV is a typical EIV model since as it has been proven theoretically, in such a case, the non-linear relationships of the elements of the random design matrix are neglected. The results of Linear Ls and TLS (third column) are incorrect. The correct solutions can be obtained by the two developed algorithms of this paper. Also algorithm 2 is more stable than algorithm 1 due to the number of iterations since the former converges after two iterations, while, starting by the same initial unknown parameters more than five iterations are required for the algorithm 2 to meet the same threshold; furthermore, it needs linearization while algorithm 2 does not require any linearization. Algorithm 2 is not sensitive to approximate initial values of the parameters which is the bottleneck problem that restricts the application of the nonlinear techniques and the solution of such methods is somewhat critical to handle due to the many pitfalls described by Pope (1972).

4.2 Curve fitting using polynomial base functions

Let g(x) be an unknown function of which we measure the coordinates of 10 points:

$$ \bar{y} = a_{0} + a_{1} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} + a_{2} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x}^{2} + a_{3} \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x}^{3} = \left[ {\begin{array}{cccc} 1 & \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{{x}} & {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{{x}}^{2} } & {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{{x}}^{3} } \\ \end{array} } \right]\left[ {\begin{array}{c} {a_{0} } \\ {a_{1} } \\ {a_{2} } \\ {a_{3} } \\ \end{array} } \right] = A(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} )\left[ {\begin{array}{c} {a_{0} } \\ {a_{1} } \\ {a_{2} } \\ {a_{3} } \\ \end{array} } \right], $$
(23)

where a, b and c are the unknown coefficients and \( (\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} ,\;\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{y} ) \) denotes the noisy coordinates which have been observed with different precision. The coordinate of these samples with their weights are given in Table 3 and Fig. 3.

Table 3 Observed points and their weights
Fig. 3
figure 3

The polynomial curve and the noisy coordinates of 10 points as its samples

Similarly we adjust the system of Eq. (23) using four ways: 1, the linear LS method; 2, the standard TLS method on the assumption that the non-typical EIV is a typical EIV model; 3, algorithm 1 in Sect. 2 based on the traditional non-linear LS method within a mixed model; 4, algorithm 2 in Sect. 3 based on modified WTLS method within a typical EIV model. The estimated unknown parameters using four methods and the true value of the unknown parameters are given in Table 4.

Table 4 The estimated unknown parameters using three methods and the true value of the unknown parameters

Similar to trigonometric curve, the linear LS method gives a bias estimation in polynomial curve. By comparison, this bias is bigger than the bias of previous example. It can refer to different non-linear property of trigonometric and polynomial functions. The similar reasoning can be given for the standard TLS method on the assumption that the non-typical EIV is a typical EIV model. Also the reasonable solutions are obtained by the stable algorithms 1 and 2, although there is a negligible difference between the results.

5 Conclusions

In this paper, non-typical EIV model was introduced where the elements of the design matrix are non-linear functions of observed noisy quantities. As a result, the first order moment/mean of the design matrix is not directly known, consequently TLS algorithms within a typical/classic EIV model are not applicable to it.

The non-typical EIV model appears in some applications such as curve fitting and surface reconstruction. Two algorithms were presented to deal with it. Algorithm 1 is based on traditional non-linear LS method within a mixed model and algorithm 2 is a modified WTLS algorithm. Although the classic former way produces the optimal LS solution to this problem, traditional non-linear techniques usually have their own difficulties (see e.g., Pope 1972), while, algorithm 2 does not need linearization and its initial values can be easily computed by a linear LS estimation which is the bottleneck problem that restricts the application of the nonlinear techniques. Therefore a simple use of the linear LS method can produce these initial values.

As the numerical examples show, the linear LS method gives a bias estimation of the non-typical EIV model. The amount of this bias depends on non-linear property of the system of equations. This conclusion had been already obtained for the typical EIV model by several contributions.

Both examples convincingly demonstrate that the standard TLS solution of the non-typical EIV model is not admissible when it is incorrectly considered as a typical EIV model; i.e., the non-linear relationships of the elements of the random design matrix are neglected.

Although we do not claim that our results based on algorithm 2 are better than algorithm 1, algorithm 2 is a proper TLS approach which is more accurate than exiting TLS algorithms in dealing with the non-typical EIV model and also is simpler than the traditional non-linear LS method within a GH model (algorithm 1) since if we formulate the non-typical EIV model in terms of the non linear GH model, one has to solve a complicate model while in our method (algorithm 2) we can directly work with the design/coefficient matrix. We only need to compute the dispersion matrix Q A . Furthermore, the examples demonstrated that algorithm 2 is converged in fewer numbers of iterations than algorithm 1.

Finally we emphasize that both the nonlinear LS and our algorithm (algorithms 1 and 2) are not unbiased estimators both for the typical and non-typical TLS problem even though the nonlinear function in the coefficient matrix is properly considered. One may design a bias corrected estimator for nontypical EIV model, see e.g., Box (1971) and Xu et al. (2012).