1 Introduction

The parameter estimation problem for two dimensional (2D) chirp models is encountered in many real-life applications such as 2D-homomorphic signal processing, magnetic resonance imaging (MRI), optical imaging, interferometric synthetic aperture radar (INSAR), modeling non-homogeneous patterns in the texture image captured by a camera due to perspective or orientation (see e.g., Francos & Friedlander, 1995, 1998, 1999 and the references cited therein). 2D chirp signals have also been used to model Newton’s rings (Guo & Li, 2018). These rings are predominantly used in testing spherical and flatting optical surface and curvature radius measurement. 2D chirp signals have been used as a spreading function or base for digital watermarking (Stankovic et al., 2001), which is also helpful in data security, medical safety, fingerprinting, and observing content manipulations, see (Zhang et al., 2010). 2D chirps with linear frequency modulation have been extensively used because of their linear feature in time-frequency domain. These models have been employed in many applications due to other important properties also, e.g., spectral shaping of the watermark by choosing appropriately the frequency and frequency rate parameters, which allows minimum overlap of the spreading function with the image data. 2D chirps also allow adaptive watermarking for enhanced robustness to stationary filtering attacks, i.e., filtering whose parameters do not change over an image, see (Stankovic et al., 2001).

Many algorithms based on different approaches have been put forward in the literature to solve such problems. Polynomial phase differencing (PD) operator was introduced in Friedlander and Francos (1995) as an extension of the polynomial phase transform proposed in Peleg and Porat (1991). Several works Friedlander and Francos (1996), (1998) and (1999) utilized PD operator to develop computationally efficient algorithms for estimating similar polynomial phase signals. Cubic phase function (CPF) proposed in O’shea (2002), was extended in Zhang et al. (2008) for similar 2D chirp signal modeling. Further, CPF was utilized to estimate 2D cubic phase signal using genetic algorithm in Djurović et al. (2010). Consistency and asymptotic normality of LSEs for a general 2D polynomial phase signal (PPS) model have been derived in Lahiri and Kundu (2017). A finite step computationally efficient procedure for a similar 2D chirp signal model proposed in Lahiri et al. (2013) was proved asymptotically equivalent to LSEs. Quasi-Maximum Likelihood (QML) algorithm Djurović and Stanković 2014 proposed for 1D PPS, was generalized for 2D PPS in Djurović (2017). Further approximate least squares estimators (ALSEs) proposed in Grover and Kundu (2018) have been proved to be asymptotically equivalent to LSEs. An efficient estimation procedure based on fixed dimension technique, presented in Grover et al. (2021) was shown to be asymptotically equivalent to the optimal LSEs, for a 2D chirp model without the product term \(mn\), e.g., compare model (1) and the one considered in Grover et al. (2021).

Estimators based on phase differencing strategies or high order ambiguity function (HAF) or some of their modifications are computationally easier to obtain. However, the performance of estimation deteriorate below a relatively high signal-to-noise ratio (SNR) threshold and are sub-optimal. Methods that use PD in the steps of estimation, usually estimate coefficients of the highest degree first, and then subsequently estimate the coefficients of a lower degree from the demodulated or de-chirped signal. Therefore, the estimation error of highest degree coefficients accumulates and affects estimation accuracy of lower degree coefficients quite seriously. For more details, one can refer to Barbarossa et al. (1998), Djurović (2017) and Wu et al. (2008). Till date, there is no detailed study of the theoretical properties of the estimators CPF and QML, such as strong consistency and asymptotic normality. Recently, optimal estimators for a simpler 2D chirp model without the interaction term have been developed in Grover et al. (2021). However, the results in Grover et al. (2021) cannot be generalized directly for the underlying model (1). It may be noted that the model considered in this paper is more general, as it takes into account the interaction term \(\mu ^0 mn\). Due to the presence of this interaction term coefficient \(\mu ^0\), the estimation becomes more difficult as the estimators of \(\alpha ^0\) and \(\gamma ^0\) are no longer independent (as in the case for Grover et al., 2021), and hence making their computation as well as the study of theoretical analysis becomes more challenging. The problem becomes more complicated under the assumption of general stationary linear process error assumption.

The main contributions of this paper are; providing a computationally efficient algorithm to estimate the parameters of the model defined in (1), and further establishing theoretical asymptotic properties of the proposed estimators. The proposed algorithm is motivated by the fact that a 2D chirp model with five non-linear parameters can be viewed through a number of 1D chirp models with two non-linear parameters, and hence computational complexity of estimators for the 2D models can be reduced.

The key attributes of the proposed method are that it is computationally faster than the conventional optimal methods such as LSEs, maximum likelihood estimators, or ALSEs and at the same time, having desirable statistical properties such as, attaining the same rates of convergence as the optimal LSEs. In fact the proposed estimators of the chirp rate parameters have the same asymptotic variance as that of the traditional LSEs, and hence are asymptotically optimal.

The rest of the paper is organised as follows: the mathematical model and the methodology to obtain the proposed estimators is presented in Sect. 2. The model assumptions and the asymptotic theoretical results are given in Sect. 3. In Sect. 4, the finite sample performance of the proposed estimators is demonstrated through simulation studies. In this section, a comparison of the performance of the proposed estimators with the state-of-the-art methods such as the least squares method, approximate least squares method, and 2D multilag HAF method is also presented. Finally, Sect. 5 concludes the paper, followed by detailed proofs in appendices.

2 Estimation methodology

This paper addresses the problem of parameter estimation of a 2D chirp signal model defined as follows:

$$\begin{aligned} y(m,n)&=A^0\cos (\alpha ^0 m+\beta ^0 m^2+\gamma ^0 n+\delta ^0 n^2+\mu ^0mn)\nonumber \\&\quad \;+B^0\sin (\alpha ^0 m+\beta ^0 m^2+\gamma ^0 n+\delta ^0 n^2+\mu ^0mn)+X(m,n),\nonumber \\& m=1,2,\ldots ,M, n=1,2,\ldots ,N. \end{aligned}$$
(1)

Here, y(mn) is the observed real valued signal and X(mn) is the additive noise term. \(A^0,B^0\) are amplitude parameters, \(\alpha ^0,\gamma ^0\) are frequency parameters, \(\beta ^0,\delta ^0\) are frequency rates or chirp rates, and \(\mu ^0\) is the coefficient of product term. \(\mathrm {{\varvec{\xi }}}^0 = (\alpha ^0,\beta ^0, \gamma ^0, \delta ^0, \mu ^0)^\top \) represents vector of non-linear parameters. This model can be used to describe signals having constant amplitude with frequency to be a linear function of spatial co-ordinates. The product term \(mn\) in such chirp models (1), is an important characteristic of numerous measurement interferometric signals, radar signal returns and detecting digital watermarking.

Now we discuss the proposed method of estimation. Let the data matrix for model (1) be denoted as

$$\begin{aligned} {\varvec{Y}} = \begin{bmatrix} y(1,1)&{} y(1,2)&{}\ldots &{}y(1,N)\\ y(2,1)&{} y(2,2)&{} \ldots &{}y(2,N)\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ y(M,1)&{}y(M,2)&{}\ldots &{} y(M,N) \end{bmatrix}_{M\times N}. \end{aligned}$$

The proposed algorithm uses the fact that for each fixed column (or row) of \({\varvec{Y}}\), the 2D chirp model breaks down to a cascade of 1D chirp models.

Realise that if we fix one dimension say \(n=n_0\) in (1), then the 2D chirp can be seen as 1-D chirp for \(m=1,2,\ldots ,M\), as follows:

$$\begin{aligned} y(m,n_0)&= A^0(n_0)\cos \big ((\alpha ^0 +n_0\mu ^0)m+\beta ^0 m^2 \big )\nonumber \\&\quad \;+B^0(n_0)\sin \big ((\alpha ^0 +n_0\mu ^0)m+\beta ^0 m^2 \big )\nonumber \\&\quad \; +X(m,n_0), \nonumber \\ \text {where, }A^0(n_0)&= A^0\cos ( \gamma ^0 n_0+\delta ^0 n_0^2)+B^0\sin ( \gamma ^0 n_0+\delta ^0 n_0^2),\nonumber \\ B^0(n_0)&= -A^0\sin ( \gamma ^0 n_0+\delta ^0 n_0^2)+B^0\cos ( \gamma ^0 n_0+\delta ^0 n_0^2). \end{aligned}$$
(2)

Similarly, for a fixed \(m=m_0\), we have 1-D chirp for \(n=1,2,\ldots ,N,\)

$$\begin{aligned} y(m_0,n) = {\widetilde{A}}^0(m_0)\cos \big ((\gamma ^0+m_0\mu ^0) n+\delta ^0 n^2 \big )&+{\widetilde{B}}^0(m_0)\sin \big ((\gamma ^0+m_0\mu ^0) n+\delta ^0 n^2 \big )\nonumber \\&+X(m_0,n). \end{aligned}$$
(3)

Equation (2) represents 1-D chirp signal model with \(\alpha ^0+n_0\mu ^0\) and \(\beta ^0\) as the frequency and frequency rate parameters respectively. Similarly, Eq. (3) represents 1-D chirp signal model with \(\gamma ^0+m_0\mu ^0\) and \(\delta ^0\) as the frequency and frequency rate parameters respectively.

Hence, our methodology is developed by estimating parameters of these 1D chirps based on a particular column (or row) vector of data matrix, rather than estimating the whole 2D chirp parameters based on the full data matrix. Therefore this procedure reduces computational burden to estimate model parameters drastically. Further suppose column vector \({\varvec{Y}}_{Mn_0}\) denotes the \(n_0^{th}\) column of data matrix \({\varvec{Y}}\) and column vector \({\varvec{Y}}_{m_0N}\) denotes the transpose of \(m_0^{th} \) row of data matrix \({\varvec{Y}}\). Define \(Z_k(\alpha _1,\alpha _2)\) matrix as

$$\begin{aligned} Z_k(\alpha _1,\alpha _2)= \begin{bmatrix} \cos (\alpha _1+\alpha _2) &{} \sin (\alpha _1+\alpha _2)\\ \cos (\alpha _12+\alpha _22^2) &{} \sin (\alpha _12+\alpha _22^2)\\ \vdots &{}\vdots \\ \cos (\alpha _1k+\alpha _2k^2) &{} \sin (\alpha _1k+\alpha _2k^2)\\ \end{bmatrix}_{k\times 2}. \end{aligned}$$
(4)

We need to estimate the model parameters of (2) which is a 1D chirp model, so we use LSEs to estimate the parameters. We obtain LSEs of non-linear parameters in model (2) for a fixed \(n_0\) by defining following reduced sum of squares, see (Grover et al., 2021):

$$\begin{aligned} R_{Mn_0}(\alpha _1,\alpha _2) = {\varvec{Y}}_{Mn_0}^{T}\Big (I_{M\times M}-P_{Z_M}(\alpha _1,\alpha _2)\Big ){\varvec{Y}}_{Mn_0}, \end{aligned}$$
(5)

where, \(P_{Z_M}(\alpha _1,\alpha _2) = Z_M(\alpha _1,\alpha _2)\Big (Z_M(\alpha _1,\alpha _2)^\top Z_M(\alpha _1,\alpha _2)\Big )^{-1}Z_M(\alpha _1,\alpha _2)^\top \) and \(I_{M\times M}\) is the \(M\times M\) identity matrix.

Then,

$$\begin{aligned} ({\widehat{\alpha }}_{n_0},{\widehat{\beta }}_{n_0})^\top = {\displaystyle \mathop {\mathrm {arg\; min}}\limits _{\alpha _1,\alpha _2}}R_{Mn_0}(\alpha _1,\alpha _2) \end{aligned}$$
(6)

is the proposed estimator of \((\alpha ^0+n_0\mu ^0,\beta ^0)^\top \) based on minimizing the sum of squares corresponding to \(n_0^{th}\) column of the data matrix \({\varvec{Y}}\). Similarly, we can obtain LSEs of parameters in model (3) by defining

$$\begin{aligned} R_{m_0N}(\alpha _1,\alpha _2) = {\varvec{Y}}_{m_0N}^{T}\Big (I_{N\times N}-P_{Z_N}(\alpha _1,\alpha _2)\Big ){\varvec{Y}}_{m_0N}, \end{aligned}$$
(7)

where, \(P_{Z_N}(\alpha _1,\alpha _2) = Z_N(\alpha _1,\alpha _2)\Big (Z_N(\alpha _1,\alpha _2)^\top Z_N(\alpha _1,\alpha _2)\Big )^{-1}Z_N(\alpha _1,\alpha _2)^\top \) and \(I_{N\times N}\) is the \(N\times N\) identity matrix. Then,

$$\begin{aligned} ({\widehat{\gamma }}_{m_0},{\widehat{\delta }}_{m_0})^\top = {\displaystyle \mathop {\mathrm {arg\; min}}\limits _{\alpha _1,\alpha _2}}R_{m_0N}(\alpha _1,\alpha _2) \end{aligned}$$
(8)

will be the proposed estimator of \((\gamma ^0+m_0\mu ^0,\delta ^0)^\top \) based on minimizing the sum of squares corresponding to \(m_0^{th}\) row of data matrix \({\varvec{Y}}\).

We observe that for each fixed column, we get an estimate of the same chirp rate parameter \(\beta ^0\) in (6), and also an estimate of frequency parameter, which is a linear combination of \(\alpha ^0\) and \(\mu ^0\). Similarly, estimates of \(\delta ^0\) and a linear combination of \(\gamma ^0\) and \(\mu ^0\) for a fixed row in (8) has been obtained. It is important to note that the linearity of parameters of 1D-chirp models plays a crucial role in getting proposed estimators of \(\alpha ^0,\gamma ^0\), and \(\mu ^0\) by fitting a linear regression model as follows.

Once the parameters corresponding to each \((M+N)\) 1-D chirp models have been estimated. We apply the following three steps to obtain final estimates of parameters of the model (1):

  1. Step-1.

    Let \({\varvec{\Gamma }}^\top =\begin{bmatrix} 1&{}1&{}\cdots &{}1&{}0&{}0&{}\cdots &{}0\\ 0&{}0&{}\cdots &{}0&{}1&{}1&{}\cdots &{}1\\ 1&{}2&{}\cdots &{}N&{}1&{}2&{}\cdots &{}M \end{bmatrix}\), and \({\varvec{\Lambda }}^\top =\begin{bmatrix} {\widehat{\alpha }}_1&{\widehat{\alpha }}_2&\cdots&{\widehat{\alpha }}_N&{\widehat{\gamma }}_1&{\widehat{\gamma }}_2&\cdots&{\widehat{\gamma }}_M \end{bmatrix}\). Combine the obtained estimates as follows:

    $$\begin{aligned} {\varvec{\Lambda }}={\varvec{\Gamma }}\begin{bmatrix} \alpha \\ \gamma \\ \mu \end{bmatrix}. \end{aligned}$$
    (9)

    Then estimate of \((\alpha ^0,\gamma ^0,\mu ^0)^\top \) is \(\Big ({\varvec{\Gamma }}^\top {\varvec{\Gamma }}\Big )^{-1}{\varvec{\Gamma }}^\top {\varvec{\Lambda }}\).

  2. Step-2.

    The estimates of \(\beta ^0\) and \(\delta ^0\) are simply the averages \(\displaystyle {\widehat{\beta }}= \frac{1}{N}\sum _{n=1}^{N}{\widehat{\beta }}_{n}\) and \(\displaystyle {\widehat{\delta }} =\frac{1}{M}\sum _{m=1}^{M}{\widehat{\delta }}_{m}\), respectively.

  3. Step-3.

    After getting estimates of non-linear parameters, \(\widehat{\mathrm {{\varvec{\xi }}}} = ({\widehat{\alpha }},{\widehat{\beta }},{\widehat{\gamma }},{\widehat{\delta }}, {\widehat{\mu }})^\top \), the amplitude parameter estimates can be provided as follows:

$$\begin{aligned} \begin{bmatrix} {\widehat{A}}\\ {\widehat{B}} \end{bmatrix} =\begin{bmatrix} \frac{2}{{MN}}\displaystyle \sum _{m=1}^{M}\sum _{n=1}^{N}y(m,n)\cos {\widehat{\phi }} \\ \frac{2}{{MN}}\displaystyle \sum _{m=1}^{M}\sum _{n=1}^{N}y(m,n)\sin {\widehat{\phi }} \end{bmatrix}. \end{aligned}$$
(10)

3 Theoretical results

In this section, we first state the model assumptions required to derive the theoretical asymptotic properties explicitly. These are as follows:

Assumption 1

X(mn) can be expressed as a linear combination of a double array sequence of independently and identically distributed (i.i.d.) random variables \(\{\epsilon (m,n)\}\) with mean 0, variance \(\sigma ^2\) and finite fourth moment.

$$\begin{aligned} X(m,n) = \displaystyle {\sum _{i=-\infty }^{\infty }\sum _{j=-\infty }^{\infty }}a(i,j)\epsilon (m-i,n-j), \end{aligned}$$
(11)

such that

$$\begin{aligned} \displaystyle {\sum _{i=-\infty }^{\infty }\sum _{j=-\infty }^{\infty }}|a(i,j)|<\infty . \end{aligned}$$
(12)

Assumption 2

True parameter \( \mathrm {{\varvec{\theta }}}^0 = (A^0,B^0,\alpha ^0,\beta ^0,\gamma ^0,\delta ^0,\mu ^0)^\top \) is an interior point of parameter space \(\Theta \), where \(\Theta = (-M,M)\times (-M,M)\times [0,2\pi ]\times [0,\pi /2]\times [0,2\pi ]\times [0,\pi /2]\times [0,2\pi ]\), and \(A^{0^2}+B^{0^2}>0\), for some \(M>0\).

Assumption 1 puts the model under a very general set-up of noise contamination as it includes the dependent relationship too. Assumption 2 is taken to assure the absence of any identifiability problem and non-zero deterministic part of the signal. Under these general assumptions, we have derived strong consistency and asymptotic normality of the estimators. The obtained results are stated in the following theorems.

Theorem 1

Under assumptions 1 and 2, the proposed estimator of parameter \(\mathrm {{\varvec{\theta }}}^0\) is strongly consistent, i.e.,

$$\begin{aligned} \widehat{\mathrm {{\varvec{\theta }}}}\xrightarrow {a.s.}\mathrm {{\varvec{\theta }}}^0 \hbox { as } \min \{M,N\}\xrightarrow {}\infty . \end{aligned}$$

Proof

Please see Appendix A for the proof. \(\square \)

Theorem 2

Under assumptions 1 and 2, the proposed estimators of \(\mathrm {{\varvec{\theta }}}^0\) is asymptotically normally distributed.

$$\begin{aligned} {\varvec{D}}^{-1}(\widehat{ \mathrm {{\varvec{\theta }}}} - \mathrm {{\varvec{\theta }}}^0) \xrightarrow {d} {\mathcal {N}}_7(0,\displaystyle {\varvec{\Sigma }})\hbox { as } M=N\rightarrow \infty , \end{aligned}$$

where \(c = \displaystyle {\sum _{i=-\infty }^{\infty }\sum _{j=-\infty }^{\infty }}a(i,j)^2\),   \({\varvec{D}}^{-1} = diag(M^{\frac{1}{2}}N^{\frac{1}{2}}, M^\frac{1}{2}N^{\frac{1}{2}}, M^{\frac{3}{2}}N^{\frac{1}{2}}, M^{\frac{5}{2}}N^{\frac{1}{2}}, M^{\frac{1}{2}}N^{\frac{3}{2}}, M^{\frac{1}{2}}N^{\frac{5}{2}}, M^{\frac{3}{2}}N^{\frac{3}{2}} )\), and

$$\begin{aligned}{} & {} {\varvec{\Sigma }}= \frac{c\sigma ^2}{(A^{0^2}+B^{0^2})}\\{} & {} \quad \begin{bmatrix} 2A^{0^2}+187B^{0^2}&{}-185A^0B^0&{}-378B^0&{}60B^0&{}-378B^0&{}60B^0&{}612B^0\\ -185A^0B^0&{}2B^{0^2}+187A^{0^2}&{} 378A^0&{}-60A^0&{}378A^0&{}-60A^0&{}-612A^0\\ -378B^0 &{}378A^0 &{}996&{}-360&{}612&{}0&{}-1224\\ 60B^0&{}-60A^0 &{}-360&{}360&{}0&{}0&{}0\\ -378B^0 &{}378A^0 &{}612&{}0&{}996&{}-360&{}-1224\\ 60B^0&{}-60A^0 &{}0&{}0&{}-360&{}360&{}0\\ 612B^0 &{}-612A^0 &{}-1224&{}0&{}-1224&{}0&{}2448 \end{bmatrix}. \end{aligned}$$

Here \(diag(a_1, a_2,\ldots , a_k)\) represents \(k\times k\) diagonal matrix with elements \(a_1, a_2,\ldots , a_k\) in the principal diagonal and \({\mathcal {N}}_k({\varvec{{\mathcal {M}}}},{\varvec{{\mathcal {S}}}})\) represents the k-variate normal distribution with mean vector \({\varvec{{\mathcal {M}}}}\) and variance-covariance matrix \({\varvec{{\mathcal {S}}}}\).

Proof

Please see Appendix B for the proof. \(\square \)

Although Theorem 2 has been proved for the increasing sample size assuming \(M=N\rightarrow \infty \), however asymptotic normality will still hold even if \(M/N\rightarrow p \) as \(M,N\rightarrow \infty \), for some \(p>0\). It is interesting to note that the asymptotic properties of the proposed estimators of chirp rates \(\beta ^0,\delta ^0 \) will remain the same even if we take \( \min \{M,N\}\xrightarrow {}\infty \). The asymptotic variance-covariance matrix of \((\alpha ^0,\gamma ^0,\mu ^0)\) will however change depending on the value of p among non-linear parameters.

If we further assume that the errors in (1) are i.i.d. Gaussian distributed random variables, then it can be observed that the proposed estimators of chirp rates parameters \(\beta ^0\) and \(\delta ^0 \), asymptotically attain Cramer-Rao lower bound (CRLB). CRLB for estimators of other non-linear parameters \(\alpha ^0\), \(\gamma ^0\), and \(\mu ^0\) are \( \frac{456c\sigma ^2}{(A^{0^2}+B^{0^2})}\),   \( \frac{456c\sigma ^2}{(A^{0^2}+B^{0^2})}\)   and   \( \frac{288c\sigma ^2}{(A^{0^2}+B^{0^2})} \), see (Lahiri & Kundu, 2017).

4 Simulation results

Simulation studies performed in this paper are divided into three parts. The first part demonstrates the evaluation of finite sample size performance of proposed estimators. We compare the performance of the proposed estimators with the asymptotically optimal estimators such as LSEs, and ALSEs, and fast but sub-optimal 2D-multilag-HAF estimators. The second part shows the lower computational cost of the proposed estimators as compared to the LSEs. Finally, the third part exemplifies the ability of the proposed estimators to extract original gray-scale texture from one contaminated with noise and reproduce the original texture. We have performed simulations on the complex counterpart of the model (1), (see Barbarossa, 2014) for comparison purposes.

4.1 Finite sample performance

To provide a detailed assessment of the performance of proposed estimators, we have chosen sample sizes \(M=N=20,40,60,80\) and 100. The fixed values of all parameters to obtain complex-valued chirp data are:

$$\begin{aligned} A^0= 1, \alpha ^0 =0.4, \beta ^0=0.1429, \gamma ^0 =0.25, \delta ^0 = 0.1250, \mu ^0 = 0.1667. \end{aligned}$$
(13)

Obtained data from the model is then contaminated with noise X(mn). We consider two distinct noise structures for our simulations. These are:

  • Independently and identically distributed (i.i.d.) normal errors with mean 0 and variance \(\sigma ^2\);

  • Autoregressive moving average (ARMA) errors with following representation:

    $$\begin{aligned} X(m,n) =\;&0.06X(m-1,n-1)-0.054X(m,n-1)+0.087X(m-1,n)\nonumber \\&+\epsilon (m,n)+.01\epsilon (m-1,n-1)+0.035\epsilon (m,n-1)+0.042\epsilon (m-1,n). \end{aligned}$$
    (14)

    where \(\epsilon (m,n)\) is a sequence of i.i.d. Gaussian random variables with mean 0 and variance \(\sigma ^2\).

We have obtained the estimates for 1000 replications for a fixed sample size \(M=N\), under a particular error structure with fixed \(\sigma ^2\). The estimators do not have any explicit closed form expression, so we use Nelder-Mead algorithm ( using “optim" function in R software) for optimization of the objective function and to obtain the estimators. Mean square errors (MSEs) obtained under 1000 replications are displayed in the Figs. 1a, b, 2a and b for four different values of \(\sigma \) = 0.1,0.5,0.9,1. The MSEs are plotted on negative logarithmic scale. The findings of these simulation results can be summarized as follows:

  • MSEs of the proposed estimators decrease rapidly as sample size increases, which supports the consistency property of the estimators. Further, as sample size increases, the gap between the MSEs of the proposed estimators and LSEs decreases.

  • The obtained MSEs of proposed estimators of \(\beta ^0\) and \(\delta ^0\), are at par with those of LSEs and ALSEs.

4.2 Time comparison

The computational advantage of the proposed estimators over the conventional LSEs is quite significant. In order to compare the two methods, we measure their computational complexities in terms of the number of points in the grid that are needed to find the initial guesses of these estimators. Once we have the precise initial guesses, applying an iterative algorithm like Nelder-Mead is a matter of seconds. “gridSearch" function from the package “NMOF" is used to calculate the initial guesses. We report observed time to get the estimates for a fixed sample size and the total number of grid points over which cost function evaluations are required, in Table 1. The choice of parameters is same as in (13) along with i.i.d. normal errors with mean 0 and standard deviation \(\sigma =0.9\). For a fixed sample size \(M=N\), the order of computation for LSEs is \(M^4N^4=M^8\), but for the proposed method, the order of computation is \(M^3N+N^3M=2M^4\). The numerical experiments for comparing time efficiency were performed on a system with processor: Intel(R) Core(TM) i3-5005U CPU @ 2.00GHz 2.00GHz; installed memory(RAM): 4.00 GB; and system type: 64-bit Operating System. Codes were written and run in R version 4.0.4 (2021-02-15)—“Lost Library Book", a free software environment for statistical computing and graphics.

Fig. 1
figure 1

Plots of \(-\log (\hbox {MSEs})\) versus the sample size for estimators of non-linear parameters

Fig. 2
figure 2

Plots of \(-\log (\hbox {MSEs})\) versus the sample size for estimators of non-linear parameters

Table 1 Time and number of grid points taken to compute the estimates

For the considered machine, it was not feasible to perform grid-search to get LSEs for more than \(M=N=7\). When we plot logarithm sample size \(M(=N)\) against the logarithm time to compute initial guess for LSEs, then it is observed to be linear. So, to get an idea of the time deviation of LSEs with that of proposed estimates at larger sample sizes, we predict time to obtain LSEs based on grid-search by fitting a simple linear regression model between \(\log \) of sample size and \(\log \) of time to get LSEs. From the results in Table 2, we can clearly observe the massive time difference of getting proposed estimates and LSEs. For example, if we go for sample size, say \(M=N=40\), then it will take more than 20 years to obtain LSEs using grid-search over the same machine (even if we assume a large amount of RAM in a machine), while it took less than 10 min to obtain the proposed estimates.

Table 2 Comparing Time efficiency of proposed estimators with Predicted time for LSEs

4.3 Texture pattern estimation

2D chirp signals create interesting gray-scale texture patterns. In order to analyze the effectiveness of the proposed estimators for estimating texture patterns accurately, we generate data from the complex counterpart of the model with same set of parameters as in (13). Then real and imaginary part of the obtained data is contaminated independently with i.i.d. normal errors having mean 0 and variance \(\sigma ^2=0.09\). The data matrix obtained is of size \(100\times 100\). We analyze this data using the proposed estimator, the optimal LSEs, ALSEs, and also with the 2D-multilag-HAF method. Note that we have used true values as initial guess for obtaining optimal estimators, LSEs and ALSEs because of the computational complexity and grid-search method for obtaining 2D-multilag-HAF estimators and proposed estimators.

Plugging these estimators in the deterministic part of the model, we get estimated texture patterns as the real part of the reproduced data. We also present real part of the original dataset to compare the original and estimated textures. It is clear from the figures that texture pattern obtained using the proposed method is visually the same as that obtained using optimal LSEs and ALSEs, while the 2D-multilag-HAF estimator gives a slightly different pattern than the original one (Fig. 3).

Fig. 3
figure 3

A comparison of estimated textures using LSEs, ALSEs, 2D-multilag-HAF and the proposed estimators

5 Conclusion

The paper proposes a computationally efficient estimators with the same convergence rate as the LSEs or ALSEs. The key idea is to develop a strategy by disintegrating the 2D model into several 1D chirp models and then design an optimal estimation method to obtain estimates of the 2D model parameters. The proposed estimators are not only asymptotically unbiased but also have an asymptotic normal distribution and same rate of convergence as that of the LSEs. Furthermore, the estimators converge strongly to the true value of the parameters. Extensive numerical simulations firmly support the theoretical results and also unveil the gigantic gap between time required for obtaining proposed estimates and the LSEs. Synthetic data analysis illustrates the effectiveness of the proposed estimators in recovering 2D gray-scale textures contaminated with noise. For several applications, like enhancing the robustness of the watermarking from frequent attacks e.g., cropping, rotating, compressing, etc., multi-component chirp signals are required for the base watermarking, see Stankovic et al. (2001). The proposed method can effectively be used to address such problems. Further, our proposed methodology can be adapted for parameter estimation of more general signal models by utilizing a sequential procedure similar to Grover et al. (2021). We believe that the estimation technique and the results obtained in this paper can lead to designing computationally efficient algorithms for estimating higher order polynomial phase signals and thus making a significant contribution towards research in this area.