1 Introduction

Fractional calculus has gained an increasing interest over the last decades and has become a powerful tool for the compact modeling of real dynamical processes with hereditary characteristics. The latter are encountered in diffusive phenomena, biological processes, electrical systems, etc\(\ldots \) [1,2,3].

The identification of fractional order systems has been an active research area and most of the studies have focused on the linear case [4,5,6,7,8,9]. However, major real-world systems are more or less nonlinear in nature and their modeling and identification is still an open research topic due to their structure diversity [10,11,12]. Within the class of nonlinear models, the block-oriented structures have gained wide recognition by the system identification community since they allow the description of a wide spectrum of real nonlinear processes [13,14,15]. Typically, these models are built by joining linear dynamic subsystems with static nonlinear blocks in various forms of interconnection (Hammerstein, Wiener, Hammerstein–Wiener, etc\(\ldots \)).

The Hammerstein model consists of a static nonlinear part followed by a linear block [16], where this situation may represent a linear system in presence of a nonlinear actuator or other nonlinear effects [17]. Among its applications, we can cite pH neutralization systems [14], biological processes [15] and heat exchangers [18], etc... This study addresses the identification of fractional nonlinear Hammerstein system, where its linear subsystem is of fractional order.

Different methods have been proposed in the literature for the identification of the block structured classical integer order systems [19,20,21,22], and only a few papers have considered the fractional order case [23,24,25,26,27,28,29]. However, in the previous studies, fractional orders were often assumed to be set a priori and only parametric estimation was performed. Other studies were limited to the particular case of commensurate order systems where the fractional orders are multiple of the same value.

In [25], the parametric identification of a Hammerstein system described by a fractional recurrence equation is performed based on an iterative method, while in [26], Levenberg–Marquardt (L–M) algorithm is used to estimate the Hammerstein system parameters where the linear part is a fractional transfer function; and fractional orders are assumed to be known. The subspace identification method based on instrumental variables is reported in [24], and an iterative linear optimization algorithm, with a Lyapunov method is used in [23]. However, both studies consider the commensurate case only.

The main contribution of this paper is the identification of a fractional Hammerstein nonlinear system where its parameters as well as its fractional orders are estimated. Besides, the general non-commensurate order case, where the orders are totally independent is addressed. The Hammerstein linear part is described by a fractional state-space model and the system is primarily converted to the polynomial nonlinear state-space (PNLSS) model. This prevents the presence of the coupled cross products of the linear part and nonlinear part parameters that occur in input/output (I/O) models.

In this way, the optimization problem has a better conditioning and the computational effort is reduced.

The PNLSS model identification is based on a nonlinear optimization method, in occurrence Levenberg–Marquardt (L–M) algorithm that combines the gradient descent and the Gauss–Newton methods. It requires the calculation of crucial sensitivity functions which may be laborious and sometimes complex depending on the chosen model. For this purpose, a novel multivariable fractional PNLSS model is developed for the implementation of parametric sensitivity functions, hence the reduction of the computational load of the method.

Numerical simulations are used to test the effectiveness of the approach and the estimator statistical properties are analyzed using Monte Carlo simulation. Finally, the method’s performance is validated on a heating benchmark.

The remainder of this paper is as follows: in Sect. 2, some preliminary notions on fractional calculus are recalled. Section 3 presents the fractional Hammerstein model based on the PNLSS description. Section 4 formulates the identification method where the calculation of parametric sensitivity functions is developed. The simulation results of academic examples and the experimental heating benchmark are reported in Sect. 5, and finally, the main conclusions along with some perspectives are outlined in Sect 6.

2 Mathematical background on fractional calculus

Fractional calculus is a powerful tool applied in control and in the modeling of many physical processes . It is defined as the generalization of the differential operator \(\dfrac{\mathrm{d}}{\mathrm{d}t}\) to the fractional order differintegral operator \(_{t_{0}}D{_{t}}^{\tilde{\alpha }}\), where \(\tilde{\alpha }\) is the non integer or fractional order and \(t_0\) is the initial time.

Different definitions of this fractional order operator have been proposed in the literature [30,31,32,33,34]. In this paper, the definition of Grünwald–Letnikov (GL) which is useful when dealing with fractional discrete systems, will be used [2, 30]. It is given as follows:

$$\begin{aligned} \varDelta ^{\tilde{\alpha }}f(kh)=\dfrac{1}{h^{\alpha }}\sum \limits _{j=0}^{k}(-1)^{j} \left( {\begin{array}{c}\tilde{\alpha }\\ j\end{array}}\right) f((k-j)h) \end{aligned}$$
(1)

where \(\varDelta ^{\tilde{\alpha }}\) denotes the fractional order difference operator, f(kh) is a discrete function, h is the sampling interval which is assumed to be equal to 1 and k is the number of samples.

\(\left( {\begin{array}{c}\tilde{\alpha }\\ j\end{array}}\right) \) is the binomial term given by the following relation

$$\begin{aligned} \left( {\begin{array}{c}\tilde{\alpha }\\ j\end{array}}\right) =\left\{ \begin{array}{ccl} 1&{}\quad \hbox {for} \,\, j=0\\ \frac{\tilde{\alpha }(\tilde{\alpha }-1)\ldots (\tilde{\alpha }-j+1)}{j!} &{}\quad \hbox {for} \,\,j>0 \end{array} \right. \end{aligned}$$
(2)

A fractional order linear system can be represented as in the integer case, based on different models such as the differential equation, the transfer function, and the state-space model.

The discrete fractional order state-space model is described by two equations:

$$\begin{aligned} \begin{array}{rcl} &{}&{}\varDelta ^{\alpha }x(k+1)=Ax(k)+B u(k)\\ &{}&{}y(k)=Cx(k)+Du(k)\\ \end{array} \end{aligned}$$
(3)

where \(x(k) \in \mathbb {R}^n\) is the state vector, u(k) and \(y(k) \in \mathbb {R}\) are, respectively, the input and the output of the system; \(A\in \mathbb {R}^{n\times n}\), \(B\in \mathbb {R}^{n\times 1}\), \(C\in \mathbb {R}^{1\times n}\) and \(D\in \mathbb {R}^{1}\) are the system matrices.

$$\begin{aligned} \varDelta ^{\alpha }x= & {} [\varDelta ^{\alpha _{1}}x_1\quad \varDelta ^{\alpha _{2}}x_2\quad \cdots \quad \varDelta ^{\alpha _{n}}x_n]^{\mathrm{T}} \quad \in \mathbb {R}^n \end{aligned}$$
(4)

is the fractional state variables vector; for the non-commensurate systems the fractional order vector \(\alpha \) components are different with \(\alpha \):

$$\begin{aligned} \alpha =[\,\alpha _1\quad \alpha _2\quad \cdots \quad \alpha _n] \end{aligned}$$

In the particular case of commensurate order systems, fractional orders are multiple of the same value \(\tilde{\alpha } \) and the state variables are differentiated to the order \(\tilde{\alpha } \) with:

$$\begin{aligned} \tilde{\alpha } =\quad \alpha _1= \quad \alpha _2 = \quad \cdots = \quad \alpha _n \end{aligned}$$

and

$$\begin{aligned} \varDelta ^{\tilde{\alpha }}x(k+1)= & {} \varDelta ^{\tilde{\alpha }}[x_1(k+1)\quad \cdots \quad x_n(k+1)]^{\mathrm{T}} \end{aligned}$$
(5)

The simulation of the fractional model (3) based on the GL difference can be performed using the following Equations [35]:

$$\begin{aligned} \begin{array}{rcl} &{}&{}\varDelta ^{\alpha }x(k+1)=Ax(k)+B u(k)\\ &{}&{}x(k+1)=\varDelta ^{\alpha }x(k+1)\\ &{}&{}-\sum \limits _{j=1}^{k+1}(-1)^{j} \left( {\begin{array}{c}\alpha \\ j\end{array}}\right) x(k+1-j)\\ &{}&{}y(k)=Cx(k)+Du(k)\\ \end{array} \end{aligned}$$
(6)

Equation (6) is rewritten as

$$\begin{aligned} \begin{array}{rcl} &{}&{}\varDelta ^{\alpha }x(k+1)=Ax(k)+B u(k)\\ &{}&{}x(k+1)=\varDelta ^{\alpha }x(k+1)-\sum \limits _{j=1}^{k+1}\beta (j)x(k+1-j)\\ &{}&{}y(k)=Cx(k)+Du(k)\\ \end{array} \end{aligned}$$
(7)

where

$$\begin{aligned}&\beta (j)= \mathrm{diagonal}\,[\beta _{i}(j)] \quad \mathrm{for} \quad i=1,2,\cdots ,n \end{aligned}$$
(8)
$$\begin{aligned}&\beta _{i}(j)= (-1)^{j} \left( {\begin{array}{c}\alpha _{i}\\ j\end{array}}\right) , \end{aligned}$$
(9)

and the recurrence equation is:

$$\begin{aligned}&\beta _i(0)=1\nonumber \\&\beta _{i}(j)=\beta _{i}(j-1)\frac{(j-1)(\alpha _{i}-1)}{j}\,\, \mathrm{for} \,\, \nonumber \\&j=1,\ldots ,k \end{aligned}$$
(10)

The above procedure will be used for the simulation of the nonlinear fractional system presented in this paper.

3 Problem setting

Consider a fractional SISO Hammerstein system shown in Fig. 1, which consists of a static nonlinear block followed by a fractional linear block.

Fig. 1
figure 1

Hammerstein system

An output-error framework is considered, where the linear part is described by a fractional state-space model and a controllable form is assumed:

$$\begin{aligned} \begin{array}{rcl} &{}&{}\varDelta ^{\alpha }x(k+1)=A_0x(k)+B_0 \tilde{u}(k)\\ &{}&{}\tilde{y}(k)=C_0x(k)+D_0\tilde{u}(k)\\ \end{array} \end{aligned}$$
(11)

where \(x(k) \in \mathbb {R}^n\), \(\tilde{u}(k)\), and \(\tilde{y}(k)\) are, respectively, the state vector, the system input, and the noise-free output of the linear part, \(\tilde{u}(k)\) is an internal variable and the nonlinear part output.

The nonlinear part is assumed to be a polynomial of order r [36] with unknown coefficients \(p_i (i=1,2,\cdots ,r)\)

$$\begin{aligned} \tilde{u}(k)=f(u(k))=\sum \limits ^{r}_{i=1}p_{i}u^{i}(k) \end{aligned}$$
(12)

The system overall output y(k) in presence of noise v(k) is as follows:

$$\begin{aligned} y(k)=\tilde{y}(k)+v(k) \end{aligned}$$
(13)

Substituting Eq. (12) into Eq. (11), yields to the Hammerstein model equations:

$$\begin{aligned}&\displaystyle \varDelta ^{\alpha }x(k+1)=A_0x(k)+B_0 \sum \limits ^{r}_{i=1}p_{i}u^{i}(k)\nonumber \\&\displaystyle y(k)=C_0x(k)+D_0\sum \limits ^{r}_{i=1}p_{i}u^{i}(k)+v(k) \end{aligned}$$
(14)
$$\begin{aligned} \begin{array}{rcl} \varDelta ^{\alpha }x(k+1)&{}=&{}A_0x(k)+B_0p_{1}u(k) \\ &{}&{} +\,B_{0}\sum \limits ^{r}_{i=2}p_{i}u^{i}(k)\\ y(k)&{}=&{}C_0x(k)+D_0p_{1}u(k)+D_0\sum \limits ^{r}_{i=2}p_{i}u^{i}(k)\\ &{}&{}+\,v(k) \end{array} \end{aligned}$$
(15)

Without loss of generality, the coefficient \(p_1\) of the nonlinear block polynomial can be normalized and set equal to 1. The obtained model of Eq. (15) is a state-space model containing nonlinear elements of polynomial type; thus, the polynomial nonlinear state-space model, defined for the integer case in [36], can be extended for the fractional case. It is expressed as follows:

$$\begin{aligned} \varDelta ^{\alpha }x(k+1)= & {} Ax(k)+Bu(k)+E\eta (k)\nonumber \\ y(k)= & {} Cx(k)+Du(k)+F\zeta (k) \end{aligned}$$
(16)

where the regular linear part of this state-space model is described by the matrices ABCD; the matrices \(E\in \mathbb {R}^{1\times n_{\eta }}\) and \(F\in \mathbb {R}^{1\times n_{\zeta }}\) contain the coefficients associated with the nonlinear terms. The vectors \(\zeta (k)\) and \(\eta (k)\) include the monomials u(k) and x(k) of degree 2 up to r.

Let us derive the PNLSS model for the fractional Hammerstein system: The link between Eq. (15) and the PNLSS model of Eq. (16) is as follows

$$\begin{aligned} A= & {} A_{0} \,\quad B = B_{0}\,\quad C = C_{0} \,\quad D = D_{0} \end{aligned}$$
(17)
$$\begin{aligned} E= & {} [p_{2}B_{0} \cdots p_{r}B_{0}]\,\quad F = [p_{2}D_{0}\cdots p_{r}D_{0}] \end{aligned}$$
(18)

The vectors \(\zeta (k)\) and \(\eta (k)\) in this Hammerstein case are equal, and they contain the monomials in u(k) of degree 2 up to r as follows:

$$\begin{aligned} \zeta (k)\,=\,\eta (k)= \left[ \,u^2(k)\quad u^3(k)\cdots \quad u^{r}(k)\,\right] ^{\mathrm{T}} \end{aligned}$$
(19)

The PNLSS model for the fractional Hammerstein system of Fig.1 is:

$$\begin{aligned} \varDelta ^{\alpha }x(k+1)= & {} Ax(k)+Bu(k)+E\eta (k)\nonumber \\ y(k)= & {} Cx(k)+Du(k)+F\eta (k)+v(k) \end{aligned}$$
(20)

The Hammerstein system has been transformed into the nonlinear fractional PNLSS model whose matrices explicitly provide the parameters of the original model. Indeed, its matrices ABCD are equal to the original Hammerstein linear part matrices, while the matrices E and F include the nonlinear part coefficients \(p_i\) with redundancy. The advantage of this system is that it ensures a better parameterization of the model than the original counterpart, since its two subsystems parameters are explicitly separated.

As reported in the literature, the drawback in the identification of nonlinear systems is the growing number of parameters of transformed models (over-parameterized model), and the occurrence of the cross combined parameters of the nonlinear part and the linear one. In this study, the use of the PNLSS model prevents the reported difficulties and ensures a better conditioning of the nonlinear Hammerstein system.

The objective of this work is the identification of the fractional Hammerstein system described by the PNLSS model. For this purpose, a nonlinear optimization approach based on Levenberg–Marquardt is developed in the next section.

4 Identification method

Consider the Hammerstein fractional order system with its linear part assumed to be completely observable and controllable with matrices:

$$\begin{aligned} \begin{array}{rcl} A_{0}&{}=&{}\left[ \begin{array}{ccccc} 0 &{} 1 &{}0&{} \dots &{} 0\\ \vdots &{} &{} &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} 0 &{} 1\\ a_{1}&{} a_{2}&{} \dots &{}a_{n-1}&{} a_{n} \end{array} \right] \quad \quad B_{0}= \left[ \begin{array}{ccccc} 0\\ \vdots \\ 0\\ 1 \end{array} \right] \\ {} &{}\\ C_{0}&{}=&{} \left[ \begin{array}{ccccc} \;c_1\quad c_2 \quad \ldots \quad c_n\; \end{array} \right] \quad \quad D_0=[d] \end{array} \end{aligned}$$
(21)

and the fractional order vector \(\alpha =[\,\alpha _1\quad \alpha _2\quad \cdots \quad \alpha _n]\). The nonlinear block equation is as follows:

$$\begin{aligned} \tilde{u}(k)=\sum \limits ^{r}_{i=1}p_{i}u^{i}(k) \end{aligned}$$
(22)

The obtained fractional Hammerstein PNLSS model of Eq. (20) exhibits the following matrices:

$$\begin{aligned} \begin{array}{lllll} A\;=\;A_{0} \;\quad B\;=\;B_{0} \;\quad C\;=\;C_{0} \;\quad D\;=\;D_{0}\\ {} &{}\\ E=\left[ \begin{array}{lllll} 0 &{} 0 &{}0&{} \dots &{} 0\\ \vdots &{} &{} &{} &{} \vdots \\ 0 &{} 0 &{} \dots &{} 0 &{} 0\\ p_{2}&{} p_{3}&{} \dots &{}p_{r-1}&{} p_{r} \end{array} \right] \\ {} &{}\\ F=[p_{2}D_{0}\quad \cdots \quad p_{r}D_{0}] \end{array}\end{aligned}$$
(23)

Let \(\theta \) represents the vector of parameters to be estimated of length \(n_\theta \).

$$\begin{aligned} \theta =[\,\tilde{\theta } \quad \alpha \,]\in \mathbb {R}^{n_\theta } \end{aligned}$$

It includes the Hammerstein model coefficients in \(\tilde{\theta } \) and the fractional orders vector \(\alpha \) with:

$$\begin{aligned}&\tilde{\theta } =[a \quad c \quad d \quad p]\nonumber \\&a=[a_1,\cdots a_n],\quad c=[c_1,\cdots c_n], \nonumber \\ \quad&p=[1,p_2 \cdots ,p_r]. \end{aligned}$$
(24)

For the identification of these parameters, a nonlinear optimization algorithm, Levenberg–Marquardt (L–M), is developed [37]; it is extensively used in nonlinear optimization and it ensures robust convergence. It is based on the computation of the Gradient and the Hessian, which are calculated using the parametric sensitivity functions. Optimization is performed by minimizing the objective function \(J(\theta )\):

$$\begin{aligned} J(\theta )\,=\dfrac{1}{K}\sum \limits _{k=1}^K\epsilon ^2(k)\quad \hbox {where} \quad \epsilon (k)=y(k)-\hat{y}(k) \end{aligned}$$
(25)

\(\epsilon (k)\) being the prediction error, \(\hat{y}(k)\) the corresponding output estimate, and K the samples number. The vector \(\theta \) is updated using the following recursive rule:

$$\begin{aligned}&\theta ^{(i+1)}=\theta ^{(i)}-\left\{ \left[ J^{''}+ \lambda I \right] ^{-1}J^{'}\right\} _{\hat{\theta }=\theta ^{(i)}}\nonumber \\&J_\theta ^{'}=-\dfrac{2}{K} \sum \limits _{k=1}^K \varepsilon (k) \; \left( \dfrac{\partial \hat{y}(k)}{\partial \theta }\right) \; {\hbox {the Gradient}}\nonumber \\&J_\theta ^{''}= \dfrac{2}{K} \sum \limits _{k=1}^K \left( \dfrac{\partial \hat{y}(k)}{\partial \theta }\right) \left( \dfrac{\partial \hat{y}(k)}{\partial \theta }\right) ^{\mathrm{T}}\;{\hbox {the Hessian}}\nonumber \\&\sigma _{\hat{y} (k)/{\theta }}=\dfrac{\partial \hat{y}(k)}{\partial \theta } \; {\hbox {the output sensitivity function}} \nonumber \\&\lambda \ :\quad {\hbox {a tuning parameter for the convergence}} \end{aligned}$$
(26)

The Gradient and the Hessian of the criterion J are obtained by implementing the model output sensitivity functions \({\sigma _{\hat{y} (k)/{\theta }}}\). They represent a major indicator of the identification conditioning, and their computation is detailed in what follows.

4.1 Implementation of sensitivity functions

Let us derive the parametric sensitivity functions related to \(\tilde{\theta }\) by differentiating the fractional PNLSS equations in (27) with respect to each component \(\tilde{\theta _i}\).

$$\begin{aligned}&\varDelta ^{\alpha }x(k+1) = Ax(k)+Bu(k)+E\eta (k)\nonumber \\&y(k) = Cx(k)+Du(k)+F\zeta (k) \end{aligned}$$
(27)
$$\begin{aligned}&\varDelta ^\alpha \left[ \dfrac{\partial x(k+1)}{\partial \tilde{\theta }_i}\right] = \dfrac{\partial A}{\partial \tilde{\theta }_i} x(k) + A \dfrac{\partial x(k)}{\partial \tilde{\theta }_i} + \dfrac{\partial B}{\partial \tilde{\theta }_i}u(k)\nonumber \\&\quad +\, B\dfrac{\partial u(k)}{\partial \tilde{\theta }_i}+E\dfrac{\partial \eta (k)}{\partial \tilde{\theta }_i} + \dfrac{\partial E}{\partial \tilde{\theta }_i} \eta (k)\nonumber \\&\dfrac{\partial \hat{y}(k)}{\partial \tilde{\theta }_i}= \dfrac{\partial C}{\partial \tilde{\theta }_i} x(k) + C \dfrac{\partial x(k)}{\partial \tilde{\theta }_i} + \dfrac{\partial D}{\partial \tilde{\theta }_i}u(k)\nonumber \\&\quad + \,D\dfrac{\partial u(k)}{\partial \tilde{\theta }_i}+F\dfrac{\partial \eta (k)}{\partial \tilde{\theta }_i} + \dfrac{\partial F}{\partial \tilde{\theta }_i} \eta (k)\nonumber \\&\quad i=1,\cdots ,n_{\tilde{\theta }} \end{aligned}$$
(28)

Note that \(\dfrac{\partial B}{\partial \tilde{\theta }_i}=0\), \(\dfrac{\partial u(k)}{\partial \tilde{\theta }_i}=0\) and \(\dfrac{\partial \eta (k)}{\partial \tilde{\theta }_i}=0\).

Equation (28) reduces to:

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/ \tilde{\theta }_i}\right] =A \sigma _{ x(k)/ \tilde{\theta }_i} +\left[ \dfrac{\partial A}{\partial \tilde{\theta }_i} \quad 0 \right] \left[ \begin{array}{ccccc} x(k)\\ u(k) \end{array} \right] \nonumber \\&\quad + \,\dfrac{\partial E}{\partial \tilde{\theta }_i} \eta (k)\nonumber \\&\sigma _{\hat{y}(k)/ \tilde{\theta }_i}= C \sigma _{ x(k)/ \tilde{\theta }_i} +\left[ \dfrac{\partial C}{\partial \tilde{\theta }_i} \dfrac{\partial D}{\partial \tilde{\theta }_i}\right] \left[ \begin{array}{ccccc} x(k)\\ u(k) \end{array} \right] \nonumber \\&\quad +\, \dfrac{\partial F}{\partial \tilde{\theta }_i} \eta (k) \quad i=1,\cdots ,n_{\tilde{\theta }} \end{aligned}$$
(29)

where \({ \sigma _{x(k)/\tilde{\theta }_{i}}=\frac{\partial x(k)}{\partial \tilde{\theta }_{i}}}\) and \({\sigma _{\hat{y} (k)/\tilde{\theta }_{i}}=\frac{\partial \hat{y}(k)}{\partial \tilde{\theta }_{i}}}\) are, respectively, the state sensitivity function and the output sensitivity function with respect to \(\tilde{\theta }_i\).

The overall sensitivity functions model is a fractional state-space model which contains nonlinear elements in \(\eta (k) \), and hence it can be written under the PNLSS form:

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/\tilde{\theta }}\right] = A_s \;\sigma _{x(k)/\tilde{\theta }}+ B_{s}u_{s}(k)+ E_s\eta _s(k) \nonumber \\&\sigma _{\hat{y}(k)/\tilde{\theta }}= C_s \;\sigma _{x(k)/\tilde{\theta }}+ D_s u_s(k)\nonumber \\&+ F_s\eta _s(k) \end{aligned}$$
(30)

where \( u_s=\left[ x \quad u \; \right] ^{\mathrm{T}}\), and \( \quad \eta _s = \; \eta \). The matrices corresponding to the PNLSS model of Eq. (30) can be derived from the sensitivity function calculation with respect to each element \(\theta _i\) in what follows:

  • Sensitivity functions with respect to the vector a

It is denoted \(\sigma _{\hat{y}(k)/ a}\) and includes the coefficients:

$$\begin{aligned} \sigma _{\hat{y}(k)/ a}=\left[ \sigma _{\hat{y}(k)/ a_{1}} \quad \sigma _{\hat{y}(k)/ a_{2}} \quad \cdots \quad \sigma _{\hat{y}(k)/ a_{n}}\right] ^{\mathrm{T}} \end{aligned}$$

Let us calculate these functions from Eq. (29):

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/ a_{i}}\right] =A \sigma _{ x(k)/ a_{i}} +\left[ \dfrac{\partial A}{\partial a_{i}} \quad 0 \right] \left[ \begin{array}{ccccc} x(k)\\ u(k) \end{array} \right] \nonumber \\&\quad +\, \dfrac{\partial E}{\partial a_{i}} \eta (k)\nonumber \\&\sigma _{\hat{y}(k)/ a_{i}} = C \sigma _{x(k)/ a_{i}}+\left[ \dfrac{\partial C}{\partial a_{i}} \quad \dfrac{\partial D}{\partial a_{i}}\right] \left[ \begin{array}{ccccc} x(k)\\ u(k) \end{array} \right] \nonumber \\&\quad + \, \dfrac{\partial F}{\partial a_{i}} \eta (k)\quad i=1,\cdots ,n \end{aligned}$$
(31)

with

$$\begin{aligned} \dfrac{\partial A}{\partial a_{i}}= & {} I^{n\times n}_{i}\nonumber \\ \dfrac{\partial C}{\partial a_{i}}= & {} \dfrac{\partial D}{\partial a_{i}}=\dfrac{\partial E}{\partial a_{i}}=\dfrac{\partial F}{\partial a_{i}}= 0 \end{aligned}$$
(32)

where \(I^{m\times l}_{i}\) is a zero matrix except for the element (mi) equal to one; Eq. (31) reduces to:

$$\begin{aligned} \varDelta ^\alpha \left[ \sigma _{x(k+1)/a_{i}}\right]= & {} A \sigma _{ x(k)/ a_{i}}+ I^{n\times n}_{i} x(k)\nonumber \\ \sigma _{\hat{y}(k)/ a_{i}}= & {} C \sigma _{ x(k)/ a_{i}} \end{aligned}$$
(33)

Similar to the computation of \(\sigma _{\hat{y}(k)/ a}\), the output sensitivity functions with respect to the components of the vectors c, d, p are obtained as follows:

  • Sensitivity functions with respect to the vector c

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/ c_{i}}\right] =A \sigma _{ x(k)/ c_{i}}\sigma _{\hat{y}(k)/ c_{i}} =C \sigma _{ x(k)/ c_{i}}+ I^{1\times n}_{i} x(k)\quad \nonumber \\&\quad i=1,2,\cdots , n \end{aligned}$$
(34)
  • Sensitivity with respect to d

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/ d}\right] =A \; \sigma _{ x(k)/ d}\nonumber \\&\sigma _{\hat{y}(k)/ d} =C \sigma _{ x(k)/ d}+ [1] \; u(k)\nonumber \\&+\,[p_2 \quad \cdots \quad p_r]\; \eta (k) \end{aligned}$$
(35)
  • Sensitivity functions with respect to the vector p

$$\begin{aligned}&\varDelta ^\alpha \left[ \sigma _{x(k+1)/ p_{i}}\right] =A \sigma _{ x(k)/ p_{i}}+ I^{n\times (r-1)}_{i} \eta (k)\nonumber \\&\sigma _{\hat{y}(k)/ p_{i}}=C \sigma _{ x(k)/ p_{i}}+ d \quad \nonumber \\&I^{1\times r-1}_{i} \eta (k)\,\,i=2,3,\cdots , r \end{aligned}$$
(36)

Therefore, the matrices of the sensitivity functions model of Eq. (30) can be computed using:

$$\begin{aligned} \begin{array}{lll} &{}A_{s}= {\hbox {Diagonal block} }\left[ A \right] ,\\ &{} C_{s}= {\hbox { Diagonal block}} \left[ C \right] ,\\ {} &{}\\ &{}B_{s}=\left[ \frac{\partial A}{\partial \tilde{\theta }} \quad 0 \right] , \quad D_{s}= \left[ \frac{\partial C}{\partial \tilde{\theta }} \quad \frac{\partial D}{\partial \tilde{\theta }} \right] , \\ {} &{}\\ &{}E_{s}=\left[ \frac{\partial E}{\partial \tilde{\theta }} \right] , \quad F_s=\left[ \frac{\partial F}{\partial \tilde{\theta }} \right] ,\\ {} &{}\\ &{}u_s(k)=\left[ x(k)\quad u(k)\right] ^{\mathrm{T}}, \quad \eta _s(k)= [u^2(k) \cdots u^{r}(k)]^{\mathrm{T}}. \end{array} \end{aligned}$$
(37)
  • Sensitivity functions with respect to the fractional orders vector

The vector \(\theta \) of parameters to be estimated includes the fractional orders \(\alpha _i\) with: \(\theta =[\,\tilde{\theta } \quad \alpha _1 \quad \alpha _2 \quad \cdots \quad \alpha _n \,]\)

The sensitivity functions with respect to the fractional orders vector \(\alpha \) is calculated numerically [8].

The output Taylor series with respect to each component \(\alpha _i\)\((i=1,2,\cdots ,n)\) is applied as follows :

$$\begin{aligned}&\hat{y}(k, \alpha _i + \delta \alpha _i ) - \hat{y} (k, \alpha _i) \approx \delta \alpha _i\; \frac{\partial \hat{y}(k)}{\partial \alpha _i}\nonumber \\&\quad =\delta \alpha _i \; \sigma _{\hat{y}(k)/\alpha _i}\,\,i=1,\cdots ,n\nonumber \\&\quad \mathrm{with}\; \delta \alpha _i\; {\hbox {a small variation of}}\; \alpha _i. \end{aligned}$$
(38)

The overall sensitivity functions vector \( \sigma _{\hat{y}(k)/{\theta }}\) of the model is expressed as:

$$\begin{aligned} \sigma _{\hat{y}(k)/\theta } =[ \sigma _{\hat{y}(k)/{\tilde{\theta }}} \quad \sigma _{\hat{y}(k)/\alpha }]^{\mathrm{T}} \end{aligned}$$

Hence, we may use these parametric sensitivity functions to calculate the Gradient \(J_{\theta }'\) and the Hessian \(J_{\theta }''\) which are expressed by the following equations:

$$\begin{aligned} J_\theta ^{'}= & {} -\dfrac{2}{K} \; \sum \limits _{k=1}^K \varepsilon (k) \quad (\sigma _{\hat{y}(k)/\theta })\nonumber \\ J_\theta ^{''}= & {} \dfrac{2}{K} \sum \limits _{k=1}^K (\sigma _{\hat{y}(k)/\theta }) (\sigma _{\hat{y}(k)/\theta })^{\mathrm{T}} \end{aligned}$$
(39)

They are used in the recurrence Eq. (26) to update the parameters vector \(\theta \).

At each iteration, the identification procedure requires the simulation of two fractional PNLSS models:

  • The estimated model based on the computation of the estimated output vector \(\hat{y}(k)\) and the state vector \(\hat{x}(k)\).

  • The sensitivity functions model necessary for the Hessian and Gradient computation and the update of the vector \(\hat{\theta }\).

The main steps for computing the parameter estimation vectors \(\tilde{\theta }\) and the fractional order \(\alpha \) in the L–M iterative algorithm in (25)–(39) are listed in the following:

  1. 1.

    Let \(i=1\), and set the initial values \({\tilde{\theta }}^{0}\), \(\alpha ^{\small {0}}\) and \(\delta \alpha \).

  2. 2.

    Compute the cost function J.

  3. 3.

    Compute the sensitivity functions with respect to \(\tilde{\theta }\) and \(\alpha \).

  4. 4.

    Compute the gradient and Hessian \(J^{'}\) and \(J^{''}\) using Eq. (39).

  5. 5.

    Update the parameter estimate \(\theta ^{(i)}\) using the recursive rule of Eq. (26).

  6. 6.

    Compute the quadratic function J.

  7. 7.

    If \(J(\theta ^{(i+1)})<J(\theta ^{(i)})\), decrease \(\lambda \), otherwise increase \(\lambda \) and set \(\hat{\theta }=\theta ^{(i)}\), \(J(\hat{\theta })=J({\theta ^{i}})\) and go to step 4.

5 Simulation examples

This section investigates the estimation performance of the developed approach. In the first part, two academic examples are considered: a commensurate case and a non-commensurate one, and in the second part the method efficiency is assessed on the basis of a heating experimental data.

An important step in achieving a good model identification requires the choice of the model structure describing the relationship between the system input/output variables; choosing a wrong structure may result in a poor parametric estimation.

In this study, this task is solved by the analysis of criteria evolution for different structures. The best one with the smallest criterion is chosen.

5.1 Numerical examples

The input u(k) is a persistent excitation sequence of zero mean and unit variance, the disturbance v(k) is a white noise sequence of zero mean and the data length is \(K=500\).

The PNLSS model is constructed and the identification is carried out in the absence of noise for different orders of the linear part and nonlinear part.

Then, using the best structure, the identification is performed with noisy data. The Monte Carlo simulations were performed to test the estimated parameters robustness to data noise, with 50 sets of noise realizations for different signals to noise ratios \(\hbox {SNR}=34\,\hbox {dB}\) and \(\hbox {SNR}=25\,\hbox {dB}\).

Example 1: Fractional commensurate case

The nonlinear system to be identified is a fractional commensurate Hammerstein system, with its linear part of order 3 (\(n\; =\; 3\)), and the fractional order \( \tilde{\alpha }\;=\;0.3\).

$$\begin{aligned} \varDelta ^{\tilde{\alpha }}x(k+1)= & {} A_0x(k)+B_0\tilde{u}(k)\nonumber \\ \tilde{y}(k)= & {} C_0x(k)+D_0u(k) \end{aligned}$$
(40)

with

$$\begin{aligned} \begin{array}{rcl} A_0&{}=&{}\left[ \begin{array}{ccccc} 0 &{} 1&{}0 \\ 0 &{} 0&{} 1 \\ 0.40&{} -\,0.10&{}-\,0.60 \end{array} \right] \,\, B_0= \left[ \begin{array}{ccccc} 0\\ 0\\ 1 \end{array} \right] \\ \\ C_0&{}=&{} \left[ \begin{array}{ccccc} -\,0.20, -\,0.80, -\,0.70 \end{array} \right] \,\, D_0=[0.10]\\ \end{array} \end{aligned}$$
(41)

The nonlinearity is a third-order polynomial (\(r=3\)):

$$\begin{aligned} \tilde{u}(k)=f(u(k))=u(k)+0.75u^2(k)+0.35u^3(k) \end{aligned}$$
(42)

Hence, the goal is to estimate the model by following parameters vector:

$$\begin{aligned}&\theta =[0.40 \quad -\,0.10 \quad -\,0.60 \quad -\,0.20 \nonumber \\&\quad -\,0.80 \quad -\,0.70\nonumber \\&\qquad 0.10 \quad 0.75 \quad 0.35 \quad 0.30] \end{aligned}$$
(43)

The input/output sets are generated, and the identification is carried out in absence of noise, for different orders n and r, for the best structure analysis.

Table 1 lists the values of the quadratic function J, and the evolution of the criterion J for each structure is illustrated in Fig. 2.

The best fit model structure (of the examined ones) is obtained for the exact structure (\(n=3,\quad r=3\)) with \(J\;\approx 1\mathrm{e}{-}31\).

For the chosen structure, Fig. 3 plots the simulation results for the noise-free case, the error is null and the estimated output overlaps with the data.

In the presence of noisy measurements, a Monte Carlo simulation is performed for 50 sets of computer realizations for \(\hbox {SNR}=34\,\hbox {dB}\) and \(\hbox {SNR}=25\,\hbox {dB}\). The estimated parameters mean value and their variances \(\delta (\%)\) are summarized in Table 2, where the estimated parameters mean value is recorded with a satisfactory criterion accuracy (\(J \approx 10^{-4}\) for \(\hbox {SNR}=34\,{\hbox {dB}}\)) and (\(J \approx 10^{-1}\) for \(\hbox {SNR}=25\,{\hbox {dB}}\)). Figures 4 and 5 show, respectively, the simulated versus the estimated outputs. We can concluded that the obtained models show perfect adequacy with the data.

The statistical performance of the estimator is analyzed on a Monte Carlo simulation for different amount of noise, and good efficiency of the optimization method is obtained.

Table 1 Structure test results of example 1
Fig. 2
figure 2

Evolution of the criteria versus the number of iterations for example 1

Fig. 3
figure 3

Identification results for example 1 for the noise-free case

Table 2 Monte Carlo simulation results of example 1

Example 2: Fractional non-commensurate example

The Hammerstein linear part is a non-commensurate fractional state-space model of order \(n=2\), with the fractional orders vector \(\alpha \;=\;[0.4 \quad 0.6]\).

The linear part matrices are given below:

$$\begin{aligned} \begin{array}{rcl} A_0&{}=&{}\left[ \begin{array}{ccccc} 0 &{} 1 \\ -\,0.37&{} -\,0.58 \end{array} \right] \,\,B_0= \left[ \begin{array}{ccccc} 0\\ 1 \end{array} \right] \\ C_0&{}=&{} \left[ \begin{array}{ccccc} -\,0.10,-\,0.20 \end{array} \right] \,\, D_0=[0.10] \end{array} \end{aligned}$$
(44)

The nonlinearity is described by the following polynomial of order \(r=3\):

$$\begin{aligned} \tilde{u}(k)=f(u(k))=u(k)+0.5u^2(k)+0.25u^3(k) \end{aligned}$$

The parameters vector to be estimated is as follows:

$$\begin{aligned} \begin{array}{rcl} \begin{array}{ccccccc} \theta = [ \;-\,0.37 &{} -\,0.58 &{} -\,0.10 &{} -\,0.20 &{} \;0.10 &{}\; 0.50 &{} \\ &{} &{} &{}0.25\; &{} 0.40\; &{}0.60\;]\\ \end{array} \end{array} \end{aligned}$$
(45)

In the first step, the choice of the best structure is investigated and the values of the cost function J for different structures are evaluated. The most optimal structures (\((n=2,r=2)\), \((n=2,r=3)\), \((n=3,r=2 )\)), and \((n=5,r=9 )\)) are displayed in Table 3, and the best criterion value is obtained for the exact structure orders \((n=2,r=3)\). The visual comparison of this task is shown in Fig. 6 where rapid convergence of the criterion J for the exact structure is obtained, compared to the other examined structures.

The simulation results for the noise-free case and for noisy data with \(\hbox {SNR}=34\,\hbox {dB}\) and \(\hbox {SNR}=25\,\hbox {dB}\) are depicted, respectively, in Figs. 7, 8 and 9.

The mean and the variance of the parameter estimates, and the criterion J values using Monte Carlo simulation for 50 runs of realizations are listed in Table 4.

We can see that both the estimated fractional orders as well as the parameters are recovered in all cases, and the obtained errors are very low (\(J= 2\mathrm{e}{-}6\) for \(\hbox {SNR}=34\,\hbox {dB}\), and \(J= 1\mathrm{e}{-}3\) for \(\hbox {SNR}=25\,\hbox {dB}\)).

For the noise-free case, the estimated output overlaps with the simulated one and the prediction error is null (\(\epsilon \approx 10^{-16}\)). These results are illustrated in Fig. 7.

For noisy data, the results are depicted in Figs. 8 and 9. The estimated outputs correspond practically to the simulated outputs and the prediction errors are very low; \(\epsilon \approx 10^{-6}\) for \(\hbox {SNR}=34\,{\hbox {dB}}\) while \(\epsilon \approx 10^{-3}\) for \(\hbox {SNR}=25\,{\hbox {dB}}\).

Fig. 4
figure 4

Identification results for example 1 for \(\hbox {SNR}=34\,\hbox {dB}\)

Fig. 5
figure 5

Identification results for example 1 for \(\hbox {SNR}=25\,{\hbox {dB}}\)

The obtained results highlight the method’s efficiency even in presence of noise. The numerical simulations show that the developed approach performs quite well in terms of fractional Hammerstein model fit for both the commensurate case and the non-commensurate one.

In the presence of noise, the statistical estimator efficiency has been confirmed with Monte Carlo simulations.

Table 3 Structure test results of example 2
Fig. 6
figure 6

Evolution of the criteria versus the number of iterations for example 2

Table 4 Monte Carlo simulation results of example 2
Fig. 7
figure 7

Noise-free identification results of example 2

Fig. 8
figure 8

Identification results for example 2 for \(\hbox {SNR}=34\,{\hbox {dB}}\)

Fig. 9
figure 9

Identification results for example 2 for \(\hbox {SNR}=25\,{\hbox {dB}}\)

5.2 Heating system benchmark

In this section, an experimental heating system is selected for testing the method’s performance. We will use measurements data taken from the database Daisy, which is a database for the system identification [38]. The experimental heating system is described in Fig. 10. The input u(k) is the voltage supplied to a 300-W halogen lamp which is connected through a D/A board and a power amplifier is used to achieve the computer control. The lamp suspended above a thin metal plate allows its heating. The output corresponds to the temperature of the metal plate measured by a thermocouple mounted on the underside of the plate. The input/output signals are represented in Fig. 11.

Fig. 10
figure 10

Experimental heating system

Fig. 11
figure 11

System input and output

As known, the heating system is a diffusive phenomenon that exhibits fractional nonlinear behavior; thus, for the purpose of its model identification, the above developed algorithm is applied to its data.

The crucial step of the model structure selection is first tackled and different orders (n and r) of the Hammerstein linear part and nonlinear ones are tried out.

The quality of the best model structure is measured in terms of the root-mean-square error (RMSE) and the relative error (RE):

$$\begin{aligned} \hbox {RMSE}=\dfrac{1}{K} \sqrt{ \sum \limits ^{K}_{k=1}\epsilon ^{2}(k)} \end{aligned}$$
(46)
$$\begin{aligned} \hbox {RE}=\sqrt{ \dfrac{\sum \nolimits ^{K}_{k=1}\epsilon ^{2}(k)}{\sum \nolimits ^{K}_{k=1} y^2(k)}} \end{aligned}$$
(47)

The results are listed in Table 5, while Fig. 12 shows the curves of the RMSE versus the number of iterations for each of the tested structures.

The best model structure appears to be the one with the following orders: \(n=3\) and \(r=5\).

The estimated model of the heating system is a fractional commensurate Hammerstein PNLSS model, where the fractional order \(\tilde{\alpha }\) equals to 0.9; the linear part is given by the following matrices \(A_0\), \(C_0, D_0\):

$$\begin{aligned} \begin{array}{rcl} A_0&{}=&{}\left[ \begin{array}{ccccc} 0 &{} 1&{}0 \\ 0 &{} 0&{} 1 \\ 0.725&{} 0.353&{}-\,0.118 \end{array} \right] , B_0= \left[ \begin{array}{ccccc} 0\\ 0\\ 1 \end{array} \right] \\ {} &{}\\ C_0&{}=&{} \left[ \begin{array}{ccccc} 0.171\quad -\,0.117\quad 0.018 \end{array} \right] , D_0=[-8\mathrm{e}{-}4]. \end{array} \end{aligned}$$
(48)

The nonlinear part is a fifth-order polynomial formulated as follows:

$$\begin{aligned} \tilde{u}(k)= & {} u(k)+1.84u^2(k)+1.76u^3(k)\nonumber \\&-\,0.20u^4(k)+30.83u^{5}(k) \end{aligned}$$
(49)
Table 5 Structure test of the heating system
Fig. 12
figure 12

RMSE versus number of iterations of the benchmark Example

For the parametric model estimation, the comparison of the real output with the estimated one of the selected structure (\(n=3, r=5\)) is depicted in Fig. 13 and the corresponding prediction error is shown in Fig. 14. The simulation results show a good agreement between both the estimated output model and the real experimental system outputs and the parameters were estimated with relatively less errors than the ones reported in the literature [23] with the values of the \(\hbox {RMSE} = 0.198\) and \(\hbox {RE} = 0.043\).

Fig. 13
figure 13

Estimated output and data of the heating benchmark

Fig. 14
figure 14

Prediction error of the heating System

6 Conclusion

In this paper, a novel approach for the identification of nonlinear Hammerstein systems whose dynamics have a fractional order behavior is presented.

To avoid the difficulty of the coupling terms between the linear and the nonlinear subsystems in Hammerstein model identification, the original nonlinear system is converted to the fractional PNLSS structure, which is suitable for taking into account the nonlinear terms.

It has the advantage of being a flexible model that presents better conditioning compared to the original counterpart model since its two subsystems parameters are explicitly separated.

Levenberg–Marquardt algorithm is used to estimate both the linear and nonlinear parts as well as the fractional orders of the fractional Hammerstein system. This algorithm presents the drawback of the nevralgic computation of parametric sensitivity functions as being time-consuming. This difficulty is solved with their representation as a new fractional PNLSS model. In this way, the computational burden is reduced considerably and the efficiency of the identification algorithm is increased.

Various examples illustrate the effectiveness of the proposed approach which gives consistent estimates and fits quite well the fractional Hammerstein model for both the commensurate and the non-commensurate system academic examples. The statistical efficiency of the optimization method has been confirmed with Monte Carlo simulations.

Furthermore, the validity of the developed method is verified through its application to an experimental heating system data, where a good model fit is achieved.

Further research will focus on the identification of other block-oriented structure combinations.