1 Introduction

Many real-world dynamic systems are operated by switching between different subsystems or modes. Such systems are called switched systems. In practice, the performance of the systems, for most switched systems, depends not only on the current state and control, but also on the previous state and control, and these systems are called time-delay switched systems. Time-delay switched systems are widely employed in different areas, such as evaporation and purification processes [1,2,3], Fermentation Processes [4,5,6,7,8,9,10] and so on.

For general optimal control problems with time-delay switched systems, the delay h is normally a constant or a decision variable which needs to be chosen optimally. However, in many real-world applications, the delay might be a function of time [11]. In this paper, we mainly focus on the optimal control problems with time-varying time-delay switched systems(TVTDSS). The mathematical expression of the standard TVTDSS is as follows:

$$\begin{aligned} \dot{\varvec{x}}(t)=\varvec{f}_{k}(\varvec{x}(t),\varvec{x}(t-\mathrm{d}(t)), \varvec{u}(t), \varvec{u}(t-\mathrm{d}(t))),~t\in [\ell _{k-1},\ell _{k}),\quad k=1,2, \cdots ,M, \end{aligned}$$
(1.1)

and the initial condition:

$$\begin{aligned} \varvec{x}(t)=\varvec{\phi }(t),~t\in [-h,0], \end{aligned}$$
(1.2)
$$\begin{aligned} \varvec{u}(t)=\varvec{\omega }(t),~t\in [-h,0), \end{aligned}$$
(1.3)

where \(\varvec{x}(t)\in {\mathbb {R}}^{n}\) and \(\varvec{x}(t-\mathrm{d}(t))\in {\mathbb {R}}^{n}\) are, respectively, the trajectory of the state in the current time and the past time; \(\varvec{u}(t)\in {\mathbb {R}}^{r}\) and \(\varvec{u}(t-\mathrm{d}(t))\in {\mathbb {R}}^{r}\) denote the control vectors in the current time and the past time, respectively; \(\ell _{k},k=1,2, \cdots ,M-1\) are the switching times; \(\ell _{0}=0\) is the initial time, \(\ell _{M}=T>0\) is a given terminal time; \(h>0\) is a given constant and \(\mathrm{d}(t):[0,T]\rightarrow [0,h]\) is a given continuously differentiable function; \(\varvec{f}_{k}: {\mathbb {R}}^{n}\times {\mathbb {R}}^{n}\times {\mathbb {R}}^{r}\times {\mathbb {R}}^{r} \rightarrow {\mathbb {R}}^{n}\) and \(\varvec{\phi }(t):{\mathbb {R}}\rightarrow {\mathbb {R}}^{n}\) are the given continuously differentiable functions; \(\varvec{\omega }(t):{\mathbb {R}}\rightarrow {\mathbb {R}}^{r}\) is the given piecewise continuous function.

The main theoretical tool for solving optimal control problems governed by (1.1)–(1.3) analytically is the famous Pontryagin minimum principle. However, it is generally very difficult, especially for practical problems, to obtain a closed-form solution, and hence numerical methods are indispensable for solving optimal control problems involving TVTDSS. One of the most popular numerical methods is the control parameterization method [12,13,14,15,16,17,18].

In the switched system of (1.1)–(1.3), the switching sequence is assumed to be fixed and the decision variables that need to be optimized are the control vectors and switching times. However, there are some difficulties when taking the switching times as decision variables: (i) the partial derivatives of the cost and constraint functions with respect to the switching times exist only when the switching times are different; (ii) the numerical integration of dynamic systems over variable length subintervals is difficult to implement numerically [13, 14, 19, 20].

The time-scaling transformation technique is an efficient method for handling these difficulties, and it was first proposed by Lee et al. in 1997 (originally called the control parameterization enhancing transform) [15]. It works by introducing a new so-called control duration vector to map variable switching times to another time horizon; this leads to an equivalent optimization problem in which switching times are fixed [14,15,16, 20,21,22,23,24,25,26]. The time-scaling transformation technique has been successfully applied to a range of optimal control problems, and it was used to solve optimal control problems where the range of the control function is a discrete set in [16]. On the basis of reference [16], this technique is used to convert approximate optimal control problems with variable partition points into equivalent standard optimal control problems with multiple characteristic times in 2002 [17]. In 2006, Li et al. applied it to the optimal control problems governed by switched systems for the first time, and the original problem was transformed into a parameter selection problem on a new time horizon [12]. Based on the prior research work, Ryan et al. first adopted the time-scaling transformation technique to transform the optimal control problems with nonlinear continuous inequality constraint on the state and the control into a class of corresponding semi-infinite programming problems, and an algorithm that computed a sequence of suboptimal controls for the original problem was proposed to solve the problem in 2009 [21]. Since then, Li et al. applied this technique to a class of optimal control problems subject to equality terminal state constraints and continuous state and control inequality constraints, and they solved the problem by using the exact penalty function method in 2011 [22].

Despite its success in optimizing control switching times, the time-scaling transformation technique encounters serious problem when the dynamic system under consideration contains delay in state/control. In fact, when applying the time-scaling transformation technique, a pre-given time-delay becomes a variable in the new time horizon, which leads to difficulties in solving the new dynamic system. In order to handle this issue, Yu et al. presented a hybrid time-scaling transformation method for solving nonlinear time-delay optimal control problems in 2016. The reason that it’s called the hybrid time-scaling transformation method is that this approach is related to two coupled time-delay systems, one is defined on the original time scale and the switching times are variable, the other is defined on the new time scale, in which the switching times are fixed [13]. However, the closed form expression of variable delay cannot be expressed in the new time horizon, so it is very difficult to obtain the delay state in the new time horizon. Furthermore, the duration for each subsystem is required to be greater than or equal to a pre-given positive value. In [20], Wu et al. presented a new computational method to solve optimal control problems of multiple delay dynamic system, and derived the analytical formulate of the time-delay in the new time scale, in such a way that the durations between switching times do not have to be greater than or equal to a pre-given positive value. By using the new computational method, the time-delay system is completely converted to a new time scale and the switching times are fixed.

In the above-mentioned optimal control problems with time-delay dynamic system, the delay is a given constant, but in practical applications, the delay usually changes with time, that is, the delay is a time-dependent function. In addition, the time-scaling transformation technique has not been used in the optimal control problems with TVTDSS. Therefore, the main idea of this paper is adopting the time-scaling transformation technique to deal with the optimal control problems with time-delay switched systems, where the delay is a function of time t.

The rest of the paper is organized as follows. We first give a standard mathematical formula for optimal control problems with TVTDSS in Sect. 2. In Sect. 3, the original problem is transformed into an equivalent problem by applying the control parameterization method and the time-scaling transformation technique. In Sect. 4, we provide the technical details for calculating the gradients of cost function and constraint functions with respect to the corresponding decision variables, then the equivalent problem could be solved by some gradient-based methods. Finally, we verify the correctness of our theory through several examples.

2 Problem Formulation

Consider a time-varying time-delay switched system defined in [0, T] with M subsystems:

$$\begin{aligned} \dot{\varvec{x}}(t)=\varvec{f}_{k}(\varvec{x}(t), \varvec{x}(t-\mathrm{d}(t)),\varvec{u}(t),\varvec{u}(t-\mathrm{d}(t))), ~t\in [\ell _{k-1},\ell _{k}),\quad k=1,2, \cdots ,M, \end{aligned}$$
(2.1)

and the initial conditions are

$$\begin{aligned} \varvec{x}(t)= & {} \varvec{\phi }(t),~t\in [-h,0], \end{aligned}$$
(2.2)
$$\begin{aligned} \varvec{u}(t)= & {} \varvec{\omega }(t),~t\in [-h,0). \end{aligned}$$
(2.3)

Define

$$\begin{aligned} U:=\{\varvec{u}(t)=[u_{1}(t),u_{2}(t),\cdots ,u_{r}(t)]^{\top } \in {\mathbb {R}}^{r}, a_{q}\leqslant u_{q}(t)\leqslant b_{q}, t\in [0,T]\}, \end{aligned}$$

where \(a_{q}\) and \(b_{q}, q=1, \cdots ,r\) are given real numbers such that \(a_{q}\leqslant b_{q}\). Any Borel measurable function \(\varvec{u}: [-h,T] \rightarrow {\mathbb {R}}^{r}\) is called an admissible control if \(\varvec{u}(t)\in U\) for almost all \(t\in [0,T]\), and \(\varvec{u}(t)=\varvec{\omega }(t)\) for all \(t\in [-h,0)\). Let \({\mathcal {U}}\) denote the set of all such admissible control.

Define

$$\begin{aligned} \varXi :=\{\varvec{\ell }=[\ell _{1},\ell _{2},\cdots ,\ell _{M-1}]^{\top } \in {\mathbb {R}}^{M-1}, \ell _{k-1}\leqslant \ell _{k}, k=1, \cdots ,M\}, \end{aligned}$$

as the set of all admissible switching time vectors. For each \(\varvec{\ell }\in \varXi \) and \(\varvec{u}(t)\in {\mathcal {U}}\), let \(\varvec{x}(\cdot \mid \varvec{\ell },\varvec{u}(t))\) denote the solution of (2.1)–(2.3), and we assume that the following conditions are satisfied.

A1::
$$\begin{aligned} \Vert \varvec{f}_{k}(\varvec{e},\varvec{\upsilon }, \varvec{\tau },\varvec{\alpha })\Vert&\leqslant C(1+\Vert \varvec{e}\Vert +\Vert \varvec{\upsilon }\Vert +\Vert \varvec{\tau }\Vert +\Vert \varvec{\alpha }\Vert ), \end{aligned}$$

where \(C>0\) is a real number and \((\varvec{e},\varvec{\upsilon },\varvec{\tau }, \varvec{\alpha })\in {\mathbb {R}}^{n}\times {\mathbb {R}}^{n} \times {\mathbb {R}}^{r}\times {\mathbb {R}}^{r}\).

A2:

: \(\varvec{f}_{k}\) is twice continuously differentiable.

A1 and A2 ensure the uniqueness of the solution of the dynamic system considered in this paper. In particular, under the assumption of A1, the solution of the dynamic system (2.1)–(2.3) is bounded [27]. In addition, A2 guarantees the existence of the gradients of the cost function and constraint functions with respect to their arguments. These two assumptions are widely used in optimal control literatures [12, 13, 19,20,21,22, 28,29,30].

Our optimization problem is formally defined as follows.

Problem 1

Given the dynamic system (2.1)–(2.3), find an admissible control vector \(\varvec{u}(t)\in {\mathcal {U}}\) and an admissible switching time vector \(\varvec{\ell }\in \varXi \) such that the cost function

$$\begin{aligned} J_{0}(\varvec{\ell },\varvec{u}(t))=\varPsi _{0}(\varvec{x} (T\mid \varvec{\ell },\varvec{u}(t))) \end{aligned}$$

is minimized subject to the canonical constraints

$$\begin{aligned} J_{n}(\varvec{\ell },\varvec{u}(t))=\varPsi _{n}(\varvec{x} (T\mid \varvec{\ell },\varvec{u}(t))) \left\{ \begin{array}{c} =0,\\ \geqslant 0,\\ \end{array}\right. \quad \quad n=1,\cdots ,m, \end{aligned}$$

where \(\varPsi _{n}:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}},n=0,\cdots ,m\) are given continuously differentiable functions.

3 Problem Transformation

3.1 Control Parameterization

The control parameterization method involves approximating the control by a linear combination of basis functions, thereby yielding an approximate optimization problem with a finite number of decision variables [14].

In this paper, we approximate the control signal \(\varvec{u}(t)\) as follows:

$$\begin{aligned} \varvec{u}(t)\approx \varvec{u}^{M}(t)=\varvec{\vartheta }_{k}, \ t\in [\ell _{k-1},\ell _{k}), \end{aligned}$$
(3.1)

where \({\varvec{\vartheta }}_{k}\in {\mathbb {R}}^{r},k=1,\cdots ,M\) are the values of control vector on \([\ell _{k-1},\ell _{k})\), \(M\geqslant 1\) is the number of subsystems.

Similarly, the control vector in the past time \(\varvec{u}(t-\mathrm{d}(t))\) can be approximately represented

$$\begin{aligned} \varvec{u}(t-\mathrm{d}(t))=\left\{ \begin{array}{ll} \varvec{\vartheta }_{k^{'}}, &{} \ \text {if} \ t-\mathrm{d}(t)\in [\ell _{k^{'}-1},\ell _{k^{'}}),\\ &{} \ \text {for some} \ k^{'}\in \{1,\cdots ,M\},\\ \varvec{\omega }(t-\mathrm{d}(t)), &{} \ \text {if}~t-\mathrm{d}(t)<0. \end{array}\right. \end{aligned}$$
(3.2)

Define

$$\begin{aligned} \varOmega :=\{{\varvec{\vartheta }}=[\vartheta _{1}^{1},\cdots , \vartheta _{1}^{r},\cdots ,\vartheta _{M}^{1},\cdots ,\vartheta _{M}^{r}]^{\top } \in {\mathbb {R}}^{M\times r}, a_{q}\leqslant \vartheta _{k}^{q}\leqslant b_{q}\}, \end{aligned}$$

as the set of all admissible parameter vectors, where \(q=1,\cdots ,r\), \(k=1,\cdots ,M\) and \(a_{q},b_{q}\) are given real numbers such that \(a_{q}\leqslant b_{q}\).

Substituting (3.1)–(3.2) into (2.1)–(2.3) yields the following new switched system, which is defined on the subinterval \(t\in [\ell _{k-1},\ell _{k})\):

$$\begin{aligned} \dot{\varvec{x}}(t)=\left\{ \begin{array}{ll} \varvec{f}_{k}(\varvec{x}(t),\varvec{x}(t-\mathrm{d}(t)), \varvec{\vartheta }_{k},\varvec{\vartheta }_{k^{'}}), &{}\ \text {if} \ t-\mathrm{d}(t)\in [\ell _{k^{'}-1},\ell _{k^{'}}),\\ &{} \ \text {for some} \ k,k^{'}\in \{1,\cdots ,M\},\\ \varvec{f}_{k}(\varvec{x}(t),\varvec{\phi }(t-\mathrm{d}(t)), \varvec{\vartheta }_{k},\varvec{\omega }(t-\mathrm{d}(t))), &{} \ \text {if}~t-\mathrm{d}(t)<0, \end{array}\right. \end{aligned}$$
(3.3)

and the initial condition is

$$\begin{aligned} \varvec{x}(t)=\varvec{\phi }(t),~~ t\in [-h,0]. \end{aligned}$$
(3.4)

Let \(\varvec{x}^{M}(\cdot \mid \varvec{\ell }, \varvec{\vartheta })\) denote the solution to the switched system above when \((\varvec{\ell },\varvec{\vartheta })\in \varXi \times \varOmega \). After applying the control parameterization method, Problem 1 becomes an optimal parameter selection problem, which is a finite dimensional optimization problem, and the new optimization problem is formally defined as follows.

Problem 2

Given the dynamic system (3.3)–(3.4), find an admissible control parameter vector \({\varvec{\vartheta }}\in \varOmega \) and an admissible switching time vector \(\varvec{\ell }\in \varXi \) such that the cost function

$$\begin{aligned} J_{0}(\varvec{\ell },\varvec{\vartheta })=\varPsi _{0}(\varvec{x}^{M} (T\mid \varvec{\ell },\varvec{\vartheta })) \end{aligned}$$

is minimized subject to the canonical constraints

$$\begin{aligned} J_{n}(\varvec{\ell },\varvec{\vartheta })=\varPsi _{n}(\varvec{x}^{M} (T\mid \varvec{\ell },\varvec{\vartheta })) \left\{ \begin{array}{l} =0,\\ \geqslant 0,\\ \end{array}\right. \qquad n=1,\cdots ,m, \end{aligned}$$

where \(\varPsi _{n}:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}},n=0,\cdots ,m\) are given continuously differentiable functions.

3.2 The Time-Scaling Transformation

We first introduce a new time variable \(s\in [-h,M]\), and the relationship between new time variable s and original time variable t as follows:

$$\begin{aligned} t(s)=\nu (s|\varvec{\delta })=\left\{ \begin{array}{ll} s,\quad &{}{s\in [-h,0]},\\ \sum \limits _{i=1}^{\lfloor s\rfloor }\delta _i+\delta _{{\lfloor s\rfloor }+1} (s-{\lfloor s\rfloor }),\quad &{}{s\in (0,M)},\\ T,\quad &{}{s=M}, \end{array}\right. \end{aligned}$$
(3.5)

where \(\delta _{i}=\ell _{i}-\ell _{i-1}\) is the length of the ith subinterval, and \(\lfloor \cdot \rfloor \) represents floor function.

Clearly,

$$\begin{aligned} \nu (i|{\varvec{\delta }})=\delta _{1}+\cdots +\delta _{i}=\ell _{i},\quad i=1,\cdots ,M. \end{aligned}$$
(3.6)

Let \(\varDelta \) denote the set of all duration vectors \(\varvec{\delta }:=[\delta _{1},\cdots ,\delta _{M}]^{\top }\in {\mathbb {R}}^{M}\) satisfying the following conditions:

$$\begin{aligned} \mathrm{(a)}~~\delta _{i}\geqslant 0, \quad i=1,\cdots ,M, \qquad \quad \mathrm{(b)}~~\delta _{1}+\cdots +\delta _{M}=T. \end{aligned}$$
(3.7)

There are some properties about the function \(\nu (\cdot |{\varvec{\delta }})\) in the following lemma.

Lemma 1

For each admissible duration vector \({\varvec{\delta }}\in \varDelta \), the function \(\nu (\cdot |{\varvec{\delta }})\) has the following properties:

  1. [1]

    \(\nu (\cdot |{\varvec{\delta }})\) is continuous;

  2. [2]

    \(\nu (\cdot |{\varvec{\delta }})\) is non-decreasing;

  3. [3]

    \(\nu (\cdot |{\varvec{\delta }})\) is strictly increasing on \([i-1,i]\) if and only if \(\delta _i>0\).

Proof

For part [1], let \(i\leqslant M\) be an integer and consider \(\nu (s|{\varvec{\delta }})\) on the open interval \((i-1,i)\)

$$\begin{aligned} \nu (s|{\varvec{\delta }})=\sum _{l=1}^{i-1}\delta _{l} +\delta _{\lfloor s\rfloor +1}(s-\lfloor s\rfloor ), \quad s\in (i-1,i), \end{aligned}$$
(3.8)

where the summation term is empty if \(i\leqslant 1\). It is clear from (3.8) that \(\nu (s|{\varvec{\delta }})\) is linear—and therefore continuous—on \((i-1,i)\). Using Eq. (3.5) and (3.6), it is easy to see that for any integer i,

$$\begin{aligned} \lim _{s\rightarrow i-}\nu (s|{\varvec{\delta }})=\lim _{s\rightarrow i+} \nu (s|{\varvec{\delta }})=\nu (i|{\varvec{\delta }}). \end{aligned}$$
(3.9)

Equations (3.8) and (3.9) show that the time-scaling function \(\nu (\cdot |{\varvec{\delta }})\) is continuous on [0, M].

It is clear from Eqs. (3.8) and (3.9) that \(\nu (\cdot |{\varvec{\delta }})\) is linear with gradient \(\delta _i\geqslant 0\) on \((i-1,i)\). Parts [2] and [3] then hold immediately.

Let

$$\begin{aligned} \varvec{z}(s)=\varvec{x}(\nu (s\mid \varvec{\delta }))=\varvec{x}(t). \end{aligned}$$
(3.10)

Substituting (3.5) into the switched system (3.3)–(3.4), for \(s\in [k-1,k)\), we can transform the time-varying time-delay switched system into the following form:

$$\begin{aligned} \dot{\varvec{z}}(s)=\left\{ \begin{array}{ll} \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{x}(\mu (s\mid \varvec{\delta })),\varvec{\vartheta }_{k},\varvec{\vartheta }_{k^{'}}), &{} \ \text {if} \ \mu (s\mid \varvec{\delta })\in [\ell _{k^{'}-1},\ell _{k^{'}}),\\ &{} \ \text {for some} \ k,k^{'}\in \{1,\cdots ,M\},\\ \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{\phi } (\mu (s\mid \varvec{\delta })),\varvec{\vartheta }_{k}, \varvec{\omega }(\mu (s\mid \varvec{\delta }))), &{} \ \text {if} \ \mu (s\mid \varvec{\delta })<0,\\ \end{array}\right. \end{aligned}$$
(3.11)

and the initial condition is

$$\begin{aligned} \varvec{z}(s)=\varvec{\phi }(s),~~s\in [-h,0], \end{aligned}$$
(3.12)

where \(\mu (s\mid \varvec{\delta })=\nu (s\mid \varvec{\delta }) -d(\nu (s\mid \varvec{\delta }))\).

Note that there are two different state vectors in the above switched system: the state vector defined on the original time scale and the state vector defined on the new time scale. As mentioned in [14], the key to fully converting the state vector to the new time scale is to find the relationship between s and \(\nu (s\mid \varvec{\delta })-d(\nu (s\mid \varvec{\delta }))\) as we can see in the switched system.

To find the relationship between s and \(\nu (s\mid \varvec{\delta })-d(\nu (s\mid \varvec{\delta }))\), for each \({\varvec{\delta }}\in \varDelta \), we define a new function \(\kappa (s|{\varvec{\delta }})\) in the new time scale as follows:

$$\begin{aligned} \kappa (s|{\varvec{\delta }})=\sup \big \{\,\gamma \in [-h,M]: \,\nu (\gamma |{\varvec{\delta }})=\nu (s|{\varvec{\delta }}) -d(\nu (s|{\varvec{\delta }}))\,\big \}, \quad s\in [-h,M]. \end{aligned}$$
(3.13)

From the properties of the \(\nu (s|{\varvec{\delta }})\), we know that for each \(s\in [-h,M]\), the set on the right-hand side of (3.13) is non-empty and hence \(\kappa (\cdot |\varvec{\delta })\) is well-defined.

Since the time-scaling function \(\nu (\cdot |{\varvec{\delta }})\) is continuous and non-decreasing, so we can easy to obtain

$$\begin{aligned} \nu (\kappa (s|{\varvec{\delta }})|{\varvec{\delta }}) =\nu (s|{\varvec{\delta }})-d(\nu (s|{\varvec{\delta }})). \end{aligned}$$
(3.14)

The relationship between \((s,\nu (s|\varvec{\delta }))\) and \((\kappa (s|{\varvec{\delta }}),\nu (s|{\varvec{\delta }}) -d(\nu (s|{\varvec{\delta }})))\) is illustrated in Fig. 1.

Fig. 1
figure 1

The relationship between \((s,\nu (s|\varvec{\delta }))\) and \((\kappa (s|{\varvec{\delta }}),\nu (s|{\varvec{\delta }}) -d(\nu (s|{\varvec{\delta }})))\)

Lemma 2

For each admissible duration vector \({\varvec{\delta }}\in \varDelta \), the function \(\kappa (\cdot |{\varvec{\delta }})\) has the following property \(\kappa (s|{\varvec{\delta }})\leqslant s\) for all \(s\in [-h,M]\).

Proof

For notational simplicity, we omit the argument \({\varvec{\delta }}\) in \(\nu (\cdot |{\varvec{\delta }})\) and \(\kappa (\cdot |{\varvec{\delta }})\).

Suppose that \(\kappa (s)>s\). Since \(\nu (\cdot |{\varvec{\delta }})\) is non-decreasing, we obtain from Eq. (3.14) that

$$\begin{aligned} \nu (s)-d(\nu (s))=\nu (\kappa (s))>\nu (s), \end{aligned}$$
(3.15)

which is a contradiction because \(d(\nu (s))\geqslant 0\). Hence, \(\kappa (s)\leqslant s\) for all \(s\in [-h,M]\).

In order to obtain the explicit formula of \(\kappa (s)\), we give the following lemma.

Lemma 3

For any given \(s\in [0,M)\) and \(\varvec{\delta }\), there exists a unique \(u\in \{1,\cdots ,M\}\) such that \(\delta _{u}>0\) and \(s\in [\nu (u-1|\varvec{\delta }),\nu ( u|\varvec{\delta }))\).

Proof

The proof of Lemma 3 is similar to the proof of Lemma 1 in [20], and hence is omitted here.

Now, we will give an explicit formula for \(\kappa (s)\). This is presented as a theorem below.

Theorem 3.1

Let \({\varvec{\delta }}\in \varDelta \), for each \(s\in [-h,M]\), if \(\nu (s|{\varvec{\delta }})-d(\nu (s|{\varvec{\delta }}))<0\), then

$$\begin{aligned} \kappa (s|{\varvec{\delta }}) = \nu (s|{\varvec{\delta }}) -d(\nu (s|{\varvec{\delta }})). \end{aligned}$$

Otherwise, let \(\rho (s|\varvec{\delta })\) denote the unique integer such that \(\delta _{\rho (s|\varvec{\delta })+1}>0\) and

$$\begin{aligned} \nu (s|\varvec{\delta })-d(\nu (s|{\varvec{\delta }})) \in \bigg [\displaystyle \sum _{k=1}^{\rho (s|{\varvec{\delta }})} \delta _{k}, \sum _{k=1}^{\rho (s|{\varvec{\delta }})+1}\delta _{k}\bigg ). \end{aligned}$$
(3.16)

Then, the following equation holds:

$$\begin{aligned} \kappa (s|{\varvec{\delta }})=\rho (s|{\varvec{\delta }}) +\displaystyle \sum _{l=\rho (s|{\varvec{\delta }})+1}^{\lfloor s\rfloor } \delta _{\rho (s|{\varvec{\delta }})+1}^{-1}\delta _l +\delta _{\rho (s|{\varvec{\delta }})+1}^{-1}\delta _{\lfloor s\rfloor +1} (s-\lfloor s\rfloor )-d(\nu (s|{\varvec{\delta }})) \delta _{\rho (s|{\varvec{\delta }})+1}^{-1}. \end{aligned}$$

Proof

The proof of Theorem 3.1 is similar to the proof of Theorem 1 in [20], and hence is omitted here.

Let

$$\begin{aligned} \varvec{z}(\kappa (s))=\varvec{x}(\nu (s\mid \varvec{\delta }) -d(\nu (s\mid \varvec{\delta })))=\varvec{x}(t-\mathrm{d}(t)). \end{aligned}$$
(3.17)

For \(s\in [k-1,k)\), we can transform the time-varying time-delay switched system into the following form:

$$\begin{aligned} \dot{\varvec{z}}(s)=\left\{ \begin{array}{ll} \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{z}(\kappa (s)), \varvec{\vartheta }_{k},\varvec{\vartheta }_{k^{'}}), &{} \ \text {if} \ \kappa (s)\in [k^{'}-1,k^{'}),\\ &{}\ \text {for some} \ k, k^{'}\in \{1,\cdots ,M\},\\ \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{\phi }(\kappa (s)), \varvec{\vartheta }_{k},\varvec{\omega }(\kappa (s))), &{} \ \text {if} \ \kappa (s)<0,\\ \end{array}\right. \end{aligned}$$
(3.18)

and the initial condition is

$$\begin{aligned} \varvec{z}(s)=\varvec{\phi }(s). \end{aligned}$$
(3.19)

Let \(\varvec{z}(\cdot \mid \varvec{\delta }, \varvec{\vartheta })\) denote the solution of (3.18)–(3.19) and the Problem 2 can be transformed into the Problem 3.

Problem 3

Given the dynamic system (3.18)–(3.19), find \((\varvec{\delta },\varvec{\vartheta })\in \varDelta \times \varOmega \) such that the cost function

$$\begin{aligned} \tilde{J}_{0}(\varvec{\delta },\varvec{\vartheta }) =\tilde{\varPsi }_{0}(\varvec{z}(M\mid \varvec{\delta },\varvec{\vartheta })) \end{aligned}$$

is minimized subject to the canonical constraints

$$\begin{aligned} \tilde{J}_{n}(\varvec{\delta },\varvec{\vartheta }) =\tilde{\varPsi }_{n}(\varvec{z}(M\mid \varvec{\delta },\varvec{\vartheta })) \left\{ \begin{array}{l} =0,\\ \geqslant 0,\\ \end{array}\right. \qquad \quad n=1,\cdots ,m, \end{aligned}$$

where \(\tilde{\varPsi }_{n}:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}},n=0,\cdots ,m\) are given continuously differentiable functions.

4 Gradient Computation

Since we adopt gradient optimization method to solve Problem 3, we need to compute the gradients of cost function and constraint functions with respect to \(\varvec{\delta }\) and \(\varvec{\vartheta }\).

Firstly, we need the gradients of the cost function and constrain functions with respect to \(\varvec{\delta }\). Note that

$$\begin{aligned} \frac{\partial \tilde{J}_{n}(\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\delta }}=&\frac{\partial \tilde{\varPsi }_{n} (\varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}\cdot \frac{\partial \varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\delta }}. \end{aligned}$$
(4.1)

Thus, we need to calculate

$$\begin{aligned} \frac{\partial \varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\delta }}. \end{aligned}$$
(4.2)

To achieve that, we have the following theorem.

Theorem 4.1

For each pair \((\varvec{\delta },\varvec{\vartheta }) \in \varDelta \times \varOmega \), we have

$$\begin{aligned} \frac{\partial \varvec{z}(s\mid \varvec{\delta }, \varvec{\vartheta })}{\partial \varvec{\delta }} =\varvec{\varUpsilon }(s\mid \varvec{\delta },\varvec{\vartheta }), \ s\in [0,M], \end{aligned}$$

where \(\varvec{\varUpsilon }(\cdot \mid \varvec{\delta }, \varvec{\vartheta })\) is the solution of the following auxiliary dynamic system on each subinterval \([k-1,k)\):

$$\begin{aligned} \dot{\varvec{\varUpsilon }}(s\mid \varvec{\delta },\varvec{\vartheta })&= \frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s),\varvec{z} (\kappa (s\mid \varvec{\delta })),\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{z(s)}}\varvec{\varUpsilon }(s\mid \varvec{\delta }, \varvec{\vartheta })\nonumber \\&\quad +\frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s), \varvec{z}(\kappa (s\mid \varvec{\delta })),\varvec{\delta }, \varvec{\vartheta })}{\partial \varvec{z}(\kappa (s\mid \varvec{\delta }))} \left[ \varvec{\varUpsilon }(\kappa (s\mid \varvec{\delta })\mid \varvec{\delta }, \varvec{\vartheta })\right. \nonumber \\&\quad \left. +\frac{\partial \varvec{z}(\kappa (s\mid \varvec{\delta }))}{\partial \kappa (s\mid \varvec{\delta })}\frac{\partial \kappa (s\mid \varvec{\delta })}{\partial \varvec{\delta }}\right] +\frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s), \varvec{z}(\kappa (s\mid \varvec{\delta })),\varvec{\delta }, \varvec{\vartheta })}{\partial \varvec{\delta }} \end{aligned}$$
(4.3)

with the initial condition

$$\begin{aligned} \varvec{\varUpsilon }(s\mid \varvec{\delta },\varvec{\vartheta }) = \varvec{0}, \ s\leqslant 0, \end{aligned}$$
(4.4)

for \(s\in [k-1,k)\),

$$\begin{aligned} \hat{\varvec{f}}_{k}(\varvec{z}(s),\varvec{z}(\kappa (s)), \varvec{\delta },\varvec{\vartheta })=\left\{ \begin{array}{ll} \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{z}(\kappa (s)), \varvec{\vartheta }_{k},\varvec{\vartheta }_{k^{'}}), &{} \ \kappa (s)\in [k^{'}-1,k^{'}),\\ \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{\phi }(\kappa (s)), \varvec{\vartheta }_{k},\varvec{\omega }(\kappa (s))), &{} \ \kappa (s)<0, \end{array}\right. \end{aligned}$$

where \(k,k^{'}=1,\cdots ,M\).

Proof

The proof of Theorem 4.1 is similar to that given for Theorem 3.1 in [13], and hence is omitted.

From Theorem 4.1, it is clear that we need calculate the derivative of \(\kappa (\cdot |{\varvec{\delta }})\) with respect to \(\delta _k\), \(k=1,\cdots ,M\). Let \({\mathcal {S}}'\) denote the set of points s such that \(\kappa (s|{\varvec{\delta }})\in \{0,1,\cdots ,M-1\}\). We have the following result.

Theorem 4.2

For all \(s\notin {\mathcal {S}}'\),

$$\begin{aligned} \frac{\partial \kappa (s|{\varvec{\delta }})}{\partial \delta _k} =\delta _{\rho (s|{\varvec{\delta }})+1}^{-1}\bigg \{\frac{\partial \nu (s|{\varvec{\delta }})}{\partial \delta _k}(1-\frac{\partial d(\nu (s|{\varvec{\delta }}))}{\partial \nu (s|{\varvec{\delta }})}) -\frac{\partial \nu (\kappa (s|{\varvec{\delta }})|{\varvec{\delta }})}{\partial \delta _k}\bigg \}, \quad k=1,\cdots ,M, \end{aligned}$$

where \(\rho (s|{\varvec{\delta }})\) is defined in Theorem 3.1.

Proof

We omit the argument \({\varvec{\delta }}\) in \(\nu (\cdot |{\varvec{\delta }})\) , \(\kappa (\cdot |{\varvec{\delta }})\) and \(\rho (\cdot |{\varvec{\delta }})\) for simplicity. First, note from the definition of \(\nu (\cdot )\) that for arbitrary s,

$$\begin{aligned} \frac{\partial \nu (s)}{\partial \delta _k}={\left\{ \begin{array}{ll} 1, &{} \text {if}\ s\geqslant 0\ \text {and}\ k=1,\cdots ,\lfloor s\rfloor ,\\ s-\lfloor s\rfloor , &{} \text {if}\ s\geqslant 0\ \text {and}\ k=\lfloor s\rfloor +1,\\ 0, &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Now, for all \(s\notin {\mathcal {S}}'\), we can differentiate Eq. (3.14) with respect to \(\delta _k\) to obtain

$$\begin{aligned} \frac{\partial \nu (s)}{\partial \delta _k}=\frac{\partial \nu (\kappa (s))}{\partial \delta _k}+\frac{\partial \nu (\kappa (s))}{\partial \kappa (s)} \frac{\partial \kappa (s)}{\partial \delta _k}+\frac{\partial d(\nu (s|{\varvec{\delta }}))}{\partial \nu (s)} \frac{\partial \nu (s)}{\partial \delta _k}. \end{aligned}$$
(4.5)

Clearly,

$$\begin{aligned} \frac{\partial \nu (s)}{\partial s}=\delta _{\lfloor s\rfloor +1}, \quad s\in [k-1,k), \quad k=1,\cdots ,M. \end{aligned}$$
(4.6)

Since \(s\notin S'\), \(\partial \nu (\kappa (s))/\partial \kappa (s)\) exist, it follows from (4.6) that

$$\begin{aligned} \frac{\partial \nu (\kappa (s))}{\partial \kappa (s)}=\delta _{\lfloor \kappa (s)\rfloor +1}=\delta _{\rho (s)+1}. \end{aligned}$$
(4.7)

Since \(\kappa (s)\notin \{0,\cdots ,M-1\}\), it follows from the definition of \(\kappa (s)\) that \(\delta _{\rho (s)+1}=\delta _{\lfloor \kappa (s)\rfloor +1}> 0\). Substituting (4.7) into (4.5) and rearranging the terms, we obtain

$$\begin{aligned} \frac{\partial \kappa (s|{\varvec{\delta }})}{\partial \delta _k} =\delta _{\rho (s|{\varvec{\delta }})+1}^{-1} \bigg \{\frac{\partial \nu (s|{\varvec{\delta }})}{\partial \delta _k}\left( 1-\frac{\partial d(\nu (s|{\varvec{\delta }}))}{\partial \nu (s|{\varvec{\delta }})}\right) -\frac{\partial \nu (\kappa (s|{\varvec{\delta }})|{\varvec{\delta }})}{\partial \delta _k}\bigg \}. \end{aligned}$$
(4.8)

Similarly, we also need the gradients of the cost function and constraint functions with respect to \(\varvec{\vartheta }\), which is given in the following theorem.

Theorem 4.3

The gradients of \(\tilde{J}_{n}(\varvec{\delta }, \varvec{\vartheta }), \ n=0,1,\cdots ,m,\) with respect to \(\varvec{\vartheta }\) can be written as:

$$\begin{aligned} \frac{\partial \tilde{J}_{n}(\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\vartheta }}=&\frac{\partial \tilde{\varPsi }_{n} (\varvec{z}(M|\varvec{\delta },\varvec{\vartheta }))}{\partial \varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}\cdot \frac{\partial \varvec{z}(M|\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\vartheta }}. \end{aligned}$$
(4.9)

Proof

The proof follows by applying the chain rule.

In Theorem 4.3 we know that if we want to compute the gradients of cost function and constraint functions with respect to \({\varvec{\vartheta }}\), we should compute the gradient of the state vectors with respect to \({\varvec{\vartheta }}\). Thus, we have the following theorem.

Theorem 4.4

For each pair \((\varvec{\delta },\varvec{\vartheta }) \in \varDelta \times \varOmega \), we have

$$\begin{aligned} \frac{\partial \varvec{z}(s\mid \varvec{\delta }, \varvec{\vartheta })}{\partial \varvec{\vartheta }} =\varvec{\varPi }(s\mid \varvec{\delta },\varvec{\vartheta }), \ s\in [0,M], \end{aligned}$$

where \(\varvec{\varPi }(\cdot \mid \varvec{\delta }, \varvec{\vartheta })\) is the solution of the following auxiliary dynamic system on each subinterval \([k-1,k)\):

$$\begin{aligned} \dot{\varvec{\varPi }}(s\mid \varvec{\delta },\varvec{\vartheta })&= \frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s), \varvec{z}(\kappa (s)),\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{z}(s)}\varvec{\varPi }(s\mid \varvec{\delta }, \varvec{\vartheta })\nonumber \\&\quad +\frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s), \varvec{z}(\kappa (s)),\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{z}(\kappa (s))}\varvec{\varPi }(\kappa (s) \mid \varvec{\delta },\varvec{\vartheta })\nonumber \\&\quad +\frac{\partial \hat{\varvec{f}}_{k}(\varvec{z}(s), \varvec{z}(\kappa (s)),\varvec{\delta },\varvec{\vartheta })}{\partial \varvec{\vartheta }} \end{aligned}$$
(4.10)

with the initial condition

$$\begin{aligned} \varvec{\varPi }(s)=\ \varvec{0}, \ s\leqslant 0, \end{aligned}$$
(4.11)

for \(s\in [k-1,k)\),

$$\begin{aligned} \hat{\varvec{f}}_{k}(\varvec{z}(s),\varvec{z}(\kappa (s)), \varvec{\delta },\varvec{\vartheta })=\left\{ \begin{array}{ll} \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{z}(\kappa (s)), boldsymbol{\vartheta }_{k},\varvec{\vartheta }_{k^{'}}), &{} \ \kappa (s)\in [k^{'}-1,k^{'}),\\ \delta _{k}\varvec{f}_{k}(\varvec{z}(s),\varvec{\phi }(\kappa (s)), \varvec{\vartheta }_{k},\varvec{\omega }(\kappa (s))), &{} \ \kappa (s)<0, \end{array}\right. \end{aligned}$$

where \(k,k^{'}=1,\cdots ,M\).

Proof

The proof of Theorem 4.4 is similar to that given for Theorem 3.3 in [13], and hence is omitted.

After obtaining the gradients of the cost function and the constraints with respect to the decision variables, we can now use the existing constrained nonlinear optimization packages such as FMINCON in MATLAB or NLPQLP in FORTRAN to solve Problem 3. In the next section, we will demonstrate the effectiveness of this approach through solving two numerical examples.

5 Numerical Examples

5.1 Example 1: Optimal Control Problem with Time-Varying Time-Delay Switched System

Consider the following nonlinear time-varying time-delay switched system with two subsyetems, it is slightly different from the second example in [31] with the delay being replaced by a function of t:

$$\begin{aligned}&S_1:\left\{ \begin{aligned} \dot{x}_1(t)&=-5x_1(t)-4x_2(t)-3x_1(t-\exp (-t/3))\\&\quad +2x_2(t-\exp (-t/3))+u_1(t)x_1(t)+u_2(t)x_2(t)\\&\quad +0.1\tanh (x_1(t)),\\ \dot{x}_2(t)&= 0.1x_1(t)-7x_2(t)+0.5u_1(t)x_1(t)+0.5u_2(t)x_2(t)\\&\quad -\sin (x_2(t-\exp (-t/3))), \end{aligned}\right. \qquad \text {if}\ 0<t\leqslant \ell _{1}, \end{aligned}$$
(5.1)
$$\begin{aligned}&S_2: \left\{ \begin{aligned} \dot{x}_1(t)&= -4x_1(t)+0.5x_2(t)+0.2\sin (x_2(t))+t^2+8,\\ \dot{x}_2(t)&= 5x_1(t)-5x_2(t)+0.5\sin (x_2(t-\exp (-t/3)))\\&\quad -u_1(t)x_1(t)-u_2(t)x_2(t), \end{aligned}\right. \qquad \text {if}\ \ell _{1}<t\leqslant 1.5, \end{aligned}$$
(5.2)

and the initial state are given by

$$\begin{aligned} x_1(t)=6,\quad x_2(t)=t^2+2,\quad t\leqslant 0, \end{aligned}$$
(5.3)

where the switching time \(\ell _{1}\) and control vector \(\varvec{u}(t)=[u_{1}(t),u_{2}(t)]^{\top }\) are decision variables that need to be optimized.

Our optimal control problem is thus stated as follows: Subject to the initial conditions (5.3) and the dynamic systems (5.1)–(5.2), choose the switching time \(\ell _{1}\) and control vector \(\varvec{u}(t)=[u_{1}(t),u_{2}(t)]^{\top }\) to minimize

$$\begin{aligned} J_0(T,\varvec{u}(t),{\varvec{\ell }})=(x_1(1.5)-2)^2+(x_2(1.5)-1)^2, \end{aligned}$$
(5.4)

where \((x_1(1.5),x_2(1.5))\) denotes the desired final state.

A switched system consists of a number of subsystems and a switching law, so we assume that the switching sequence of this switched system is \(S_{1}\Rightarrow S_{2}\). In order to using the control parameterization method and the time-scaling transformation technique for this problem, we introduce two new variables \(\varvec{\delta }=[\delta _{1},\delta _{2}]^{\top }\) and \(\varvec{\vartheta }=[\vartheta _{1},\vartheta _{2}, \vartheta _{3},\vartheta _{4}]^{\top }\) represent the duration of the subsystem and parameter vector, where \(\delta _{1}=\ell _{1}\) and \(\delta _{2}=1.5-\ell _{1}\). Moreover, we have that \(\delta _1+\delta _2=1.5\).

By applying the control parameterization method and the time-scaling transformation technique, (5.1)–(5.2) can be converted into the following form:

$$\begin{aligned}&\tilde{S}_1:\left\{ \begin{aligned} \dot{z}_1(s)&= \delta _1(-5z_1(s)-4z_2(s)-3z_1(\kappa (s))+2z_2(\kappa (s))\\&\quad +\vartheta _1z_1(s)+\vartheta _3z_2(s)+0.1\tanh (z_1(s))),\\ \dot{z}_2(s)&= \delta _1(0.1z_1(s)-7z_2(s)+0.5\vartheta _1z_1(s)+0.5\vartheta _3z_2(s)\\&\quad -\sin (z_2(\kappa (s)))), \end{aligned}\right. \qquad \text {if}~ 0<s\leqslant {1}, \end{aligned}$$
(5.5)
$$\begin{aligned}&\tilde{S}_2:\left\{ \begin{aligned} \dot{z}_1(s)&= \delta _2(-4z_1(s)+0.5z_2(s)+0.2\sin (z_2(s))+\nu (s)^2+8),\\ \dot{z}_2(s)&= \delta _2(5z_1(s)-5z_2(s)+0.5\sin (z_2(\kappa (s)))\\&\quad -\vartheta _2z_1(s)-\vartheta _4z_2(s), \end{aligned}\right. \qquad \text {if}~ 1<s\leqslant {2}, \end{aligned}$$
(5.6)

where \(\varvec{z}(s)=\varvec{x}(t(s))\) and \(\varvec{z}(\kappa (s))=\varvec{x}(t(s)-\mathrm{d}(t(s)))\).

The initial conditions (5.3) become

$$\begin{aligned} z_1(s)=6,\quad z_2(s)=\nu (s)^2+2,\quad s\leqslant 0, \end{aligned}$$
(5.7)

Furthermore, the cost function (5.4) becomes

$$\begin{aligned} \tilde{J}_0(s,{\varvec{\delta }},{\varvec{\vartheta }}) =(z_1(2)-2)^2+(z_2(2)-1)^2. \end{aligned}$$
(5.8)

Thus, the transformed problem can be stated as follows: Subject to the new dynamic systems (5.5)–(5.6) and the new initial conditions (5.7), choose the mode durations \(\delta _{i},i=1,2\) and parameter vectors \(\vartheta _{i},i=1,2,3,4\), to minimize \(\tilde{J}_0\).

Let the control constraints be \(-4.0<\vartheta _1,\vartheta _2, \vartheta _3,\vartheta _4\leqslant 6.0\). We suppose the initial optimal parameterization vector is \([0.1,0.2,0.5,0.8]^{\top }\) and the initial time of duration vector is \([0.9,0.6]^{\top }\) in the new problem.

By comparing the numerical results obtained by using the new method with traditional control parameterization we know that using the new method is more accurate than using traditional control parameterization method in optimal value. The details of the numerical results we get will be listed in Tables 1 and  2. In addition, the optimal trajectories of the state obtained by using different methods are illustrated in Figs. 2 and 3, the optimal control by using different methods are illustrated in Figs. 4 and  5.

Table 1 Numerical results of Example 1 by using the traditional control parameterization method
Table 2 Numerical results of Example 1 by using the new method
Fig. 2
figure 2

Optimal state trajectory for Example 1 by using traditional control parameterization method

Fig. 3
figure 3

Optimal state trajectory for Example 1 by using the new method

Fig. 4
figure 4

Optimal control value for Example 1 by using the new method

Fig. 5
figure 5

Optimal control value for Example 1 by using the traditional control parameterization method

5.2 Example 2: Optimal Control Problem with Time-Varying Time-Delay Switched System

Consider the following nonlinear time-varying time-delay switched system with three subsyetems, again, the dynamic system comes from the first example in [31, 32] with the delay being replaced by another function of t:

$$\begin{aligned}&S_1:\left\{ \begin{array}{l} \dot{x}_1(t) = 2x_1(t)x_2(t)+x_2(t-\exp (-t)),\\ \dot{x}_2(t) = 3x_1(t)+4x_2(t-\exp (-t)), \end{array}\right. \qquad \text {if}\ 0<t\leqslant \ell _{1}, \end{aligned}$$
(5.9)
$$\begin{aligned}&S_2: \left\{ \begin{array}{l} \dot{x}_1(t) = -2x_1(t)x_2(t)+\sin (x_2(t-\exp (-t))),\\ \dot{x}_2(t) = x_1(t)x_2(t)+x_1(t-\exp (-t))x_2(t-\exp (-t)), \end{array}\right. \qquad \text {if}\ \ell _{1}<t\leqslant \ell _{2}, \end{aligned}$$
(5.10)
$$\begin{aligned}&S_3:\left\{ \begin{array}{l} \dot{x}_1(t) = t^2-2x_1(t)+3x_2(t-\exp (-t)),\\ \dot{x}_2(t) = -x_2(t)+x_1(t-\exp (-t))x_2(t-\exp (-t)), \end{array}\right. \qquad \text {if}\ \ell _{2}<t\leqslant 1, \end{aligned}$$
(5.11)

and the initial conditions are

$$\begin{aligned} x_1(t)=t-1,\quad x_2(t)=t^2+1,\quad t\leqslant 0, \end{aligned}$$
(5.12)

where the switching time vector \({\varvec{\ell }} =[\ell _{1},\ell _{2}]^{\top }\) is decision variable that needs to be optimized.

Our optimal control problem is thus stated as follows: Subject to the initial conditions (5.12) and the dynamic systems (5.9)–(5.11), choose the switching time vector \({\varvec{\ell }} =[\ell _{1},\ell _{2}]^{\top }\) to minimize

$$\begin{aligned} J_0(T)=(x_1(1)-0.5)^2+(x_2(1)-0.25)^2, \end{aligned}$$
(5.13)

where \((x_1(1),x_2(1))\) denotes the desired final state.

As in [31], we assume that the switching sequence is \(S_1 \Rightarrow S_2\Rightarrow S_3\). Let \(\delta _{1}=\ell _{1}\), \(\delta _{2}=\ell _{2}-\ell _{1}\) and \(\delta _{3}=1-\ell _{2}\), clearly, \(\delta _1\), \(\delta _2\) and \(\delta _3\) represent the duration of the subsystem \(S_1\), \(S_2\) and \(S_3\), respectively. Moreover, we have that \(\delta _1+\delta _2+\delta _3=1,\delta _i \geqslant 0,i=1,2,3\).

By applying the time-scaling transformation technique, (5.9)–(5.11) can be converted into the following form:

$$\begin{aligned}&\tilde{S}_1:\left\{ \begin{array}{l} \dot{z}_1(s) = \delta _1(2z_1(s)z_2(s)+z_2(\kappa (s))),\\ \dot{z}_2(s) = \delta _1(3z_1(s)+4z_2(\kappa (s))), \end{array}\right. \qquad 0<s\leqslant 1, \end{aligned}$$
(5.14)
$$\begin{aligned}&\tilde{S}_2:\left\{ \begin{array}{l} \dot{z}_1(s) = \delta _2(-2z_1(s)z_2(s)+\sin (z_2(\kappa (s)))),\\ \dot{z}_2(s) = \delta _2(z_1(s)z_2(s)+z_1(\kappa (s))z_2(\kappa (s))), \end{array}\right. \qquad 1<s\leqslant 2, \end{aligned}$$
(5.15)
$$\begin{aligned}&\tilde{S}_3:\left\{ \begin{array}{l} \dot{z}_1(s) = \delta _3(\nu (s)^2-2z_1(s)+3z_2(\kappa (s))),\\ \dot{z}_2(s) = \delta _3(-z_2(s)+z_1(\kappa (s))z_2(\kappa (s))), \end{array}\right. \qquad 2<s\leqslant 3. \end{aligned}$$
(5.16)

The initial conditions (5.12) become

$$\begin{aligned} z_1(s)=\nu (s)-1,\quad z_2(s)=\nu (s)^2+1,\quad s\leqslant 0. \end{aligned}$$
(5.17)

Furthermore, the cost function becomes

$$\begin{aligned} \tilde{J}_0(s,{\varvec{\delta }})=(z_1(3)-0.5)^2+(z_2(3)-0.25)^2. \end{aligned}$$
(5.18)

Thus, the transformed problem can be stated as follows: Subject to the new dynamic systems (5.14)–(5.16) and the new initial conditions (5.17), choose the mode durations \(\delta _{i},i=1,2,3\), to minimize \(\tilde{J}_0\).

In the new problem, we let the initial time of durations be \(\delta _1=0.1\), \(\delta _2=0.7\), \(\delta _3=0.2\), and the new time horizon is \(s\in [0,3].\) We can obtain the optimal cost is \(\tilde{J}^*_0=9.536~0\times 10^{-5}\). The other numerical results are described in Table 3 and Fig. 6 shows the optimal state trajectory.

Table 3 Numerical result of Example 2
Fig. 6
figure 6

Optimal state trajectory for Example 2

6 Conclusion

In this paper, we consider a class of optimal control problems with TVTDSS. We first transform the original problem into a parameter selection problem with finite dimensional decision variables by control parameterization method. In order to obtain the optimal switching times, we adopt the time-scaling transformation technique to convert the optimization problem with variable switching times into an equivalent new problem defined on a new time horizon with fixed switching times. Then, we calculate the gradients of the objective and constraint functions with respect to the control heights and durations. Finally, the examples were provided to illustrate the effectiveness of the proposed approach.