Sliding—motion along the discontinuity threshold—is central to the most novel phenomena of nonsmooth dynamics. The theory was largely developed in Filippov’s work [51], but seems to originate in earlier Russian texts, perhaps first from G. N. Nikol’skii [109] (see also discussion in [4, 107, 148]). The standard definition would describe sliding as motion along an ideal threshold \({\mathcal D}\). We shall define it as follows.

FormalPara Definition 6.1

A solution of an implementation (5.3) of a piecewise-smooth system (5.1) is said to slide if it evolves inside the layer \({\mathcal D}^\varepsilon \) for a time \(\varDelta t={{\mathcal O}{\left ({{1}}\right )}}\) (i.e., a time not vanishing as ε → 0).

This allows us to discuss sliding in implementations as well as in the ideal discontinuous system. Intuitively, sliding occurs because solutions tend towards some invariant set in the layer \({\mathcal D}^\varepsilon \) around the discontinuity threshold \({\mathcal D}\). More precisely we can state the following.

FormalPara Lemma 6.1

Consider system (5.1) on an open region W, defined piecewise on regions \(\mathcal {R}_i={\left \{{\mathbf {x}\in \mathbb R^n\;:\;i=\operatorname {step}(\sigma (\mathbf {x})),\;\sigma \neq 0}\right \}}\) in terms of vector fields f i that are differentiable on σ ≥−ε (for i = 1) and σ ≤ +ε (for i = 0), and a scalar function σ differentiable for all x. Take an implementation of (5.1) on W according to Definition 5.1 . If the discontinuity threshold \({\mathcal D}={\left \{{\mathbf {x}\;:\;\sigma =0}\right \}}\) is either attracting or repelling with respect to the vector fields f i for σ → 0 on W, and the vector fields f i have non-vanishing components normal to \({\mathcal D}\) for σ → 0 on W, then there exists some E > 0 such that the layer \({\mathcal D}^\varepsilon \), where \(\sigma ={{\mathcal O}{\left ({{\varepsilon }}\right )}}\), is invariant inside W for 0 < ε < E.

FormalPara Proof

The attractivity of \({\mathcal D}\) implies f 1(x) ⋅∇σ < 0 < f 0(x) ⋅∇σ, and repulsivity of \({\mathcal D}\) implies the opposite signs, given that the normal components f i ⋅∇σ are non-vanishing, evaluated at x as σ → 0. Since f 0(x) and f 1(x) are differentiable, at any point x ∈ W on σ = 0, there exists \(\tilde E(\mathbf {x})>0\) such that f 1(u) ⋅∇σ < 0 < f 0(u) ⋅∇σ for all u such that \(|\mathbf {u}\cdot \nabla \sigma |<\tilde E(\mathbf {x})\). Let \(\tilde E(\mathbf {x})\) be the largest such value at each x, and let E be the infimum of all \(\tilde E(\mathbf {x})\) for \(\mathbf {x}\in {\mathcal D}\cap W\). Then for any ε < E the vector fields satisfy f 1(x) ⋅∇σ < 0 for x such that σ < +ε, and f 0(x) ⋅∇σ > 0 for x such that σ > −ε, therefore the region |σ(x)| < ε is invariant.□

This suggests that if we can define a switching multiplier (some μ or ν), and a dynamics on it in the switching layer, then sliding constitutes an invariant of the dynamics confined to the layer, when the dynamics would otherwise carry trajectories through the layer. Even if we cannot define the layer dynamics in closed form, we can still consider the attracting or repelling objects it forms that constitute sliding.

Lemma 6.1 treats only the case when \({\mathcal D}\) is attracting or repelling with respect to the flows outside it, but attractivity/repulsivity is neither necessary nor sufficient for sliding to take place. The extension of Lemma 6.1 to a discontinuity threshold \({\mathcal D}\) formed by the intersection of manifolds \({\mathcal D}_1\cap \dots \cap {\mathcal D}_m\), for example, is quite straightforward, this also has been mainly considered only under conditions of uniformly attraction of \({\mathcal D}\) with respect to the surrounding flows, e.g., in [5, 38].

The situation in general is greatly more complicated. The discontinuity threshold \({\mathcal D}\) need not be attracting or repelling for sliding to occur (as we saw in the example of ‘sticky genes’ in Sect. 1.3, see also [81]). Moreover, an intersection of manifolds \({\mathcal D}_1\cap \dots \cap {\mathcal D}_m\) can be attracting without the individual thresholds \({\mathcal D}_j\) being attracting, for example, if the flow spirals around a point x 1 = x 2 = 0 by crossing through discontinuity thresholds x 1 = 0 and x 2 = 0 (see, e.g., [36, 51]). The permutations are enormous and no substantial accounting of the possibilities has been made. Perhaps the most ambitious steps in this direction are in [66], where the authors classify the behaviours a solution can exhibit once it enters an intersection in a planar system.

The only general statement that can be made is that if the set \(\mathcal {G}\) from (5.2) has an intersection with the tangent space \(\mathcal {T}_{\mathcal D}\) of the discontinuity threshold at a given point \(\mathbf {x}\in {\mathcal D}\),

$$\displaystyle \begin{aligned} \mathcal{G}(\mathbf{x})\cap\mathcal{T}_{\mathcal D}(\mathbf{x})\neq\emptyset\;, \end{aligned} $$
(6.1)

then sliding motion is possible in (5.1). Similarly if the set \(\mathcal {G}^\varepsilon \) of a given implementation from (5.3) has an intersection with the tangent space \(\mathcal {T}_{\mathcal D}\) of the discontinuity threshold at a given point \(\mathbf {x}\in {\mathcal D}\),

$$\displaystyle \begin{aligned} \mathcal{G}^\varepsilon(\mathbf{x})\cap\mathcal{T}_{\mathcal D}(\mathbf{x})\neq\emptyset\;, \end{aligned} $$
(6.2)

then sliding motion is possible in the implementation of (5.1). A solution arriving at \({\mathcal D}^\varepsilon \) evolves onto some attractor that lies in \({\mathcal D}^\varepsilon \), on which solutions x(t) evolve in a direction ε-close to the tangent space to \({\mathcal D}\). Thus we formalize the notion of sliding more precisely as follows.

FormalPara Definition 6.2

A solution x(t) of (5.1) is said to slide along the discontinuity threshold \({\mathcal D}\) if \(\dot {\mathbf {x}} (t)\in \mathcal {G}(\mathbf {x})\cap \mathcal {T}_{\mathcal D}(\mathbf {x})\). A solution x(t) of an implementation of (5.1) is said to slide along the discontinuity threshold \({\mathcal D}\) (more strictly along the layer \({\mathcal D}^\varepsilon \) approximating \({\mathcal D}\)) if \(\dot {\mathbf {x}} (t)\in \mathcal {G}^\varepsilon (\mathbf {x})\cap \mathcal {T}_{\mathcal D}(\mathbf {x})\).

By saying \(\dot {\mathbf {x}}(t)\in \mathcal {G}^{\varepsilon }(\mathbf {x})\cap \mathcal {T}_{\mathcal D}(\mathbf {x})\), we mean that the tangent vector along a trajectory, \(\dot {\mathbf {x}}(t)\), which must lie in \(\mathcal {G}^{\varepsilon }(\mathbf {x})\), lies tangent to the discontinuity threshold \({\mathcal D}\), implying that x(t) evolves along \({\mathcal D}^{\varepsilon }\) as t changes.

To find out whether sliding will occur in either situation requires a look at the dynamics, to find out whether the vector fields in the sets (6.1) and (6.2) possess attractors or repellers in the layer that can be followed by any solutions x(t). We explore the different approaches to this over Sects. 6.1 and 6.3.

The experiments in Chap. 4 show sliding under the implementations from Definition 4.1. In each case sliding occurs in a layer \({\mathcal D}_j^{\varepsilon _j}\) that forms an order ε j-neighbourhood around an ideal discontinuity threshold \({\mathcal D}_j={\left \{{\mathbf {x}\;:\;\sigma _j(\mathbf {x})=0}\right \}}\). Regardless of implementation, the attractivity of the layer implies the existence of local attractors inside it. The effects of Lemma 6.1 are seen in Figs. 4.1 to 7.4, with sliding occurring along the threshold x = 0 in Fig. 4.1, the thresholds x 1 = 0 < x 2 and x 2 = 0 < x 1 in Figs. 4.2 and 4.3, and the threshold x 1 = 0 < x 2 in Figs. 4.4, 4.5, and 4.6. With smoothing, these attractors are normally hyperbolic manifolds or equilibria. With hysteresis, the attractors are cycles oscillating between the boundaries x 1 = ±ε 1 and x 2 = ±ε 2, therefore reducible to return maps on those surfaces. For time stepping or delay the attractors can be bounded within definite ε 1 and ε 2 neighbourhoods of x 1 = 0 and x 2 = 0, respectively, and are described by piecewise-smooth two-dimensional maps on the plane.

Before returning to look at these experiments again closely, we need to build up a more general picture of the dynamics inside the switching layer that result from these definitions of layers, implementations, and sliding.

6.1 Sliding Perspective I: The Piecewise-Smooth System

Given the description of a piecewise-smooth system in terms of switching multipliers, as given by (5.2) with (5.8), we can analyse dynamics at the discontinuity as follows.

In (1.15) we defined the multipliers \(\nu _j=\operatorname {step}(\sigma _j)\) only as taking values ν j ∈ [0, 1] at σ j = 0. We shall now ask how each ν j varies across the interval [0, 1] and derive dynamics on them induced by the vector fields in σ j ≠ 0. (Clearly outside the discontinuity surface the ν js are piecewise-constant and so obey \(\dot \nu _j=0\) for any j).

On a discontinuity threshold \({\mathcal D}_j\), we can treat each ν j as a ‘blow-up’ variable of the discontinuity set σ j = 0. This is done by letting σ j = ε jν j for some small ε j ≥ 0, so that the discontinuity occurs across an interval σ j ∈ [0, ε j] → 0 as ε j → 0. (This method is developed in [79, 81] but is essentially just a scaling of the quantity σ j that maps its values on σ j ∈ [0, ε] → 0 to values ν j ∈ [0, 1], and has no doubt been used earlier, e.g., in [148]. The term ‘blow-up’ itself appears to originate from singular perturbation literature [40]). We then use this to find the dynamics of ν j on the interval ν j ∈ [0, 1].

At a point where \({\mathcal D}\) is a codimension m manifold, let \(\mathbf {x}|{ }_{\mathcal D}\in \mathbb R^{n-m}\) denote the space of x restricted to \({\mathcal D}\). The multipliers ν j in (5.10) lie on intervals [0, 1], so the dynamics on \({\mathcal D}\) can be said to take place inside a switching layer

$$\displaystyle \begin{aligned} {\mathcal D}^\varepsilon={\left\{{\;(\nu_1,\dots,\nu_m,\;\mathbf{x}|{}_{\mathcal D})\;\in\;[0,1]^m\times\mathbb R^{n-m}\;}\right\}}\;. \end{aligned} $$
(6.3)

For economy of nomenclature we use the term ‘switching layer’ to describe both the parameterization of \({\mathcal D}\) given by (6.3) in the piecewise-smooth system, and the region around \({\mathcal D}\) given by (5.3) in the implementation, each with an associated small parameter ε. The concepts are closely related, and one may refer to the ‘switching layer of the piecewise-smooth system’ or the ‘switching layer of the implementation’ if necessary to avoid confusion.

At a point on a discontinuity threshold where σ 1 = ⋯ = σ m = 0 for some m ≥ 1, let us take local coordinates \(\mathbf {x}=(X, \underline {x})\), where X = (σ 1, …, σ m) and \( \underline {x}\in \mathbb R^{n-m}\), so \({\mathbf {x}}_{\mathcal D}=(0,\dots ,0, \underline {x})\). We then obtain on \({\mathcal D}\) a switching layer

$$\displaystyle \begin{aligned} {\mathcal D}^\varepsilon={\left\{{\;(\nu_1,\dots,\nu_m,\;\underline{x})\;\in\;[0,1]^m\times\mathbb R^{n-m}\;}\right\}}\;, \end{aligned} $$
(6.4)

with each multiplier ν j constituting the blow-up variable of the set σ j = 0, as some small parameter ε j → 0+. The switching layer is n dimensional, and differentiating σ j = ε jν j according to \(\dot {\mathbf {x}}=\mathbf {F}(\mathbf {x};\nu _1,\dots ,\nu _m)\) in these coordinates, given \(\dot \sigma _j=\mathbf {F}\cdot \nabla \sigma _j= F_j\), we have

$$\displaystyle \begin{aligned} \varepsilon_j\dot\nu_j=\dot \sigma_j &=F_j(\varepsilon_1\nu_1,\dots,\varepsilon_m\nu_m,\underline{x};\nu_1\dots,\nu_m)\\ &=F_j(0,\dots,0,\underline{x};\nu_1\dots,\nu_m)+{{\mathcal O}{\left({{{\varepsilon_1,\dots,\varepsilon_m}}}\right)}}\;. \end{aligned} $$
(6.5)

Neglecting the higher order term for ε j → 0, we obtain a well-defined layer system on X = (0, …, 0),

$$\displaystyle \begin{aligned} \varepsilon_j\dot\nu_j&=F_j(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)\;,\quad j=1,\dots,m,{} \end{aligned} $$
(6.6a)
$$\displaystyle \begin{aligned} {\underline{\dot x}} &=\underline{F}(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)\;,{} \end{aligned} $$
(6.6b)

up to terms of order ε 1, …, ε m, on the right-hand side. We recall that in this notation \(\mathbf {F}=(F, \underline {F})\) and F = (F 1, …, F m) = (F ⋅∇σ 1, …, F ⋅∇σ m).

In deriving this system we have fixed a very simple relationship between ν j and σ j, namely a linear (if singular as ε j → 0) mapping. We can do this without loss of generality because, through hidden terms, we are able to express any more complex functional relationship between some switching multiplier ν j and the switching function σ j using nonlinearity.

Say, for example, a vector field has a component \(F_j=1+\operatorname {step}(\sigma _j)\), representing perhaps the reaction force from an object stuck to a surface with σ j = 0, and say that F j is known to pass through zero twice as the function σ j changes sign. Clearly the function F j = 1 + ν j does not satisfy this, as ν j ∈ [0, 1] implies 1 + ν j ∈ [1, 2]. The function F j = 1 + ν j −  j(1 − ν j), however, varies over \(F_j\in [\frac 14(6-r^{-1}-r),2]\) for r > 1, and if, say, F j is known to vanish at some ν j = k, then r = (1 + k)∕(1 − k)k provides this.

With the dynamics at a discontinuity threshold \({\mathcal D}\) thus described by (6.6), sliding occurs if there exist fix points of (6.6a). These are sets of points satisfying \(\dot \nu _j=0\), and they generate sliding manifolds

$$\displaystyle \begin{aligned} \mathcal{M}={\left\{{\begin{array}{l}(\underline{x},\nu_1,\dots,\nu_m)\in\mathbb R^{n-m}\times[0,1]^m\;\mbox{ such that}\\F_j(0,\dots,0, \underline{x};\nu_1,\dots,\nu_m)=0,\;\;j=1,\dots,m\end{array}}\right\}}\;, \end{aligned} $$
(6.7)

which are invariant wherever they are normally hyperbolic (see [81]). (Recall that \(\mathbf {x}|{ }_{\mathcal D}=(0,\dots ,0, \underline {x})\) denotes x restricted to \({\mathcal D}\), and F ⋅∇σ j = F j).

On \(\mathcal {M}\) the dynamics takes the form of a sliding mode, given by

$$\displaystyle \begin{aligned} 0&=F_j(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)\;,\quad j=1,\dots,m,{} \end{aligned} $$
(6.8a)
$$\displaystyle \begin{aligned} {\underline{\dot x}}&=\underline{F}(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)\;,{} \end{aligned} $$
(6.8b)

Because (6.7) consists of m equations F 1 = ⋯ = F m = 0, in m unknowns given by the switching multipliers ν 1, …, ν m, they typically define a well-defined set \(\mathcal {M}\) inside a given switching layer. This set may consist of branches of different stability (determined by considering the eigenvalues of the matrix (F 1, …, F m)∕(ν 1, …, ν m)), connected by non-hyperbolic points (where those eigenvalues have zero real part).

In a system of many switches \(\nu _j=\operatorname {step}(\sigma _j)\), a solution may evolve between places where the discontinuity threshold \({\mathcal D}\) consists of an intersection of m different submanifolds \({\mathcal D}_j\), as defined in (5.11). On each different such region of \({\mathcal D}\) for different m, we first blow up each \({\mathcal D}_j\) into a switching layer, then derive the sliding modes, which occupy local sliding manifolds \(\mathcal {M}\) of dimension \(\mathbb R^{n-m}\).

If there is nonlinear dependence on the multipliers ν j, then there may exist multiple equilibria, periodic or complex attractors, undergoing bifurcations and any other nonlinear phenomena inside the switching layer.

We will apply these ideas later when we look at some examples of linear versus nonlinear dynamics on the discontinuity threshold.

6.2 Sliding Perspective II: Hybrid Implementations

We define an implementation as hybrid if it cannot be expressed by means of a set of ordinary differential equations alone, but instead is given by a hybrid of the system

$$\displaystyle \begin{aligned} \dot{\mathbf{x}}={{\mathbf{f}}^{i}}(\mathbf{x})\quad {\mathrm{if}}\quad \mathbf{x}\in\mathcal{R}_i^\varepsilon\;,\quad i\in Z_N\;, \end{aligned} $$
(6.9)

along with a map

$$\displaystyle \begin{aligned} \dot{\mathbf{x}}={{\mathbf{f}}^{i}}(\mathbf{x})\quad {\mathrm{with}}\quad i\mapsto\begin{cases}i&\text{if}\;\text{event}(\mathbf{x};\varepsilon)=\text{false}\;,\\ \varPsi(\mathbf{x};i)&\text{if}\;\text{event}(\mathbf{x};\varepsilon)=\text{true}\;,\end{cases} \end{aligned} $$
(6.10)

with Z N as before being some discrete set of N labels, and with the regions \(\mathcal {R}_i^\varepsilon \) obeying \(\mathcal {R}_i^\varepsilon \supset \mathcal {R}_i\), such that the switching layer \(\mathcal {D}^\varepsilon \) is formed by the overlap of two or more regions \(\mathcal {R}_i^\varepsilon \), though \(\mathcal {R}_i^0=\mathcal {R}_i\).

The system evolves as \(\dot {\mathbf {x}}={{\mathbf {f}}^{i}}(\mathbf {x})\) until a condition ‘event(x;ε)’ is satisfied, then i is updated to a new mode Ψ(x;i). The hysteretic, delay, stochastic, and time-stepping implementations in Definition 4.1 are all of this type. Typically in such situations there is an implementation layer \({\mathcal D}^\varepsilon \) on which the system may exist in more than one mode i, and its dynamics therefore depends not only on its state x but also on the current mode, which therefore appears in the update map Ψ(x;i). In other implementations the dynamics in \(\mathcal {D}^\varepsilon \) is instead governed by a transition rule that only depends on x, i.e., the state lies in a transition ‘mode’ and not in any mode i.

Let \(\varOmega \subset \mathcal {D}^\varepsilon \) be a set of points inside the switching layer on which ‘event(x;ε) = true’ is satisfied, and let x 1, x 2, … denote a set of points inside Ω visited by x(t) at times t 1, t 2, …. Integrating between these provides a map

$$\displaystyle \begin{aligned} {\mathbf{x}}_{n}=\varPhi({\mathbf{x}}_{n-1};\varepsilon)\;\quad {\mathrm{where}}\quad \varPhi:\varOmega\mapsto\varOmega\;. \end{aligned} $$
(6.11)

We have been deliberately vague in defining the set Ω and map Φ, as the form of both depends on the implementation. In Chap. 7 we will see how these definitions apply to the ‘experiments’ from Chap. 4. Still, with these definitions we can derive some useful results, in particular without knowing the map (6.11) explicitly, we can derive some implications for the dynamics in the switching layer.

For sliding to occur, the map Φ must have an invariant set on Ω, but that set need not be unique, and by implication the sliding dynamics need not be unique.

According to the map (6.11), x(t) evolves in increments along each field f i, e.g.,

$$\displaystyle \begin{aligned} {\mathbf{x}}_{n}&={\mathbf{x}}_{n-1}+{{\mathbf{f}}^{i}}({\mathbf{x}}_{n-1})\varDelta t+{{\mathcal O}{\left({{{\varDelta t^2}}}\right)}}\\ {\mathrm{if}}\;&\;\quad {\mathbf{x}}_{n-1}\in\varOmega_i({\mathbf{x}}_{n-1},{\mathbf{x}}_0,t_{n-1})\;. \end{aligned} $$
(6.12)

The mode i selected at each time increment can depend not only on the current state x n−1, but is also typically history dependent, depending on the initial state x 0 and the current time t n−1. The regions Ω i ⊂ Ω may therefore overlap. Because this selection has discontinuities wherever the mode i changes, this raises the possibility that attractors of the map can bifurcate in a far more arbitrary manner than in continuous or differentiable maps, able to make abrupt jumps in topology and periodicity (as in Fig. 2.2(ii), for example).

We can also derive the effect of an attractor on the system’s dynamics. Let x(t) evolve along an ε-infinitesimal neighbourhood of the discontinuity thresholds for a time interval [0, T], switching between modes \(i\in Z_N={\left \{{1,2,\dots ,m}\right \}}\), at a sequence of times t 1, t 2, …, t r, where 0 = t 0 < t 1 < t 2 < ⋯ < t r = T. Thus x(t) evolves along a different vector field f i in mode i ∈ Z N on each time interval [t j−1, t j] for some i = {1, …, m} and j ∈{1, …, r}. Let μ i denote the total proportion of the time T spent evolving along f i,

$$\displaystyle \begin{aligned} \mu_i=\frac 1T\sum_{n=1}^{r}\left\{\begin{array}{ll}t_n-t_{n-1}&{\mathrm{if}}\;\;{\mathbf{x}}_n\in \mathcal{R}_n\;,\\0&{\mathrm{if}}\;\;{\mathbf{x}}_n\notin \mathcal{R}_n\;.\end{array}\right. \end{aligned} $$
(6.13)

Let γ i(x i) = 1 if x n is currently in mode f i and γ i(x i) = 0 otherwise. Then the total change in x(t) over the time increment T = \(\displaystyle \sum _{n=1}^{r}\)(t n − t n−1) is

$$\displaystyle \begin{aligned} \qquad \quad \frac{\varDelta\mathbf{x}}{T} &= \frac 1T\sum_{n=1}^{r}\varDelta{\mathbf{x}}_{n-1} \\ &= \frac 1T\sum_{n=1}^{r}\gamma_i({\mathbf{x}}_{n-1}){{\mathbf{f}}^{i}}({\mathbf{x}}_{n-1})(t_n-t_{n-1})+{{\mathcal O}{\left({{{\varDelta t^2}}}\right)}}\\ &= \sum_{i=1} \mu_i {{\mathbf{f}}^{i}}(\mathbf{x}) \;+\;{{\mathcal O}{\left({{{T}}}\right)}}\;, \end{aligned} $$
(6.14)

where μ i ≥ 0 and \(\sum _{i=1}^N \mu _i = 1 \). In the limit T → 0 this gives an effective equation of motion,

$$\displaystyle \begin{aligned} \dot{\mathbf{x}}={{\mathbf{F}}^{{\mathrm{co}}}} (\mathbf{x};\mu_1,\dots,\mu_N) := \sum_{i=1} \mu_i {{\mathbf{f}}^{i}}(\mathbf{x})\;. \end{aligned} $$
(6.15)

Comparing to (5.5a), we see that as the coefficients μ i vary over [0, 1] the effective vector field F co traces out the convex hull \(\mathcal {F}(\mathbf {x})\) of the f i’s.

Using these effective equations of motion we can understand a basic separation of scales that distinguishes motion across and along the discontinuity threshold. Assume that x evolves along an ε-neighbourhood of the intersection of discontinuity thresholds σ 1 = ⋯ = σ m = 0. Define scaled coordinates \(\mathbf {x}=(u,v,\dots ,w, \underline {x})\), where u = σ 1ε, v = yε, …, w = σ mε, and \(\mathbf {x}\in \mathbb R^{n-m}\). Let \({{\mathbf {F}}^{{\mathrm{co}}}}=(F^{\mathrm{co}}_u,\dots ,F^{\mathrm{co}}_w, \underline {F}^{\mathrm{co}})\), then \(\dot {\mathbf {x}}={\mathbf {F}}^{\mathrm{co}}\) becomes

$$\displaystyle \begin{aligned} \varepsilon\dot u&=F^{\mathrm{co}}_u(\varepsilon u,\dots,\varepsilon w,\underline{x};\mu_1,\dots,\mu_m)\\ &=F^{\mathrm{co}}_u(0,\dots,0,\mathbf{x};\mu_1,\dots,\mu_m)+{{\mathcal O}{\left({{\varepsilon}}\right)}}\;,\\ \vdots&\qquad \vdots\\ \varepsilon\dot w&=F^{\mathrm{co}}_v(\varepsilon u,\dots,\varepsilon w,\underline{x};\mu_1,\dots,\mu_m)\\ &=F^{\mathrm{co}}_w(0,\dots,0,\mathbf{x};\mu_1,\dots,\mu_m)+{{\mathcal O}{\left({{\varepsilon}}\right)}}\;,{} \end{aligned} $$
(6.16a)
$$\displaystyle \begin{aligned} {\underline{\dot x}}&=\underline{F}^{\mathrm{co}}(\varepsilon u,\dots,\varepsilon w,\underline{x};\mu_1,\dots,\mu_m)\\ &=\underline{F}^{\mathrm{co}}(0,\dots,0,\mathbf{x};\mu_1,\dots,\mu_m)+{{\mathcal O}{\left({{\varepsilon}}\right)}}\;.{} \end{aligned} $$
(6.16b)

(Here part (a) labels the fast equations \(\varepsilon \dot {{[\;\;]}}=\!...\) and (b) the slow equation \({ \underline {\dot x}}=\!...\)). The (u, …, w) coordinates therefore evolve on a fast timescale τ = tε. Denoting the derivative with respect to τ by a prime, the system instead becomes

$$\displaystyle \begin{aligned} u^{\prime}&=F^{\mathrm{co}}_u(0,\dots,0,\mathbf{x};\mu_1,\dots,\mu_m)+{{\mathcal O}{\left({{\varepsilon}}\right)}}\;,\\ \vdots&\qquad \vdots\\ w^{\prime}&=F^{\mathrm{co}}_w(0,\dots,0,\mathbf{x};\mu_1,\dots,\mu_m)+{{\mathcal O}{\left({{\varepsilon}}\right)}}\;, \end{aligned} $$
(6.17a)
$$\displaystyle \begin{aligned} \underline{x}^{\prime}&={{\mathcal O}{\left({{\varepsilon}}\right)}}\;. \end{aligned} $$
(6.17b)

When we simulate the system on the τ-timescale, the variables (u, …, w) evolve across the \({{\mathcal O}{\left ({{\varepsilon }}\right )}}\) space of the switching layer, while the variables x remain quasi-static.

To identify the coefficients μ i in (6.14), we therefore simulate (6.17) for a time interval T, keeping \( \underline {x}\) fixed. If the time interval T can be taken long enough that the coefficients μ i as calculated by (6.13) reach a steady state, their values define an effective equation of motion (6.15) at any given x. If one of the μ i takes a value of unity, with all μ ji = 0, then the system is determined to have crossed the discontinuity threshold. If one or more μ i settle to steady values between 0 and 1, then the simulation of the fast (u, …, w) subsystem of (6.17) must have reach a steady state or other attractor inside the switching layer, and is said to be sliding along the discontinuity.

The maps (6.11) and their invariants may not in general have any closed form expression that can be determined from the system (5.1), but must be discovered by simulation and approximated.

Although (6.16) is similar formally to (6.6), the latter describes a continuous flow on σ 1 = ⋯ = σ m = 0, while the former describes a hybrid implementation that jumps between the 2m different modes specified by (6.16) when each μ j takes a value 0 or 1 in the neighbourhood of σ 1 = ⋯ = σ m = 0.

6.3 Sliding Perspective III: Smoothed Implementations

If switching is implemented by a smooth process, then we can proceed by steps that are actually very similar to Sect. 6.1. This has the advantage that the analysis then follows standard methods of singular perturbation theory, but this familiarity disguises unobvious ambiguities that accompany smoothing. Though we can describe these to some extent here, we are still learning what kind of dynamics, and more specifically what kind of singularities, persists under smoothing.

We begin with the system \(\dot {\mathbf {x}}=\mathbf {F}(\mathbf {x};\nu _1,\dots ,\nu _m)\) in terms of switching multipliers \(\nu _j=\operatorname {step}(\sigma _j)\). To smooth the discontinuity we simply make the replacement \(\nu _j\mapsto \phi ^{\varepsilon _j}(\sigma _j)\), where \(\phi ^{\varepsilon _j}(\sigma _j)\) is a smooth monotonic function, satisfying \(\phi ^{\varepsilon _j}(\sigma _j)=\operatorname {step}(\sigma _j)+{{\mathcal O}{\left ({{{\varepsilon _j}}}\right )}}\) for |σ j|≥ ε j.

Observe that for |σ j|≥ ε j this definition gives \(\phi ^{\varepsilon _j}(\sigma _j)=\phi ^1(\sigma _j/\varepsilon _j)\), so let assume this also holds for |σ j| < ε. The dynamics of the quantity ν j is found simply by differentiating and applying the chain rule, \(\dot \nu _j=\varepsilon _j^{-1}(\partial \phi ^{\varepsilon _j}\!/\partial \sigma _j)\dot \sigma _j=\hat \varepsilon _j^{-1}\dot {\mathbf {x}} \cdot \nabla \sigma _j=\hat \varepsilon _j^{-1}\mathbf {F}\cdot \nabla \sigma _j=\hat \varepsilon _jF_j\), where \(\hat \varepsilon _j=\varepsilon _j(\partial \phi ^{\varepsilon _j}\!/\partial \sigma _j)^{-1}\).

The result is formally that in (6.5)–(6.6), except that the definition of ν j now differs and, more crucially, the small quantity \(\hat \varepsilon _j\) is now a function of σ j. We have

$$\displaystyle \begin{aligned} \hat\varepsilon_j(\sigma_j)\dot\nu_j&=F_j(0,\dots,0,\underline{x};\nu_1\dots,\nu_m)+{{\mathcal O}{\left({{{\varepsilon_1,\dots,\varepsilon_m}}}\right)}}\;, \end{aligned} $$
(6.18a)
$$\displaystyle \begin{aligned} {\underline{\dot x}}&=\underline{F}(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)+{{\mathcal O}{\left({{{\varepsilon_1,\dots,\varepsilon_m}}}\right)}}\;. \end{aligned} $$
(6.18b)

By the definition of \(\phi ^{\varepsilon _j}\) the quantity \(\hat \varepsilon _j\) is non-zero on the layer, where |σ j| < ε j. We can refine this if we limit the derivative of \(\phi ^{\varepsilon _j}\) away from zero, for example, choose \(\phi ^{\varepsilon _j}\) such that \(\partial \phi ^{\varepsilon _j}/\partial \sigma _j>K\) for \(|\sigma _j|<1-\varepsilon _j^p\), for fixed K > 0 and p ≥ 1 such that ε jK → 0 as ε j → 0. Then \(\hat \varepsilon _j\) behaves as a small quantity \(\hat \varepsilon _j/K={{\mathcal O}{\left ({{{\varepsilon _j}}}\right )}}\) for \(\sigma _j\in [-1+\varepsilon _j^p,+1-\varepsilon _j^p]\).

We see that this is analogous to the system (6.6) obtained by piecewise-smooth methods, and in fact they can be shown to be equivalent in the limit ε 1, ε 2 → 0, see [81].

Equations of the form (6.18) can be found in singular perturbation studies of climate and gene regulation, e.g., [96, 103]. An equivalent form in common use when using Sotomayor–Teixeira regularization [136] is to define a parameter u j = σ jε j, to obtain instead

$$\displaystyle \begin{aligned} \varepsilon_j\dot u_j&=F_j(0,\dots,0,\underline{x};\nu_1\dots,\nu_m)+{{\mathcal O}{\left({{{\varepsilon_1,\dots,\varepsilon_m}}}\right)}}\;, \end{aligned} $$
(6.19a)
$$\displaystyle \begin{aligned} {\underline{\dot x}}&=\underline{F}(0,\dots,0,\underline{x};\nu_1,\dots,\nu_m)+{{\mathcal O}{\left({{{\varepsilon_1,\dots,\varepsilon_m}}}\right)}}\;. \end{aligned} $$
(6.19b)

In either case (6.18) or (6.19), analysis proceeds using standard concepts from geometric singular perturbation theory, see, e.g., [49, 86]. If we assume all of the ε js are of the same order, that is every ratio ε iε j is non-vanishing as ε i, ε j → 0 for i, j = 1, …, m, then the analysis is closely analogous to that of the piecewise-smooth system in Sect. 6.1. The switching layer of the implementation is given as in (6.4) treating the ν j as variables, or the same expression with ν j replaced by u j in the alternative variables. The slow-fast system has a critical manifold, corresponding precisely to the sliding manifold \(\mathcal {M}\) in (6.7), where the fast ν j subsystem (6.6a) vanishes. According to Fenichel’s theory [49], wherever \(\mathcal {M}\) is normally hyperbolic with respect to the fast ν j subsystem, for ε j > 0, there exists an invariant manifold \(\mathcal {M}_{\varepsilon _j}\) in an ε j-neighbourhood of \(\mathcal {M}\). The dynamics on \(\mathcal {M}\) is precisely the sliding dynamics (6.8), and moreover the dynamics on \(\mathcal {M}^{\varepsilon _j}\) is topologically equivalent to (6.8).

If all of the ε js are of different orders, then the dynamics in the switching layer will be more intricate, involving a separation onto more timescales, but still falls under standard methods of singular perturbation theory. The case \(\varepsilon _j=\varepsilon _1^j\) for j = 1, …, m, for instance, falls under Fenichel’s analysis in [49]. The author is not aware of any studies to date applying such many timescale dynamics to piecewise-smooth problems.

The Sotomayor–Teixeira approach of replacing the switching multipliers ν j by smooth (but non-analytic) functions \(\phi ^{\varepsilon _j}(\sigma _j)\) can be weakened so that the functions \(\phi ^{\varepsilon _j}(\sigma _j)\) are analytic. It is then impossible for these functions to be constant outside the switching layer, so they require defining to \(\phi ^{\varepsilon _j}(\sigma _j)=\operatorname {step}(\sigma _j)+E\) where E is small, for example, \(E={{\mathcal O}{\left ({{{\varepsilon _j}}}\right )}}\) or \({{\mathcal O}{\left ({{{\varepsilon _j/\sigma _j}}}\right )}}\) or \({{\mathcal O}{\left ({{{\operatorname {e}^{-\sigma _j/\varepsilon _j}}}}\right )}}\). This is often the case in applications. One example is in [96], where \(\phi ^{\varepsilon _j}(\sigma _j)=\frac 12+\frac 12\arctan (\sigma _j/\varepsilon )=\operatorname {step}(\sigma _j)+{{\mathcal O}{\left ({{{\varepsilon _j/\sigma _j}}}\right )}}\). Another example is in [103, 117], where \(\phi ^{\varepsilon _j}(\sigma _j)=Z(\sigma _j+1)=\operatorname {step}(\sigma _j)+{{\mathcal O}}\big (\sigma _j^{1/\varepsilon _j}\big )\) in terms of the Hill function \(Z(w)=1/(1+w^{-1/\varepsilon _j})\) [70]. Both of these functions are analytic and are asymptotic to \(\operatorname {step}(\sigma _j)\) for large argument.

Good examples of these methods applied to genetic models like that of Sect. 4.2 can be found in [41, 76, 102, 103, 117]. They tell a story similar to that obtained by piecewise-smooth analysis, but an appreciation of possible hidden terms would bring more insight into the robustness of these studies. Hill functions as a class are sometimes used without rigorous justification from the biology, and in such cases the possible differences between alternate sigmoid functions can be calculated as hidden terms, including representing the different between Hill functions of different stiffnesses (different powers 1∕ε j).

Hidden terms survive when we smooth a discontinuity, and have the interpretation that they vanish (asymptotically at least) outside the discontinuity threshold. If we smooth by replacing \(\nu _j\mapsto \phi ^{\varepsilon _j}(\sigma _j)=\operatorname {step}(\sigma _j)+E(\varepsilon _j)\), then the term ν(ν − 1) which from (5.16) typically characterizes hidden terms, simplifies to

$$\displaystyle \begin{aligned}\nu(\nu-1)\;\;\mapsto\;\; 2\operatorname{sign}(\sigma_j)E(\varepsilon_j)+\operatorname{e}^2(\varepsilon_j)\;.\end{aligned}$$

Hence the hidden term is of order E(ε j), vanishing (asymptotically) outside the discontinuity threshold with E.

Hidden terms therefore allow us to distinguish between different kinds or rates of switching according to different methods of smoothing. For instance, consider the one-dimensional system

$$\displaystyle \begin{aligned} \dot x=F(x;\nu_{(r)})=a(x)+\nu b(x)\;, \end{aligned} $$
(6.20)

defined in terms of a switching multiplier

$$\displaystyle \begin{aligned} \nu\;=\;\nu_{(r)}:=\displaystyle\lim_{\varepsilon\rightarrow0}\phi^\varepsilon_{(r)}(x)\;, \end{aligned} $$
(6.21)

for different smooth functions \(\phi ^\varepsilon _{(1)}(x),\;\phi ^\varepsilon _{(2)}(x),\;\dots \), such that \(\nu _{(r)}\rightarrow \operatorname {step}(x)\) for any r. Does it matter how we choose the function \(\phi ^\varepsilon _{(r)}\), or do we always obtain the same piecewise-smooth system (6.20) in the limit ε → 0?

As examples consider the following sigmoid quantities

$$\displaystyle \begin{aligned} \phi^\varepsilon_{(0)}(x)&=\frac 12+\frac{x/\varepsilon}{2\sqrt{1+(x/\varepsilon)^2}}{} \end{aligned} $$
(6.22a)
$$\displaystyle \begin{aligned} \phi^\varepsilon_{(1)}(x)&=\frac 12+\frac{x/\varepsilon}{2\sqrt{1+(x/\varepsilon)^2}}+\frac{ A(x)}{2{\left({{1+(x/\varepsilon)^2}}\right)}^{k}}\;,{} \end{aligned} $$
(6.22b)
$$\displaystyle \begin{aligned} \phi^\varepsilon_{(2)}(x)&=\frac 12+\frac 1\pi\arctan(x/\varepsilon)\;,{} \end{aligned} $$
(6.22c)
$$\displaystyle \begin{aligned} \phi^\varepsilon_{(3)}(x)&=\frac 12+\frac 12\tanh(x/\varepsilon)\;,{} \end{aligned} $$
(6.22d)

where k > 0 and A(x) is a smooth function of x.

We will show the following.

Lemma 6.2

We can write each system \(\dot x=F(x;\nu _{(r)})\) from (6.20) as

$$\displaystyle \begin{aligned} \dot x=F(x;\nu_{(r)})=F(x;\operatorname{step}(x))+H_{(r)}^\varepsilon(x)\;, \end{aligned} $$
(6.23)

an asymptotic expansion whose tail satisfies \(H_{(r)}^\varepsilon (x)\rightarrow 0\) as ε → 0 for x ≠ 0. This can be re-written in an ε-independent form as

$$\displaystyle \begin{aligned} \dot x=F(x;\nu_{(r)})=F(x;\nu_{(0)})+H_{(r)}(x;\nu_{(0)})\;, \end{aligned} $$
(6.24)

with hidden terms satisfying H (r)(x;0) = H (r)(x;1) = 0, but with H (r)(x;ν) begin non-vanishing in the layer |x| < ε for r = 1, 2, 3.

Proof

The proof is directly by asymptotic expansion and straightforward calculations, so for brevity we place some of the details in Appendix C. The expansions of the sigmoid functions \(\phi ^\varepsilon _{(r)}\) for large argument all take the form \(\phi ^\varepsilon _{(r)}(x)=\operatorname {step}(x)+{{\mathcal O}{\left ({{{\varepsilon /x}}}\right )}}\) (the precise expressions are given in Appendix C, and in fact for r = 3 the error is \({{\mathcal O}{\left ({{{\operatorname {e}^{-\varepsilon /x}}}}\right )}}\)). Substituting these into (6.20) gives (6.23) with the tail of the expansion given by \(H_{(r)}^\varepsilon (x)={{\mathcal O}{\left ({{{\varepsilon /x}}}\right )}}\).

This means that the systems (6.20) with (6.21) and (6.22) are equivalent for x ≠ 0. For x = 0, however, these asymptotic series diverge. To compare the different systems at x = 0, the second part of the theorem instead seeks a form of the expansions that is independent of ε, by expressing them all in terms of ν (0).

To do this we first rearrange (6.22a) to find that

$$\displaystyle \begin{aligned}x/\varepsilon=(\phi^\varepsilon_{(0)}-\frac 12)/\sqrt{\phi^\varepsilon_{(0)}(1-\phi^\varepsilon_{(0)})}\;,\end{aligned}$$

followed by substituting this into the expressions in (6.22) and then into (6.20). To obtain (6.24) for r = 1 is then just a matter of algebra. To obtain (6.24) for r = 2, 3, it is better to substitute into the asymptotic expansions for each \(\phi _{(r)}^\varepsilon \), thus expressing them in terms of \(\phi ^\varepsilon _{(0)}\). We then replace \(\phi ^\varepsilon _{(0)}\) with ν (0). The algebra is set out in Appendix C, giving (6.24) for r = 1, 2, 3, with

$$\displaystyle \begin{aligned} H_{(1)}(x;\nu)&=(4h)^k A(x)b(x)\;,{} \end{aligned} $$
(6.25a)
$$\displaystyle \begin{aligned} H_{(2)}(x;\nu)&={\left\{{h\;C_1(2\nu-1)+\sqrt{h}C_2(2\nu-1)}\right\}}b(x)\;,{} \end{aligned} $$
(6.25b)
$$\displaystyle \begin{aligned} H_{(3)}(x;\nu)&={\left\{{h\;C_1(2\nu-1)+\operatorname{e}^{-|2\nu-1|/\sqrt{h}}C_3(2\nu-1)}\right\}}b(x)\;,{} \end{aligned} $$
(6.25c)
$$\displaystyle \begin{aligned} \mbox{where}\quad h&=\nu(1-\nu)\;, \end{aligned} $$

in terms of functions C i(2ν − 1) that are finite valued for all ν ∈ [0, 1], given in Appendix C. Clearly H (r)(x;0) = H (r)(x;1) = 0 in each case. These expressions are now independent of ε and therefore remain well-defined in terms of ν (0) as ε → 0.

Compare the hidden terms in (6.25) with the expression (5.12) in (5.3). Note how the term “ν(1 − ν)” appears throughout the hidden terms H (r)(x;ν) in (6.25), but also demonstrates more general forms that hidden terms can take than we derived by polynomial expansion in (5.16).

This merely demonstrates that a piecewise-defined function corresponds not to one unique function of a switching multiplier ν or limiting smooth function ϕ ε(x), but a whole class of such functions. Comparing to (5.3) we see that the difference between the alternate smoothings F(x;ν (r)) lies in hidden terms.

Our interest, of course, concerns the dynamical implications of these hidden terms, left behind from the asymptotic approximations above. It should be quite clear that they can affect the system’s dynamics. There are examples in [78, 79, 81, 83] of hidden terms deciding whether solutions slide along or cross through a discontinuity threshold. The anomalous sliding we described in Sects. 1.2 and 1.3 came from hidden terms, and more generally that can take all manner of non-trivial forms. An interesting example is given by taking (6.22b) and letting A(x) be a matrix. The following example shows how this can destabilize an equilibrium under smoothing. Consider the piecewise-linear problem

$$\displaystyle \begin{aligned} {\left({{\begin{array}{c}\dot x\\\dot y\\\dot z\end{array}}}\right)} &={\left({{\begin{array}{c}1\\0\\0\end{array}}}\right)}+ \nu{\left({{\begin{array}{c}-3\\ay+z\\az-y\end{array}}}\right)}\;,\quad \nu=\operatorname{step}(x)\;, \end{aligned} $$
(6.26)

for a < 0, which has sliding modes satisfying ν = 1∕3, with an attracting focus equilibrium at y = z = 0. If we smooth this system by replacing ν with \(\phi ^\varepsilon _{(0)}\), then we obtain a topologically equivalent system, with an attracting focus equilibrium on an invariant manifold, where \(\phi ^\varepsilon _{(0)}=1/3\). Consider instead smoothing by replacing ν with \(\phi ^\varepsilon _{(1)}\), and let

$$\displaystyle \begin{aligned} A=c{{\left({{\begin{array}{ccc}-1/3&0&0\\0&a&-1\\0&1&a\end{array}}}\right)}}\;, \end{aligned} $$
(6.27)

for small c > 0. As this is now a smooth system it succumbs to standard stability analysis. The system has an equilibrium at (x, y, z) = (x , 0, 0), where \(\phi _{(0)}(x_*/\varepsilon )=\frac 13+\frac {8c}{27}+{{\mathcal O}{\left ({{{c^2}}}\right )}}\). This has eigenvalues \(-\frac 3\varepsilon (1-\frac {4c}{9})\phi _{(0)}^{\prime }(x_*/\varepsilon )\) and \(\frac 13(a\pm i)+\frac {8c}{27}(3+a+3a^2\pm i)\) to order c 2. This implies that for

$$\displaystyle \begin{aligned}\frac{-3a}{8[(a+1)^2-\frac 53a]}<c<\frac 94\;,\end{aligned}$$

the equilibrium will de-stabilize in the (y, z) directions, becoming a saddle-focus as depicted in Fig. 6.1.

Fig. 6.1
figure 1

A focus destabilized by smoothing

In this section we have seen some of the less obvious complexity of switching and sliding dynamics when considered from different viewpoints, expressed through layers, nonlinearity, and implementations. Let us now return to see what insight these give us into the ambiguities of the examples in Chap. 4.