Keywords

1 Introduction

This study is a sequel of the work [1] devoted to optimizing a controller stabilizing a wheel at a point. The problem of a wheel rolling on a plane or an uneven terrain is of importance in many practical applications. A rising tide of interest to this classical problem is due to appearance of robotic systems of a new type—ball-shaped or spherical robots and robot–wheels—and search for new actuators for such systems [2,3,4,5,6]. The problem of motion control for mobile robots of this type that move owing to displacements of masses (pendulums) inside the shell (wheel) is discussed in many publications (see, for example, [2, 4, 6, 7]). In this paper, we consider the simplest model of a robot-wheel assuming that it is driven by a control torque applied to the wheel axis. We do not go into detail of implementation of the actuator assuming only that the control torque is constrained, with the limit value being determined by physical parameters of the robot [2, 7]. On the one hand, such a model, in spite of its simplicity, is of interest by itself in the study of advanced control strategies, including optimal ones. On the other hand, this model can be used as a reference one, in studying more complicated models, with the solutions obtained for the reference model being taken to be a set of target trajectories for the original system [8].

We set the problem of synthesizing a control law in the form of feedback that brings the wheel from an arbitrary initial position on a straight line to a given one, with the velocity of motion being limited. To meet the phase and control constraints, an advanced feedback law in the form of nested saturation functions depending on four coefficients was suggested in [1]. Feedback laws of this type were studied in [9, 10]. The basic advantage of such laws is that they ensure global stability of the closed-loop system and guarantee the fulfilment of the phase and control constraints under appropriate choice of feedback coefficients.

Two of the four feedback coefficients are uniquely determined by the limit value of the control torque and the maximum allowed wheel velocity, while the selection of the other two coefficients can be used to optimize the performance of the controller. The optimality criterion employed in this study, as well as in [1], is similar to that in [11], where the selection of feedback coefficients of a saturated linearizing feedback for a wheeled robot with constrained control resource was discussed. The optimality is meant in the sense that the phase portrait of the nonlinear closed-loop system is similar to that of a linear system with a stable node, with the asymptotic rate of approaching the target point being as high as possible. The problem statement in this study differs from that in [1] by the definition of the concept of the node-like phase portrait. While in [1] it was defined only for the domain of the phase plane satisfying the phase constraints and the asymptote dividing the domain into two invariant sets was assumed straight, in this work, the definition is extended to the entire phase plane and the asymptote is allowed to be curvilinear. The optimal value of the asymptotic convergence rate in terms of the new definition to be derived in this work is considerably greater than that in terms of the definition introduced in [1].

The paper is organized as follows. In Sect. 2, the wheel stabilization problem statement is given, the governing equations are reduced to a dimensionless form, and some earlier obtained results from [1] are presented. The optimization problem statement is formulated in Sect. 3, and the solution of the optimization problem is presented in Sect. 4. Section 5 summarizes the results of the study and discusses prospects for future research.

2 Stabilization Problem Statement

We consider a wheel rolling without slipping on a plane along a straight line (Fig. 1). The dynamics of the wheel are described by the equation [1]

$$ M\ddot{x}= R,\; Mr^2\ddot{\theta } = rR-f\dot{\theta }-U, $$

where M and r are mass and radius of the wheel, x is the coordinate of the wheel center, \(\theta \) is the rotation angle, R is the reaction force, f is the viscous friction coefficient, and U is the control torque.

Fig. 1.
figure 1

Schematic of the robot-wheel.

Applying the condition of rolling without slipping \( \dot{x}+r\dot{\theta }=0 \), we reduce the system equations to one second-order equation

$$\begin{aligned} \mu \ddot{x} = -\frac{f\dot{x}}{r^2}+\frac{U}{r}, \end{aligned}$$
(1)

where \(\mu =2M\). In the point stabilization problem, it is required to synthesize a control law U in the form of a feedback that brings the wheel to a given target point on the line. Without loss of generality, we set the target point to be at the origin. The control torque U is assumed to be limited, and we also assume that the velocity of the wheel center cannot exceed a prescribed value:

$$\begin{aligned} |U| \le U_{max} , \; |\dot{x}| \le V_{max}. \end{aligned}$$
(2)

The problem is further simplified by going to dimensionless form. Indeed, by introducing the dimensionless time, coordinate, and control

$$\begin{aligned} \tilde{t}= tV_{max}/r,\; \tilde{x} =x/r,\; \tilde{U}=U/U_{max}, \end{aligned}$$
(3)

as well as dimensionless parameters

$$ {\tilde{\mu }}=\frac{\mu V^2_{max}}{U_{max}}, \; \tilde{f}=\frac{f V_{max}}{rU_{max}}, $$

using the dot notation for the derivatives with respect to the new time, and assuming that \(U_{max}-fV_{max}/r>0\) (see [1] for detail), Eq. (1) turns to the dimensionless form:

$$\begin{aligned} {\tilde{\mu }}\ddot{\tilde{x}} = -\tilde{f}\dot{\tilde{x}}+\tilde{U}, \end{aligned}$$
(4)

where \(0\le \tilde{f}<1\), with constraints (2) taking the form

$$\begin{aligned} |\tilde{U}|\le 1,\; |\dot{\tilde{x}}|\le 1. \end{aligned}$$
(5)

In what follows, only the dimensionless model is used, and we omit tilde over all variables and parameters to avoid messy notation.

In [1], it was proposed to stabilize the wheel by applying the feedback in the form of nested saturators given by

$$\begin{aligned} U(x, \dot{x}) = - k_4\mathrm {Sat}(k_3(\dot{x}+k_2\mathrm {Sat}(k_1x))) + \frac{f\dot{x}}{r}, \end{aligned}$$
(6)

where \(\mathrm {Sat}(x)\) is the saturation function defined by the conditions \(\mathrm {Sat}(x)=x\) for \(|x|\le 1\) and \(\mathrm {Sat}(x)= \mathrm {sign}(x)\) for \(|x|>1\) and \(k_i>0\), \(i=1,2,3,4\), are positive coefficients.

It has been shown [1] that (6) is a stabilizing feedback. Moreover, if \(k_2=1\) and \(k_4=1-f\), then constraints (5) hold for any positive \(k_1\) and \(k_3\). Substituting (6) into (4) with the above-specified coefficients \(k_2\) and \(k_4\), we get the following equation governing the closed-loop system:

$$\begin{aligned} \ddot{x} = -\eta \mathrm {Sat}(k_3(\dot{x}+\mathrm {Sat}(k_1x))). \end{aligned}$$
(7)

where \(\eta =(1-f)/\mu \) is the control resource per unit mass.

Fig. 2.
figure 2

An example of inappropriate selection of feedback coefficients in (7).

Although feedback (6) with the coefficients \(k_2=1\) and \(k_4=1-f\) stabilizes the system and respects the constraints, inappropriate selection of the other two coefficients can result in poor performance of the control system and great overshooting. Figure 2 illustrates this. It shows a phase trajectory (curve 2) of the wheel with \(\mu =1\) and \(f=0\). Because of inappropriate selection of the feedback coefficients (here, \(k_1=9\) and \(k_3=100\)), the wheel missed the target point several times, with the overshootings being quite large. The phase portrait of the system in this case reminds that of a focus, with the overshootings being quite large, which does not sound good. The broken blue line (marked by 1) shows the curve \(x_2+\mathrm {Sat}(k_1x_1)=0\). Hence, it follows that the freedom in selection of \(k_1\) and \(k_3\) can be employed to optimize the performance of the controller, which is discussed in the remainder of the paper.

3 Optimization Problem Statement

Intuitively, speaking of desirable behavior, we want to have fast asymptotic convergence to the origin in the time domain and the phase portrait of the nonlinear system to look like that of a linear system with a node, when any trajectory approaches the origin monotonically, or has at most one overshooting. Recall that, in the linear case, the phase plane is divided into two invariant half-planes by a straight line, which is the asymptote for all (but two if the node is not a degenerate one) phase trajectories of the system. The concept of a node-like phase portrait for a nonlinear system can formally be defined in terms of a curvilinear asymptote dividing the phase plane into two invariant sets, which is a generalization of the straight asymptote for a linear system.

Definition 1

We will say that the phase portrait of a nonlinear system is of the node-like type if there is a curvilinear asymptote lying completely in the second and fourth quadrants.

The property of being node-like defined above is a global one. It means that not only the origin is a node of the linearized system but also that the behavior of the phase trajectories in the entire phase plane is similar to the behavior of the phase trajectories of a linear system. Like the straight asymptote in the linear case, the curvilinear asymptote divides the phase plane into two invariant sets such that any phase trajectory passes through only two quadrants of the phase plane.

Now, the problem to be solved in this study can be formulated as follows.

Problem. Determine feedback coefficients \(k_1\) and \(k_3\) for which the asymptotic rate of approaching the target point is maximal under the condition that the phase portrait of system ( 7 ) is of the node-like type.

4 Solution of the Optimization Problem

First, we establish the general form a curvilinear asymptote (further, simply asymptote) for system (7) and, then, will determine under what conditions the asymptote passes only through the second and forth quadrants.

Let us introduce the notation \(x_1=x\) and \(x_2=\dot{x}\) and rewrite (7) in the state-space form as

$$\begin{aligned} \begin{array}{ccl} \dot{x}_1 &{} = &{} x_2\\ \dot{x}_2 &{} = &{} -\eta \mathrm {Sat}(k_3(x_2+\mathrm {Sat}(k_1x_1))). \end{array} \end{aligned}$$
(8)

It is easy to see that the closed-loop system (8) is piecewise linear. Figure 3 shows the partitioning of the phase plane. Here, the dashed lines depict the boundaries between different linearity regions where one linear system switches to another. The solid broken line

$$\begin{aligned} x_2+\mathrm {Sat}(k_1x_1)=0 \end{aligned}$$
(9)

is the set of points where the right-hand side of the second equation in (7) vanishes. The control reaches saturation outside the broken strip bounded by the two dashed lines parallel to (9).

In the intersection of the sets \(|x_1|\le 1/k_1\) and \(|x_2+k_1x_1|\le 1/k_3\), which includes the origin, Eq. (7) takes the form

$$\begin{aligned} \ddot{x}+\eta k_3\dot{x}+\eta k_1k_3x=0. \end{aligned}$$
(10)
Fig. 3.
figure 3

Partition of the phase plane for system (8).

To simplify the following calculations, we confine our consideration in this paper to the case of a degenerate node (repeated root of the characteristic equation) of the linearized system, which is governed by the equation

$$\begin{aligned} \ddot{x}+2\lambda \dot{x}+\lambda ^2 x=0,\; \lambda >0, \end{aligned}$$
(11)

where \(\lambda \) is the rate of the asymptotic convergence. Comparing (10) and (11), we find that the coefficients \(k_1\) and \(k_3\) are to be selected from the one-parameter family

$$\begin{aligned} k_1=\frac{\lambda }{2}, \; k_3=\frac{2\lambda }{\eta } \end{aligned}$$
(12)

parameterized by the exponent \(\lambda \), and will seek for the maximal \(\lambda \) for which the phase portrait of system (7) is of the node-like type.

Clearly, being a curve dividing the phase plane into two invariant sets, any asymptote must be an integral curve of the system [12]. It was proved in [1, Lemma 1] that any trajectory of equation (7) beginning in the strip \(|x_2|\le 1\) never leaves it, i.e., the strip is an invariant set of the system. It is easy to prove that any trajectory beginning outside the strip cannot intersect the horizontal segment of line (7) either. This follows from the facts that the horizontal segments of line (9) are negative half-trajectories with the initial points \((-1/k_1, 1)\) and \((1/k_1, -1)\), respectively, and that no trajectories can intersect [12]. Indeed, let \(x_1(0)=-1/k_1, x_2(0)=1\). Since the right-hand side of the second equation in (7) is zero, \(x_2(t)\equiv 1\). Then, by virtue of the first equation, \(x(t)=x_1(0)+t<0\) and, for \(t\le 0\), the negative half-trajectory beginning at the point \((-1/k_1, 1)\) is the left horizontal segment of line (9). Similarly, it is proved that the right horizontal segment is the negative half-trajectory beginning at the point \((1/k_1, -1)\). Note also that the positive half-trajectories of the system beginning at the same points asymptotically approach the origin by virtue of the fact that the origin is the equilibrium point of the system. This brings us at the following lemma.

Lemma 1

The asymptote of system (7) is an integral curve consisting of the singular phase trajectory (equilibrium point) \(x(t)\equiv 0\) and two pairs of the half-trajectories beginning at the points \((-1/k_1, 1)\) and \((1/k_1, -1)\).

Thus, solving the Problem reduces to finding the maximal exponent \(\lambda \) for which the positive half-trajectories beginning at the points \((-1/k_1, 1)\) and \((1/k_1, -1)\) completely lie in the second and fourth quadrants. Taking into account the symmetry of the phase portrait with respect to the origin, it will suffice to consider only one of these half-trajectories, say, that beginning in the fourth quadrant.

In [1], the estimate \(\tilde{\lambda }=\eta \) for the maximal \(\lambda \) was obtained by seeking for a straight asymptote that divides the strip \(|x_2|\le 1\) into two invariant sets. Further in this section, we will show that, allowing the asymptote to be curvilinear, an exact value of maximal \(\lambda \) can be obtained, which is considerably greater than the estimate from [1]. Moreover, the curvilinear asymptote is shown to divide the entire phase plane, rather than the strip, into two invariant sets.

In view of the system symmetry, we may confine our consideration to the trajectories beginning in the left half-plane. It is evident that the positive half-trajectory beginning in the corner of the broken line (9) completely lies in the fourth quadrant if and only if it does not intersect the straight asymptote \(x_2=-\lambda x_1\) of the linear system (10). The latter may hold in the following two cases. First, this obviously happens when the trajectory does not intersect the dotted line \(x_2+k_1 x_1=1/k_3\) (i.e., when the control does not reach saturation). The other case takes place when the system does reach saturation but the trajectory still does not intersect the straight asymptote. Whether the second case is possible will further be verified.

Consider the first case. Solution of the linear equation (11) is given by

$$\begin{aligned} x_1(t) =-\frac{1}{\lambda }(\lambda t+ 2)\exp (-\lambda t), \; x_2(t) =(\lambda t+ 1)\exp (-\lambda t), \end{aligned}$$
(13)

from which it follows that

$$\begin{aligned} \frac{x_1}{x_2}=-\frac{1}{\lambda }\frac{\lambda t +2}{\lambda t +1}. \end{aligned}$$
(14)

Equation (11) in the half-plane \(x_2>0\) can be written as

$$ \frac{d\,x_2}{d\,x_1}= -2\lambda -\lambda ^2 \frac{x_1}{x_2}= -2\lambda -\lambda \frac{\lambda t +2}{\lambda t+1}=-\frac{\lambda ^2 t}{\lambda t + 1}. $$

Taking into account that \(k_1=\lambda /2\), the slope of the saturation (switching) line \(x_2+k_1 x_1=1/k_3\) is \(-\lambda /2\).

Let us find \(t_*\) for which the trajectory (13) has the same slope,

$$ -\frac{\lambda ^2 t}{\lambda t + 1} =-\frac{\lambda }{2}. $$

This yields \(\lambda t_*=1\) and \(t_*=1/\lambda \); the corresponding trajectory point is given by \(x_1(t_*)=-3e^{-1}/\lambda \), \(x_2(t_*)=2e^{-1}\). Substituting these into the equation of the saturation line, we get

$$ 2e^{-1} - \frac{3}{2}e^{-1}=\frac{\eta }{2\lambda }, $$

from which it follows that \(\lambda _*=\eta e\). The estimate obtained is by e times greater than the estimate obtained with the help of a straight asymptote in [1]. The coordinates of the touching point are \((-3 e^{-2}/\eta ,2e^{-1})\).

Thus, in the considered case, the desired asymptote is defined parametrically by Eq. (13) for \(\lambda =\eta e\) as t varies from 0 to \(\infty \). On the asymptote, the system is linear and the control reaches saturation at the single point.

Now, let us check whether the exponent \(\lambda \) can be increased if we permit saturation on the positive half-trajectory. After intersecting the saturation line, system (8) turns to

$$ \dot{x}_1=x_2, \; \dot{x}_2 = -\eta . $$

Rewriting these equations as

$$ \frac{dx_1}{dx_2}=-\frac{x_2}{\eta }, $$

and integrating the resulting equation, we find that the system trajectory is the parabola

$$\begin{aligned} x_1(t)=-\frac{1}{2\eta }x_2^2(t)+C(x_{10},x_{20}), \end{aligned}$$
(15)

where \(x_{10}\) and \(x_{20}\) are the coordinates of the point where the trajectory intersects the saturation line and

$$ C(x_{10},x_{20})=x_{10}+\frac{x_{20}^2}{2\eta }= \frac{1}{8\eta }\left( \lambda ^2x_{10}^2+6x_{10}\eta +\frac{\eta ^2}{\lambda ^2}\right) . $$

Equating the slope of the parabola to that of the straight asymptote, we find that the tangent line to the parabola is parallel to the asymptote when \(x_2=\eta /\lambda \). The condition that the parabola touches the asymptote is that it passes through the point with the coordinates \(x^{**}_1=-\eta /\lambda ^2\), \(x^{**}_2=\eta /\lambda \). Note also that it is at this point where the asymptote and the saturation line intersect. Substituting these into the parabola equation, we obtain

$$ -\frac{\eta }{\lambda ^2}=-\frac{\eta }{2\lambda ^2} + C $$

from which it follows that \(C=-\eta /2\lambda ^2\). Equating the two expressions for C, we get the following second-order algebraic equation in \(x_{10}\):

$$ \frac{\lambda ^2x_{10}^2}{\eta ^2}+\frac{6x_{10}}{\eta }+\frac{5}{\lambda ^2}=0. $$

Two solutions of this equation are \(-5\eta /\lambda ^2\) and \(-\eta /\lambda ^2\). The former is the abscissa of the first intersection point where the trajectory leaves the strip and the control reaches saturation, and the latter is the abscissa of the second point where the trajectory enters again the strip.

The above implies that, in order that the trajectory return to the strip at the right point \((x_1^{**}, x_2^{**})\), the first intersection with the saturation line must be at the point with the coordinates \(x_1^*=-5\eta /\lambda ^2\), \(x_2^*=3\eta /\lambda \). The value of \(\lambda \) and the corresponding time \(t^*\) are found by equating solutions (13) at \(t=t^*\) to the coordinates obtained

$$ -\frac{1}{\lambda }(\lambda t^* + 2)e^{-\lambda t^*}= -\frac{5\eta }{\lambda ^2} $$

and

$$ (\lambda t^* + 1)e^{-\lambda t^*}= \frac{3\eta }{\lambda }. $$

Dividing the first equation by the second one and solving the equation obtained, we get

$$ \lambda t^*=\frac{1}{2}, \; \lambda =2\eta \sqrt{e}. $$

As can be seen, the exponential rate of approaching the origin obtained is by \(2/\sqrt{e}\approx 1.2\) times greater than that in the previous case and is by \(2\sqrt{e}\approx 3.3\) times greater than the estimate obtained in [1].

Moreover, this is the exact value of the maximal rate \(\lambda \): \(\lambda _{max}=2\eta \sqrt{e}\). Indeed, for any \(\lambda >\lambda _{max}\), the positive half-trajectory emerging from the corner of the broken line (9) necessarily intersects the straight asymptote and, being the trajectory of the linear system (10), will intersect the \(x_2\)-axis and enter the first quadrant.

The above results are summarized in the following theorem.

Theorem 1

The greatest exponential rate \(\lambda \) of the deviation x decrease for which the phase portrait of the nonlinear system (8) is of the node-like type is

$$ \lambda _{max}=2\eta \sqrt{e}. $$

The corresponding coefficients \(k_1\) and \(k_3\) are given by

$$\begin{aligned} k_1=\eta \sqrt{e},\; k_3=4\sqrt{e}. \end{aligned}$$
(16)
Fig. 4.
figure 4

Optimal asymptote for system (8) with \(\mu =1\) and \(f=0\).

The optimal curvilinear asymptote for the system with \(\mu =1\) and \(f=0\) (and, hence, \(\eta =1\)), corresponding to the optimal value of \(\lambda \) is depicted in Fig. 4 by the bold black curve. For this system, \(\lambda _{max}=2\sqrt{e}\), \(k_1=\sqrt{e}\), and \(k_3=4\sqrt{e}\). The two broken dashed lines in the figure are boundaries of the region where the control does not reach saturation. The straight dashed line is the straight asymptote \(x_2=-\lambda _{max} x_1\) of the linear system (10).

Let us describe the part of the asymptote lying in the fourth quadrant. As noted earlier, it consists of the negative and positive half-trajectories beginning at the point \((-1/k_1,1)\). The former is the straight line (marked by 4) given parametrically by \(x_1(t)=-1/k_1+t\), \(x_2(t)\equiv 1\), \(-\infty < t\le 0\). The latter, in turn, consists of the three segments: the first segment (curve 3) is the trajectory of the linear equation (11) given by (13), where \(0\le t<t^*=1/2\lambda \); the second segment (curve 2) is a piece of parabola (15), \(t^*\le t<t^{**}=5/2\lambda \); and the third segment (line 1) is a piece of the straight asymptote of (11), \(t^{**}\le t< \infty \). The other part of the asymptote in the second quadrant is symmetric to this one with respect to the origin.

Fig. 5.
figure 5

Phase portrait of system (8) with \(\mu =1\) and \(f=0.25\).

Figure 5 shows the phase portrait of system (8) with \(\mu =1\) and \(f=0.25\) (\(\eta =0.75\)) for the optimal value of \(\lambda =\lambda _{max}\). The black bold line is the asymptote of the system. The green broken lines are the boundaries of the region where the control is not saturated. As can seen, any trajectory beginning below (above) the asymptote completely lies below (above) the asymptote, and any trajectory intersects the \(x_2\)-axis at most once. If we further increase \(\lambda \), the asymptote will intersect the \(x_2\)-axis and will pass through all quadrants, which means that the property of the phase portrait being node-like will be violated.

5 Conclusions

In the paper, the problem of optimizing a controller stabilizing a robot-wheel at a target point on a straight line subject to phase and control constraints has been discussed. The controller implementing an advanced feedback law in the form of nested saturation functions was suggested in [1]. The feedback depends on four coefficients two of which ensure the fulfillment of the phase and control constraints, while the other two can be adjusted to optimize the performance of the controller. An optimal controller has been defined to be that that ensures the greatest convergence rate near the target point, while preserving a node-like phase portrait of the nonlinear system. Optimal values of the feedback coefficients have been found, and the corresponding asymptote dividing the phase plane into two invariant sets has been constructed. The use of the new definition of the node-like phase portrait relying on the concept of a curvilinear asymptote made it possible to get a greater value of the asymptotic convergence rate near the target point compared to that in [1].

In the future, we plan to apply the approach developed in this paper to optimizing coefficients of a controller for a more complicated system of a robot–wheel with a pendulum. We also plan to synthesize a hybrid control law where the selection of the feedback coefficients will depend on whether the system is in the neighborhood of the target point or far from it.