Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In safety-critical robotics applications involving autonomous ground and air vehicles, it is desirable to unambiguously specify the desired system behavior and automatically synthesize a controller that provably implements this behavior. Additionally, autonomous systems often have high-dimensional, nonlinear dynamics and require high-performance (not just feasible) controllers.

Linear temporal logic (LTL) is an expressive task-specification language for specifying a variety of tasks such as responding to the environment, visiting goals, periodically monitoring areas, staying safe, and remaining stable. These properties generalize classical point-to-point motion planning. Also, the widespread use of LTL in software verification [2] makes it appealing as a common language for reasoning about the software and dynamics of autonomous systems.

Standard methods for motion planning with LTL task specifications first create a finite abstraction of the original dynamical system. This abstraction can informally be viewed as a labeled graph that represents possible behaviors of the system. Approximate finite abstractions can be computed using either sampling-based methods (e.g., RRTs) [6, 14, 17] or reachability-based approaches [1, 3, 10, 16, 29].

Given a finite abstraction of a dynamical system and an LTL specification, controllers can be automatically constructed using an automata-based approach [2, 6, 9, 14, 16]. This approach first transforms the LTL formula into an equivalent Büchi automaton whose size may be exponential in the length of the formula [2]. A product automaton is created from the finite abstraction and the Büchi automaton, and then a controller is found by graph search in the product automaton.

The main drawback of this approach is that it is expensive to compute a finite abstraction. The product automaton might also be quite large due to the size of the abstraction and the Büchi automaton. Finally, although optimal controllers can be computed for the discrete abstraction [21, 27], optimality is only with respect to the abstraction’s level of refinement or asymptotic [14].

Instead of the automata-based approach, we directly encode a large class of temporal logic formulas as mixed-integer linear constraints on the original dynamical system. These constraints enforce that an infinite sequence of system states satisfies a task specification. A key component of our formulation is enforcing that the system is in a (non-convex) region at a given time. We introduce an alternative formulation for this that gives a tighter convex relaxation than the commonly used big-M approach. Our approach applies to any deterministic system model that is amenable to finite-dimensional optimization, as the temporal logic constraints are independent of any particular system dynamics or cost functions. We specifically investigate Mixed Logical Dynamic (MLD) systems [4] and certain differentially flat systems [19], whose dynamics can be encoded with mixed-integer linear constraints. MLD systems include constrained linear systems, linear hybrid automata, and piecewise affine systems. Differentially flat systems include quadrotors and car-like vehicles.

It is well-known that mixed-integer linear programming can be used for reasoning about propositional logic [7, 11], generating state-constrained trajectories [8, 20, 24], and modeling vehicle routing problems [13, 22]. The work most similar to ours is Karaman et al. [15], who consider controller synthesis for MLD systems subject to finite-horizon LTL specifications. However, finite-horizon properties are too restrictive to model a large class of interesting robotics problems, including persistent surveillance, repeated assembly, periodic motion, and environmental monitoring. Our work specifically addresses these types of periodic tasks with a novel mixed-integer formulation.

Our main contributions are (1) a novel method for encoding both finite- and infinite-horizon temporal logic properties as mixed-integer linear constraints on a system and (2) an improved encoding that has a tighter convex relaxation and uses significantly fewer binary variables for common tasks than related work [15]. The fragment of temporal logic that we consider allows one to specify properties such as safety, stability, liveness, guarantee, and response. We demonstrate how this mixed-integer programming formulation can be used with off-the-shelf optimization solvers (e.g. CPLEX [23]) to compute both feasible and optimal controllers for high-dimensional systems with temporal logic specifications.

2 Preliminaries

An atomic proposition is a statement that is \( True \) or \( False \). A propositional formula is composed of only atomic propositions and propositional connectives, i.e., \(\wedge \) (and), \(\vee \) (or), and \(\lnot \) (not). Let \(\mathcal {T}= \{ 0,1,2,\ldots , T\} \subset {\mathbb N}\) denote a bounded set of discrete time instances and \(\mathcal {T}^{\infty } = \{0,1,2,\ldots \}\) denote an unbounded set of discrete time instances.

2.1 System Model

We consider discrete-time nonlinear systems of the form

$$\begin{aligned} x(t+1) = f( x(t),u(t) ), \end{aligned}$$
(1)

where \(t \in \mathcal {T}^{\infty }\), \(x \in \mathcal {X}\subseteq {\mathbb R}^{n_c} \times \{0,1\}^{n_l}\) are the continuous and binary states, \(u \in \mathcal {U}\subseteq {\mathbb R}^{m_c} \times \{0,1\}^{m_l}\) are the inputs, and \(x(0) = x_0 \in \mathcal {X}\) is the initial state. We assume that the system is deterministic, i.e., an initial state \(x_0\) and a control input sequence \(\mathbf {u} = {u_0 u_1 u_2\ldots }\) produces a unique trajectory (or run) \(\mathbf {x} = \mathbf {x}(x_0,\mathbf {u}) = x_0 x_1 x_2\ldots \).

Let AP be a finite set of atomic propositions. The (time-dependent) labeling function \(L_t: \mathcal {X}\rightarrow 2^{AP}\) maps the continuous part of each state to the set of atomic propositions that are \( True \) at time t. Each atomic proposition \(\psi \in AP\) is represented by a union of polyhedrons. The finite index set \(I^{\psi }_t\) lists the polyhedrons where \(\psi \) holds at time t. The ith polyhedron is \(\{x \in \mathcal {X}\mid H^{\psi _i}_t x \le K^{\psi _i}_t \}\), where \(i \in I^{\psi }_t\). Thus, the set of states where atomic proposition \(\psi \) holds at time t is given by . This (potentially) time-varying set is the finite union of polyhedrons (finite conjunctions of halfspaces).

2.2 A Fragment of Temporal Logic

We do not attempt to reason about all possible temporal logic formulas (see [2]); instead, we develop a useful library of temporal operators for robotic tasks. This fragment of temporal logic can concisely and unambiguously specify a wide range of tasks such as safe navigation, surveillance, persistent coverage, response to the environment, and visiting goals. In the following definitions, \(\psi \), \(\phi \), and \(\psi _j\) (for a finite number of indices j) are propositional formulas. To simplify the presentation, we split these into three groups: core \(\varPhi _{\text {core}}\), response \(\varPhi _{\text {resp}}\), and fairness \(\varPhi _{\text {fair}}\). We first define the syntax of the temporal operators and then their semantics.

Syntax

The core operators, \(\varPhi _{\text {core}} := \{ \varphi _{\text {safe}}, \varphi _{\text {goal}}, \varphi _{\text {per}}, \varphi _{\text {live}}, \varphi _{\text {until}} \}\), specify fundamental properties such as safety, guarantee, persistence, liveness (recurrence), and until. These operators are,

$$\begin{aligned} \varphi _{\text {safe}}&:= \square \psi ,&\varphi _{\text {goal}}&:= \Diamond \psi ,&\varphi _{\text {per}}&:= \Diamond \square \psi ,&\varphi _{\text {live}}&:= \square \Diamond \psi ,&\varphi _{\text {until}}&:= \psi \mathcal {U}\phi , \end{aligned}$$

where \(\varphi _{\text {safe}}\) specifies safety, i.e., a property should invariantly hold, \(\varphi _{\text {goal}}\) specifies goal visitation, i.e., a property should eventually hold, \(\varphi _{\text {per}}\) specifies persistence, i.e., a property should eventually hold invariantly, and \(\varphi _{\text {live}}\) specifies liveness (recurrence), i.e., a property should hold repeatedly, as in surveillance, and \(\varphi _{\text {until}}\) specifies until, i.e., a property \(\psi \) should hold until another property \(\phi \) holds.

The response operators, \(\varPhi _{\text {resp}} := \{ \varphi _{\text {resp}}^1, \varphi _{\text {resp}}^2, \varphi _{\text {resp}}^3, \varphi _{\text {resp}}^4 \}\), specify how the system responds to the environment. These operators are,

$$\begin{aligned} \varphi _{\text {resp}}^1&:= \square ( \psi \implies \bigcirc \phi ),&\varphi _{\text {resp}}^2&:= \square ( \psi \implies \Diamond \phi ), \\ \varphi _{\text {resp}}^3&:= \Diamond \square ( \psi \implies \bigcirc \phi ),&\varphi _{\text {resp}}^4&:= \Diamond \square ( \psi \implies \Diamond \phi ), \end{aligned}$$

where \(\varphi _{\text {resp}}^1\) specifies next-step response to the environment, \(\varphi _{\text {resp}}^2\) specifies eventual response to the environment, \(\varphi _{\text {resp}}^3\) specifies steady-state next-step response to the environment, and \(\varphi _{\text {resp}}^4\) specifies steady-state eventual response to the environment.

Finally, the fairness operators, \(\varPhi _{\text {fair}} := \{ \varphi _{\text {fair}}^1, \varphi _{\text {fair}}^2, \varphi _{\text {fair}}^3 \}\), allow one to specify conditional tasks. These operators are,

$$\begin{aligned} \varphi _{\text {fair}}^1&:= \Diamond \psi \implies \bigwedge _{j=1}^m \Diamond \phi _j,&\varphi _{\text {fair}}^2&:= \Diamond \psi \implies \bigwedge _{j=1}^m \square \Diamond \phi _j, \\ \varphi _{\text {fair}}^3&:= \square \Diamond \psi \implies \bigwedge _{j=1}^m \square \Diamond \phi _j, \end{aligned}$$

where \(\varphi _{\text {fair}}^1\) specifies conditional goal visitation, and \(\varphi _{\text {fair}}^2\) and \(\varphi _{\text {fair}}^3\) specify conditional repeated goal visitation.

The fragment of LTL that we consider is built from the temporal operators defined above as follows,

$$\begin{aligned} \varphi {:}:= \varphi _{\text {core}} \mid \varphi _{\text {resp}} \mid \varphi _{\text {fair}} \mid \varphi _1 \wedge \varphi _2, \end{aligned}$$
(2)

where \(\varphi _{\text {core}} \in \varPhi _{\text {core}}\), \(\varphi _{\text {resp}} \in \varPhi _{\text {resp}}\), and \(\varphi _{\text {fair}} \in \varPhi _{\text {fair}}\).

This LTL fragment specifies many properties relevant to robotics, especially for surveillance tasks for which no mathematical programming-based approaches currently exist. However, it does not include nested properties [2]. Determining all temporal properties that can be expressed in this framework is future work.

Remark 1

To include disjunctions (e.g., \(\varphi _1 \vee \varphi _2\)), one can rewrite a formula in disjunctive normal form, where each clause is of the form (2). In what follows, each clause can then be considered separately, as the system (1) is deterministic.

Semantics

We use set operations between a trajectory (run) \(\mathbf {x} = \mathbf {x}(x_0,\mathbf {u})\) and subsets of \(\mathcal {X}\) where particular propositional formulas hold to define satisfaction of a temporal logic formula [2]. We denote the set of states where propositional formula \(\psi \) holds by . A run \(\mathbf {x}\) satisfies the temporal logic formula \(\varphi \), denoted by \(\mathbf {x} \models \varphi \), if and only if certain set operations hold. Given propositional formulas \(\psi \) and \(\phi \), we relate satisfaction of (a partial list of) formulas of the form (2) with set operations as follows:

  • \(\mathbf {x} \models \square \psi \) iff for all i,

  • \(\mathbf {x} \models \Diamond \square \psi \) iff there exists an index j such that for all \(i \ge j\),

  • \(\mathbf {x} \models \Diamond \psi \) iff for some i,

  • \(\mathbf {x} \models \square \Diamond \psi \) iff for infinitely many i,

  • \(\mathbf {x} \models \psi \mathcal {U}\phi \) iff there exists an index j such that and for all \(i < j\),

  • \(\mathbf {x} \models \square (\psi \implies \bigcirc \phi )\) iff or for all i,

  • \(\mathbf {x} \models \square (\psi \implies \Diamond \phi )\) iff or for some \(k \ge i\) for all i,

  • \(\mathbf {x} \models \Diamond \square (\psi \implies \bigcirc \phi )\) iff there exists an index j such that or for all \(i \ge j\),

  • \(\mathbf {x} \models \Diamond \square (\psi \implies \Diamond \phi )\) iff there exists an index j such that or for some \(k \ge i\) for all \(i \ge j\).

A run \(\mathbf {x}\) satisfies a conjunction of temporal logic formulas \(\varphi = \bigwedge _{i=1}^m \varphi _i\) if and only if the set operations for each temporal logic formula \(\varphi _i\) holds. The LTL formula \(\varphi \) is satisfiable by a system at state \(x_0 \in \mathcal {X}\) if and only if there exists a control input sequence \(\mathbf {u}\) such that \(\mathbf {x}(x_0, \mathbf {u}) \models \varphi \).

3 Problem Statement

In this section, we formally state both a feasibility and an optimization problem and give an overview of our solution approach. Let \(\varphi \) be an LTL formula of the form (2) defined over AP.

Problem 1

Given a system of the form (1), with initial condition \(x_0\), and an LTL formula \(\varphi \) of the form (2), determine whether or not there exists a control input sequence \(\mathbf {u}\) such that \(\mathbf {x}(x_0, \mathbf {u}) \models \varphi \).

We now introduce a cost function to distinguish among all trajectories that satisfy Problem 1. Since LTL formulas are defined over infinite state sequences, we define a cost function over infinite state sequences. We use a maximum cost function to simplify the presentation; it can easily be extended to discounted, limit-maximum, and average cost functions (see [26]). Let the cost \(c: \mathcal {X}\times \mathcal {U}\rightarrow {\mathbb R}\) be bounded.

Definition 1

Let \(\mathbf {x}\) be a trajectory and \(\mathbf {u}\) be the corresponding control input sequence. The maximum cost of trajectory \(\mathbf {x}\) is

$$\begin{aligned} J (\mathbf {x}, \mathbf {u}) := \sup _{t \in \mathcal {T}^{\infty }} c( x_t, u_t), \end{aligned}$$
(3)

where J maps trajectories and control inputs to \({\mathbb R}\cup \infty \).

Problem 2

Given a system of the form (1), with initial condition \(x_0\), and an LTL formula \(\varphi \) of the form (2), compute a control input sequence \(\mathbf {u}\) such that \(\mathbf {x}({x}_0, \mathbf {u}) \models \varphi \) and \(J(\mathbf {x}(x_0, \mathbf {u}), \mathbf {u})\) is minimized.

We now give a brief overview of our solution approach. We parameterize the system trajectory (control input) as a periodic prefix-suffix structure. Every LTL operator of the form (2) is encoded as mixed-integer linear constraints on this finite parameterization. These temporal logic constraints (see Sect. 5) are then combined with dynamic constraints (see Sect. 6) as constraints on a combined mixed-integer optimization problem with an appropriate cost function. For MLD systems and certain differentially flat systems (see Sect. 6) with linear costs, Problems 1 and 2 can thus be solved using a mixed-integer linear program (MILP) solver. While even checking feasibility of a MILP is NP-hard, modern solvers using branch and bound methods routinely solve large problems [23]. We show promising results (see Fig. 1) on high-dimensional (10\(+\) continuous state) systems in Sect. 7.

Fig. 1
figure 1

Illustration of a problem instance. The task is to repeatedly visit regions P, D1, and D2, where dark regions are obstacles that must be avoided. Representative trajectories for a quadrotor (left) and nonlinear car (right) are shown with the prefix (blue) and suffix (black)

Remark 2

We only consider open-loop trajectory generation, which is already a challenging problem due to the nonlinear dynamics and LTL specifications. Disturbances can be dealt with by wrapping a feedback controller around the trajectory. Incorporating disturbances during trajectory generation is the subject of future work.

4 A Periodic Trajectory Parameterization

We parameterize the system trajectory by a periodic prefix-suffix form that is commonly used in model checking for finite systems. In this structure, the prefix is a finite trajectory and the suffix is a finite trajectory that is repeated infinitely often. This gives a sufficient condition that is amenable to computation, although it may miss valid non-periodic trajectories.

A walk is a finite sequence of states \(\mathbf {x} = x_0 x_1 x_2\ldots x_N\) that satisfy the constraints in (1). A cycle is a walk \(\mathbf {x} = x_0 x_1 x_2\ldots x_{N}\) where \(f(x_{N},u) = x_0\) for some \(u \in \mathcal {U}\). A trajectory \(\mathbf {x}\) induces a corresponding word (i.e., sequence of labels) \(L(\mathbf {x}) = L_0(x_0) L_1(x_1) L_2(x_2)\ldots \) through the labeling function. A word is similarly defined for a walk or cycle. We now define a trajectory in prefix-suffix form.

Definition 2

Let \(\mathbf {x}_{\text {pre}}\) be a finite walk and \(\mathbf {x}_{\text {suf}}\) be a finite cycle. A trajectory \(\mathbf {x}\) is in prefix-suffix form if it is of the form \(\mathbf {x} = \mathbf {x}_{\text {pre}}(\mathbf {x}_{\text {suf}})^{\omega }\), where \(\omega \) denotes infinite repetition.

We will require that the (time-varying) labeling function \(L_t\) is eventually periodic.

Assumption 1

There exists a finite \(t' \in \mathcal {T}^{\infty }\) and a \(\varOmega \in \mathbb {N}\) such that \(L_{t} = L_{t+\varOmega }\) for all \(t \ge t' \in \mathcal {T}^{\infty }\). We further assume that \(\varOmega \) is minimal among all possible values.

In the sequel, we will only consider trajectories \(\mathbf {x} = \mathbf {x}_{\text {pre}}(\mathbf {x}_{\text {suf}})^{\omega }\) in prefix-suffix form. While both \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\) are finite, the constraint that \(\mathbf {x}_{\text {suf}}\) is a cycle allows us to repeat that sequence of states forever. Repeating the same sequence of states is a sufficient condition that the word \(L(\mathbf {x}_{\text {suf}})\) (i.e., the sequence of atomic propositions) is also repeated (using Assumption 1). However, only the word matters for the feasibility of an LTL formula, not the exact sequence of states. In fact, there may exist other trajectories that produce the same word L(x), but are not eventually periodic. Our approach cannot find such trajectories, although we have not noticed this limitation in our experiments. This differs from the case of finite discrete systems, where a prefix-suffix form is sufficient to find a feasible solution if one exists [2].

In the next section, we will encode the temporal operators as mixed-integer constraints on \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\). Let \(\mathbf {x}_{\text {cat}}:= \mathbf {x}_{\text {pre}}\mathbf {x}_{\text {suf}}\) denote the concatenation of \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\), and assign time indices to \(\mathbf {x}_{\text {cat}}\) as \(\mathcal {T}_{\text {cat}} := \{0, 1,\ldots , T_s, \ldots , T \}\). Let \(\mathcal {T}_{\text {pre}} := \{0, 1,\ldots , T_s-1 \}\) and \(\mathcal {T}_{\text {suf}} := \{ T_s,\ldots , T \}\), where \(T_s\) is the first time instance on the suffix. The infinite repetition of \(\mathbf {x}_{\text {suf}}\) is enforced by the constraint \(\mathbf {x}_{\text {cat}}(T_s) = f(\mathbf {x}_{\text {cat}}(T), u)\) for some \(u \in \mathcal {U}\). By Assumption 1, it is sufficient that \(T_s\) is greater than \(t'\) and that the length of \(\mathcal {T}_{\text {suf}}\) is an integer multiple of \(\varOmega \). We often identify \(\mathbf {x}_{\text {pre}}(0) \cdots \mathbf {x}_{\text {pre}}(T_{\text {pre}})\) with \(\mathbf {x}_{\text {cat}}(0) \cdots \mathbf {x}_{\text {cat}}(T_s -1)\) and \(\mathbf {x}_{\text {suf}}(0) \cdots \mathbf {x}_{\text {suf}}(T_{\text {suf}})\) with \(\mathbf {x}_{\text {cat}}(T_s) \cdots \mathbf {x}_{\text {cat}}(T)\) in the obvious manner.

5 A Mixed-Integer Linear Formulation of LTL Constraints

In this section, we develop a mixed-integer programming formulation for a given prefix-suffix trajectory parameterization, \(\mathbf {x}_{\text {cat}}= \mathbf {x}_{\text {pre}}\mathbf {x}_{\text {suf}}\). The corresponding system trajectory is \(\mathbf {x} = \mathbf {x}_{\text {pre}}(\mathbf {x}_{\text {suf}})^{\omega }\). Since the system is deterministic, this defines a corresponding control input sequence. The split between \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\) can either be specified a priori or left as a variable (see [26] for details). We mix notation in the following and refer to x and \(\mathcal {T}\) instead of \(\mathbf {x}_{\text {cat}}\) and \(\mathcal {T}_{\text {cat}}\) when clear from context.

5.1 Relating the Dynamics and Propositions

We now relate the state of a system to the set of atomic propositions that are \( True \) at each time instance. We assume that each propositional formula \(\psi \) is described at time t by the union of a finite number of polytopes, indexed by the finite index set \(I^{\psi }_t\). Let represent the set of states that satisfy propositional formula \(\psi \) at time t. We assume that these have been constructed as necessary from the system’s original atomic propositions. We note that a proposition preserving partition [1] is not necessary or even desired.

For each propositional formula \(\psi \), introduce binary variables \(z^{\psi _i}_t \in \{ 0,1 \}\) for all \(i \in I^{\psi }_t\) and for all \(t \in \mathcal {T}\). Let \(x_t\) be the state of the system at time t and M be a vector of sufficiently large constants. The big-M formulation

$$\begin{aligned} H^{\psi _i}_t x_t&\le K^{\psi _i}_t + M ( 1 - z^{\psi _i}_t), \quad \forall i \in I^{\psi }_t \nonumber \\ \sum _{i \in I^{\psi }_t} z^{\psi _i}_t&= 1 \end{aligned}$$
(4)

enforces the constraint that at time t. Define \(P^{\psi }_t := \sum _{i \in I^{\psi }_t} z^{\psi _i}_t\). If \(P^{\psi }_t =~1\), then . If \(P^{\psi }_t = 0\), then nothing can be inferred.

The big-M formulation may give poor continuous relaxations of the binary variables, i.e., \(z^{\psi _i}_t \in \left[ 0, 1 \right] \), which may lead to poor performance during optimization [23]. Such relaxations are frequently used during the solution of mixed-integer linear programs [23]. Thus, we introduce an alternate representation whose continuous relaxation is the convex hull of the original set . This formulation is well-known in the optimization community [12], but does not appear in the trajectory generation literature ([8, 20, 24] and references therein). As such, this formulation may be of independent interest for trajectory planning with obstacles.

The convex hull formulation

$$\begin{aligned} H^{\psi _i}_t x^i_t&\le K^{\psi _i}_t z^{\psi _i}_t, \quad \forall i \in I^{\psi }_t \nonumber \\ \sum _{i \in I^{\psi }_t} z^{\psi _i}_t&= 1, \nonumber \\ \sum _{i \in I^{\psi }_t} x^i_t&= x_t \end{aligned}$$
(5)

represents the same set as the big-M formulation (4). While the convex hull formulation introduces more continuous variables, it gives the tightest linear relaxation of the disjunction of the polytopes and reduces the need to select the M parameters [12]. Note that we will only use the convex hull formulation (5) for safety and persistence formulas (i.e., \(\varphi _{\text {safe}}\) and \(\varphi _{\text {per}}\)) in Sect. 5.2, as \(P^{\psi }_t = 0\) enforces \(x = 0\).

Regardless if one uses the big-M or convex hull formulation, only one binary variable is needed for each polyhedron (i.e., finite conjunction of halfspaces). This compares favorably with the approach in [15], where a binary variable is introduced for each halfspace. Additionally, the auxiliary continuous variables and mixed-integer constraints previously used are not needed because we use implication. For simple tasks such as \(\varphi = \Diamond \psi \), our method can use significantly fewer binary variables than previously needed, depending on the number of halfspaces and polytopes needed to describe .

For every temporal operator described in the following section, the constraints in (4) or (5) should be understood to be implicitly applied to the corresponding propositional formulas so that \(P^{\psi }_t = 1\) implies that the system satisfies \(\psi \) at time t. Also, note that we use different binary variables for each formula—even when representing the same set.

5.2 The Mixed-Integer Linear Constraints

In this section, the trajectory parameterization \(\mathbf {x}\) has been a priori split into a prefix \(\mathbf {x}_{\text {pre}}\) and a suffix \(\mathbf {x}_{\text {suf}}\). This assumption can be relaxed, so that the size of \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\) are optimization variables (see [26] for details). We further assume that \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\) satisfy Assumption 1.

In the following, the correctness of the constraints applied to \(\mathbf {x}_{\text {pre}}\) and \(\mathbf {x}_{\text {suf}}\) comes directly from the temporal logic semantics given in Sect. 2.2 and the form of the trajectory \(\mathbf {x} = \mathbf {x}_{\text {pre}}(\mathbf {x}_{\text {suf}})^{\omega }\). The most important factors are whether a property can be verified over finite- or infinite-horizons. All infinite-horizon (liveness) properties must be satisfied on the suffix \(\mathbf {x}_{\text {suf}}\).

We begin with the fundamental temporal operators \(\varPhi _{\text {core}}\). Safety and persistence require a mixed-integer linear constraint for each time step, while guarantee and liveness only require a single mixed-integer linear constraint.

Safety, \(\varphi _{\text {safe}} = \square \psi \), is satisfied by the constraints

$$\begin{aligned} P^{\psi }_t = 1, \quad \forall t \in \mathcal {T}_{\text {pre}}, \\ P^{\psi }_t = 1, \quad \forall t \in \mathcal {T}_{\text {suf}}, \end{aligned}$$

which ensure that the system is always in a region. Similarly, persistence, \(\varphi _{\text {per}} = \Diamond \square \psi \), is enforced by

$$\begin{aligned} P^{\psi }_t = 1, \quad \forall t \in \mathcal {T}_{\text {suf}}, \end{aligned}$$

which ensures the system eventually remains in a region.

Guarantee, \(\varphi _{\text {goal}} = \Diamond \psi \), is satisfied by the constraints

$$\begin{aligned} \sum _{t \in \mathcal {T}_{\text {pre}}} P^{\psi }_t + \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\psi }_t = 1, \end{aligned}$$

which ensures the system eventually visits a region. Similarly, liveness \(\varphi _{\text {live}} = \square \Diamond \psi \) is enforced by

$$\begin{aligned} \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\psi }_t = 1, \end{aligned}$$

which ensures the system repeatedly visits a region.

Until, \(\varphi _{\text {until}} = \psi \mathcal {U}\phi \), is enforced by

$$\begin{aligned} P^{\phi }_0= & {} s_0, \\ P^{\phi }_t= & {} s_t - s_{t-1}, \quad t=1,\ldots ,T \\ P^{\psi }_t= & {} 1-s_t, \quad \forall t \in \mathcal {T}, \end{aligned}$$

where we use auxiliary binary variables \(s_t \in \{0,1\}\) for all \(t \in \mathcal {T}\) such that \(s_t \le s_{t+1}\) for \(t=0,\ldots ,T-1\) and \(s_T = 1\).

Now consider the response temporal operators \(\varPhi _{\text {resp}}\). For these formulas, the definition of implication is used to convert each inner formula into a disjunction between a property that holds at a state and a property that holds at some point in the future. The response formulas require a mixed-integer linear constraint for each time step.

For next-step response, \(\varphi _{\text {resp}}^1 = \square ( \psi \implies \bigcirc \phi ) = \square ( \lnot \psi \vee \bigcirc \phi )\), the additional constraints are

$$\begin{aligned} P^{\lnot \psi }_t + P^{\phi }_{t+1}&= 1, \quad t= 0,\ldots , T_s, \ldots , T-1, \\ P^{\lnot \psi }_T + P^{\phi }_{T_s}&= 1, \end{aligned}$$

Similarly, steady-state next-step response, \(\varphi _{\text {resp}}^3 = \Diamond \square ( \psi \implies \bigcirc \phi ) = \Diamond \square ( \lnot \psi \vee \bigcirc \phi )\), is satisfied by

$$\begin{aligned} P^{\lnot \psi }_t + P^{\phi }_{t+1}&= 1, \quad t= T_s,\ldots , T-1, \\ P^{\lnot \psi }_{T} + P^{\phi }_{T_s}&= 1, \end{aligned}$$

Eventual response, \(\varphi _{\text {resp}}^2 = \square ( \psi \implies \Diamond \phi ) = \square ( \lnot \psi \vee \Diamond \phi )\), requires the following constraints

$$\begin{aligned} P^{\lnot \psi }_t + \sum _{\tau =t}^{T} P^{\phi }_{\tau } = 1, \quad \forall t \in \mathcal {T}_{\text {pre}}, \\ P^{\lnot \psi }_t + \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\phi }_t = 1, \quad \forall t \in \mathcal {T}_{\text {suf}}. \end{aligned}$$

Similarly, for steady-state eventual response, \(\varphi _{\text {resp}}^4 = \Diamond \square ( \psi \implies \Diamond \phi ) = \Diamond \square ( \lnot \psi \vee \Diamond \phi )\), the additional constraints are

$$\begin{aligned} P^{\lnot \psi }_t + \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\phi }_t = 1, \quad \forall t \in \mathcal {T}_{\text {suf}}. \end{aligned}$$

Now consider the fairness temporal operators \(\varPhi _{\text {fair}}\). In the following, the definition of implication is used to rewrite the inner formula as disjunction between a single safety (persistence) property and a conjunction of guarantee (liveness) properties. These formulas require a mixed-integer linear constraint for each conjunction in the response and each time step.

Conditional goal visitation, \(\varphi _{\text {fair}}^1 = \Diamond \psi \implies \bigwedge _{j=1}^m \Diamond \phi _j \;= \square \lnot \psi \vee \bigwedge _{j=1}^m \Diamond \phi _j\), is specified by

$$\begin{aligned} P^{\lnot \psi }_t + \sum _{t \in \mathcal {T}} P^{\phi _j}_t = 1, \quad \forall j=1,\ldots ,m, \forall t \in \mathcal {T}. \end{aligned}$$

Conditional repeated goal visitation, \(\varphi _{\text {fair}}^2 = \Diamond \psi \implies \bigwedge _{j=1}^m \square \Diamond \phi _j = \square \lnot \psi \vee \bigwedge _{j=1}^m \square \Diamond \phi _j\), is enforced as

$$\begin{aligned} P^{\lnot \psi }_t + \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\phi _j}_t = 1, \quad \forall j=1,\ldots ,m, \forall t \in \mathcal {T}. \end{aligned}$$

Similarly, \(\varphi _{\text {fair}}^3 = \square \Diamond \psi \!\implies \! \bigwedge _{j=1}^m \square \Diamond \phi _j = \Diamond \square \lnot \psi \vee \bigwedge _{j=1}^m \square \Diamond \phi _j\), is represented by

$$\begin{aligned} P^{\lnot \psi }_t + \sum _{t \in \mathcal {T}_{\text {suf}}} P^{\phi _j}_t = 1, \quad \forall j=1,\ldots ,m, \; \forall t \in \mathcal {T}_{\text {suf}}. \end{aligned}$$

We have encoded the temporal logic specifications on the system variables using mixed-integer linear constraints. Note that the equality constraints on the binary variables dramatically reduce search space. In Sect. 6 we discuss adding dynamics to further constrain the possible behaviors of the system.

6 System Dynamics

The mixed-integer constraints in Sect. 5 are over a sequence of states, and thus are independent of the specific system dynamics. Dynamic constraints on the sequence of states can also be enforced by standard transcription methods [5]. However, the resulting optimization problem may then be a mixed-integer nonlinear program due to the dynamics. We highlight two useful classes of nonlinear systems where the dynamics can be encoded using mixed-integer linear constraints.

6.1 Mixed Logical Dynamical Systems

Mixed Logical Dynamical (MLD) systems have both continuous and discrete-valued states and allow one to model nonlinearities, logic, and constraints [4]. These systems include constrained linear systems, linear hybrid automata, and piecewise affine systems. An MLD system is of the form

$$\begin{aligned}&x(t+1) = A x(t) + B_{1} u(t) + B_{2} \delta (t) + B_{3} z(t) \nonumber \\ \text { subject to }\;\;&E_{2} \delta (t) + E_{3} z(t) \le E_{1} u(t) + E_{4} x(t) + E_{5}, \end{aligned}$$

where \(t \in \mathcal {T}^{\infty }\), \(x \in \mathcal {X}\subseteq {\mathbb R}^{n_c} \times \{0,1\}^{n_l}\) are the continuous and binary states, \(u \in \mathcal {U}\subseteq {\mathbb R}^{m_c} \times \{0,1\}^{m_l}\) are the inputs, and \(\delta \in \{0,1\}^{r_l}\) and \(z \in {\mathbb R}^{r_l}\) are auxiliary binary and continuous variables, respectively. The terms A, \(B_1\), \(B_2\), \(B_3\), \(E_1\), \(E_2\), \(E_3\), \(E_4\), and \(E_5\) are system matrices of appropriate dimension. We assume that the system is deterministic and well-posed (see Definition 1 in [4]).

6.2 Differentially Flat Systems

A system is differentially flat if there exists a set of outputs such that all states and control inputs can be determined from these outputs without integration. If a system has states \(x \in {\mathbb R}^n\) and control inputs \(u \in {\mathbb R}^m\), then it is flat if we can find outputs \(y \in {\mathbb R}^m\) of the form \(y = y(x,u,\dot{u},\ldots , u^{(p)})\) such that \(x = x(y,\dot{y},\ldots , y^{(q)})\) and \(u = u(y,\dot{y},\ldots , y^{(q)})\). Thus, we can plan trajectories in output space and then map these to control inputs [17].

Differentially flat systems may be encoded using mixed integer linear constraints in certain cases, e.g., the flat output is constrained by mixed integer linear constraints. This holds for relevant classes of robotic systems, including quadrotors and car-like robots. However, control input constraints are typically non-convex in the flat output. Common approaches to satisfy control constraints are to plan a sufficiently smooth trajectory or slow down along a trajectory [19].

7 Examples

We demonstrate our techniques on a variety of motion planning problems. The first example is a chain of integrators parameterized by dimension. Our second example is a quadrotor model from [25]. Our final example is a nonlinear car-like vehicle with drift. All computations were done on a laptop with a 2.4 GHz dual-core processor and 4 GB of memory using CPLEX [23] through Yalmip [18].

The environment and task is motivated by a pickup and delivery scenario. All properties should be understood to be with respect to regions in the plane (see Fig. 1). Let P be a region where supplies can be picked up and D1 and D2 be regions where supplies must be delivered. The robot must remain in the safe region S (in white). Formally, the task specification is \(\varphi = \square S \wedge \square \Diamond P \wedge \square \Diamond D1 \wedge \square \Diamond D2\). Additionally, we minimize the maximum cost function (3) where \(c( x_t, u_t) = |u_t|\) penalizes the control input.

In the remainder of this section, we consider this temporal logic motion planning problem for different system models. We use the simultaneous (sim.) approach described in Sect. 5.2, and also a sequential (seq.) approach from [26] that first computes the suffix and then the prefix. A trajectory of length 60 (split evenly between the prefix and suffix) is used in all cases, and all results are averaged over 20 randomly generated environments. The simultaneous approach uses between 300 and 469 binary variables with a mean of 394. Finally, all continuous-time models are discretized using a first-order hold and time-step of 0.5 s.

7.1 Chain of Integrators

The first system is a chain of orthogonal integrators in the x and y directions. The kth derivative of the x and y positions are controlled, i.e., \(x^{(k)} = u_x\) and \(y^{(k)} = u_y\), subject to the constraints \(|u_x| \le 0.5\) and \(|u_y| \le 0.5\). The general state constraints are \(|x^{(i)}| \le 1\) and \(|y^{(i)}| \le 1\) for \(i=1,\ldots , k-1\). Results are given in Tables 1 and 2 under “chain-2”, “chain-6”, and “chain-10”, where “chain-k” indicates that the kth derivative in both the x and y positions is controlled.

Table 1 Time until a feasible solution was found (mean ± standard error) and number of problems (out of 20) solved in 45 s using the big-M formulation (4) with M \(=\) 10
Table 2 Time until a feasible solution was found (mean ± standard error) and number of problems (out of 20) solved in 45 s using the convex hull formulation (5)

7.2 Quadrotor

We now consider the quadrotor model used in [25] for point-to-point motion planning, to which we refer the reader for a complete description of the model. The state \(x = (p, v, r, w)\) is 10-dimensional, consisting of position \(p \in {\mathbb R}^3\), velocity \(v \in {\mathbb R}^3\), orientation \(r \in {\mathbb R}^2\), and angular velocity \(w \in {\mathbb R}^2\). This model is the linearization of a nonlinear model about hover with the yaw constrained to be zero. The control input \(u \in {\mathbb R}^3\) is the total, roll, and pitch thrust. Results are given in Tables 1 and 2 under “quadrotor”, and a sample trajectory is shown in Fig. 1.

Also, we use the fact that the quadrotor is differentially flat [19] to generate trajectories for the nonlinear model (with fixed yaw). We parameterize the flat output \(p \in {\mathbb R}^3\) with eight piecewise polynomials of degree three, and then optimize over their coefficients to compute a smooth trajectory. Afterwards, we check that the trajectory does not violate the control input constraints. Results are given in Table 1 under “quadrotor-flat”.

7.3 Nonlinear Car

Consider a nonlinear car-like vehicle with state \(x = (p_x, p_y, \theta )\) and dynamics \(\dot{x} = ( v \cos (\theta ), v \sin (\theta ), u )\). The variables \(p_x,p_y\) are position (m) and \(\theta \) is orientation (rad). The vehicle’s speed v is fixed at 0.8 (m/s) and its control input is constrained as \(|u| \le 2.5\). We form a hybrid MLD model by linearizing the system about different orientations \(\hat{\theta }_i\) for \(i = 1,\ldots , k\). The dynamics are governed by the closest linearization to the current \(\theta \). Results with \(k=3\) and \(k=4\) are given in Table 1 under “car-3” and “car-4”, respectively. A sample trajectory of “car-4” is show in Fig. 1.

Additionally, we use the flat output \((x,y) \in {\mathbb R}^2\) to generate trajectories for the nonlinear car-like model in a similar manner as for the quadrotor model. Results are given in Table 1 under “car-flat”.

7.4 Discussion and Comparison

We first compare our approach to reachability-based algorithms that compute a finite abstraction [16, 28]. We used the method in [28] to compute a discrete abstraction for a two dimensional system in 22 s, and [16] reports abstracting a four dimensional system in just over a minute. This contrasts with our mixed-integer approach that can routinely find solutions to such problems in seconds, although we do not compute a feedback controller. Our results appear particularly promising for situations where the environment is dynamically changing and a finite abstraction must be repeatedly computed.

We also compare to the finite-horizon mixed-integer formulation given in [15]. Consider the task \(\varphi = \Diamond \psi \), where is a convex polytope defined by m halfspaces. Our method uses one binary variable at each time step, while their approach uses m. Additionally, while we encode eventually (\(\Diamond \)) using a single constraint, their approach uses a number of constraints quadratic in the trajectory length.

In most of our examples, we are able to quickly compute a feasible trajectory that satisfies a temporal logic formula by solving a mixed-integer linear program. This is aided by the sequential approach, which separates the problem into computing a suffix and then a prefix [26]. It typically takes a long time to compute a trajectory that is provably globally optimal, although this does happen in finite time.

Finally, the convex hull formulation performed poorly in our examples. There is an empirical tradeoff between having tighter continuous relaxations and the number of continuous variables in the formulation. We hypothesize that the convex hull formulation will be most useful in cases when (1) the number of binary variables is large, or (2) the cost function is minimized near the boundary of the region.

8 Conclusion

We presented a novel mixed-integer programming-based method for control of nonlinear systems with a useful fragment of LTL that allows both finite- and infinite-horizon properties to be specified. Our method is efficient in the number of binary variables used to model the an LTL formula. Additionally, we showed the computational effectiveness of our approach on temporal logic motion planning examples.

Future work will consider reactive environments by including both continuous and discrete disturbances using a receding horizon control approach. Additionally, we will expand the space of tasks that can be specified by including additional temporal operators and timing constraints.