Keywords

1 Introduction

Probabilistic programs are used to describe randomized algorithms and probability distributions, with applications in many areas. As an example, consider the well-known program which models the race between a tortoise and a hare (see, e.g., [10, 24, 30]). As long as the tortoise (variable t) is not behind the hare (variable h), it does one step in each iteration. With probability \(\tfrac{1}{2}\), the hare stays at its position and with probability \(\tfrac{1}{2}\) it does a random number of steps uniformly chosen between 0 and 10. The race ends when the hare is in front of the tortoise. Here, the hare wins with probability one and the technique of [30] infers the upper bound \(\tfrac{2}{3}\cdot \max (t-h+9,0)\) on the expected number of loop iterations. Thus, the program is positively almost surely terminating.

Section 2 recapitulates preliminaries on probabilistic programs and on the connection between their expected runtime and their corresponding recurrence equation. Then we show in Sects. 3 and 4 that classical results on random walk theory directly yield a very simple decision procedure for (positive) almost sure termination of CP programs like the tortoise and hare example. In this way, we also obtain asymptotically tight bounds on the expected runtime of any CP program. Based on these bounds, in Sect. 5 we develop the first algorithm to compute closed forms for the exact expected runtime of such programs. In Sect. 6, we present its implementation in our tool KoAT [9] and discuss related and future work. We refer to [20] for a collection of examples to illustrate the application of our algorithm and for all proofs.

figure a

2 Expected Runtimes of Probabilistic Programs

Example 1

(Tortoise and Hare). The program \(\mathcal {P}_{race}\) on the right formulates the race of the tortoise and the hare as a CP program. In the loop guard, we use the scalar product \((1, -1)\bullet (t, h)\) which stands for \(t - h\). Exactly one of the instructions with numbers in brackets \([\ldots ]\) is executed in each loop iteration and the number indicates the probability that the corresponding instruction is chosen.

figure b

We now define the kind of probabilistic programs considered in this paper.

figure c

Definition 2

(Probabilistic Program). A program has the form on the right, where \(\varvec{x}=(x_1,\ldots ,x_r)\) for some \(r \ge 1\) is a tuple of pairwise different program variables, \(\varvec{a},\varvec{c}_1,\ldots ,\varvec{c}_{n}\in \mathbb {Z}^{r}\) are tuples of integers, the \(\varvec{c}_j\) are pairwise distinct, \(b\!\in \!\mathbb {Z}\), \(\bullet \) is the scalar product (i.e., \((a_1,\ldots ,a_{r}) \bullet (x_1,\ldots ,x_{r}) = a_1 \cdot x_1 + \ldots + a_{r} \cdot x_{r}\)), and \(\varvec{d} \in \mathbb {Z}^{r}\) with \(\varvec{a} \bullet \varvec{d} \le b\). We require \(p_{\varvec{c}_1}(\varvec{x}), \ldots , p_{\varvec{c}_{n}}(\varvec{x}), p^{\prime }(\varvec{x}) \in \mathbb {R}_{\ge 0}= \big \{r \in \mathbb {R}\mid r \ge 0\big \}\) and \(\sum \nolimits _{1 \le j \le n} p_{\varvec{c}_j}(\varvec{x}) + p^{\prime }(\varvec{x}) = 1\) for all \(\varvec{x} \in \mathbb {Z}^{r}\). It is a program with direct termination if there is an \(\varvec{x} \in \mathbb {Z}^{r}\) with \(\varvec{a} \bullet \varvec{x} > b\) and \(p'(\varvec{x}) > 0\). If all probabilities are constant, i.e., if there are \(p_{\varvec{c}_1}, \ldots ,p_{\varvec{c}_n}, p' \in \mathbb {R}_{\ge 0}\) such that \(p_{\varvec{c}_j}(\varvec{x}) = p_{\varvec{c}_j}\) and \(p'(\varvec{x}) =p'\) for all \(1 \le j \le n\) and all \(\varvec{x} \in \mathbb {Z}^{r}\), we call it a constant probability (CP) program.

Such a program means that the integer variables \(\varvec{x}\) are changed to \(\varvec{x}+\varvec{c}_j\) with probability \(p_{\varvec{c}_j}(\varvec{x})\). For inputs \(\varvec{x}\) with \(\varvec{a} \bullet \varvec{x} \le b\) the program terminates immediately. Note that the program in Example 1 has no direct termination (i.e., \(p'(\varvec{x}) = 0\) for all \(\varvec{x} \in \mathbb {Z}^{r}\)). Since the values of the program variables only depend on their values in the previous loop iteration, our programs correspond to Markov Chains [32] and they are related to random walks [16, 21, 33], cf. [20] for details.

Clearly, in general termination is undecidable and closed forms for the runtimes of programs are not computable. Thus, decidability results can only be obtained for suitably restricted forms of programs. Our class nevertheless includes many examples that are often regarded in the literature on probabilistic programs. So while other approaches are concerned with incomplete techniques to analyze termination and complexity, we investigate classes of probabilistic programs where one can decide the termination behavior, always find complexity bounds, and even compute the expected runtime exactly. Our decision procedure could be integrated into general tools for termination and complexity analysis of probabilistic programs: As soon as one has to investigate a sub-program that falls into our class, one can use the decision procedure to compute its exact runtime. Our contributions provide a starting point for such results and the considered class of programs can be extended further in future work.

In probability theory (see, e.g., [2]), given a set \(\varOmega \) of possible events, the goal is to measure the probability that events are in certain subsets of \(\varOmega \). To this end, one regards a set \(\mathfrak {F}\) of subsets of \(\varOmega \), such that \(\mathfrak {F}\) contains the full set \(\varOmega \) and is closed under complement and countable unions. Such a set \(\mathfrak {F}\) is called a \(\sigma \)-field, and a pair of \(\varOmega \) and a corresponding \(\sigma \)-field \(\mathfrak {F}\) is called a measurable space.

A probability space \((\varOmega , \mathfrak {F}, \mathbb {P})\) extends a measurable space \((\varOmega , \mathfrak {F})\) by a probability measure \(\mathbb {P}\) which maps every set from \(\mathfrak {F}\) to a number between 0 and 1, where \(\mathbb {P}(\varOmega ) = 1\), \(\mathbb {P}(\varnothing ) = 0\), and \(\mathbb {P}(\biguplus \nolimits _{j \ge 0} A_j) = \sum \nolimits _{j \ge 0} \mathbb {P}(A_j)\) for any pairwise disjoint sets \(A_0, A_1, \ldots \in \mathfrak {F}\). So \(\mathbb {P}(A)\) is the probability that an event from \(\varOmega \) is in the subset A. In our setting, we use the probability space \(((\mathbb {Z}^{r})^\omega , \mathfrak {F}^{\mathbb {Z}^{r}}, \mathbb {P}_{\varvec{x}_0}^\mathcal {P})\) arising from the standard cylinder-set construction of MDP theory, cf. [20]. Here, \((\mathbb {Z}^{r})^\omega \) corresponds to all infinite sequences of program states and \(\mathbb {P}_{\varvec{x}_0}^\mathcal {P}\) is the probability measure induced by the program \(\mathcal {P}\) when starting in the state \(\varvec{x}_0\in \mathbb {Z}^{r}\). For example, if \(A \subseteq (\mathbb {Z}^{2})^\omega \) consists of all infinite sequences starting with \((5, 1)\), \((6, 1)\), \((7, 6)\), then \(\mathbb {P}_{(5, 1)}^{\mathcal {P}_{race}}(A) = \tfrac{6}{11} \, \cdot \, \tfrac{1}{22} = \tfrac{3}{121}\). So, if one starts with \((5, 1)\), then \(\tfrac{3}{121}\) is the probability that the next two states are \((6, 1)\) and \((7, 6)\). Once a state is reached that violates the loop guard, then the probability to remain in this state is 1. Hence, if B contains all infinite sequences starting with \((7, 8)\), \((7, 8)\), then \(\mathbb {P}_{(7, 8)}^{\mathcal {P}_{race}}(B) = 1\). In the following, for any set of numbers M let \(\overline{M} = M \cup \{ \infty \}\).

Definition 3

(Termination Time). For a program \(\mathcal {P}\) as in Definition 2, its termination time is the random variable \(T^{\mathcal {P}}:(\mathbb {Z}^{r})^\omega \rightarrow \overline{\mathbb {N}}\) that maps every infinite sequence \(\langle \varvec{z}_0, \varvec{z}_1, \ldots \rangle \) to the first index j where \(\varvec{z}_j\) violates \(\mathcal {P}\)’s loop guard.

Thus, \(T^{\mathcal {P}_{race}}(\langle (5, 1),(6, 1),(7, 8),(7, 8), \ldots \rangle ) = 2\) and \(T^{\mathcal {P}_{race}}(\langle (5, 1),(6, 1),(5, 6), (8, 6), (9, 6),\ldots \rangle ) = \infty \) (i.e., this sequence always satisfies \(\mathcal {P}_{race}\)’s loop guard as the jth entry is \((5+j, 6)\) for \(j\ge 3\)). Now we can define the different notions of termination and the expected runtime of a probabilistic program. As usual, for any random variable X on a probability space \((\varOmega , \mathfrak {F}, \mathbb {P})\), \(\mathbb {P}(X = j)\) stands for \(\mathbb {P}(X^{-1}(\{j\}))\). So \(\mathbb {P}_{\varvec{x}_0}^\mathcal {P}(T^{\mathcal {P}} = j)\) is the probability that a sequence has termination time j. Similarly, \(\mathbb {P}_{\varvec{x}_0}^\mathcal {P}(T^{\mathcal {P}} < \infty ) = \sum \nolimits _{j \in \mathbb {N}} \mathbb {P}_{\varvec{x}_0}^\mathcal {P}(T^{\mathcal {P}} = j)\). The expected value \(\mathbb {E}(X)\) of a random variable \(X: \varOmega \rightarrow \overline{\mathbb {N}}\) for a probability space \((\varOmega , \mathfrak {F}, \mathbb {P})\) is the weighted average under the probability measure \(\mathbb {P}\), i.e., \(\mathbb {E}(X) = \sum \nolimits _{j \in \overline{\mathbb {N}}} \; j \cdot \mathbb {P}(X = j)\), where \(\infty \cdot 0 = 0\) and \(\infty \cdot u = \infty \) for all \(u \in \mathbb {N}_{>0}\).

Definition 4

(Termination and Expected Runtime). A program \(\mathcal {P}\) as in Definition 2 is almost surely terminating (AST) if \(\mathbb {P}^{\mathcal {P}}_{\varvec{x}_0}(T^{\mathcal {P}}\!\!<\infty )=1\) for any initial value \(\varvec{x}_0 \in \mathbb {Z}^{r}\). For any \(\varvec{x}_0 \in \mathbb {Z}^{r}\), its expected runtime \(rt_{\varvec{x}_0}^\mathcal {P}\) (i.e., the expected number of loop iterations) is defined as the expected value of the random variable \(T^{\mathcal {P}}\) under the probability measure \(\mathbb {P}^{\mathcal {P}}_{\varvec{x}_0}\), i.e., \(rt_{\varvec{x}_0}^\mathcal {P}= \mathbb {E}^\mathcal {P}_{\varvec{x}_0}\left( T^{\mathcal {P}}\right) = \sum \nolimits _{j \in \mathbb {N}} \;j \cdot \mathbb {P}_{\varvec{x}_0}^\mathcal {P}(T^{\mathcal {P}}\!\!=\!j)\) if \(\mathbb {P}_{\varvec{x}_0}^\mathcal {P}(T^{\mathcal {P}}\!\!<\!\infty )=1\), and \(rt_{\varvec{x}_0}^\mathcal {P}\!= \mathbb {E}^\mathcal {P}_{\varvec{x}_0}\left( T^{\mathcal {P}}\right) =\infty \) otherwise.

The program \(\mathcal {P}\) is positively almost surely terminating (PAST) if for any initial value \(\varvec{x}_0 \in \mathbb {Z}^{r}\), the expected runtime of \(\mathcal {P}\) is finite, i.e., if \(rt_{\varvec{x}_0}^\mathcal {P}= \mathbb {E}^\mathcal {P}_{\varvec{x}_0}\left( T^{\mathcal {P}}\right) < \infty \).

Example 5

(Expected Runtime for \(\mathcal {P}_{race}\)). By the observations in Sect. 4 we will infer that \(\tfrac{2}{3}\cdot (t-h+1) \le rt_{(t,h)}^{\mathcal {P}_{race}} \le \tfrac{2}{3}\cdot (t-h+1) + \tfrac{16}{3}\) holds whenever \(t-h > -1\), cf. Example 22. So the expected number of steps until termination is finite (and linear in the input variables) and thus, \(\mathcal {P}_{race}\) is PAST. The algorithm in Sect. 5 will even be able to compute \(rt_{(t,h)}^{\mathcal {P}_{race}}\) exactly, cf. Example 34.

If the initial values \(\varvec{x}_0\) violate the loop guard, then the runtime is trivially 0.

Corollary 6

(Expected Runtime for Violating Initial Values). For any program \(\mathcal {P}\) as in Definition 2 and any \(\varvec{x}_0 \in \mathbb {Z}^{r}\) with \(\varvec{a} \bullet \varvec{x}_0 \le b\), we have \(rt^\mathcal {P}_{\varvec{x}_0} = 0\).

To obtain our results, we use an alternative, well-known characterization of the expected runtime, cf. e.g., [3, 8, 15, 24,25,26,27, 32, 34]. To this end, we search for the smallest (or “least”) solution of the recurrence equation that describes the runtime of the program as 1 plus the sum of the runtimes in the next loop iteration, multiplied with the corresponding probabilities. Here, functions are compared pointwise, i.e., for \(f,g: \mathbb {Z}^{r} \rightarrow \overline{\mathbb {R}_{\ge _0}}\) we have \(f \le g\) if \(f(\varvec{x}) \le g(\varvec{x})\) holds for all \(\varvec{x} \in \mathbb {Z}^{r}\). So we search for the smallest function \(f:\mathbb {Z}^{r} \rightarrow \overline{\mathbb {R}_{\ge 0}}\) that satisfies

$$\begin{aligned} \!\!f(\varvec{x}) = \sum \nolimits _{1 \le j \le n} p_{\varvec{c}_j}(\varvec{x}) \cdot f(\varvec{x}+\varvec{c}_j)+p'(\varvec{x}) \cdot f(\varvec{d})+1 \;\;\; \text {for all }\varvec{x} \text { with }\varvec{a} \bullet \varvec{x} > b. \; \end{aligned}$$
(1)

Equivalently, we can search for the least fixpoint of the “expected runtime trans-former” \(\mathcal {L}^\mathcal {P}\) which transforms the left-hand side of (1) into its right-hand side.

Definition 7

(\(\mathcal {L}^\mathcal {P}\), cf. [32]). For \(\mathcal {P}\) as in Definition 2, we define the expected runtime transformer \(\mathcal {L}^\mathcal {P}\!: (\mathbb {Z}^{r}\!\rightarrow \overline{\mathbb {R}_{\ge 0}}) \rightarrow (\mathbb {Z}^{r}\!\rightarrow \overline{\mathbb {R}_{\ge 0}})\), where for any \(f\!: \mathbb {Z}^{r}\!\rightarrow \overline{\mathbb {R}_{\ge 0}}\):

$$\mathcal {L}^\mathcal {P}(f)(\varvec{x})={\left\{ \begin{array}{ll}\sum \nolimits _{1 \le j \le n} p_{\varvec{c}_j}(\varvec{x}) \cdot f(\varvec{x}+\varvec{c}_j)+p^{\prime }(\varvec{x})\cdot f(\varvec{d}) + 1,&{}\text {if } \varvec{a} \bullet \varvec{x} > b\\ f(\varvec{x}),&{}\text {if } \varvec{a} \bullet \varvec{x} \le b \end{array}\right. } $$

Example 8

(Expected Runtime Transformer for \(\mathcal {P}_{race}\)). For \(\mathcal {P}_{race}\) from Example 1, \(\mathcal {L}^{\mathcal {P}_{race}}\) maps any function \(f: \mathbb {Z}^{2} \rightarrow \overline{\mathbb {R}_{\ge 0}}\) to \(\mathcal {L}^{\mathcal {P}_{race}}(f)\), where \(\mathcal {L}^{\mathcal {P}_{race}}(f)(t,h) =\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \tfrac{6}{11} \cdot f(t+1, h) + \tfrac{1}{22} \cdot \sum \nolimits _{1 \le j \le 10} f(t+1, h+j) + 1, &{}\text {if }t-h > -1\\ f(t,h), &{}\text {if }t-h \le -1 \end{array}\right. } \end{aligned}$$
(2)

Theorem 9 recapitulates that the least fixpoint of \(\mathcal {L}^\mathcal {P}\) indeed yields an equivalent characterization of the expected runtime. In the following, let be the function with for all \(\varvec{x} \in \mathbb {Z}^{r}\).

Theorem 9

(Connection Between Expected Runtime and Least Fixpoint of \(\mathcal {L}^\mathcal {P}\), cf. [32]). For any \(\mathcal {P}\) as in Definition 2, the expected runtime transformer \(\mathcal {L}^\mathcal {P}\) is continuous. Thus, it has a least fixpoint \(\mathrm {lfp}(\mathcal {L}^P): \mathbb {Z}^{r} \rightarrow \overline{\mathbb {R}_{\ge _0}} \) with . Moreover, the least fixpoint of \(\mathcal {L}^\mathcal {P}\) is the expected runtime of \(\mathcal {P}\), i.e., for any \(\varvec{x}_0 \in \mathbb {Z}^{r}\), we have \(\mathrm {lfp}(\mathcal {L}^\mathcal {P})(\varvec{x}_0)=rt^\mathcal {P}_{\varvec{x}_0}\).

So the expected runtime \(rt^{\mathcal {P}_{race}}_{(t,h)}\) can also be characterized as the smallest function \(f\!: \mathbb {Z}^{2}\!\rightarrow \overline{\mathbb {R}_{\ge 0}}\) satisfying \(f(t,h)\!=\!(2)\), i.e., as the least fixpoint of \(\mathcal {L}^{\mathcal {P}_{race}}\).

3 Expected Runtime of Programs with Direct Termination

We start with stating a decidability result for the case where for all \(\varvec{x}\) with \(\varvec{a}\,\bullet \,\varvec{x} > b\), the probability \(p'(\varvec{x})\) for direct termination is at least \(p'\) for some \(p' > 0\). Intuitively, these programs have a termination time whose distribution is closely related to the geometric distribution with parameter \(p'\) (which has expected value \(\tfrac{1}{p'}\)). By using the alternative characterization of \(rt^\mathcal {P}_{\varvec{x}_0}\) from Theorem 9, one obtains that such programs are always PAST and their expected runtime is indeed bounded by the constant \(\tfrac{1}{p'}\). This result will be used in Sect. 5 when computing the exact expected runtime of such programs. The more involved case where \(p'(\varvec{x}) = 0\) is considered in Sect. 4.

Theorem 10

(PAST and Expected Runtime for Programs With Direct Termination). Let \(\mathcal {P}\) be a program as in Definition 2 where there is a \(p' > 0\) such that \(p'(\varvec{x}) \ge p'\) for all \(\varvec{x} \in \mathbb {Z}^{r}\) with \(\varvec{a}\bullet \varvec{x} > b\). Then \(\mathcal {P}\) is PAST and its expected runtime is at most \(\tfrac{1}{p'}\), i.e., \(rt^\mathcal {P}_{\varvec{x}_0} \le \tfrac{1}{p'}\) if \(\varvec{a}\bullet \varvec{x}_0 > b\), and \(rt^\mathcal {P}_{\varvec{x}_0} = 0\) if \(\varvec{a}\bullet \varvec{x}_0 \le b\).

Example 11

(Example 1 with Direct Termination). Consider the variant \(\mathcal {P}_{direct}\) of \(\mathcal {P}_{race}\) on the right, where in each iteration, the hare either does nothing with probability \(\tfrac{9}{10}\) or one directly reaches a configuration where the hare is ahead of the tortoise. By Theorem 10 the program is PAST and its expected runtime is at most \(\tfrac{1}{\frac{1}{10}}=10\), i.e., independent of the initial state it takes at most 10 loop iterations on average. In Sect. 5 it will turn out that 10 is indeed the exact expected runtime, cf. Example 32.

figure d

4 Expected Runtimes of Constant Probability Programs

Now we present a very simple decision procedure for termination of CP programs (Sect. 4.2) and show how to infer their asymptotic expected runtimes (Sect. 4.3). This will be needed for the computation of exact expected runtimes in Sect. 5.

4.1 Reduction to Random Walk Programs

figure e

As a first step, we show that we can restrict ourselves to random walk programs, i.e., programs with a single program variable x and the loop condition \(x > 0\).

Definition 12

(Random Walk Program). A CP program \(\mathcal {P}\) is called a random walk program if there exist \(m,k \in \mathbb {N}\) and \(d \in \mathbb {Z}\) with \(d \le 0\) such that \(\mathcal {P}\!\) has the form on the right. Here, we require that \(m> 0\) implies \(p_m > 0\) and that \(k > 0\) implies \(p_{-k} > 0\).

Definition 13 shows how to transform any CP program as in Definition 2 into a random walk program. The idea is to replace the tuple \(\varvec{x}\) by a single variable x that stands for \(\varvec{a} \bullet \varvec{x} -b\). Thus, the loop condition \(\varvec{a} \bullet \varvec{x} > b\) now becomes \(x > 0\). Moreover, a change from \(\varvec{x}\) to \(\varvec{x} + \varvec{c}_j\) now becomes a change from x to \(x + \varvec{a} \bullet \varvec{c}_j\).

figure f
figure g
figure h

Definition 13

(Transforming CP Programs to Random Walk Programs). Let \(\mathcal {P}\) be the CP program on the left with \(\varvec{x} =(x_1,\ldots ,x_r)\) and \(\varvec{a} \bullet \varvec{d} \le b\). Let \( rdw _{\mathcal {P}}\) denote the affine map \( rdw _{\mathcal {P}}\!: \mathbb {Z}^{r}\!\!\rightarrow \mathbb {Z}\) with \( rdw _\mathcal {P}(\varvec{z}) = \varvec{a} \bullet \varvec{z} - b\) for all \(\varvec{z} \in \mathbb {Z}^{r}\). Thus, \( rdw _\mathcal {P}(\varvec{d})\le 0\). Let \(k_\mathcal {P},m_\mathcal {P}\in \mathbb {N}\) be minimal such that   \(-k_\mathcal {P}\le \varvec{a} \bullet \varvec{c}_j \le m_\mathcal {P}\)   holds for all \(1 \le j \le n\). For all \(-k_\mathcal {P}\le j \le m_\mathcal {P}\), we define \(p^{ rdw }_j = \sum \nolimits _{1 \le u \le n, \; \varvec{a} \bullet \varvec{c}_u = j} p_{\varvec{c}_u}\). This results in the random walk program \(\mathcal {P}^{ rdw }\) on the right.

Example 14

(Transforming \(\mathcal {P}_{race}\)). For the program \(\mathcal {P}_{race}\) of Example 1, the mapping \( rdw _{\mathcal {P}_{race}}:\mathbb {Z}^{2} \rightarrow \mathbb {Z}\) is \( rdw _{\mathcal {P}_{race}}(t, h) = (1, -1)\bullet (t, h) +1 = t-h+1\). Hence we obtain the random walk program \(\mathcal {P}^{ rdw }_{race}\) on the right, where \(x= rdw _{\mathcal {P}_{race}}(t, h)\) represents the distance between the tortoise and the hare.

Approaches based on supermartingales (e.g., [1, 4, 10, 12, 13, 17]) use mappings similar to \( rdw _\mathcal {P}\) in order to infer a real-valued term which over-approximates the expected runtime. However, in the following (non-trivial) theorem we show that our transformation is not only an over- or under-approximation, but the termination behavior and the expected runtime of \(\mathcal {P}\) and \(\mathcal {P}^{ rdw }\) are identical.

Theorem 15

(Transformation Preserves Termination & Expected Runtime). Let \(\mathcal {P}\) be a CP program as in Definition 2. Then the termination times \(T^{\mathcal {P}}\!\) and \(T^{\mathcal {P}^{ rdw }}\!\!\!\!\) are identically distributed w.r.t. \( rdw _\mathcal {P}\), i.e., for all \(\varvec{x}_0 \in \mathbb {Z}^{r}\) with \(x_0 = rdw _{\mathcal {P}}(\varvec{x}_0)\) and all \(j\!\in \!\overline{\mathbb {N}}\) we have \( \mathbb {P}_{\varvec{x}_0}^{\mathcal {P}}(T^{\mathcal {P}}\!\!\!=\!j)=\mathbb {P}^{\mathcal {P}^{ rdw }}_{x_0}(T^{\mathcal {P}^{ rdw }}\!\!\!=\!j)\). So in particular, \(\mathbb {P}_{\varvec{x}_0}^{\mathcal {P}}(T^{\mathcal {P}}\!\!<\!\! \infty )\!= \mathbb {P}^{\mathcal {P}^{ rdw }}_{x_0}(T^{\mathcal {P}^{ rdw }}\!\!\!<\!\!\infty )\) and \(rt_{\varvec{x}_0}^\mathcal {P}\!=\!\mathbb {E}^\mathcal {P}_{\varvec{x}_0}(T^{\mathcal {P}}) \!=\! \mathbb {E}^{\mathcal {P}^{ rdw }}_{x_0}(T^{\mathcal {P}^{ rdw }}) \!=\! rt_{x_0}^{\mathcal {P}^{ rdw }}\). Thus, the expected runtimes of \(\mathcal {P}\) on the input \(\varvec{x}_0\) and of \(\mathcal {P}^{ rdw }\) on \(x_0\) coincide.

The following definition identifies pathological programs that can be disregarded.

figure i

Definition 16

(Trivial Program). Let \(\mathcal {P}\) be a CP program as in Definition 2. We call \(\mathcal {P}\) trivial if \(\varvec{a} = \varvec{0} = (0,0,\ldots ,0)\) or if \(\mathcal {P}^{ rdw }\) is the program on the right.

Note that a random walk program \(\mathcal {P}\) is trivial iff it has the form \(\texttt {while} (x\!>\!0) \{ x = x \;\; [1];\}\), since \(\mathcal {P}\!=\!\mathcal {P}^{ rdw }\) holds for random walk programs \(\mathcal {P}\). From now on, we will exclude trivial programs \(\mathcal {P}\) as their termination behavior is obvious: for inputs \(\varvec{x}_0\) that satisfy the loop condition \(\varvec{a} \bullet \varvec{x}_0 > b\), the program never terminates (i.e., \(rt_{\varvec{x}_0}^\mathcal {P}= \infty \)) and for inputs \(\varvec{x}_0\) with \(\varvec{a} \bullet \varvec{x}_0 \le b\) we have \(rt_{\varvec{x}_0}^\mathcal {P}= 0\). Note that if \(\varvec{a} = \varvec{0}\), then the termination behavior just depends on b: if \(b < 0\), then \(rt_{\varvec{x}_0}^\mathcal {P}= \infty \) for all \(\varvec{x}_0\) and if \(b \ge 0\), then \(rt_{\varvec{x}_0}^\mathcal {P}= 0\) for all \(\varvec{x}_0\).

4.2 Deciding Termination

We now present a simple decision procedure for (P)AST of random walk programs \(\mathcal {P}\). By the results of Sect. 4.1, this also yields a decision procedure for arbitrary CP programs. If \(p' > 0\), then Theorem 10 already shows that \(\mathcal {P}\) is PAST and its expected runtime is bounded by the constant \(\tfrac{1}{p'}\). Thus, in the rest of Sect. 4 we regard random walk programs without direct termination, i.e., \(p' = 0\).

Definition 17 introduces the drift of a random walk program, i.e., the expected value of the change of the program variable in one loop iteration, cf. [4].

Definition 17

(Drift). Let \(\mathcal {P}\) be a random walk program \(\mathcal {P}\) as in Definition 12. Then its drift is \(\mu _\mathcal {P}= \sum \nolimits _{-k \le j \le m} j \cdot p_j\).

Theorem 18 shows that to decide (P)AST, one just has to compute the drift.

Theorem 18

(Decision Procedure for (P)AST of Random Walk Programs). Let \(\mathcal {P}\) be a non-trivial random walk program without direct termination.

  • If \(\mu _\mathcal {P}> 0\), then the program is not AST.

  • If \(\mu _\mathcal {P}= 0\), then the program is AST but not PAST.

  • If \(\mu _\mathcal {P}< 0\), then the program is PAST.

Example 19

(\(\mathcal {P}_{race}\,\)is PAST). The drift of \(\mathcal {P}^{ rdw }_{race}\) in Example 14 is \(\mu _{\mathcal {P}^{ rdw }_{race}}=1\cdot \tfrac{6}{11}+ \tfrac{1}{22} \cdot \sum \nolimits _{-9 \le j \le 0} j=-\tfrac{3}{2}<0\). So on average the distance x between the tortoise and the hare decreases in each loop iteration. Hence by Theorem 18, \(\mathcal {P}^{ rdw }_{race}\) is PAST and the following Corollary 20 implies that \(\mathcal {P}_{race}\) is PAST as well.

Corollary 20

(Decision Procedure for (P)AST of CP programs). For a non-trivial CP program \(\mathcal {P}\), \(\mathcal {P}\) is (P)AST iff \(\mathcal {P}^{ rdw }\) is (P)AST. Hence, Theorems 15 and 18 yield a decision procedure for AST and PAST of CP programs.

In [20], we show that Theorem 18 follows from classical results on random walks [33]. Alternatively, Theorem 18 could also be proved by combining several recent results on probabilistic programs: The approach of [28] could be used to show that \(\mu _\mathcal {P}= 0\) implies AST. Moreover, one could prove that \(\mu _\mathcal {P}< 0\) implies PAST by showing that x is a ranking supermartingale of the program [4, 10, 13, 17]. That the program is not PAST if \(\mu _\mathcal {P}\ge 0\) and not AST if \(\mu _\mathcal {P}> 0\) could be proved by showing that \(-x\) is a \(\mu _\mathcal {P}\)-repulsing supermartingale [12].

While the proof of Theorem 18 is based on known results, the formulation of Theorem 18 shows that there is an extremely simple decision procedure for (P)AST of CP programs, i.e., checking the sign of the drift is much simpler than applying existing (general) techniques for termination analysis of probabilistic programs.

4.3 Computing Asymptotic Expected Runtimes

It turns out that for random walk programs (and thus by Theorem 15, also for CP programs), one can not only decide termination, but one can also infer tight bounds on the expected runtime. Theorem 21 shows that the computation of the bounds is again very simple.

Theorem 21

(Bounds on the Expected Runtime of CP Programs). Let \(\mathcal {P}\) be a non-trivial CP program as in Definition 2 without direct termination which is PAST (i.e., \(\mu _{\mathcal {P}^{ rdw }}<0\)). Moreover, let \(k_\mathcal {P}\) be obtained according to the transformation from Definition 13. If \( rdw _\mathcal {P}(\varvec{x}_0) \le 0\), then \(rt_{\varvec{x}_0}^\mathcal {P}= 0\). If \( rdw _\mathcal {P}(\varvec{x}_0) > 0\), then \(\mathcal {P}\)’s expected runtime is asymptotically linear and we have

$$-\tfrac{1}{\mu _{\mathcal {P}^{ rdw }}}\cdot rdw _{\mathcal {P}}(\varvec{x}_0) \quad \le \quad rt_{\varvec{x}_0}^\mathcal {P}\quad \le \quad -\tfrac{1}{\mu _{\mathcal {P}^{ rdw }}}\cdot rdw _{\mathcal {P}}(\varvec{x}_0) + \tfrac{1-k_\mathcal {P}}{\mu _{\mathcal {P}^{ rdw }}}. $$

Example 22

(Bounds on the Runtime of \(\mathcal {P}_{race}\)). In Example 19 we saw that the program \(\mathcal {P}^{ rdw }_{race}\) from Example 14 is PAST as it has the drift \(\mu _{\mathcal {P}^{ rdw }_{race}}=-\tfrac{3}{2}<0\). Note that here \(k=9\). Hence by Theorem 21 we get that whenever \( rdw _{\mathcal {P}_{race}}(t, h) = t-h+1\) is positive, the expected runtime \(rt^{\mathcal {P}_{race}}_{(t, h)}\) is between \(-\tfrac{1}{\mu _{\mathcal {P}^{ rdw }_{race}}}\cdot rdw _{\mathcal {P}_{race}}(t, h)=\tfrac{2}{3}\cdot (t-h+1)\) and \(-\tfrac{1}{\mu _{\mathcal {P}^{ rdw }_{race}}}\cdot rdw _{\mathcal {P}_{race}}(t, h) + \tfrac{1-k}{\mu _{\mathcal {P}^{ rdw }_{race}}}=\tfrac{2}{3}\cdot (t-h+1) + \tfrac{16}{3}\). The same upper bound \(\tfrac{2}{3}\cdot (t-h+1) + \tfrac{16}{3}\) was inferred in [30] by an incomplete technique based on several inference rules and linear programming solvers. In contrast, Theorem 21 allows us to read off such bounds directly from the program.

Our proof of Theorem 21 in [20] again uses the connection to random walks and shows that the classical Lemma of Wald [21, Lemma 10.2(9)] directly yields both the upper and the lower bound for the expected runtime. Alternatively, the upper bound in Theorem 21 could also be proved by considering that \( rdw _{\mathcal {P}}(\varvec{x}_0)+(1-k_{\mathcal {P}})\) is a ranking supermartingale [1, 4, 10, 13, 17] whose expected decrease in each loop iteration is \(\mu _\mathcal {P}\). The lower bound could also be inferred by considering the difference-bounded submartingale \(- rdw _{\mathcal {P}}(\varvec{x}_0)\) [7, 19].

5 Computing Exact Expected Runtimes

While Theorems 10 and 21 state how to deduce the asymptotic expected runtime, we now show that based on these results one can compute the runtime of CP programs exactly. In general, whenever it is possible, then inferring the exact runtimes of programs is preferable to asymptotic runtimes which ignore the “coefficients” of the runtime.

Again, we first consider random walk programs and generalize our technique to CP programs using Theorem 15 afterwards. Throughout Sect. 5, for any random walk program \(\mathcal {P}\) as in Definition 12, we require that \(\mathcal {P}\) is PAST, i.e., that \(p'>0\) (cf. Theorem 10) or that the drift \(\mu _\mathcal {P}\) is negative if \(p'=0\) (cf. Theorem 18). Note that whenever \(k=0\) and \(\mathcal {P}\) is PAST, then \(p'>0\).Footnote 1

To compute \(\mathcal {P}\)’s expected runtime exactly, we use its characterization as the least fixpoint of the expected runtime transformer \(\mathcal {L}^\mathcal {P}\) (cf. Theorem 9), i.e., \(rt^\mathcal {P}_{x}\) is the smallest function \(f:\mathbb {Z}\rightarrow \overline{\mathbb {R}_{\ge 0}}\) satisfying the constraint

$$\begin{aligned} f(x) \;= \; \sum \nolimits _{-k \le j \le m} p_j \cdot f(x+j)+p' \cdot f(d)+1 \quad \text {for all }x > 0, \end{aligned}$$
(3)

cf. (1). Since \(\mathcal {P}\) is PAST, f never returns \(\infty \), i.e., \(f: \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\). Note that the smallest function \(f: \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) that satisfies (3) also satisfies

$$\begin{aligned} f(x) \;= \; 0 \quad \text {for all }x \le 0. \end{aligned}$$
(4)

Therefore, as \(d \le 0\), the constraint (3) can be simplified to

$$\begin{aligned} f(x) \;= \; \sum \nolimits _{-k \le j \le m} p_j \cdot f(x+j)+1 \quad \text {for all }x > 0. \end{aligned}$$
(5)

In Sect. 5.1 we recapitulate how to compute all solutions of such inhomogeneous recurrence equations (cf., e.g., [14, Ch. 2]). However, to compute \(rt^\mathcal {P}_{x}\), the challenge is to find the smallest solution \(f: \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) of the recurrence equation (5). Therefore, in Sect. 5.2 we will exploit the knowledge gained in Theorems 10 and 21 to show that there is only a single function f that satisfies both (4) and (5) and is bounded by a constant (if \(p' > 0\), cf. Theorem 10) resp. by a linear function (if \(p' = 0\), cf. Theorem 21). This observation then allows us to compute \(rt^\mathcal {P}_x\) exactly. So the crucial prerequisites for this result are Theorem 9 (which characterizes the expected runtime as the smallest solution of the recurrence equation (5)), Theorem 18 (which allows the restriction to negative drift if \(p' = 0\)), and in particular Theorems 10 and 21 (since Sect. 5.2 will show that the results of Theorems 10 and 21 on the asymptotic runtime can be translated into suitable conditions on the solutions of (5)).

5.1 Finding All Solutions of the Recurrence Equation

Example 23

(Modification of \(\mathcal {P}^{ rdw }_{race}\)). To illustrate our approach, we use a modified version of \(\mathcal {P}^{ rdw }_{race}\) from Example 14 to ease readability. In Sect. 6, we will consider the original program \(\mathcal {P}^{ rdw }_{race}\) resp. \(\mathcal {P}_{race}\) from Example 14 resp. Example 1 again and show its exact expected runtime inferred by the implementation of our approach. In the modified program \(\mathcal {P}^{mod}_{race}\) on the right, the distance between the tortoise and the hare still increases with probability \(\tfrac{6}{11}\), but the probability of decreasing by more than two is distributed to the cases where it stays the same and where it decreases by two. We have \(p'=0\) and the drift is \(\mu _{\mathcal {P}^{mod}_{race}} =1 \cdot \tfrac{6}{11} + 0 \cdot \tfrac{1}{11} -1 \cdot \tfrac{1}{22} -2 \cdot \tfrac{7}{22}=-\tfrac{3}{22}<0\). So by Theorem 18, \(\mathcal {P}^{mod}_{race}\) is PAST. By Theorem 9, \(rt^{\mathcal {P}^{mod}_{race}}_{x}\) is the smallest function \(f : \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) satisfying

$$\begin{aligned} f(x) \;= \; \tfrac{6}{11}\cdot f(x+1) + \tfrac{1}{11}\cdot f(x) + \tfrac{1}{22}\cdot f(x-1) + \tfrac{7}{22}\cdot f(x-2) +1 \; \text {for all }x > 0. \end{aligned}$$
(6)
figure j

Instead of searching for the smallest \(f:\mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) satisfying (5), we first calculate the set of all functions \(f:\mathbb {Z}\rightarrow \mathbb {C}\) that satisfy (5), i.e., we also consider functions returning negative or complex numbers. Clearly, (5) is equivalent to

$$\begin{aligned} \begin{array}{r@{\;\;}c@{\;\;}l} 0&{}=&{}p_m \cdot f(x+m)+\ldots +p_1 \cdot f(x+1)+(p_0-1) \cdot f(x) \;+\\ &{}&{}p_{-1} \cdot f(x-1)+\ldots +p_{-k} \cdot f(x-k)+1 \qquad \qquad \text {for all }x>0. \end{array} \end{aligned}$$
(7)

The set of solutions on \(\mathbb {Z}\rightarrow \mathbb {C}\) of this linear, inhomogeneous recurrence equation is an affine space which can be written as an arbitrary particular solution of the inhomogeneous equation plus any linear combination of \(k+m\) linearly independent solutions of the corresponding homogeneous recurrence equation.

We start with computing a solution to the inhomogeneous equation (7). To this end, we use the bounds for \(rt^\mathcal {P}_{x}\) from Theorems 10 and 21 (where we take the upper bound \(\tfrac{1}{p'}\) if \(p' > 0\) and the lower bound \(-\tfrac{1}{\mu _\mathcal {P}} \cdot x\) if \(p' = 0\)). So we define

$$ C_{const}=\tfrac{1}{p'}, \; \text {if }p' > 0 \qquad \text { and }\qquad C_{lin}=-\tfrac{1}{\mu _\mathcal {P}}, \; \text {if p' = 0.}$$

One easily shows that if \(p' > 0\), then \(f(x) = C_{const}\) is a solution of the inhomogeneous recurrence equation (7) and if \(p' = 0\), then \(f(x)=C_{lin} \cdot x\) solves (7).

Example 24

(Example 23 cont.). In the program \(\mathcal {P}^{mod}_{race}\) of Example 23, we have \(p' = 0\) and \(\mu _{\mathcal {P}^{mod}_{race}} =-\tfrac{3}{22}\). Hence \(C_{lin}= \tfrac{22}{3}\) and \(C_{lin}\cdot x\) is a solution of (6).

After having determined one particular solution of the inhomogeneous recurrence equation (7), now we compute the solutions of the homogeneous recurrence equation which results from (7) by replacing the add-on “+ 1” with 0. To this end, we consider the corresponding characteristic polynomial \(\chi _\mathcal {P}\):Footnote 2

$$\begin{aligned} \chi _\mathcal {P}(\lambda )=p_m \cdot \lambda ^{k+m}+\ldots +p_1 \cdot \lambda ^{k+1}+(p_0-1) \cdot \lambda ^k + p_{-1} \cdot \lambda ^{k-1}+\ldots +p_{-k} \;\;\;\; \end{aligned}$$
(8)

Let \(\lambda _1,\ldots ,\lambda _c\) denote the pairwise different (possibly complex) roots of the characteristic polynomial \(\chi _\mathcal {P}\). For all \(1 \le j \le c\), let \(v_j\in \mathbb N\setminus \{0\}\) be the multiplicity of the root \(\lambda _j\). Thus, we have \(v_1 + \ldots + v_c = k + m\).

Then we obtain the following \(k+m\) linearly independent solutions of the homogeneous recurrence equation resulting from (7):

$$\begin{aligned} \lambda _j^x \cdot x^u \quad \text { for all }1 \le j \le c\text { and all }0 \le u \le v_j-1 \end{aligned}$$

So \(f\!:\!\mathbb {Z}\!\rightarrow \!\mathbb {C}\) is a solution of (5) (resp. (7)) iff there exist coefficients \(a_{j,u}\!\in \!\mathbb {C}\) with

$$\begin{aligned} f(x) \;= \; C(x) \, + \, \sum \nolimits _{1 \le j \le c} \;\; \sum \nolimits _{0\le u \le v_j-1}a_{j,u} \cdot \lambda _j^x \cdot x^u \quad \text {for all }x > -k, \end{aligned}$$
(9)

where \(C(x) = C_{const} = \tfrac{1}{p'}\) if \(p' > 0\) and \(C(x) = C_{lin} \cdot x = -\tfrac{1}{\mu _\mathcal {P}} \cdot x\) if \(p' = 0\). The reason for requiring (9) for all \(x > -k\) is that \(-k+1\) is the smallest argument where f’s value is taken into account in (5).

Example 25

(Example 24 cont.). The characteristic polynomial for the program \(\mathcal {P}^{mod}_{race}\) of Example 23 has the degree \(k+m=2+1=3\) and is given by

$$\begin{aligned} \chi _{\mathcal {P}^{mod}_{race}}(\lambda )\;=\;\tfrac{6}{11}\cdot \lambda ^3 -\tfrac{10}{11}\cdot \lambda ^2 +\tfrac{1}{22}\cdot \lambda +\tfrac{7}{22}. \end{aligned}$$

Its roots are \(\lambda _1=1\), \(\lambda _2=-\tfrac{1}{2}\), and \(\lambda _3=\tfrac{7}{6}\). So here, all roots are real numbers and they all have the multiplicity 1. Hence, three linearly independent solutions of the homogeneous part of (6) are the functions \(1^x=1\), \((-\tfrac{1}{2})^x\), and \((\tfrac{7}{6})^x\). Therefore, a function \(f:\mathbb {Z}\rightarrow \mathbb {C}\) satisfies (6) iff there are \(a_1,a_2,a_3 \in \mathbb {C}\) such that

$$\begin{aligned} \begin{array}{r@{\;\;}c@{\;\;}l} f(x)&{}=&{}C_{lin} \cdot x + a_1 \cdot 1^x+a_2 \cdot (-\tfrac{1}{2})^x\!+ a_3 \cdot (\tfrac{7}{6})^x\\ &{}=&{}\tfrac{22}{3}\cdot x + a_1+a_2 \cdot (-\tfrac{1}{2})^x\!+ a_3\cdot (\tfrac{7}{6})^x \qquad \text {for }x > -2. \end{array} \end{aligned}$$
(10)

5.2 Finding the Smallest Solution of the Recurrence Equation

In Sect. 5.1, we recapitulated the standard approach for solving inhomogeneous recurrence equations which shows that any function \(f:\mathbb {Z}\rightarrow \mathbb {C}\) that satisfies the constraint (5) is of the form (9). Now we will present a novel technique to compute \(rt^\mathcal {P}_x\), i.e., the smallest non-negative solution \(f:\mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) of (5). By Theorems 10 and 21, this function f is bounded by a constant (if \(p' > 0\)) resp. linear (if \(p' = 0\)). So, when representing f in the form (9), we must have \(a_{j,u} = 0\) whenever \(|\lambda _j| > 1\). The following lemma shows how many roots with absolute value less or equal to 1 there are (i.e., these are the only roots that we have to consider). It is proved using Rouché’s Theorem which allows us to infer the number of roots whose absolute value is below a certain bound. Note that 1 is a root of the characteristic polynomial iff \(p' = 0\), since \(\sum \nolimits _{-k \le j \le m} p_j = 1 - p'\).

Lemma 26

(Number of Roots With Absolute Value \(\le 1\)). Let \(\mathcal {P}\) be a random walk program as in Definition 12 that is PAST. Then the characteristic polynomial \(\chi _\mathcal {P}\) has k roots \(\lambda \in \mathbb {C}\) (counted with multiplicity) with \(|\lambda |\le 1\).

Example 27

(Example 25 cont.). In \(\mathcal {P}^{mod}_{race}\) of Example 23 we have \(k = 2\). So by Lemma 26, \(\chi _\mathcal {P}\) has exactly two roots with absolute value \(\le 1\). Indeed, the roots of \(\chi _\mathcal {P}\) are \(\lambda _1=1\), \(\lambda _2=-\tfrac{1}{2}\), and \(\lambda _3=\tfrac{7}{6}\), cf. Example 25. So \(|\lambda _3|>1\), but \(|\lambda _1|\le 1\) and \(|\lambda _2|\le 1\).

Based on Lemma 26, the following lemma shows that when imposing the restriction that \(a_{j,u} = 0\) whenever \(|\lambda _j| > 1\), then there is only a single function of the form (9) that also satisfies the constraint (4). Hence, this must be the function that we are searching for, because the desired smallest solution \(f:\mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) of (5) also satisfies (4).

Lemma 28

(Unique Solution of (4) and (5) when Disregarding Roots With Absolute Value \(> 1\)). Let \(\mathcal {P}\) be a random walk program as in Definition 12 that is PAST. Then there is exactly one function \(f:\mathbb {Z}\rightarrow \mathbb {C}\) which satisfies both (4) and (5) (thus, it has the form (9)) and has \(a_{j,u} = 0\) whenever \(|\lambda _j| > 1\).

The main theorem of Sect. 5 now shows how to compute the expected runtime exactly. By Theorems 10 and 21 on the bounds for the expected runtime and by Lemma 28, we no longer have to search for the smallest function that satisfies (4) and (5), but we just search for any solution of (4) and (5) which has \(a_{j,u} = 0\) whenever \(|\lambda _j| > 1\) (because there is just a single such solution). So one only has to determine the values of the remaining k coefficients \(a_{j,u}\) for \(|\lambda _j| \le 1\), which can be done by exploiting that f(x) has to satisfy both (4) for all \(x \le 0\) and it has to be of the form (9) for all \(x > -k\). In other words, the function in (9) must be 0 for \(-k+1 \le x \le 0\).

Theorem 29

(Exact Expected Runtime for Random Walk Programs). Let \(\mathcal {P}\) be a random walk program as in Definition 12 that is PAST and let \(\lambda _1,\ldots ,\lambda _c\) be the roots of its characteristic polynomial with multiplicities \(v_1,\ldots ,v_c\). Moreover, let \(C(x) = C_{const} = \tfrac{1}{p'}\) if \(p' > 0\) and \(C(x) = C_{lin} \cdot x = -\tfrac{1}{\mu _\mathcal {P}} \cdot x\) if \(p' = 0\). Then the expected runtime of \(\mathcal {P}\) is \(rt^\mathcal {P}_x = 0\) for \(x \le 0\) and

$$ rt^\mathcal {P}_x \; = \; C(x)\;\; + \; \; \sum \nolimits _{1 \le j \le c, \; |\lambda _j|\le 1} \; \; \; \sum \nolimits _{0\le u \le v_j-1} \; a_{j,u} \cdot \lambda _j^x \cdot x^u \quad \text {for }x > 0,$$

where the coefficients \(a_{j,u}\) are the unique solution of the k linear equations:

$$\begin{aligned} 0\; = \; C(x) +\sum \nolimits _{1 \le j \le c, \; |\lambda _j|\le 1} \, \sum \nolimits _{0\le u \le v_j-1}a_{j,u} \cdot \lambda _j^x \cdot x^u \quad \text {for }-k+1 \le x \le 0 \end{aligned}$$
(11)

So in the special case where \(k = 0\), we have \(rt^\mathcal {P}_x = C(x) = C_{const} = \tfrac{1}{p'}\) for \(x > 0\).

Thus for \(x>0\), the expected runtime \(rt^\mathcal {P}_x\) can be computed by summing up the bound C(x) and an add-on \(\sum \nolimits _{1 \le j \le c, \; |\lambda _j|\le 1} \; \sum \nolimits _{0\le u \le v_j-1} \ldots \;\) Since C(x) is an upper bound for \(rt^\mathcal {P}_x\) if \(p' >0\) and a lower bound for \(rt^\mathcal {P}_x\) if \(p'=0\), this add-on is non-positive if \(p' > 0\) and non-negative if \(p' = 0\).

Example 30

(Example 27 cont.). By Theorem 29, the expected runtime of the program \(\mathcal {P}^{mod}_{race}\) from Example 23 is \(rt^{\mathcal {P}^{mod}_{race}}_{x} = 0\) for \(x \le 0\) and

$$ rt^{\mathcal {P}^{mod}_{race}}_{x} \; = \; \tfrac{22}{3} \cdot x +a_1+a_2\cdot (-\tfrac{1}{2} )^x \quad \mathrm{for}\ \,\,\, x > 0, \,\; \mathrm{cf.}\ \, \, (10). $$

The coefficients \(a_1\) and \(a_2\) are the unique solution of the \(k=2\) linear equations

$$\begin{aligned} 0&= \tfrac{22}{3}\cdot 0+a_1+a_2\cdot (-\tfrac{1}{2})^0=a_1+a_2\\ 0&= \tfrac{22}{3}\cdot (-1)+a_1+a_2\cdot (-\tfrac{1}{2})^{-1}=-\tfrac{22}{3}+a_1-2\cdot a_2 \end{aligned}$$

So \(a_1=\tfrac{22}{9}\), \(a_2=-\tfrac{22}{9}\), and hence \(rt^{\mathcal {P}^{mod}_{race}}_{x} \; = \; \tfrac{22}{3}\cdot x+\tfrac{22}{9}-\tfrac{22}{9}\cdot (-\tfrac{1}{2})^x\) for \(x > 0\).

By Theorem 15, we can lift Theorem 29 to arbitrary CP programs \(\mathcal {P}\) immediately.

Corollary 31

(Exact Expected Runtime for CP Programs). For any CP program, its expected runtime can be computed exactly.

Note that irrespective of the degree of the characteristic polynomial, its roots can always be approximated numerically with any chosen precision. Thus, “exact computation” of the expected runtime in the corollary above means that a closed form for \(rt^\mathcal {P}_{\varvec{x}}\) can also be computed with any desired precision.

Example 32

(Exact Expected Runtime of \(\mathcal {P}_{direct}\)). Reconsider the program \(\mathcal {P}_{direct}\) of Example 11 with the probability \(p' = \tfrac{1}{10}\) for direct termination. \(\mathcal {P}_{direct}\) is PAST and its expected runtime is at most \(\tfrac{1}{p'} = 10\), cf. Example 11. The random walk program \(\mathcal {P}_{direct}^{ rdw }\) on the right is obtained by the transformation of Definition 13. As \(k = 0\), by Theorem 29 we obtain \(rt^{\mathcal {P}_{direct}^{ rdw }}_x = \tfrac{1}{p'} = 10\) for \(x > 0\). By Theorem 15, this implies \(rt^{\mathcal {P}_{direct}}_{(t, h)} = rt^{\mathcal {P}_{direct}^{ rdw }}_{ rdw _{\mathcal {P}_{direct}}(t, h)} = 10\) if \( rdw _{\mathcal {P}_{direct}}(t,h) = t - h + 1 > 0\), i.e., 10 is indeed the exact expected runtime of \(\mathcal {P}_{direct}\).

figure k

Note that Theorem 29 and Corollary 31 imply that for any \(\varvec{x}_0 \in \mathbb {Z}^r\), the expected runtime \(rt^\mathcal {P}_{\varvec{x}_0}\) of a CP program \(\mathcal {P}\) that is PAST and has only rational probabilities \(p_{\varvec{c}_1},\ldots ,p_{\varvec{c}_n},p' \in \mathbb {Q}\) is always an algebraic number. Thus, one could also compute a closed form for the exact expected runtime \(rt^\mathcal {P}_{\varvec{x}}\) using a representation with algebraic numbers instead of numerical approximations.

Nevertheless, Theorem 29 may yield a representation of \(rt^\mathcal {P}_x\) which contains complex numbers \(a_{j,u}\) and \(\lambda _j\), although \(rt^\mathcal {P}_x\) is always real. However, one can easily obtain a more intuitive representation of \(rt^\mathcal {P}_x\) without complex numbers:

Since the characteristic polynomial \(\chi _\mathcal {P}\) only has real coefficients, whenever \(\chi _\mathcal {P}\) has a complex root \(\lambda \) of multiplicity v, its conjugate \(\overline{\lambda }\) is also a root of \(\chi _\mathcal {P}\) with the same multiplicity v. So the pairwise different roots \(\lambda _1, \ldots , \lambda _c\) can be distinguished into pairwise different real roots \(\lambda _1, \ldots , \lambda _s\), and into pairwise different non-real complex roots \(\lambda _{s+1}, \overline{\lambda _{s+1}}, \ldots , \lambda _{s+t}, \overline{\lambda _{s+t}}\), where \(c = s + 2 \cdot t\).

For any coefficients \(a_{j,u}, a_{j,u}' \in \mathbb {C}\) with \(j \in \{s+1,\ldots ,s+t\}\) and \(u \in \{0,\ldots , v_{j}-1\}\) let \(b_{j,u} = 2 \cdot \mathrm {Re}(a_{j,u})\in \mathbb {R}\) and \(b_{j,u}' = -2\cdot \mathrm {Im}(a_{j,u})\in \mathbb {R}\). Then \(a_{j,u} \cdot \lambda _{j}^x + a_{j,u}' \cdot \overline{\lambda _{j}}^x \; = \; b_{j,u} \cdot \mathrm {Re}(\lambda _{j}^x) + b_{j,u}' \cdot \mathrm {Im}(\lambda _{j}^x)\). Hence, by Theorem 29 we get the following representation of the expected runtime which only uses real numbers:

$$\begin{aligned} {rt^\mathcal {P}_x\!=\!\left\{ \begin{array}{lll} C(x) &{}+ \sum \limits _{1 \le j \le s, \; |\lambda _j|\le 1} \; \; \;\sum \limits _{0\le u \le v_j-1}a_{j,u} \cdot \lambda _j^x \cdot x^u\\ &{}+ \sum \limits _{s+1 \le j \le s+t, \; |\lambda _{j}|\le 1} \; \sum \limits _{0 \le u \le v_{j}-1} \left( b_{j,u}\!\cdot \!\mathrm {Re}(\lambda _{j}^x) + b_{j,u}'\!\cdot \!\mathrm {Im}(\lambda _{j}^x)\right) \cdot x^u, &{} \text {for }x > 0\\ 0, &{}&{} \text {for }x \le 0 \end{array} \right. }\end{aligned}$$
(12)

To compute \(\mathrm {Re}(\lambda _{j}^x)\) and \(\mathrm {Im}(\lambda _{j}^x)\), take the polar representation of the non-real roots \(\lambda _{j} = w_j\cdot e^{\theta _j \cdot i}\). Then \(\mathrm {Re}(\lambda _{j}^x) = w^x_j \cdot \cos (\theta _j \cdot x)\) and \(\mathrm {Im}(\lambda _{j}^x) =w^x_j \cdot \sin (\theta _j \cdot x)\).

Therefore, we obtain the following algorithm to deduce the exact expected runtime automatically.

Algorithm 33

(Computing the Exact Expected Runtime). To infer the runtime of a CP program \(\mathcal {P}\) as in Definition 12 that is PAST, we proceed as follows:

  1. 1.

    Transform \(\mathcal {P}\) into \(\mathcal {P}^{ rdw }\) by the transformation of Definition 13. Thus, \(\mathcal {P}^{ rdw }\) is a random walk program as in Definition 12.

  2. 2.

    Compute the solution \(C(x) = C_{const} = \tfrac{1}{p'}\) resp. \(C(x) = C_{lin}\cdot x = -\tfrac{1}{\mu _{\mathcal {P}^{ rdw }}} \cdot x\) of the inhomogeneous recurrence equation (7).

  3. 3.

    Compute the \(k+m\) (possibly complex) roots of the characteristic polynomial \(\chi _{\mathcal {P}^{ rdw }}\) (cf. (8)) and keep the k roots \(\lambda \) with \(|\lambda |\le 1\).

  4. 4.

    Determine the coefficients \(a_{j,u}\) by solving the k linear equations in (11).

  5. 5.

    Return the solution (12) where \(b_{j,u} = 2 \cdot \mathrm {Re}(a_{j,u})\), \(b_{j,u}' = -2 \cdot \mathrm {Im}(a_{j,u})\), and for \(\lambda _{j} = w_j\cdot e^{\theta _j \cdot i}\) we have \(\mathrm {Re}(\lambda _{j}^x) = w^x_j \cdot \cos (\theta _j \cdot x)\) and \(\mathrm {Im}(\lambda _{j}^x) =w^x_j \cdot \sin (\theta _j \cdot x)\). Moreover, x must be replaced by \( rdw _\mathcal {P}(\varvec{x})\).

6 Conclusion, Implementation, and Related Work

We presented decision procedures for termination and complexity of classes of probabilistic programs. They are based on the connection between the expected runtime of a program and the smallest solution of its corresponding recurrence equation, cf. Sect. 2. For our notion of probabilistic programs, if the probability for leaving the loop directly is at least \(p'\) for some \(p' > 0\), then the program is always PAST and its expected runtime is asymptotically constant, cf. Sect. 3. In Sect. 4 we showed that a very simple decision procedure for AST and PAST of CP programs can be obtained by classical results from random walk theory and that the expected runtime is asymptotically linear if the program is PAST. Based on these results, in Sect. 5 we presented our algorithm to automatically infer a closed form for the exact expected runtime of CP programs (i.e., with arbitrarily high precision). All proofs and a collection of examples to demonstrate our algorithm can be found in [20].

Implementation. We implemented Algorithm 33 in our tool KoAT [9], which was already one of the leading tools for complexity analysis of (non-probabilistic) integer programs. The implementation is written in OCaml and uses the Python libraries MpMath [22] and SymPy [29] for solving linear equations and for finding the roots of the characteristic polynomial. In addition to the closed form for the exact expected runtime, our implementation can also compute the concrete number of expected loop iterations if the user specifies the initial values of the variables. For further details, a set of benchmarks, and to download our implementation, we refer to https://aprove-developers.github.io/recurrence/.

Example 34

(Computing the Exact Expected Runtime of \(\mathcal {P}_{race}\) Automatically). For the tortoise and hare program \(\mathcal {P}_{race}\) from Example 1, our implementation in KoAT computes the following expected runtime within 0.49 s on an Intel Core i7-6500 with 8 GB memory (when selecting a precision of 2 decimal places):

$${\begin{array}{rl} rt^{\mathcal {P}_{race}}_{(t, h)}= &{}0.049\cdot {0.65}^{(t-h+1)}\;\cdot \sin \left( 2.8 \cdot (t-h+1) \right) - 0.35\cdot {0.65}^{(t-h+1)}\!\cdot \cos \left( 2.8\cdot (t-h+1) \right) \\ &{}+ 0.15\cdot {0.66}^{(t-h+1)}\! \cdot \sin \left( 2.2\cdot (t-h+1) \right) - 0.35\cdot {0.66}^{(t-h+1)}\! \cdot \cos \left( 2.2\cdot (t-h+1) \right) \\ &{}+ 0.3\cdot {0.7}^{(t-h+1)}\! \cdot \sin \left( 1.5 \cdot (t-h+1) \right) - 0.39\cdot {0.7}^{(t-h+1)}\! \cdot \cos \left( 1.5\,(t-h+1) \right) \\ &{}+ 0.62\cdot {0.75}^{(t-h+1)}\! \cdot \sin \left( 0.83\cdot (t-h+1) \right) - 0.49\cdot {0.75}^{(t-h+1)}\! \cdot \cos \left( 0.83 \cdot (t-h+1) \right) \\ &{}+\tfrac{2}{3}\cdot (t-h) \; + \; 2.3\\ \end{array}}$$

So when starting in a state with \(t = 1000\) and \(h = 0\), according to our implementation the number of expected loop iterations is \(rt^{\mathcal {P}_{race}}_{(1000, 0)}= 670\).

Related Work. Many techniques to analyze (P)AST have been developed, which mostly rely on ranking supermartingales, e.g., [1, 4, 10, 12, 13, 17, 19, 28, 30]. Indeed, several of these works (e.g., [1, 4, 17, 19]) present complete criteria for (P)AST, although (P)AST is undecidable. However, the corresponding automation of these techniques is of course incomplete. In [13] it is shown that for affine probabilistic programs, a superclass of our CP programs, the existence of a linear ranking supermartingale is decidable. However, the existence of a linear ranking supermartingale is sufficient but not necessary for PAST or an at most linear expected runtime.

Classes of programs where termination is decidable have already been studied for deterministic programs. In [35] it was shown that for a class of linear loop programs over the reals, the halting problem is decidable. This result was transferred to the rationals [5] and under certain conditions to integer programs [5, 18, 31]. Termination analysis for probabilistic programs is substantially harder than for non-probabilistic ones [23]. Nevertheless, there is some previous work on classes of probabilistic programs where termination is decidable and asymptotic bounds on the expected runtime are computable. For instance, in [6] it was shown that AST is decidable for certain stochastic games and [11] presents an automatic approach for inferring asymptotic upper bounds on the expected runtime by considering uni- and bivariate recurrence equations.

figure l

However, our algorithm is the first which computes a general formula (i.e., a closed form) for the exact expected runtime of arbitrary CP programs. To our knowledge, up to now such a formula was only known for the very restricted special case of bounded simple random walks (cf. [16]), i.e., programs of the form on the right for some \(1 {\ge } p {\ge } 0\) and some \(b \in \mathbb {Z}\). Note that due to the two boundary conditions \(x > 0\) and \(b > x\), the resulting recurrence equation for the expected runtime of the program only has a single solution \(f: \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) that also satisfies \(f(0) = 0\) and \(f(b) = 0\). Hence, standard techniques for solving recurrence equations suffice to compute this solution. In contrast, we developed an algorithm to compute the exact expected runtime of unbounded arbitrary CP programs where the loop condition only has one boundary condition \(x > 0\), i.e., x can grow infinitely large. For that reason, here the challenge is to find an algorithm which computes the smallest solution \(f: \mathbb {Z}\rightarrow \mathbb {R}_{\ge 0}\) of the resulting recurrence equation. We showed that this can be done using the information on the asymptotic bounds of the expected runtime from Sects. 3 and 4.

Future Work. There are several directions for future work. In Sect. 4.1 we reduced CP programs to random walk programs. In future work, we will consider more advanced reductions in order to extend the class of probabilistic programs where termination and complexity are decidable. Moreover, we want to develop techniques to automatically over- or under-approximate the runtime of a program \(\mathcal {P}\) by the runtimes of corresponding CP programs \(\mathcal {P}_1\) and \(\mathcal {P}_2\) such that \(rt^{\mathcal {P}_1}_{\varvec{x}} \le rt^\mathcal {P}_{\varvec{x}} \le rt^{\mathcal {P}_2}_{\varvec{x}}\) holds for all \(\varvec{x} \in \mathbb {Z}^{r}\). Furthermore, we will integrate the easy inference of runtime bounds for CP programs into existing techniques for analyzing more general probabilistic programs.