1 Introduction

Probabilistic programs [28] are imperative programs with the ability to toss a (possibly biased) coin and proceed their execution depending on the outcome of the coin toss. Whereas an ordinary program maps inputs to outputs, a probabilistic program maps inputs to a (posterior) distribution over outputs. Probabilistic programs are used to describe randomized algorithms and Bayesian networks [17]. Other application areas include, amongst others, machine learning, systems biology, security [3], planning and control, quantum computing [44], and software-defined networks [15]. This paper focuses on the computational hardness of the following program analyses: two notions of probabilistic termination, approximating preexpectations [29, 31], and determining covariances on preexpectations.

Program termination Probabilistic programs are normal-looking programs, but reasoning about their correctness is intricate. The key property of program termination exemplifies this. Whereas a classical program either certainly terminates on a given input or not, this is no longer true for probabilistic programs. The program

figure a

Footnote 1 computes a geometric distribution with parameter \(\nicefrac {1}{3}\), as the probability that i equals \(N > 0\) on termination is \(\left( \nicefrac {2}{3}\right) ^{(N{-}1)}\cdot \nicefrac {1}{3}\). This program does not always terminate as it admits an infinite run in which the variable c stays one forever. The probability of this single non-terminating run, however, is zero. The program thus does not terminate certainly (meaning that every single of its runs terminates), but it terminates with probability one. This is referred to as almost-sure termination [41].

If a classical, deterministic program terminates, it reaches a unique final state in finitely many steps. This is not always true for probabilistic programs: It is possible to reach a final state with different probabilities and with different numbers of steps. For probabilistic programs, we are thus interested in the expected (or average) number of steps required until termination. As the expected value of a geometric distribution is finite, the above program terminates almost surely and needs-on average-finitely many steps to do so.

Probabilistic programs that terminate in expected finitely many steps are called positively almost-surely terminating—a terminology introduced by Bournez and Garnier [5]. Their inspiration for the name “positive” comes from Markov chain theory, more specifically from the distinction between positively recurrent states (a state is revisited with probability one and the expected time until a revisit is finite) and null recurrent states (a state is revisited with probability one but the expected time to revisit is infinite) [39, Section A.2, p. 588].

The difference between almost-sure termination and positive almost-sure termination is highlighted by the following program:

figure b

This program describes a left-bounded one-dimensional random walk where the particle starts at position 10. With probability \(\nicefrac {1}{2}\), the variable x is decremented by one (the particle moves to the left); with the same likelihood x is incremented by one (the particle moves to the right). This program terminates almost-surely [41], but on average requires infinitely many computation steps until it terminates [23, Section 7.1]. It is thus not positively almost-surely terminating.

These two sample programs show that there are several nuances when considering termination of probabilistic programs. Reasoning about positive almost-sure termination is not trivial. For instance, positive almost-sure termination is not preserved under sequential composition. Consider program P, given by

figure c

which is positively almost-surely terminating for the same reason as the geometric distribution program. The simple program Q:

figure d

terminates certainly on any input, and does so in a finite number of steps. The program PQ however is not positively almost-surely terminating as it requires on average an infinite number of steps until termination. This is due to the fact that the variable x grows exponentially in program P, whereas P’s termination probability decays exponentially and program Q requires k steps when x equals k.

Computational hardness of program termination Almost-sure termination is an active field of research [9, 14] with seminal papers by Hart et al.  [19, 41] going back to 1983. A lot of work has been done towards automated reasoning for almost-sure termination. For instance, [42] gives an overview of some particularly interesting examples of probabilistic logic programs and the according intuition for proving almost-sure termination. Arons et al. [1] reduce almost-sure termination to termination of nondeterministic programs by means of a planner. This idea has been further exploited and refined into a pattern-based approach with prototypical tool support [13]. Chakarov and Sankaranarayanan use ranking supermartingales to reason about positive almost-sure termination [7]. A related approach for reasoning about expected runtimes is presented in [24]. A technique for synthesizing ranking supermartingales is presented by Chatterjee et al. [8]. Dal Lago and Grellois have presented a completely different approach for proving almost-sure termination by means of a type system [30].

Despite the existence of several (sometimes automated) approaches to tackle almost-sure termination, most authors claim that it must intuitively more difficult than the termination problem for ordinary programs. Esparza et al. [13], for instance, claim that almost-sure termination is more involved to decide than ordinary termination since for the latter a topological argument suffices while for the former arithmetical reasoning is needed. Hence, one cannot use standard termination provers that have been developed for ordinary programs.

There are some results on computational hardness in connection with probabilistic programs, like non-recursive-enumerability results for probabilistic rewriting logic [6] and decidability results for restricted probabilistic programming languages [34]. The computational hardness of almost-sure termination has, however, received scant attention. As a notable exception, [43] establishes that deciding almost-sure termination of certain concurrent probabilistic programs is in \(\varPi _2^0\); the second level of the arithmetical hierarchy.

In this paper, we study the computational hardness of almost-sure termination and positive almost-sure termination. We obtain that deciding almost-sure termination of probabilistic program P on a given input \(\sigma \) is \(\varPi _2^0\)-complete. While for ordinary programs we have a complexity leap when moving from the non-universal to the universal halting problem, we establish that this is not the case for probabilistic programs: Deciding universal almost-sure termination turns out to be \(\varPi _2^0\)-complete too. Our hardness results are established by reductions from the universal halting problem for ordinary, i.e. non-probabilistic, programs. The case for positive almost-sure termination is different, however: While deciding (non-universal) positive almost-sure termination is \(\varSigma _2^0\)-complete, we show that universal positive almost-sure termination is \(\varPi _3^0\)-complete.

Preexpectations Establishing the correctness of probabilistic programs needs—due to their intricate behavior even more so than ordinary programs—formal reasoning. Weakest-precondition calculi à la Dijkstra [12] provide an important apparatus to enable formal reasoning. To develop such calculi, one takes into account that due to its random nature, the final state of a program on termination is not unique. Thus, rather than a mapping from inputs to outputs (as in Dijkstra’s approach), probabilistic programs map an initial state to a distribution on possible final states. More precisely, we have sub-distributions where the “missing” probability mass represents the likelihood of divergence. Given a random variable f and an initial distribution, a key issue now is to determine f’s expected value on the probabilistic program’s termination. Kozen’s seminal work on a probabilistic variant of propositional dynamic logic [29] focuses on this. McIver and Morgan [31] extended Kozen’s approach with (amongst others) nondeterminism. In their jargon, f is called an expectation and—à la Dijkstra—pre- and postexpectations are the quantitative counterparts of pre- and postconditions. To illustrate preexpectations, consider the program:

figure e

that flips a pair of coins, one being fair and one being biased. The preexpectation of the random variable \(x + y\) (i.e. the expected value of \(x + y\) after termination of the above program) equals \(\nicefrac {1}{2} \cdot 0 + \nicefrac {1}{2} \cdot 1 + \nicefrac {1}{3} \cdot 0 + \nicefrac {2}{3} \cdot 1 = \nicefrac {7}{6}\). We can also capture probabilities of events, using indicator functions. For instance, we denote by \([x+y=0]\) the indicator function of the event that x plus y equals 0. The preexpectation of the random variable \([x+y=0]\) equals \(\nicefrac {1}{6}\), i.e. the probability that after termination x plus y equals 0 is \(\nicefrac {1}{6}\). As another example, the preexpectation of \([x = y]\) equals \(\nicefrac {1}{2}\). As with ordinary programs, the difficult part of the program analysis is to deal with loops; for example by finding appropriate loop invariants [25].

Computational hardness of computing preexpectations and covariances This paper deals with establishing the computational hardness of approximating preexpectations of probabilistic programs. Morgan [33] remarks that while partial correctness for small-scale examples is not harder to prove than for ordinary programs, the case for total correctness of a probabilistic loop must be harder to analyze. This paper gives precise classifications of the level of arithmetical reasoning that is needed to compute preexpectations by establishing the following results: We first show that computing lower bounds on the expected value of a random variable f after executing a probabilistic program P on a given input \(\sigma \) is \(\varSigma _1^0\)-complete and therefore arbitrarily close approximations from below are computably enumerable. Computing upper bounds, on the other hand, is shown to be \(\varSigma _2^0\)-complete, thus arbitrarily close approximations from above are not computably enumerable in general. We also show that deciding whether the expected value is at all finite is \(\varSigma _2^0\)-complete. Deciding whether an expected value equals some rational is shown to be \(\varPi _2^0\)-complete.

We complement these results by also considering obtaining bounds on (co)variances of probabilistic programs. We show that obtaining bounds on (co)variances is computationally more difficult than for expected values. In particular, we prove that computing upper and lower bounds for (co)variances of preexpectations is both \(\varSigma _2^0\)-complete, thus not computably enumerable. We also show that determining the precise values of (co)variances is in \(\varDelta _3^0\) and both \(\varSigma ^0_2\)- and \(\varPi ^0_2\)-hard. The covariance problem is thus a problem that lies “properly” in \(\varDelta _3^0\). These results rule out analysis techniques based on finite loop-unrollings as complete approaches for reasoning about covariances.

Organization of the paper Section 2 introduces probabilistic programs and computational and arithmetical complexity. Section 3 formally defines the various decision problems for preexpectations and determines their hardness. We treat preexpectations before termination, since our hardness results on termination make use of the results on preexpectations. Section 4 defines (positive) almost-sure termination and presents detailed proofs of their computational hardness. Our hardness results on probabilistic termination are then applied in Sect. 5 to study the hardness of determining covariances. Section 6 concludes the paper. This paper is based on the conference papers [21] and [22], provides a unified treatment of these results, sharpens some of the results, and presents detailed proofs.

2 Preliminaries

2.1 Syntax and semantics of probabilistic programs

Our development builds upon a simple programming language called probabilistic guarded command language (pGCL) [28, 31]—a variant of Dijkstra’s guarded command language [12] endowed with probabilistic choice constructs. Its syntax is given as follows:

Definition 1

(Syntax of pGCL [28, 31]) Let \({\textsf {Var}} \) be the (countable) set of program variables. The set\(\textsf {Prog}\)of pGCL programs adheres to the grammar

$$\begin{aligned} {\textsf {Prog}} ~~\longrightarrow ~~&{\textsf {skip}}~|~ {\textsf {diverge}}~|~ {v} :={e} ~|~ {{\textsf {Prog}}} ;~ {{\textsf {Prog}}} ~|~ {\textsf {if}}\, \left( {b} \right) \, \left\{ {{\textsf {Prog}}} \right\} \, {\textsf {else}}\, \left\{ {{\textsf {Prog}}} \right\} \\&\qquad \quad ~|~ {\textsf {while}}\, \left( {b} \right) \, \left\{ {{\textsf {Prog}}} \right\} ~|~ \left\{ {{\textsf {Prog}}} \right\} \mathrel {\left[ {p} \right] } \left\{ {{\textsf {Prog}}} \right\} ~, \end{aligned}$$

where \(v \in {\textsf {Var}} \), e is an arithmetical expression over \({\textsf {Var}}\), \(p \in [0,\, 1] \cap \mathbb {Q}\), and b is a Boolean expression over arithmetical expressions over \({\textsf {Var}} \). We call the set of programs that do not contain any probabilistic choices the set of ordinary programs and denote this set by ordProg. \(\triangle \)

Let us shortly go over the language constructs: \({\textsf {skip}}\) does nothing. \({\textsf {diverge}}\) is a program that certainly enters an infinite loop.Footnote 2\({v} :={e}\) assigns the value of expression e evaluated in the current variable valuation to variable v. \({P_1} ;~ {P_2}\) is the sequential composition of \(P_1\) and \(P_2\). \({\textsf {if}}\, \left( {b} \right) \, \left\{ {P_1} \right\} \, {\textsf {else}}\, \left\{ {P_2} \right\} \) is a conditional construct guarded by a Boolean expression b. \({\textsf {while}}\, \left( {b} \right) \, \left\{ {P'} \right\} \) is a loop construct also guarded by a Boolean expression b. \(\left\{ {P_1} \right\} \mathrel {\left[ {p} \right] } \left\{ {P_2} \right\} \) denotes a probabilistic choice between the program \(P_1\) (which is executed with probability p) and the program \(P_2\) (which is executed with probability \(1-p\)). A simple operational semantics for pGCL is given in the following. We first define what a program configuration is.

Definition 2

(Program States and Configurations) Let

$$\begin{aligned} \mathbb {S}= \{\sigma ~|~ \sigma :V \rightarrow \mathbb {Q},~ V \subset {\textsf {Var}},~ V \text { finite}\}~. \end{aligned}$$

be the set of program states. Notice that \(\mathbb {S}\) is countable since the domain V of any state \(\sigma :V \rightarrow \mathbb {Q}\) is finite. A state \(\sigma \) is considered a valid input for a given program P, iff all program variables that occur in P are in the domain of \(\sigma \). Notice that the domain of \(\sigma \) may contain more variables than occur in P. For each program \(P \in \textsf {Prog}\), let

$$\begin{aligned} \mathbb {S}_P = \{\sigma \in \mathbb {S}~|~ \sigma \text { is a valid input for } P \} \end{aligned}$$

denote the set of valid inputs for P.Footnote 3

For \(\sigma : V \rightarrow \mathbb {Q}\), we denote by \(\sigma [x \mapsto v]\) the state

$$\begin{aligned} \sigma [x \mapsto v] :\quad V \cup \{x\} \rightarrow \mathbb {Q},~ y \mapsto {\left\{ \begin{array}{ll} \sigma (y), &{}\text {if } x \ne y\\ v, &{}\text {if } x = y~, \end{array}\right. } \end{aligned}$$

i.e. we enlarge the domain of \(\sigma \) if necessary and let variable x evaluate to value v.

Let \(\textsf {Prog}_{\downarrow } = {\textsf {Prog}} \cup \{{\downarrow }\} \cup \{{\downarrow };\, P ~|~ P \in {\textsf {Prog}} \}\). For \(P \in \textsf {Prog}_{\downarrow }\), we define \(\mathbb {S}_P\) analogously to the case for \(P \in \textsf {Prog}\). The set of program configurations (or simply configurations) is given by

$$\begin{aligned} \mathbb {K}~=~ \{\langle P,\, \sigma ,\, a,\, \Theta \rangle \in \textsf {Prog}_{\downarrow } \times \mathbb {S}\times \big ([0,\, 1] \cap \mathbb {Q}\big ) \times \{L,\, R\}^* ~|~ \sigma \in \mathbb {S}_P\}~. \end{aligned}$$

Notice that the configurations in \(\mathbb {K}\) couple only those states \(\sigma \) with programs P, where \(\sigma \) is a valid input for P. Notice furthermore, that \(\mathbb {K}\) is countable since it is a cartesian product of countable sets. The last two components of a configuration are used for bookkeeping of probabilistic choices as explained later. A configuration is a terminal configuration if it is of the form \(\langle {\downarrow },\, \sigma ,\, a,\, \Theta \rangle \). For a program \(P \in {\textsf {Prog}} \) and an initial state \(\sigma \in \mathbb {S}_P\), we denote by \(\gamma _{P,\sigma }\) the initial configuration \(\langle P,\, \sigma ,\, 1,\, \varepsilon \rangle \). \(\triangle \)

Using the notion of configurations, we can now define our operational semantics in terms of a configuration transition relation:

Definition 3

(Operational Semantics of pGCL) Let \(\llbracket e \rrbracket _\sigma \in \mathbb {Q}\) and \(\llbracket b \rrbracket _\sigma \in \{\textsf {true},\, \textsf {false}\}\) be the evaluation of an arithmetical expression e and a Boolean expression b in state \(\sigma \), respectively. Then the semantics of pGCL is given by the least relation \({\vdash } \subseteq \mathbb {K}\times \mathbb {K}\) satisfying the following inference rules:

$$\begin{aligned} (\text {skip})~&\frac{}{\langle {\textsf {skip}},\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle {\downarrow },\, \sigma ,\, a,\, \theta \rangle }\\ (\text {diverge})~&\frac{}{\langle {\textsf {diverge}},\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle {\textsf {diverge}},\, \sigma ,\, a,\, \theta \rangle }\\ (\text {assign})~&\frac{}{\langle v\text {\texttt { := }} e,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle {\downarrow },\, \sigma [v \mapsto \llbracket e \rrbracket _\sigma ],\, a,\, \theta \rangle }\\ (\text {seq-1})~&\frac{\langle P_1,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle P_1',\, \sigma ',\, a',\, \theta '\rangle }{\langle P_1\text {\texttt {;}}\,P_2,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle P_1'\text {\texttt {;}}\, P_2,\, \sigma ',\, a',\, \theta '\rangle }\\ (\text {seq-2})~&\frac{}{\langle {\downarrow }\text {\texttt {;}}\,P_2,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle P_2,\, \sigma ,\, a,\, \theta \rangle }\\ (\text {if-t})~&\frac{\llbracket b \rrbracket _\sigma = \textsf {true}}{\langle {\textsf {if}}\, \left( {b} \right) \, \left\{ {P_1} \right\} \, {\textsf {else}}\, \left\{ {P_2} \right\} ,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle P_1,\, \sigma ,\, a,\, \theta \rangle }\\ (\text {if-f})~&\frac{\llbracket b \rrbracket _\sigma = \textsf {false}}{\langle {\textsf {if}}\, \left( {b} \right) \, \left\{ {P_1} \right\} \, {\textsf {else}}\, \left\{ {P_2} \right\} ,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle P_2,\, \sigma ,\, a,\, \theta \rangle }\\ (\text {while-t})~&\frac{\llbracket b \rrbracket _\sigma = \textsf {true}}{\langle {\textsf {while}}\, \left( {b} \right) \, \left\{ {P'} \right\} ,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle {P'} ;~ {{\textsf {while}}\, \left( {b} \right) \, \left\{ {P'} \right\} },\, \sigma ,\, a,\, \theta \rangle }\\ (\text {while-f})~&\frac{\llbracket b \rrbracket _\sigma = \textsf {false}}{\langle {\textsf {while}}\, \left( {b} \right) \, \left\{ {P'} \right\} ,\, \sigma ,\, a,\, \theta \rangle ~\vdash ~ \langle {\downarrow },\, \sigma ,\, a,\, \theta \rangle }\\ (\text {prob-L})~&\frac{}{\big \langle \{P_1\}\,[p]\,\{P_2\},\, \sigma ,\, a,\, \theta \big \rangle ~\vdash ~ \langle P_1,\, \sigma ,\, a \cdot p,\, \theta \cdot L \rangle }\\ (\text {prob-R})~&\frac{}{\big \langle \{P_1\}\,[p]\,\{P_2\},\, \sigma ,\, a,\, \theta \big \rangle ~\vdash ~ \langle P_2,\, \sigma ,\, a \cdot (1 - p),\, \theta \cdot R\rangle }~, \end{aligned}$$

where

$$\begin{aligned} \sigma [v \mapsto \llbracket e \rrbracket _\sigma ]\left( v'\right) ~=~ {\left\{ \begin{array}{ll} \llbracket e \rrbracket _\sigma , &{}\text {if } v' = v, \quad \text {and}\\ \sigma \left( v'\right) , &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

We use \(\gamma \vdash ^k \gamma '\) in the usual sense, i.e. there exist \(k-1\) configurations \(\gamma _1,\dots ,\gamma _k \in \mathbb {K}\), such that \(\gamma \vdash \gamma _1 \vdash \cdots \vdash \gamma _{k-1} \vdash \gamma '\). \(\triangle \)

The semantics given by the \(\vdash \)-relation is mostly a straightforward operational semantics except for two features: in addition to the program that is to be executed next and the current variable valuation, each configuration also stores a sequence \(\theta \) over the alphabet \(\{L,\, R\}\) that encodes a history of the branches (Left or Right) that were chosen at the probabilistic choices as well as the probability a that all those branches were chosen. It can easily be seen that the graph that is spanned by the \(\vdash \)-relation is just an unfolding of the Markov decision process semantics for pGCL provided in [18]. An equivalent semantics has already been provided in Kozen’s seminal work on probabilistic propositional dynamic logic (PPDL), where each pGCL construct is shown to correspond to a PPDL formula [29].

It is a well-known result due to Kleene that for any ordinary program \(Q \in {\mathsf {ordProg}} \) and an associated configuration \(\gamma = \langle Q,\, \sigma ,\, a,\, \theta \rangle \) the k-th successor with respect to \(\vdash \) is unique and computable. If, however, program P contains probabilistic choices, the k-th successor of a configuration is not necessarily unique, because at various points of the execution the program must choose a left or a right branch with some probability. However, if we resolve those choices by providing a sequence of symbols w over the alphabet \(\{L,\, R\}\) that encodes for all probabilistic choices which occur whether the Left or the Right branch shall be chosen at a branching point, we can construct a computable function that computes a unique k-th successor with respect to sequence w. Notice that for this purpose a sequence of finite length is sufficient. We obtain the following:

Proposition 1

(The Successor Configuration Function) Let \(\mathbb {K}_\bot = \mathbb {K}\cup \{\bot \}\). There exists a computable total function \( T :\mathbb {N}\times \mathbb {K}_\bot \times \{L,\, R\}^* \rightarrow \mathbb {K}_\bot \), such that

$$\begin{aligned} T _0(\gamma ,\, w)&= {\left\{ \begin{array}{ll} \gamma , &{}\mathrm{if }~w = \varepsilon ,\\ \bot , &{}\mathrm{otherwise,} \end{array}\right. }\\ T _{k+1}(\gamma ,\, w)&= {\left\{ \begin{array}{ll} T_{k}(\gamma ',\, w'), &{}\mathrm{if }~\gamma = \langle P,\, \sigma ,\, a,\, \theta \big \rangle \vdash \langle P',\, \sigma ',\, a',\, \theta \cdot b\big \rangle = \gamma ',\\ &{} ~~~ \mathrm{with }~w = b\cdot w'~\mathrm{and }~ b \in \{L,\, R,\, \varepsilon \},\\ \bot , &{}\mathrm{otherwise.} \end{array}\right. } \end{aligned}$$

Hence \( T _k(\gamma ,\, w)\) returns a successor configuration \(\gamma ' \ne \bot \), if \(\gamma \vdash ^k \gamma '\) and the inference rules for probabilistic choices (prob-L) and (prob-R) have been applied exactly |w| times to resolve all probabilistic choices in \(\gamma \) according to w. Otherwise \( T _k(\gamma ,\, w)\) returns \(\bot \). Note in particular that for both the inference of a terminal configuration \(\langle {\downarrow },\, \sigma ,\, a,\, \theta \rangle \) within less than k steps as well as the inference of a terminal configuration through less or more than |w| probabilistic choices, \( T _k(\gamma ,\, w)\) results in \(\bot \).

2.2 Computational and arithmetical complexity

Our hardness results will be stated in terms of levels in the arithmetical hierarchy—a notion for classifying sets according to the complexity required to define them in the language of first-order Peano arithmetic.Footnote 4 We first briefly recall this concept:

Definition 4

(Arithmetical Hierarchy [26, 35]) The class \(\varSigma _n^0\) is defined asFootnote 5

$$\begin{aligned} \varSigma _n^0 =&\big \{ \mathcal {A} ~\big |~ \mathcal {A} = \big \{ x ~\big |~ \exists y_1\, \forall y_2\, \exists y_3\, \cdots \, \exists /\forall y_n:~ ( x,\, y_1,\, y_2,\, y_3,\, \ldots ,\, y_n) \in \mathcal {R}\big \},\, \\&\mathcal {R} \text { is a decidable relation}\big \}~, \end{aligned}$$

the class \(\varPi _n^0\) is defined as

$$\begin{aligned} \varPi _n^0 =&\big \{ \mathcal {A} ~\big |~ \mathcal {A} = \big \{ x ~\big |~ \forall y_1\, \exists y_2\, \forall y_3\, \cdots \, \exists /\forall y_n:~ ( x,\, y_1,\, y_2,\, y_3,\, \ldots ,\, y_n) \in \mathcal {R}\big \},\, \\ {}&\mathcal {R} \text { is a decidable relation}\big \}~, \end{aligned}$$

and the class \(\varDelta _n^0\) is defined as \(\varDelta _n^0 = \varSigma _n^0 \cap \varPi _n^0\), for every \(n \in \mathbb {N}\). Multiple consecutive quantifiersof the same type can be contracted into one quantifier of that type, so the number n refers to the number of necessary quantifier alternations rather than to the actual number of quantifiers occurring in a defining formula. A set \(\mathcal {A}\) is called arithmetical, iff \(\mathcal {A} \in \varGamma _n^0\), for some \(\varGamma \in \{\varSigma ,\, \varPi ,\, \varDelta \}\) and \(n \in \mathbb N\). The inclusion diagram

with \(\varSigma _n^0 \ne \varPi _n^0\) holds for every \(n \ge 1\), thus the arithmetical sets form a strict hierarchy. Furthermore, note that \(\varSigma _0^0 = \varPi _0^0 = \varDelta _0^0 = \varDelta _1^0\) is the class of the decidable sets and \(\varSigma _1^0\) is the class of the computably enumerable sets. \(\triangle \)

The notion of the arithmetical hierarchy is of utter utility: Besides establishing a strong connection between computability and logic, stating precisely at which level in the arithmetical hierarchy a decision problem must be placed amounts to giving a measure of just “how unsolvable” the decision problem is [10].

In order to establish results of the form “problem\(\mathcal {A}\)is at least as hard to solve as problem\(\mathcal {B}\)”, we make use of two notions called many-one reducibility and many-one completeness.

Definition 5

(Many-One Reducibility and Completeness [11, 35, 38]) Let \(\mathcal {A}\) and \(\mathcal {B}\) be arithmetical sets and let X and Y be some appropriate universes such that \(\mathcal {A} \subseteq X\) and \(\mathcal {B} \subseteq Y\). \(\mathcal {A}\) is called many-one reducible (or simply reducible) to \(\mathcal {B}\), denoted

$$\begin{aligned} \mathcal {A} ~\mathrel {\,{\le }_{\text {m}}}~ \mathcal {B}~, \end{aligned}$$

iff there exists a computable function \(f:X \rightarrow Y\), such that

$$\begin{aligned} \forall \, {x} \in X:~~x \in \mathcal {A} ~\Longleftrightarrow ~ f( x) \in \mathcal {B}~. \end{aligned}$$

If f is a function such that f reduces \(\mathcal {A}\) to \(\mathcal {B}\), we denote this by \(f:\mathcal {A} \mathrel {\,{\le }_{\text {m}}}\mathcal {B}\). Note that \(\mathrel {\,{\le }_{\text {m}}}\) is transitive.

Given \(\varGamma \in \{\varSigma ,\, \varPi ,\, \varDelta \}\), a set \(\mathcal {A}\) is called many-one complete for\(\varvec{\varGamma _n^0}\) (or simply \(\varvec{\varGamma _n^0}\)-complete) iff both \(\mathcal {A}\) is a member of \(\varGamma _n^0\) and \(\mathcal {A}\) is \(\varvec{\varGamma _n^0}\)-hard, meaning \(\mathcal {C} \mathrel {\,{\le }_{\text {m}}}\mathcal {A}\), for any set \(\mathcal {C} \in \varGamma _n^0\). Note that if \(\mathcal {A}\) is \(\varGamma _n^0\)-complete and \(\mathcal {A} \mathrel {\,{\le }_{\text {m}}}\mathcal {B}\), then \(\mathcal {B}\) is necessarily \(\varGamma _n^0\)-hard. Furthermore, note that if \(\mathcal {A}\) is \(\varSigma _n^0\)-complete, then \(\mathcal {A} \in \varSigma _n^0\setminus \varPi _n^0\). Analogously if \(\mathcal {A}\) is \(\varPi _n^0\)-complete, then \(\mathcal {A} \in \varPi _n^0\setminus \varSigma _n^0\). \(\triangle \)

Many well-known and natural problems are complete for some level of the arithmetic hierarchy. Arguably one of the most prominent problems is the halting problem for ordinary programs:

Theorem 1

(The Halting Problem [36]) The halting problem is the problem whether an ordinary program terminates on a given valid input. The according problem set \(\mathcal {H} \subset {\mathsf {ordProg}} \times \mathbb {S}\) is defined as

$$\begin{aligned} (P,\, \sigma ) \in \mathcal {H} \quad \mathrm{iff}\quad \sigma \in \mathbb {S}_P~\mathrm{and }~\exists \, k \in \mathbb {N}~ \exists \, \sigma ' \in \mathbb {S}: T _k(\gamma _{P,\sigma },\, \varepsilon ) = \langle {\downarrow },\, \sigma ',\, 1,\, \varepsilon \rangle ~. \end{aligned}$$

The complement of the halting problem is the problem whether a program does not terminate on a given valid input. It is given by

$$\begin{aligned} (P,\, \sigma ) \in \overline{{\small {\mathcal {H}}}} \quad \mathrm{iff}\quad \sigma \in \mathbb {S}_P~\mathrm{and }~(P,\, \sigma ) \not \in \mathcal {H} ~. \end{aligned}$$

\(\mathcal {H} \) is \(\varSigma _1^0\)-complete and \(\overline{{\small {\mathcal {H}}}} \) is \(\varPi _1^0\)-complete.

The halting problem is the problem of whether a given program terminates on a given valid input. Its universal version is the problem of whether a given program terminates on all valid inputs.

Theorem 2

(The Universal Halting Problem [36]) The universal halting problem is the problem whether an ordinary program terminates on all possible valid inputs. The according problem set \(\,\mathcal {UH} \subset {\mathsf {ordProg}} \) is defined as

$$\begin{aligned} P \in \,\mathcal {UH} \quad \mathrm{iff} \quad \forall \,\sigma \in \mathbb {S}_P :(P,\, \sigma ) \in \mathcal {H} ~. \end{aligned}$$

The complement of\(\,\mathcal {UH} \) is given by \(\,\overline{{\small {\mathcal {UH}}}} = {\mathsf {ordProg}} \setminus \,\mathcal {UH} \).

\(\,\mathcal {UH} \) is \(\varPi _2^0\)-complete and \(\,\overline{{\small {\mathcal {UH}}}} \) is \(\varSigma _2^0\)-complete.

We observe that we have a complexity jump from \(\mathcal {H} \) to \(\,\mathcal {UH} \), namely from \(\varSigma _1^0\) to \(\varPi _2^0\), i.e. a jump one level up and to the “other side” of the hierarchy.

Since we go as high as the third level of the arithmetical hierarchy, we introduce a third \(\varSigma _3^0\)-complete problem: the problem of whether the set of valid inputs on which a given ordinary program does not terminate is finite.

Theorem 3

(The Cofiniteness Problem [36]) The cofiniteness problem is the problem of deciding whether the set of valid inputs on which an ordinary program P terminates is cofinite.Footnote 6 The according problem set \(\mathcal {COF} \subset {\mathsf {ordProg}} \) is given byFootnote 7

$$\begin{aligned} P \in \mathcal {COF} \quad \mathrm{iff}\quad \{\sigma \in \mathbb {S}_P ~|~ (P,\, \sigma ) \in \mathcal {H} \}~\mathrm{is~cofinite}. \end{aligned}$$

Let \(\overline{{\small {\mathcal {COF}}}} = {\mathsf {ordProg}} \setminus \mathcal {COF} \) denote the complement of\(\mathcal {COF} \).

\(\mathcal {COF} \) is \(\varSigma _3^0\)-complete and \(\overline{{\small {\mathcal {COF}}}} \) is \(\varPi _3^0\)-complete.

3 The hardness of approximating preexpectations

In this section we investigate the computational hardness of approximating expected values of random variables assigning numbers to program states. For such random variables we want to know the expected value after program execution. We work with a certain class of random variables commonly called expectations [31]:

Definition 6

(Expectations [20, 31]) The set of expectations is defined as

$$\begin{aligned} \mathbb {E}~=~ \left\{ f ~\Big |~ f:\mathbb {S}\rightarrow \mathbb {Q}_{\ge 0},~ f \text { computable} \right\} ~. \end{aligned}$$

Notice that \(\mathbb {E}\) is countable as there are only countably many computable functions. \(\triangle \)

Given a probabilistic program P, initial state \(\sigma \), and expectation f, we would like to answer the question:

“What is the expected value off after termination of P on input \(\sigma \)?”

For this problem, f is referred to as a postexpectation and the aforementioned expected value of f is referred to as the preexpectation.Footnote 8 We have restricted to computable postexpectations since this (a) makes the set \(\mathbb {E}\) countable and (b) there is no hope of determining the value of a non-computable postexpectation in a final state.

In order to express preexpectations in our formalism, we need—in addition to the successor configuration function \( T \)—another computable operation:

Proposition 2

There exists a total computable function \(\wp :\mathbb {K}_\bot \times \mathbb {E}\rightarrow \mathbb {Q}_{\ge 0}\), with

$$\begin{aligned} \wp (\gamma ,\, f) ~~{=}~{\left\{ \begin{array}{ll} f(\sigma ) \cdot a,&{}\mathrm{if }~\gamma = \langle {\downarrow },\, \sigma ,\, a,\, \underline{~\,}\,\rangle ,\\ 0,&{}\mathrm{otherwise,} \end{array}\right. } \end{aligned}$$

where \(\underline{~\,}\) represents an arbitrary value in \(\{L,\, R\}^*\).

\(\wp \) takes a configuration \(\gamma \) and an expectation f and returns the probability of reaching \(\gamma \) multiplied with the value of f in the configuration \(\gamma \) (this is where computability of f is needed). It does so only if the provided configuration \(\gamma \) is a terminal configuration. Otherwise it returns 0.

We can now define preexpectations using \( T \) and \(\wp \) as follows:

Definition 7

(Preexpectations) Let \(f \in \mathbb {E}\) and \(A^{{}\le k} = \bigcup _{i=0}^k A^i\) for a finite alphabet A. Then the preexpectation of\(\varvec{P}\)with respect to postexpectation\(\varvec{f}\)in initial state\(\varvec{\sigma \in \mathbb {S}_P}\) is given by

$$\begin{aligned} E _{P,\sigma }\left( {f}\right) ~=~ \sum _{k = 0}^{\infty } ~ \sum _{w \in \{L,\, R\}^{{}\le k}} ~ \wp \big ( T _{k}(\gamma _{P,\sigma },\, w),\, f \big )~. \end{aligned}$$

For later use, we also define the notation \( E _{P,\sigma }^{k}\left( {f}\right) \) for the k-th summand of the outer sum of \( E _{P,\sigma }\left( {f}\right) \):

The preexpectation \( E _{P,\sigma }\left( {f}\right) \) as defined here coincides with the weakest preexpectation \( wp .P.f\,(\sigma )\) à la McIver and Morgan [31] for fully probabilistic programs, which in turn coincides with Kozen’s [29] eventuality operation \(\langle P \rangle f\) in probabilistic propositional dynamic logic. In the above definition for \( E _{P,\sigma }\left( {f}\right) \), we sum over all possible numbers of inference step lengths k and sum over all possible probabilistic-choice-resolving sequences from length 0 up to length k. Using \(\wp \) we filter out the terminal configurations \(\gamma \) and sum up the values of \(\wp (\gamma ,\, f)\).

In order to investigate the complexity of approximating \( E _{P,\sigma }\left( {f}\right) \), we define three sets: \(\mathcal {LEXP}\), which relates to the set of rational lower bounds of \( E _{P,\sigma }\left( {f}\right) \), \(\mathcal {REXP}\), which relates to the set of rational upper bounds, and \(\mathcal {EXP}\) which relates to the value of \( E _{P,\sigma }\left( {f}\right) \) itself:

Definition 8

(Approximation Problems for Preexpectations) The problem sets \(\mathcal {LEXP},\mathcal {REXP},\mathcal {EXP}\subset {\textsf {Prog}} \times \mathbb {S}\times \mathbb {E}\times \mathbb {Q}_{\ge 0}\) are defined as

$$\begin{aligned} (P,\, \sigma ,\, f,\, q) \in \mathcal {LEXP}\quad&\text {iff} \quad \sigma \in \mathbb {S}_P \text { and } q < E _{P,\sigma }\left( {f}\right) , \\ (P,\, \sigma ,\, f,\, q) \in \mathcal {REXP}\quad&\text {iff} \quad \sigma \in \mathbb {S}_P \text { and }q > E _{P,\sigma }\left( {f}\right) , \quad \text {and} \\ \end{aligned}$$

The computational hardness of approximating preexpectations coincides with the hardness of deciding these problem sets. The first hardness result we establish is the \(\varSigma _1^0\)-completeness of \(\mathcal {LEXP}\). For that, we show that \(\mathcal {LEXP}\) is a \(\varSigma _1^0\) problem and then show by a reduction from the (non-universal) halting problem for ordinary programs that \(\mathcal {LEXP}\) is \(\varSigma _1^0\)-hard

Theorem 4

\(\mathcal {LEXP}\) is \(\varSigma _1^0\)-complete.

Proof

For proving the membership \(\mathcal {LEXP}\in \varSigma _1^0\), first notice that \(\sigma \in \mathbb {S}_P\) is clearly decidable and then consider the following:

$$\begin{aligned} ~&(P,\, \sigma ,\, f,\, q) \in \mathcal {LEXP}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q< E _{P,\sigma }\left( {f}\right) \quad (\hbox {Definition }8)\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q< \sum _{k = 0}^{\infty } ~ E _{P,\sigma }^{k}\left( {f}\right) \quad (\hbox {Definition }7)\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, y:q < \sum _{k = 0}^{y} ~ E _{P,\sigma }^{k}\left( {f}\right) \quad \text {(all summands are positive)} \\ \Longrightarrow ~&\mathcal {LEXP}\in \varSigma _1^0\quad (\text {the above is a } \varSigma _1^0\text {-formula}) \end{aligned}$$

Figure 1 (left) gives an intuition on the resulting \(\varSigma _1^0\)-formula: With increasing maximum computation length y more and more probability mass of the expected value can be accumulated until eventually a probability mass strictly larger than q has been accumulated.

It remains to show that \(\mathcal {LEXP}\) is \(\varSigma _1^0\)-hard by constructing a computable function r such that \(r:\mathcal {H} \mathrel {\,{\le }_{\text {m}}}\mathcal {LEXP}\). This function r takes an ordinary program \(Q \in {\mathsf {ordProg}} \) and variable valuation \(\sigma \) as its input and computes \((P,\, \sigma ,\, 1,\, \nicefrac 1 2)\), where P is the following probabilistic program:

figure f

Correctness of the reduction: There are two cases: (1) Q terminates on input \(\sigma \). Then P terminates with probability 1 and the expected value of 1Footnote 9 after executing the program P on input \(\sigma \) is thus 1. As \(\nicefrac 1 2 < 1\), we have that \((P,\, \sigma ,\, 1,\, \nicefrac 1 2) \in \mathcal {LEXP}\).

(2) Q does not terminate on input \(\sigma \). Then P terminates with probability \(\nicefrac {1}{2}\) and the expected value of 1 after executing the program P on input \(\sigma \) is thus \(\nicefrac {1}{2} \cdot 1 = \nicefrac {1}{2}\) since the right branch contributes 0 to the expected value of 1. As \(\nicefrac 1 2 \not < \nicefrac 1 2\), we have that \((P,\, \sigma ,\, 1,\, \nicefrac 1 2) \not \in \mathcal {LEXP}\). \(\square \)

As an immediate consequence of Theorem 4, \(\mathcal {LEXP}\) is computably enumerable (cf. Definition 5). This means that all lower bounds for preexpectations can be effectively enumerated by some algorithm. Now, if upper bounds were computably enumerable as well, then preexpectations would correspond to computable reals. However, we show that the contrary holds, because \(\mathcal {REXP}\) is \(\varSigma _2^0\)-complete. Thus \(\mathcal {REXP}\not \in \varSigma _1^0\), i.e. upper bounds are not computably enumerable.

Fig. 1
figure 1

Schematic depiction of the formulae defining \(\mathcal {LEXP}\) (left diagram) and \(\mathcal {REXP}\) (right diagram), respectively. In each diagram, the solid line represents the monotonically increasing graph of \(\sum _{k = 0}^{y} \, \sum _{w \in \{L,\, R\}^{{}\le k}} \, \wp \big ( T _{k}(\gamma _{P,\sigma },\, w),\, v \big )\) plotted over increasing y

We now establish the \(\varSigma _2^0\)-hardness of \(\mathcal {REXP}\) reduction from \(\,\overline{{\small {\mathcal {UH}}}} \), i.e. the complement of the universal halting problem for ordinary programs (see Theorem 2).

To simplify notation, we define the following abbreviation (function \(\wp \) is introduced in Proposition 2):

$$\begin{aligned} \alpha :\quad \mathbb {K}_\bot \rightarrow \mathbb {Q}_{\ge 0},~ \gamma \mapsto \wp (\gamma ,1)~. \end{aligned}$$

Intuitively, \(\alpha \) can be used to express termination probabilities, as we will see later.

Overall we get the following hardness result for approximating preexpectations from above:

Theorem 5

\(\mathcal {REXP}\) is \(\varSigma _2^0\)-complete.

Proof

For proving the membership \(\mathcal {REXP}\in \varSigma _2^0\), consider the following:

$$\begin{aligned} ~&(P,\, \sigma ,\, f,\, q) \in \mathcal {REXP}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q> E _{P,\sigma }\left( {f}\right) \quad (\text {Definition}~8) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q> \sum _{k = 0}^{\infty } ~ E _{P,\sigma }^{k}\left( {f}\right) \quad (\text {Definition}~7) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, \delta> 0 \, \forall \, y:q - \delta > \sum _{k = 0}^{y} ~ E _{P,\sigma }^{k}\left( {f}\right) \\ \Longrightarrow ~&\mathcal {REXP}\in \varSigma _2^0\quad (\text {the above is a } \varSigma _2^0\text {-formula}) \end{aligned}$$

Figure 1 (right) gives an intuition on the resulting \(\varSigma _2^0\)-formula: No matter what maximum computation length y we allow and thereby no matter how much probability mass of the actual expected value we accumulate, this probability mass is strictly smaller than q (ensured by the safety margin \(\delta \)).

It remains to show that \(\mathcal {REXP}\) is \(\varSigma _2^0\)-hard: We do this by constructing a function \(r:\,\overline{{\small {\mathcal {UH}}}} \mathrel {\,{\le }_{\text {m}}}\mathcal {REXP}\): This function r takes an ordinary program \(Q \in {\mathsf {ordProg}} \) as its input and computes the tuple \((P,\, \sigma [v \mapsto 0],\, v,\, 1)\), where \(\sigma \) is an arbitrary but fixed input, and \(P \in {\textsf {Prog}} \) is the following probabilistic program:

figure g

where TQ is an ordinary program that computesFootnote 10\(\alpha \big ( T _k\big (\gamma _{Q,g_Q(i)},\, \varepsilon \big )\big ) \cdot 2^{k+1}\) and stores the result in the variable v, and \(g_Q:\mathbb {N} \rightarrow \mathbb {S}_Q\) is some computable enumeration of valid inputs for Q. Thus \(\alpha \big ( T _k\big (\gamma _{Q,g_Q(i)},\, \varepsilon \big )\big ) \cdot 2^{k+1}\) returns \(2^{k+1}\) if and only if Q terminates on input \(g_Q(i)\) after exactly k steps (otherwise it returns 0).

Correctness of the reduction: The two while-loops generate independent geometric distributions with parameter \(\nicefrac 1 2\) on i and k, respectively, so the probability of generating exactly the numbers i and k is \(\nicefrac {1}{2^{i+1}} \cdot \nicefrac {1}{2^{k+1}} = \nicefrac {1}{2^{i+k+2}}\). The expected valuation of v after executing the program P is hence independent of the input \(\sigma \) and given by

$$\begin{aligned} \sum _{i= 0}^{\infty } \sum _{k= 0}^{\infty } \frac{1}{2^{i+k+2}} \cdot \alpha \bigg ( T _k\Big (\gamma _{Q,g_Q(i)},\, \varepsilon \Big )\bigg ) \cdot 2^{k+1}~. \end{aligned}$$

Since for each input, the number of steps until termination of Q is either unique or does not exist, the formula for the expected outcome reduces to \(\sum _{i = 0}^{\infty } \nicefrac {1}{2^{i+1}} = 1\) if and only if Q terminates on every input after some finite number of steps. Thus if there exists an input on which Qdoes not eventually terminate, then \((P,\, \sigma [v \mapsto 0],\, v,\, 1) \in \mathcal {REXP}\) as then the preexpectation of v is strictly less than 1. If, on the other hand, Qdoes terminate on every input, then this preexpectation is exactly 1 and hence \((P,\, \sigma [v \mapsto 0],\, v,\, 1) \not \in \mathcal {REXP}\). \(\square \)

As mentioned before, a consequence of Theorem 4 and Theorem 5 for approximating preexpectations is that upper bounds are not computable at all whereas lower bounds are at least computably enumerable. Upper bounds would be computably enumerable if we had access to an oracle for the (non-universal) halting problem \(\mathcal {H} \) for ordinary programs. Given a rational q it would then be semi-decidable whether q is an upper bound when provided access to an oracle for \(\mathcal {H} \). Next, we establish that this is not the case for the problem of deciding whether qequals exactly the value of the preexpectation. Formally, we establish the following hardness result:

Theorem 6

\(\mathcal {EXP}\) is \(\varPi _2^0\)-complete.

Proof

For proving the membership \(\mathcal {EXP}\in \varPi _2^0\) consider the following: By Theorem 5 there exists a decidable relation \(\mathcal {R}\), such that

$$\begin{aligned} (P,\, \sigma ,\, v,\, q) \in \mathcal {REXP}\quad \text {iff}\quad \exists \, r_1 \, \forall \, r_2 :(r_1,\, r_2,\, P,\, \sigma ,\, v,\, q) \in \mathcal {R}~. \end{aligned}$$

Furthermore, by Theorem 4 there exists a decidable relation \(\mathcal {L}\), such that

$$\begin{aligned} (P,\, \sigma ,\, v,\, q) \in \mathcal {LEXP}\quad \text {iff}\quad \exists \, \ell :(\ell ,\, P,\, \sigma ,\, v,\, q) \in \mathcal {L}~. \end{aligned}$$

Let \(\lnot \mathcal {R}\) and \(\lnot \mathcal {L}\) be the (decidable) negations of \(\mathcal {R}\) and \(\mathcal {L}\), respectively. Then:

$$\begin{aligned} ~&(P,\, \sigma ,\, v,\, q) \in \mathcal {EXP}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q = E _{P,\sigma }\left( {v}\right) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {Definition}~8)\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, q \le E _{P,\sigma }\left( {v}\right) \,\wedge \, q \ge E _{P,\sigma }\left( {v}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \lnot \big (q > E _{P,\sigma }\left( {v}\right) \big ) \,\wedge \, \lnot \big (q < E _{P,\sigma }\left( {v}\right) \big )\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \lnot \big (\exists \, r_1\, \forall \, r_2 :(r_1,\, r_2,\, P,\, \sigma ,\, v,\, q) \in \mathcal {R}\big ) \,\wedge \, \lnot \big (\exists \, \ell :(\ell ,\, P,\, \sigma ,\, v,\, q) \in \mathcal {L}\big )\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \big (\forall \, r_1\, \exists \, r_2 :(r_1,\, r_2,\, P,\, \sigma ,\, v,\, q) \in \lnot \mathcal {R}\big ) \,\wedge \, \big (\forall \, \ell :(\ell ,\, P,\, \sigma ,\, v,\, q) \in \lnot \mathcal {L}\big )\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \forall \, r_1\, \forall \, \ell \, \exists \, r_2 :(r_1,\, r_2,\, P,\, \sigma ,\, v,\, q) \in \lnot \mathcal {R} \,\wedge \, (\ell ,\, P,\, \sigma ,\, v,\, q) \in \lnot \mathcal {L}\\ \Longrightarrow ~&\mathcal {EXP}\in \varPi _2^0~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {the above is a } \varPi _2^0\text {-formula}) \end{aligned}$$

It remains to show that \(\mathcal {EXP}\) is \(\varPi _2^0\)-hard. We do this by proving \(\,\mathcal {UH} \mathrel {\,{\le }_{\text {m}}}\mathcal {EXP}\). Reconsider the reduction function r from the proof of Theorem 5: Given an ordinary program Q, r computes the tuple \((P,\, \sigma [v \mapsto 0],\, v,\, 1)\), where P is a probabilistic program with \( E _{P,\sigma }\left( {\varvec{1}}\right) = 1\) if and only if Q terminates on all inputs. Thus \(Q \in \,\mathcal {UH} \) iff \((P,\, \sigma [v \mapsto 0],\, v,\, 1) \in \mathcal {EXP}\) and therefore \(r:\,\mathcal {UH} \mathrel {\,{\le }_{\text {m}}}\mathcal {EXP}\). \(\square \)

Apart from approximating preexpectations, we consider the question whether a preexpectation is finite. This is formalized by the problem set \(\mathcal {FEXP}\):

Definition 9

The problem set \(\mathcal {FEXP}\subset {\textsf {Prog}} \times \mathbb {S}\times \mathbb {E}\) is defined as

Since deciding whether a given rational number is an upper bound of a preexpectation is \(\varSigma _2^0\)-complete (cf. Theorem 5), it is not surprising that deciding finiteness of preexpectations is also in \(\varSigma _2^0\). It is in fact \(\varSigma _2^0\)-complete as well.

Theorem 7

\(\mathcal {FEXP}\) is \(\varSigma _2^0\)-complete.

Proof

For proving the membership \(\mathcal {FEXP}\in \varSigma _2^0\), consider the following:

$$\begin{aligned}&(P,\sigma ,f) \in \mathcal {FEXP}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, E _{P,\sigma }\left( {f}\right) ~<~ \infty \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \,q: E _{P,\sigma }\left( {f}\right) ~<~ q \\ \Longleftrightarrow ~&\exists \,q:\sigma \in \mathbb {S}_P \,\wedge \, E _{P,\sigma }\left( {f}\right) ~<~ q \\ \Longleftrightarrow ~&\exists \,q:(P,\sigma ,f,q) \in \mathcal {REXP}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\text {Definition}~8) \\ \Longrightarrow ~&\mathcal {FEXP}\in \varSigma _2^0~~~\!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {Theorem}~5, \text {the above is a } \varSigma _2^0\text {-formula}) \end{aligned}$$

The proof that \(\mathcal {FEXP}\) is \(\varSigma _2^0\)-hard is deferred to Lemma 1, because we use a reduction from the positive almost-sure termination problem (Definition 11), which is studied in detail in the next section. \(\square \)

This concludes our study of the computational hardness of approximating preexpectations. In the next section, we study computational aspects of analyzing the termination behavior of probabilistic programs.

4 The hardness of deciding probabilistic termination

We now turn towards the computational hardness of analyzing the termination behavior of probabilistic programs. We are interested in two notions: The termination probability and finiteness of the expected runtime of the program. Note that these two notions do not coincide but the latter implies the former. Using the function \(\alpha \) as defined earlier, we can capture the termination probability and the expected runtime of a probabilistic program on a given input as follows: Recall \(\gamma _{P,\sigma } = \langle P,\, \sigma ,\, 1,\, \varepsilon \rangle \) (see Definition 2) and \(A^{{}\le k} = \bigcup _{i=0}^k A^i\) (see Definition 7). Then we define the following:

Definition 10

(Termination Probabilities and Expected Runtimes) Let \(P \in {\textsf {Prog}} \) and \(\sigma \in \mathbb {S}_P\). Then

  1. 1.

    the probability that\(\varvec{P}\)terminates on\(\varvec{\sigma }\) is given by

    $$\begin{aligned} \text {Pr}_{P,\sigma }({\downarrow }) ~=~ E _{P,\sigma }\left( {1}\right) ~, \end{aligned}$$
  2. 2.

    the expected runtime of\(\varvec{P}\)on\(\varvec{\sigma }\) is given by

For the termination probability \(\text {Pr}_P({\downarrow })\), we sum up the probabilities of all reachable final configurations. This coincides with computing the expected value of the constant function \(\lambda \sigma \,.\, 1\). As for \( E _{P,\sigma }\left( {\downarrow }\right) \), it is pointed out in [14] that the expected runtime of P on \(\sigma \) can be expressed as

$$\begin{aligned} E _{P,\sigma }\left( {\downarrow }\right) =&\sum _{k=0}^{\infty } \text {Pr}(\text {``}P \text { runs for more than }k\text { steps on }\sigma \text {''})\\ =&\sum _{k=0}^{\infty }\big ( 1 - \text {Pr}(\text {``}P\text { terminates within }k\text { steps on }\sigma \text {''})\big )~. \end{aligned}$$

Regarding the termination probability of a probabilistic program, the case of almost-sure termination is of special interest: A program Pterminates almost-surely on input \(\sigma \) iff P terminates on \(\sigma \) with probability 1. Furthermore, we say that Pterminates positively almost-surely on \(\sigma \) iff the expected runtime of P on \(\sigma \) is finite. Lastly, we say that P terminates universally (positively) almost-surely, if it does so on all possible inputs \(\sigma \). Notice that almost-sure termination does not imply positive almost-sure termination. The reversed implication, however, is true. Thus positive almost-sure termination is the stronger notion.

For analyzing the hardness of deciding (universal) (positive) almost-sure termination, we formally define the according problem sets:

Definition 11

(Probabilistic Termination Problem Sets) The sets \(\mathcal {AST} \), \(\mathcal {PAST} \), \(\mathcal {UAST} \), and \(\mathcal {PAST} \) are defined as follows:

$$\begin{aligned} (P,\, \sigma ) \in \mathcal {AST} \quad&\text {iff}\quad \sigma \in \mathbb {S}_P \text { and }\text {Pr}_{P,\sigma }({\downarrow }) = 1\\ (P,\, \sigma ) \in \mathcal {PAST} \quad&\text {iff}\quad \sigma \in \mathbb {S}_P \text { and } E _{P,\sigma }\left( {\downarrow }\right) < \infty \\ P \in \mathcal {UAST} \quad&\text {iff}\quad \forall \, \sigma \in \mathbb {S}_P :(P,\, \sigma ) \in \mathcal {AST} \\ P \in \mathcal {UPAST} \quad&\text {iff}\quad \forall \, \sigma \in \mathbb {S}_P :(P,\, \sigma ) \in \mathcal {PAST} \end{aligned}$$

Notice that both \(\mathcal {PAST} \subset \mathcal {AST} \) and \(\mathcal {UPAST} \subset \mathcal {UAST} \) hold. \(\triangle \)

The problem of (universal) almost-sure termination is often considered as the probabilistic counterpart to the (universal) halting problem for ordinary programs. Supported by our hardness results, we will, however, argue why perhaps (universal) positive almost-sure termination is a more suitable probabilistic analog.

As a first hardness result, we establish that deciding almost-sure termination of a program on a given input is \(\varPi _2^0\)-complete:

Theorem 8

\(\mathcal {AST} \) is \(\varPi _2^0\)-complete.

Proof

For proving \(\mathcal {AST} \in \varPi _2^0\), we show \(\mathcal {AST} \mathrel {\,{\le }_{\text {m}}}\mathcal {EXP}\). For that, consider the following function r which takes a probabilistic program Q and a state \(\sigma \) as its input and computes the tuple \((Q,\, \sigma ,\, 1,\, 1)\).

Correctness of the reduction: The expected value of postexpectation 1 after executing P on input \(\sigma \) is exactly the termination probability, see Definition 10. Thus \((P,\, \sigma ,\, 1,\, 1) \in \mathcal {EXP}\) iff \((Q,\, \sigma ) \in \mathcal {AST} \) and therefore \(r:\mathcal {AST} \mathrel {\,{\le }_{\text {m}}}\mathcal {EXP}\). Since \(\mathcal {EXP}\) is \(\varPi _2^0\)-complete by Theorem 6, it follows that \(\mathcal {AST} \in \varPi _2^0\).

It remains to show that \(\mathcal {AST} \) is \(\varPi _2^0\)-hard. For that we reduce the \(\varPi _2^0\)-complete universal halting problem to \(\mathcal {AST} \). Our reduction function \(r'\) takes an ordinary program Q as its input and computes the pair \((P',\, \sigma )\), where \(\sigma \) is some arbitrary input in \(\mathbb {S}_{P'}\) and \(P'\) is the probabilistic program

figure h

where SQ(i) is an ordinary program that regardless of its input \(\sigma \) simulates the program Q on input \(g_Q(i)\), and \(g_Q:\mathbb {N} \rightarrow \mathbb {S}_Q\) is some computable enumeration of valid inputs for Q.

Correctness of the reduction: The while-loop in \(P'\) establishes a geometric distribution with parameter \(\nicefrac 1 2\) on i and hence a geometric distribution on all possible inputs for Q. After the while-loop, the program Q is simulated on the input generated probabilistically in the while-loop. Obviously then the entire program \(P'\) terminates with probability 1 on any arbitrary input \(\sigma \), i.e. terminates almost-surely on \(\sigma \), if and only if the simulation of Q terminates on every input. Thus \(Q \in \,\mathcal {UH} \) if and only if \((P',\, \sigma ) \in \mathcal {AST} \). \(\square \)

While for ordinary programs there is a complexity gap between the halting problem for some given input and the universal halting problem (\(\varSigma _1^0\)-complete vs. \(\varPi _2^0\)-complete), we establish that there is no such gap for almost-sure termination, i.e. \(\mathcal {UAST} \) is exactly as hard to decide as \(\mathcal {AST} \):

Theorem 9

\(\mathcal {UAST} \) is \(\varPi _2^0\)-complete.

Proof

For proving \(\mathcal {UAST} \in \varPi _2^0\), consider that, by Theorem 8, there exists a decidable relation \(\mathcal {R}\), such that

$$\begin{aligned} (P,\, \sigma ) \in \mathcal {AST} \quad \text {iff}\quad \forall \, y_1 \, \exists \, y_2:(y_1,\, y_2,\, P,\, \sigma ) \in \mathcal {R}~. \end{aligned}$$

By that we have that

$$\begin{aligned} P \in \mathcal {UAST} \quad \text {iff}\quad \forall \, \sigma \in \mathbb {S}_P\, \forall \, y_1 \, \exists \, y_2:(y_1,\, y_2,\, P,\, \sigma ) \in \mathcal {R} \end{aligned}$$

which is a \(\varPi _2^0\)-formula and therefore \(\mathcal {UAST} \in \varPi _2^0\).

It remains to show that \(\mathcal {UAST} \) is \(\varPi _2^0\)-hard. This can be done by proving \(\mathcal {AST} \mathrel {\,{\le }_{\text {m}}}\mathcal {UAST} \) as follows: On input \((Q,\, \sigma )\) the reduction function \(r:\mathcal {AST} \mathrel {\,{\le }_{\text {m}}}\mathcal {UAST} \) computes a probabilistic program P that first initializes all variables according to \(\sigma \) and then executes Q. This reduction is clearly correct. \(\square \)

As mentioned above, this result is sort of opposed to the result for ordinary programs. There a \(\varSigma _1^0\)-formula expressing termination on a certain input is prepended with a universal quantifier over all possible inputs yielding a \(\varPi _2^0\)-formula. The reason for the missing complexity gap between non-universal and universal almost-sure termination is that non-universal almost-sure termination is already a \(\varPi _2^0\)-property, basically due to the inherent universal quantification over all resolutions of probabilistic choices. Prepending this \(\varPi _2^0\)-formula with another universal quantifier over all possible inputs does not increase the complexity, as two universal quantifiers can computably be contracted to a single one yielding again a \(\varPi _2^0\)-formula.

We now investigate the computational hardness of deciding positive almost-sure termination: It turns out that deciding \(\mathcal {PAST} \) is \(\varSigma _2^0\)-complete. Thus, \(\mathcal {PAST} \) becomes semi-decidable when given access to an \(\mathcal {H} \)-oracle whereas \(\mathcal {AST} \) does not. We establish \(\varSigma _2^0\)-hardness by a reduction from \(\,\overline{{\small {\mathcal {UH}}}} \). The implications of this reduction are counterintuitive as it means that the reduction function effectively transforms each ordinary program that does not terminate on all inputs into a probabilistic program that does terminate within an expected finite number of steps.

Theorem 10

\(\mathcal {PAST} \) is \(\varSigma _2^0\)-complete.

Proof

For proving \(\mathcal {PAST} \in \varSigma _2^0\), consider the following:

$$\begin{aligned} ~&(P,\, \sigma ) \in \mathcal {PAST} \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \infty> E _{P,\sigma }\left( {\downarrow }\right) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(\text {Definition}~11) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, c :c> E _{P,\sigma }\left( {\downarrow }\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, c :c> \sum _{k = 0}^{\infty } \left( 1-\sum _{w \in \{L,\, R\}^{{}\le k}} ~ \alpha \big ( T _{k}(\sigma _{P,\sigma },\, w) \big )\right) ~~~~~~~~~~~~~~~~ (\text {Definition}~10) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, c \, \forall \, \ell :c > \sum _{k = 0}^{\ell } \left( 1-\sum _{w \in \{L,\, R\}^{{}\le k}} ~ \alpha \big ( T _{k}(\sigma _{P,\sigma },\, w) \big )\right) \\ \Longrightarrow ~&\mathcal {PAST} \in \varSigma _2^0\,~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {the above is a } \varSigma _2^0\text {-formula}) \end{aligned}$$

It remains to show that \(\mathcal {PAST} \) is \(\varSigma _2^0\)-hard. For that, we use a reduction function \(r:\,\overline{{\small {\mathcal {UH}}}} \mathrel {\,{\le }_{\text {m}}}\mathcal {PAST} \) with \(r(Q) = (P,\, \sigma )\), where \(\sigma \) is an arbitrary valid input for P and P is the probabilistic program

figure i

where \( InitQ (i)\) is an ordinary program that initializes a simulation of the program Q on input \(g_Q(i)\) (recall the enumeration \(g_Q\) from Theorem 5), \( StepQ (i)\) is an ordinary program that does one single (further) step of that simulation and sets term to 1 if that step has led to termination of Q, and \( Cheer (x)\) is an ordinary program that executes \(2^x\) many effectless computation steps. We refer to this as “cheering”.Footnote 11

Correctness of the reduction: Intuitively, the program P starts by simulating Q on input \(g_Q(0)\). During the simulation, it—figuratively speaking—gradually looses interest in further simulating Q by tossing a coin after each simulation step to decide whether to continue the simulation or not. If eventually P finds that Q has terminated on input \(g_Q(0)\), it “cheers” for a number of steps exponential in the number of coin tosses that were made so far, namely for \(2^x\) steps. P then continues with the same procedure for the next input \(g_Q(1)\), and so on.

The variable x keeps track of the number of loop iterations (starting from 0), which equals the number of coin tosses. The x-th loop iteration takes place with probability \(\nicefrac {1}{2^x}\).

Notice that the simulation of a single step of the ordinary program Q, i.e. the program \( StepQ (i)\), requires (in our runtime model) at most a number linear in the number of instructions in program Q. Since the size of program Q is fixed in the construction of program P, we consider the time required to execute \( StepQ (i)\) to be constant.Footnote 12

One loop iteration then consists of a constant number \(c_1\) of steps in case Q did not terminate on input \(g_Q(i)\) in the current simulation step. Such an iteration therefore contributes \(\nicefrac {c_1}{2^x}\) to the expected runtime of P. In case Q did terminate, a loop iteration takes a constant number \(c_2\) of steps plus \(2^x\) additional “cheering” steps. Such an iteration therefore contributes \(\nicefrac {c_2 + 2^x}{2^x} = \nicefrac {c_2}{2^x} + 1 > 1\) to the expected runtime. Overall, the expected runtime \( E _{P,\sigma }\left( {\downarrow }\right) \) roughly resembles a geometric series with exponentially decreasing summands. However, for each time the program Q terminates on an input, a summand of the form \(\nicefrac {c_2}{2^x} + 1\) appears in this series. There are now two cases:

(1) \(Q \in \,\overline{{\small {\mathcal {UH}}}} \), so there exists some valid input \(\sigma \) with minimal i such that \(g_Q(i) = \sigma \) on which Q does not terminate. In that case, summands of the form \(\nicefrac {c_2}{2^x} + 1\) appear only \(i-1\) times in the series and therefore, the series converges—the expected runtime is finite, so \((P,\, \sigma ) \in \mathcal {PAST} \).

(2) \(Q \not \in \,\overline{{\small {\mathcal {UH}}}} \), so Q terminates on every input. In that case, summands of the form \(\nicefrac {c_2}{2^x} + 1\) appear infinitely often in the series and therefore, the series diverges—the expected runtime is infinite, so \((P,\, \sigma ) \not \in \mathcal {PAST} \). \(\square \)

We are now in a position to present the missing part of the proof of Theorem 7: We show that deciding \(\mathcal {FEXP}\), i.e. the question whether a preexpectation computed by a probabilistic program for a given input is finite, is \(\varSigma _2^0\)-hard by reduction from the positive almost-sure termination problem.

Lemma 1

\(\mathcal {FEXP}\) is \(\varSigma _2^0\)-hard.

Proof

We use a reduction function \(r:\mathcal {PAST} \mathrel {\,{\le }_{\text {m}}}\mathcal {FEXP}\) that is given by \(r(P,\, \sigma ) = (P',\, \sigma ',\, v)\), where \(\sigma '\) is an arbitrary valid input for \(P'\) and \(P'\) is the probabilistic program

figure j

where \( TP (k)\) is an ordinary program that computes

and stores this value in variable v.

Correctness of the reduction: Regardless of its input \(\sigma '\), the while loop of program \(P'\) establishes a geometric distribution on variable k such that the probability that \(k = i\) is given by \(\nicefrac {1}{2^i}\). This while loop terminates almost-surely. Thereafter, the program computes (\(\dagger \)). Due to the geometric distribution on k, the program in effect stores \(\text {Pr}(\text {``}P\text { terminates after exactly }i\text { steps on input }\sigma \text {''}) \cdot i \cdot 2^i\) in variable v with probability \(\nicefrac {1}{2^i}\). The expected value of variable v is thus given by

$$\begin{aligned} E _{P,\sigma }\left( {v}\right)&= \sum _{i}^{\infty } \frac{\text {Pr}(\text {``}P\text { terminates after exactly }i\text { steps on input }\sigma \text {''}) \cdot i \cdot 2^i}{2^i} \\&= \sum _{i}^{\infty } \text {Pr}(\text {``}P\text { terminates after exactly }i \text { steps on input }\sigma \text {''}) \cdot i \\&= E _{P,\sigma }\left( {\downarrow }\right) ~. \end{aligned}$$

We see that the expected value of v after executing \(P'\) on an arbitrary input equals exactly the expected runtime of P on input \(\sigma \) and this expected value is infinite if and only if the expected runtime is infinite. Thus, we have \((P,\, \sigma ) \in \mathcal {PAST} \) iff \(r(P,\, \sigma ) = (P',\, \sigma ',\, v) \in \mathcal {FEXP}\) and hence \(r :\mathcal {PAST} \mathrel {\,{\le }_{\text {m}}}\mathcal {FEXP}\). \(\square \)

It is noteworthy that we cannot just annotate the given program with a runtime counter and determine the expected value of that runtime counter. In order to see that, consider the program \({\textsf {while}}\, \left( {\textsf {true}} \right) \, \left\{ {{\textsf {skip}}} \right\} \) and its annotated version

figure k

The expected value of variable \( counter \) is 0, since the program terminates with probability 0 and there is no probability mass whatsoever that could contribute to the expected value of \( counter \). On the other hand, the expected runtime of \({\textsf {while}}\, \left( {\textsf {true}} \right) \, \left\{ {{\textsf {skip}}} \right\} \) is clearly \(\infty \). The expected value of the runtime counter is thus unequal to the expected runtime and therefore we need the more involved construction presented in the proof of Lemma 1.

Coming back to termination problems, the last problem we study is universal positive almost-sure termination. In contrast to the non-positive version, we do have a complexity leap when moving from non-universal to universal positive almost-sure termination. We will establish that \(\mathcal {UPAST} \) is \(\varPi _3^0\)-complete and thus strictly harder to decide than \(\mathcal {UAST} \). We do this by a reduction from \(\overline{{\small {\mathcal {COF}}}} \), the complement of the cofiniteness problem (see Theorem 3):

Theorem 11

\(\mathcal {UPAST} \) is \(\varPi _3^0\)-complete.

Proof

By Theorem 10, there exists a decidable relation \(\mathcal {R}\), such that

$$\begin{aligned} (P,\, \sigma ) \in \mathcal {PAST} \quad \text {iff}\quad \exists \, y_1 \, \forall \, y_2:(y_1,\, y_2,\, P,\, \sigma ) \in \mathcal {R}~. \end{aligned}$$

Therefore \(\mathcal {UPAST} \) is definable by

$$\begin{aligned} P \in \mathcal {UPAST} \quad \text {iff}\quad \forall \, \sigma \in \mathbb {S}_P \, \exists \, y_1 \, \forall \, y_2:(y_1,\, y_2,\, P,\, \sigma ) \in \mathcal {R}~, \end{aligned}$$

which is a \(\varPi _3^0\)-formula and therefore \(\mathcal {UPAST} \in \varPi _3^0\).

It remains to show that \(\mathcal {UPAST} \) is \(\varPi _3^0\)-hard. For that we reduce the \(\varPi _3^0\)-complete complement of the cofiniteness problem to \(\mathcal {UPAST} \) using the following function f: f takes an ordinary program Q as its input and computes the probabilistic program P given by

figure l

where \( InitQ (i)\) is an ordinary program that initializes a simulation of the program Q on input \(g_Q(i)\) (recall the enumeration \(g_Q\) from Theorem 5), \( StepQ (i)\) is an ordinary program that does one single (further) step of that simulation and sets term to 1 if that step has led to termination of Q, and \( Cheer (x)\) is an ordinary program that executes \(2^x\) many effectless computation steps.

Note that program P is the same program as in the proof of Theorem 10 with one exception: The variable i is not initialized with 0, but rounded up to the next natural number, say \(\ell \), of the initial value of i in a given program state.Footnote 13 Thus, for every input \(\sigma \), the program P skips all inputs (in the order given by \(g_Q\)) up to the \(\ell \)-th input, where \(\ell = \max \{\lceil \sigma (i) \rceil ,\, 0\}\). After that, program P simulates Q on all remaining inputs starting with input \(g_Q\big (\ell \big )\).

This ability to skip any number of inputs for some input state \(\sigma \) is crucial for the correctness of the reduction. Intuitively, program P terminates in finite expected time on all input states, i.e. \(P \in \mathcal {UPAST} \), if it is impossible to find an input state \(\sigma \) (and thus a value \(\ell \in \mathbb {N}\) determined by \(\sigma (i)\)) such that executing P on \(\sigma \) skips all input states on which Q does not terminate. Otherwise, P (when executed on such an input state \(\sigma \)) keeps simulating terminating runs of Q and thus “cheers” infinitely often. In this case, the expected runtime of P on input state \(\sigma \) becomes infinite, i.e. \(P \notin \mathcal {UPAST} \).

Correctness of the reduction: The problem \(\overline{{\small {\mathcal {COF}}}} \) can alternatively be defined as

$$\begin{aligned} Q \in \overline{{\small {\mathcal {COF}}}} \quad \text {iff}\quad \big \{\sigma \in \mathbb {S}_Q ~|~ (Q,\, \sigma ) \in \overline{{\small {\mathcal {H}}}} \} \text { is infinite}. \end{aligned}$$

There are now two cases:

(1) \(Q \not \in \overline{{\small {\mathcal {COF}}}} \). Then there are only finitely many inputs on which Q does not terminate. Say \(\ell \in \mathbb {N}\) is a minimal value such that Q does not terminate on input \(g_Q(\ell )\), i.e. the program Q terminates on all input states \(g_Q(j)\) with \(j > \ell \). Now, consider the execution of program P on some input \(\sigma \) with \(\sigma (i) > \ell \). Then the “cheering” steps in the if-branch of the while-loop of P are executed infinitely often. Consequently, the runtime of P on that input \(\sigma \) is infinite (analogously to the proof of Theorem 10). Hence, \(P \notin \mathcal {UPAST} \).

(2) \(Q \in \overline{{\small {\mathcal {COF}}}} \). Then there are infinitely many inputs on which Q does not terminate. For every input state \(\sigma \) of P (and thus regardless of the number of skipped input states of Q, i.e. the value \(\ell = \max \{\lceil \sigma (i)\rceil ,\,0\} \in \mathbb {N}\) that is initially assigned to variable i), the variable i will eventually be incremented to some value \(j > \ell \) such that Q does not terminate on input \(g_Q(j)\). From this point on, the “cheering” steps in the if-branch of the while-loop of P are not executed anymore. Consequently, for every input state \(\sigma \), the expected time until termination of P in input state \(\sigma \) is finite. Hence, \(P \in \mathcal {UPAST} \). \(\square \)

This concludes our study of the computational hardness of analyzing probabilistic termination. We have seen that there is a complexity leap when moving from positive almost-sure termination to universal positive almost-sure termination, namely from \(\varSigma _2^0\)-complete (i.e. \(\exists \,\forall \,\cdots \)) to \(\varPi _3^0\)-complete (i.e. \(\forall \, \exists \, \forall \, \cdots \)).

Considering the quantifier ordering and the type of objects that are quantified, we believe that the problem of (universal) positive almost-sure termination is maybe a more natural probabilistic analog to the (universal) halting problem for ordinary programs: For ordinary programs, we can define the halting problem by \(\exists \)-quantifying over a witness computation length and then running the (unique) computation of at most that length on a given input. For the universal halting problem, we additionally prepend a \(\forall \)-quantification over all possible inputs.

Somewhat analogously, for probabilistic programs, we can define the problem of positive almost-sure termination by \(\exists \)-quantifying over a witness expected computation length and then running all computations (captured by a \(\forall \)-quantifier) of at most that expected length on a given input. For the universal version of the problem, we additionally prepend a \(\forall \)-quantifier over all possible inputs.

Another argument stems more from a user perspective. For a user, the expected runtime of an algorithm might be more relevant than its termination probability, because an algorithm whose runtime can at least be estimated to some finite value is probably more useful in practice than an algorithm for which one has to expect to wait forever until the algorithm finishes its computation (even if it does so with probability 1). Along this line of thought, the expected runtime of an algorithm is also a key notion in defining probabilistic complexity classes such as ZPP—the class of decision problems that can be decided by a probabilistic program that always gives the correct answer within expected polynomial time [16].

5 Hardness of approximating (Co)variances

The last group of analysis problems we study concerns the approximation of variances and covariances. More precisely, we are given a probabilistic program P, an initial state \(\sigma \) and two expectations, i.e. random variables, \(f,g \in \mathbb {E}\). Furthermore, let \(\mu \) denote the final distribution of program states that is generated by running P on input \(\sigma \). What is then the covariance of f and g (the variance of f or the variance of g) under the final distribution \(\mu \)? Having the operator \( E _{P,\sigma }\left( {\,\cdot \,}\right) \) for obtaining preexpectations, i.e. expected values of random variables, readily available, the textbook definition of variances and covariances of expectations is as follows (cf. [2, Definition 4.10.10, Lemma 4.10.6] or [27, Definition 5.1]):

Definition 12

Let \(P \in {\textsf {Prog}} \), \(\sigma \in \mathbb {S}_P\), and \(f,g \in \mathbb {E}\). Then

  1. 1.

    the covariance of\(\varvec{f}\)and\(\varvec{g}\)after executing\(\varvec{P}\)on\(\varvec{\sigma }\) is given by

    $$\begin{aligned} \text {Cov}_{P,\sigma }(f,\, g)&= E _{P,\sigma }\left( {\left( f - E _{P,\sigma }\left( {f}\right) \right) \cdot \left( g - E _{P,\sigma }\left( {g}\right) \right) }\right) \\&= E _{P,\sigma }\left( {f \cdot g}\right) - E _{P,\sigma }\left( {f}\right) \cdot E _{P,\sigma }\left( {g}\right) . \end{aligned}$$

    if \( E _{P,\sigma }\left( {f}\right) , E _{P,\sigma }\left( {g}\right) \) and \( E _{P,\sigma }\left( {f \cdot g}\right) \) are finite; otherwise it is undefined,

  2. 2.

    the variance of\(\varvec{f}\)after executing\(\varvec{P}\)on\(\varvec{\sigma }\) is given by

    $$\begin{aligned} \text {Var}_{P,\sigma }(f) = E _{P,\sigma }\left( {f^2}\right) - \left( E _{P,\sigma }\left( {f}\right) \right) ^2 \end{aligned}$$

    if \( E _{P,\sigma }\left( {f}\right) \) is finite; otherwise it is undefined. In particular, if \( E _{P,\sigma }\left( {f^2}\right) \) is also finite, then \(\text {Var}_{P,\sigma }(f) ~=~ \text {Cov}_{P,\sigma }(f,\, f)\).

\(\triangle \)

Since definedness of covariances is not always guaranteed, we first address the question whether a covariance is defined. According to Definition 12, the covariance \(\text {Cov}_{P,\sigma }(f,\, g)\) is defined if and only if

$$\begin{aligned} E _{P,\sigma }\left( {f}\right) ~<~ \infty \quad \text {and}\quad E _{P,\sigma }\left( {g}\right) ~<~ \infty \quad \text {and}\quad E _{P,\sigma }\left( {f \cdot g}\right) ~<~ \infty . \end{aligned}$$

Hence, we are concerned with the following problem set:

Definition 13

The problem set \(\mathcal {DCOVAR}\subset {\textsf {Prog}} \times \mathbb {S}\times \mathbb {E}\times \mathbb {E}\) is defined as

$$\begin{aligned} (P,\sigma ,f,g) \in \mathcal {DCOVAR}\quad \text {iff}\quad&\sigma \in \mathbb {S}_P ~\text {and}~ E _{P,\sigma }\left( {f}\right) ~<~ \infty ~\text {and}~ E _{P,\sigma }\left( {g}\right) ~<~ \infty \\ \end{aligned}$$

Apart from the usual condition that \(\sigma \) is a valid input state for P, each of these conditions can be reduced to the question whether the expected value of a random variable is finite. As a consequence of Definition 13 and Theorem 7, we obtain the following theorem on the hardness of deciding definedness of covariances:

Theorem 12

\(\mathcal {DCOVAR}\) is \(\varSigma _2^0\)-complete.

Proof

By Theorem 7 there is a decidable relation \(\mathcal {F}\) such that

$$\begin{aligned} (P,\, \sigma ,\, f) \in \mathcal {FEXP}\quad \text {iff}\quad E _{P,\sigma }\left( {f}\right) ~<~ \infty \quad \text {iff}\quad \exists \, y_1 \, \forall \, y_2:(y_1,\, y_2,\, P,\, \sigma ,\, f) \in \mathcal {F} \end{aligned}$$

By the above we have the following:

$$\begin{aligned}&(P,\, \sigma ,\, f,\, g) \in \mathcal {DCOVAR}\\ \Longleftrightarrow \quad&\sigma \in \mathbb {S}_P \,\wedge \, E _{P,\sigma }\left( {f}\right) ~<~ \infty \,\wedge \, E _{P,\sigma }\left( {g}\right) ~<~ \infty \,\wedge \, E _{P,\sigma }\left( {f \cdot g}\right) ~<~ \infty \\ \Longleftrightarrow \quad&\sigma \in \mathbb {S}_P \,\wedge \, \exists \, y_1 \, \forall \, y_2:(y_1,\, y_2,\, P,\, \sigma ,\, f) \in \mathcal {F} \\&\quad \,\wedge \, \exists \, y_1' \, \forall \, y_2':(y_1',\, y_2',\, P,\, \sigma ,\, g) \in \mathcal {F} \,\wedge \, \exists \, y_1'' \, \forall \, y_2'':(y_1'',\, y_2'',\, P,\, \sigma ,\, f \cdot g) \in \mathcal {F} \\ \Longrightarrow \quad&\mathcal {DCOVAR}\in \varSigma _2^0 \end{aligned}$$

It remains to show that \(\mathcal {DCOVAR}\) is \(\varSigma _2^0\)-hard. For that, we reduce the \(\varSigma _2^0\)-complete problem \(\mathcal {FEXP}\) to \(\mathcal {DCOVAR}\) by \(r(P,\, \sigma ,\, f) = (P,\, \sigma ,\, f,\, 0)\).

Correctness of the reduction: Consider the following:

$$\begin{aligned}&(P,\, \sigma ,\, f) \in \mathcal {FEXP}\\ \Longleftrightarrow \quad&E _{P,\sigma }\left( {f}\right)< \infty \\ \Longleftrightarrow \quad&E _{P,\sigma }\left( {f}\right)< \infty \,\wedge \, 0< \infty \,\wedge \, 0< \infty \\ \Longleftrightarrow \quad&E _{P,\sigma }\left( {f}\right)< \infty \,\wedge \, E _{P,\sigma }\left( {0}\right)< \infty \,\wedge \, E _{P,\sigma }\left( {0}\right)< \infty \\ \Longleftrightarrow \quad&E _{P,\sigma }\left( {f}\right)< \infty \,\wedge \, E _{P,\sigma }\left( {0}\right)< \infty \,\wedge \, E _{P,\sigma }\left( {f \cdot 0}\right) < \infty \\ \Longleftrightarrow \quad&(P,\, \sigma ,\, f,\, 0) \in \mathcal {DCOVAR}\end{aligned}$$

Thus, we have that \((P,\, \sigma ,\, f) \in \mathcal {FEXP}\) if and only if \(r(P,\, \sigma ,\, f) \in \mathcal {DCOVAR}\) and hence \(r:\mathcal {FEXP}\mathrel {\,{\le }_{\text {m}}}\mathcal {DCOVAR}\). \(\square \)

We now turn towards the hardness of approximating covariances. Analogously to our studies on the hardness of approximating preexpectations, we define three problem sets: one for lower bounds, one for upper bounds, and one for the exact value of \(\text {Cov}_{P,\sigma }(f,\,g)\).

Definition 14

(Approximation Problems for Covariances) The problem sets \(\mathcal {LCOVAR},\mathcal {RCOVAR},\mathcal {COVAR}\subset {\textsf {Prog}} \times \mathbb {S}\times \mathbb {E}\times \mathbb {E}\times \mathbb {Q}\) are defined as

$$\begin{aligned}&(P,\, \sigma ,\, f,\, g,\, q) \in {\mathcal {LCOVAR}} \quad \\&\qquad \text {iff}\quad \sigma \in \mathbb {S}_P \text { and } (P,\, \sigma ,\, f,\, g) \in {\mathcal {DCOVAR}} \text { and } q < \text {Cov}_{P,\sigma }(f,\, g)~,\\&(P,\, \sigma ,\, f,\, g,\, q) \in {\mathcal {RCOVAR}} \quad \\&\qquad \text {iff}\quad \sigma \in \mathbb {S}_P \text { and } (P,\, \sigma ,\, f,\, g) \in {\mathcal {DCOVAR}} \text { and } q > \text {Cov}_{P,\sigma }(f,\, g)~,\\&(P,\, \sigma ,\, f,\, g,\, q) \in {\mathcal {COVAR}} \quad \\&\qquad \text {iff}\quad \sigma \in \mathbb {S}_P \text { and } (P,\, \sigma ,\, f,\, g) \in {\mathcal {DCOVAR}} \text { and } q = \text {Cov}_{P,\sigma }(f,\, g)~. \end{aligned}$$

The first fact we establish on computational hardness of approximating covariances is that approximating lower bounds of covariances is \(\varSigma _2^0\)-complete:

Theorem 13

\(\mathcal {LCOVAR}\) is \(\varSigma _2^0\)-complete.

Proof

For proving the membership \(\mathcal {LCOVAR}\in \varSigma _2^0\), consider the following:

$$\begin{aligned}&(P,\sigma ,f,g,q) \in \mathcal {LCOVAR}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ (P,\sigma ,f,g) \in \mathcal {DCOVAR}~\wedge ~ q < \text {Cov}_{P,\sigma }(f,\, g)~~~~~~~~~~~~~~~~~ (\text {Definition}~14) \end{aligned}$$

By Theorem 12 (\(\mathcal {DCOVAR}\) is in \(\varSigma _2^0\)), there is a decidable relation \(\mathcal {D}\) such that

$$\begin{aligned}&(P,\sigma ,f,g,q) \in \mathcal {LCOVAR}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1\, \forall \, y_2:(P,\sigma ,f,g) \in \mathcal {D} ~\wedge ~ q< \text {Cov}_{P,\sigma }(f,\, g) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1\, \forall \, y_2:(P,\sigma ,f,g) \in \mathcal {D} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {Definition}~12) \\&\wedge ~ q< E _{P,\sigma }\left( {f \cdot g}\right) - E _{P,\sigma }\left( {f}\right) \cdot E _{P,\sigma }\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1\, \forall \, y_2:(P,\sigma ,f,g) \in \mathcal {D} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {Definition}~7) \\&\wedge ~ q< \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1\, \forall \, y_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ \exists \,z_1:q< \sum _{k=0}^{z_1} E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1\, \forall \, y_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ \exists \,z_1 \forall \, z_2:q< \sum _{k=0}^{z_1} E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists y_1 \exists \, z_1\, \forall \, y_2 \forall \, z_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ q < \sum _{k=0}^{z_1} E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {g}\right) \\ \Longrightarrow ~&\mathcal {LCOVAR}\in \varSigma _2^0~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {the above is a } \varSigma _2^0\text {-formula}) \end{aligned}$$

For proving the \(\varSigma _2^0\)-hardness of \({\mathcal {LCOVAR}}\) we reduce \(\overline{{\small {\mathcal {AST}}}} \) to \(\mathcal {LCOVAR}\). Consider the reduction function \(r :\overline{{\small {\mathcal {AST}}}} \mathrel {\,{\le }_{\text {m}}}\mathcal {LCOVAR}\) with \(r(P,\, \sigma ) = (P',\, \sigma ,\, 1,\, 1,\, 0)\), where \(P'\) is given by

$$\begin{aligned} \left\{ {{\textsf {skip}}} \right\} \mathrel {\left[ {\nicefrac 1 2} \right] } \left\{ {P} \right\} ~. \end{aligned}$$

Correctness of the reduction First observe that expected value \( E _{P',\sigma }\left( {1}\right) \) ranges from 0 to 1. In particular, this means that the expected values of 1 as well as \(1^2\) are finite, i.e. the covariance \(\text {Cov}_{P',\sigma }(1,\, 1)\) is defined. Then

$$\begin{aligned} \text {Cov}_{P',\sigma }(1,\, 1)&= E _{P',\sigma }\left( {1^2}\right) - E _{P',\sigma }\left( {1}\right) ^2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (\text {Definition}~12) \\&= E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2~. \end{aligned}$$

Note that \( E _{P',\sigma }\left( {1}\right) \) is exactly the probability of \(P'\) terminating on input \(\sigma \). A plot of this termination probability against the resulting variance is given in Fig. 2. We observe that \(\text {Cov}_{P',\sigma }(1,\, 1) = E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2 > 0\) holds iff \(P'\) terminates neither with probability 0 nor with probability 1. Since, however, \(P'\) terminates by construction at least with probability \(\nicefrac 1 2\), we obtain that \(\text {Cov}_{P',\sigma }(1,\, 1) > 0\) iff \(P'\) terminates with probability less than 1, which is the case iff P terminates with probability less than 1. Thus \(r(P,\, \sigma ) = (P',\, \sigma ,\, 1,\, 1,\, 0) \in \mathcal {LCOVAR}\) iff \((P,\, \sigma ) \in \overline{{\small {\mathcal {AST}}}} \) and therefore \(r:\overline{{\small {\mathcal {AST}}}} \mathrel {\,{\le }_{\text {m}}}\mathcal {LCOVAR}\). Since \(\overline{{\small {\mathcal {AST}}}} \) is \(\varSigma _2^0\)-complete, if follows that \(\mathcal {LCOVAR}\) is \(\varSigma _2^0\)-hard. \(\square \)

Fig. 2
figure 2

Plot of the termination probability of a program \(P'\) on input \(\sigma \) against the resulting variance. The curve is the one of the polynomial \(x - x^2\)

Next, we show that approximating upper bounds for covariances is exactly as hard as approximating lower bounds, namely \(\varSigma _2^0\)-complete:

Theorem 14

\({\mathcal {RCOVAR}}\) is \(\varSigma _2^0\)-complete.

Proof

For proving membership \(\mathcal {RCOVAR}\in \varSigma _2^0\), consider the following:

$$\begin{aligned}&(P,\sigma ,f,g,q) \in \mathcal {RCOVAR}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ (P,\sigma ,f,g) \in \mathcal {DCOVAR}~\wedge ~ q > \text {Cov}_{P,\sigma }(f,\,g)~~~~~~~~~~~~~~~~~ (\text {Definition}~14) \end{aligned}$$

Now, by Theorem 12 (\(\mathcal {DCOVAR}\) is in \(\varSigma _2^0\)) there is a decidable relation \(\mathcal {D}\) such that

$$\begin{aligned}&(P,\sigma ,f,g,q) \in \mathcal {RCOVAR}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} ~\wedge ~ q> \text {Cov}_{P,\sigma }(f,\,g) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} ~\wedge ~ \exists \, \delta> 0:q - \delta> \text {Cov}_{P,\sigma }(f,\,g) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} \quad (\text {Definition}~12) \\&\wedge ~ \exists \, \delta> 0:q - \delta> E _{P,\sigma }\left( {f \cdot g}\right) - E _{P,\sigma }\left( {f}\right) \cdot E _{P,\sigma }\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} \quad (\text {Definition}~7) \\&\wedge ~ \exists \, \delta> 0:q - \delta> \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ \exists \, \delta> 0 \exists z_2 :q - \delta> \sum _{k=0}^{\infty } E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \forall \,y_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ \exists \, \delta> 0 \, \exists \, z_2 \forall \, z_1 :q - \delta> \sum _{k=0}^{z_1} E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {g}\right) \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P \wedge ~ \exists \,y_1 \exists \,\delta> 0\, \exists \, z_2 \forall \,y_2 \forall \, z_1 :(P,\sigma ,f,g) \in \mathcal {D} \\&\wedge ~ q - \delta > \sum _{k=0}^{z_1} E _{P,\sigma }^{k}\left( {f \cdot g}\right) - \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {f}\right) \cdot \sum _{k=0}^{z_2} E _{P,\sigma }^{k}\left( {g}\right) \\ \Longrightarrow ~&\mathcal {RCOVAR}\in \varSigma _2^0\quad (\text {the above is a }\varSigma _2^0\text {-formula}) \end{aligned}$$

For proving the \(\varSigma _2^0\)-hardness of \(\mathcal {RCOVAR}\), we reduce the \(\varSigma _2^0\)-complete \(\overline{{\small {\mathcal {AST}}}} \) to \(\mathcal {RCOVAR}\). Let \((P,\sigma )\) be an instance of \(\overline{{\small {\mathcal {AST}}}} \). Consider the reduction function \(r(P,\, \sigma ) = \left( P',\, \sigma ,\, 1,\, 1,\, \nicefrac 1 4\right) \), with \(P'\) being the program

$$\begin{aligned} \left\{ {{\textsf {diverge}}} \right\} \mathrel {\left[ {\nicefrac 1 2} \right] } \left\{ {P} \right\} ~. \end{aligned}$$

Analogously to the proof of Theorem 13, \(\text {Cov}_{P',\sigma }(1,\, 1)\) is defined and we have

$$\begin{aligned} \text {Cov}_{P',\sigma }(1,\, 1) ~=~ E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2~. \end{aligned}$$

Recall that \( E _{P',\sigma }\left( {1}\right) \) is exactly the probability of \(P'\) terminating on input \(\sigma \). By reconsidering Fig. 2, we can see that \(\text {Cov}_{P',\sigma }(1,\, 1) = E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2 < \nicefrac {1}{4}\) holds iff \(P'\) does not terminate with probability \(\nicefrac 1 2\). Since by construction \(P'\) terminates with a probability of at most \(\nicefrac 1 2\), it follows that \(\text {Cov}_{P',\sigma }(v,\, v) < \nicefrac 1 4\) holds iff \(P'\) terminates with probability less than \(\nicefrac 1 2\), which is the case iff P terminates with probability less than 1. Thus \(r(P,\, \sigma ) = \left( P',\, \sigma ,\, 1,\, 1,\, \nicefrac 1 4\right) \in \mathcal {RCOVAR}\) iff \((P,\, \sigma ) \in \overline{{\small {\mathcal {AST}}}} \) and therefore we have \(r:\overline{{\small {\mathcal {AST}}}} \mathrel {\,{\le }_{\text {m}}}\mathcal {RCOVAR}\). Since \(\overline{{\small {\mathcal {AST}}}} \) is \(\varSigma _2^0\)-complete, if follows that \(\mathcal {RCOVAR}\) is \(\varSigma _2^0\)-hard. \(\square \)

Regarding the hardness of deciding whether a given rational is equal to the covariance we establish \(\mathcal {COVAR}\) can be solved in \(\varDelta _3^0 = \varSigma _3^0 \cap \varPi _3^0\), \(\mathcal {COVAR}\) is \(\varPi _2^0\)-hard, and \(\mathcal {COVAR}\) is \(\varSigma _2^0\)-hard. \(\mathcal {COVAR}\) is therefore at least as hard as deciding whether a non-probabilistic program terminates on all inputs or deciding whether a probabilistic program terminates positively almost-surely.

Theorem 15

\(\mathcal {COVAR}\) is in \(\varDelta _3^0\).

Proof

To prove \(\mathcal {COVAR}\in \varDelta _3^0 = \varSigma _3^0 \cap \varPi _3^0\), consider that \((P,\sigma ,f,g,q) \in \mathcal {COVAR}\) if and only if

$$\begin{aligned}&\sigma \in \mathbb {S}_P ~\wedge ~ (P,\sigma ,f,g) \in \mathcal {DCOVAR}\\ ~\wedge ~&(P,\sigma ,f,g,q) \notin \mathcal {LCOVAR}~\wedge ~ (P,\sigma ,f,g,q) \notin \mathcal {RCOVAR}. \end{aligned}$$

Now, by Theorems 12, 13, and 14 there exist decidable relations \(\mathcal {D}\), \(\mathcal {L}\), \(\mathcal {R}\) such that

$$\begin{aligned}&(P,\sigma ,f,g,q) \in \mathcal {COVAR}\\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists x_1\, \forall \, x_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\qquad \wedge ~ \lnot \exists \, y_1 \forall \, y_2:(P,\sigma ,f,g,q) \in \mathcal {L} ~\wedge ~ \lnot \exists \, z_1 \forall \, z_2:(P,\sigma ,f,g,q) \in \mathcal {R} \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists x_1\, \forall \, x_2:(P,\sigma ,f,g) \in \mathcal {D} \\&\qquad \wedge ~ \forall \, y_1 \exists \, y_2:(P,\sigma ,f,g,q) \notin \mathcal {L} ~\wedge ~ \forall \, z_1 \exists \, z_2:(P,\sigma ,f,g,q) \notin \mathcal {R} \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \exists x_1\, \forall \, x_2 \forall \, y_1 \forall \, z_1 \exists \, y_2 \exists \, z_2 :(P,\sigma ,f,g) \in \mathcal {D}~~~~~~~~~~~~~~ (\text {this is a }\varSigma _3^0\text {-formula})\\&\qquad \wedge ~ (P,\sigma ,f,g,q) \notin \mathcal {L} ~\wedge ~ (P,\sigma ,f,g,q) \notin \mathcal {R} \\ \Longleftrightarrow ~&\sigma \in \mathbb {S}_P ~\wedge ~ \forall \, y_1 \forall \, z_1 \exists \, y_2 \exists \, z_2 \exists \,x_1 \forall \, x_2 :(P,\sigma ,f,g) \in \mathcal {D}~~~~~~~~~~~~~~ (\text {this is a }\varPi _3^0\text {-formula})\\&\qquad \wedge ~ (P,\sigma ,f,g,q) \notin \mathcal {L} ~\wedge ~ (P,\sigma ,f,g,q) \notin \mathcal {R} \end{aligned}$$

Hence, there exists a \(\varSigma _3^0\)-formula and a \(\varPi _3^0\)-formula both defining \(\mathcal {COVAR}\) and therefore \(\mathcal {COVAR}\in \varSigma _3^0 \cap \varPi _3^0 = \varDelta _3^0\). \(\square \)

Theorem 16

\(\mathcal {COVAR}\) is \(\varPi _2^0\)-hard.

Proof

We reduce the \(\varPi _2^0\)-complete \(\mathcal {AST} \) to \(\mathcal {COVAR}\). Let \((P,\sigma )\) be an instance of \(\mathcal {AST} \). Consider the reduction function \(r(P,\, \sigma ) = \left( P',\, \sigma ,\, 1,\, 1,\, \nicefrac 1 4\right) \), with \(P'\) being the program

$$\begin{aligned} \left\{ {{\textsf {diverge}}} \right\} \mathrel {\left[ {\nicefrac 1 2} \right] } \left\{ {P} \right\} ~. \end{aligned}$$

Again, the covariance \(\text {Cov}_{P',\sigma }(1,\,1)\) is defined and we have

$$\begin{aligned} \text {Cov}_{P',\sigma }(1,\, 1) ~=~ E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2 \end{aligned}$$

of which the plot is depicted in Fig. 2. Recall that \( E _{P',\sigma }\left( {1}\right) \) is exactly the probability of \(P'\) terminating on input \(\sigma \). We can see that \(\text {Cov}_{P',\sigma }(1,\, 1) = E _{P',\sigma }\left( {1}\right) - E _{P',\sigma }\left( {1}\right) ^2 = \frac{1}{4}\) iff \(P'\) terminates with probability \(\nicefrac 1 2\). Since \(P'\) terminates at most with probability \(\nicefrac 1 2\), we obtain that \(\text {Cov}_{P',\sigma }(1,\, 1) = \frac{1}{4}\) iff \(P'\) terminates with probability \(\nicefrac 1 2\), which is the case iff P terminates almost-surely. Thus \(r(P,\, \sigma ) = \left( P',\, \sigma ,\, 1,\, 1,\, \frac{1}{4}\right) \in \mathcal {COVAR}\) iff \((P,\, \sigma ) \in \mathcal {AST} \) and therefore \(r:\mathcal {AST} \mathrel {\,{\le }_{\text {m}}}\mathcal {COVAR}\). Since \(\mathcal {AST} \) is \(\varPi _2^0\)-complete, we obtain that \(\mathcal {COVAR}\) is \(\varPi _2^0\)-hard. \(\square \)

Theorem 17

\(\mathcal {COVAR}\) is \(\varSigma _2^0\)-hard.

Proof

We reduce the \(\varSigma _2^0\)-complete \(\mathcal {FEXP}\) to \(\mathcal {COVAR}\). For that, consider the reduction function \(r(P,\, \sigma ,\, f) = \left( P,\, \sigma ,\, f,\, 0,\, 0\right) \).

Correctness of the reduction: The covariance of f and 0 is defined if and only if the expected value of f is finite, the expected value of 0 is finite (which is trivially satisfied), and the expected value of \(f \cdot 0 = 0\) is finite (which is again trivially satisfied). In case that this covariance is defined, it is equal to 0 by definition of the covariance of any random variable f and 0. Thus, the covariance is defined and its value is 0 if and only if the expected value of f after executing P on input \(\sigma \) is finite. Hence, \(r :\mathcal {FEXP}\mathrel {\,{\le }_{\text {m}}}\mathcal {COVAR}\). \(\square \)

Remark 1

(The Hardness of Approximating Variances) Variance approximation is not easier than covariance approximation, i.e. the same hardness results as in Theorems 13 through 16 hold for analogous variance approximation problems. In fact, we have always reduced to approximating a variance—the variance of termination—for obtaining our hardness results on covariances. The only exception is Theorem 17, where two different expectations are used.

Fig. 3
figure 3

The complexity landscape of analysis problems for probabilistic programs. All analysis problems-except from the \(\varSigma _2^0\)-hard and \(\varPi _2^0\)-hard problem \(\mathcal {COVAR}\)-are complete for the respective level in the arithmetical hierarchy

It can furthermore be seen that variance approximation is not harder than covariance approximation The proofs are analogous to the corresponding proofs for covariances presented in this section. The main difference is that we additionally have to consider the case that \( E _{P,\sigma }\left( {f^2}\right) \) is infinite (otherwise, we have \(\text {Var}_{P,\sigma }(f) = \text {Cov}_{P,\sigma }(f,\, f)\)). In this case it suffices to drop the finiteness check for \( E _{P,\sigma }\left( {f^2}\right) \) from \(\mathcal {DCOVAR}\) when approximating lower bounds of variances. This does not change the complexity, because we still have to check that the expected value of f is finite. When approximating upper bounds no change is required: If \( E _{P,\sigma }\left( {f^2}\right) \) is infinite, so is the variance and no a priori fixed constant q is an upper bound of the variance. For computing exact variances, we proceed as in Theorem 15. \(\triangle \)

As an immediate consequence of Theorems 13 and 14, computing both upper and lower bounds for covariances are equally difficult. This is contrary to the case for expected values: While computing upper bounds for expected values is \(\varSigma _2^0\)-complete, we have seen that computing lower bounds is “only” \(\varSigma _1^0\)-complete, thus lower bounds are computably enumerable. By this, we can even computably enumerate an ascending sequence that monotonically converges to the sought-after expected value. By Theorems 13 and 14 this is not possible for a covariance as \(\varSigma _2^0\)-sets are in general not computably enumerable.

For deciding whether the covariance equals a given rational, we have that this is properly in \(\varDelta _3^0\) (see Theorem 15) since this problem is both \(\varSigma _2^0\)-hard (Theorem 17) and \(\varPi _2^0\)-hard (Theorem 16). Since there exist no complete problems in\(\varDelta _n^0\), for\(n \ge 2\) (in the sense of many-one-reducibility) [40, Exercise 14–14, p. 332], it is impossible to establish a completeness result similar to our other hardness results for \(\mathcal {COVAR}\).

Our hardness results rule out techniques based on finite loop-unrollings as complete approaches for reasoning about the covariances of values of probabilistic programs. An invariant-based approach to overcome this problem is outlined in [22]. This approach also extends to reasoning on runtime variances.

6 Conclusion and discussion

We have studied the computational complexity of solving a variety of natural problems which appear in the analysis of probabilistic programs: Approximating lower bounds, upper bounds, and exact preexpectations (\(\mathcal {LEXP}\), \(\mathcal {REXP}\), and \(\mathcal {EXP}\)), deciding whether a preexpectation is finite (\(\mathcal {FEXP}\)), deciding non-universal and universal almost-sure termination (\(\mathcal {AST} \) and \(\mathcal {UAST} \)), deciding non-universal and universal positive almost-sure termination (\(\mathcal {PAST} \) and \(\mathcal {UPAST} \)), and approximating lower bounds, upper bounds, and exact covariances (\(\mathcal {LCOVAR}\), \(\mathcal {RCOVAR}\), and \(\mathcal {COVAR}\)). Our complexity results are summarized in Fig. 3. Lower bounds of preexpectations are computably enumerable. All other problems are strictly more difficult. Each of the examined problems—except for \(\mathcal {COVAR}\)—is complete for their respective level of the arithmetical hierarchy. For \(\mathcal {COVAR}\) we have established that it is both \(\varSigma _2^0\)- and \(\varPi _2^0\)-hard but in \(\varDelta _3^0\).

An interesting issue that is raised by our results is the following: We see that universal positive almost-sure termination (\(\mathcal {UPAST} \)) is computationally strictly harder to decide than universal almost-sure termination (\(\mathcal {UAST} \)), namely \(\varPi _3^0\)-complete as opposed to \(\varPi _2^0\)-complete. Yet for \(\mathcal {UPAST} \) there exist several complete approaches based on supermartingales [7, 24]. These approaches seem to work quite well in practice [8] and moreover, they are conceptually quite easy. Techniques for proving \(\mathcal {UAST} \), on the other hand, seem much more involved [32]. So while \(\mathcal {UPAST} \) is harder to verify than \(\mathcal {UAST} \) in theory, it appears to be more approachable in practice.

This issue is not new to us: While \(\mathcal {UPAST} \) requires proving an upper bound on the expected runtime, \(\mathcal {UAST} \) requires proving a (non-strict) lower bound (namely 1) on the termination probability. We have encountered a very similar phenomenon in [24]: While proving upper bounds on the expected runtime is conceptually easy, proving lower bounds is much more involved. The same discrepancy seems to apply to lower and upper bounds of expected values: Proving upper bounds is conceptually easier than proving lower bounds, whereas in theory proving upper bounds should be computationally harder.

While we have no solution to this “paradox”, we do believe that the issue is worth investigating. An interesting direction could be to perform for instance for both \(\mathcal {UPAST} \) and \(\mathcal {UAST} \) a smoothed analysis of the problemFootnote 14 [4] and see whether this gives valuable insights on this paradox.

Another direction for future research is to investigate the interplay of nondeterministic and probabilistic behavior. We conjecture that for demonic nondeterminism, all our results remain valid. For angelic nondetermism, there is evidence that the problems might become harder, i.e. the complexity classes may change [9].