Linking Theories of Probabilistic Programming

Jifeng, He

doi:10.1007/978-3-030-01461-2_10

He Jifeng¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11180))

495 Accesses

Abstract

Formal methods advocate the critical role played by the algebraic approach in specification and implementation of programs. Traditionally, a top-down approach (with denotational model as its origin) links the algebra of programs with the denotational representation by establishment of the soundness and completeness of the algebra against the given model, while a bottom-up approach (a journey started from operational model) introduces a variety of bisimulations to establish the equivalence relation among programs. This paper follows up a new way presented in [1] to handle probabilistic programming. Our approach takes an algebra of probabilistic programs as its foundation, and then generates both denotational model and transition system, and explores the consistency among three types of representations.

Access provided by CONRICYT-eBooks. Download chapter PDF

Fine-Grained Semantics for Probabilistic Programs

Probabilistic (logic) programming concepts

Article 08 May 2015

Probabilistic Programming Process Algebra

1 Introduction

Formal methods [4, 5, 9] advocate the critical role played by the algebra of programs in specification and implementation of programs [6, 8]. Study leads to the conclusion that both the top-down approach (with denotational model as its origin) [2, 3, 11] and the bottom-up approach (a journey started from operational model) [10] can meet in the middle.

This paper proposes a new roadmap for linking theories of probabilistic programming. Our new journey consists of the following steps:

Step 1: First we present an algebraic framework for a probabilistic programming language, which provides a set of algebraic laws for probabilistic programs, and introduces the concept of finite normal form. This paper then defines the refinement relation $\sqsubseteq _{A}$, and demonstrates how to reduce finite programs into finite normal form, and to transform an infinite program into an ascending chain of finite normal forms.

Step 2: Within the given program algebra we discuss the algebraic properties of the test operator $\mathcal{T}$ which composes test case tc and testing program P in

$$\mathcal{T}(tc,\,P)~=_{df}(tc;P)$$

where tc is represented by a total constant assignment

$$\begin{aligned} x,y,..,z:=a,\,b,\,..,c \end{aligned}$$

Based on the algebra of test, this paper identifies a probabilistic program P as a binary relation [P] which relates the test case with the final observation

$$[P]\,=_{df}~\{(tc,\,\,obs)~|~\mathcal{T}(tc,\,P)\,\sqsubseteq _{A}\,obs\}$$

and selects the set inclusion as the refinement relation $\sqsubseteq _{rel}$

$$\begin{aligned} P\,\sqsubseteq _{rel}\,Q~=_{df}~([P]\supseteq [Q]) \end{aligned}$$

We establish the consistency of the denotational model against the algebraic framework by proof of

$$\begin{aligned} \sqsubseteq _{rel}\,=\,\sqsubseteq _{A} \end{aligned}$$

Step 3: We propose an algebraic definition of the consistency of step relation of the transition system of programs such that any consistent transition system $(O,\,\sqsubseteq _{O})$ satisfies

$$\begin{aligned} \sqsubseteq _{O}\,=\,\sqsubseteq _{A} \end{aligned}$$

The paper is organised in the following way:

Section 2 is devoted to the algebraic framework of a probabilistic programming language with a collection of algebraic laws. Section 3 shows the normal forms of the finite and infinite probabilistic programs and proves that any probabilistic program can be converted into normal form with algebraic laws. Section 4 presents a test-based model, where each program is identified as a binary relation between test case and visible observation recorded during the execution of the test. It is shown that the refinement relation $\sqsubseteq _{rel}$ in the test model is equivalent to the algebraic refinement $\sqsubseteq _{A}$. Section 5 proposes a formal definition for the consistency of step relation of transition system against the algebra of programs. Moreover, it provides a transition system for the probabilistic programming language, and establishes its correctness. The paper ends with a short summary.

2 Probabilistic Programming Language

This section is going to construct an algebraic framework for the probabilistic programming language introduced in [12]

$P\,::=\,\bot ~|~\mathbf{skip}~|~var:=exp$

$|~P\,\lhd \,bexp\,\rhd \,P$

$|~P\,;\,P$

$|~\oplus \{\underline{G}\}$

$|~\mu \,X\,\bullet \,P(X)$

$\oplus (\underline{G})$ denotes the probabilistic choice with a list of weighted alternatives $\underline{G}$ as its argument

$\underline{G}\,::=\,<>~|~\alpha (v):P,\,\underline{G}$

where the expression $\alpha (v)$ maps any given value of program variable v to a nonnegative real number.

2.1 Probabilistic Choice

This section presents the algebraic properties of the probabilistic choice, which plays a crucial role in construction of normal form for the probabilistic language, and provides an elegant representation for finite observation. Later we are also going to use these algebraic laws to show that any finite program can be converted into a probabilistic choice.

The probabilistic choice is commutative.

($\oplus $-1) $\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}~=_{A} \oplus \{\beta _{\rho (1)}:P_1,...,\beta _{\rho (m)}:P_{m}\}$

where $\rho $ is an arbitrary permutation of the list $<1,...,m>$.

The alternative $(1-\beta ):\bot $ can be added to the probabilistic choice construct where $\beta ~=_{df}~\varSigma _{i} \beta _{i}$.

($\oplus $-2) $\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}~=_{A} \oplus \{\beta _1:P_1,...,\beta _{k}:P_{k},\,(1-\beta ):\bot \}$

The probabilistic choice operator becomes void whenever it contains an alternative with the probability 1.

($\oplus $-3) $\oplus \{1:Q\}~=_{A}~Q$

Corollary

$\oplus \{\}~=_{A}~\bot $

Proof

From $\oplus $-2 and $\oplus $-3.

The next law shows how to eliminate the nested choices.

($\oplus $-4) Let $P\,=\,\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}$, then

$\oplus \{\alpha :P,\,\underline{G}\}$

$~=_{A}~\oplus \{(\alpha \cdot \beta _1):P_1,..., (\alpha \cdot \beta _{k}):P_{k},\,\underline{G}\}$

The probabilistic choice operator distributes over sequential composition.

($\oplus $-5) $\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}\,;\,Q$

$=_{A}~\oplus \{\beta _1:(P_1;Q),...,\,\beta _{k}:(P_{k};Q)\}$

Assignment distributes through the probabilistic choice.

($\oplus $-6) $(v:=e)\,;\,\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}$

$=_{A}~\oplus \{\beta _1[e/v]:(v:=e;P_1),...,\,\beta _{k}[e/v]:(v:=e;P_{k})\}$

Two alternatives with the same guarded program can be merged.

($\oplus $-7) $\oplus \{\alpha :Q,\,\beta :Q,\,\underline{G}\}~=_{A}~ \oplus \{(\alpha +\beta ):Q,\,\underline{G}\}$

Any alternative with zero probability can be removed.

($\oplus $-8) $\oplus \{0:Q,\,\underline{G}\}~=_{A}~ \oplus \{\underline{G}\}$

2.2 Conditional Choice

Conditional choice can be seen as a special form of probabilistic choice.

cond-1 $(P\lhd b\rhd Q)~=_{A}~\oplus \{(1\lhd b\rhd 0):P,\,(0\lhd b\rhd 1):Q\}$

From Law cond-1 and the laws of probabilistic choice presented in the previous section, we can derive the following set of well-known properties of conditional choice:

Theorem 2.1

(1) $P\lhd b\rhd P~=_{A}~P$

(2) $P\lhd b\rhd Q~=_{A}~Q\lhd \lnot b\rhd P$

(3) $(P\lhd b\rhd Q)\lhd c\rhd R~=_{A}~P\lhd b\wedge c\rhd (Q\lhd c\rhd R)$

(4) $P\lhd b\rhd (Q\lhd c\rhd R)~=_{A}~(P\lhd b\rhd Q)\lhd c\rhd (P\lhd b\rhd R)$

(5) $P\lhd true\rhd Q~=_{A}~P~=_{A}~Q\lhd false\rhd P$

(6) $(P\lhd b\rhd Q);R~=_{A}~(P;R)\lhd b\rhd (Q;R)$

(7) $(v:=e);(P\lhd b\rhd Q)~=_{A}~((v:=e);P)\lhd b[e/v]\rhd ((v:=e);Q)$

Proof

For any finite program P:

$\begin{array}{clr} (1) &{} P\lhd b\rhd P &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \{(1\lhd b\rhd 0):P,~(0\lhd b\rhd 1):P\} &{}\{\oplus -7\} \\ =_{A} &{} \oplus \{1:P\} &{}\{\oplus -3\} \\ =_{A} &{} P \end{array}$

$\begin{array}{clr} (2) &{} P\lhd b\rhd Q &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \{(1\lhd b\rhd 0):P,\,(0\lhd b\rhd 1):Q\} &{}\{\oplus -1\} \\ =_{A} &{} \oplus \{(1\lhd \lnot b\rhd 0):Q,\,(0\lhd \lnot b\rhd 1):P\} &{}\{\text{ cond }-1\} \\ =_{A} &{} Q\lhd \lnot b\rhd P \end{array}$

$\begin{array}{clr} (3) &{} (P\lhd b\rhd Q)\lhd c\rhd R &{}\{\text{ cond }-1\} \\ =_{A} &{}\oplus \left\{ \begin{array}{l} (1\lhd c\rhd 0):\oplus \{(1\lhd b\rhd 0):P,\,(0\lhd b\rhd 1):Q),\,\\ (0\lhd c\rhd 1):R \end{array}\right\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (1\lhd b\wedge c\rhd 0):P,\\ Let (1\lhd \lnot b\wedge c\rhd 0):Q,\\ (1\lhd \lnot c\rhd 0):R \end{array}\right\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (1\lhd b\wedge c\rhd 0):P,\\ (0\lhd b\wedge c\rhd 1):\oplus \{(1\lhd c\rhd 0):Q,\,(0\lhd c\rhd 1):R\} \end{array}\right\} &{}\{\text{ cond }-1\} \\ =_{A} &{} P\lhd b\wedge c\rhd (Q\lhd c\rhd R) \end{array}$

$\begin{array}{clr} (4) &{} P\lhd b\rhd (Q\lhd c\rhd R) &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (1\lhd b\rhd 0):P,\\ (0\lhd b\rhd 1):\oplus \{(1\lhd c\rhd 0):Q,~(0\lhd c\rhd 1):R\} \end{array}\right\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (1\lhd b\rhd 0):P,\\ (1\lhd \lnot b\wedge c\rhd 0):Q,~\\ (1\lhd \lnot b\wedge \lnot c\rhd 0):R \end{array}\right\} &{}\{\oplus -7\} \\ =_{A} &{}\oplus \left\{ \begin{array}{l} (1\lhd b\wedge c\rhd 0):P,\\ (1\lhd b\wedge \lnot c\rhd 0):P,\\ (1\lhd \lnot b\wedge c\rhd 0):Q,~\\ (1\lhd \lnot b\wedge \lnot c\rhd 0):R \end{array}\right\} &{}\{\oplus -1~\text{ and } \text{ cond }-1\} \\ =_{A} &{}\oplus \left\{ \begin{array}{l} (1\lhd c\rhd 0):(P\lhd b\rhd Q),\\ (0\lhd \lnot c\rhd 1):(P\lhd b\rhd R) \end{array}\right\} &{}\{\text{ cond }-1\} \\ =_{A} &{} (P\lhd b\rhd Q)\lhd c\rhd (P\lhd b\rhd R) \end{array}$

$\begin{array}{clr} (5) &{}P\lhd true\rhd Q &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \{(1\lhd true\rhd 0):P,~(1\lhd true\rhd 1):Q\} &{}\{\oplus -8\} \\ =_{A} &{} \oplus \{1:P\} &{}\{\oplus -3\} \\ =_{A} &{} P &{}\{\oplus -1,\,3~\text{ and }~8\} \\ =_{A} &{} \oplus \{0:Q,~1:P\} &{}\{\text{ calculation }\} \\ =_{A} &{} \oplus \{(1\lhd false\rhd 0):Q,~(0\lhd false\rhd 1):P\} &{}\{\text{ cond }-1\} \\ =_{A} &{} Q\lhd false\rhd P \end{array}$

(6) From cond-1 and $\oplus $-5.

(7) From cond-1 and $\oplus $-6.

The probabilistic choice operator distributes over conditional.

Theorem 2.2

Let $P\,=\,\oplus \{\beta _1:P_1,...,\beta _{m}:P_{m}\}$, then

$(P\lhd b\rhd Q)~=_{A}~ \oplus \{\beta _1:(P_1\lhd b\rhd Q),...,\beta _{k}:(P_{k}\lhd b\rhd Q)\}$

provided that $\varSigma _{i}\beta _{i}\,=\,1$

Proof

$\begin{array}{clr} &{} P\lhd b\rhd Q &{}\{\mathbf{cond}-1\} \\ =_{A} &{} \oplus \{(1\lhd b\rhd 0):P,~(0\lhd b\rhd 1);Q\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (\beta _1\lhd b\rhd 0):P_1,\\ ....,\\ (\beta _{m}\lhd b\rhd 0):P_{m},\\ (0\lhd b\rhd 1):Q \end{array}\right\}&\{(\oplus -7)~\text{ and } \text{ assumption: }\sigma _{i}\beta _{i}=1\} \end{array}$

$\begin{array}{clr} =_{A} &{} \oplus \left\{ \begin{array}{l} (\beta _1\lhd b\rhd 0):P_1,\\ ....,\\ (\beta _{m}\lhd b\rhd 0):P_{m},\\ (0\lhd \lnot b\rhd \beta _1):Q,\\ .....,\\ (0\lhd b\rhd \beta _{m}):Q \end{array}\right\} &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} \beta _1:(P_1\lhd b\rhd Q),\\ ....,\\ \beta _{m}:(P_{m}\lhd b\rhd Q) \end{array}\right\} \end{array}$

2.3 Sequential Composition

Sequential composition in the probabilistic programming language inherits the algebraic laws of its counterpart in the conventional programming language. It is associative, and has $\bot $ as its zero, and skip as its unit.

seq-1. $P;(Q;R)~=_{A}~(P;Q);R$

seq-2. $\bot ;Q~=_{A}~\bot ~=_{A}~P;\bot $

seq-3. $\mathbf{skip};Q~=_{A}~Q~=_{A}~Q;\mathbf{skip}$

2.4 Total Assignment

An assignment is a total one if all the variables of the program appear on the left hand side in some standard order

$$\begin{aligned} x,\,y,\,..,z~:=~e,\,f,\,...,\,g \end{aligned}$$

A non-total assignment $x:=e$ can be transformed to a total assignment by addition of identity assignments

asgn-1. $(x:=e) ~=_A~ (x,\,y,\,..,z:=e,\,y,...,z)$

For the notational simplicity we will use v to stand for the list $x,y,\,..,z$ of program variables and $v:=e$ for a total assignment.

The list of variables may be sorted into any desired order, provided that the right hand side is subject tote same permutation.

asgn-2. $(x,\,y,\,..:=e,\,f,..)~=_A~(y,\,x,..:=f,\,e,..)$

The following law enables us to eliminate sequential composition between total assignments

asgn-3. $(v:=e\,;\,v:=f(v))~=_A~(v:=f(e))$

where the expression f(e) is easily calculated by substituting the expression in the list e for the corresponding variables in the list v.

The following law deals with the conditional of total assignments

asgn-4. $((v:=e)\lhd b\rhd (v:=f))~=_A~(v:=(e\lhd b\rhd f))$

where the conditional expression $e\lhd b\rhd f$ is defined mathematically:

$ e\lhd b\rhd f \left\{ \begin{array}{l} =_{df} e\text{ if }~b\\ =_{df} f\text{ if }~\lnot b \end{array}\right. $

Finally, we need a law that determines when two total assignments are equal.

asgn-5. $(v:=f)~=_{A}~(v:=g)$ iff $\forall v\bullet f(v)=g(v)$

3 Normal Form Reduction

This section is devoted to the concept of normal form. It will deal with the following issues:

Transform a finite program into a finite normal form based on the algebraic laws of the previous section.
Introduce the least upper bound operator for an ascending chain of finite normal forms.
Establish the continuity of programming combinators.
Verify the continuity of the recursion operator.
Convert an infinite program into an ascending chain of finite normal forms.

First we introduce the concept of finite normal form.

3.1 Finite Normal Form

Definition 3.1

(finite normal form).

A finite normal form is a probabilistic choice with total assignments as its alternatives:

$$\begin{aligned} \oplus \{\beta _1:(v:=e_1),\,...\,,\beta _{k}:(v:=e_{k})\} \end{aligned}$$

Theorem 3.2

Let $S_1~=~\oplus \{\beta _1:(v:=e_1),\,..,\,\beta _{m}:(v:=e_{m})\}$

and $S_2~=~\oplus \{\alpha _1:(v:=f_1),\,..,\,\alpha _{n}:(v:=f_{n})\}$.

Then

(1) $S_1;S_2~=_{A} \oplus \left\{ \begin{array}{l} (\beta _1\cdot \alpha _1):(v:=f_1(e_1)),\\ ...,\\ (\beta _1\cdot \alpha _{n}):(v:=f_{n}(e_1)),\\ ...\\ (\beta _{m}\cdot \alpha _1):(v:=f_1(e_{m})),\\ ...,\\ (\beta _{m}\cdot \alpha _{n}):(v:=f_{n}(e_{m})) \end{array}\right\} $

(2) $S_1\lhd b\rhd S_2~=_{A} \oplus \left\{ \begin{array}{l} (\beta _1\lhd b\rhd 0):(v:=e_1),\\ ......,\\ (\beta _{m}\lhd b\rhd 0):(v:=e_{m}),\\ (\alpha _1\lhd \lnot b\rhd 0):(v:=f_1),\\ ......,\\ (\alpha _{n}\lhd \lnot b\rhd 0):(v:=f_{n}) \end{array}\right\} $

Proof

\(\begin{array}{clr} (1) &{} S_1;S_2 &{}\{\oplus -5\} \\ =_{A} &{} \oplus \{\beta _1:(v:=e_1;S_2),\,..,\,\beta _{m}:(v:=e_{m};S_2)\} &{}\{\oplus -6\} \\ =_{A}&{} \oplus \left\{ \begin{array}{l} \beta _1: \oplus \left\{ \begin{array}{l} \alpha _1[e_1/v]:(v:=e_1;v:=f_1),\\ ...,\\ \alpha _{n}[e_1/v]:(v:=e_1;v:=f_{n}) \end{array}\right\} ,\\ ......, \\ \beta _{m}: \oplus \left\{ \begin{array}{l} \alpha _1[e_{m}/v]:(v:=e_{m};v:=f_1),\\ ...,\\ \alpha _{n}[e_{m}/v]:(v:=e_{m};v:=f_{n}) \end{array}\right\} \\ \end{array}\right\} &{}\{\mathbf{asgn}-3\} \\ =_{A}&{} \oplus \left\{ \begin{array}{l} \beta _1: \oplus \left\{ \begin{array}{l} \alpha _1[e_1/v]:(v:=f_1(e_1)),\\ ..., \alpha _{n}[e_1/v]:(v:=f_{n}(e_1)) \end{array}\right\} ,\\ ......, \\ \beta _{m}:\oplus \left\{ \begin{array}{l} \alpha _1[e_{m}/v]:(v:=f_1(e_{m})),\\ ...,\\ \alpha _{n}[e_{m}/v]:(v:=f_{n}(e_{m})) \end{array}\right\} \\ \end{array}\right\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (\beta _1\cdot \alpha _1):(v:=f_1(e_1)),\\ ...,\\ (\beta _1\cdot \alpha _{n}):(v:=f_{n}(e_1)),\\ ...\\ (\beta _{m}\cdot \alpha _1):(v:=f_1(e_{m})),\\ ...,\\ (\beta _{m}\cdot \alpha _{n}):(v:=f_{n}(e_{m})) \end{array}\right\} \end{array}\)

$\begin{array}{clr} (2) &{} S_1\lhd b\rhd S_2 &{}\{\text{ cond }-1\} \\ =_{A} &{} \oplus \{(1\lhd b\rhd 0):S_1,~(1\lhd \lnot b\rhd 0):S_2\} &{}\{\oplus -4\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} (\beta _1\lhd b\rhd 0):(v:=e_1),\\ ......,\\ (\beta _{m}\lhd b\rhd 0):(v:=e_{m}),\\ (\alpha _1\lhd \lnot b\rhd 0):(v:=f_1),\\ ......,\\ (\alpha _{n}\lhd \lnot b\rhd 0):(v:=f_{n}) \end{array}\right\} \end{array}$

Theorem 3.3

Assume that $S_{i}~=~\oplus \{\alpha _{i,\,1}:(v:=e_{i,\,1}),...,\alpha _{i,\,k_{i}}: (v:=e_{i,\,k_{i}})\}$ for $1\le i\le n$. Then

$\oplus \{\beta _1:S_1,...,\,\beta _{n}:S_{n}\}~=_{A}~ \oplus \left\{ \begin{array}{l} (\beta _1\cdot \alpha _{1,1}):(v:=e_{1,\,1}),\\ ......,\\ (\beta _1\cdot \alpha _{1,\,k_{1}}):(v:=e_{1,\,k_{1}}),\\ ......,\\ (\beta _{n}\cdot \alpha _{n,\,1}):(v:=e_{n,\,1}),\\ ......,\\ (\beta _{n}\cdot \alpha _{n,\,k_{n}}):(v:=e_{n,\,k_{n}}) \end{array}\right\} $

Proof: Similar to Theorem 3.2(1).

Theorem 3.4

(finite normal reduction).

Any finite program can be converted into a finite normal form.

Proof

Basic case:

(1) From $\oplus $-3, we have $(v:=e)~=_{A}~\oplus \{1:(v:=e)\}$

(2) From Corollary of $\oplus $-3, it follows that $\bot \,=_{A}\,\oplus \{\}$

induction: The conclusion follows from Theorems 3.2 and 3.3.

The following law permits comparison of finite normal forms

norm-1. Let $S_1\,=\,\oplus \{\beta _1(v):(v:=e_1(v)),~..,~\beta _{m}(v):(v:=e_{m}(v))\}$

and $S_2\,=\,\oplus \{\alpha _1(v):(v:=f_1(v)),~...,~\alpha _{n}(v):(v:=f_{n}(v))\}$.

Then $S_1\,\sqsubseteq _{A}\,S_2$ iff

$\forall c,\,d\bullet (\varSigma _{i}\{\beta _{i}(c)~|~e_{i}(c)=d\} \le \varSigma _{j}\{\alpha _{j}(c)~|~f_{j}(c)=d\})$

Theorem 3.5

$S_1\,\sqsubseteq _{A}\,S_2$ iff for all constants c

$$\begin{aligned} ((v:=c);S_1)\,\sqsubseteq _{A}\,((v:=c);S_2) \end{aligned}$$

Proof

From $\oplus $-6 and norm-1.

Corollary

Assume that

$$\begin{aligned} T_1= & {} \oplus \{\beta _1:(v:=c_1),..,\beta _{n}:(v:=c_{m})\} \\ T_2= & {} \oplus \{\alpha _1:(v:=d_1),..,\alpha _{n}:(v:=d_{n})\} \end{aligned}$$

where all $\beta _{i}$ and $\alpha _{j}$ are constants, and furthermore both $\{c_1,..,c_{m}\}$ and $\{d_1,..,d_{n}\}$ are lists of distinct constants.

Then $T_1\sqsubseteq _{A}T_2$ iff there exists an injective mapping $\phi $ from $\{1,..,m\}$ to $\{1,..,n\}$ such that

$$\forall i\bullet (c_{i}=d_{\phi (i)})~\wedge ~ (\beta _{i}\le \alpha _{\phi (i)})$$

The next theorem shows that all programming combinators F satisfy

$$\begin{aligned} F(S_1)\,\sqsubseteq _{A}\,F(S_2) \end{aligned}$$

whenever both $S_1$ and $S_2$ are finite normal forms, and $S_1\sqsubseteq _{A}S_2$

Theorem 3.6

If $S_1$ and $S_2$ are finite normal forms satisfying $S_1\,\sqsubseteq _{A}\,S_2$, then

(1) $(S_1\lhd b\rhd R)\,\sqsubseteq _{A}\,(S_2\lhd b\rhd R)$

provided that R is a finite program.

(2) $\oplus \{\gamma :\,S_1,~\underline{G}\}\,\sqsubseteq _{A}\, \oplus \{\gamma :\,S_2,~\underline{G}\}$

where $\underline{G}\,=\,\xi _1:R_1,..,\xi _{n}:R_{n}$ and all $R_{i}$ are finite.

(3) $(S_1;R)\,\sqsubseteq _{A}\,(S_2;R)$ provided that R is a finite program.

(4) $(R;S_1)\,\sqsubseteq _{A}\,(R;S_2)$ provided that R is a finite program.

Proof

From $\oplus $-6 we can transform $(v:=c);S_1$ and $(v:=c);S_2$ into the following form

$$\begin{aligned} (v:=c);S_1&=_{A}&\oplus \{\beta _1:(v:=c_1),..,\beta _{m}:(v:=c_{m})\} \\ (v:=c);S_2&=_A&\oplus \{\alpha _1:(v:=d_1),...,\alpha _{n}:(v:=d_{n})\} \end{aligned}$$

where all $\beta _{i}$ and $\alpha _{j}$ are constants, and furthermore both $\{c_1,..,c_{m}\}$ and $\{d_1,..,d_{n}\}$ are lists of distinct constants. From Corollary of Theorem 3.5 it follows that there exists an injective mapping $\phi $ from $\{1,..,m\}$ to $\{1,..,n\}$ satisfying

$$\forall i\bullet (c_{i}=d_{\phi (i)})~\wedge ~ (\beta _{i}\le \alpha _{\phi (i)})$$

Proof of (1).

Case 1:$b[c/v]=true$

$\begin{array}{clr} (1) &{} (v:=c);(S_1\lhd b\rhd R) &{}\{\text{ Theorem } 2.1(5)\} \\ \equiv _{A} &{} (v:=c);S_1 &{}\{\text{ Theorem } 3.5\} \\ \sqsubseteq _{A} &{}(v:=c);S_2 &{}\{\text{ Theorem } 2.1(5)\} \\ =_{A} &{} (v:=c);(S_2\lhd b\rhd R) \end{array}$

Case 2: $b[c/v]=false$. From Theorem 2.1 it follows that

$$\begin{aligned} (v:=c);(S_1\lhd b\rhd R)~=_{A}~(v:=c);R~=_{A}~(v:=c);(S_2\lhd b\rhd R) \end{aligned}$$

The conclusion follows from Theorem 3.5.

Proof of (2). From Theorem 3.4 it follows that all $R_{i}$ can be transformed into finite normal forms:

$$R_{i}~=_{A}~\oplus \{\rho _{i,\,1}(v):(v:=f_{i,\,1}),..., \rho _{i,\,k_{i}}(v):(v:=f_{i,\,k_{i}})\}$$

for $i\in \{1,..,n\}$.

From Theorem 3.3 it follows that

$(v:=c);\oplus \{\gamma :\,S_1,\,\underline{G}\}~=_{A}$

$\oplus \left\{ \begin{array}{l} (\gamma (c)\cdot \beta _1):\,(v:=c_1),...., (\gamma (c)\cdot \beta _{m}):\,(v:=c_{m}), \\ (\xi _1(c)\cdot \rho _{1,\,1}(c)):\,(v:=f_{1,\,1}(c)),...., (\xi _1(c)\cdot \rho _{1,\,k_1}(c)):\,(v:=f_{1,\,k_1}(c)),\\ .....,\\ (\xi _{n}(c)\cdot \rho _{n,\,1}(c)):\,(v:=f_{n,\,1}(c)), ....., (\xi _{n}(c)\cdot \rho _{n,\,k_{n}}(c)):\,(v:=f_{n,\,k_{n}}(c)) \end{array}\right\} $

$(v:=c);\oplus \{\gamma :\,S_2,\,\underline{G}\}~=_{A}$

$\oplus \left\{ \begin{array}{l} (\gamma (c)\cdot \alpha _1):\,(v:=d_1),...., (\gamma (c)\cdot \alpha _{n}):\,(v:=d_{n}), \\ (\xi _{1}(c)\rho _{1,\,1}(c)):\,(v:=f_{1,\,1}(c)),...., (\xi _{1}(c)\cdot \rho _{1,\,k_1}(c)):\,(v:=f_{1,\,k_1}(c)),\\ .....,\\ (\xi _{n}(c)\cdot \rho _{n,\,1}(c)):\,(v:=f_{n,\,1}(c)), ....., (\xi _{n}(c)\cdot \rho _{n,\,k_{n}}(c)):\,(v:=f_{n,\,k_{n}}(c)) \end{array}\right\} $

Then for any constant r

$\begin{array}{clr} &{}\varSigma \left( \begin{array}{l} \{(\gamma (c)\cdot \beta _{i})~|~(c_{i}=r)\}\\ \cup ~\{(\xi _{i}(c)\cdot \rho _{i,\,j}(c))~|~(f_{i,\,j}=r)\} \end{array}\right) &{}\{\forall i\bullet \,(c_{i}=d_{\phi (i)})\} \\ = &{} \varSigma \left( \begin{array}{l} \{(\gamma (c)\cdot \beta _{i})~|~(d_{\phi (i)}=r)\}\\ \cup ~\{(\xi _{i}(c)\cdot \rho _{i,\,j}(c))~|~(f_{i,\,j}=r\} \end{array}\right) &{}\{\forall i\bullet \,(\beta _{i}\le \alpha _{\phi (i)})\} \\ \le &{} \varSigma \left( \begin{array}{l} \{(\gamma (c)\cdot \alpha _{\phi (i)})~|~(d_{\phi (i)}=r)\}\\ \cup ~\{(\xi _{i}(c)\cdot \rho _{i,\,j}(c))~|~(f_{i,\,j}=r)\} \end{array}\right) &{}\{(\varSigma X)~\le ~(\varSigma \,(X\cup Y))\} \\ \le &{} \varSigma \left( \begin{array}{l} \{(\gamma (c)\cdot \alpha _{l})~|~(d_{l}=r)\}\\ \cup ~\{(\xi _{i}(c)\cdot \rho _{i,\,j}(c))~|~(f_{i,\,j}=r)\} \end{array}\right) \end{array}$

which leads to the conclusion

$$(v:=c);\oplus \{\gamma :S_1,\,\underline{G}\} ~\sqsubseteq _{A}~(v:=c);\oplus \{\gamma :S_2,\,\underline{G}\}$$

Proof of (3). From Theorem 3.4 we can convert R into a finite normal form

$$\begin{aligned} R~=_{A}~\oplus \{\rho _1(v):(v:=f_1),...,\rho _{n}(v):(v:=f_{n})\} \end{aligned}$$

From $\oplus $-5 and 6 we obtain

$(v:=c);S_1;R~=_{A}$

$\oplus \left\{ \begin{array}{l} \beta _{1}\cdot \rho _1(c_{1}):(v:=f_1(c_{1})),...., \beta _{1}\cdot \rho _{n}(c_{1}):(v:=f_{n}(c_{1})),\\ .....,\\ \beta _{m}\cdot \rho _1(c_{m}):(v:=f_1(c_{m})), ...., \beta _{m}\cdot \rho _{n}(c_{m}):(v:=f_{n}(c_{m})) \end{array}\right\} $

$(v:=c);S_2;R~=_{A}$

$\oplus \left\{ \begin{array}{l} (\alpha _{1}\cdot \rho _1(d_{1})):\,(v:=f_1(d_{1})), ...., (\alpha _{1}\cdot \rho _{n}(d_{1})):\,(v:=f_{n}(d_{1})),\\ .....,\\ (\alpha _{n}\cdot \rho _1(d_{n})):\,(v:=f_1(d_{n})), ...., (\alpha _{n}\cdot \rho _{n}(d_{n})):(v:=f_{n}(d_{n})) \end{array}\right\} $

Then for any constant r we have

$\begin{array}{clr} &{}\varSigma _{i,\,j}\{\beta _{i}\cdot \rho _{j}(c_{i})~|~f_{j}(c_{i})=r\} &{}\{\forall i\bullet (c_{i}=d_{\phi (i)})\} \\ = &{} \varSigma _{i,\,j}\{\beta _{i}\cdot \rho _{j}(d_{\phi (i)})~|~f_{j}(d_{\phi (i)})=r\} &{}\{\forall i\bullet (\beta _{i}\le \alpha _{\phi (i)})\} \\ \le &{} \varSigma _{i,\,j}\{\alpha _{\phi (i)}\cdot \rho _{j}(d_{\phi (i)})~|~ f_{j}(d_{\phi (i)})=r\} &{}\{\varSigma X\,\le \,\varSigma (X\cup Y)\} \\ \le &{} \varSigma _{i,\,j}\{\alpha _{i}\cdot \rho _{j}(d_{i})~|~f_{j}(d_{i})=r\} \end{array}$

which leads to the conclusion that

$$\begin{aligned} (v:=c);S_1;R\,\sqsubseteq _{A}\,(v:=c);S_2;R \end{aligned}$$

3.2 Infinite Normal Form

Definition 3.2

(infinite normal form).

An infinite normal form is represented by an infinite sequence of finite normal forms

$$\begin{aligned} S~=~\{S_{i}~|~i\in Nat\} \end{aligned}$$

where each $S_{i+1}$ is a more accurate description than its predecessor

$$\begin{aligned} (S_{i+1}\,\sqsupseteq _{A}\,S_{i})\text{ for } \text{ all } i\in Nat \end{aligned}$$

This is called ascending chain condition. It is this type of chain that will be taken as the normal form for programs that contains recursion. The exact behaviour of the normal form is captured by the least upper bound of the whole sequence, written

$$\begin{aligned} \bigsqcup \,S \end{aligned}$$

The least upper bound operator is characterised by two laws:

norm-2. $\bigsqcup \,S\,\sqsubseteq _{A}\,Q$ iff $\forall i\bullet (S_{i}\,\sqsubseteq _{A}\,Q)$

norm-3. If P is a finite normal form, then $P\,\sqsubseteq _{A}\,\bigsqcup \,T$ iff

$$\begin{aligned} \forall c,\,\exists j\bullet ((v:=c);P)\,\sqsubseteq _{A}\,((v:=c);T_{j}) \end{aligned}$$

The following theorem states that $\bigsqcup S$ is actually the least upper bound of the ascending chain with respect to the refinement order $\sqsubseteq _{A}$.

Theorem 3.7

(1) $S_{i}\,\sqsubseteq _{A}\,\bigsqcup \,S$ for all $i\in Nat$.

(2) If $S_{i}\,\sqsubseteq _{A}\,Q$ for all $i\in Nat$ then $\bigsqcup \,S\,\sqsubseteq _{A}\,Q$

Proof

(1) From norm-3.

(2) From norm-2.

3.3 Continuity

This section deals with the continuity of programming combinators (including recursion) before we show how to transform a program into an ascending chain of finite normal form.

Definition 3.3

(continuity).

An operator is continuous if it distributes through least upper bound of descending chains.

The following laws explore the continuity of finite programming combinators.

norm-4. $(\bigsqcup \,S)\lhd b\rhd P~=_{A}~ \bigsqcup _{i}(S_{i}\lhd b\rhd P)$

norm-5. $(\bigsqcup S);P~=_{A}~\bigsqcup _{i}(S_{i};P)$

norm-6. $P;(\bigsqcup S)~=_{A}~\bigsqcup _{i}(P;S_{i})$

provided that P is a finite normal form.

norm-7. $\oplus \{\alpha :(\bigsqcup \,S),~\underline{G}\}~=_{A}~ \bigsqcup _{i}\oplus \{\alpha :S_{i},\,\underline{G}\}$

The next concern is how to eliminate the nested least upper bound operators.

norm-8. $\bigsqcup _{k}(\bigsqcup _{l}S_{k,\,l})~=_{A}~ \bigsqcup _{i}S_{i,\,i}$

provided that $(S_{k,\,i+1}\,\sqsubseteq _{A}\,S_{k,\,i})$ and $(S_{i+1,\,l}\,\sqsubseteq _{A}S_{i,\,l})$ for all i, k and l.

Law norm-8 lays down the foundation for computation of normal forms by eliminating programming operators.

Theorem 3.8

(Continuity of finite programming combinators).

(1) $(\bigsqcup \,S)\lhd b\rhd (\bigsqcup \, T)~=_{df}~ \bigsqcup _{i}(S_{i}\lhd b\rhd T_{i})$

(2) $(\bigsqcup \,S)\,;\,(\bigsqcup \, T)~=_{df}~ \bigsqcup _{i}(S_{i}\,;\,T_{i})$

(3) $\oplus \{\alpha :(\bigsqcup \,S),\,..,\,\beta :(\bigsqcup \, T)\}~=_{df}~ \bigsqcup _{i}\oplus \{\alpha :S_{i},\,...,\,\beta :T_{i})$

Proof

$\begin{array}{clr} (2) &{} (\bigsqcup S);(\bigsqcup T) &{}\{\text{ norm }-5\} \\ =_{A} &{} \bigsqcup _{i}(S_{i};\bigsqcup T) &{}\{\text{ norm }-6\} \\ =_{A} &{} \bigsqcup _{i}\bigsqcup _{j}(S_{i};T_{j}) &{}\{\text{ norm }-8\} \\ =_{A} &{} \bigsqcup _{i}(S_{i};T_{i}) \end{array}$

The continuity theorem ensures that ascending chains constitute a valid normal form for all the combinators of our probabilistic language, and the stage is set for treatment of recursion.

3.4 Recursion

Consider first an innermost recursive program

$$\begin{aligned} \mu X\bullet P(X) \end{aligned}$$

where P(X) contains X as its only free identifier. Because X is certainly not in normal form, it is impossible to express P(X) in normal form. However, all other components of P(X) are expressible in finite normal form, and all its combinators permit reduction to finite normal form. So if X were replaced by $\bot $, $P(\bot )$ can be reduced to finite normal form, and so on $P(\bot )$, $P^{2}(\bot )$,.., Furthermore from Theorem 3.6 it follows that P is monotonic, this constitutes an ascending chain of finite normal forms.

$\mathbf rec $-1. $\mu X\bullet P(X)~=_{A}\,\bigsqcup _{n}P^{n}(\bot )$ provided that P is continuous.

where $P^{0}(X)\,=_{df}\,\bot $ and $P^{n+1}(X)\,=_{df}\,P(P^{n}(\bot ))$.

Finally we are going to show that the $\mu $ operator is also continuous.

Theorem 3.9

(Continuity of the recursion operator).

If $S_{i}(X)$ contains X as its only free recursive identifier for all i,

and that all $S_{i}$ are continuous and they form an ascending chain for all finite normal forms X:

$$\begin{aligned} S_{i+1}(X)\,\sqsupseteq _{A}\,S_{i}(X)~\text{ for } \text{ all }~i\in Nat \end{aligned}$$

then $\mu X\bullet \bigsqcup _{i}S_{i}(X)~=_{A}~\bigsqcup _{i}\mu X\bullet S_{i}(X)$

Proof

Let $P(X)~=_{df}~\bigsqcup _{i}S_{i}(X)$. By induction we are going to establish for all $n\in Nat$

$$\begin{aligned} P^{n}(\bot )~=_{A}~\bigsqcup _{i}S_{i}^{n}(\bot )~~~~~(*) \end{aligned}$$

Base case: $n=0$

$$\begin{aligned} P^{0}(\bot )~=_{A}~\bot ~=_{A}\bigsqcup _{i}\bot ~=_{A}~\bigsqcup _{i}S_{i}^{0}(\bot ) \end{aligned}$$

Induction:

$\begin{array}{clr} &{} P^{n+1}(\bot ) &{}\{\text{ Def } \text{ of } P^{n+1}\} \\ =_{A} &{} \bigsqcup _{i}S_{i}(P^{n}(\bot )) &{}\{\text{ induction } \text{ hypothesis }\} \\ =_{A} &{} \bigsqcup _{i}S_{i}(\bigsqcup _{j}S_{j}^{n}(\bot )) &{} \{S_{i}~\text{ is } \text{ continuous }\} \\ =_{A} &{} \bigsqcup _{i}\bigsqcup _{j}S_{i}(S_{j}^{n}(\bot )) &{}\{\text{ norm }-8\} \\ =_{A} &{} \bigsqcup _{i}S_{i}(S_{i}^{n}(\bot )) &{}\{\text{ Def } \text{ of } S_{i}^{n+1}\} \\ =_{A} &{} \bigsqcup _{i}S_{i}^{n+1}(\bot ) \end{array}$

which leads to the conclusion:

$\begin{array}{clr} &{}\mu X\bullet P(X) &{}\{\text{ rec }-1\} \\ =_{A} &{}\bigsqcup _{n}P^{n}(\bot ) &{}\{\text{ Conclusion }~(*)\} \\ =_{A} &{}\bigsqcup _{n}(\bigsqcup _{i}S_{i}^{n}(\bot ))~ &{}\{\text{ norm }-8\} \\ =_{A} &{}\bigsqcup _{i}(\bigsqcup _{n} S_{i}^{n}(\bot )) &{}\{\text{ rec }-1\} \\ =_{A} &{}\bigsqcup _{i}\mu X\bullet S_{i}(X) \end{array}$

Now we reach the stage to eliminate the recursion operator.

Theorem 3.10

Any recursive program $\mu X\bullet F(X)$ can be converted into the least upper bound of an ascending chain.

Proof

$\begin{array}{clr} &{}\mu X\bullet F(X,\,\mu Y.G_1(Y),...,\mu Y\bullet G_{m}(Y)) &{}\{\text{ rec }-1\} \\ =_{A} &{} \mu X\bullet F(X,\,\bigsqcup _{n}G^{n}(\bot ),\,..,\, \bigsqcup _{n}G_{m}^{n}(\bot )) &{}\{\text{ Theorem } 3.8\} \\ =_{A} &{} \mu X\bullet \bigsqcup _{n}(F(X,\,G^{n}(\bot ),..,\,G_{m}^{n}(\bot ))) &{}\{\text{ Theorem } 3.9\} \\ =_{A} &{} \bigsqcup _{n}\mu X\bullet F(X,\,G^{n},..,G_{m}^{n}(\bot )) &{}\left\{ \begin{array}{l} \text{ rec }-1~~\text{ and } \text{ let } \\ F_{n}=_{df}F(X,\,G^{n}(\bot ),..,G_{m}^{n}(\bot )) \end{array}\right\} \\ =_{A} &{} \bigsqcup _{n}\bigsqcup _{m}F_{n}^{m}(\bot ) &{}\{\text{ norm }-8\} \\ =_{A} &{} \bigsqcup _{n}F_{n}^{n}(\bot ) \end{array}$

Theorem 3.11

(1) If $P\,\sqsubseteq _{A}\,Q$, then

(a) $(P\lhd b\rhd R)\,\sqsubseteq _{A}\,(Q\lhd b\rhd R)$

(b) $(P;R)\,\sqsubseteq _{A}\,(Q;R)$

(c) $\oplus \{\gamma :\,P,\,\xi _1:U_1,...,\xi _{l}:U_{l}\} \sqsubseteq _{A}\,\oplus \{\gamma :\,Q,\,\xi _{l}:U_{l}\}$

(d) $(R;P)\,\sqsubseteq _{A}\,(R;Q)$

(2) If $P(S)\,\sqsubseteq _A\,Q(S)$ for any finite normal form S, then

$\mu X\bullet P(X)\,\sqsubseteq _{A}\,\mu X\bullet Q(X)$

Proof: From Theorems 3.4 and 3.10 we can transform P, Q and R into descending chain of finite normal forms:

$$ P~=_{A}~\bigsqcup _{i} P_{i}, Q~=_{A}~\bigsqcup _{j} Q_{j} R~=_{A}~\bigsqcup _{k} R_{k} $$

From Definition 3.2 it follows that for any constant c there exists a mapping $\psi _{c}$ satisfying $\forall i\bullet (i\le \psi _{c}(i))$, and

$$\forall i\bullet ((v:=c);P_{i})\,\sqsubseteq _{A}\,((v:=c);Q_{\psi _{c}(i)}) (*)$$

Proof of 1.(a): From $(*)$ and Theorem 3.6(1) we reach the conclusion

$$\forall i\bullet ((v:=c);(P_{i}\lhd b\rhd R_{i}))\,\sqsubseteq _{A}\, ((v:=c);(Q_{\psi _{c}(i)}\lhd b\rhd R_{\psi _{c}(i)}))$$

which implies

$\begin{array}{clr} &{} (P\lhd b\rhd R) &{}\{\text{ Theorem } 3.8\} \\ =_{A} &{} \bigsqcup _{i}(P_{i}\lhd b\rhd R_{i}) &{}\{\text{ norm }~2~\text{ and }~3\} \\ \sqsubseteq _{A} &{} \bigsqcup _{j}(Q_{j}\lhd b\rhd R_{j}) &{}\{\text{ Theorem } 3.8\} \\ =_{A} &{} (Q\lhd b\rhd R) \end{array}$

Proof of 1.(b): From $(*)$ and Theorem 3.6(3)(4) we reach the conclusion

$$\forall i\bullet ((v:=c);(P_{i};R_{i}))\,\sqsubseteq _{A}\, ((v:=c);(Q_{\psi _{c}(i)};R_{\psi _{c}(i)}))$$

which implies

$\begin{array}{clr} &{} (P;R) &{}\{\text{ Theorem } 3.8\} \\ =_{A} &{} \bigsqcup _{i}(P_{i};R_{i}) &{}\{\text{ norm }~2~\text{ and }~3\} \\ \sqsubseteq _{A} &{} \bigsqcup _{j}(Q_{j};R_{j}) &{}\{\text{ Theorem } 3.8\} \\ =_{A} &{} (Q;R) \end{array}$

Proof of 1.(c): From Theorems 3.4 and 3.10 there exists a family $\{\{U_{i,\,n}~n\in Nat\}~|~1\le i\le l\}$ of ascending chains such that for all i, $U_{i}~=_{A}~\bigsqcup _{j}U_{i,\,j}$. Then we have

$\begin{array}{clr} &{}(v:=c);\oplus \left\{ \begin{array}{l} \gamma :\,P_{i},\,\\ \xi _{1}:\,U_{1,i},\\ \,..,\\ \xi _{l}:\,U_{l,\,i}\end{array}\right\} &{}\{\oplus -6\} \\ =_{A} &{} \oplus \left\{ \begin{array}{l} \gamma (c):\,((v:=c);P_{i})),\\ \xi _{1}(c):\,((v:=c);U_{1,\,i}),\\ ...,\\ \xi _{l}(c):\,((v:=c);U_{l,\,i}) \end{array}\right\} &{}\{\text{ Theorem } 3.6\} \\ \sqsubseteq _{A} &{} \oplus \left\{ \begin{array}{l} \gamma (c):\,((v:=c);Q_{\psi _{c}(i)}),\\ \xi _{1}(c):\,((v:=c);U_{1,\,i}),\\ ...,\\ \xi _{l}(c):\,((v:=c);U_{l,\,i}) \end{array}\right\}&\{(i\le \psi _{c}(i))\implies \forall j\bullet (U_{j,\,i}\sqsubseteq _{A}U_{j,\,\psi _{c}(i)}\} \end{array}$

$\begin{array}{clr} \sqsubseteq _{A} &{} \oplus \left\{ \begin{array}{l} \gamma (c):\,((v:=c);Q_{\psi _{c}(i)}),\\ \xi _{1}(c):\,((v:=c);U_{1,\,\psi _{c}(i)}),\\ ...,\\ \xi _{l}(c):\,((v:=c);U_{l,\,\psi _{c}(i)}) \end{array}\right\} &{}\{\oplus -6\} \\ =_{A} &{}(v:=c);\oplus \left\{ \begin{array}{l} \gamma :\,Q_{\psi _{c}(i)},\\ \xi _{1}:\,U_{1,\psi _{c}(i)},\\ ...,\\ \xi _{l}:\,U_{l,\,\psi _{c}(i)} \end{array}\right\} \end{array}$

which leads to the conclusion.

Proof of 1.(d): Assume that

$$R_{i}\,=_{A}\,\oplus \{\xi _{i,\,1}:\,(v:=e_{i,\,1}),\,...,\, \xi _{i,\,n_{i}}:\,(v:=e_{i,\,n_{i}}:\,(v:=e_{i,\,n_{i}})\}$$

Define $\varPhi _{c}(i)~=_{df}~\mathbf{max}(\psi _{e_{i,\,1}(c)}(i),\,...,\, \psi _{e_{i,\,n_{i}}(c)}(i))$. Then we have

$\begin{array}{clr} &{} (v:=c);R_{i};P_{i} &{}\{\oplus -5~\text{ and }~6\} \\ =_{A} &{}\oplus \left\{ \begin{array}{l} x_{i,\,1}(c):\,(v:=e_{i,\,1}(c));P_{i}),\\ ....,\\ x_{i,\,n_{i}}:\,(v:=e_{i,\,n_{i}};P_{i}) \end{array}\right\} &{}\{\text{ Conclusion }~1(c)\} \\ \sqsubseteq _{A} &{}\oplus \left\{ \begin{array}{l} x_{i,\,1}(c):\,(v:=e_{i,\,1}(c));Q_{\psi _{e_{i,\,1}(c)}(i)},\\ ....,\\ x_{i,\,n_{i}}:\,(v:=e_{i,\,n_{i}};Q_{\psi _{e_{i,\,n_{i}}}(c)(i)},\\ \end{array}\right\} &{}\{\text{ Def } \text{ of } \varPhi ~\text{ and } \text{ Conclusion }\,1(c)\} \\ \sqsubseteq _{A} &{}\oplus \left\{ \begin{array}{l} x_{i,\,1}(c):\,(v:=e_{i,\,1}(c));Q_{\varPhi _{c}(i)},\\ ....,\\ x_{i,\,n_{i}}:\,(v:=e_{i,\,n_{i}};Q_{\varPhi _{c}(i)},\\ \end{array}\right\} &{}\{\oplus -6\} \\ =_{A} &{} (v:=c);R_{i};Q_{\varPhi _{c}(i)} &{}\{i\le \varPhi _{c}(i)\} \\ \sqsubseteq _{A} &{} (v:=c);R_{\varPhi _{c}(i)};Q_{\varPhi _{c}(i)} \end{array}$

which leads to the conclusion.

4 Testing Programs

An operational approach usually defines the relationship between a program and its possible execution by machine. In an abstract way, a computation consists of a sequence of individual steps with the following features:

each step takes the machine from one state to a closely similar state;
each step is drawn from a very limited repertoire.

In a stored program computer, the machine states are represented as pairs

$$\begin{aligned} (s,~P) \end{aligned}$$

where

(1) s is a text, defining the data state as an assignment of constant to all variables of the alphabet

$$\begin{aligned} x,\,y,\,...,z\,:=\,a,\,b,\,...,c \end{aligned}$$

(2) P is a program text, representing the rest of the program that remains to be executed. When this becomes the empty text $\epsilon $, there is no more program to be executed. The machine state

$$\begin{aligned} (t,\, \epsilon ) \end{aligned}$$

is the last state of any execution sequence that contains it, and t presents the final value of the variables in the end of execution.

The following lemma indicates that data states are the best programs.

Lemma 4.1

$(s\,\sqsubseteq _{A}\,P)$ implies $(s\,=_{A}\,P)$.

Definition 4.1

(Probabilistic state).

Let $S_{i}\,=, \oplus \{\xi _{i,\,1}:(v:=c_{i,\,1}),\,..\,\xi _{i,\,m_{i}}:(v:=c_{i,\,m_{i}})\}$ be a finite normal form for all $i\in Nat$ in which all $\xi _{i,\,j}$ and $c_{i,\,j}$ are constants,

and $c_{i,\,l,\,}\ne c_{i,\,m}$ for all $l\ne m$.

If $S_{i}\,\sqsubseteq _{A}\,S_{i+1}$ for all $i\in Nat$, then

$$\begin{aligned} \bigsqcup _{i}S_{i} \end{aligned}$$

is called a probabilistic state.

The execution of program (s; P) can be seen as a test on P with the test case s. The result of such a testing gives rise to a set of possible outcomes. We are then able to compare the behaviours of two programs based on testing.

Formally, the test operator for our probabilistic programming language is defined by

$$\begin{aligned} \mathcal{T}(s,\,P)\,=_{df}\,(s;P) \end{aligned}$$

When $\bot $ is taken as the test case, we obtain

$$\begin{aligned} \mathcal{T}(\bot ,\,P)\,=_{A}\,\bot \end{aligned}$$

Execution of a test will deliver a probabilistic state.

Theorem 4.1

For any test $\mathcal{T}(s,\,P)$, there exists a probabilistic state t such that

$$\begin{aligned} \mathcal{T}(s,\,P)~=_{A}~t \end{aligned}$$

Proof

From Theorem 3.4 it follows that any finite program P can be converted into a finite normal form:

$$\begin{aligned} P~=_{A}\oplus \{\beta _1:(v:=e_1),...,\,\beta _{m}:(v:=e_{m})\} \end{aligned}$$

The conclusion is derived from $\oplus -6$.

For any program P there exists an ascending chain $S~=~\{S_{i}~|~i\in Nat\}$ of finite normal form such that

$$\begin{aligned} P~=_{A}~\bigsqcup S \end{aligned}$$

The conclusion follows from $\mathbf{norm}-6$.

Corollary

$P\,\sqsubseteq _{A}\,Q$ iff for all test case s

$$\begin{aligned} \mathcal{T}(s,\,P)\,\sqsubseteq _{A}\,\mathcal{T}(s,\,Q) \end{aligned}$$

Definition 4.2

A program P can be identified as a binary relation [P] between test case s and a final probabilistic data state t it may enter in the end of testing

$$\begin{aligned}{}[P]~=_{df}~\{(s,\,t)~|~\mathcal{T}(s,\,P)\,\sqsubseteq _{A}\,t\} \end{aligned}$$

As usual we define the refinement relation $\sqsubseteq _{rel}$ on the relational model by the set inclusion

$$\begin{aligned} P\sqsubseteq _{rel} Q\,=_{df}\,([P]\,\supseteq \,[Q]) \end{aligned}$$

Theorem 4.2

$\sqsubseteq _{rel}\,=\,\sqsubseteq _{A}$

Proof

$\begin{array}{clr} &{} P\,\sqsubseteq _{A}\,Q &{}\{\text{ Corollary } \text{ of } \text{ Theorem } 4.1\} \\ \equiv &{} \forall s\bullet \mathcal{T}(s,\,P)\,\sqsubseteq _{A}\,\mathcal{T}(s,\,Q) &{}\{\text{ Theorem } 4.1\} \\ \equiv &{} \forall s,t\bullet (\mathcal{T}(s,\,Q)\sqsubseteq _{A}\,t) \implies (\mathcal{T}(s,\,P)\,\sqsubseteq _{A}\,t) &{}\{\text{ Definition } 4.2\} \\ \equiv &{} [Q]\subseteq [P] &{}\{\text{ Definition } \text{ of } \sqsubseteq _{rel}\} \\ \equiv &{} P\sqsubseteq _{rel}Q \end{array}$

Theorem 4.3

(1) $[P\lhd b\rhd Q](v:=c)\,=\,[P](v:=c)\lhd b[c/v]\rhd [Q](v:=c)$

(2) $[\oplus \{\beta _1:P_1,..,\beta _{m}:P_{m}\}](v:=c)\,=\, \left\{ \begin{array}{l} \oplus \{\beta _1[c/v]:t_1,..,\beta _{m}[c/v]:t_{m}\}~|~\\ \forall i\bullet t_{i}\in [P_{i}](v:=c) \end{array}\right\} $

(3) $[P;Q]\,=\,[P]\circ [Q]\uparrow $

where $[Q]\uparrow $ is defined inductively:

$$\begin{aligned}{}[Q]\uparrow (v:=c)&=_{df}&[Q](v:=c) \\ \,[Q]\uparrow (\oplus \{\rho _1:(v:=c_1),..,\rho _{m}:(v:=c_{m})\})&=_{df}&\left\{ \begin{array}{l} \oplus \{\rho _1:t_1,..,\rho _{m}:t_{m}\}~|~\\ \forall i\bullet t_{i}\in [Q]\uparrow (v:=c_{i}) \end{array}\right\} \\ \,[Q]\uparrow (\bigsqcup _{i}t_{i})&=_{df}&\{\bigsqcup _{i} u_{i}~|~\forall i\bullet u_{i}\in [Q]\uparrow (t_{i})\} \end{aligned}$$

(4) $[\mu X\bullet P(X)]\,=\, \bigcap _{n}[P^{n}(\bot )]$

Proof of (3)

$\begin{array}{clr} &{}(s,\,t)\in [P;Q] &{}\{\text{ Definition }~2.1\} \\ \equiv &{} \mathcal{T}(s,\,(P;Q))\,\sqsubseteq _{A}\,t &{}\{\text{ Theorem } \text{4.1 }\} \\ \equiv &{} \exists u\bullet \mathcal{T}(s,\,P)=_{A}u\,\wedge \,(u;Q)\,\sqsubseteq _{A}\,t &{}\{\text{ Def } \text{ of } [Q]\uparrow \} \\ \equiv &{} \exists u\bullet \mathcal{T}(s,\,P)=_{A}u\,\wedge \, (u,\,t)\in [Q]\uparrow &{}\{\text{ Theorem } 3.11\} \\ \equiv &{} \exists u\bullet (\mathcal{T}(s,\,P)\sqsubseteq _{A}u)\,\wedge \, (u,\,t)\in [Q]\uparrow &{}\{\text{ Definition } 4.2\} \\ \equiv &{} \exists u\bullet ((s,\,u)\in [P]\,\wedge \, (u,\,t)\in [Q]\uparrow ) &{}\{\text{ Def } \text{ of } \text{ relational } \text{ composition }\} \\ \equiv &{} (s,\,t)\in ([P]\circ [Q]\uparrow ) \end{array}$

5 Operational Approach

This section provides an operational semantics for our probabilistic programming language. We will introduce the concept of the consistency of an operational framework with respect to the algebra of programs, and present a transition system for the probabilistic language. This section also explores the link between the consistent transition system with the normal form representation of probabilistic programs.

There are two types of transitions for our language

(1) Transition $(s,\,P)\rightarrow (t,\,Q)$ means P transfers to Q with the data state s replaced by t.

We define the concept of divergence, being a machine state that can lead to an infinite execution

$$\mathbf{divergence}(s,\,P)~=_{df}~\forall n\, \exists t,\,Q\bullet ((s,\,P)\rightarrow _{n}(t,\,Q))$$

where $\rightarrow _{0}\,=_{df}\,id$, and $\rightarrow _{n+1}\,=_{df}\,(\rightarrow \,;\rightarrow _{n})$.

(2) Transition $(s,\,P){\mathop {\rightarrow }\limits ^{r}}(s,\,Q)$ (where $0<r\le 1$ means Q is chosen by P to be executed with the probability r, whereas the data state remains unchanged.

We examine the concept of finitary, being a machine state that can only engage in finite number of probabilistic choices

$$\mathbf{finitary}(s,\,P)~=_{df}~\exists n\,\forall t,Q, r,m\bullet \left( \begin{array}{l} (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,Q)\,\wedge \,m>n\\ \implies \,\mathbf{divergence}(t,\,Q) \end{array}\right) $$

where ${\mathop {\rightarrow }\limits ^{r}}_{1}=_{df} {\mathop {\rightarrow }\limits ^{r}}$

and ${\mathop {\rightarrow }\limits ^{r}}_{n+1}=_{df}\, (\rightarrow \,;\,{\mathop {\rightarrow }\limits ^{r}}_{n})\,\cup \, \{({\mathop {\rightarrow }\limits ^{r_1}};{\mathop {\rightarrow }\limits ^{r_2}}_{n})~|~r_1\cdot r_2\,=\,r\}$

and ${\mathop {\rightarrow }\limits ^{r}}_{*}=_{df} \bigcup _{n}{\mathop {\rightarrow }\limits ^{r}}_{n}$

Definition 5.1

A transition system is consistent with respect to the algebraic semantics if for all machine states $(s,\,P)$

(1) $\mathbf{divergence}(s,\,P)$ implies $\mathcal{T }(s,\,P)~=_{A}~\bot $, and

(2) $\mathbf{finitary}(s,\,P)$ if P does not contain $\mu $ operator.

(3) $\mathcal{T}(s,\,P)~=_{A}~\oplus \{r:\mathcal{T}(t,\,Q)~|~(s,\,P){\mathop {\rightarrow }\limits ^{r}} (t,\,Q)\}$,

where we extend the definition of the test operator to deal with the empty program text $\epsilon $ by

$$\begin{aligned} \mathcal{T}(s,\,\epsilon )~=_{df}~s \end{aligned}$$

Theorem 5.1

Let $\rightarrow $ be a consistent transition system.

If $\mathbf{finitary}(s,\,P)$, then there exists n such that

$$\mathcal{T}(s,\,P)~=_{A}~ \oplus \,\{r\,:\,t~|~\exists m\bullet (m\le n)\,\wedge \, (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )\}$$

Otherwise

$$\mathcal{T}(s,\,P)~=_{A}~ \bigsqcup _{n}\,\oplus \{r\,:\,t~|~\exists m\bullet (m\le n)\,\wedge \, (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )\} $$

Proof

(1) Assume that $\mathbf{finitary}(s,\,P)$. For $k>0$ define

$$\mathbf{finitary}_{k}(s,\,P)~=_{df}~\forall (t,\,Q),\,\forall r,\,m\bullet \left( \begin{array}{l} (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,Q)\,\wedge \,m>k\\ \implies \,\mathbf{divergence}(t,\,Q) \end{array}\right) $$

The following inductive proof is based on the length of transition sequences

Basic case: $\mathbf{finitary}_{1}(s,\,P)$.

The conclusion directly follows from (1) and (3) of Definition 5.1.

Induction step: $\mathbf{finatary}_{k+1}(s,\,P)$.

From the definition of $\mathbf{finitary}_{n+1}$ it follows that

$$(s,\,P){\mathop {\rightarrow }\limits ^{r}}(t,\,Q)~\implies ~\mathbf{finitary}_{k}(t,\,Q) (*)$$

$\begin{array}{clr} &{} \mathcal{T}(s,\,P) &{}\{\text{ Def } 5.1(3)\} \\ =_{A} &{} \oplus \,\{r:\mathcal{T}(u,\,Q)~|~(s,\,P){\mathop {\rightarrow }\limits ^{r}}(u,\,Q)\} &{}\{(*)~\text{ and } \text{ inductive } \text{ hypothesis }\} \\ =_{A} &{} \oplus \,\left\{ \begin{array}{l} r:\oplus \,\{\lambda \,:t~|~\exists m\bullet (m\le n)\wedge \\ (s,\,P){\mathop {\rightarrow }\limits ^{r}}(u,\,Q)\,\wedge \, (u,\,Q){\mathop {\rightarrow }\limits ^{\lambda }}_{m}(t,\,\epsilon )\} \end{array}\right\} &{}\{\oplus -4~\text{ Let }~l=n+1\} \\ =_{A} &{} \oplus \,\{\beta :t~|~ \exists m\le l\bullet (s,\,P){\mathop {\rightarrow }\limits ^{\beta }}(t,\epsilon )\} \end{array}$

(2) Consider the case where $\lnot \mathbf{finitary}(s,\,P)$.

First we are going to establish the inequality

$$\mathcal{T}(s,\,P)\,\sqsupseteq _{A}\, \bigsqcup _{n}\, \oplus \,\{r\,:\,t~|~\exists m\le n\bullet (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )$$

By norm-2 we are required to prove for all n

$$\mathcal{T}(s,\,P)\,\sqsupseteq _{A} \oplus \{r\,:\,t~|~\exists m\bullet (m\le n)\wedge (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )\} $$

Basic case: $n=1$.

$\begin{array}{clr} &{}\mathcal{T}(s,\,P) &{}\{\text{ Def } 5.1(3)\} \\ =_{A} &{} \oplus \,\left\{ \begin{array}{l} \{\lambda \,:\,\mathcal{T}(t,\,Q)~|~ (s,\,P){\mathop {\rightarrow }\limits ^{\lambda }}(t,\,Q)\}~\cup \\ \{\beta \,:\,t~|~(s,\,P){\mathop {\rightarrow }\limits ^{\beta }}(t,\,\epsilon )\} \end{array}\right\} &{}\{\text{ Theorem } 3.11(c)\} \\ \sqsupseteq _{A} &{}\oplus \,\left\{ \begin{array}{l} \{\lambda \,:\,\bot ~|~ (s,\,P){\mathop {\rightarrow }\limits ^{\lambda }}(t,\,Q)\}~\cup \\ \{\beta \,:\,t~|~(s,\,P){\mathop {\rightarrow }\limits ^{\beta }}(t,\,\epsilon )\} \end{array}\right\} &{}\{\oplus -2\} \\ =_{A} &{} \oplus \,\{r\,:\,t~|~(s,\,P){\mathop {\rightarrow }\limits ^{r}}(t,\,\epsilon )\} \end{array}$

Induction:

$\begin{array}{clr} &{}\mathcal{T}(s,\,P) &{}\{\text{ Def } 5.1(3)\} \\ =_{A} &{} \oplus \,\left\{ \begin{array}{l} \{\lambda \,:\,\mathcal{T}(t,\,Q)~|~ (s,\,P){\mathop {\rightarrow }\limits ^{\lambda }}(t,\,Q)\}~\cup \\ \{\beta \,:\,t~|~(s,\,P){\mathop {\rightarrow }\limits ^{\beta }}(t,\,\epsilon )\} \end{array}\right\} &{}\{\text{ inductive } \text{ hypothesis }\} \\ \sqsupseteq _{A} &{}\oplus \,\left\{ \begin{array}{l} \{\lambda \,:\,\oplus \{~\gamma :u~|~ (s,\,P){\mathop {\rightarrow }\limits ^{\lambda }}(t,\,Q)~\wedge ~\\ \exists m\le n\bullet (t,\,Q){\mathop {\rightarrow }\limits ^{\gamma }}_{m}(u,\,\epsilon )\}\}~\cup \,\\ \{\beta :\,t~|~(s,\,P){\mathop {\rightarrow }\limits ^{\beta }}(t,\,\epsilon )\} \end{array}\right\} &{}\{\text{ Def } {\mathop {\rightarrow _{n}}\limits ^{r}}\} \\ =_{A} &{} \oplus \,\{ r\,:\,t~|~\exists m\le n+1\bullet (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )\} \end{array}$

Now we are going to prove the inequality

$$\mathcal{T}(s,\,P)\,\sqsubseteq _{A} \bigsqcup _{n}\,\oplus \{r\,:\,t~|~\exists m\bullet (m\le n)\wedge (s,\,P){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon )\} $$

From Definition 5.1 (2) we conclude that P must contain $\mu $ operator. Let us begin with the simplest case:

$$\begin{aligned} P\,=\,\mu X\bullet F(X) \end{aligned}$$

where F(X) does not refer to $\mu $ operator. Clearly from Definition 5.1(2) we have for all n

(i) $\mathbf{finitary}(F^{n}(\bot ))$

By induction it can be shown that

(ii) $(s,\,F^{n}(\bot )){\mathop {\rightarrow }\limits ^{\lambda }}_{m}(t,\,\epsilon ) \,\implies \,\exists k\le n\bullet (s,\,\mu X\bullet F(X)){\mathop {\rightarrow }\limits ^{\lambda }}_{m+k} (t,\,\epsilon )$

From (i) it follows that for all n there exists $k_{n}$ such that

$\begin{array}{clr} &{}\mathcal{T}(s,\,F^{n}(\bot )) &{} \{\mathbf{finitary}(F^{n}(\bot ))\} \\ =_{A} &{} \oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le k_{n})\,\wedge \,\\ (s,\,F^{n}(\bot )){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon ) \end{array}\right) \} &{}\{(ii)~\text{ and } \text{ Corollary } \text{ of } {} \mathbf{norm}-1\} \\ \sqsubseteq _{A} &{}\oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le (k_{n}+n))\,\wedge \\ (s,\,\mu X\bullet F(X)){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon ) \end{array}\right) \} &{}\{\mathbf{norm}-3\} \\ \sqsubseteq _{A} &{} \bigsqcup _{n}\oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le n)\,\wedge \\ (s,\,\mu X\bullet F){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon ) \end{array}\right) \} \end{array}$

which leads to the conclusion

$\begin{array}{clr} &{}\mathcal{T}(s,\,\mu X\bullet F(X)) &{}\{\text{ Theorem } 3.8~\text{ and }~\text{ rec }-1\} \\ =_{A} &{} \bigsqcup _{n}\mathcal{T}(s,\,F^{n}(\bot )) &{}\{\text{ previous } \text{ conclusion }\} \\ \sqsubseteq _{A} &{} \bigsqcup _{n}\oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le n)\,\wedge \\ (s,\,\mu X\bullet F){\mathop {\rightarrow }\limits ^{r}}_{m}(t,\,\epsilon ) \end{array}\right) \} \end{array}$

Finally let us examine the case where

$$\begin{aligned} P~=~F(\mu X\bullet Q(X),..,\mu X\bullet R(X)) \end{aligned}$$

In a similar way we can prove

$\begin{array}{clr} \bullet &{} \mathcal{T}(s,\,F(\mu X\bullet Q(X),\,..,\mu X\bullet R(X))) &{}\{\text{ Theorem }~3.8\} \\ =_{A} &{} \bigsqcup _{n} \mathcal{T}(s,\,F(Q^{n}(\bot ),..,R^{n}(\bot ))) &{}\{\mathbf{finitary}(F(Q^{n}(\bot ),..,R^{n}(\bot )))\} \\ =_{A} &{} \bigsqcup _{n} \oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le n_{k})\,\wedge \\ (s,\,F(Q^{n}(\bot ),...))\\ {\mathop {\rightarrow }\limits ^{r}}_{m} (t,\,\epsilon )\end{array}\right) \} &{}\{\text{ proof } \text{ for }~(P\,=\,\mu X\bullet F(X))\} \\ \sqsubseteq _{A} &{} \bigsqcup _{n} \oplus \,\{r:t~|~\left( \begin{array}{l} \exists m\bullet (m\le n)\,\wedge \,\\ (s,\,P)\\ {\mathop {\rightarrow }\limits ^{r}}_{m} (t,\,\epsilon )\end{array}\right) \} \end{array}$

We propose the following transition system for our probabilistic programming language.

Definition 5.2

(1) Assignment

$((v:=c),\,v:=e)\rightarrow ((v:=e[c/v]),\,\epsilon )$.

(2) Probabilistic Choice

(a) $((v:=c),~\oplus \,\{r_1:P_1,..,r_{m};P_{m}\}) {\mathop {\longrightarrow }\limits ^{r_{k}[c/v]}}\,((v:=c),\,P_{k})$

provided that $r_{k}[c/v]>0$

(b) $((v:=c),~\oplus \,\{r_1:P_1,..,r_{m};P_{m}\}) {\mathop {\longrightarrow }\limits ^{1-\sum _{k}r_{k}[c/v]}}\,((v:=c),\,\bot )$

provided that $\sum _{k}r_{k}[c/v]<1$.

(3) Conditional

$\begin{array}{rcll} (a)~((v:=c),~P\lhd b\rhd Q)&{} \rightarrow &{} ((v:=c),\,P) &{} \text{ if }~b[c/v]=true \\ (b)~((v:=c),~P\lhd b\rhd Q)&{} \rightarrow &{} ((v:=c),\,Q) &{} \text{ if }~b[c/v]=false \end{array}$

(4) Composition

$\begin{array}{rcll} (a)~(s,\,P;Q)&{} {\mathop {\rightarrow }\limits ^{r}} &{} (t,\,R;Q)&{} \text{ if }~ (s,\,P){\mathop {\rightarrow }\limits ^{r}} (t,\,R) \\ (b)~(s,\,P;Q)&{} \rightarrow &{} (t,\,R;Q)&{} \text{ if }~ (s,\,P)\,\rightarrow \,(t,\,R) \\ (c)~(s,\,P;Q)&{} {\mathop {\rightarrow }\limits ^{r}} &{} (t,\,Q) &{} \text{ if }~(s,\,P){\mathop {\rightarrow }\limits ^{r}} (t,\,\epsilon ) \\ (d)~(s,\,P;Q)&{} \rightarrow &{} (t,\,Q) &{} \text{ if }~(s,\,P)\,\rightarrow \,(t,\,\epsilon ) \end{array}$

(5) Recursion

$(s,\,\mu X\bullet P(X))\rightarrow (s,\,P(\mu X\bullet P(X)))$

(6) Chaos

$(s,\,\bot )\rightarrow (s,\,\bot )$

We are going to show that Definition 5.2 gives a consistent transition system. First, we show that the given transition system satisfies Definition 5.1(3)

Lemma 5.2

$\mathcal{T}(s,\,P)\,=_{A}\, \oplus \,\{r:\mathcal{T}(t,\,Q)~|~ (s,\,P){\mathop {\rightarrow }\limits ^{r}}(t,\,Q)\}$

Proof

Direct from the following properties of the test operator $\mathcal{T}$:

(1) From $\oplus $-6 it follows that

$\mathcal{T}((v:=c),\,\oplus \{r_1:P_1,..,r_{n}:P_{n}\} ~=_{A}~\oplus \{r_1[c/v]:\mathcal{T}((v:=c),\,P_1),..., r_{n}[c/v]:\mathcal{T}((v:=c),\,P_{n})\}$

(2) From Theorem 2.1(7) we obtain

$\mathcal{T}((v:=c),\,(P\lhd b\rhd Q))~=_{A}~ \mathcal{T}((v:=c),\,P)\lhd b[c/v]\rhd \mathcal{T}((v:=c),\,Q)$

(3) From Theorem 4.1 we have

$\mathcal{T}((v:=c),\,(P;Q))~=_{A}~ \oplus \{r_1:\mathcal{T}((v:=d_1),\,Q),...,r_{m}:\mathcal{T}((v:=d_{m}),\,Q)\}$

provided that $\mathcal{T}((v:=c),\,P)~=_{A}~\oplus \, \{r_1:(v:=d_1),..,r_{m}:(v:=d_{m}\}$

(4) $\mathcal{T}(s,\,\mu X\bullet P(X))~=_{A}~ \mathcal{T}(s,\,P(\mu X\bullet P(X)))$

Next we deal with the condition (1) of Definition 5.1.

Lemma 5.3

If P is a finite program, then

$$\begin{aligned} \mathbf{divergence}(s,\,P)\,\implies \,\mathcal{T}(s,\,P)\,=_{A}\,\bot \end{aligned}$$

Proof

We give an induction proof based on the structure of program text P:

Base case: Clearly the conclusion holds for the case $P\,=\,v:=e$ and $P\,=\,\bot $

Inductive step:

$\begin{array}{clr} &{} \mathbf{divergence}((v:=c),~(P\lhd b\rhd Q)) &{}\{\text{ Rule } (3)~\text{ in } \text{ Definition } \text{5.2 }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}((v:=c),\,P)\\ \lhd b[c/v]\rhd \\ \mathbf{divergence}((v:=c),\,Q) \end{array}\right) &{}\{\text{ Induction } \text{ hypothesis }\} \\ \implies &{} \left( \begin{array}{l} (\mathcal{T}((v:=c),\,P)\,=_{A}\,\bot )\\ \lhd b[c/v]\rhd \\ (\mathcal{T}((v:=c),\,Q)\,=_{A}\,\bot ) \end{array}\right) &{}\{\text{ Theorem } 2.1(7)\} \\ \implies &{} \mathcal{T}((v:=c),~(P\lhd b\rhd Q))\,=_{A}\,\bot \end{array}$

$\begin{array}{clr} &{} \mathbf{divergence}((v:=c),~\oplus \{r_1:P_1,..,r_{n}:P_{n}\}) &{}\{\text{ Rule } (2)~\text{ in } \text{ Definition } \text{5.2 }\} \\ \implies &{} \varSigma \{(r_{k}[c/v])| \mathbf{divergence}((v:=c),\,P_{k})\}=1 &{}\{\text{ Induction } \text{ hypothesis }\} \\ \implies &{} \varSigma \{(r_{k}[c/v])| \mathcal{T}((v:=c),\,P_{k})~=_{A}~\bot \}=1 &{}\{\oplus -3~\text{ and }~6 \} \\ \implies &{} \mathcal{T}((v:=c),~\oplus \,\{r_1:P_1,..,r_{n}:P_{n}\})\,=_{A}\,\bot \end{array}$

Finally we are going to tackle infinite programs.

Lemma 5.4

If $(s,\,G(Q))\rightarrow _{*}(t,\,\epsilon )$,

then either $\mathbf{divergence}(s,\,G(\bot ))$ or $(s,\,G(\bot ))\rightarrow _{*}(t,\,\epsilon )$.

Proof

Induction on the structure of G

Base case. $G(Q)\,=\,Q$ From Rule (6)

$$\begin{aligned} (s,\,\bot )\rightarrow (s,\,\bot ) \end{aligned}$$

in Definition 5.2.

Inductive step:

(1) $G(Q)=G_1(Q)\lhd b\rhd G_2(X)$

$\begin{array}{clr} &{}(s,\,G(Q))\rightarrow _{*}(t,\,\epsilon ) &{}\{\text{ Rule } (3)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \left( \begin{array}{l} (s,\,G_1(Q))\rightarrow _{*}(t,\,\epsilon )\\ \lhd (s;b)\rhd \\ (s,\,G_2(Q))\rightarrow _{*}(t,\,\epsilon ) \end{array}\right) &{}\{\text{ Induction } \text{ hypothesis }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,G_1(\bot ))~\vee ~ (s,\,G_1(\bot ))\rightarrow _{*}(t,\,\epsilon )\\ \lhd (s;b)\rhd \\ \mathbf{divergence}(s,\,G_2(\bot ))~\vee ~ (s,\,G_2(\bot ))\rightarrow _{*}(t,\,\epsilon ) \end{array}\right) &{}\{\text{ Rule } (3)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \mathbf{divergence}(s.\,G(\bot ))\,\vee \, (s,\,G(\bot ))\rightarrow _{*}(t,\,\epsilon ) \end{array}$

(2) $G(Q)=\oplus \{\alpha _1:G_1(Q),\,...\,\alpha _{k}:G_{k}(Q)\}$. Similar to Case (1).

(3) $G(Q)=G_1(Q);G_2(Q)$

$\begin{array}{clr} &{}(s,\,G(Q))\rightarrow _{*}(t,\,\epsilon ) &{}\{\text{ Rule } (4)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \exists u\bullet \left( \begin{array}{l} (s,\,G_1(Q))\rightarrow _{*}(u,\,\epsilon )~\vee \\ (u,\,G_2(Q))\rightarrow _{*}(t,\,\epsilon ) \end{array}\right) &{}\{\text{ Induction } \text{ hypothesis }\} \\ \implies &{} \exists u\bullet \left( \begin{array}{l} \mathbf{divergence}(s,\,G_1(\bot ))~\vee ~\\ (s,\,G_1(\bot ))\rightarrow _{*}(u,\,\epsilon )~\vee ~\\ \mathbf{divergence}(u,\,G_2(\bot ))~\vee ~\\ (u,\,G_2(\bot ))\rightarrow _{*}(t,\,\epsilon ) \end{array}\right) &{}\{\text{ Rule } (4)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \mathbf{divergence}(s.\,G(\bot ))\,\vee \, (s,\,G(\bot ))\rightarrow _{*}(t,\,\epsilon ) \end{array}$

(4) $G(Q)=\mu X\bullet P(Q,\,X)$

$\begin{array}{clr} &{} (s,\,\mu X\bullet P(Q,\,X))\rightarrow _{*} (t,\, \epsilon ) &{}\{\text{ Rule } (5)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} (s,\,P(Q,\,\mu X\bullet P(Q,\,X)))\rightarrow _{*} (t,\, \epsilon ) &{}\{\text{ Induction } \text{ hypothesis }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,P(\bot ,\,\mu X\bullet P(\bot ,\,X)))~\vee \\ (s,\,P(\bot ,\,\mu X\bullet P(\bot ,\,X)))\rightarrow _{*} (t,\, \epsilon ) \end{array}\right) &{}\{\text{ Rule } (5)~\text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,\mu X\bullet P(\bot ,\,X))~\vee \\ (s,\,\mu X\bullet P(\bot ,\,X))\rightarrow _{*} (t,\, \epsilon ) \end{array} \right) \end{array}$

Lemma 5.5

(1) $\mathbf{divergence}(s,\,F(P)) \,\implies \,\mathbf{divergence}(s,\,\mathcal{F}(\bot ))$

(2) $\mathbf{divergence}(s,\,F(\mu X\bullet P(X))\,\implies \, \mathbf{divergence}(s,\,F(P(\mu X\bullet P(X))))$

Proof

(1) Based on induction on the structure of F.

Base case: $F(X)=X$. The conclusion follows from the Rule (6) in Definition 5.2.

Inductive Step:

$\begin{array}{clr} &{}\mathbf{divergence}(s,\,F_1(Q)\lhd b\rhd F_2(Q)) &{}\{\text{ Rule } \text{(3) } \text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,F_1(Q))\,\\ \lhd (s;b)\rhd \, \\ \mathbf{divergence}(s,\,F_2(Q)) \end{array} \right) &{}\{\text{ induction } \text{ hypothesis }\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,F_1(\bot ))\,\\ \lhd (s;b)\rhd \,\\ \mathbf{divergence}(s,\,F_2(\bot )) \end{array} \right) &{}\{\text{ Rule } \text{(3) } \text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \mathbf{divergence}((s,\,(F_1(\bot )\lhd b\rhd F_2(\bot )))) \end{array}$

$\begin{array}{clr} &{}\mathbf{divergence}(s,\,F_1(Q);F_2(Q))&{} \{\text{ Rule } \text{(4) } \text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \mathbf{divergence}(s,\,F_1(Q))\,\vee \, \\ &{} \exists t\bullet (s,\,F_1(Q))\rightarrow _{*}(t,\,\epsilon )\,\wedge \,\mathbf{divergence}(t,\,F_2(Q)) &{}\{\text{ Lemma }~5.4\} \\ \implies &{} \left( \begin{array}{l} \mathbf{divergence}(s,\,F_1(\bot ))\,\vee \,\\ (s,\,F_1(\bot )) \rightarrow _{*} (t,\,\epsilon )\,\wedge \,\mathbf{divergence}(t,\,F_2(\bot )) \end{array} \right) &{}\{\text{ Rule } \text{(4) } \text{ in } \text{ Def } \text{5.2 }\} \\ \implies &{} \mathbf{divergence}(s,\,F_1(\bot );F_2(\bot )) \end{array}$

Lemma 5.6

$\mathbf{divergence}(s,\,F(\mu X\bullet P(X)) \,\implies \, \mathcal{T}(s,\,F(\mu X\bullet P(X)))\,=_{A}\,\bot $

Proof

$\begin{array}{clr} &{} \mathbf{divergence}(s,\,F(\mu X\bullet P(X))) &{}\{\text{ Lemma } 5.5(2)\} \\ \implies &{} \forall n\bullet \mathbf{divergence}(s,\,F(P^{n}(\mu X\bullet P(X)))) &{}\{\text{ Lemma } 5.5(1)\} \\ \implies &{} \forall n\bullet \mathbf{divergence}(s,\,F(P^{n}(\bot ))) &{}\{\text{ Lemma } 5.3\} \\ \implies &{} \forall n\bullet \mathcal{T}(s,\,F(P^{n}(\bot )))\,=_{A}\,\bot &{}\{\text{ Theorem }~3.8\} \\ \implies &{}\mathcal{T}(s,\,F(\mu X\bullet P(X)))\,=_{A}\,\bot \end{array}$

Lemma 5.7

If P is finite, then $\mathbf{fnitary}(s,\,P)$ holds for all states s

Proof

On structural induction

Combining Lemmas 5.2, 5.6 and 5.7 we conclude

Theorem 5.8

The transition system defined in Definition 5.2 is consistent.

6 Conclusions

This paper begins with an algebraic framework for our probabilistic programming language, and then shows how to deliver the corresponding denotational and operational representations consistently. The main contributions include:

Clarify the type of observations we are able to record during the execution of a probabilistic programs:
- The behaviour of a program cannot simply be modelled as a relation between the initial data state and a finite distribution on the possible final data states.
- The normal approach permits us to distinguish a program which can terminate and deliver a final distribution function from a program which can only generate an approximate distribution function during its ever-lasting execution.
The test algebra lays down the foundation for construction of a denotational framework for our probabilistic programming language.
The consistency of an operational approach against the algebra of programs can be formalised and validated within the algebra of programs.

The language we put forward in this paper has not included the nondeterministic choice operator given in the traditional programming languages. As a result, we lose the case where the probabilistic choice can be identified as a refinement of the nondeterministic choice. Moreover, the refinement order in the conventional languages was directly induced from the choice operator, whereas we were forced to adopt an inductive definition in Sects. 1 and 2 based on finite and infinite normal forms. Consequently, it makes the proof of monotonicity of programming combinators in this paper look cumbersome.

In future, we will investigate a language armed with both probabilistic and nondeterministic choice operators, and follow up the algebraic approach advocated in this paper to explore the links among various programming presentations for the probabilistic languages.

References

Jifeng, H., Qin, L.: A new roadmap for linking theories of programming and its applications on GCL and CSP. Sci. Comput. Program. Elsevier 162, 3–34 (2018)
Article Google Scholar
Abrial, J.-R.: The B-Book: Assigning Programs to Meanings. Cambridge Press, Cambridge (1996)
Book Google Scholar
Abrial, J.-R.: Modelling in Event-B: System and Software Engineering. Cambridge Press, Cambridge (2010)
Book Google Scholar
Dijkstra, E.W.: A Discipline of Programming. Prentice-Hall, Englewood Cliffs (1976)
MATH Google Scholar
Hehner, E.C.R.: A more complete model of communicating processes. Theor. Comput. Sci. 26, 105–120 (1983)
Article MathSciNet Google Scholar
Hehner, E.C.R.: Predicative programming, part 1 and 2. Commun. ACM 27(2), 134–151 (1984)
Article MathSciNet Google Scholar
Hennessy, M.C.: Algebraic Theory of Process. The MIT Press, Cambridge (1988)
MATH Google Scholar
Hoare, C.A.R., et al.: Laws of programming. Commun. ACM 30(8), 672–686 (1987)
Article MathSciNet Google Scholar
Jones, C.B.: Systematic Software Development Using VDM. Prentice Hall, New York (1986)
MATH Google Scholar
Plotkin, G.D.: A structural approach to operational semantics. Technical Report, DAIMI-FN-19, Aarhus University, Denmark (1981)
Google Scholar
Spivey, J.M.: The Z Notation: A Reference Manual. Prentice Hall, New York (1992)
MATH Google Scholar
Jifeng, H., Seidel, K., McIver, A.: Probabilistic models for the guarded command language. Sci. Comput. Program. 28(2–3), 171–192 (1997)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai, China
He Jifeng

Authors

He Jifeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to He Jifeng .

Editor information

Editors and Affiliations

Newcastle University, Newcastle, UK
Cliff Jones
National University of Defense Technology, Changsha, China
Ji Wang
Institute of Software, CAS, Beijing, China
Naijun Zhan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jifeng, H. (2018). Linking Theories of Probabilistic Programming. In: Jones, C., Wang, J., Zhan, N. (eds) Symposium on Real-Time and Hybrid Systems. Lecture Notes in Computer Science(), vol 11180. Springer, Cham. https://doi.org/10.1007/978-3-030-01461-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-01461-2_10
Published: 29 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01460-5
Online ISBN: 978-3-030-01461-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Linking Theories of Probabilistic Programming

Abstract

Similar content being viewed by others

Fine-Grained Semantics for Probabilistic Programs

Probabilistic (logic) programming concepts

Probabilistic Programming Process Algebra

1 Introduction

2 Probabilistic Programming Language

2.1 Probabilistic Choice

Corollary

Proof

2.2 Conditional Choice

Theorem 2.1

Proof

Theorem 2.2

Proof

2.3 Sequential Composition

2.4 Total Assignment

3 Normal Form Reduction

3.1 Finite Normal Form

Definition 3.1

Theorem 3.2

Proof

Theorem 3.3

Theorem 3.4

Proof

Theorem 3.5

Proof

Corollary

Theorem 3.6

Proof

3.2 Infinite Normal Form

Definition 3.2

Theorem 3.7

Proof

3.3 Continuity

Definition 3.3

Theorem 3.8

Proof

3.4 Recursion

Theorem 3.9

Proof

Theorem 3.10

Proof

Theorem 3.11

4 Testing Programs

Lemma 4.1

Definition 4.1

Theorem 4.1

Proof

Corollary

Definition 4.2

Theorem 4.2

Proof

Theorem 4.3

5 Operational Approach

Definition 5.1

Theorem 5.1

Proof

Definition 5.2

Lemma 5.2

Proof

Lemma 5.3

Proof

Lemma 5.4

Proof

Lemma 5.5

Proof

Lemma 5.6

Proof

Lemma 5.7

Proof

Theorem 5.8

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations