1 Introduction

With the ever-increasing interest in probabilistic processes, robust notions and formal tools are needed to analyze their behavior. It has been argued many times by now that behavioral metrics [2, 6, 7, 13, 14, 20, 23, 34] are preferable to behavioral relations in this setting in that they overcome the high sensitivity of equivalences and preorders with respect to tiny variations in the values of probabilities. Instead of stating whether the behavior of two processes is exactly the same or not, behavioral metrics measure the disparities in their behavior.

A fundamental aspect for process specification and verification is compositional reasoning, namely to prove the compatibility of the language operators with the chosen behavioral semantics. In [16] the authors proposed to express the compositional properties of the bisimilarity metric [6, 13, 14], i.e., the quantitative analogue to bisimulation equivalence, in terms of its continuity properties. In detail, a notion of uniform continuity was proposed subsuming the properties of non-extensiveness [4], non-expansiveness [14] and Lipschitz continuity of the metric. Informally, an operator is non-extensive wrt. a metric if the distance of two systems defined via that operator is not greater than the maximum of the pairwise distances of their components. The notion of non-expansiveness relaxes that of non-extensiveness by allowing the distance of the two systems to be bounded by the sum of the pairwise distances of their components. Lipschitz continuity extends non-expansiveness by considering as bound a multiplicative factor of the sum of pairwise distances of the components of the two systems.

Since [33], a specification format is a set of syntactic constraints on the form of the SOS inference rules [29, 30] used to define the operational semantics of processes, ensuring by construction a given compositionality property (see [1, 28] for surveys). A first step in the definition of specification formats for uniform continuity properties of bisimilarity metric can be found in [19]. However, the formats in [19] are not fully syntactic since they do not simply require an inspection of the pattern of the rules, but they require to compute how many copies of a process argument can be spawned by each operator. Such computation is necessarily recursive since the SOS rules define operators in terms of operators.

Our Contribution. In this paper we provide a classic syntactic format for uniform continuity properties for the bisimilarity metric. In detail, we propose three formats: the non-extensiveness, the non-expansiveness, and the Lipschitz continuity format. These formats build on the PGSOS format [11, 12] and on syntactic constraints regulating the form of the targets of rules. We recall that the PGSOS format guarantees that probabilistic bisimilarity, i.e., the kernel of the bisimilarity metric, is a congruence [12]. Hence, to obtain the desired continuity properties it is fundamental to start from the PGSOS format, since it ensures that by composing processes which are at distance 0, i.e., bisimilar, we obtain processes which are at distance 0. We will see that less demanding continuity properties require less demanding rule constraints.

2 Background

In the process algebra setting, processes are constructed inductively as closed terms over a suitable signature, namely a countable set \(\varSigma \) of operators. We let f range over \(\varSigma \) and \(\mathfrak {n}\) range over the rank of \(f \in \varSigma \). Operators with rank 0 are called constants. We let \(\varSigma ^0\) denote the subset of constants in the signature \(\varSigma \).

Assume a set of (state, or process) variables \(\mathbf {V}\!_s\) disjoint from \(\varSigma \) and ranged over by \(x,y,\ldots \). The set \(\textsf {T}(\varSigma , V)\) of terms over \(\varSigma \) and a set of variables \(V \subseteq \mathbf {V}\!_s\) is the least set satisfying: (i) \(V \subseteq \textsf {T}(\varSigma , V)\), and (ii) \(f(t_1,\ldots ,t_{\mathfrak {n}}) \in \textsf {T}(\varSigma , V)\) whenever \(f \in \varSigma \) and \(t_1,\ldots ,t_{\mathfrak {n}} \in \textsf {T}(\varSigma , V)\). By \(\mathbf {T}(\varSigma )\) we denote the set of the closed terms \(\textsf {T}(\varSigma , \emptyset )\), or processes. By \(\mathbb {T}(\varSigma )\) we denote the set of all open terms \(\textsf {T}(\varSigma , \mathbf {V}\!_s)\). By \(\mathrm {var}(t)\) we denote the set of the variables occurring in the term t.

To equip processes with a semantics we rely on nondeterministic probabilistic labeled transition systems (PTSs) [31], which allow for modeling reactive behavior, nondeterminism and probability. The state space in a PTS is the set of processes \(\mathbf {T}(\varSigma )\). Transitions take processes to discrete probability distributions over processes, i.e., mappings \(\pi :\mathbf {T}(\varSigma )\rightarrow [0,1]\) with \(\sum _{t \in \mathbf {T}(\varSigma )} \pi (t) = 1\). The support of a distribution \(\pi \) is \(\mathsf {supp}(\pi ) = \{ t \in \mathbf {T}(\varSigma )\mid \pi (t) >0\}\). By \(\varDelta (\mathbf {T}(\varSigma ))\) we denote the set of all finitely supported distributions over \(\mathbf {T}(\varSigma )\).

Definition 1

(PTS, [31]). A PTS is a triple \((\mathbf {T}(\varSigma ),\mathcal A,\xrightarrow { \, {} \, })\), where \(\varSigma \) is a signature, \(\mathcal A\) a countable set of action labels, and \(\xrightarrow { \, {} \, } \subseteq {\mathbf {T}(\varSigma )\times \mathcal A\times \varDelta (\mathbf {T}(\varSigma ))}\) a transition relation.

As usual, a transition \((t,\alpha ,\pi ) \in \xrightarrow { \, {} \, }\) is denoted \(t \xrightarrow { \, {\alpha } \, } \pi \). Then, denotes that there is no distribution \(\pi \) with \(t \xrightarrow { \, {\alpha } \, } \pi \) and is called a negative transition.

Distributions and Distribution Terms. For a process \(t \in \mathbf {T}(\varSigma )\), \(\delta _t\) is the Dirac distribution with \(\delta _t(t)= 1\) and \(\delta _t(s)=0\) for all \(s \ne t\). For \(f \in \varSigma \) and \(\pi _i \in \varDelta (\mathbf {T}(\varSigma ))\), \(f(\pi _1,\dots , \pi _{\mathfrak {n}})\) is the distribution defined by \(f(\pi _1,\dots \pi _{\mathfrak {n}})(f(t_1,\dots t_{\mathfrak {n}})) = \prod _{i = 1}^{\mathfrak {n}}\pi _i(t_i)\) and \(f(\pi _1,\dots ,\pi _{\mathfrak {n}})(t)=0\) for all t not in the form \(f(t_1,\dots ,t_{\mathfrak {n}})\). The convex combination \(\sum _{i \in I} p_i \pi _i\) of a family of distributions \(\{\pi _i\}_{i \in I} \subseteq \varDelta (\mathbf {T}(\varSigma ))\) with \(p_i \in (0,1]\) and \(\sum _{i \in I} p_i = 1\) is defined by \((\sum _{i \in I} p_i \pi _i)(t) = \sum _{i \in I} (p_i \pi _i(t))\) for all \(t \in \mathbf {T}(\varSigma )\).

Then, we need the notion of distributions term. Assume a countable set of distribution variables \(\mathbf {V}\!_d\) disjoint from \(\varSigma \) and \(\mathbf {V}\!_s\) and ranged over by \(\mu , \nu ,\ldots \). Intuitively, any \(\mu \in \mathbf {V}\!_d\) may instantiate to a distribution in \(\varDelta (\mathbf {T}(\varSigma ))\). The set of distribution terms over \(\varSigma \) and the sets of variables \(V_s \subseteq \mathbf {V}\!_s\) and \(V_d \subseteq \mathbf {V}\!_d\), notation \(\textsf {D}\textsf {T}(\varSigma , V_s, V_d)\), is the least set satisfying: (i) \(\{\delta _t \mid t \in \textsf {T}(\varSigma , V_s)\} \subseteq \textsf {D}\textsf {T}(\varSigma , V_s, V_d)\), (ii) \(V_d \subseteq \textsf {D}\textsf {T}(\varSigma , V_s, V_d)\), (iii) \(f(\varTheta _1, \ldots , \varTheta _n) \in \textsf {D}\textsf {T}(\varSigma , V_s, V_d)\) whenever \(f \in \varSigma \) and \(\varTheta _i \in \textsf {D}\textsf {T}(\varSigma , V_s, V_d)\), (iv) \({\textstyle \sum _{j\in J} p_j \varTheta _j \in \textsf {D}\textsf {T}(\varSigma , V_s, V_d)}\) whenever \(\varTheta _j \in \textsf {D}\textsf {T}(\varSigma , V_s, V_d)\) and \(p_j \in (0,1]\) with \(\sum _{j\in J} p_j = 1\). We write \(\mathbb {DT}(\varSigma )\) for \(\textsf {D}\textsf {T}(\varSigma , \mathbf {V}\!_s,\mathbf {V}\!_d)\), i.e. the set of all open distribution terms, and \({\mathbf {DT}}(\varSigma )\) for \(\textsf {D}\textsf {T}(\varSigma , \emptyset ,\emptyset )\), i.e. the set of the closed distribution terms. Notice that closed distribution terms denote distributions. We denote by \(\mathrm {var}(\varTheta )\) the set of the variables occurring in distribution term \(\varTheta \). We let \(\delta _{\mathbf {V}\!_s}\) denote the set \(\{\delta _x \mid x \in \mathbf {V}\!_s\}\).

A distribution term is in normal form if the Dirac operator is applied only to single variables \(\delta _x\) and not to non-variable terms \(\delta _t\) with \(t \not \in \mathbf {V}\!_s\) and, then, either there is no convex combination or there is only one convex combination as outermost operation.

Definition 2

(Normal form, [19]). A distribution term \(\varTheta \in \mathbb {DT}(\varSigma )\) is in normal form iff: either (i) \(\varTheta = \mu \in \mathbf {V}\!_d\), or (ii) \(\varTheta = \delta _x\) for \(x \in \mathbf {V}\!_s\), or (iii) \(\varTheta = f(\varTheta _1,\dots ,\varTheta _{\mathfrak {n}})\) where all the \(\varTheta _i\) are in normal form and no convex combination occurs in any of the \(\varTheta _i\), or (iv) \(\varTheta = \sum _{j \in J} p_j \varTheta _j\) where all the \(\varTheta _j\) are in normal form and no convex combination occurs in any of the \(\varTheta _j\).

A substitution is a mapping \(\sigma :\mathbf {V}\!_s\cup \mathbf {V}\!_d\rightarrow \mathbb {T}(\varSigma )\cup \mathbb {DT}(\varSigma )\) with \(\sigma (x) \in \mathbb {T}(\varSigma )\) for all \(x \in \mathbf {V}\!_s\) and \(\sigma (\mu ) \in \mathbb {DT}(\varSigma )\) for all \(\mu \in \mathbf {V}\!_d\). A substitution \(\sigma \) is closed if it maps terms to closed terms in \(\mathbf {T}(\varSigma )\cup {\mathbf {DT}}(\varSigma )\).

We say that two distribution terms \(\varTheta _1,\varTheta _2\) are equivalent, written \(\varTheta _1 \equiv \varTheta _2\) if under all closed substitutions \(\sigma \), the closed distribution terms \(\sigma (\varTheta _1)\) and \(\sigma (\varTheta _2)\) express the same probability distribution. Each distribution term is equivalent to a normal form.

Proposition 1

([19]). For each distribution term \(\varTheta \in \mathbb {DT}(\varSigma )\) there is a distribution term \(\varTheta ' \in \mathbb {DT}(\varSigma )\) in normal form such that \(\varTheta \equiv \varTheta '\).

PGSOS Specifications. PTSs are defined by means of SOS rules [30], which are syntax-driven inference rules allowing us to infer the behavior of processes inductively with respect to their structure. They are based on expressions of the form \(t \xrightarrow { \, {\alpha } \, } \varTheta \) and , called, resp., positive and negative literals with t a term and \(\varTheta \) a distribution term, which will instantiate to transitions through substitutions. The literals \(t\xrightarrow { \, {\alpha } \, } \varTheta \) and with the same term t in the left hand side and the same label \(\alpha \), are said to deny each other.

We assume SOS rules in the probabilistic GSOS (PGSOS) format [12], which allows for specifying most of probabilistic process algebras operators [16]. As examples, all SOS rules in Tables 1, 2 and 3 are PGSOS rules according to the following definition.

Definition 3

(PGSOS rules, [12]). A PGSOS rule r has the form:

(1)

where \(f\in \varSigma \) is an operator, \(H \subseteq \{1,\dots ,\mathfrak {n}\}\), \(M_h\) and \(N_h\) are finite sets of indexes, \(\alpha _{h,m},\alpha _{h,n},\alpha \in \mathcal A\) are action labels, \(x_h\in \mathbf {V}\!_s\) are state variables, \(\mu _{h,m}\in \mathbf {V}\!_d\) are distribution variables and \(\varTheta \in \mathbb {DT}(\varSigma )\) is a distribution term. Furthermore: (i) all distribution variables \(\mu _{h,m}\) with \(h\in H\) and \(m\in M_h\) are distinct, (ii) all variables \(x_1, \dots x_{\mathfrak {n}}\) are distinct, (iii) \(\mathrm {var}(\varTheta )\subseteq \{\mu _{h,m} \mid h\in H, m\in M_h\} \cup \{x_1,\dots , x_{\mathfrak {n}}\}\).

Constraints (i)–(iii) are inherited by the classical GSOS rules in [5] and are necessary to ensure that probabilistic bisimulation is a congruence [8, 9, 12]. For a PGSOS rule r, the positive (resp. negative) literals above the line are called the positive (resp. negative) premises, notation \(\mathrm {pprem}(r)\) (resp. \(\mathrm {nprem}(r)\)). We denote by \(\mathrm {der}(x,r) = \{\mu \mid x \xrightarrow { \, {a} \, }\mu \in \mathrm {pprem}(r)\}\) the set of derivatives of x in r. The literal \(f(x_1, \dots , x_{\mathfrak {n}})\xrightarrow { \, {\alpha } \, } \varTheta \) is called the conclusion, notation \(\mathrm {conc}(r)\), the term \(f(x_1, \dots , x_{\mathfrak {n}})\) is called the source, notation \(\mathrm {src}(r)\), and the distribution term \(\varTheta \) is called the target, notation \(\mathrm {trg}(r)\).

Definition 4

(PTSS, [12]). A PGSOS-transition system specification (PTSS) is a tuple \(P=(\varSigma ,\mathcal A,R)\), with \(\varSigma \) a signature, \(\mathcal A\) a set of actions and R a set of PGSOS rules.

We conclude with the notion of disjoint extension for a PTSS which allows us to introduce new operators without affecting the behavior of those already specified.

Definition 5

(Disjoint extension). A PTSS \(P' = (\varSigma ',\mathcal A,R')\) is a disjoint extension of a PTSS \(P = (\varSigma ,\mathcal A,R)\), written \(P \sqsubseteq P'\), if \(\varSigma \subseteq \varSigma '\), \(R \subseteq R'\) and \(R'\) introduces no new rule for any operator in R.

3 Behavioral Metrics and Their Compositional Properties

Behavioral equivalences, such as probabilistic bisimulation [27, 32], answer the question of whether two processes behave precisely the same way or not with respect to the observations we can make on them. Behavioral metrics [2, 6, 13, 14, 34] answer the more general question of measuring the differences in processes behavior. They are usually defined as 1-bounded pseudometrics expressing the behavioral distance on processes, namely they quantify the disparities in the observations that we can make on them.

A 1-bounded pseudometric on a set X is a function \(d :X \times X \rightarrow [0,1]\) such that: (i) \(d(x,x) =0\), (ii) \(d(x,y) = d(y,x)\), and (iii) \(d(x,y) \le d(x,z) + d(z,y)\), for all \(x,y,z \in X\). The kernel of a pseudometric d on X consists in the set of the pairs of elements in X that are at distance 0, namely \(ker(d) = \{(x,y) \in X \times X \mid d(x,y) = 0\}\). For simplicity, we denote by \(\mathcal {D}(X)\) the set of 1-bounded pseudometrics on the set X.

As elsewhere in the literature, we will sometimes use the term metric in place of pseudometric. Below we recall the notion of bisimilarity metric, which is one of the most studied behavioral metrics in the literature, and we discuss its compositionality.

The notion of bisimulation metric [6, 13, 14] bases on the quantitative analogous to the bisimulation game: two processes can be at some given distance \(\varepsilon <1\) only if they can mimic each other’s transitions and evolve to distributions that are, in turn, at a distance \(\le \varepsilon \). To formalize this intuition, we need to lift pseudometrics on processes to pseudometrics on distributions. To this purpose, we rely on the notions of matching (also known as coupling or weight function) and Kantorovich lifting.

Definition 6

(Matching). Assume two sets X and Y. A matching for distributions \(\pi \in \varDelta (X)\) and \(\pi ' \in \varDelta (Y)\) is a distribution over the product space \(\mathfrak {w}\in \varDelta (X \times Y)\) with \(\pi \) and \(\pi '\) as left and right marginal, namely: (i) \(\sum _{y\in Y} \mathfrak {w}(x,y)=\pi (x)\), for all \(x \in X\), and (ii) \(\sum _{x\in X} \mathfrak {w}(x,y)=\pi '(y)\), for all \(y \in Y\). We let \(\mathfrak {W}(\pi ,\pi ')\) denote the set of all matchings for \(\pi \) and \(\pi '\).

Definition 7

(Kantorovich metric, [21]). Given \(d \in \mathcal {D}(X)\), the Kantorovich lifting of d is the pseudometric \({{\mathrm{\mathbf {K}}\,}}(d):\varDelta (X) \times \varDelta (X) \rightarrow [0,1]\) defined for all \(\pi ,\pi ' \in \varDelta (X)\) by

$$ {{\mathrm{\mathbf {K}}\,}}(d)(\pi ,\pi ') = \min _{\mathfrak {w}\in \mathfrak {W}(\pi ,\pi ')} \sum _{x,y \in X}\mathfrak {w}(x,y)\cdot d(x,y). $$

Bisimulation metrics are normally parametric with respect to a discount factor allowing us to specify how much the distance of future transitions is mitigated [3, 14]. Informally, any difference that can be observed only after a long sequence of computation steps does not have the same impact of the differences that can be witnessed at the beginning of the computation. In our context, the discount factor is a value \(\lambda \in (0,1]\) such that the distance arising at step n is mitigated by \(\lambda ^n\) (\(\lambda =1\) means no discount).

Definition 8

(Bisimulation metric, [13]). Let \(\lambda \in (0,1]\). We say that \(d \in \mathcal {D}(\mathbf {T}(\varSigma ))\) is a bisimulation metric if for all \(s,t \in \mathbf {T}(\varSigma )\) with \(d(s,t) < 1\), we have that:

$$ \forall s \xrightarrow { \, {a} \, } \pi _s \; \exists t \xrightarrow { \, {a} \, } \pi _t \text{ such } \text{ that } \lambda \cdot {{\,\mathrm{\mathbf {K}}\,}}(d)(\pi _s,\pi _t) \le d(s,t). $$

Notice that any bisimulation metric is a pseudometric by definition. In [13] it was proved that the smallest bisimulation metric exists and it is called bisimilarity metric, denoted by \(\mathbf {b}^{\lambda }\). Moreover, its kernel induces an equivalence relation that coincides with probabilistic bisimilarity, namely \(\mathbf {b}^{\lambda }(s,t) = 0\) if and only if s and t are bisimilar [6, 13].

3.1 Compositionality

In order to specify and verify systems in a compositional manner, it is necessary that the behavioral semantics be compatible with all operators of the language that describe these systems. In the behavioral metric approach, we must guarantee that the distance between two composed systems \(f(s_1,\ldots ,s_{\mathfrak {n}})\) and \(f(t_1,\ldots ,t_{\mathfrak {n}})\) depends on the distance between all pairs of arguments \(s_i,t_i\), so that the closer all pairs \(s_i\) and \(t_i\) the closer the composed systems \(f(s_1,\ldots ,s_{\mathfrak {n}})\) and \(f(t_1,\ldots ,t_{\mathfrak {n}})\).

Following [18, 19], we define compositionality properties for bisimulation metric relying on the notion of modulus of continuity for an operator with respect to a distance.

Definition 9

(Modulus of continuity). Let \(P=(\varSigma ,\mathcal A,R)\) be a PTSS, \(f \in \varSigma \) some \(\mathfrak {n}\)-ary operator and \(d \in \mathcal {D}(\mathbf {T}(\varSigma ))\). Then, a mapping \(\mathfrak {m}:[0,1]^{\mathfrak {n}} \rightarrow [0,1]\) is a modulus of continuity for f with respect to d iff:

  • for all processes \(s_i,t_i \in \textsf {T}(\varSigma )\) we have:

    $$\begin{aligned} d(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}})) \le \mathfrak {m}(d(s_1,t_1),\ldots ,d(s_{\mathfrak {n}},t_{\mathfrak {n}})), \end{aligned}$$
  • \(\mathfrak {m}\) is continuous at \((0,\ldots ,0)\), i.e. \(\lim _{(\varepsilon _1 \ldots \varepsilon _{\mathfrak {n}}) \rightarrow (0 \ldots 0)} \mathfrak {m}(\varepsilon _1,\ldots ,\varepsilon _{\mathfrak {n}}) = \mathfrak {m}(0,\ldots ,0)\),

  • \(\mathfrak {m}(0,\ldots ,0)=0\).

In [16, 18, 19] an operator is called uniformly continuous if this operator admits any modulus of continuity with respect to \(\mathbf {b}^{\lambda }\). Intuitively, a uniformly continuous \(\mathfrak {n}\)-ary operator f ensures that for any non-zero bisimulation distance \(\varepsilon \) (understood as the admissible tolerance from the operational behavior of the composed processes \(f(s_1,\ldots ,s_{\mathfrak {n}})\) and \(f(t_1,\ldots ,t_{\mathfrak {n}})\)) there are non-zero bisimulation distances \(\varepsilon _i\) such that the distance between the composed processes \(f(s_1,\ldots ,s_{\mathfrak {n}})\) and \(f(t_1,\ldots ,t_{\mathfrak {n}})\) is at most \(\varepsilon \) whenever the component \(t_i\) is in distance of at most \(\varepsilon _i\) from \(s_i\).

Definition 10

(Uniformly continuous operator). Let \(P=(\varSigma ,\mathcal A,R)\) be a PTSS. We say that an \(\mathfrak {n}\)-ary operator \(f \in \varSigma \) is

  • non extensive if \(\mathfrak {m}(\varepsilon _1,\ldots ,\varepsilon _{\mathfrak {n}}) = \max _{i=1}^{\mathfrak {n}} \varepsilon _i\) is a modulus of continuity for f;

  • non expansive if \(\mathfrak {m}(\varepsilon _1,\ldots ,\varepsilon _{\mathfrak {n}}) = \sum _{i=1}^{\mathfrak {n}} \varepsilon _i\) is a modulus of continuity for f;

  • Lipschitz continuous if \(\mathfrak {m}(\varepsilon _1,\ldots ,\varepsilon _{\mathfrak {n}}) = L \cdot \sum _{i=1}^{\mathfrak {n}} \varepsilon _i\) is a modulus of continuity for f for some \(L > 1\).

Non-extensiveness and non-expansiveness are the most studied notion of compositionality (see, e.g.,  [4, 10, 13,14,15, 17, 22, 34, 35]). However, they do not allow for reasoning on recursive processes [16, 19, 24]. Essentially, recursive operators are Lipschitz continuous but not non-expansive [16].

We conclude this section by showing that the PGSOS format is necessary to obtain uniform continuity properties of operators. In the following Example we show an operator f defined by a rule which does not respect the PGSOS format, and for which no modulus of continuity among those in Definition 10 can be established wrt. \(\mathbf {b}^{\lambda }\).

Example 1

(Counterexample for PGSOS format). For simplicity of notation, in all the Examples we will let a.t stand for \(a.\delta _{t}\). Assume the prefix operator \(a.\bigoplus _{i \in I}[p_i]\_\) in Table 1 and a PTSS containing the following rules

$$ \frac{\displaystyle x \mathbin {\Vert }_{\mathcal A} x \xrightarrow { \, {a} \, } \mu }{\displaystyle f(x) \xrightarrow { \, {a} \, } f(\mu )} \quad \frac{\displaystyle }{\displaystyle \eta \xrightarrow { \, {\surd } \, } \delta _{\mathrm {nil}}} \quad \frac{\displaystyle x \xrightarrow { \, {a} \, } \mu \quad y \xrightarrow { \, {a} \, } \nu \quad a \ne \surd }{\displaystyle x \mathbin {\Vert }_{\mathcal A} y \xrightarrow { \, {a} \, } \mu \mathbin {\Vert }_{\mathcal A} \nu } \quad \frac{\displaystyle x \xrightarrow { \, {a} \, } \surd \quad y \xrightarrow { \, {a} \, } \surd }{\displaystyle x \mathbin {\Vert }_{\mathcal A} y \xrightarrow { \, {\surd } \, } \delta _{\mathrm {nil}}} $$

in which \(\mathbin {\Vert }_{\mathcal A}\) denotes the fully synchronous parallel composition operator, \(\eta \) denotes the skip process, \(\mathrm {nil}\) is the processes that cannot perform any action and \(\surd \) is a special action in \(\mathcal A\) denoting successful termination. Clearly, the rule for f falls out of the PGSOS format since an arbitrary term different from a single process variable occurs as left hand side of the premise. Given any process \(u \in \mathbf {T}(\varSigma )\) s.t. \(\mathbf {b}^{\lambda }(u,\eta ) = 1\), consider processes \(s_1 = a.u\) and \(t_1 = a.([1-\varepsilon ] u \oplus [\varepsilon ] \eta )\) and for all \(k > 1\) define \(s_k = a.s_{k-1}\) and \(t_k = a.t_{k-1}\). Clearly, for all \(k \ge 1\) we have \(\mathbf {b}^{\lambda }(s_k,t_k) = \varepsilon \cdot \lambda ^k\). We now aim at evaluating \(\mathbf {b}^{\lambda }(f(s_k),f(t_k))\), for all \(k\ge 1\). Since the target of the rule for f is of the form \(f(\mu )\), where \(\mu \) is derived from the synchronous parallel composition of the source variable with itself, we can infer that \(f(s_k)\) via a sequence of k a-labeled transitions reaches a distribution that assigns probability 1 to the parallel composition of \(2^k\) copies of process u. Similarly, after k a-labeled transitions \(f(t_k)\) reaches a distribution assigning probability \((1-\varepsilon )^{2^k}\) to the process corresponding to the parallel composition of \(2^k\) copies of u; probability \(\varepsilon ^{2^k}\) to the parallel composition of \(2^k\) copies of \(\eta \), and the remaining probability \(1-(1-\varepsilon )^{2^k}-\varepsilon ^{2^k}\) to processes that cannot execute any action (being the parallel composition of copies of u and \(\eta \)). Therefore we get \(\mathbf {b}^{\lambda }(f(s_k),f(t_k)) = \lambda ^k(1-(1-\varepsilon )^{2^k})\). This implies that for any k with \(2^k>L\), \(\mathbf {b}^{\lambda }(f(s),f(t)) / \mathbf {b}^{\lambda }(s,t) = (1-(1-\varepsilon )^{2^k})/\varepsilon > L\) holds for \(s = s_k\), \(t = t_k\) and \(\varepsilon \in (0, (2^k-L)/(2^{k-1}(2^k-1)))\). We infer that no Lipschitz constant can be found for f and thus none of the continuity properties in Definition 10 can be established for f.

4 The Non-extensiveness Format

In this section we provide a specification format ensuring that all operators in PTSSs respecting the syntactic constraints imposed by the format are non-extensive. Briefly, the non-extensiveness format admits only convex combinations of variables and constants as targets of the rules. In the following, we rely on Proposition 1 and without loss of generality we always assume that the targets of all PGSOS-rules are in normal form.

Definition 11

(Specification format for non-extensiveness). A PTSS \(P = (\varSigma ,\mathcal A,R)\) is in non-extensiveness format if all rules \(r \in R\) are PGSOS rules as in Definition 3 and \(\varTheta = \mathrm {trg}(r)\) is of the form \(\varTheta = \sum _{j \in J} p_j \varTheta _j \text {, with } \varTheta _j \in \mathbf {V}\!_d\cup \delta _{\mathbf {V}\!_s} \cup \varSigma ^0 \text { for all } j \in J\).

Then, all operators in \(\varSigma \) are called P-non-extensive.

Essentially, in each element \(\varTheta _j\) we have at most one occurrence of one process argument \(x_h\) or of one of its derivatives \(\mu _{h,m}\), thus implying that \(\varTheta _j\) contributes to the distance \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}}))\) by \(\mathbf {b}^{\lambda }(s_h,t_h)\). As a consequence, we have that \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}})) \le \sum _{j \in J} p_j \max _{i=1}^{n}\mathbf {b}^{\lambda }(s_i,t_i) = \max _{i=1}^{n} \mathbf {b}^{\lambda }(s_i,t_i)\).

We notice that all rules in Table 1 respect Definition 11. Conversely, for each operator defined in Tables 2 and 3 there is at least one rule violating the constraints in Definition 11. For instance, the target of the first rule for operator of sequentialization xy contains both the derivative \(\mu \) of the process argument x and the process argument y. If we consider the first rule for the operator of parallel composition \(x \mathbin {\Vert }_{B} y\), we note that the target contains one derivative for each of the two process arguments.

To obtain the non-extensiveness property of bisimilarity metric from our format, we need to apply a quantitative analogous to the standard technique used to prove the congruence property of behavioural equivalences. Informally, we introduce the notion of non-extensive closure of a pseudometric, seen as the quantitative analogous to the congruence closure of behavioral relations. Briefly, the non-extensive closure allows us to lift a pseudometric to a pseudometric that satisfies the non-extensiveness property.

Definition 12

(Non-extensive closure). Assume a PTSS \((\varSigma ,\mathcal A,R)\). The non-extensive closure \(\mathrm {Ext} :\mathcal {D}(\mathbf {T}(\varSigma )) \rightarrow \mathcal {D}(\mathbf {T}(\varSigma ))\) is defined for \(d \in \mathcal {D}(\mathbf {T}(\varSigma ))\) and \(s,t \in \mathbf {T}(\varSigma )\) by

$$ \mathrm {Ext}(d)(s,t) = {\left\{ \begin{array}{ll} \displaystyle \min \{d(s,t),\; \max _{i = 1\dots ,\mathfrak {n}} \mathrm {Ext}(d)(s_i,t_i)\} &{} \text {if } s= f(s_1,\dots ,s_{\mathfrak {n}}), \\ {} &{} t=f(t_1,\dots ,t_{\mathfrak {n}}) \text { and } f \in \varSigma \\ [0.5 ex] d(s,t) &{} \text {otherwise.} \end{array}\right. } $$

In particular we denote by \(\mathbf {Ext}^{\lambda }\) the non-extensive closure of \(\mathbf {b}^{\lambda }\).

Clearly, focusing on the non-extensive closure of bisimilarity metric \(\mathbf {Ext}^{\lambda }\), we observe that \(\mathbf {Ext}^{\lambda }(s,t) \le \mathbf {b}^{\lambda }(s,t)\) for all \(s,t \in \mathbf {T}(\varSigma )\). Moreover, notice that each operator f admits its modulus of continuity for non-extensiveness wrt. \(\mathbf {Ext}^{\lambda }\). Therefore, the proof of the non-extensiveness theorem (Theorem 1 below), that states the correctness of the non-extensiveness format (Definition 11), consists in proving that under the constraints of our format the metric \(\mathbf {Ext}^{\lambda }\) is itself a bisimulation metric. As a consequence, from \(\mathbf {Ext}^{\lambda }(s,t) \le \mathbf {b}^{\lambda }(s,t)\) and \(\mathbf {b}^{\lambda }\) being the least bisimulation metric, we can infer that \(\mathbf {Ext}^{\lambda }= \mathbf {b}^{\lambda }\) and thus that \(\mathbf {b}^{\lambda }\) is non-extensive wrt. all the operators satisfying our non-extensiveness format.

Theorem 1

(Non-extensiveness). Let \(\lambda \in (0,1]\) and \(P=(\varSigma ,\mathcal A,R)\) be a PTSS in non-extensiveness format. Then the operators in \(\varSigma \) are non-extensive wrt. \(\mathbf {b}^{\lambda }\).

Proof sketch

We present only a sketch of the proof. The proofs of Theorems 2 and 3 follow the same reasoning schema.

As \(\mathbf {Ext}^{\lambda }\le \mathbf {b}^{\lambda }\) by definition and \(\mathbf {b}^{\lambda }\) is the least bisimulation metric, the thesis is equivalent to prove that \(\mathbf {Ext}^{\lambda }\) is a bisimulation metric. We need to show that

$$\begin{aligned} \begin{aligned}&\text {whenever } \mathbf {Ext}^{\lambda }(\sigma (t),\sigma '(t)) < 1 \text { and } P \vdash \sigma (t) \xrightarrow { \, {a} \, } \pi \\&\text {then } P \vdash \sigma '(t) \xrightarrow { \, {a} \, } \pi ' \text { and } \lambda \cdot {{\,\mathrm{\mathbf {K}}\,}}(\mathbf {Ext}^{\lambda })(\pi ,\pi ') \le \mathbf {Ext}^{\lambda }(\sigma (t),\sigma '(t)). \end{aligned} \end{aligned}$$
(2)

This is proved by induction over the structure of t.

For the non-trivial inductive step with \(t = f(t_1,\dots ,t_{\mathfrak {n}})\) and \(\mathbf {Ext}^{\lambda }(\sigma (t),\sigma '(t)) = \max _{i = 1,\dots ,\mathfrak {n}} \mathbf {Ext}^{\lambda }(\sigma (t_i), \sigma '(t_i))\) the proof can be sketched as follows. Assume that \(P \vdash \sigma (t) \xrightarrow { \, {a} \, } \pi \), so that there are a non-extensive rule

and a closed substitution \(\sigma _1\), such that \(\sigma _1(x_i) = \sigma (x_i)\) for all \(i=1,\dots ,\mathfrak {n}\) and \(\sigma _1(\sum _{j\in J}p_j \varTheta _j) = \pi \), from which we can derive such a transition.

For each \(h \in H, m \in M_h\), the inductive hypothesis gives that \(\sigma '(t_h) \xrightarrow { \, {a_{h,m}} \, } \pi _{h,m}\) and \(\lambda \cdot {{\,\mathrm{\mathbf {K}}\,}}(\mathbf {Ext}^{\lambda })(\sigma _1(\mu _{h,m}),\pi _{h,m}) \le \mathbf {Ext}^{\lambda }(\sigma (t_h),\sigma '(t_h))\). For the closed substitution \(\sigma _2\) with \(\sigma _2(x_i) = \sigma '(t_i)\) for all \(i =1,\dots ,\mathfrak {n}\) and \(\sigma _2(\mu _{h,m}) = \pi _{h,m}\) for all \(h \in H, m \in M_h\), we get that \(\{\sigma _2(x_h) \xrightarrow { \, {a_{h,m}} \, } \sigma _2(\mu _{h,m}) \mid h \in H, m \in M_h\}\) is provable from P. For each \(h \in H, n \in N_h\), by the inductive hypothesis we can infer that for all \(h \in H, n \in N_h\). For \(\sigma _2\) defined as above, we get that the set of negative premises is provable from P. Hence, all the premises of the closed instance of r with respect to \(\sigma _2\) are provable from P, thus implying that \(\sigma '(t) \xrightarrow { \, {a} \, } \sigma '(\sum _{j \in J}p_j\varTheta _j)\) is provable from P.

Then, \(\lambda \cdot {{\,\mathrm{\mathbf {K}}\,}}(\mathbf {Ext}^{\lambda })(\sum _{j \in J}p_j\sigma _1(\varTheta _j), \sum _{j \in J} p_j \sigma _2(\varTheta _j)) \le \mathbf {Ext}^{\lambda }(\sigma (t),\sigma '(t))\) follows by: 1. the convexity properties of the Kantorovich metric, 2. the form of the \(\varTheta _j\) in the target of a non-extensive rule, and 3. the inductive hypothesis.    \(\square \)

Table 1. Non-extensive process algebra operators. \(\mu \oplus _p \nu \) stays for \(p \cdot \mu + (1-p) \cdot \nu \).

By applying Theorem 1, we get the non-extensiveness of all operators in Table 1.

Corollary 1

The operators defined by the PGSOS rules in Table 1 are non-extensive.

Since we already know from [16] that none of the operators in Tables 2 and 3 is non-extensive, we can conclude that our format is not too restrictive, meaning that no non-extensive operator among those used in standard probabilistic processes algebras is out of the format. Moreover, we can also infer that the constraints of the format cannot be relaxed in any immediate way.

5 The Non-expansiveness Format

In this section we focus on the non-expansiveness property. We show that by relaxing the constraints of the non-extensiveness format, we obtain a specification format for non-expansiveness. In detail, we allow the target of each rule to be a convex combination of distribution terms in which also non-constant operators can appear and can be applied to source variables and to their derivatives, provided that for each source variables at most one occurrence of that variable or of its derivatives occurs.

Definition 13

(Specification format for non-expansiveness). Assume a PTSS \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-extensiveness format. A PTSS \(P_2 = (\varSigma _2,\mathcal A,R_2)\) with \(P_1 \sqsubseteq P_2\) is in non-expansiveness format if all rules \(r \in R_2 \setminus R_1\) are of the form

where:

  1. 1.

    the target \(\varTheta \) is of the form \(\varTheta = \sum _{j \in J} p_j \varTheta _j\),

  2. 2.

    for all \(j \in J\) and \(h \in H\), at most one among the variables in \((\{x_h\} \cup \{\mu _{h,m} \mid m \in M_h\})\) occurs in \(\varTheta _j\) and there must be at most one such occurrence, and

  3. 3.

    for all \(j \in J\) and \(i \in \{1,\dots ,\mathfrak {n}\} \setminus H\), the variable \(x_i\) can occur at most once in \(\varTheta _j\).

Moreover, all operators defined by rules in \(R_2 \setminus R_1\) are called \(P_2\)-non-expansive.

Since in each \(\varTheta _j\) we can have at most one occurrence of a source variable \(x_h\) or of one of its derivatives \(\{\mu _{h,m} \mid m \in M_h\}\), we can infer that \(\varTheta _j\) contributes to \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}}))\) by, at most, \(\sum _{i =1}^{\mathfrak {n}}\mathbf {b}^{\lambda }(s_i,t_i)\). Consequently, we obtain \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}})) \le \sum _{j \in J} p_j \sum _{i=1}^{n}\mathbf {b}^{\lambda }(s_i,t_i) = \sum _{i=1}^{n} \mathbf {b}^{\lambda }(s_i,t_i)\).

Notice that all non-extensive operators are also non-expansive. Further, all operators defined by rules in Table 2, as well as those in Table 1, respect the non-expansiveness format. Conversely, the operators in Table 3, which are known to be expansive [16], have at least one rule falling out of the format. For instance, the rule for the bang operator !x, which stands for the parallel composition of infinitely many copies of the process argument, violates Definition 13 since both the source variable x and its derivative \(\mu \) occur in the same distribution term in the target.

As in the case of non-extensiveness, the formal proof of the correctness of the non-expansiveness format bases on a proper notion of closure of a pseudometric. We define the non-expansiveness closure of a pseudometric as its lifting to a pseudometric which is non-extensive on operators satisfying Definition 11, and non-expansive on the others.

Definition 14

(Non-expansive closure). Assume the PTSSs \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-extensiveness format and \(P_2 = (\varSigma _2,\mathcal A,R_2)\) with \(P_1 \sqsubseteq P_2\). The non-expansive closure \(\mathrm {Exp} :\mathcal {D}(\mathbf {T}(\varSigma _2)) \rightarrow \mathcal {D}(\mathbf {T}(\varSigma _2))\) is defined for \(d \in \mathcal {D}(\mathbf {T}(\varSigma _2))\) and \(s,t \in \mathbf {T}(\varSigma _2)\) by

$$ \mathrm {Exp}(d)(s,t) = {\left\{ \begin{array}{ll} \displaystyle \min \{d(s,t),\; \max _{i = 1\dots ,\mathfrak {n}} \mathrm {Exp}(d)(s_i,t_i)\} &{} \text {if } s= f(s_1,\dots ,s_{\mathfrak {n}}), \\ {} &{} t=f(t_1,\dots ,t_{\mathfrak {n}}), f \in \varSigma _1 \\ \displaystyle \min \{d(s,t),\; \sum _{i = 1}^{\mathfrak {n}} \mathrm {Exp}(d)(s_i,t_i)\} &{} \text {if } s= f(s_1,\dots ,s_{\mathfrak {n}}), \\ {} &{} t=\!f(t_1,\dots t_{\mathfrak {n}}), f \!\in \! \varSigma _2\! \setminus \! \varSigma _1 \\ d(s,t) &{} \text {otherwise.} \end{array}\right. } $$

In particular we denote by \(\mathbf {Exp}^{\lambda }\) the non-expansive closure of \(\mathbf {b}^{\lambda }\).

We can then show that under the constraints of Definition 13, the non-expansive closure \(\mathbf {Exp}^{\lambda }\) of \(\mathbf {b}^{\lambda }\) is indeed a bisimulation metric. Since by definition \(\mathbf {Exp}^{\lambda }(s,t) \le \mathbf {b}^{\lambda }(s,t)\) for all \(s,t \in \mathbf {T}(\varSigma )\) and \(\mathbf {b}^{\lambda }\) is the least bisimulation metric, we can conclude that \(\mathbf {Exp}^{\lambda }= \mathbf {b}^{\lambda }\) and the operators in a PTSS in non-expansiveness format are non-expansive with respect to \(\mathbf {b}^{\lambda }\), as stated in Theorem 2 below.

Theorem 2

(Non-expansiveness). Let \(\lambda \in (0,1]\) and \((\varSigma ,\mathcal A,R)\) be a PTSS in non-expansiveness format. Then the operators is \(\varSigma \) are non-expansive with respect to \(\mathbf {b}^{\lambda }\).

By applying Theorem 2 we get the non-expansiveness of all operators in Table 2.

Corollary 2

The operators defined by the PGSOS rules in Tables 1 and 2 are non-expansive with respect to \(\mathbf {b}^{\lambda }\).

Since we already know from [16] that none of the operators in Table 3 is non-expansive, we can conclude that our format is not too restrictive, meaning that no non-expansive operator among those used in standard probabilistic processes algebras is out of the format. Moreover, the following examples show that the constraints in Definition 13 cannot be relaxed in any trivial way.

Example 2

(Counterexample for Definition 13.2, I). Consider (i) \(s_1 = a.([\nicefrac {1}{2}] u_1 \oplus [\nicefrac {1}{2}] u_2)\); (ii) \(t_1 = a.([\nicefrac {1}{3}] u_1 \oplus [\nicefrac {1}{3}] u_2 \oplus [\nicefrac {1}{3}] u_3)\); (iii) \(u_1 = b.\eta \); (iv) \(u_2 = c.\eta \); (v) \(u_3 = u_1 + u_2\). Clearly we have \(\mathbf {b}^{\lambda }(s_1,t_1) = \nicefrac {1}{3} \cdot \lambda \), due to \(u_3\). Let \(\mathbin {\Vert }_{}\) denote \(\mathbin {\Vert }_{\emptyset }\). Consider the operator f defined by the following PGSOS rule

$$\begin{aligned} \frac{\displaystyle x \xrightarrow { \, {a} \, } \mu }{\displaystyle f(x) \xrightarrow { \, {a} \, } \delta _x \mathbin {\Vert }_{} \mu } \end{aligned}$$

which violates the non-expansiveness format of Definition 13 since both x and \(\mu \) occur in the target. Let us evaluate \(\mathbf {b}^{\lambda }(f(s_1),\, f(t_1))\). Let \(\pi _{s_1} = \nicefrac {1}{2} \delta (s_1 \mathbin {\Vert }_{} u_1) + \nicefrac {1}{2} \delta (s_1 \mathbin {\Vert }_{} u_2)\) and \(\pi _{t_1} = \nicefrac {1}{3} \delta (t_1 \mathbin {\Vert }_{} u_1) + \nicefrac {1}{3} \delta (t_1 \mathbin {\Vert }_{} u_2) + \nicefrac {1}{3} \delta (t_1 \mathbin {\Vert }_{} u_3)\). We have \(f(s_1) \xrightarrow { \, {a} \, } \pi _{s_1}\), \(f(t_1) \xrightarrow { \, {a} \, } \pi _{t_1}\) and \(\mathbf {b}^{\lambda }(f(s_1),f(t_1)) \ge \lambda \cdot {{\,\mathrm{\mathbf {K}}\,}}(\mathbf {b}^{\lambda })(\pi _{s_1},\pi _{t_1})\). Then \(\mathbf {b}^{\lambda }(s_1 \mathbin {\Vert }_{}u_1,\, t_1 \mathbin {\Vert }_{} u_3) = 1 =\mathbf {b}^{\lambda }(s_1 \mathbin {\Vert }_{} u_2,\, t_1 \mathbin {\Vert }_{} u_3)\), whereas \(\mathbf {b}^{\lambda }(s_1 \mathbin {\Vert }_{} u_1,\, t_1 \mathbin {\Vert }_{} u_1) = \mathbf {b}^{\lambda }(s_1 \mathbin {\Vert }_{} u_2,\, t_1 \mathbin {\Vert }_{} u_2) = \mathbf {b}^{\lambda }(s_1,t_1)\). Hence, \(\mathbf {b}^{\lambda }(f(s_1),\, f(t_1)) = \nicefrac {1}{3}\cdot \lambda + \nicefrac {2}{9} \cdot \lambda ^2 > \mathbf {b}^{\lambda }(s_1,t_1)\).

Example 3

(Counterexample for Definition 13.2, II). Consider process \(u_1\) from Example 2 and processes \(s_2 = a.([\nicefrac {1}{2}] u_1 \oplus [\nicefrac {1}{2}] \eta )\) and \(t_2 = a.u_1\). Clearly, \(\mathbf {b}^{\lambda }(s_2,t_2) = \nicefrac {1}{2}\cdot \lambda \). Consider the following PGSOS rule defining the behavior of operator g

$$\begin{aligned} \frac{\displaystyle x \xrightarrow { \, {a} \, } \mu }{\displaystyle g(x) \xrightarrow { \, {a} \, } \mu \mathbin {\Vert }_{\mathcal A} \mu } \end{aligned}$$

which violates the non-expansiveness format as variable \(\mu \) occurs in the right hand side of premise and twice in the target. Let us evaluate \(\mathbf {b}^{\lambda }(g(s_2),g(t_2))\). We have that \(g(s_2) \xrightarrow { \, {a} \, } \nicefrac {1}{4} \delta (u_1 \mathbin {\Vert }_{\mathcal A} u_1) + \nicefrac {1}{4} \delta (\eta \mathbin {\Vert }_{\mathcal A} \eta ) + \nicefrac {1}{4} \delta (u_1 \mathbin {\Vert }_{\mathcal A} \eta ) + \nicefrac {1}{4} \delta (\eta \mathbin {\Vert }_{\mathcal A} u_1)\). Conversely, \(g(t_2) \xrightarrow { \, {a} \, } \delta (u_1 \mathbin {\Vert }_{\mathcal A} u_1)\). Therefore, since \(\mathbin {\Vert }_{\mathcal A}\) corresponds to the fully synchronous parallel composition, we can directly infer that \(\mathbf {b}^{\lambda }(g(s_2),g(t_2)) \ge \nicefrac {3}{4} \cdot \lambda > \mathbf {b}^{\lambda }(s_2,t_2)\).

Example 4

(Counterexample for Definition 13.3). Consider process \(u_1\) from Example 2 and processes \(s_2,t_2\) from Example 3. Consider the following PGSOS rule defining the behavior of operator h

$$\begin{aligned} \frac{\displaystyle x \xrightarrow { \, {b} \, } \mu }{\displaystyle h(x,y) \xrightarrow { \, {b} \, } \delta _y \mathbin {\Vert }_{\mathcal A} \delta _y} \end{aligned}$$

which violates the non-expansiveness format in Definition 13 as variable y occurs twice in the target. From similar calculations to those applied in Example 3, we can infer that \(\mathbf {b}^{\lambda }(h(u_1,s_2),h(u_1,t_2)) \ge \nicefrac {3}{4}\cdot \lambda ^2\), which is greater than \(\mathbf {b}^{\lambda }(s_2,t_2)\) for \(\lambda > \nicefrac {2}{3}\).

Table 2. Non-expansive process algebra operators. The symmetric version of the first rule for \(\mathbin {\Vert }_{B}\) and the one for \(\mid \mid \mid _p\) have been omitted.

6 The Lipschitz Continuity Format

We dedicate this section to the definition of a specification format for Lipschitz continuity. This is obtained by relaxing the constraints of the non-expansiveness format in Definition 13 as follows: firstly, for each operator f we fix \(n_f,m_f \in \mathbb {N}\). Then, we allow the targets of the rules for f to be convex combinations of distribution terms in which (i) the number of occurrences of source variables and their derivatives is bounded by \(n_f\); (ii) the number of nested operators in which a source variable can occur is bounded by \(m_f\); (iii) derivatives can only occur in the scope of non-expansive operators.

Definition 15

(Specification format for Lipschitz continuity). Assume a PTSS \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-expansiveness format. A PTSS \(P_2 = (\varSigma _2,\mathcal A,R_2)\) with \(P_1 \sqsubseteq P_2\) is in Lipschitz continuity format if for all operators \(f \in \varSigma _2 \setminus \varSigma _1\) there are naturals \(n_f, m_f\) and all rules \(r \in R_2 \setminus R_1\) are of the form

where:

  1. 1.

    the target \(\varTheta \) is of the form \(\varTheta = \sum _{j \in J} p_j \varTheta _j\),

  2. 2.

    for all \(j \in J\) and \(i \in \{1,\dots ,\mathfrak {n}\}\), \(x_i\) occurs at most \(n_f\) times in \(\varTheta _j\) and such occurrences appear in the scope of at most \(m_f\) operators, and

  3. 3.

    for all \(j \in J\) and \(h \in H\), there are at most \(n_f\) occurrences of variables in \(\{\mu _{h,m} \mid m \in M_h\}\) in \(\varTheta _j\), which appear only in the scope of non-expansive operators in \(\varSigma _1\).

Moreover, all operators defined by rules in \(R_2 \setminus R_1\) are called \(P_2\)-Lipschitz.

Informally, for each \(h \in H\), bounding to \(n_f\) the number of occurrences of derivatives \(\{\mu _{h,m} \mid m \in M_h\}\) that may occur in \(\varTheta _j\) and requiring them in the scope of non-expansive operators, guarantees that process argument \(x_h\) and \(\varTheta _j\) contribute to \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}}))\) by, at most, \(n_f \cdot \mathbf {b}^{\lambda }(s_h,t_h)\). Considering instead source variables, constraint 2 of Definition 15 guarantees that for each \(i=1,\dots ,\mathfrak {n}\) there is a constant \(L_i \ge 1\) such that process argument \(x_i\) and \(\varTheta _j\) contribute to distance \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}}))\) by, at most, \(\lambda \cdot n_f \cdot L_i^{m_f} \cdot \mathbf {b}^{\lambda }(s_i,t_i)\), where \(L_i\) is the maximal Lipschitz constant among those of operators applied on top of \(x_i\), and factor \(\lambda \) arises since \(x_i\) is delayed by one computation step. Therefore, we can infer that for each \(j \in J\) there is a constant \(L_j \ge n_f\) such that \(\varTheta _j\) contributes to \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}}))\) by, at most, \(L_j \sum _{i = 1}^{\mathfrak {n}} \mathbf {b}^{\lambda }(s_i,t_i)\). Finally, we obtain \(\mathbf {b}^{\lambda }(f(s_1,\ldots ,s_{\mathfrak {n}}),f(t_1,\ldots ,t_{\mathfrak {n}})) \le \sum _{j \in J} p_j L_j \cdot \sum _{i=1}^{n}\mathbf {b}^{\lambda }(s_i,t_i) = L\cdot \sum _{i=1}^{n} \mathbf {b}^{\lambda }(s_i,t_i)\) for some constant \(L \ge 1\).

Notice that the rules in Table 3, as those in Tables 1 and 2, respect the constraints in Definition 15. We observe that Table 3 contains the specification of standard recursive operators \(x^{\omega }\), !x and \(x^*y\) which were ruled out by Definitions 11 and 13.

Next, we formalize the definition of a Lipschitz factor for an operator f, namely the constant \(L_f\) wrt. which f is Lipschitz continuous, and its construction over the rules for f and their targets. Since this construction is recursive for recursive operators, in order to have well-defined Lipschitz factors for these operators we need a discount factor strictly less than one.

Definition 16

(Lipschitz factor). Assume \(\lambda \in (0,1)\), a PTSS \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-extensiveness format and a PTSS \(P_2 = (\varSigma _2,\mathcal A,R_2)\) with \(P_1 \sqsubseteq P_2\). For each operator \(f \in \varSigma _2 \setminus \varSigma _1\) let \(R_f \subseteq R_2\) denote the set of rules defining the behavior. Then, the set of Lipschitz factors of operators in \(\varSigma _2 \setminus \varSigma _1\) is defined as the set of minimum reals \(L_f \ge 1\) satisfying the system of inequalities of the form

$$ L_f \ge \max _{r \in R_f}\;\max _{x \in \mathrm {src}(r)}\Big \{ \ell _{\mathrm {trg}(r)}(x) + \sum _{\mu \in \mathrm {der}(x,r)} \ell _{\mathrm {trg}(r)}(\mu ) \Big \} $$

for each \(f \in \varSigma _2 \setminus \varSigma _1\) where, for any distribution term \(\varTheta \in \mathbb {DT}(\varSigma )\) in normal form, the factor \(\ell _{\varTheta }\) is defined inductively as follows

$$\begin{aligned}&\ell _{\varTheta }(x) = {\left\{ \begin{array}{ll} 0 &{} \text {if } x \not \in \mathrm {var}(\varTheta ) \\ \lambda &{} \text {if } \varTheta = \delta _x \\ \max _{i = 1,\dots ,\mathfrak {n}} \ell _{\varTheta _i}(x) &{} \text {if } \varTheta = f(\varTheta _1,\dots ,\varTheta _{\mathfrak {n}}) \text { and } f \in \varSigma _1 \\ L_f \sum _{i = 1}^{\mathfrak {n}} \ell _{\varTheta _i}(x) &{} \text {if } \varTheta = f(\varTheta _1,\dots ,\varTheta _{\mathfrak {n}}) \text { and } f \in \varSigma _2\setminus \varSigma _1\\ \sum _{j \in J}p_j \ell _{\varTheta _j}(x) &{} \text {if } \varTheta = \sum _{j \in J} p_j\varTheta _j \end{array}\right. } \\&\ell _{\varTheta }(\mu ) = {\left\{ \begin{array}{ll} 0 &{} \text {if } \mu \not \in \mathrm {var}(\varTheta ) \\ 1 &{} \text {if } \varTheta = \mu \\ \max _{i = 1,\dots ,\mathfrak {n}} \ell _{\varTheta _i}(\mu ) &{} \text {if } \varTheta = f(\varTheta _1,\dots ,\varTheta _{\mathfrak {n}}) \text { and } f \in \varSigma _1 \\ L_f \sum _{i = 1}^{\mathfrak {n}} \ell _{\varTheta _i}(\mu ) &{} \text {if } \varTheta = f(\varTheta _1,\dots ,\varTheta _{\mathfrak {n}}) \text { and } f \in \varSigma _2\setminus \varSigma _1\\ \sum _{j \in J}p_j \ell _{\varTheta _j}(\mu ) &{} \text {if } \varTheta = \sum _{j \in J} p_j\varTheta _j \end{array}\right. } \end{aligned}$$

The following proposition states that Lipschitz factors are well-defined if the PTSS is in Lipschitz format and \(\lambda \in (0,1)\). Moreover, \(L_f = 1\) if f is non-expansive.

Proposition 2

Let \(\lambda \in (0,1)\). Assume a PTSS \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-extensiveness format and an extension \(P_1 \sqsubseteq P_2 = (\varSigma _2,\mathcal A,R_2)\) in Lipschitz format. Then:

  • for each operator \(f \in \varSigma _2\), the Lipschitz factor \(L_f < + \infty \) is well-defined.

  • moreover, if \(f \in \varSigma _2\) is non-expansive then \(L_f=1\).

The following two examples show that if \(P_2\) falls out of the Lipschitz continuity format, then Proposition 2 does not hold.

Example 5

Consider operators f and g specified by the following rules for all \(k\in \mathbb {N}\):

$$\begin{aligned} \frac{\displaystyle x \xrightarrow { \, {a_k} \, } \mu }{\displaystyle f(x) \xrightarrow { \, {a_k} \, } \underbrace{\mu \mathbin {\Vert }_{\mathcal A} \ldots \mathbin {\Vert }_{\mathcal A} \mu }_{k-\text {times}}} \quad \quad \frac{\displaystyle }{\displaystyle g(x) \xrightarrow { \, {a_k} \, } \delta (\underbrace{h(\ldots h(}_{k-\text {times}}x)))} \quad \quad \frac{\displaystyle x \xrightarrow { \, {a_k} \, } \mu }{\displaystyle h(x) \xrightarrow { \, {a_k} \, } \mu \mathbin {\Vert }_{\mathcal A} \mu } \end{aligned}$$

Notice that rules for both f and g are out of the Lipschitz format since we cannot define any \(n_f\) and \(m_g\) accordingly to Definition 15. Then, the inductive definition of Lipschitz factors in Definition 16 gives that \(L_{\mathbin {\Vert }_{\mathcal A}} = 1\), \(L_h = 2\) and, then, \(L_f \ge k\) and \(L_g \ge \lambda \cdot 2^k\) for all \(k \in \mathbb {N}\), thus implying that \(L_f=L_g=+\infty \).

Example 6

Consider the copy operator [5] defined by the PGSOS rules below

$$\begin{aligned} \frac{\displaystyle x \xrightarrow { \, {a} \, } \mu \quad a \not \in \{\mathrm {l},\mathrm {r}\}}{\displaystyle \mathrm {cp}(x) \xrightarrow { \, {a} \, } \mu } \qquad \frac{\displaystyle x \xrightarrow { \, {\mathrm {l}} \, } \mu \quad x \xrightarrow { \, {\mathrm {r}} \, } \nu }{\displaystyle \mathrm {cp}(x) \xrightarrow { \, {\mathrm {s}} \, } \mathrm {cp}(\mu ) \mathbin {\Vert }_{\mathcal A} \mathrm {cp}(\nu )}. \end{aligned}$$

Firstly, notice that the second rule violates the non-expansiveness format since we have two occurrences of derivatives of x in the same distribution term (Definition 13.2). Consequently, it also violates the Lipschitz continuity format in that \(\mu \) and \(\nu \) occur in the scope of an expansive operator (Definition 15.3). By Definition 16 we would have \(L_{\mathrm {cp}} \ge L_{\mathbin {\Vert }_{\mathcal A}}\cdot L_{\mathrm {cp}} + L_{\mathbin {\Vert }_{\mathcal A}}\cdot L_{\mathrm {cp}} = 2 \cdot L_{\mathrm {cp}}\) which is satisfied only by \(L_{\mathrm {cp}} = +\infty \).

Then, the following example shows that, due to the presence of recursive operators, we must require \(\lambda \) to be strictly lesser than one to guarantee the validity of Proposition 2.

Example 7

Consider the rule for the bang operator \(!\_\) in Table 3. By Definition 16 we obtain that \(L_{!} \ge \lambda \cdot L_{!} + 1\), thus giving \(L_{!} \ge \frac{1}{1-\lambda }\), which does not hold for \(\lambda = 1\).

We remark that by restricting to non-recursive operators, such as \(x^n\) and \(!^n\) in Table 3, Proposition 2 would hold also for \(\lambda = 1\).

As one can expect, the proof of the correctness of our Lipschitz continuity format bases on the notion of Lipschitz closure of a pseudometric that lifts it to a pseudometric which is non-extensive on non-extensive operators, non-expansive on non-expansive operators and Lipschitz continuous on Lipschitz operators.

Definition 17

(Lipschitz closure). Let \(\lambda \in (0,1)\). Assume a PTSS \(P_1 = (\varSigma _1,\mathcal A,R_1)\) in non-extensiveness format and a PTSS \(P_2 = (\varSigma _2,\mathcal A,R_2)\) with \(P_1 \sqsubseteq P_2\). Then, the Lipschitz closure \(\mathrm {Lip} :\mathcal {D}(\mathbf {T}(\varSigma _2)) \rightarrow \mathcal {D}(\mathbf {T}(\varSigma _2))\) is defined for all \(d\in \mathcal {D}(\mathbf {T}(\varSigma _2))\) and \(s,t \in \mathbf {T}(\varSigma _2)\) by

$$ \mathrm {Lip}(d)(s,t) = {\left\{ \begin{array}{ll} \displaystyle \min \{d(s,t),\; \max _{i = 1\dots ,\mathfrak {n}} \mathrm {Lip}(d)(s_i,t_i)\} &{} \text {if } s= f(s_1,\dots ,s_{\mathfrak {n}}), \\ {} &{} t=f(t_1,\dots ,t_{\mathfrak {n}}), f \in \varSigma _1 \\ \displaystyle \min \{d(s,t),\; L_f \cdot \sum _{i = 1}^{\mathfrak {n}} \mathrm {Lip}(d)(s_i,t_i)\} &{} \text {if } s= f(s_1,\dots ,s_{\mathfrak {n}}), \\ {} &{} t=\!f(t_1,\dots t_{\mathfrak {n}}), f\! \in \! \varSigma _2 \!\setminus \! \varSigma _1 \\ d(s,t) &{} \text {otherwise.} \end{array}\right. } $$

In particular we denote by \(\mathbf {Lip}^{\lambda }\) the Lipschitz closure of \(\mathbf {b}^{\lambda }\).

We can show that on a PTSS respecting Definition 15 the Lipschitz closure \(\mathbf {Lip}^{\lambda }\) of \(\mathbf {b}^{\lambda }\) is a bisimulation metric and thus the operators in the PTSS are Lipschitz continuous wrt. \(\mathbf {b}^{\lambda }\). In particular, the constraints of the Lipschitz continuity format combined with the definition of the Lipschitz factor \(L_f\) of operator f (Definition 16) guarantee that f is \(L_f\)-Lipschitz wrt. \(\mathbf {b}^{\lambda }\).

Theorem 3

(Lipschitz continuity). Let \((\varSigma ,\mathcal A,R)\) be a PTSS in Lipschitz continuity format and let \(\lambda \in (0,1)\). Then the operators is \(\varSigma \) are Lipschitz continuous wrt. \(\mathbf {b}^{\lambda }\). In particular, for a Lipschitz operator f, we have that f is \(L_f\)-Lipschitz wrt. \(\mathbf {b}^{\lambda }\).

As an application of Theorem 3, we obtain that \(\mathbf {b}^{\lambda }\) is compositional wrt. all the operators defined in Table 3 in the sense of Lipschitz continuity.

Corollary 3

Let \(\lambda \in (0,1)\). The operators defined by the PGSOS rules in Tables 1, 2 and 3 are Lipschitz continuous with respect to \(\mathbf {b}^{\lambda }\).

Again, we remark that in Corollary 3 the condition \(\lambda \) strictly lesser than one is required by the operators \(x^\omega \), !x and \(x^*y\).

Since all operators that have been proved in [16] to be Lipschitz continuous are in Table 3, we can infer that our format is not too restrictive as it does not exclude any know Lipschitz continuous operator. Moreover, the following two examples show that the constraint of Definition 15 cannot be trivially relaxed.

Table 3. Lipschitz continuous process algebra operators. We recall that \(\mathbin {\Vert }_{}\) denotes \(\mathbin {\Vert }_{\emptyset }\).

Example 8

(Counterexample for bounded number of occurrences). Assume \(\mathcal A= \{a_k \mid k \in \mathbb {N}\}\). Consider the operators f and g as in Example 5, where we already noticed that the rules defining them violate Definition 15. Let \(\varepsilon \in (0,1)\). Following [19], consider processes \(u_k = a_k.a_k.\mathrm {nil}\) and \(v_k = a_k.([1-\varepsilon ]a_k.\mathrm {nil}\oplus [\varepsilon ]\mathrm {nil})\) with \(k \in \mathbb {N}\). Clearly, \(\mathbf {b}^{\lambda }(u_k,v_k) = \lambda \varepsilon \) for all \(k \in \mathbb {N}\).

Consider first operator f. We have \(f(u_k) \xrightarrow { \, {a_k} \, } \pi _{u,k}\) and \(f(v_k) \xrightarrow { \, {a_k} \, } \pi _{v,k}\), with \(\pi _{u,k} (a_k.\mathrm {nil}\mathbin {\Vert }_{\mathcal A} \ldots \mathbin {\Vert }_{\mathcal A} a_k.\mathrm {nil}) = 1\), \(\pi _{v,k} (a_k.\mathrm {nil}\mathbin {\Vert }_{\mathcal A} \ldots \mathbin {\Vert }_{\mathcal A} a_k.\mathrm {nil}) = (1-\varepsilon )^k\) and \(\pi _{v,k}\) giving the remaining probability \(1-(1-\varepsilon )^k\) to processes that cannot move. We get \(\mathbf {b}^{\lambda }(f(u_k),f(v_k)) = \lambda (1-(1-\varepsilon )^k)\). Hence, \(\sup _{k \in \mathbb {N}}\mathbf {b}^{\lambda }(f(u_k),f(v_k)) =\! \sup _{k \in \mathbb {N}} \lambda (1-(1-\varepsilon )^k) = \lambda \), thus giving that the distance between f-composed processes is bounded by \(\mathfrak {m}(\varepsilon )=\lambda \) if \(\varepsilon >0\) and \(\mathfrak {m}(0)=0\), which is not a modulus of continuity since it is not continuous at 0. Hence, f is not uniformly continuous accordingly to Definition 10.

Consider now operator g. We have \(g(u_k) \xrightarrow { \, {a_k} \, } \delta (h(\ldots h(u_k)\ldots ))\) (with k nested occurrences of h) and \(g(v_k) \xrightarrow { \, {a_k} \, } \delta (h(\ldots h(v_k)\ldots ))\). Then, \(h(\ldots h(u_k)\ldots ) \xrightarrow { \, {a_k} \, } \pi '_{u,k}\) and \(h(\ldots h(v_k)\ldots ) \xrightarrow { \, {a_k} \, } \pi '_{v,k}\), with \(\pi '_{u,k} (a_k.\mathrm {nil}\mathbin {\Vert }_{\mathcal A} \ldots \mathbin {\Vert }_{\mathcal A} a_k.\mathrm {nil}) = 1\) (parallel composition of \(2^k\) copies of \(a_k.\mathrm {nil}\)), \(\pi '_{v,k}(a_k.\mathrm {nil}\mathbin {\Vert }_{\mathcal A} \ldots \mathbin {\Vert }_{\mathcal A} a_k.\mathrm {nil}) = (1-\varepsilon )^{2^k}\) and \(\pi '_{v,k}\) giving probability \(1-(1-\varepsilon )^{2^k}\) to processes that cannot move. We get \(\mathbf {b}^{\lambda }(g(u_k),g(v_{k})) = \lambda ^2(1-(1-\varepsilon )^{2^k})\) and, then, \(\sup _{k \in \mathbb {N}}\mathbf {b}^{\lambda }(g(u_k),g(v_k)) = \sup _{k \in \mathbb {N}} \lambda ^2(1-(1-\varepsilon )^{2^k}) = \lambda ^2\). As for operator f, we conclude that g is not uniformly continuous.

Example 9

Counterexample for occurrences in non-expansive operators). In Example 6 we noticed that the rules for the copy operator \(\mathrm {cp}\) violate Definition 15. Let \(\varepsilon \in (0,1)\). Following [19], consider processes \(s_1 = \mathrm {l}.([1-\varepsilon ]a.\mathrm {nil}\oplus [\varepsilon ]\mathrm {nil}) + \mathrm {r}.([1-\varepsilon ]a.\mathrm {nil}\oplus [\varepsilon ]\mathrm {nil})\) and \(t_1 = \mathrm {l}.a.\mathrm {nil}+ \mathrm {r}.a.\mathrm {nil}\) for some \(a \ne \mathrm {l,r}\). Then, for \(k>1\), let \(s_k = \mathrm {l}.s_{k-1} + \mathrm {r}.s_{k-1}\) and \(t_k = \mathrm {l}.t_{k-1} + \mathrm {r}.t_{k-1}\). Clearly, \(\mathbf {b}^{\lambda }(s_k,t_k) = \lambda ^k \cdot \varepsilon \). Similar calculations to those in Example 1 give \(\mathbf {b}^{\lambda }(\mathrm {cp}(s_k),\mathrm {cp}(t_k)) = \lambda ^k(1-(1-\varepsilon )^{2^k})\). Thus, for any k with \(2^k>L\), \(\mathbf {b}^{\lambda }(\mathrm {cp}(s),\mathrm {cp}(t)) / \mathbf {b}^{\lambda }(s,t) = (1-(1-\varepsilon )^{2^k})/\varepsilon > L\) holds for \(s = s_k\), \(t = t_k\) and \(\varepsilon \in (0, (2^k-L)/(2^{k-1}(2^k-1)))\). We can conclude that \(\mathrm {cp}\) is not Lipschitz continuous.

7 Conclusions

We have provided three specification formats for the bisimilarity metric. These formats guarantee, respectively, the non-extensiveness, non-expansiveness and Lipschitz continuity of the operators satisfying the related format wrt. the bisimilarity metric.

To the best of our knowledge, the only other formats for the bisimilarity metric proposed in the literature are those in [19]. As outlined in the Introduction, the difference between the two proposals is in that our formats are purely syntactic, whereas the ones in [19] are semantic, in the sense that they require a recursive computation of how many copies of a process argument can be spawned by each operator. However, we remark that we have applied the same technique of [19] to the evaluation of Lipschitz factors of operators. This is due to the fact that from the syntactic constraints on PGSOS rules for an operator f we can only guarantee that f is Lipschitz continuous for some \(L_f\) and that \(L_f \ge n_f\), but no further information can be inferred. This is reasonable since the validity of the Lipschitz continuity property does not depend on the value of the Lipschitz constant, but only on its existence. The actual value of \(L_f\) can be established, in general, only from the semantics of f.

As future work, we plan to provide specification formats for weak bisimulation metrics [15], whose applicability has been demonstrated in [24,25,26]. Intuitively, we need to consider TSSs in a format guaranteeing that the kernel of the chosen metric is a congruence and enrich our three formats with constraints allowing us to deal with silent moves.