Keywords

1 Introduction

Secure information flow is a rigorous technique for evaluating security of a system. A system satisfies confidentiality requirements if it does not leak any secret information to its public outputs. However, imposing no leakage policy is too restrictive and in practice the security policy of the system tends to permit minor leakages. For example, a password checking program leaks information about what the password is not when it shows a message indicating that user has entered a wrong password. Quantitative information flow has been a well-established attempt to overcome this deficiency. Given a system with secret (high confidentiality) inputs and public (low confidentiality) outputs, quantitative information flow addresses the problem of measuring the amount of information leakage, i.e., the amount of information that an attacker can deduce about the secret inputs by observing the outputs. Quantitative information flow is widely used in analyzing timing attacks [23, 24], differential privacy [1], anonymity protocols [13, 25, 29, 30], and cryptographic algorithms [18, 19].

Assume a program with a secret input and a public output. Furthermore, assume an attacker that executes the program and observes the public output. A common approach for measuring the amount of leaked information is to use the notion of uncertainty [31]. Before executing the program, the attacker has an initial uncertainty about the secret, which is determined by her prior knowledge of the secret. After executing the program and observing the output, she may infer information about the secret and thus her uncertainty may be reduced. This yields the following intuitive definition of the information leakage [31]: leaked information = initial uncertainty - remaining uncertainty.

In this paper, a practical and automated formal approach is proposed to quantify the information leakage of terminating concurrent programs. The approach considers leakages in intermediate states of the program executions and effect of the scheduling policy.

We assume the program has a secret input h, a public output l, and zero or more neutral variables. Neutral variables specify temporary and/or auxiliary components of the runtime program configuration that do not belong to a certain confidentiality level by nature, e.g., the stack pointer and loop indexes. h is fixed and does not change during program executions. This is the case in any analysis in the context of confidentiality that assumes data integrity to be out of scope, e.g., [2, 7]. We also assume that the public and neutral variables have single initial values. Furthermore, a probabilistic attacker is supposed, who has the full knowledge of source code of the concurrent program and is able to choose a scheduler and execute the program under the control of that scheduler. She can observe sequences of values of l during the executions, called execution traces. We also assume that the attacker can execute the program an arbitrary number of times and can then guess the value of h in a single attempt. This is called one-try guessing model [31].

In order to model the concurrent program, Markov decision processes (MDPs) are used. MDPs provide a powerful state transition system, capable of modeling probabilistic and nondeterministic behaviors [28]. The scheduler is assumed to be probabilistic, resolving nondeterminism in the MDP and inducing a Markov chain (MC). States of an MC contain values of h, l, and possible neutral variables. For computing the leakage, however, MC should capture the attacker’s view of the program. The attacker, while executing the program and observing the execution traces, does not know the exact value of h in each step. She can only guess a set of possible values based on the executed program statements and the observed traces. She also cannot distinguish those executions of MC that have the same trace. In this regard, we define an equivalence relation for a given MC, called back-bisimulation, to specify these requirements of the threat model. Back-bisimulation induces a quotient which models the attacker’s view of the program. A partition-refinement algorithm is proposed to compute the back-bisimulation quotient.

Each state of the back-bisimulation quotient contains a secret distribution, which shows possible values of h in that state, and thus is a determiner of the attacker’s uncertainty about h. Each execution trace of the quotient shows a reduction of the attacker’s uncertainty from the initial state to the final state of the trace. Therefore, secret distribution in the initial state of the quotient determines the attacker’s initial uncertainty and secret distributions in the final states determine the remaining uncertainty. In the literature, uncertainty is measured based on the notion of entropy. The entropy of h expresses the difficulty for an attacker to discover its value. Based on the program model and the attacker’s observational power, various definitions of entropy have been proposed. As Smith [31] shows, in the context of one-try guessing model, uncertainty about a random variable should be defined in terms of Renyi’s min-entropy. This yields that the information leakage is computed as the difference of the Renyi’s min-entropy of h in the initial state of the quotient and the expected value of the Renyi’s min-entropy of h in the final states of the quotient.

We also show a subclass of MCs, called Markov chains with pseudoback-bisimilar states, in which back-bisimulation cannot correctly construct the attacker’s view of the program behavior. Using back-bisimulation to handle this situation is considered a potential future work. Briefly, our contributions include

  • proposing back-bisimulation equivalence, in order to capture the attacker’s observation of the program,

  • developing an algorithm to compute back-bisimulation quotient of an MC,

  • proposing a method to compute the leakage of a concurrent program from the back-bisimulation quotient, and

  • analyzing the dining cryptographers problem.

1.1 Paper Outline

The paper proceeds as follows. Section 2 provides a core background on some basics, information theory, Markovian models and probabilistic schedulers. Section 3 presents the proposed approach. It starts with introducing the program and threat models. It then formally defines back-bisimulation and discusses how to compute the program leakage. Finally, it describes how to construct the attacker’s view of the program model, the back-bisimulation quotient. Section 4 concludes the paper and proposes future work. Finally, the case study and related work are discussed in Appendix A and Appendix B, respectively.

2 Background

In this section, we provide preliminary concepts and notations required for the proposed approach.

2.1 Basics

A probability distribution \( Pr \) over a set \(\mathcal {X}\) is a function \( Pr : \mathcal {X} \rightarrow [0,1]\), such that \(\sum _{x \in \mathcal {X}} Pr (x) = 1\). We denote the set of all probability distributions over \(\mathcal {X}\) by \(\mathcal {D}(\mathcal {X})\).

Let S be a set and \(\mathcal {R}\) an equivalence relation on S. For \(s \in S\), \([s]_\mathcal {R}\) denotes the equivalence class of s under \(\mathcal {R}\), i.e., \([s]_\mathcal {R} = \{ s^\prime \in S \ | \ s \mathcal {R} s^\prime \}\). Note that for \(s^\prime \in [s]_\mathcal {R}\) we have \([s^\prime ]_\mathcal {R} = [s]_\mathcal {R}\). The set \([s]_\mathcal {R}\) is often referred to as the \(\mathcal {R}\)-equivalence class of s. The quotient space of S under \(\mathcal {R}\), denoted by \(S/\mathcal {R} = \{ [s]_\mathcal {R} \ | \ s \in S \}\), is the set consisting of all \(\mathcal {R}\)-equivalence classes. A partition for S is a set \(\varPi = \{B_1, \ldots , B_k \}\) such that \(B_i \ne \varnothing \) (for \(0 < i \le k\)), \(B_i \cap B_j = \varnothing \) (for \(0< i < j \le k\)) and \(S = \cup _{0 < i \le k} B_i\). \(B_i \in \varPi \) is called a block. \(C \subseteq S\) is a superblock of \(\varPi \) if \(C = B_{i_1} \cup \dots \cup B_{i_l}\) for some \(B_{i_1}, \ldots , B_{i_l} \in \varPi \). Note that for equivalence relation \(\mathcal {R}\) on S, the quotient space \(S/\mathcal {R}\) is a partition for S.

2.2 Information Theory

Let \(\textrm{X}\) denote a random variable with the finite set of values \(\mathcal {X}\). Vulnerability [31] of \(\textrm{X}\) is defined as \( Vul (\textrm{X}) = \underset{x \in \mathcal {X}}{\max } \ Pr(\textrm{X}=x)\). Vulnerability is defined as the highest probability of correctly guessing the value of the variable in just a single attempt. In order to quantify information leaks, we convert this probability into bits using Renyi’s min-entropy [31].

Definition 1

The Renyi’s min-entropy of a random variable \(\textrm{X}\) is given by \(\mathcal {H}_\infty (\textrm{X}) = - \log _2 \ Vul (\textrm{X})\).

2.3 Markovian Models

We use Markov decision processes (MDPs) to model operational semantics of concurrent programs. MDPs are state transition systems that permit both probabilistic and nondeterministic choices [28]. In any state of an MDP, a nondeterministic choice between probability distributions exists. Once a probability distribution is chosen nondeterministically, the next state is selected in a probabilistic manner. Nondeterminism is used to model concurrency between threads by means of interleaving, i.e., all possible choices of the threads are considered. Formally,

Definition 2

A Markov decision process (MDP) is defined as a tuple \(\mathcal {M}=(S, Act , {\textbf {P}}, \zeta , AP , V)\) where,

  • S is a set of states,

  • \( Act \) is a set of actions,

  • \({\textbf {P}} : S \rightarrow ( Act \rightarrow (S \rightarrow [0,1]))\) is a transition probability function such that for all states \(s \in S\) and actions \(\alpha \in Act \), \({\sum }_{s^\prime \in S} {\textbf {P}}(s)(\alpha )(s^\prime ) \in \{0,1\}\),

  • \(\zeta :S \rightarrow [0,1]\) is an initial distribution such that \({\sum }_{s \in S} \zeta (s) = 1\).

  • \( AP \) is a set of atomic propositions,

  • \(V:S \rightarrow AP \) is a labeling function.

Atomic propositions represent simple known facts about the states. The function V labels each state with atomic propositions. An MDP \(\mathcal {M}\) is called finite if S, \( Act \), and AP are finite. An action \(\alpha \) is enabled in state s iff \({\sum }_{s^\prime \in S} {\textbf {P}}(s)(\alpha )(s^\prime ) = 1\). Let \( Act (s)\) denote the set of enabled actions in s. Each state \(s^\prime \) for which \({\textbf {P}}(s)(\alpha )(s^\prime ) > 0\) is called an \(\alpha \)-successor of s. The set of \(\alpha \)-successors of s is denoted by \( Succ (s, \alpha )\). The set of successors of s is defined as \( Succ (s) = \underset{\alpha \in Act (s)}{\cup } Succ (s, \alpha )\). The set of successors of a set of states \(\mathcal {S}\) is defined as \( Succ (\mathcal {S}) = \underset{s \in \mathcal {S}}{\cup } Succ (s)\). The set of predecessors of s is defined as \( Pre (s)=\{ s^\prime \in S \ | \ s \in Succ (s^\prime ) \}\). The set of labels that are associated with the predecessors of s is defined as \( PreLabels (s) = \{ V(s^\prime ) \ | \ s^\prime \in Pre (s), s^\prime \ne s \}\).

MDP Semantics. The intuitive operational behavior of an MDP \(\mathcal {M}\) is as follows. At the beginning, an initial state \(s_0\) is probabilistically chosen such that \(\zeta (s_0) > 0\). Assuming that \(\mathcal {M}\) is in state s, first a nondeterministic choice between the enabled actions needs to be resolved. Suppose action \(\alpha \in Act (s)\) is selected. Then, one of the \(\alpha \)-successors of s is selected probabilistically according to the transition function \({\textbf {P}}\). That is, with probability \({\textbf {P}}(s)(\alpha )(s^\prime )\) the next state is \(s^\prime \).

Initial and Final States. The states s with \(\zeta (s) > 0\) are considered as the initial states. The set of initial states of \(\mathcal {M}\) is denoted by \( Init (\mathcal {M})\). To ensure \(\mathcal {M}\) is non-blocking, we include a self-loop to each state s that has no successor, i.e., \({\textbf {P}}(s)(\tau )(s^\prime ) = 1\). The distinguished action label \(\tau \) is used to show that the self-loop’s action is not of further interest. Then, a state s is called final if \( Succ (s) = \{s\}\). In the literature, these states are called absorbing [3]. We call them final, because in our program model they show termination of the program. The set of final states of \(\mathcal {M}\) is denoted by \( final (\mathcal {M})\).

Execution Paths. Alternating sequences of states that may arise by resolving both nondeterministic and probabilistic choices in an arbitrary MDP \(\mathcal {M}\) are called (execution) paths. More precisely, a finite path fragment \(\hat{\sigma }\) of \(\mathcal {M}\) is a finite state sequence \(s_0s_1\ldots s_n\) such that \(s_i \in Succ (s_{i-1})\) for all \(0<i\le n\). A path \(\sigma \) is an infinite state sequence \(s_0 s_1 \ldots s_{n-1} s_n^\omega \) such that \(s_0 \in Init (\mathcal {M})\), \(s_i \in Succ (s_{i-1})\) for all \(0<i\le n\), \(\omega \) denotes infinite iteration, and \(s_n \in final (\mathcal {M})\), i.e., \(s_n^\omega \) denotes the infinite iteration over \(s_n\). The final state of \(\sigma \), i.e. \(s_n\), is given by \( final (\sigma )\). The set of execution paths of \(\mathcal {M}\) is denoted by \( Paths (\mathcal {M})\). The set of finite path fragments starting in s and ending in \(s^\prime \) is denoted by \( PathFrags (s,s^\prime )\).

Traces and Trace Fragments. A trace of an execution path is the sequence of atomic propositions of the states of the path. Formally, the trace of a finite path fragment \(\hat{\sigma } = s_0s_1 \ldots s_n\) is defined as \(\hat{T} = trace (\hat{\sigma }) = V(s_0) V(s_1) \ldots V(s_n)\). For a path \(\sigma = s_0s_1\ldots \), \(trace_{\ll i}(\sigma )\) is defined as the prefix of \( trace (\sigma )\) up to index i, i.e., \(trace_{\ll i}(\sigma ) = V(s_0) V(s_1) \ldots V(s_i)\). Let \( Paths (T)\) be the set of paths that have the trace T, i.e., \( Paths (T) = \{ \sigma \in Paths (\mathcal {M}) \ | \ trace (\sigma ) = T \}\). We define \( final ( Paths (T))\) to denote the set of final states that result from the trace T, i.e., \( final ( Paths (T)) = \{ final (\sigma ) \ | \ \sigma \in Paths (T) \}\).

MDPs are suitable for modeling concurrent programs, but since they contain nondeterministic choices, they are too abstract to implement. We need to resolve these nondeterministic choices into probabilistic ones. The result is a Markov chain, which does not contain action and nondeterminism.

Definition 3

A (discrete-time) Markov chain (MC) is a tuple \(\mathcal {M}=(S, {\textbf {P}}, \zeta , AP, V)\) where,

  • S is a set of states,

  • \({\textbf {P}} : S \times S \rightarrow [0,1]\) is a transition probability function such that for all states \(s \in S\), \({\sum }_{s^\prime \in S} {\textbf {P}}(s, s^\prime ) = 1\),

  • \(\zeta :S \rightarrow [0,1]\) is an initial distribution such that \({\sum }_{s \in S} \zeta (s) = 1\),

  • AP is a set of atomic propositions,

  • \(V:S \rightarrow AP\) is a labeling function.

The function \({\textbf {P}}\) determines for each state s the probability \({\textbf {P}}(s,s^\prime )\) of a single transition from s to \(s^\prime \). Note that for all states \(s \in S\), \({\sum }_{s^\prime \in S} {\textbf {P}}(s, s^\prime ) = 1\).

Reachability Probabilities. We define the probability of reaching a state s from an initial state in an MC \(\mathcal {M}\) as \( Pr (s) = \underset{ \begin{array}{c} \hat{\sigma } \in PathFrags (s_0 , s) \\ s_0 \in Init (\mathcal {M}) \end{array} }{\sum \ \ Pr (\hat{\sigma })}\), where

$$ Pr (\hat{\sigma } = s_0 s_1 \ldots s_n) = {\left\{ \begin{array}{ll} \zeta (s_0) &{} \text {if} \ n=0,\\ \zeta (s_0). \prod \limits _{0 \le i < n} {\textbf {P}}(s_i , s_{i+1}) \quad &{} \text {otherwise}. \end{array}\right. } $$

Trace Probabilities. The occurrence probability of a trace T is defined as \( Pr (T)=\underset{ \sigma \in Paths (T) }{\sum } Pr (\sigma )\), where \( Pr (\sigma = s_0 s_1 \ldots s_n^\omega ) = Pr (\hat{\sigma } = s_0 s_1 \ldots s_n)\).

DAG Structure of Program Models. We assumed that the programs always terminate and states indicate the current values of the variables and the program counter. This implies that Markovian models of every terminating program takes the form of a directed acyclic graph (DAG), modulo self-loops in final states. Therefore, reachability probabilities coincide with long-run probabilities [3]. Initial states of the program are represented as roots of the DAG, and final states as leaves. Each state of a Markovian model is located at a level equal to the least distance of that state from an initial state. Level of state s is denoted by level(s).

2.4 Probabilistic Schedulers

A probabilistic scheduler implements the scheduling policy of a concurrent program. It determines the order and probability of execution of threads. When a probabilistic scheduler is applied to a concurrent program, nondeterministic choices are replaced by probabilistic ones. As we modeled concurrency between threads using nondeterminism in MDP, the scheduler is used to resolve the possible nondeterminism in MDP. For demonstration purposes, it suffices to consider a simple but important subclass of schedulers called memoryless probabilistic schedulers. Given a state s, a memoryless probabilistic scheduler returns a probability for each action \(\alpha \in Act (s)\). This random choice is independent of what has happened in the history, i.e., which path led to the current state. This is why it is called memorylessFootnote 1. Formally,

Definition 4

Let \(\mathcal {M}=(S, Act , {\textbf {P}}, \zeta , AP, V)\) be an MDP. A memoryless probabilistic scheduler for \(\mathcal {M}\) is a function \(\delta :S \rightarrow \mathcal {D}( Act )\), such that \(\delta (s) \in \mathcal {D}( Act (s))\) for all \(s \in S\).

As all nondeterministic choices in an MDP \(\mathcal {M}\) are resolved by a scheduler \(\delta \), a Markov chain \(\mathcal {M}_\delta \) is induced. Formally,

Definition 5

Let \(\mathcal {M}=(S, Act , {\textbf {P}}, \zeta , AP, V)\) be an MDP and \(\delta :S \rightarrow \mathcal {D}( Act )\) be a memoryless probabilistic scheduler on \(\mathcal {M}\). The MC of \(\boldsymbol{\mathcal {M}}\) induced by \(\boldsymbol{\delta }\) is given by \(\mathcal {M}_\delta = (S, \textbf{P}_\delta , \zeta , AP, V)\) where \({\textbf {P}}_\delta (s, s^\prime ) = \underset{\alpha \in Act (s)}{\sum } \delta (s)(\alpha ).{\textbf {P}}(s)(\alpha )(s^\prime )\)

3 The Proposed Approach

Suppose a concurrent program P, running under control of a scheduling policy \(\delta \). The proposed approach proceeds in three steps: (1) defining an MDP representing P and applying \(\delta \) to the MDP to resolve the nondeterminism in the MDP (Sect. 3.1), (2) constructing a back-bisimulation quotient (Sect. 3.2), and (3) computing the leakage (Sect. 3.3). Finally, an algorithm for computing the back-bisimulation quotient is presented (Sect. 3.4).

3.1 The Program and Threat Models

It is assumed P has a secret input variable h and a public output variable l and h has a fixed ue during the program executions. If the program has several secret variables, they can be encoded (e.g. concatenated) into one secret variable. The same is done for public and neutral variables. Possible values of l and h are denoted by \( Val _l\) and \( Val _h\).

The attacker has a prior knowledge of the secret, which is modeled as a prior probability distribution over the possible values of h, i.e. Pr(h). Here, the attacker is assumed to be probabilistic, i.e., she knows size of the secret, in addition to some accurate constraints about the values of h. For instance, the attacker could know that h is 2 bits long, its value is not 1, the probability that its value is 2 is 0.6, and the probability that its value is 3 is thrice the probability that it is 0. The prior distribution encoding these constraints is \( Pr (h) = \{ 0 \mapsto 0.1, 2 \mapsto 0.6, 3 \mapsto 0.3 \}\)Footnote 2. A special case of the probabilistic attacker is ignorant [6], who has no prior information about the value of h except its size. Thus, the ignorant attacker’s initial knowledge is a uniform prior distribution on h.

Define an MDP Representing P. Operational semantics of the concurrent program P is represented by an MDP \(\mathcal {M}^{\texttt {P}}=(S, Act , {\textbf {P}}, \zeta , Val _l, V)\). Each state \(s \in S\) is a tuple \(\langle \overline{l}, \overline{h}, \overline{n}, pc \rangle \), where \(\overline{l}\), \(\overline{h}\), and \(\overline{n}\) are values of the public, secret, and neutral variables, respectively, and pc is the program counter. Actions \( Act \) are program statements of P. The function \(\textbf{P}\) defines probabilities of transitions between states. Atomic propositions are \( Val _l\) and the function V labels each state with value of l in that state. In fact, a state label is what an attacker observes in a state and traces of \(\mathcal {M}^{\texttt {P}}\) are the sequences of public values that are valid during the execution.

The initial distribution \(\zeta \) is determined by the prior knowledge of the attacker \( Pr (h)\), i.e., \(\zeta (s) = Pr (h=\overline{h})\) for all \(s \in Init (\mathcal {M}^{\texttt {P}})\), where \(s = \langle ., \overline{h}, ., . \rangle \).

Example 1

(Program P1). Consider the following program, where h is a 2-bit random variable and || denotes the concurrency of the executions:

figure a

The attacker’s prior knowledge is the size of h, yielding a uniform distribution on h, i.e., \( Pr (h) = \{ 0 \mapsto \frac{1}{4}, 1 \mapsto \frac{1}{4}, 2 \mapsto \frac{1}{4}, 3 \mapsto \frac{1}{4} \}\). The MDP \(\mathcal {M}^{\texttt {P1}}\) of the program is shown in Fig. 1. The initial distribution \(\zeta \) is determined by \( Pr (h)\), i.e., \(\zeta = \{ s_0 \mapsto \frac{1}{4}, s_4 \mapsto \frac{1}{4}, s_7 \mapsto \frac{1}{4}, s_{10} \mapsto \frac{1}{4} \}\). Each state is labeled by the value of l in that state. Each transition is labeled by an action (a program statement) and a probability. For instance, the transition from \(s_0\) to \(s_1\) has the action \(\alpha : \texttt {l:=h/2}\) and the probability \({\textbf {P}}(s_0)(\alpha )(s_1)=1\); Or the transition from \(s_0\) to \(s_2\) has the label \(\beta : \texttt {l:=h}\ \texttt {mod}\ \texttt {2}\) and the probability \({\textbf {P}}(s_0)(\beta )(s_2)=1\).

Fig. 1.
figure 1

\(\mathcal {M}^{\texttt {P1}}\): MDP of the program P1, with \(\alpha \) denoting l:=h/2, \(\beta \) denoting l:=h mod 2, \(\gamma \) denoting l:=1, and \(\tau \) denoting termination of the program

Resolve the Nondeterminism in the MDP. The scheduling policy is represented by a memoryless probabilistic scheduler \(\delta \). As the MDP \(\mathcal {M}^{\texttt {P}}\) is run under the control of the scheduler \(\delta \), all nondeterministic transitions are resolved and an MC \(\mathcal {M}^\texttt {P}_\delta = (S, {\textbf {P}}_\delta , \zeta , Val _l, V)\) is produced.

Example 2

(MC of P1). We choose the scheduler to be uniform. The uniform scheduler, denoted by the function uni, picks each thread with the same probability. This yields the definition of the scheduler as follows:

$$\begin{aligned} uni(s_0)&=uni(s_4)=uni(s_7)=\{ \alpha \mapsto \frac{1}{2}, \beta \mapsto \frac{1}{2} \},\quad \quad uni(s_{10})=\{ \gamma \mapsto 1 \},\\ uni(s_1)&=uni(s_5)=uni(s_{13})=\{ \beta \mapsto 1 \},\quad uni(s_2)=uni(s_8)=uni(s_{11})=\{ \alpha \mapsto 1 \},\\ uni(s_{3})&=uni(s_6)=uni(s_{9})=uni(s_{12})=uni(s_{14})=uni(s_{15})=\{ \tau \mapsto 1 \}.\quad \quad \quad \quad \quad \end{aligned}$$

The MC \(\mathcal {M}^{\texttt {P1}}_{uni}\) of the program P1 running under control of the uniform scheduler is depicted in Fig. 2. In this Figure, transitions are labeled by the transition probability.

Fig. 2.
figure 2

\(\mathcal {M}^{\texttt {P1}}_{uni}\): MC of the program P1 with the uniform scheduler

3.2 The Attacker’s View of the Program: Back-Bisimulation Quotient

In order to measure the amount of information the attacker can deduce about h, we need to construct the attacker’s view of the program. First, the attacker can distinguish a final state from a non-final one by observing termination of the program. Second, she cannot discriminate between those paths that have the same trace. For instance, in \(\mathcal {M}^{\texttt {P1}}_{uni}\) (Fig. 2) the attacker only observes the traces \(\{ \langle 0,0,0^\omega \rangle , \langle 0,0,1^\omega \rangle , \langle 0,1,0^\omega \rangle , \langle 0,1^\omega \rangle \}\), whereas there are seven different execution paths. The implication is that she cannot distinguish those final states that have the same public values and result from the same traces. Third, she does not know secret values in the final states, but may guess the value of h based on a probability distribution that she can compute according to the possible values of h in each final state. These three requirements are captured by an equivalence relation, called back-bisimulation, denoted by \(\sim _b\).

Definition 6

Let \(\mathcal {M}^\texttt {P}_\delta \) be an MC. A back-bisimulation for \(\mathcal {M}^\texttt {P}_\delta \) is a binary relation \(\mathcal {R}\) on S such that for all \(s_1\ \mathcal {R}\ s_2\), the following three conditions hold:

(1) \(V(s_1) = V(s_2)\), (2) if \(s_1^\prime \in Pre(s_1 )\), then there exists \(s_2^\prime \in Pre(s_2 )\) with \(s_1^\prime \ \mathcal {R}\ s_2^\prime \), (3) if \(s_2^\prime \in Pre(s_2 )\), then there exists \(s_1^\prime \in Pre(s_1 )\) with \(s_1^\prime \ \mathcal {R}\ s_2^\prime \).

States \(s_1\) and \(s_2\) are back-bisimilar, denoted by \(s_1 \sim _b s_2\), if there exists a back-bisimulation \(\mathcal {R}\) for \(\mathcal {M}^\texttt {P}_\delta \) with \(s_1\ \mathcal {R}\ s_2\).

Condition (1) requires that the states \(s_1\) and \(s_2\) have the same public values. According to condition (2), every incoming transition of \(s_1\) must be matched by an incoming transition of \(s_2\); the reverse is assured by condition (3).

Theorem 1

Back-bisimulation is an equivalence relation.Footnote 3

As \(\sim _b\) is an equivalence relation, it induces a set of equivalence classes on the state space of an MC. Given MC \(\mathcal {M}^\texttt {P}_\delta \), a quotient space \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) captures the attacker’s view of the program P. The MC \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) aggregates same-trace paths of \(\mathcal {M}^\texttt {P}_\delta \) into one path.

Fig. 3.
figure 3

\(\mathcal {M}^{\texttt {P1}}_{uni}/\sim _b\): back-bisimulation quotient of \(\mathcal {M}^{\texttt {P1}}_{uni}\)

Definition 7

For MC \(\mathcal {M}^\texttt {P}_\delta = (S, {\textbf {P}}_\delta , \zeta , Val _l, V)\) and back-bisimulation \(\sim _b\), the back-bisimulation quotient \(\boldsymbol{\mathcal {M}^\texttt {P}_\delta /\sim _b}\) is defined by \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) where

\(\mathcal {M}^\texttt {P}_\delta /\sim _b = ( S/\sim _b , {\textbf {P}}^\prime _\delta , s^b_{init}, Val _l, V, Pr (h) )\)

  • \(S/\sim _b\) is the quotient space of S under \(\sim _b\),

  • \({\textbf {P}}^\prime _\delta : (S/\sim _b) \times (S/\sim _b) \rightarrow [0,1]\) is a probability transition function between equivalence classes of \(S/\sim _b\) such that \(\forall \ s^b, t^b \in S/\sim _b . \ {\textbf {P}}^\prime _\delta (s^b, t^b) = \underset{\begin{array}{c} s \in s^b, \ t \in t^b \end{array}}{\sum } Pr (s)*{\textbf {P}}_\delta (s,t)\), where \( Pr (s)\) is the probability of reaching to s in MC \(\mathcal {M}^\texttt {P}_\delta \),

  • \(s^b_{init} = Init(\mathcal {M}^\texttt {P}_\delta )\),

  • \(V ([s]_{\sim _b}) = V(s)\),

  • \( Pr (h)\) is a mapping from each quotient state \(s^b\) to \( Pr (h_{s^b})\), where \( Pr (h_{s^b})\) is the probability distribution of h in the state \(s^b\) and is computed, for all \(\overline{h} \in Val _h\), as

    figure b

    where \( Pr (s^b)\) is the reachability probability of \(s^b\) in \(\mathcal {M}^\texttt {P}_\delta /\sim _b\).

The public variable has a single initial value. Thus, all of the initial states of \(\mathcal {M}^\texttt {P}_\delta \) have the same public value and form a single equivalence class \(s^b_{init}\). Each state \(s^b\) is labeled with a probability distribution \( Pr (h_{s^b})\) which shows the probabilities of possible values of h in that state.

Example 3

(Back-bisimulation quotient of P1). The back-bisimulation quotient \(\mathcal {M}^{\texttt {P1}}_{uni}/\sim _b\) is depicted in Fig. 3. Each state of \(\mathcal {M}^{\texttt {P1}}_{uni}/\sim _b\) is an equivalence class, containing related states of \(\mathcal {M}^{\texttt {P1}}_{uni}\):

$$\begin{aligned} s^b_0&= \{s_0, s_4, s_7, s_{10}\}, \quad s^b_1 = \{s_1, s_2, s_5, s_8\}, \quad s^b_2 = \{s_3\}, \quad s^b_3 = \{s_{6}, s_{9}\},\\ s^b_4&= \{s_{15}\}, \quad s^b_5 = \{s_{12}, s_{14}\}, \quad s^b_6 = \{s_{11}, s_{13}\}. \end{aligned}$$

States are labeled with the value of l, together with the distribution of h:

$$\begin{aligned} Pr (h_{s^b_0})&= \{0 \mapsto \frac{1}{4}, 1 \mapsto \frac{1}{4}, 2 \mapsto \frac{1}{4}, 3 \mapsto \frac{1}{4}\},\qquad Pr (h_{s^b_3}) = \{1 \mapsto \frac{1}{2}, 2 \mapsto \frac{1}{2}\}, \\ Pr (h_{s^b_1})&= \{0 \mapsto \frac{1}{2}, 1 \mapsto \frac{1}{4}, 2 \mapsto \frac{1}{4}\}, \quad Pr (h_{s^b_5}) = \{1 \mapsto \frac{1}{2}, 2 \mapsto \frac{1}{2}\},\\ Pr (h_{s^b_2})&= \{0 \mapsto 1\}, \quad Pr (h_{s^b_4}) = \{3 \mapsto 1\}, \quad Pr (h_{s^b_6}) = \{1 \mapsto \frac{1}{2}, 2 \mapsto \frac{1}{2}\}. \end{aligned}$$

The back-bisimulation quotient can be automatically constructed from the MC. This will be discussed in Sect. 3.4. After constructing the quotient, the next step is to compute the program leakage from the back-bisimulation quotient.

3.3 Measuring the Leakage Using Back-Bisimulation Quotient

Let \(\mathcal {M}^\texttt {P}_\delta /\sim _b = ( S/\sim _b , {\textbf {P}}^\prime _\delta , s^b_{init}, Val _l, V, Pr (h) )\) be the attacker’s view of the program P running with the scheduler \(\delta \). In each state \(s^b\) of \(\mathcal {M}^\texttt {P}_\delta /\sim _b\), the secret distribution \( Pr (h_{s^b})\) determines the attacker’s uncertainty about h. Depending on the program statements that are chosen and executed by the scheduler, and the public values observed by the attacker, the distribution of h changes from state to state along each trace of \(\mathcal {M}^\texttt {P}_\delta /\sim _b\). In fact, \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) transforms a priori distribution of h in the initial state \(s^b_{init}\) to posterior distributions in the final states \( final (\mathcal {M}^\texttt {P}_\delta /\sim _b)\).

The attacker’s uncertainty about h in a state \(s^b\) with the secret distribution \( Pr (h_{s^b})\) is measured by \(\mathcal {H}_\infty (\texttt {h}_{s^b})\). Thus, the initial uncertainty is measured by \(\mathcal {H}_\infty (\texttt {h}_{s^b_{init}})\).

Since there might be more than one final state with different reachability probabilities and the MC can seen as a discrete probability distribution over all of its final states, the remaining uncertainty is defined as the expectation of uncertainties in all final states: \(\underset{s^b_f \in final (\mathcal {M}^\texttt {P}_\delta /\sim _b)}{\sum } Pr (s^b_f) \mathcal {H}_\infty (\texttt {h}_{s^b_f})\), where \( Pr (s^b_f)\) is the probability of reaching \(s^b_f\) from the initial state \(s^b_{init}\). It now follows that the leakage of the concurrent program P running under control of the scheduler \(\delta \) is computed as \(\mathcal {L}(\texttt {P}_\delta ) = \mathcal {H}_\infty (\texttt {h}_{s^b_{init}}) - \sum _{s^b_f \in final (\mathcal {M}^\texttt {P}_\delta /\sim _b)} Pr (s^b_f). \mathcal {H}_\infty (\texttt {h}_{s^b_f})\).

Notice that for measuring the leakage of P, we computed min-entropy of initial and final states, and did not consider min-entropy of intermediate states. This is not in contrast with our assumption of taking into account the intermediate values of l along the execution paths. This is because in \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) distributions of h in the final states result from values of l and distributions of h in the intermediate states. Thus, when computing the remaining uncertainty from the final distributions, the intermediate values of l are automatically taken into account. The final distributions of h also result from the program statements which are chosen by the scheduler. Therefore, the effect of the scheduler choices is considered, as well.

Moreover, in the literature, the remaining uncertainty is usually measured by the conditional entropy \(\mathcal {H}_\infty ({h}|{l})\), but we measure it by the non-conditional entropy \(\mathcal {H}_\infty ({h})\). These entropies are identical in our program model, because in \(\mathcal {M}^\texttt {P}_\delta /\sim _b\) the entropy \(\mathcal {H}_\infty ({h})\) is computed from final states that result from traces observed by the attacker. This is exactly the same as the conditional entropy \(\mathcal {H}_\infty ({h}|{l})\).

Example 4

(Back-bisimulation quotient of P1 is a distribution transformer.). In the initial state \(s^b_0\) of \(\mathcal {M}^{\texttt {P1}}_{uni}/\sim _b\), the distribution of h is \( Pr (h_{s^b_0}) = \{0 \mapsto \frac{1}{4}, 1 \mapsto \frac{1}{4}, 2 \mapsto \frac{1}{4}, 3 \mapsto \frac{1}{4}\}\). This means that before executing the program, the attacker only knows that the value of h belongs to the set \(\{0,1,2,3\}\) and if she guesses the value of h, then the likelihood of her being successful is \(\frac{1}{4}\). Therefore, \( Pr (h_{s^b_0})\) determines the attacker’s initial uncertainty about h. Now consider the final state \(s^b_3\), in which the distribution of h is \( Pr (h_{s^b_3}) = \{1 \mapsto \frac{1}{2}, 2 \mapsto \frac{1}{2}\}\). In this state, the attacker knows that the value of h belongs to \(\{1,2\}\), and thus her uncertainty about h is reduced. This means that after executing the program and observing the trace \(\langle 0,1,0^\omega \rangle \), if the attacker guesses the value of h, then the likelihood of her being successful is \(\frac{1}{2}\). These considerations imply that the back-bisimulation quotient is a distribution transformer.

Example 5

(Information leakage of P1). The initial uncertainty is quantified as the Renyi’s min-entropy of h in the initial state \(s^b_0\), i.e., \(\mathcal {H}_\infty (\texttt {h}_{s^b_0}) = -\log _2 \frac{1}{4}=2 \ (bits)\).

The remaining uncertainty is quantified as the Renyi’s min-entropy of h in the final states. There are four final states with different reachability probabilities: \(s^b_2\), \(s^b_5\), \(s^b_3\), \(s^b_4\). Consequently, the remaining uncertainty is quantified as the expectation of the Renyi’s min-entropy of h in these states:

$$\begin{aligned} \underset{s^b \in \{s^b_2, s^b_5, s^b_3, s^b_4\}}{\sum } Pr (s^b).\mathcal {H}_\infty (\texttt {h}_{s^b}) =&-\frac{1}{4}*\log _2 1 - \frac{1}{4}*\log _2 \frac{1}{2} - \frac{1}{4}*\log _2 \frac{1}{2}\\&- \frac{1}{4}*\log _2 1 = 0.5 \ (bits), \end{aligned}$$

where \( Pr (s^b)\) denotes the probability of reaching \(s^b\) from the initial state \(s^b_0\). Finally, the leakage of the program P1 running with the uniform scheduler is computed as \(\mathcal {L}(\texttt {P1}_{uni}) = 2 - 0.5 = 1.5 \ (bits)\).

The following section formally defines the back-bisimulation equivalence and explains how to compute the back-bisimulation quotient.

3.4 Computing Back-Bisimulation Quotient

In this section, we discuss how to compute the back-bisimulation quotient. Before that, we first explain a subclass of MCs, called Markov chains with pseudoback-bisimilar states, in which back-bisimulation cannot correctly construct the attacker’s view of the program behavior.

Pseudoback-Bisimilar States. In order to compute the states of a back-bisimulation quotient, we need to aggregate back-bisimilar states into one equivalence class. For that, we define Back-bisimulation signature, which is defined as a kind of fingerprint for states of a back-bisimulation equivalence class.

Definition 8

The back-bisimulation signature of a state s is defined as

$$sig_{\sim _b}(s) = \{ \ \big (V(s), [s^\prime ]_{\sim _b}\big ) \ | \ \exists s^\prime \in Pre(s) \ \}.$$

It asserts that two states that have the same public value and their predecessors belong to the same equivalence class, have the same signature.

Definition 9

Let \(\mathcal {M}^\texttt {P}_\delta \) be an MC. Two states \(s_1 , s_2 \in S\) are pseudoback-bisimilar iff (1) \(V(s_1) = V(s_2)\), (2) \(level(s_1) = level(s_2)\), (3) \(sig_{\sim _b}(s_1) \ne sig_{\sim _b}(s_2)\), (4) \(PreLabels(s_1) \cap PreLabels(s_2) \ne \varnothing \). An MC that contains some pseudoback-bisimilar states is denoted by MC\(_\mathfrak {p}\) and an MC with no pseudoback-bisimilar state is denoted by MC\(_\mathfrak {n}\).

Stated in words, two states are pseudoback-bisimilar if they have the same label, are at the same level (distance from an initial state), and have different signatures, but intersecting pre-labels. In an MC\(_\mathfrak {n}\), states at the same level and with the same label, either have no intersecting pre-labels or have the same pre-labels.

Example 6

(An example \(MC_\mathfrak {p}\)). Consider the following program:

figure c

where \( Val _h \in \{0,1\}\) and \( Pr (h) = \{ 0 \mapsto \frac{1}{2}, 1 \mapsto \frac{1}{2} \}\). A uniform scheduler is selected for both parallel operators. The MC \(\mathcal {M}^{\texttt {P2}}_{uni}\) is shown in Fig. 4a.

Fig. 4.
figure 4

\(\mathcal {M}^{\texttt {P2}}_{uni}\) and \(\mathcal {M}^{\texttt {P2}}_{uni}/\sim _b\)

In \(\mathcal {M}^{\texttt {P2}}_{uni}\), states \(s_8\) and \(s_9\) are pseudoback-bisimilar. They both have the label 3, are at the level 3, and have different signatures:

$$sig_{\sim _b}(s_8) = \Big \{ \big (3, \{s_5, s_7\}\big ), \big (3, \{s_3\}\big ) \Big \}, sig_{\sim _b}(s_9) = \Big \{ \big (3, \{s_5, s_7\}\big ) \Big \},$$

but intersecting pre-labels: \(preLabels(s_8) = \{2,1\}, \qquad preLabels(s_9) = \{2\}\).

Back-bisimulation captures the attacker’s view for all programs that do not contain pseudoback-bisimilar states, i.e., those final states that have the same public values and result from the same trace are indistinguishable and fall into the same \(\sim _b\)-equivalence class. Formally,

Theorem 2

Let \(\mathcal {M}^\texttt {P}_\delta \) be an MC\(_\mathfrak {n}\). For all paths \(\sigma _1, \sigma _2 \in Paths(\mathcal {M}^\texttt {P}_\delta )\) with

\(\sigma _1 = s_{0,1} s_{1,1} \ldots s_{n-1,1} (s_{n,1})^\omega \), \(\sigma _2 = s_{0,2} s_{1,2} \ldots s_{n-1,2} (s_{n,2})^\omega \), and \(n \ge 0\) it holds that \(s_{n,1} \sim _b s_{n,2}\) iff \(trace(\sigma _1) = trace(\sigma _2)\).

Theorem 2 argues that same-trace final states are back-bisimilar. A similar argument can be made for non-final states.

Theorem 3

Let \(\mathcal {M}^\texttt {P}_\delta \) be an MC\(_\mathfrak {n}\). For all paths \(\sigma _1, \sigma _2 \in Paths(\mathcal {M}^\texttt {P}_\delta )\) with

\(\sigma _1 = s_{0,1} s_{1,1} \ldots s_{n-1,1} (s_{n,1})^\omega \), \(\sigma _2 = s_{0,2} s_{1,2} \ldots s_{m-1,2} (s_{m,2})^\omega \), \(n, m > 0\), and \(0 \le i < min(n,m)\) it holds that \(s_{i,1} \sim _b s_{i,2}\) iff \(trace_{\ll i}(\sigma _1) = trace_{\ll i}(\sigma _2)\).

Therefore, all paths of \(\mathcal {M}^\texttt {P}_\delta \) with the same trace form a single path in \(\mathcal {M}^\texttt {P}_\delta /\sim _b\). Stated formally, let \(\sigma ^\prime \in Paths(\mathcal {M}^\texttt {P}_\delta /\sim _b)\), \(s^b_f = final (\sigma ^\prime ) \in final (\mathcal {M}^\texttt {P}_\delta /\sim _b)\), and \(T = trace(\sigma ^\prime )\). The path \(\sigma ^\prime \) is the aggregation of all paths \(Paths(T) \subseteq Paths(\mathcal {M}^\texttt {P}_\delta )\). All final states of \(\mathcal {M}^\texttt {P}_\delta \) that result from the trace T fall into the same \(\sim _b\)-equivalence class \(s^b_f = \{ s_f \ | \ s_f \in final (Paths(T)) \}\). For \(s^b_f\), the secret distribution \( Pr (h_{s^b_f})\) contains probabilities of possibles values of h that the attacker might be able to guess by observing T.

Pseudoback-bisimilar states do not fall into the same \(\sim _b\)-equivalence class and thus \(\sim _b\) is not able to aggregate all paths with the same trace. For instance, in \(\mathcal {M}^{\texttt {P2}}_{uni}/\sim _b\) (Fig. 4b) there are two paths \(s^b_3 s^b_7 s^b_{10} s^b_5 s^b_2 (s^b_6)^\omega \) and \(s^b_3 s^b_7 s^b_{10} s^b_4 s^b_9 (s^b_{12})^\omega \) with the same trace \(\langle 0,1,2,3,4,5^\omega \rangle \). The attacker, after observing the trace, cannot discriminate the value of h to be 0 or 1. But, in the attacker’s view of the MC constructed by back-bisimulation (Fig. 4b) the value of h is distinguished in the final states of the two paths. Furthermore, the probability of some traces in the back-bisimulation quotient might be different from their probability in the concrete model. For example, the probability of the trace \(\langle 0,1,2,3,4,5^\omega \rangle \) in \(\mathcal {M}^{\texttt {P2}}_{uni}\) is \(\frac{5}{8}\), while it is \(\frac{9}{16}\) in \(\mathcal {M}^{\texttt {P2}}_{uni}/\sim _b\). The implication is that back-bisimulation cannot correctly construct the attacker’s view of an MC\(_\mathfrak {p}\). For MC\(_\mathfrak {p}\)s, we use the trace-exploration-based method, introduced in [25], which computes the program leakage directly from the MC\(_\mathfrak {p}\) \(\mathcal {M}^{\texttt {P}}_{\delta }\).

Algorithm for Computing the Back-Bisimulation Quotient Space. In this section, an algorithm is proposed for obtaining the back-bisimulation quotient space for a finite MC\(_\mathfrak {n}\). This algorithm is similar to Kanellakis and Smolka’s algorithm for computing the bisimulation quotient space [20]. It relies on a partition refinement technique, where the state space is partitioned into blocks. It starts from an initial partition \(\varPi _0\) and computes successive refinements of \(\varPi _0\) until a stable partition is reached. A partition is stable if no further refinements are possible. The obtained partition is \(S/\sim _b\), the largest back-bisimulation over the input finite MC\(_\mathfrak {n}\). The essential steps are outlined in Algorithm 1. The algorithm consists of two main parts: (a) computing the initial partition, and (b) successively refining the partitions.

figure d

Computing the Initial Partition. Since back-bisimilar states have the same public value, it is sensible to use this in determining the initial partition \(\varPi _0\).

All initial states have the same public value and have no predecessors. Consequently, they form a single block \(s^b_{init} = Init(\mathcal {M}^\texttt {P}_\delta )\). This block will remain unchanged during the refinements.

For the remaining states \(S {\setminus } s^b_{init}\), each group of states with the same public value forms a block. Same-label blocks can be obtained by the equivalence relation \(\mathcal {R} = \{ (s_1, s_2) \ | \ V(s_1)=V(s_2) \}\), which induces the quotient spaces \(\big (S \ \backslash \ Init(\mathcal {M}^\texttt {P}_\delta ) \big )\ / \ \mathcal {R}\). Thus, the initial partition is obtained as \(\varPi _0 = \{ s^b_{init} \} \cup \Big ( \big (S \ \backslash \ Init(\mathcal {M}^\texttt {P}_\delta ) \big ) \ / \ \mathcal {R} \Big )\).

Partition Refinement. Since all partitions are a refinement of the initial partition \(\varPi _0\), each block in these partitions contains states with the same public value. However, blocks of \(\varPi _0\), except \(s^b_{init}\), do not consider the one-step predecessors of states. This is taken care of in the successive refinement steps, by the refinement operator.

Definition 10

Let \(\varPi \) be a partition for S and \(\mathcal {C}\) be a superblock of \(\varPi \). Then,

$$ Refine _b(\varPi , \mathcal {C}) = \underset{\mathcal {B} \in \varPi }{\cup } Refine _b(\mathcal {B}, \mathcal {C}),$$

where \( Refine _b(\mathcal {B}, \mathcal {C}) = \{ \mathcal {B} \cap Succ(\mathcal {C}),\ \mathcal {B} {\setminus } Succ (\mathcal {C}) \} {\setminus } \{ \varnothing \}\). Here, \(\mathcal {C}\) is called a splitter for \(\varPi \), refining blocks of \(\varPi \) to subblocks.

Using \(\mathcal {C}\), \( Refine _b(\mathcal {B}, \mathcal {C})\) decomposes the block \(\mathcal {B}\) into two subblocks, provided that the subblocks are nonempty.

A key step of computing the back-bisimulation quotient space is to determine a splitter \(\mathcal {C}\) for a given partition \(\varPi \). Algorithm 1 uses the blocks of the previous partition \(\varPi _{old}\) as splitter candidates for \(\varPi \).

Theorem 4

Algorithm 1 always terminates and correctly computes the back-bisimulation quotient space \(S/\sim _b\).

The following theorem discusses the time complexity of Algorithm 1.

Theorem 5

The time complexity of Algorithm 1 is O(|S|.|E|), where E denotes the set of transitions of \(\mathcal {M}^\texttt {P}_\delta \).

4 Conclusions and Future Work

In this paper, a quantification approach is proposed for concurrent programs. Back-bisimulation equivalence is defined to model the attacker’s view of the program behavior. Then a partition refinement algorithm is developed to compute the back-bisimulation quotient of the program. The back-bisimulation quotient is automatically constructed and contains secret distributions, which are used to compute the information leakage of the program.

The back-bisimulation quotient contains all execution traces which the attacker can observe during executing the program. Thus, it can be used to compute maximal and minimal leakages that might occur during the program executions. Furthermore, the quotient is an abstract model of the program and the quantification analysis is done on a minimized model, most likely saving time and space.

Back-bisimulation equivalence creates a lot of exciting opportunities for future works. It can be used to verify any trace-equivalence-based property, such as observational determinism [21, 26, 33], a widely-studied confidentiality property of secure information flow for concurrent programs. It can also be defined on multi-terminal binary decision diagrams (MTBDDs), in order to improve the scalability of the quantification approach to a great extent. We aim to lift the program-termination restriction and extend the proposed approach to non-terminating concurrent programs. We also aim to study bounded leakage problem [25] and channel capacity [29] on the back-bisimulation quotient. Probably, using some reductions, such as on-the-fly techniques, can improve the scalability of the problem. Furthermore, handling programs with pseudoback-bisimilar states using back-bisimulation is a possible future work. Another avenue to consider the current work is to perform time analysis of the proposed approach, e.g. on dining cryptographers protocol.