1 Introduction

Reversible computing [1, 26] has gained interest for its application to different fields: from modelling biological/chemical phenomena [18], to simulation [29], debugging [13] and modelling fault-tolerant systems [11, 19, 32]. Our interest focuses on this latter application and stems from the fact that reversibility can be used to rigorously model, implement and revisit programming abstractions for reliable software systems.

Recent works [4, 6, 24, 25, 30] have studied the effect of reversibility in communication-centric scenarios, as a way to correct faulty computations by bringing back the system to a previous consistent state. In this setting, processes’ behaviours are strongly disciplined by their types, prescribing the actions they have to perform within a session. A session consists of a structured series of message exchanges, whose flow can be controlled via conditional choices, branching and recursion. Correctness of communication is statically guaranteed by a framework based on a (session) type discipline [16]. None of the aforementioned works addresses systems in which the participants can explicitly abort the session, commit a computation and roll it back to a previous checkpoint. In this paper, we aim at filling this gap. We explain below the distinctive aspects of our checkpoint-based rollback recovery approach.

Linguistic Primitives to Explicitly Program Reversible Sessions. We introduce three primitives to: (i) commit a session, preventing undoing the interactions performed so far along the session; (ii) roll back a session, restoring the last saved process checkpoints; (iii) abort a session, to discard the session, and hence all interactions already performed in it, thus allowing another session of the same protocol to start with possible different participants. Notice that most proposals in the literature (e.g., [2,3,4]) only consider an abstract view, as they focus on reversible contracts (i.e., types). Instead, we focus on programming primitives at process level, and use types for guaranteeing a safe and consistent system evolution.

Asynchronous Commits. Our commit primitive does not require a session-wide synchronisation among all participants, as it is a local decision. However, its effect is on the whole session, as it affects the other session participants. This means that each participant can independently decide when to commit. Such flexibility comes at the cost of being error-prone, especially considering that the programmer has not only to deal with the usual forward executions, but also with the backward ones. Our type discipline allows for ruling out programs which may lead to these errors. The key idea of our approach is that a session participant executing a rollback action is interested in restoring the last checkpoint he/she has committed. For the success of the rollback recovery it is irrelevant whether the ‘passive’ participants go back to their own last checkpoints. Instead, if the ‘active’ participant is unable to restore the last checkpoint he/she has created, because it has been replaced by a checkpoint imposed by another participant, the rollback recovery is considered unsatisfactory.

In our framework, programmers are relieved from the burden of ensuring the satisfaction of rollbacks, since undesired situations are prevented at design time (statically) by relying on a compliance check at the type level. To this aim, we introduce \(\mathtt {cherry\text {-}pi}\) (checkpoint-based rollback recovery pi-calculus), a variant of the session-based \(\pi \)-calculus [17, 36] enriched with rollback recovery primitives. We present here a binary version of the calculus, which is more convenient to demonstrate the essence of our rollback recovery approach; the proposed approach can be seamlessly extended to multiparty sessions (see the companion technical report [27] available online). A key difference with respect to the standard binary type discipline is the relaxation of the duality requirement. The types of two session participants are not required to be dual, but they will be compared with respect to a compliance relation (as in [5]), which also takes into account the effects of commit and rollback actions. Such relaxation also involves the requirements concerning selection and branching types, and those concerning branches of conditional choices. The \(\mathtt {cherry\text {-}pi}\) type system is used to infer types of session participants, which are then combined together for the compliance check.

Reversibility in \(\mathtt {cherry\text {-}pi}\) is controlled via two specific primitives: a rollback one telling when a reverse computation has to take place, and a commit one limiting the scope of a potential reverse computation. This implies that the calculus is not fully reversible (i.e., backward computations are not always enabled), leading to have properties that are relaxed and different with respect to other reversible calculi [9, 10, 21, 30]. We prove that \(\mathtt {cherry\text {-}pi}\) satisfies the following properties: (i) a rollback always brings back the system to a previous visited state and (ii) it is not possible to bring the computation back to a point prior to the last checkpoint, which implies that our commits have a persistent effect. Concerning soundness properties, we prove that (a) our compliance check is decidable, (b) compliance-checked \(\mathtt {cherry\text {-}pi}\) specifications never lead to communication errors (e.g., a blocked communication where there is a receiver without the corresponding sender), and (c) compliance-checked \(\mathtt {cherry\text {-}pi}\) specifications never activate undesirable rollbacks (according to our notion of rollback recovery mentioned above). Property (b) resembles the type safety property of session-based calculi (see, e.g., [36]), while property (c) is a new property specifically defined for \(\mathtt {cherry\text {-}pi}\). The technical development of property proofs turns out to be more intricate than that of standard properties of session-based calculi, due to the combined use of type and compliance checking. To demonstrate feasibility and effectiveness of our rollback recovery approach, we have concretely implemented the compliance check using the MAUDE [8] framework (the code is available at https://github.com/tiezzi/cherry-pi).

Outline. Section 2 illustrates the key idea of our rollback recovery approach; Sect. 3 introduces the \(\mathtt {cherry\text {-}pi}\) calculus; Sect. 4 introduces typing and compliance checking; Sect. 5 presents the properties satisfied by \(\mathtt {cherry\text {-}pi}\); Sect. 6 concludes the paper with related and future work. Omitted rules, extension to multiparty sessions, proofs of the results, and a further example are reported on the companion technical report [27].

2 A Reversible Video on Demand Service Example

We discuss the motivations underlying our work by introducing our running example, a Video on Demand (VOD) scenario. The key idea is that a rollback requester is satisfied only if her restored checkpoint was set by herself. In Fig. 1(a), a service (S) offers to a user (U) videos with two different quality levels, namely high definition (HD) and standard definition (SD). After the login, U sends her video request, and receives the corresponding price and metadata (actors, directors, description, etc.) from S. According to this information, U selects the video quality. Then, she receives, first, a short test video (to check the audio and video quality in her device) and, finally, the requested video. If the vision of the HD test video is not satisfactory, U can roll back to her last checkpoint to possibly change the video quality, instead in the SD case U can abort the session.

Fig. 1.
figure 1

VOD example: (a) a full description without commit actions; (b, d) runs with undesired rollback; (c) a run with satisfactory rollback.

Let us now add commit actions as in the run shown in Fig. 1(b). After receiving the price, U commits, while S commits after the quality selection. In this scenario, however, if U activates the rollback, she is unable to go back to the checkpoint she set with her commit action because the actual effect of rollback is to restore the checkpoint set by the commit action performed by S.

In the scenario in Fig. 1(c), instead, S commits after sending the price to U. In this case, no matter who first performed the commit action, the rollback results to be satisfactory. Also if S commits later, the checkpoint of U remains unchanged, as U performed no other action between the two commits. This would not be the case if both U and S committed after the communication of the metadata, as in Fig. 1(d). If S commits before U, no rollback issue arises, but if U commits first it may happen that her internal decision is taken before S commits. In this case, U would not be able to go back to the checkpoint set by herself, and she would be unable to change the video quality.

These undesired rollbacks are caused by bad choices of commit points. We propose a compliance check that identifies these situations at design time.

3 The \(\mathtt {cherry\text {-}pi}\) Calculus

In this section, we introduce \(\mathtt {cherry\text {-}pi}\), a calculus (extending that in [36]) devised for studying sessions equipped with our checkpoint-based rollback recovery mechanism.

Syntax. The syntax of the \(\mathtt {cherry\text {-}pi}\) calculus relies on the following base sets: shared channels (ranged over by a), used to initiate sessions; session channels (ranged over by s), consisting of pairs of endpoints (ranged over, with a slight abuse of notation, by s, \(\bar{s}\)) used by the two parties to interact within an established session; labels (ranged over by l), used to select and offer branching choices; values (ranged over by v), including booleans, integers and strings (whose sorts, ranged over by S, are \(\texttt{bool}\), \(\texttt{int}\) and \(\texttt{str}\), respectively), which are exchanged within a session; variables (ranged over by x, y, z), storing values and session endpoints; process variables (ranged over by X), used for recursion.

Collaborations, ranged over by C, are given by the grammar in Fig. 2. The key ingredient of the calculus is the set of actions for controlling the session rollback. Actions \(\textsf {commit}\), \(\textsf {roll}\) and \(\textsf {abort}\) are used, respectively, to commit a session (producing a checkpoint for each session participant), to trigger the session rollback (restoring the last committed checkpoints) or to abort the whole session. We discuss below the other constructs of the calculus, which are those typically used for session-based programming [15]. A \(\mathtt {cherry\text {-}pi}\) collaboration is a collection of session initiators, i.e. terms ready to initiate sessions by synchronising on shared channels. A synchronisation of two initiators \(\bar{a}(x) . P\) and \(a(y) . Q\) causes the generation of a fresh session channel, whose endpoints replace variables x and y in order to be used by the triggered processes P and Q, respectively, for later communications. No subordinate sessions can be initiated within a running session.

Fig. 2.
figure 2

\(\mathtt {cherry\text {-}pi}\) syntax.

When a session is started, each participant executes a process. Processes are built up from the empty process \(\textbf{0}\) and basic actions by means of action prefix \(\_\,.\,\_\,\), conditional choice \(\textsf {if}\ e\ \textsf {then}\ \_\ \textsf {else}\ \_\), and recursion \(\mu X.\_\). Actions \(x!\langle e\rangle \) and \(y?(z:S)\) denote output and input via session endpoints replacing x and y, respectively. These communication primitives realise the standard synchronous message passing, where messages result from the evaluation of expressions, which are defined by means of standard operators on boolean, integer and string values. Variables that are arguments of input actions are (statically) typed by sorts. There is no need for statically typing the variables occurring as arguments of session initiating actions, as they are always replaced by session endpoints. Notice that in \(\mathtt {cherry\text {-}pi}\) the exchanged values cannot be endpoints, meaning that session delegation (i.e., channel-passing) is not considered. Actions \(x \triangleleft l\) and \(x \triangleright \{l_1\, :\, P_1 ,\ldots ,l_n\, :\, P_n\}\) denote selection and branching (where \(l_1\), ..., \(l_n\) are pairwise distinct).

Example 1

Let us consider the VOD example informally introduced in Sect. 2. The scenario described in Fig. 1(a) with commit actions placed as in Fig. 1(b) is rendered in \(\mathtt {cherry\text {-}pi}\) as \( C_{\texttt {US}} \ = \ \overline{login}(x).\ P_\texttt {U} \ \mid \ login(y).\ P_\texttt {S} \), where:

$$ \begin{array}{rcl} P_\texttt {U} &{} = &{} x!\langle v_{{{{\textsf {\textit{req}}}}}}\rangle . \, x?(x_{{{{\textsf {\textit{price}}}}}}:\texttt{int}) . \, \textsf {commit}. \, x?(x_{{{{\textsf {\textit{meta}}}}}}:\texttt{str}) . \, \textsf {if}\ (f_{{{{\textsf {\textit{eval}}}}}}(x_{{{{\textsf {\textit{price}}}}}},x_{{{{\textsf {\textit{meta}}}}}})) \\ &{}&{} \quad \textsf {then}\ {x} \triangleleft l_{{{{\textsf {\textit{HD}}}}}} . \, x?(x_{{{{\textsf {\textit{testHD}}}}}}:\texttt{str}) . \,\\ &{}&{} \qquad \quad (\textsf {if}\ (f_{{{{\textsf {\textit{HD}}}}}}(x_{{{{\textsf {\textit{testHD}}}}}}))\ \textsf {then}\ x?(x_{{{{\textsf {\textit{videoHD}}}}}}:\texttt{str}) . \textbf{0}\ \textsf {else}\ \textsf {roll}) \\ &{}&{} \quad \textsf {else}\ {x} \triangleleft l_{{{{\textsf {\textit{SD}}}}}} . \, x?(x_{{{{\textsf {\textit{testSD}}}}}}:\texttt{str}) . \,\\ &{}&{} \qquad \quad (\textsf {if}\ (f_{{{{\textsf {\textit{SD}}}}}}(x_{{{{\textsf {\textit{testSD}}}}}}))\ \textsf {then}\ x?(x_{{{{\textsf {\textit{videoSD}}}}}}:\texttt{str}) . \textbf{0}\ \textsf {else}\ \textsf {abort}) \end{array} $$
$$ \begin{array}{rcl} P_\texttt {S} &{} = &{} y?(y_{{{{\textsf {\textit{req}}}}}}:\texttt{str}) . \, y!\langle f_{{{{\textsf {\textit{price}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, y!\langle f_{{{{\textsf {\textit{meta}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \\ &{}&{} y \triangleright \{\ l_{{{{\textsf {\textit{HD}}}}}}\!\, :\, \!\textsf {commit}. \, y!\langle f_{{{{\textsf {\textit{testHD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, y!\langle f_{{{{\textsf {\textit{videoHD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textbf{0} \ ,\\ &{}&{} \qquad \ \ l_{{{{\textsf {\textit{SD}}}}}}\!\, :\, \!\textsf {commit}. \, y!\langle f_{{{{\textsf {\textit{testSD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, y!\langle f_{{{{\textsf {\textit{videoSD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textbf{0}\ \} \end{array} $$

Notice that expressions used for decisions and computations are abstracted by relations \(f_{{{{\textsf {\textit{n}}}}}}(\cdot )\), whose definitions are left unspecified. Considering the placement of commit actions depicted in Fig. 1(c), the \(\mathtt {cherry\text {-}pi}\) specification of the service’s process becomes:

$$ \begin{array}{l} y?(y_{{{{\textsf {\textit{req}}}}}}:\texttt{str}) . \, y!\langle f_{{{{\textsf {\textit{price}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textsf {commit}. \, y!\langle f_{{{{\textsf {\textit{meta}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \\ y \triangleright \{\ l_{{{{\textsf {\textit{HD}}}}}}\!\, :\, \! y!\langle f_{{{{\textsf {\textit{testHD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, y!\langle f_{{{{\textsf {\textit{videoHD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textbf{0} \ ,\\ \qquad \ \ l_{{{{\textsf {\textit{SD}}}}}}\!\, :\, \! y!\langle f_{{{{\textsf {\textit{testSD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, y!\langle f_{{{{\textsf {\textit{videoSD}}}}}}(y_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textbf{0}\ \} \end{array} $$

Finally, considering the placement of commit actions depicted in Fig. 1(d), the \(\mathtt {cherry\text {-}pi}\) specification of the user’s process becomes:

$$ \begin{array}{l} x!\langle v_{{{{\textsf {\textit{req}}}}}}\rangle . \, x?(x_{{{{\textsf {\textit{price}}}}}}:\texttt{int}) . \, x?(x_{{{{\textsf {\textit{meta}}}}}}:\texttt{str}) . \, \textsf {commit}. \, \textsf {if}\ (f_{{{{\textsf {\textit{eval}}}}}}(x_{{{{\textsf {\textit{price}}}}}},x_{{{{\textsf {\textit{meta}}}}}}))\ \textsf {then}\ \ldots \end{array} $$

Semantics. The operational semantics of \(\mathtt {cherry\text {-}pi}\) is defined for runtime terms, generated by the extended syntax of the calculus in Fig. 3 (new constructs are highlighted by a grey background). We use \(r\) to denote session identifiers, i.e. session endpoints and variables. Those runtime terms that can be also generated by the grammar in Fig. 2 are called initial collaborations.

At collaboration level, two constructs are introduced: \((\nu s:C_1)\ C_2\) represents a session along the channel s with associated starting checkpoint \(C_1\) (corresponding to the collaboration that has initialised the session) and code \(C_2\); \(\langle P_1 \rangle \blacktriangleright P_2\) represents a log storing the checkpoint \(P_1\) associated to the code \(P_2\). At process level, the only difference is that session identifiers \(r\) are used as first argument of communicating actions. We extend the standard notion of binders to take into account \((\nu s:C_1)\ C_2\), which binds session endpoints s and \(\bar{s}\) in \(C_2\) (in this respect, it acts similarly to the restriction of \(\pi \)-calculus, but its scope cannot be extended in order to avoid involving processes that do not belong to the session in the rollback effect). The derived notions of bound and free names (where names stand for variables, process variables and session endpoints), alpha-equivalence, and substitution are standard and we assume that bound names are pairwise distinct. The semantics of the calculus is defined for closed terms, i.e. terms without free variables and process variables.

Fig. 3.
figure 3

\(\mathtt {cherry\text {-}pi}\) runtime syntax (the rest of processes P and expressions e are as in Fig. 2).

Not all processes allowed by the extended syntax correspond to meaningful collaborations. In a general term the processes stored in logs may not be consistent with the computation that has taken place. We get rid of such malformed terms, as we will only consider those runtime terms, called reachable collaborations, obtained by means of reductions from initial collaborations.

The operational semantics of \(\mathtt {cherry\text {-}pi}\) is given in terms of a standard structural congruence \(\equiv \) [17] and a reduction relation \(\rightarrowtail \) given as the union of the forward reduction relation \(\twoheadrightarrow \) and backward reduction relations . The definition of the relation \(\twoheadrightarrow \) over closed collaborations relies on an auxiliary labelled relation \(\xrightarrow {\ell }\) over processes that specifies the actions that processes can initially perform and the continuation process obtained after each such action. Given a reduction relation \(\mathcal {R}\), we will indicate with \(\mathcal {R}^{+}\) and \(\mathcal {R}^{*}\) respectively the transitive and the reflexive-transitive closure of \(\mathcal {R}\).

The operational rules defining the auxiliary labelled relation are in Fig. 4 (omitted rules are in [27]). We use \(k\) to denote generic session endpoints (s or \(\bar{s}\)). Action label \(\ell \) stands for either \(k!\langle v\rangle \), \(k?(x)\), \(k \triangleleft l\), \(k \triangleright l\), \( cmt \), \( roll \), \( abt \), or \(\tau \). The meaning of the rules is straightforward, as they just produce as labels the actions currently enabled in the process. In doing that, expressions of sending actions and conditional choices are evaluated (auxiliary function \(e \downarrow v\) says that closed expression e evaluates to value v).

Fig. 4.
figure 4

\(\mathtt {cherry\text {-}pi}\) semantics: auxiliary labelled relation.

The operational rules defining the reduction relation \(\rightarrowtail \) are reported in Fig. 5 (standard rules for congruence, in the forward and backward case, are omitted). We comment on salient points. Once a session is created, its initiating collaboration is stored in the session construct (rule [F-Con]). Communication, branching selection and internal conditional choices proceed as usual, without affecting logs (rules [F-Com], [F-Lab] and [F-If]). A commit action updates the checkpoint of a session, by replacing the processes stored in the logs of the two involved parties (rule [F-Cmt]). Notably, this form of commit is asynchronous as it does not require the passive participant to explicitly synchronise with the active participant by means of a primitive for accepting the commit. On the other hand, under the hood, a low-level implementation of this mechanism would synchronously update the logs of the involved parties. Conversely, a rollback action restores the processes in the two logs (rule [B-Rll]). The abort action (rule [B-Abt]), instead, kills the session and restores the collaboration stored in the session construct formed by the two initiators that have started the session; this allows the initiators to be involved in new sessions. The other rules simply extend the standard parallel, restriction rules to forward and backward relations.

Fig. 5.
figure 5

\(\mathtt {cherry\text {-}pi}\) semantics: forward and backward reduction relations.

Example 2

Consider the first \(\mathtt {cherry\text {-}pi}\) specification of the VOD scenario given in Example 1. In the initial state \(C_{\texttt {US}}\) of the collaboration, U and S can synchronise in order to initialise the session, thus evolving to \( C_{\texttt {US}}^1 \ = \ (\nu s:C_{\texttt {US}})\ (\langle P_\texttt {U}[ \bar{s} / x ] \rangle \blacktriangleright P_\texttt {U}[ \bar{s} / x ] \ \mid \ \langle P_\texttt {S}[ s / y ] \rangle \blacktriangleright P_\texttt {S}[ s / y ] ) \).

Let us consider now a possible run of the session. After three reduction steps, U executes the commit action, obtaining the following runtime term:

figure b

After four further reduction steps, U chooses the HD video quality and S commits as well; the resulting runtime collaboration is as follows:

$$ \begin{array}{rcl} C_{\texttt {US}}^3 &{} = &{} (\nu s:C_{\texttt {US}})\ (\langle P_\texttt {U}'' \rangle \blacktriangleright P_\texttt {U}'' \ \mid \ \langle P_\texttt {S}'' \rangle \blacktriangleright P_\texttt {S}'' ) \\ P_\texttt {U}'' &{} = &{} \bar{s}?(x_{{{{\textsf {\textit{testHD}}}}}}:\texttt{str}) . \, \textsf {if}\ (f_{{{{\textsf {\textit{HD}}}}}}(x_{{{{\textsf {\textit{testHD}}}}}}))\ \textsf {then}\ \bar{s}?(x_{{{{\textsf {\textit{videoHD}}}}}}:\texttt{str}) . \textbf{0}\ \textsf {else}\ \textsf {roll}\\ P_\texttt {S}'' &{} = &{} s!\langle f_{{{{\textsf {\textit{testHD}}}}}}(v_{{{{\textsf {\textit{req}}}}}})\rangle . \, s!\langle f_{{{{\textsf {\textit{videoHD}}}}}}(v_{{{{\textsf {\textit{req}}}}}})\rangle . \, \textbf{0}\end{array} $$

In the next reductions, U evaluates the test video and decides to revert the session execution, resulting in \( C_{\texttt {US}}^4 \ = \ (\nu s:C_{\texttt {US}})\ (\langle P_\texttt {U}'' \rangle \blacktriangleright \textsf {roll}\ \mid \ \langle P_\texttt {S}'' \rangle \blacktriangleright s!\langle f_{{{{\textsf {\textit{videoHD}}}}}}(v_{{{{\textsf {\textit{req}}}}}})\rangle . \,\textbf{0}) \). The execution of the roll action restores the checkpoints \(P_\texttt {U}''\) and \(P_\texttt {S}''\), that is . After the rollback, U is not able to change the video quality as her own commit point would have permitted; in fact, it holds .

4 Rollback Safety

The operational semantics of \(\mathtt {cherry\text {-}pi}\) provides a description of the functioning of the primitives for programming the checkpoint-based rollback recovery in a session-based language. However, as shown in Example 2, it does not guarantee high-level properties about the safe execution of the rollback. To prevent such undesired rollbacks, we propose the use of compliance checking, to be performed at design time. This check is not done on the full system specification, but only at the level of session types.

Session Types and Typing. The syntax of the \(\mathtt {cherry\text {-}pi}\) session types \(T\) is defined as follows. Type \(![S].T\) represents the behaviour of first outputting a value of sort S (i.e., \(\texttt{bool}\), \(\texttt{int}\) or \(\texttt{str}\)), then performing the actions prescribed by type \(T\). Type \(?[S].T\) is the dual one, where a value is received instead of sent. Types \(\texttt{end}\) and \(\texttt{err}\) represent inaction and faulty termination, respectively. Type \(\triangleleft [l].T\) represents the behaviour that selects the label l and then behaves as \(T\). Type \(\triangleright [l_1\, :\, T_1, \ldots , l_n\, :\, T_n]\) describes a branching behaviour: it waits for one of the n options to be selected, and behaves as type \(T_i\) if the i-th label is selected (external choice). Type \(T_1 \oplus T_2\) behaves as either \(T_1\) or \(T_2\) (internal choice). Type \(\mu t.T\) represents a recursive behaviour. Type \(\texttt{cmt}.T\) represents a commit action followed by the actions prescribed by type \(T\). Finally, types \(\texttt{roll}\) and \(\texttt{abt}\) represent rollback and abort actions.

The \(\mathtt {cherry\text {-}pi}\) type system does not perform compliance checks, but only infers the types of collaboration participants, which will be then checked together according to the compliance relation. Typing judgements are of the form \(C \ \blacktriangleright \ A\), where \(A\), called type associations, is a set of session type associations of the form \(\hat{a}:T\), where \(\hat{a}\) stands for either \(\bar{a}\) or a. Intuitively, \(C \ \blacktriangleright \ A\) indicates that from the collaboration C the type associations in \(A\) are inferred. The definition of the type system for these judgements relies on auxiliary typing judgements for processes, of the form \(\varTheta ;\varGamma \ \vdash \ P \ \blacktriangleright \ \varDelta \), where \(\varTheta \), \(\varGamma \) and \(\varDelta \), called basis, sorting and typing respectively, are finite partial maps from process variables to type variables, from variables to sorts, and from variables to types, respectively. Updates of basis and sorting are denoted, respectively, by \(\varTheta \cdot X:t\) and \(\varGamma \cdot y:S\), where \(X \notin dom(\varTheta )\), \(t \notin cod(\varTheta )\) and \(y \notin dom(\varGamma )\). The judgement \(\varTheta ;\varGamma \ \vdash \ P \ \blacktriangleright \ \varDelta \) stands for “under the environment \(\varTheta ;\varGamma \), process P has typing \(\varDelta \)”. In its own turn, the typing of processes relies on auxiliary judgments for expressions, of the form \(\varGamma \ \vdash \ e \ \blacktriangleright \ S\). The axioms and rules defining the typing system for \(\mathtt {cherry\text {-}pi}\) collaborations and processes are given in Figs. 6 and 7; typing rules for expressions are standard (see [27]). The type system is defined only for initial collaborations, i.e. for terms generated by the grammar in Fig. 2. Other runtime collaborations are not considered here, as no check will be performed at runtime. We comment on salient points. Typing rules at collaboration level simply collect the type associations of session initiators in the collaboration. Rules at process level instead determine the session type corresponding to each process, by mapping each process operator to the corresponding type operator. Data and expression used in communication actions are abstracted as sorts, and a conditional choice is rendered as an internal non-deterministic choice.

Fig. 6.
figure 6

Typing system for \(\mathtt {cherry\text {-}pi}\) collaborations.

Fig. 7.
figure 7

Typing system for \(\mathtt {cherry\text {-}pi}\) processes.

Compliance Checking. To check compliance between pairs of session parties, we consider type configurations of the form \( ( T , T'):\langle \tilde{T}_1 \rangle \blacktriangleright T_2 \parallel \langle \tilde{T}_3 \rangle \blacktriangleright T_4 \), consisting in a pair \((T,T')\) of session types, corresponding to the types of the parties at the initiation of the session, and in the parallel composition of two pairs \(\langle \tilde{T}_c \rangle \blacktriangleright T\), where \(T\) is the session type of a party and \(\tilde{T}_c\) is the type of the party’s checkpoint. We use \(\tilde{T}\) to denote either a type \(T\), representing a checkpoint committed by the party, or \(\underline{T}\), representing a checkpoint imposed by the other party. The semantics of type configurations, necessary for the definition of the compliance relation, is given in Fig. 8, where label \(\lambda \) stands for either \(![S]\), \(?[S]\), \(\triangleleft \, l\), \(\triangleright \, l\), \(\tau \), \(\texttt{cmt}\), \(\texttt{roll}\), or \(\texttt{abt}\). We comment on the relevant rules. In case of a commit action, the checkpoints of both parties are updated, and the one of the passive party (i.e., the party that has not performed the commit) is marked as ‘imposed’ (rule [TS-Cmt]\(_1\)). However, if the passive party did not perform any action from its current checkpoint, this checkpoint is not overwritten by the active party (rule [TS-Cmt]\(_2\)), as discussed in Sect. 2 (Fig. 1(c)). In case of a roll action (rule [TS-Rll]\(_1\)), the reduction step is performed only if the active party (i.e., the party that has performed the rollback action) has a non-imposed checkpoint; in all other situations the configuration cannot proceed with the rollback. Finally, in case of abort (rule [TS-Abt]\(_1\)), the configuration goes back to the initial state; this allows the type computation to proceed, in order not to affect the compliance check between the two parties.

Fig. 8.
figure 8

Semantics of types and type configurations (symmetric rules for configurations are omitted).

On top of the above type semantics, we define the compliance relation, inspired by the relation in [3], and prove its decidability.

Definition 1 (Compliance)

Relation on configurations is defined as follows: holds if for all \(U_1'\), \(T_1'\), \(U_2'\), \(T_2'\) such that \(( T , T'):\langle \tilde{U_1} \rangle \blacktriangleright T_1 \parallel \langle \tilde{U_2} \rangle \blacktriangleright T_2 \longmapsto ^*\) we have that \(T_1'=T_2'=\texttt{end}\). Two types \(T_1\) and \(T_2\) are compliant, written , if .

Theorem 1

Let \(T_1\) and \(T_2\) be two session types, checking if holds is decidable.

This compliance relation is used to define the notion of rollback safety.

Definition 2 (Rollback safety)

Let C be an initial collaboration, then C is rollback safe (shortened roll-safe) if \(C \ \blacktriangleright \ A\) and for all pairs \(\bar{a}:T_1\) and \(a:T_2\) in \(A\) we have .

Example 3

Let us consider again the VOD example. As expected, the first \(\mathtt {cherry\text {-}pi}\) collaboration defined in Example 1, corresponding to the scenario described in Fig. 1(b), is not rollback safe, because the types of the two parties are not compliant. Indeed, the session types \(T_\texttt {U}\) and \(T_\texttt {S}\) associated by the type system to the user and the service processes, respectively, are as follows:

$$ \begin{array}{rcl} T_\texttt {U} &{} = &{} ![\texttt{str}].\, ?[\texttt{int}].\, \texttt{cmt}.\, ?[\texttt{str}].\, (\, \triangleleft [l_{{{{\textsf {\textit{HD}}}}}}].\, ?[\texttt{str}].\, (\, ?[\texttt{str}].\,\texttt{end}\ \oplus \ \texttt{roll}\,)\\ &{}&{}\qquad \qquad \qquad \qquad \qquad \qquad \oplus \ \triangleleft [l_{{{{\textsf {\textit{SD}}}}}}].\, ?[\texttt{str}].\, (\, ?[\texttt{str}].\,\texttt{end}\ \oplus \ \texttt{abt}\,)\,) \\ T_\texttt {S} &{} = &{} ?[\texttt{str}].\, ![\texttt{int}].\, ![\texttt{str}].\, \triangleright [l_{{{{\textsf {\textit{HD}}}}}}\, :\, \texttt{cmt}.\, ![\texttt{str}].\,![\texttt{str}].\,\texttt{end} \ ,\\ &{}&{} l_{{{{\textsf {\textit{SD}}}}}}\, :\, \texttt{cmt}.\, ![\texttt{str}].\,![\texttt{str}].\,\texttt{end}] \end{array} $$

Thus, the resulting initial configuration is \(( T_\texttt {U} , T_\texttt {S}):\langle T_\texttt {U} \rangle \blacktriangleright T_\texttt {U} \parallel \langle T_\texttt {S} \rangle \blacktriangleright T_\texttt {S}\), which can evolve to the configuration \( ( T_\texttt {U} , T_\texttt {S}):\langle \underline{T} \rangle \blacktriangleright \texttt{roll} \parallel \langle U \rangle \blacktriangleright ![\texttt{str}].\texttt{end} \), with \(T= ?[\texttt{str}].\, (?[\texttt{str}].\,\texttt{end}\ \oplus \ \texttt{roll})\) and \(U= ![\texttt{str}].\,![\texttt{str}].\,\texttt{end}\). This configuration evolves to \( ( T_\texttt {U} , T_\texttt {S}):\langle \underline{T} \rangle \blacktriangleright \texttt{err} \parallel \langle U \rangle \blacktriangleright \texttt{err} \), which cannot further evolve and is not in a completed state (in fact, type \(\texttt{err}\) is different from \(\texttt{end}\)), meaning that \({T_\texttt {U}}\) and \({T_\texttt {S}}\) are not compliant.

In the scenario described in Fig. 1(c), instead, the type of the server process is as follows: \( T_\texttt {S}' = \, ?[\texttt{str}].\, ![\texttt{int}].\, \texttt{cmt}.\, ![\texttt{str}].\, \triangleright [l_{{{{\textsf {\textit{HD}}}}}}\, :\, ![\texttt{str}].\,![\texttt{str}].\,\texttt{end} \, ,\, l_{{{{\textsf {\textit{SD}}}}}}\, :\, ![\texttt{str}].\) \(![\texttt{str}].\,\texttt{end}] \) and we have . Finally, the types of the processes depicted in Fig. 1(d) are:

figure m

and we have . Indeed, the corresponding initial configuration can evolve to the configuration \( ( T_\texttt {U}' , T_\texttt {S}''):\langle \underline{\triangleleft [l_{{{{\textsf {\textit{HD}}}}}}].\, \ldots } \rangle \blacktriangleright \texttt{roll} \parallel \langle \triangleright [l_{{{{\textsf {\textit{HD}}}}}}\, :\, \ldots , l_{{{{\textsf {\textit{SD}}}}}}\, :\, \ldots ] \rangle \blacktriangleright ![\texttt{str}].\texttt{end} \), which again evolves to a configuration that is not in a completed state.

MAUDE Implementation. To show the feasibility of our approach, we have implemented the semantics of type configurations in Fig. 8 in the MAUDE framework [8]. MAUDE provides an instantiation of rewriting logic [22] and it has been used to implement the semantics of several formal languages [23].

The syntax of \(\mathtt {cherry\text {-}pi}\) types and type configurations is specified by defining algebraic data types, while transitions and reductions are rendered as rewrites and, hence, inference rules are given in terms of (conditional) rewrite rules. Since MAUDE specifications are executable, we have obtained in this way an interpreter for \(\mathtt {cherry\text {-}pi}\) type configurations, which permits to explore the reductions arising from the initial configuration of two given session types.

Our implementation consists of two MAUDE modules. The CHERRY-TYPES-SYNTAX module provides the definition of the sorts that characterise the syntax of \(\mathtt {cherry\text {-}pi}\) types, such as session types, selection/branching labels, type variables and type configurations. In particular, basic terms of session types are rendered as constant operations on the sort \(\texttt {Type}\); e.g., the roll type is defined as

figure o

The other syntactic operators are instead defined as operations with one or more arguments; e.g., the output type takes as input a \(\texttt {Sort}\) and a continuation type:

figure p

To prevent undesired rewrites inside operator arguments, following the approach in [33], we have declared these operations as \(\texttt {frozen}\). The \(\texttt {prec}\) attribute has been used to define the precedence among operators. The CHERRY-TYPES-SEMANTICS module provides rewrite rules, and additional operators and equations, to define the \(\mathtt {cherry\text {-}pi}\) type semantics. For example, the operational rule [TS-Snd] is rendered as follows:

figure q

The correspondence between the operational rule and the rewrite rule is one-to-one; the only peculiarity is the fact that, since rewrites have no labels, we have made the transition label part of the resulting term. Reduction rules for type configurations are instead rendered in terms of conditional rewrite rules with rewrites in their conditions. For example, the [TS-Com] rule is rendered as:

figure r

Again, there is a close correspondence between the operational rule and the rewrite one.

The compliance check between two session types can be then conveniently realised on top of the implementation described above by resorting to the MAUDE command \(\texttt {search}\). This permits indeed to explore the state space of the configurations reachable from an initial configuration. Specifically, the compliance check between types \(\texttt {T1}\) and \(\texttt {T2}\) is rendered as follows:

figure s

This command searches for all terminal states (=>!), i.e. states that cannot be rewritten any more (see in Definition 1), and checks if at least one of the two session types in the corresponding configurations (\(\texttt {T1'}\) and \(\texttt {T2'}\)) is different from the \(\texttt {end}\) type. Thus, if this search has no solution, \(\texttt {T1}\) and \(\texttt {T2}\) are compliant; otherwise, they are not compliant and a violating configuration is returned.

Example 4

Let us consider the \(\mathtt {cherry\text {-}pi}\) types defined in Example 3 for the scenario described in Fig. 1(b). In our MAUDE implementation of the type syntax, the session types \(T_\texttt {U}\) and \(T_\texttt {S}\), and the corresponding initial type configuration, are rendered as follows:

figure u

where \(\texttt {(+)}\) represents the internal choice operator, \(\texttt {sel}\) the selection operator, \(\texttt {brn}\) the branching operator, \(\texttt {brnEl}\) an option offered in a branching, and \(\texttt {ckp}\) a non-imposed checkpoint. The compliance between the two session types can be checked by loading the two modules of our MAUDE implementation, and executing the following command:

figure v

This \(\texttt {search}\) command returns the following solution:

figure w

As explained in Example 3, the two types are not compliant. Indeed, the configuration above is a terminal state, and \(\texttt {T1}\) and \(\texttt {T2}\) are clearly different from \(\texttt {end}\).

The scenario in Fig. 1(c) is rendered by the following implementation of the service type:

figure x

In this case, as expected, the \(\texttt {search}\) command returns:

figure y

meaning that types \(\texttt {Tuser}\) and \(\texttt {Tservice'}\) are compliant. Finally, the \(\texttt {search}\) command applied to the type configuration related to the scenario depicted in Fig. 1(d) returns a solution, meaning that in that case the user and service types are not compliant.

5 Properties of \(\mathtt {cherry\text {-}pi}\)

This section presents the results regarding the properties of \(\mathtt {cherry\text {-}pi}\). The statement of some properties exploits labelled transitions that permit to easily distinguish the execution of commit and rollback actions from the other ones. To this end, we can instrument the reduction semantics of collaborations by means of labels of the form \( cmt \,s\), \( roll \,s\) and \( abt \,s\), indicating the rule used to derive the reduction and the session on which such operation has been done.

Rollback Properties. We show some properties concerning the reversible behaviour of \(\mathtt {cherry\text {-}pi}\) related to the interplay between rollback and commit primitives. The first two properties, namely Theorem 2 and Lemma 1, are an adaptation of typical properties of reversible calculi, while Lemma 2 and Lemma 3 are brand new.

The following theorem states that any reachable collaboration is also a forward only reachable collaboration. This means that all the states a collaboration reaches via mixed executions (also involving backward reductions) are states that we can reach from the initial configuration with just forward reductions. This assures us that if the system goes back it will reach previous visited states.

Theorem 2

Let \(C_0\) be an initial collaboration. If \(C_0 \rightarrowtail ^{*}C_1\) then \(C_0 \twoheadrightarrow ^{*}C_1\).

We now show a variant of the so-called Loop Lemma [10]. In a fully reversible calculus this lemma states that each computational step, either forward or backward, can be undone. Since reversibility in \(\mathtt {cherry\text {-}pi}\) is controlled, we have to state that if a reversible step is possible (e.g., a rollback is enabled) then the effects of the rollback can be undone.

Lemma 1 (Safe rollback)

Let \(C_1\) and \(C_2\) be reachable collaborations. If then \(C_2 \twoheadrightarrow ^{*} C_1\).

A rollback always brings the system to the last taken checkpoint. We recall that, since there may be sessions running in parallel, a collaboration may be able to do different rollbacks within different sessions. Thus, determinism only holds relative to a given session, and rollback within one session has no effect on any other parallel session.

Lemma 2 (Determinism)

Let C be a reachable collaboration. If and then \(C' \equiv C''\).

Fig. 9.
figure 9

\(\mathtt {cherry\text {-}pi}\) semantics: error reductions.

The last rollback property states that a collaboration cannot go back to a state prior to the execution of a commit action, that is commits have a persistent effect. Let us note that recursion does not affect this theorem, since at the beginning of a collaboration computation there is always a new session establishment, leading to a stack of past configurations. Hence it is never the case that from a collaboration C you can reach again C via forward steps.

Theorem 3 (Commit persistency)

Let C be a reachable collaboration. If \(C {\mathop {\twoheadrightarrow }\limits ^{cmt \,s}} C'\) then there exists no \(C''\) such that and \(C'' \twoheadrightarrow ^{+} C\).

Soundness Properties. The second group of properties concerns soundness guarantees. The definition of these properties requires formally characterising the errors that may occur in the execution of an unsound collaboration. We rely on error reduction (as in [7]) rather than on the usual static characterisation of errors (as, e.g., in [36]), since rollback errors cannot be easily detected statically. In particular, we extend the syntax of \(\mathtt {cherry\text {-}pi}\) collaborations with the \(\texttt {roll\_error}\) and \(\texttt {com\_error}\) terms, denoting respectively collaborations in rollback and communication error states:

figure ad

where \(\tilde{P}\) denotes either a checkpoint P committed by the party or a checkpoint \(\underline{P}\) imposed by the other party of the session. The semantics of \(\mathtt {cherry\text {-}pi}\) is extended as well by the (excerpt) of error reduction rules in Fig. 9. The error semantics does not affect the normal behaviour of \(\mathtt {cherry\text {-}pi}\) specifications, but it is crucial for stating our soundness theorems. Its definition is based on the notion of barb predicate: \(P\Downarrow _{\mu }\) holds if there exists \(P'\) such that \(P \Rightarrow P'\) and \(P'\) can perform an action \(\mu \), where \(\mu \) stands for k?, k! ,\(k\triangleleft l\), \(k\triangleright l\), or roll (i.e., input, output, select, branching action along session channel k, or roll action); \(\Rightarrow \) is the reflexive and transitive closure of \(\xrightarrow {\tau }\). The meaning of the error semantics rules is as follows. A communication error takes place in a collaboration when a session participant is willing to perform an output but the other participant is ready to perform neither the corresponding input nor a roll back (rule [E-Com]\(_1\)) or vice versa, or one participant is willing to perform a selection but the corresponding branching is not available on the other side or viceversa. Instead, a rollback error takes place in a collaboration when a participant is willing to perform a rollback action but her checkpoint has been imposed by the other participant ([E-Rll]\(_2\)). To enable this error check, the rules for commit and rollback have been modified to keep track of imposed overwriting of checkpoints. This information is not relevant for the runtime execution of processes, but it is necessary for characterising the rollback errors that our type-based approach prevents.

Besides defining the error semantics, we also need to define erroneous collaborations, based on the following notion of context: \( \mathbb {C}\ {::}= \ [\cdot ] \,\, \mid \,\, \mathbb {C}\!\mid \! C \,\, \mid \,\, (\nu s:C)\ \mathbb {C}\).

Definition 3 (Erroneous collaborations)

A collaboration C is communication (resp. rollback) erroneous if \(C = \mathbb {C}[\texttt {com\_error}]\) (resp. \(C = \mathbb {C}[\texttt {roll\_error}]\)).

The key soundness results follow: a rollback safe collaboration never reduces to either a rollback erroneous collaboration (Theorem 4) or a communication erroneous collaboration (Theorem 5).

Theorem 4 (Rollback soundness)

If C is a roll-safe collaboration, then we have that \(C\ {\rightarrowtail \!\!\!\!\!\!/\ }^*\ \mathbb {C}[\texttt {roll\_error}]\).

Theorem 5 (Communication soundness)

If C is a roll-safe collaboration, then we have that \(C\ {\rightarrowtail \!\!\!\!\!\!/\ }^*\ \mathbb {C}[\texttt {com\_error}]\).

We conclude with a progress property of \(\mathtt {cherry\text {-}pi}\) sessions: given a rollback safe collaboration that can initiate a session, each collaboration reachable from it either is able to progress on the session with a forward/backward reduction step or has correctly reached the end of the session. This result follows from Theorems 4 and 5, and from the fact that we consider binary sessions without delegation and subordinate sessions.

Theorem 6 (Session progress)

Let \(C=(\bar{a}(x_1) . P_1 \mid a(x_2) . P_2)\) be a roll-safe collaboration. If \(C \!\rightarrowtail ^*\! C'\) then either \(C' \!\rightarrowtail \! C''\) for some \(C''\) or \(C' \!\equiv \! (\nu s\!:\!C) (\langle \tilde{Q_1\!} \rangle \blacktriangleright \! \textbf{0}\!\mid \! \langle \tilde{Q_2\!} \rangle \blacktriangleright \! \textbf{0})\) for some \(\tilde{Q_1}\) and \(\tilde{Q_2}\).

6 Conclusion and Related Work

This paper proposes rollback recovery primitives for session-based programming. These primitives come together with session typing, enabling a design time compliance check which ensures checkpoint persistency properties (Lemma 1 and Theorem 3) and session soundness (Theorems 4 and 5). Our compliance check has been implemented in MAUDE.

In the literature we can distinguish two ways of dealing with rollback: either using explicit rollbacks and implicit commits [20], or by using explicit commits and spontaneous aborts [11, 34]. Differently from these works, we have introduced a way to control reversibility by both triggering it and limiting its scope. Reversibility is triggered via an explicit rollback primitive (as in [20]), while explicit commits limit the scope of potential future reverse executions (as in [11, 34]). Differently from [11, 34], commit does not require any synchronisation, as it is a local decision. This could lead to run-time misbehaviours where a process willing to roll back to its last checkpoint reaches a point which has been imposed by another participant of the session. Our type discipline rules out such cases.

Reversibility in behavioural types has been studied in different formalisms: contracts [2, 4], binary session types [24], multiparty session types [6, 25, 30, 31], and global graphs [14, 28]. In [2, 4] choices can be seen as implicit checkpoints and the system can go back to a previous choice and try another branch. In [2] rollback is triggered non-deterministically , while in [4] it is triggered by the system only when the computation is stuck. In both works reversibility (and rollbacks) is used to achieve a relaxed variant of client-server compliance: if there exists an execution in which the client is able to terminate then the client and server are compliant. Hence, reversibility is used as a means to explore different branches if the current one leads to a deadlock. In [24] reversibility is studied in the context of binary session types. Types information is used at run-time by monitors, for binary [24] and multiparty [25] settings, to keep track of the computational history of the system. allowing to revert any computational step. where global types are enriched with computational history. There, reversibility is uncontrolled, and each computational step can be undone. In [6] global types are enriched with history information, and choices are seen as labelled checkpoints. The information about checkpoints is projected into local types. At any moment, the party who decided which branch to take in a choice may decide to revert it, forcing the entire system to go back to a point prior to the choice. Hence, rollback is confined inside choices and it is spontaneous. meaning that the former can be programmed while the latter cannot. Checkpoints are not seen as commits, and a rollback can bring the system to a state prior to several checkpoints. In [30] an uncontrolled reversible variant of session \(\pi \)-calculus is presented, while [31] studies different notions of reversibility for both binary and multiparty single sessions. In [14, 28] global graphs are extended with conditions on branches. These conditions at runtime can trigger coordinated rollbacks to revert a distributed choice. Reversibility is confined into branches of a distributed choice and not all the computational steps are reversible; inputs, in fact, are irreversible unless they are inside an ongoing loop. to trigger a rollback several conditions and constraints about loops have to be satisfied. Hence, in order to trigger a rollback a runtime condition should be satisfied.

We detach from these works in several ways. Our checkpoint facility is explicit and checkpointing is not relegated to choices: the programmer can decide at any point when to commit. This is because the programmer may be interested in committing, besides choice points, a series of interactions (e.g., to make a payment irreversible). Once a commit is taken, the system cannot revert to a state prior to it. Our rollback is explicit, meaning that it is the programmer who deliberately triggers a rollback. The extension to the multiparty setting is natural and does not rely on a formalism to describe the global view of the system. Our compliance check, which is decidable, resembles those of [2,3,4], which are defined for different rollback recovery approaches based on implicit checkpoints.

As future work, we plan to extend our approach to deal with sessions where parties can interleave interactions performed along different sessions. This requires to deal with subordinate sessions, which may affect enclosing sessions by performing, e.g., commit actions that make some interaction of the enclosing sessions irreversible, similarly to nested transactions [35]. To tackle this issue it would be necessary to extend the notion of compliance relation to take into account possible partial commits (in case of nested sub-sessions) that could be undone at the top level if a rollback is performed. Also, the way our checkpoints are taken resembles the Communication Induced Checkpoints (CIC) approach [12]; we leave as future work a thoughtful comparison between these two mechanisms.