1 Introduction

Modern software development contains a big share of reusing previously developed software called services. Often these services are developed by third party companies and supplied as Web Service. Our work addresses the topic composability analysis for replaceability, compatibility, and process conformance, which was identified as a research challenge in [29].

While stateless services have no restrictions on the order of the call of operations of the interface, stateful services may restrict this order. For example, a file service may expect that a file is first opened for reading, then read operations may follow, and finally the file must be closed.

Such a required client behaviour needs to be obeyed during the execution of the service. If the client differs the service usage from the provided service protocol, the application might crash and cause a chain of events leading to unwished behaviour. In the worst-case scenario a problem within one service is appearing only for the current specific configuration may lead to the crash of the complete component-based software or service-oriented system.

Hence, our goal is to check automatically whether in a service composition each stateful service is used correctly. The correct usage of a stateful service must be specified by a service protocol and should be published together with the service interface. Therefore, it must be proven that the behaviour of clients of a service never violate the defined service protocol. A violation of a service protocol means that its operations are called not compliant to the protocol. Usually, these protocols for stateful services are specified as a finite state machine.

An automatic verification approach must consider the sequences of operation calls to a stateful services. Therefore, for a service composition, it must be checked whether the set of sequences of possible operation calls to a stateful service is a subset of the set of legal operation calls specified by the protocol. This kind of condition is called trace inclusion.

Remark 1

Note that in the literature on component-based and service-oriented architectures, there are also other kind of protocols, the interaction protocols, see e.g. [9]. Interaction protocols specify the behaviour of a service by request and reply messages, i.e., an abstraction service interaction interface is modelled. Thus, their primary focus is to model the receivable as well as the triggered interactions of a service. Interaction protocol checking examines whether for each request there is a reply message and vice versa, i.e., a bisimulation relation must be satisfied.

Service protocols consider a service interface as an application programming interface (API), i.e., the operations specified in an interface description are atomic units. Their primary focus is to model the set of legal sequences of operations calls. Service protocol conformance checking checks whether all sequences of operation calls to a service are legal, i.e. trace inclusion (or alternatively, a simulation relation) must be satisfied.

Service protocol conformance checking requires to know the behaviour of services. Often, service providers do not want to provide their code, business process etc., of the service’s implementation. Thus, abstractions of the behaviour of the services are necessary, i.e., each real behaviour corresponds to an abstract behaviour but not necessarily vice versa. These abstractions should be automatically be derived and published together with the service description. Therefore, the basis for this abstraction is the source code of the service implementation, usually written in higher-level programming languages as e.g. Java, C# etc. Therefore, the important language concepts have to be considered since everything possibility in a programming language is used—whether it makes sense in a certain context or not. In our previous work [3, 5], we modelled abstractions for the classical control structures such as loops, conditionals, sequential execution of statements etc. In particular, we considered recursion and concurrency without any restrictions on recursion depth or the number of concurrent threads. In this article, we consider in addition exception handling concepts.

Currently, abstractions are often specified using Petri-Nets (see e.g. [34, 39]), pushdown systems (see e.g. [10, 11]), finite state machines (see e.g. [28, 38]), or process-algebras (see e.g. [9]). Finite state machines only allow an adequate modelling of bound concurrency and bound recursion. This means that the number of parallel threads and the recursion depth are bound by a constant, respectively. If recursion is present within the component or the application, the language of interactions is not regular but context-free [40]. Therefore, it requires a pushdown automaton to describe all sequences of interactions, i.e., finite state machine or those process-algebras not taking into account sequential recursion cannot model it. Furthermore, if recursive callbacks are present, protocol conformance checking based on finite state machine abstractions may lead to false positives [40]. Unbound recursion can be adequately modelled by pushdown systems, but there is no adequate modelling of unbound concurrency. Petri-Nets may model adequately unbound concurrency but not unbound recursion. However, recursive Petri-nets may deal (cf., [20]) with unbound concurrency and unbound recursion. With some process algebras, it is possible to model unbound recursion and unbound parallelism including synchronization. This leads to a model equivalent to a Turing Machine [20] (or a coloured Petri-Net with an infinite number of colours) and is therefore not well-suited for analysis tools. In order to provide safe protocol conformance checking, a conservative representation of the actual component behaviour is required. Consequently the abstraction layer has to be capable of representing unbound recursion, unbound parallelism as well as exception handling, such that it can still be checked.

Protocol Conformance Checking verifies the protocol w.r.t. these abstractions. Protocol conformance implies that there is no protocol violation. However, it is possible that protocol conformance cannot be proven although there is no protocol violation in the real behaviour (false alarms). Furthermore, service may call operations of other services. Hence, protocol violations may occur in services not directly used by the client.

In order to take into account indirect uses of services, it is necessary to consider the behaviour of services. On the one hand, services implementations fully specify this behaviour. On the other hand, services providers may wish to keep secret their implementation, e.g., because it may contain business secrets. Thus, an abstract behaviour of services that hide implementation decisions and business secrets should be published. A protocol conformance checker composes the abstract behaviour of each service according to the architecture of the system such that it keeps track of the calls to operations provided by all services [5].

If this abstraction is too coarse-grained, a large number of false alarms may be produced. In our work, the abstract behaviour completely abstracts from data, but under this restriction the control flow should be modelled precisely. In particular, if recursive callbacks are present in the composed application it should be present in the abstraction, otherwise some protocol violations might not be discovered [40]. This approach follows the idea that everything that is possible should be considered. Often, it is argued that recursive callbacks are not present in the service-oriented context. However, there are candidates for recursive callbacks such as for example a map-reduce service [1].

It is not necessary to specify manually the abstract behaviour of a service implementation, since it can be automatically derived from the source code of the implementation using standard compiler technology [3, 5].

Both and Zimmermann [5] show that using Mayr’s process rewrite systems [26], both, recursion and parallelism, can be adequately modelled. However, the protocol conformance checking problem becomes undecidable. An approximation for this protocol conformance checking problem is shown in [5] and its feasibility was demonstrated. None of these works consider exception handling. However, in modern programming languages interactions can also be initiated by exceptions. Hence, their characteristic behaviour triggered by conditional execution of blocks as well as the execution of the finally-block might lead to service protocol violations. In particular, in the case of chains of interface calls, a protocol violation might be raised somewhere in the component-based system at the direct interfaces of the component raising the exception.

Hence, simplified abstractions are not acceptable for application in component systems. Therefore, it is important to tackle these exceptions to ensure a rugged composition of services. Our main contributions are:

  1. (i)

    showing that the previous concepts cannot deal with exceptions adequately

  2. (ii)

    providing an (automatic) abstraction of exception handling that can be combined with abstractions of other programming language concepts such as procedure call and return, forking and synchronizing parallel processes, loops, conditional statements, statement sequences,

  3. (iii)

    and showing an approach for protocol conformance checking based on this abstraction.

Section 2 defines process rewrite systems, protocols and provides a running example. In Sect. 3, the abstraction of exception semantics to process rewrite systems is demonstrated. Section 4 shows how to check protocol conformance. Section 5 discusses related work.

2 Preliminaries

This section introduces our service model, its execution semantics including exception handling, and summarizes [35]. In particular, it is shown that the execution semantics naturally corresponds to Mayr’s process rewrite systems [26].

A service \(s\) provides an interface \(I_s\) where an interface is a set of type descriptions and procedure signatures with exceptions that may be raised during execution. The implementation of \(s\) may call procedures of other services. The required interface \(R_s\) of \(s\) is the set of procedures of other services called by \(s\). A service-oriented system is a directed graph \(S\triangleq (WS,C)\) where \(WS\) is a set of services such that each service \(s \in WS,\,p \in R_s\) there is an edge \((s,s') \in C\) with \(p \in I_{s'}\). Hence, any call leaving existing service \(s \in WS\) calls a procedure of another service \(s' \in WS\). Consequently, if a service implementation might lead to a call of another procedure, it is represented within \(S\).

There are two kinds of procedures in interfaces, asynchronous and synchronous procedures. If a synchronous procedure is called, the caller waits until the callee is completed. If an asynchronous procedure is called, the caller and the callee concurrently continue their execution. A synchronize statement \(\mathbf{sync}~f\) is a barrier, i.e. the execution waits until the last asynchronous call of \(f\) is completed. Before a (synchronous or asynchronous) procedure \(p\) returns, all asynchronous procedures called by \(p\) must be completed.

For the implementation of the services, programming languages such as e.g. Java, C#, or BPEL can be used. For the purpose of this article, the complete consideration of all programming language concepts would be too much. Instead we consider the most important concepts such as loops (with the classical semantics of while-loops), conditionals, sequential execution of statements, a synchronization statement, synchronous and asynchronous procedures, and exception handling. Loops, conditional statements, and sequential execution of statements have the standard semantics. Synchronous and asynchronous procedure calls and the synchronization have a semantics as described above.

In contrast to these concepts, exception handling has a rather complex semantics: The statement \(\mathbf{raise}~E\) raises the exception \(E\). This means the execution is being interrupted, i.e., it is not being continued by the execution of the next statement. If the exception \(E\) has been raised outside of a try-block, then the current procedure stops with the exception \(E\), i.e., the corresponding call raises \(E\). A try statement

is executed as follows (this is according to Java, C#, BPEL): The statements in the try-block are being executed as usual. If an exception \(E\) is raised, then the block of the first exception handler when \(E_i \{ \cdots \}\) with \(E_i=E\) is executed.Footnote 1 If \(E\) is different from any exceptions in the exception handlers, the try statement terminates with exception \(E\). Note that asynchronous procedure calls within a try-block, an exception handler, or a finally-block must be synchronized when the block is left. Without loss of generality, we assume that the last statement of try-block synchronizes asynchronous procedure calls within the block—either by a synchronize or by a return statement.

However, the execution of the try statement definitely finishes with the execution of the finally-block, no matter what happens inside the try-block or the exception handler. Hence, a return statement or raising an unhandled exception within a try-block or an exception handler is earliest being executed after the finally-block has been executed. However, if the finally-block executes a return statement or raises an exception, then it returns from the current procedure or ends with an exceptional state without executing any open return or raise statement.

Example 1

Figure 1 shows a service-oriented system consisting of three services \(s_1,\,s_2\), and \(s_3\) with provided interfaces \(I_1,\,I_2\), and \(I_3\), respectively. The provided interfaces are shown by circles, the required interfaces are visualized by opened circles and service bindings are visualized by arrows. The symbols \(q_i\) represent program points indicating each statement. Procedure \(b\) of interface \(I_2\) is the sole asynchronous procedure. Every other procedure is synchronous. The execution starts with calling \(main\) of service \(s_1\). The return statement at program point \(q_{10}\) in Fig. 1 would not be executed if the finally block would be replaced by

The reason is that before returning by the return statement at \(q_{10}\), the finally block is being executed and this execution executes \(q_{18}\). If there would be a return statement, then this would be the return from procedure \(a\).

Fig. 1
figure 1

A service-oriented system

Remark 2

The semantics of loops, conditional statements, sequential execution, and the synchronous procedure call (and return) is surprisingly uniform across different programming languages. The semantics of exception handling – including the finally-statement– is according to Java, cf. Chapter 11.3 of [19]. The try statement with a finally clause of C# and .NET has an analogous semantics. We do not consider function calls as they could be transformed into procedure calls with result parameters. Similarly, other control structures such as different loops, switch statements can be transformed into while-loops and conditionals, respectively. Such transformations are often applied in compilers for intermediate code generation.

Remark 3

The interface \(I_s\) of a service can be specified using WSDL. The exceptions are specified by the fault-part in the operations. There are several approaches on Web Service Composition with exception handling, see e.g. [18, 23]. In particular, the stub generated from a WSDL interface description of a service \(s\) might raise exceptions that can be handled by the client using \(s\).

This article assumes that a protocol of a service \(s\) is given by a finite state machine \(A_s \triangleq (\varSigma _{s},R_{s},\rightarrow _{s},r_{0}^{s},F_{s})\) where \(\varSigma _{s}\) denotes the set of operations symbols in the interface of service \(s,\,R_s\) are the states of \(s,\,r_{0}^{s}\) is the state when \(s\) is started by a client, and \(F_s\) is the set of final states. A final state must be reached when the client finishes the use of \(s\). Thus, the language \(L(A_s)\) accepted by the finite state machine \(A_s\) defines the set of legal sequences of operation calls to \(s\). Hence, a service protocol contains only operation calls available in the service description. In particular, a software architect is enabled here to define a required behaviour without knowing (or specifying) the service implementation. This is a main distinction in comparison with interaction protocols [9]. Figure 2 shows the state diagrams of protocols of the services in Fig. 1 which are used as an example throughout this article (final states are indicated by squares and initial states by an arrow without source). If the protocols \(A_2\) or \(A_3\) reach state \(\mathsf{E}\), then a protocol violation occurs. Here, \(A_2\) permits only calls of \(b\) followed by a least one call of \(c\); all other sequences lead to an error state.

Fig. 2
figure 2

Protocols of the services in Fig. 1

Example 2

Figure 3 shows a possible execution of the service-oriented system in Fig. 1 when the value 1 is read. Steps (1) and (2) are synchronous calls. The program point after the call is pushed onto the runtime stack. Step (3) is an asynchronous call. In this case, the stack forks, i.e., it has now two branches: one for the caller and one for the callee. This kind of runtime structure is called a cactus stack. [13, 21] showed that a cactus stack can be used as a runtime system for concurrent processes. Note that each stack in a cactus stack might be maintained on a different processor. Hence, cactus stacks are a well-suited as a logical runtime model for distributed systems such as service compositions. Thus, in Fig. 3, the cactus stacks after steps (5) and (10) could be maintained by three processors.

Fig. 3
figure 3

A protocol violation in Fig. 1

The call of \(b(1)\) raises exception \(\mathsf{Exc}_1\). In a step, all interleavings are possible, i.e. any top element of a stack in the cactus stack can be taken for the next step. Suppose the call \(c(1)\) (Step 4) is taken. Then, in (5), \(b(1)\) is called, which also raises \(\mathsf{Exc}_1\). Since this exception is not handled, the synchronize statement results in \(\mathsf{Exc}_1\) (Step (6)). Step (7) shows that now the exception handler in the body of \(c\) is being executed and this calls \(f(0)\). Then – as above – \(a(0)\) is called. Note that the branch from \(q_9\) cannot be removed until \(q_9\) is on the top of the (main-) stack. This execution demonstrates that protocol \(A_2\) is violated because service \(s_2\) receives the calls \(b(1),\,c(1),\,b(1),\,b(0)\), i.e., \(A_2\) will be in state \(E\) after this sequence. Furthermore, this sequence stems from the execution of an exception handler.

Thus, an abstract state can be represented as a cactus stack of program points and exception states and an abstract semantics transforms cactus stacks into cactus stacks. The transformations are (i) changing a state on one of the top stack elements, (ii) pushing a state on one of the stacks, (iii) poping a state from one of stacks, (iv) forking to a new stack, (v) synchronizing two stacks (i.e., waiting for a forked stack to be emptied).

According to [3], there is a one-to-one correspondence between cactus stacks and process-algebraic expressions. The set \({\textit{PEX}}(Q)\) of process-algebraic expressions over a finite set \(Q\) (atomic processes) is the smallest set satisfying:

  1. (i)

    \(Q \subseteq ~{\textit{PEX}}(Q)\)

  2. (ii)

    If \(e,e' \in {\textit{PEX}}(Q)\), then \(e.e' \in {\textit{PEX}}(Q)\) and \(e \parallel e' {\in } {\textit{PEX}}(Q)\) (sequential and parallel composition, respectively).

  3. (iii)

    \(\varepsilon \in {\textit{PEX}}(Q)\)

The empty process, denoted by \(\varepsilon \), is the identity w.r.t. sequential and parallel composition.

For example, the cactus stack in Fig. 3 after step (10) can be represented by the process-algebraic expression \((((q_8 \!\!\parallel \!\!q_{22}).q_{35}.q_{31}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{35}.q_3\).

Hence, the transformations of cactus stacks can be represented by rewrite rules for process-algebraic expressions, as demonstrated in Fig. 4. These rules are a short summary of our previous work [4, 5]. The rewrite rules are labelled with the name of the operation provided by a service if an external service is being called while internal calls or control logic (e.g., thread operations) are labelled with \(\lambda \). The reason is that protocol conformance checking only considers interaction sequences between services (w.r.t. the considered interaction protocol) but not internal procedure calls within a service.

Fig. 4
figure 4

Cactus stack transformations/process rewrite rules for abstract semantics

A process rewrite system (short: PRS) is a tuple \(\varPi \triangleq (\varSigma , Q, \rightarrow , q_0, F)\) where

  1. (i)

    \(Q\) is a finite set (atomic processes),

  2. (ii)

    \(\varSigma \) is a finite alphabet disjoint from \(Q\) (actions),

  3. (iii)

    \(q_0 \in Q\) (the initial state),

  4. (iv)

    \(\rightarrow \subseteq ~{\textit{PEX}}(Q) \times (\varSigma \uplus \{\lambda \}) \times ~{\textit{PEX}}(Q)\) is a set of process rewrite rules (\(\lambda \in \varSigma ^{*}\) is the empty word),

  5. (v)

    \(F \subseteq Q \cup \{ \varepsilon \}\) (the set of final processes).

The PRS \(\varPi \) defines a derivation relation \(\Rightarrow \subseteq ~{\textit{PEX}}(Q) \times \varSigma ^* \times ~{\textit{PEX}}(Q)\,(\varSigma ^{*}\) is the set of all finite words over \(\varSigma \)) by the inference rules in Fig. 5. The set \( L(\varPi ) \triangleq \{ w \in \varSigma ^{*} : \exists f \in F \bullet q_0 {\mathop {\Rightarrow }\limits ^{w}} f \} \) is the language accepted by the PRS \(\varPi \).

Fig. 5
figure 5

Inference rules for the definition of the derivation relation in PRSs

Remark 4

The second inference rules implies that rewrite rules can only applied to the top of one of the stacks in a cactus stack.

A service-oriented system \(S\) is abstracted to a process rewrite system \(\varPi _S \triangleq (\varSigma ,Q,\rightarrow ,q_0,F)\) where \( \varSigma = \bigcup _{s \in S} \varSigma _s\) is the set of all procedures in the interface descriptions of the services of \(S,\,Q \triangleq ~PP \cup ~Exceptions,\,PP\) is the set of program points, \(Exceptions~\triangleq \{ q_E : E \text { is an exception} \},\,\rightarrow \) is defined by as in Fig. 4, \(q_0\) is the program point where \(S\) starts, \(F \triangleq Q_F \cup ~Exceptions\), and \(Q_F\) is the set of program points where the main program returns (i.e., the execution of the program terminates). Exceptional states are final because a program may terminate in an exceptional state.

Remark 5

In Fig. 4 there is slight difference to [4, 5]: it uses \(q {\mathop {\rightarrow }\limits ^{\lambda }} \varepsilon \) for a procedure. Furthermore, [5] shows a compositional construction of \(\varPi _S\). For each \(s \in S\), a PRS \(\varPi _s\) is automatically derived from the implementation of \(s\) by using classical compiler technology, and these PRSs are glued together to obtain \(\varPi _S\). For reasons of space, this construction has been omitted. \(L(\varPi _S)\) contains all possible interaction sequences between services that may happen during execution. Note that programming language concepts expressing fork–join parallelism (as e.g., in BPEL) can be abstracted analogously to calling asynchronous procedures and synchronize with them, respectively.

The use of a service \(s\) in service-oriented system \(S\) is defined as a PRS \(U_s \triangleq (\varSigma _s,Q,\rightarrow _s, q_0,F)\) where \(\varSigma _s\) is the set of operations in the provided interface of \(s,\,Q,\,q_0\), and \(F\) are defined as above, and \(\rightarrow _s\) is defined as \(\rightarrow \) except that all labels \(l \not \in \varSigma _s\) are replaced by \(\lambda \). The protocol conformance checking problem checks for each service \(s\) of \(S\) whether \(L(U_s) \subseteq L(A_s)\).

Mayr [26] classified the process rewrite rules according to the class of process-algebraic expressions on the left-hand side and right-hand side, respectively: Class 1 allows only single states, class \(S\) only single states or sequential expressions, class \(P\) only single states parallel expressions, and class \(G\) (general) allows arbitrary process-algebraic expressions. Figure 4 shows the class for each rule. An \((x,y)\) -PRS, \(x,y \in \{ 1,S, P, G \}\) allows only rules whose left-hand sides belong to class \(x\) and whose right-hand sides belong to class y.

Remark 6

\((1,1)\)-PRSs correspond to non-deterministic finite state machines (with \(\lambda \)-transitions), \((1,S)\)-PRSs correspond to context-free systems, \((S,S)\)-PRSs correspond to pushdown machines, \((P,P)\)-PRSs correspond to Petri-Nets, and \((1,G)\)-PRSs correspond to process-algebras, cf. [26].

Figure 6 summarizes the approach for protocol conformance checking of a service-oriented system, cf. [4]. Each service is equipped with a protocol \(A_i\). First, an abstract behaviour \(\varPi _i\) is determined and published for each service. Second, these abstract behaviours are glued to an abstract semantics \(\varPi _S\) that is defined as above. Third, the use \(U_i\) of each service \(S_i\) is determined as described above, and finally, for each service \(S_i\), it is checked whether \(L(U_i) \subseteq L(A_i)\).

Fig. 6
figure 6

Protocol conformance checking of a service-oriented system

3 Abstraction of exception handling

The aim of this section is to extend the abstraction to exception handling. We first discuss the abstractions due to exception handling without a finally-block and then discuss the abstraction of exception blocks with finally-blocks.

3.1 Exception handling without finally-Blocks

Figure 7 shows the PRS-rules specifying the abstract semantics for the raise statement and for exceptional procedure returns for synchronous and asynchronous procedures, i.e., the execution of the procedure stops with an unhandled exception. The left column shows the abstract semantics if the exception is not handled, the right column shows the abstract semantics if the exception is handled by a uniquely defined exception handler.

Fig. 7
figure 7

Exception handling without finally-blocks

Remark 7

For each statement (raise, procedure call, synchronization) that may raise an exception, it can be statically determined whether a given exception \(E\) is handled or unhandled. If exception \(E\) is handled, then the corresponding exception handler is statically defined. Without loss of generality, we assume that the set of unhandled exceptions of a procedure is contained in its signature (such a set can be determined statically). Thus, the execution of a procedure \(p\) may end with an exceptional state \(q_\mathsf{E}\). Thus, the PRS-rules in Fig. 7 can be computing from service implementations by using classical compiler technology.

The rules for the raise statement are straightforward, because they model the execution semantics exactly: If the exception is unhandled, the current execution stops with the exceptional state. Otherwise, the execution continues with the corresponding exception handler.

Consider now the case that a synchronous procedure \(m\) ends with an exceptional state \(q_\mathsf{E}\). Then, the caller of \(m\) raises exception \(E\). There are two cases: exception \(E\) is unhandled (left column, Fig. 7) or handled by a corresponding exception handler (right column). For both cases, \(q_E\) is popped and handling exceptions is analogous to the raise statement, i.e., the top state is replaced by \(q_\mathsf{E}\) if \(\mathsf{E}\) is unhandled and replaced by the program point of the corresponding exception handler, if \(\mathsf{E}\) is handled. The rules in the left row of Fig. 7 must be included into the abstract semantics for each procedure call of \(m\).

Remark 8

For modelling exception handling, it is necessary to use PRS-rules with the sequential composition operator on its left-hand side because it must be modelled that an unhandled exception \(q_E\) within a procedure \(m\) leads to a re-raising exception at the caller. Therefore, instead of continue with the program point \(q'\) after the call, the program point must be replaced by the exception state. In contrast to the normal procedure return (which can be modelled by \(q \rightarrow \varepsilon \)), this replacement is only possible by the rule \(q_E.q' \rightarrow q_E\).

The abstract semantics of exceptional returns from asynchronous procedures is analogous to exceptional returns from synchronous procedures. The main difference is that the state is a cactus stack that forks to a state \(q_E\) stemming from a call of an asynchronous procedure \(m\). Since this is not a regular return from \(m\), either \(m\) must be synchronized explicitly or implicitly by a returning from the callee.

3.2 Exception handling with finally-Blocks

The abstract semantics for finally-blocks must ensure that the finally block is being executed, even if the corresponding try-block (or a corresponding exception handler) executes a return statement or raise an exception not handled within a corresponding exception handler. This also applies for nested try-blocks. Consider for example the following situation:

The raised exception is only handled in an outer try-block. Before executing the exception handler for \(\mathsf{E'}\), all finally-blocks that correspond to inner finally-blocks containing the statement raising \(\mathsf{E}\) must be executed. Thus, the abstract semantics of the rules in Fig. 7 can only be used if none of the inner try-blocks contains a finally-block.

The main idea to ensure the execution of a finally-block is to push the first program point of the finally-block to the stack when entering a try-block, cf. Rule (1) of Fig. 8. For this purpose, program points are assigned to each try-block with a corresponding finally-block, to the end of each try-block, to the end of exception handling block, and to the end of the finally-block, cf. Fig. 8. Furthermore, if an inner try-block raises an exception \(\mathsf{E}\) that is not handled by its exception handlers, the try-block ends with state \(q_E\). Figure 8 shows the PRS-rules for the abstract semantics.

Fig. 8
figure 8

Abstract semantics for exception handling with finally-blocks

If the execution reaches the end of the try-block or the end of one of its exception handlers, then the abstract associated program point can be obtained by the second element of the stack, cf. Rules (2) and (3). Together with Rule (4), these rules ensure that the finally-block is being executed and after its execution, the statement \(s\) after the finally-block is being executed. This is the reason for pushing the program point of \(s\) onto the stack in Rules (2) and (3). If the try-block executes a return statement, then this execution must be postponed until the try-block has been completed and its program point must be pushed onto the stack. For this, the two elements on the top of the stack must be exchanged, cf. Rule (5). Together with Rule (6) for regularly leaving the finally-block, these rules ensure that the return statement is being executed after finishing the finally-block.

The situation is more complicated if the finally-block executes a return statement because in this case, the return statement of the try-block is not being executed. Consider for example Fig. 8: If \(q_7\): return is executed, then \(q_2\): return is not executed although \(q_2\) is the second top element on the stack (by Rule (5)). Therefore, the second element on the stack must be removed, cf. Rule (7). Rule (7) is also able to deal with nested try-blocks. Consider for example a synchronous procedure \(p\) where the body contains the following nested try-block:

When executing \(q_3:\) return, the four program points on the top of the stack are \(q_3.q_2.q_1.q'\) where \(q'\) is a program point after a procedure call of \(p\). According to Rule (7), the abstract semantics contains the rulesFootnote 2 \(q_3.q_2 {\mathop {\rightarrow }\limits ^{\lambda }} q_3\) and \(q_3.q_1 {\mathop {\rightarrow }\limits ^{\lambda }} q_3\). Together with the PRS-rule \(q_3.q' {\mathop {\rightarrow }\limits ^{\lambda }} q'\), it can be shown that \(q_3.q_2.q_1.q'{\mathop {\Rightarrow }\limits ^{\lambda }} q'\), i.e., the procedure returns by the return statement of the outer finally block. A simple inductive argument shows that such a derivation can be constructed for arbitrarily nested try-blocks with corresponding finally-blocks that contain return statements. Rules (8), (9), and (10) of Fig. 8 correspond to Rules (5), (6), and (7) when the try-block raises an unhandled exception \(\mathsf{E'}\) or an exception handler raises exception \(\mathsf{E'}\).

Remark 9

Similar arguments as above also apply to asynchronous procedures. Here, two stacks of a cactus stack have to be considered. The situation on the top of the two stacks is \((q_3.q_2.q_1)\!\!\parallel \!\!q'\) where \(q'\) is a synchronization statement or any program point between the asynchronous procedure call before the corresponding synchronization statement. Then, it holds \((q_3.q_2.q_1)\!\!\parallel \!\!q' {\mathop {\Rightarrow }\limits ^{\lambda }} q_3\!\!\parallel \!\!q'\). Now, the process rewrite rules of the abstract semantics for synchronization or asynchronous procedure return can be applied.

Example 3

Figure 9 shows the process rewrite rules of the abstract semantics for the service-oriented system in Fig. 1. The initial state is \(q_0\), the set of final states is \(F = \{ q_9, q_{\mathsf{Exc}_1}, q_{\mathsf{Exc}_2} \}\). The following derivation corresponds to the execution in Fig. 3 (the rule is as a lower index of the arrow):

$$\begin{aligned}&q_0 {\mathop {\Rightarrow }\limits ^{\lambda }}_{(2)} q_2 {\mathop {\Rightarrow }\limits ^{f}}_{(4)} q_{34}.q_3 {\mathop {\Rightarrow }\limits ^{a}}_{(56)} q_6. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(7)} q_7.q_{16}. q_{35}.q_3 {\mathop {\Rightarrow }\limits ^{b}}_{(8)} (q_8 \!\!\parallel \!\!q_{20}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(32)} (q_8 \!\!\parallel \!\!q_{21}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(34)} (q_8 \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{c}}_{(9)} ((q_{24}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(38)} ((q_{26}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(40)} ((q_{27}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{b}}_{(41)} (((q_{28}\!\!\parallel \!\!q_{20}).q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(32)} (((q_{28}\!\!\parallel \!\!q_{21}).q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3\\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(34)} (((q_{28}\!\!\parallel \!\!q_{Exc_1}).q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(45)} ((q_{Exc_1}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3\\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(47)} ((q_{30}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{f}}_{(48)} ((q_{34}.q_{31}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3\\&{\mathop {\Rightarrow }\limits ^{a}}_{(56)} ((q_6.q_{35}.q_{31}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}.q_{35}.q_3 \\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(7)} ((q_7.q_{16}.q_{35}.q_{31}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3\\&{\mathop {\Rightarrow }\limits ^{b}}_{(8)} (((q_8\!\!\parallel \!\!q_{20}).q_{16}.q_{35}.q_{31}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3\\&{\mathop {\Rightarrow }\limits ^{\lambda }}_{(33)} (((q_8\!\!\parallel \!\!q_{22}).q_{16}.q_{35}.q_{31}.q_{32}.q_9) \!\!\parallel \!\!q_{\mathsf{Exc}_1}).q_{16}. q_{35}.q_3 \end{aligned}$$

At the end of the second line, rules (9) and rules (32) are applicable. It is worth to compare the last expression of the seventh line with the cactus stack in Fig. 3 after step (3). It has basically the same shape except that program points \(q_{32}\) and \(q_{16}\) are not present in the cactus stack. These program points are the first program points of the finally-blocks and are pushed onto the stack when the corresponding try statement is executed.

Fig. 9
figure 9

Abstract semantics of the services in Fig. 1

4 Protocol conformance checking

For checking the conformance of a protocol of a service \(s\) in a service-oriented system \(S\), it must be checked whether \(L(U_s) \subseteq L(A_s)\) where the use of \(s\) is defined by the PRS \(U_s \triangleq (\varSigma _s,Q,\rightarrow ,q_0,F)\) and \(A_s \triangleq (\varSigma _s,R_s,\rightarrow _s,r_0^s,F_s)\) is the finite state machine defining the protocol of \(s\). Both and Zimmermann [5] proves that this problem is undecidable for the classes of \((x,G)\) process rewrite systems, i.e., if sequential composition and parallel composition occurs in a PRS, the protocol conformance checking becomes undecidable.

The goal is to construct—similar to [5]—the Combined Abstraction. The Combined Abstraction is a PRS \(K\) with the following properties:

  1. (i)

    \(K\) belongs to the same class of PRSs as \(U_s\),

  2. (ii)

    \(L(K) \supseteq L(U_s) \cap (\varSigma _{s}^{*} {\setminus } L(A_s))\), and

  3. (iii)

    if \(U_s\) belongs to one of the classes of \((x,y)\)-PRSs, \(y \in \{ 1,S \}\)) then \(L(K) = L(U_s) \cap (\varSigma _{s}^{*} {\setminus } L(A_s))\).

Note that \(\varSigma _{s}^{*} {\setminus } A_s\) is the language accepted by the finite state machine \(\bar{A}_s \triangleq (\varSigma _s,R_s,\rightarrow _s,r_0^s,\bar{F})\) where \(\bar{F}_s \triangleq R_s \setminus F_s\) is the set of all non-final states of \(A_s\). Thus, \(L(K) \ne \emptyset \) implies protocol conformance.

Both and Zimmermann [5] defines the Combined Abstraction for the class of \((1,G)\)-PRS and [4] extends it to the class of \((P,G)\)-PRS. This article extends it further to the class of \((G,G)\)-PRS. Here, the construction of the Combined Abstraction is based on a normalized process rewrite system that consists only of rules of the forms \(q {\mathop {\rightarrow }\limits ^{\alpha }} q',\,q.q' {\mathop {\rightarrow }\limits ^{\alpha }} q'',\,q {\mathop {\rightarrow }\limits ^{\alpha }} q'.q'',\,q \parallel q' {\mathop {\rightarrow }\limits ^{\alpha }} q''\), and \(q {\mathop {\rightarrow }\limits ^{\alpha }} q'\parallel q''\). Mayr [26] shows that for any PRS \(\varPi \) there exists a normalized PRS \(\varPi '\) with \(L(\varPi )=L(\varPi ')\). However, \(\varPi '\) may have more atomic processes.

In contrast to [5], the construction of the Combined Abstraction for \((G,G)\)-PRS is based on the construction of a pushdown system in [22] that accepts the intersection of a context-free language and a regular language. The Combined Abstraction of \(U_s=(\varSigma _s,Q,\rightarrow ,q_0,F)\) and \(\bar{A}_s=(\varSigma _s,R_s,\rightarrow _s,r_0^s,\bar{F}_s)\) is a process rewrite system \(K \triangleq (\varSigma _s,Q_K,\rightarrow _K,q_0^K,F_K)\) where \(Q_K \triangleq R_s \times Q,\,q_0^k \triangleq (r_0,q_0),\,\rightarrow _K \triangleq T_{11} \cup T_{1S} \cup T_{S1} \cup T_{1P} \cup T_{P1} \cup T_0\) as defined in Fig. 10, and \(F_K \triangleq F_s \times F\). The proof of its correctness is based on the following

Fig. 10
figure 10

Transition rules for the combined abstraction

Theorem 1

Let \(U_s \triangleq (\varSigma _s,Q,\rightarrow ,q_0,F)\) be a PRS , \(\bar{A}_s \triangleq (\varSigma _s,R_s,\rightarrow _s,r_0^s,\bar{F}_s)\) be a finite state machine, and \(K \triangleq (\varSigma _s,Q_K,\rightarrow _K,q_0^K, F_K)\) be the Combined Abstraction of \(U_s\) and \(\bar{A}_s\). Furthermore, let be \(e \in {\textit{PEX}}(Q)\) be a process-algebraic expression over \(Q\) and \(r \in R_s\) a protocol state, and \(x \in \varSigma _{s}^{*}\) such that there is a \(f \in F\) and \(\bar{r} \in \bar{F}_s\) with \(e {\mathop {\Rightarrow }\limits ^{x}} f\) and \(r {\mathop {\Rightarrow }\limits ^{x}}_{s} \bar{r}\). Then, there is a \(e' \in {\textit{PEX}}(Q_K)\) and a \(f_k \in F_K\) such that the following properties are satisfied:

  1. (i)

    \(e\) and \(e'\) have the same shape, i.e., \(e\) is obtained from \(e'\) by removing the first component contained in each atomic process of \(e'\).

  2. (ii)

    All states on the top of the stacks in the cactus stack corresponding to \(e'\) have the protocol state \(r\).

  3. (iii)

    \(e' {\mathop {\Rightarrow }\limits ^{x}}_{K} f_k\)

Proof

See Appendix 6 \(\square \)

Corollary 1

Let \(U_s \triangleq (\varSigma _s,Q,\rightarrow ,q_0,F)\) be a PRS , \(\bar{A}_s \triangleq (\varSigma _s,R_s,\rightarrow _s,r_0^s,\bar{F}_s)\) be a finite state machine, and \(K \triangleq (\varSigma _s,Q_K,\rightarrow _K,q_0^K, F_K)\) be the Combined Abstraction of \(U_s\) and \(\bar{A}_s\). Then, \(L(U_s) \cap L(A_s) \subseteq L(K)\).

Proof

Let be \(x \in L(A_s) \cap L(U_s)\). Then, \(q_0 {\mathop {\Rightarrow }\limits ^{x}} f\) for a \(f \in F\) and \(r_0 {\mathop {\Rightarrow }\limits ^{x}} \bar{r}\) for a \(\bar{r} \in \bar{F}_s\). Hence, by Theorem 1(iii) there is a \(f_k \in F_K\) such that \((r_0,q_0) {\mathop {\Rightarrow }\limits ^{x}}_{K} (a\bar{r}, f)\). Thus, \(x \in L(K)\) since \(q_0^K = (r_0,q_0)\). \(\square \)

Remark 10

If the Combined Abstraction only contains rules from \(T_{11} \cup T_{1S} \cup T_{S1}\), then equivalence holds. In particular, there is no need to apply Lemma 2. The combined abstraction specializes to the construction as well the proof in Appendix 6 to the intersection of pushdown machines and finite state machines as described in [22]. In this case, equivalence holds [22].

If condition (ii) is violated, then the cactus stack would have two top-of-stack elements with different protocol states. This makes no sense because there is only one protocol of service \(s\), and therefore, the protocol cannot be at the same time in two different states. We call cactus stack violating condition (ii) as inconsistent.

The transition rules \(T_{11},\,T_{1S}\), and \(T_{S1}\) are a slight generalization of those in [22]. Note that \(r {\mathop {\Rightarrow }\limits ^{\lambda }}_s r\), and for \(a \in \varSigma _s\) it is \(r {\mathop {\Rightarrow }\limits ^{a}}_s r'\) iff \(r {\mathop {\rightarrow }\limits ^{a}}_{s} r'\). The transition rules \(T_{1P}\) and \(T_{P1}\) are straightforward. The ideas stem from [4, 5]. The transition rules \(T_0\) are required for maintaining a consistent state of the protocol as demonstrated by Example 4:

Example 5

Consider the finite state machine \(A_2\) in Fig. 2 and suppose that \(U_s\) contains the following transition rules \(q_0 {\mathop {\rightarrow }\limits ^{b}} q_1 \parallel q_2,\,q_1 {\mathop {\rightarrow }\limits ^{c}} q_3,\,q_2 {\mathop {\rightarrow }\limits ^{b}} q_3,\,q_3 \parallel q_3{\mathop {\rightarrow }\limits ^{c}} q_4\) where \(q_4\) is the final state. Then, \(bbcc \in L(U_s) \setminus L(A_2)\), i.e., there is a protocol violation. Without the rules in \(T_0\), the following derivations are possible:

$$\begin{aligned} \begin{array}{lll} q_K &{}{\mathop {\Rightarrow }\limits ^{\lambda }} (r_0^2,q_0) &{} \text {by}~T_S \\ &{}{\mathop {\Rightarrow }\limits ^{b}} (r_1^2,q_1)\parallel (r_1^2,q_2) &{}\text {by}~T_{1P} \\ &{}{\mathop {\Rightarrow }\limits ^{b}} (r_1^2,q_1)\parallel (E,q_3) &{}\text {by}~T_{11}\\ &{}{\mathop {\Rightarrow }\limits ^{c}} (r_1^2,q_1)\parallel (E,q_4) &{}\text {by}~T_{11}\\ &{}{\mathop {\Rightarrow }\limits ^{c}} (r_2^2,q_3)\parallel (E,q_4) &{}\text {by}~T_{11}\\ \end{array} \qquad \begin{array}{lll} q_K &{}{\mathop {\Rightarrow }\limits ^{\lambda }} (r_0^2,q_0) &{} \text {by}~T_S \\ &{}{\mathop {\Rightarrow }\limits ^{b}} (r_1^2,q_1)\parallel (r_1^2,q_2) &{}\text {by}~T_{1P}\\ &{}{\mathop {\Rightarrow }\limits ^{b}} (r_1^2,q_1)\parallel (E,q_3) &{}\text {by}~T_{11}\\ &{}{\mathop {\Rightarrow }\limits ^{c}} (r_2^2,q_3)\parallel (E,q_3) &{}\text {by}~T_{11}\\ &{}{\mathop {\Rightarrow }\limits ^{c}} (r_2^2,q_4)\parallel (E,q_3) &{}\text {by}~T_{11} \end{array} \end{aligned}$$

For none of these two process-algebraic expressions, there are \(\lambda \)-transitions that lead to a final state. Thus, the protocol violation is not detected. The reason is that for both derivations, a change of the protocol state was not taken into account. For both derivations, after the second step, the left operand of \(\parallel \) still indicates that \(A_2\) is in state \(r_1^2\) although by the transitions the (final) protocol state \(E\) is reached for the right operand. Figure 11 shows the corresponding cactus stack. Each top element of a cactus stack should have the same protocol state because there is only one protocol, and therefore, the protocol state must be unique. With the rules of \(T_0\), it is possible to change the protocol state of the left operand to \(E\) before applying another transition rule. Thus, \((r_1^2,q_1)\parallel (E,q_3) {\mathop {\Rightarrow }\limits ^{\lambda }} (E,q_1) \parallel (E,q_3)\,{\mathop {\Rightarrow }\limits ^{c}} (E,q_3) \parallel (E,q_3) {\mathop {\Rightarrow }\limits ^{c}} (E,q_4) \in F_K\) for both cases. Hence, the protocol violation is detected.

Fig. 11
figure 11

Inconsistent cactus stack

Example 5

(Combined Abstraction) The Combined Abstraction of the PRS in Fig. 9 and the protocol automaton \(A_2\) in Fig. 2 has 220 atomic processes and 398 transition rules. For reasons of space we only give a derivation demonstrating the protocol violation discussed in Example 2 (the class of the applied transition rule according to Fig. 10 is indicated below the derivation step, \(T_{SS}\) is a combination of \(T_{S1}\) and \(T_{1S}\)), cf. Fig. 12. This derivation should be compared with the derivation in Example 3. Note that all operations different from \(b\) and \(c\) are replaced by \(\lambda \) since the other operations are not contained in the interface of service \(s_2\).

Fig. 12
figure 12

A derivation using the rules of a combined abstraction (Example 5)

5 Related work

Many works on static protocol checking of components consider local protocol checking on FSMs. The same approach can also be applied to check protocols of objects in object-oriented systems. The idea of static type checking by using FSMs goes back to Nierstrasz [28]. His approach uses regular languages to model the dynamic behaviour of objects, which is less powerful than context-free grammars (CFG). Therefore, the approach cannot handle recursive callbacks. In [25], object-life cycles for the dynamic exchange of implementations of classes and methods using a combination of the bridge/strategy pattern are considered. The approach comprises dynamic as well as static conformance checking. However, it is also based on FSMs, which are in general still used widely in similar approaches (e.g., [15, 27, 35]). Tenzer and Stevens [38] investigate approaches for checking object-life cycles. They assume that object-life cycles of UML-classes are described using UML state-charts and that for each method of a client, there is a FSM that describes the calling sequence from that method. In order to deal with recursion, Tenzer and Stevens add a rather complicated recursion mechanism to FSMs. It is not clear whether this recursion mechanism is as powerful as pushdown automata and therefore could accept general context-free languages. Pradel et al. [32] discuss protocol conformance checking of APIs analogous to ours. It is designed for object-oriented programs and focuses on containers and iterators. Their approach learns protocols and statically checks them. The static protocol conformance checker may report false positives but no false negatives. It is based on an intra-procedural branch-sensitive program analysis. Hence, their approach is able to take into account data flow but has a rather imprecise abstraction of recursion, since the latter would require a context-sensitive interprocedural program analysis. Furthermore, [32] does not discuss exception handling and concurrency.

Zimmermann and Schaarschmidt [40] show that if the behaviour is a context-free language due to recursive callbacks, finite state approaches may lead to false positives. Furthermore, they introduced an approach for protocol conformance checking based on context-free systems. Exception handling would require pushdown systems. Lin et al. [24] use pushdown systems and discuss an approach of adaptation to protocols based on pushdown systems. All these works are for sequential systems.

Schmidt et al. [37] propose an approach for protocol checking of concurrent component-based systems. Their approach is also FSM-based. Thus, it is also unable to deal with recursive callbacks. Both and Zimmermann [4, 5] use the restricted class of \((P,G)\)-PRS. Thus, it allows the adequate modelling of unbound recursion, unbound concurrency, and explicit synchronizations. However, exceptions are not considered in these works.

An alternative approach for an investigation of protocol conformance is the use of process-algebras such as CSP (e.g.,  [2]). These approaches are more powerful than FSMs and context-free grammars. However, mechanized checking requires some restrictions on the specification language. For example, [2] uses a subset of CSP that allows only the specification of finite processes. At the end the conformance checking is reduced to checking FSMs similar to [37]. In [30], behavioural protocol conformance is used to describe a problem similar to ours. In contrast to our approach the developer has to define not only the allowed receivable calls but also the calls of the component. This approach cannot handle recursive callbacks, since the verification is reduced to finite state model. Many works use process-algebras as abstractions for the formal (behavioural) analysis of e.g. BPEL applications.

[16] uses CSP, while [31, 36] use CCS-Process-algebras are similar to \((P,G)\)-PRSs. These two works do not verify the behaviour in our sense. To the best of our knowledge, we are not aware of works in protocol conformance checking taking into account unbound recursion, unbound concurrency, and exception handling.

Other works such as [8] use another notion of behavioural conformance as this article. Their notion of conformance basically implies absence of deadlocks and livelocks, i.e., they want to reach a desired state. In contrast, protocols in this article specifies sequences of operation calls that must be satisfied, i.e., it is more a safety condition rather than a liveness condition. Furthermore, [8] does not abstract the service behaviour from an implementation. The latter is done by [33] who abstracts the service implementation to a ZING model. They check also a kind of absence of deadlocks as [8] using a simulation relation.

Bouajjani and Emmi [7] discuss the analysis of recursive parallel programs. They restrict themselves to finite data types and explore the decidability of problems such as e.g. reachability. It seems that there model is slightly more general as there are situations where the reachability problem becomes undecidable. Their approach doesn’t consider exception handling.

de Caso et al. [12] abstract contracts to protocols (modelled as finite state machines) in the sense of this article and validates them. This is done for both, the client and the server and a simulation, bi-simulation or protocol conformance can be checked automatically. Abstractions from implementations are not considered. Dumez et al. [14] derive a composition specification from interaction protocols and derives an implementation of composed services satisfying the specification. Ghezzi et al. [17] discuss a dynamic approach for protocol conformance checking.

6 Conclusions

This article extends our previous work of protocol conformance checking towards exception handling. The approach is capable to represent exception handling even via service interactions. The abstractions are computed using an automatic translation. A more rugged composition of SOAs is now possible. In contrast to our previous work, the most general class of process rewrite systems is needed for modelling exception handling, unbound recursion, unbound concurrency, and explicit synchronization. Table 1 shows an interesting correspondence between the rule classifications according to Mayr and the adequate modelling of programming language concepts. In particular, it shows what is required for modelling the language concepts if one abstracts completely from data. Thus, we have the correspondence between Mayr’s hierarchy of process rewrite systems and programming language concepts shown in Fig. 13. In our previous work, we had a correspondence to the class \((P,G)\)-PRS (Process Algebra Nets). With exception handling, we have a correspondence to the \((G,G)\)-PRS, the general class of process rewrite systems.

Fig. 13
figure 13

PRS-Hierarchy and expressiveness w.r.t. programming language concepts a PRS-Hierarchy and its expressiveness b PRS-Hierarchy and programming language concepts

Table 1 Rule classes and programming language concepts

The reachability problem is decidable for each class of PRS while the inclusion problem to regular languages becomes undecidable in any class containing a \(G\), i.e., that includes parallel as well as sequential composition. In a similar way as [5], we have defined a Combined Abstraction that approximates the inclusion problem by a reachability problem such that the approximation is exact iff the process rewrite system belongs to a decidable class. The reachability problem can be solved by the algorithm in [26].

However, this algorithm requires exponential space (and therefore at least exponential time) in the worst case since the reachability problem for process rewrite systems is EXPSPACE-hard [26]. It is subject to future work to check whether the worst-case behaviour practically occurs and to apply some heuristics to get it more efficient if necessary. For the latter, the same ideas as in [3, 4, 6] may apply. In particular, [6] has shown that protocol conformance checking of complex systems is possible in acceptable time.

Taking into account data is a challenge: the data types of variables must be abstracted to finite domains. However, this leads to a severe state explosion problem as in classical model checking. Thus, in order to consider data in protocol conformance checking, a more goal-oriented abstraction is required.