Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Service-oriented architectures (SOA) aim at the configuration of service compositions out of existing, independently deployable services. With micro services, this fundamental principle has recently found new impetus. In a formal model-based design, the employed services are equipped with well-defined interface descriptions, and the composition can hence be checked with respect to user requirements before assemblance. Many such analysis methods for service compositions exist today (e.g. [5, 9, 10, 17, 20, 21]), using a variety of formalisms for modelling.

While formal analysis techniques are numerous, this is less so for techniques localizing the cause of faults in erroneous compositions. Analysis techniques building on model checking return counter examples, but they do not give hints on whether to change the way of composition or to change services, and in case of the latter which service to change. For software, this is a more frequently studied question. Techniques like delta debugging [7, 25], slicing [1, 24] or statistical methods [16] all aim at localizing the cause of faults in programs. Performance blame analysis [6] targets the identification of software components causing long runtimes.

In [15], Krämer and Wehrheim studied the applicability of software error localization techniques to service compositions. It turned out that it is in general difficult to transfer such techniques to a model-based setting: Software fault localization typically requires the executability of the entity under examination, i.e., requires code or requires entities with some accompanied interpretation, compilation or simulation tool as to execute them. Programs are being run on one or – in case of statistical methods – even a large number of test cases, and the outcome of these tests is used to determine potential error locations. Models of service compositions, however, typically cannot be executed as no concrete service implementations might exist yet, and modelling languages often do not come with interpreters providing some form of execution.

A notable exception to the requirement of executability in fault localization is the work of Jose et al. [13]. While they also study programs and make use of testing, they in addition employ bounded model checking for test input generation. This is similar to the SMT-based verification technique which we already employ for service compositions [23]. The basic principle of Jose et al. is the encoding of faulty program runs as trace formulae, and the localization of likely fault causes (in the faulty program trace) via maximum satisfiability solving (MAX-SAT).

In this paper, we transfer this concept to the model-based setting. The main challenge for this is directly the fact that we work on the level of models: instead of (only) using the typical programming language types, service interfaces are formulated via the concepts of domain ontologies, and service assemblance uses predicates and rules of these ontologies. Semantically, such concepts and predicates have no fixed interpretation; only ontology rules sometimes slightly restrict their meaning. A faulty test input (as required by Jose et al.) can thus not simply be an input variable of the program with its value. A trace formula encoding a faulty run cannot simply accumulate the concrete statements in the run, but needs to incorporate service calls of which just the interfaces are known. Technically, concepts and predicates of ontologies will in our approach be encoded as undefined function symbols, and a test input is then a complete logical structure fixing an interpretation of these symbols together with a valuation of variables in this logical structure. On this basis, we construct (1) logical formulae encoding the query for correctness of the whole service composition based on a given domain ontology and (2) trace formulae encoding faulty runs in case of errors based on the logical structure used for the fault. Satisfiability of these formulae are solved using the SMT solver Z3 [18]. Out of the trace formula, we can derive fault sets via maximum satisfiability solving, i.e., locations in the service composition which – when changed – correct errors. We furthermore allow to prioritize fault types, e.g., when we prefer to see faults in service calls over faults in the structure of the service composition. To this end, we impose weights on constraints in the MAX-SAT instance. Finally, we suggest a technique for finding more likely faults by identifying service composition entities which are the cause of faults in several cases.

The technique we propose here identifies single entities in service compositions as faults. It thus complements the techniques introduced by Krämer and Wehrheim [14] which aim at a repair of faulty service compositions via the replacement of large parts of the composition.

The paper is structured as follows. In the next section, we introduce some necessary background in logic and introduce the model of service compositions we employ in this paper. Section 3 describes correctness checking for service composition which is the basis for identifying faulty service compositions. The following section then introduces our technique for fault localization, for prioritization of fault types and the computation of likely faults. The last section concludes. Our description of the employed techniques is almost always given in terms of Z3 syntax, i.e., we give logical formulae in the SMT-Lib format [4].

2 Background

We start with describing the necessary background in logics, services and service composition. For the latter, we follow the definitions of Walther and Wehrheim [23].

A service composition describes the assemblance of services from a specific domain (e.g., tourism or health domain). The concepts occurring in such a domain and their interrelations are typically fixed in an ontology. Ontologies can be given in standardized languages, like for instance OWL [2]. Here, we simply fix the ingredients of ontologiesFootnote 1.

Definition 1

A rule-enhanced ontology \(K=(\mathcal {T}, \mathcal {P}, \mathcal {F}, R)\) consists of a finite set of type (or concept) symbols \(\mathcal {T}\) together with constants of these types, a finite set of predicate symbols \(\mathcal {P}\), where every \(p\in \mathcal {P}\) denotes an n-ary relation on types, a finite set of function symbols \(\mathcal {F}\) and a set R of rules which are predicate logic formulae over some variables \( Var \) and the constants using \(\mathcal {P}\) and \(\mathcal {F}\). We use \(\varPhi _K( Var )\) to describe the set of such formulae.

Here, we implicitly assume the types to contain integers and booleans with the usual constants. Rules, which are not part of every ontology language, are used to describe relationships between concepts or properties of predicates (e.g., commutativity). An example of a rule language for ontologies can for instance be found in [12].

The running example of our paper operates in the tourism domain, containing types like \( Restaurant , Rating \) and \( Location \) and a constant R0 of type \( Restaurant \). On these, the ontology contains two predicates and two functions:

$$\begin{aligned} isHigh :&Rating \rightarrow&\mathbf{bool }\\ isVegan :&Restaurant \rightarrow&\mathbf{bool }\\ ratOf :&Restaurant \rightarrow&Rating \\ locOf :&Restaurant \rightarrow&Location \end{aligned}$$

For simplicity, the knowledge base contains just two rules specifying properties of the constant R0: \( isHigh(ratOf(R0)) \) and \( isVegan (R0)\).

With its types, an ontology defines – in a logical sense – a large number of undefined function symbols. It remains undefined what a \( Restaurant \) really is; it is just a name. The interpretation of these symbols and the universes of the types are not defined. A logical structure (or logical model) fixes this interpretation.

Definition 2

Let \(K = (\mathcal {T},\mathcal {P},\mathcal {F},R)\) be an ontology. A logical structure over K, \(\mathcal {S_K}=(\mathcal {U}, \mathcal {I})\), consists of

  • \(\mathcal {U} = \bigcup _{T \in \mathcal {T}} \mathcal {U}_T\) the universe of values,

  • \(\mathcal {I}\) an interpretation of the predicate and function symbols, i.e., for every \(p \in \mathcal {P}\) of type \(T_1 \times \ldots \times T_n \rightarrow \mathbf{bool }\) and every \(f\in \mathcal {F}\) of type \(T_1 \times \ldots \times T_n \rightarrow T\) we have a predicate

    $$ \mathcal {I}(p): \mathcal {U}_{T_1} \times \ldots \times \mathcal {U}_{T_n} \rightarrow \mathcal {U}_{\mathbf{bool }} $$

    and a function

    $$ \mathcal {I}(f): \mathcal {U}_{T_1} \times \ldots \times \mathcal {U}_{T_n} \rightarrow \mathcal {U}_T, $$

    respectively, and

  • we require the logical structure to satisfy all rules of the ontology.

Ontologies provide the concepts which can be used to describe the functionality of services. A service signature \(({ Svc },\mathcal {T}_{in},\mathcal {T}_{out})\) over an ontology K first of all specifies the name \({ Svc }\) of the service as well as the type of its input \(\mathcal {T}_{in} \in \mathcal {T}\) and the type of its output \(\mathcal {T}_{out} \in \mathcal {T}\). We restrict ourselves to services with single inputs and outputs. A service description in addition adds semantical information to this interface.

Definition 3

A service description \(( sig ,I,O, pre _{{ Svc }}, post _{{ Svc }})\) of a service named \({ Svc }\) over an ontology K consists of a service signature \( sig = ({ Svc },\mathcal {T}_{in},\mathcal {T}_{out})\), input variable i and output variable o, a precondition \( pre _{{ Svc }}\) over the input variable and a postcondition \( post _{{ Svc }}\) over input and output variable, both elements of \(\varPhi _K(\{i,o\})\).

Fig. 1.
figure 1

Service composition RestaurantChecker

The interface of a service thus states what the service requires to hold true upon calling it (preconditions) and what it guarantees when it has finished (postcondition). Services can be assembled by means of a workflow language. While there are different languages in use (e.g. WS-BPEL [19]), we simply employ a programming language like notation here.

Definition 4

Let \(K =(\mathcal {T},\mathcal {P},\mathcal {F},R)\) be an ontology and \( Var \) a set of variables typed over \(\mathcal {T}\). The syntax of a workflow W over K is given by the following rules:

$$\begin{aligned} W \;{:}{:}{=}{}\;&Skip \mid u := t \mid W_1 ; W_2\mid u := { Svc }(v)&\\&\mid \mathbf {if~} B \mathbf {~then~} W_1\mathbf {~else~} W_2\mathbf {~fi} \mid \mathbf {while~} B \mathbf {~do~} W \mathbf {~od} \end{aligned}$$

with variables \(u, v \in Var \), expression t of type \( type (u)\), \(B\in \varPhi _K( Var )\), and \({ Svc }\) a service name.

Herein, we allow for standard expressions on integers and booleans. A complete service composition then consists of such a workflow together with a name, its inputs and outputs, a list of the employed services and their pre- and postcondition.

Figure 1 shows our running example of a service composition. In this, we use two services: \( GetRating \) with precondition true and postcondition \( r=ratOf(in) \) and \( GetLocation \) with precondition true and postcondition \( loc=locOf(out) \) (already instantiated to the variables used in the workflow).

A service composition is furthermore equipped with an overall pre- and postcondition. These pre- and postconditions constitute requirements on the service composition: whenever the whole composition is started with an input satisfying the precondition, it should terminate and at the end satisfy the postcondition. Our example service composition is not correct in that sense: whenever the input to the composition is a vegan restaurant (satisfying \( isVegan(in) \)) with a low rating (satisfying \(\lnot isHigh(r) \) for r being the rating of in), the service composition will output the value of in which does not satisfy the (first clause of the) postcondition.

For defining this in a more formal way, we furthermore need the definition of a state: given a logical structure \(\mathcal {S}\), a state in \(\mathcal {S}\) is a mapping from variables to values of the universe, \(\sigma : Var \rightarrow \mathcal {U}\). In [23], a semantics for workflows is given which is parameterised in a logical structure: for a workflow W, \([\![W]\!]_\mathcal {S}\) is a mapping from a set of states in \(\mathcal {S}\) to sets of states. We will not repeat the definition here, instead we concentrate on verification and fault localization. The semantics of service calls of Walther and Wehrheim [23] fixes (a) a service to be executable only when the precondition holds in the current state, otherwise it blocks, and (b) the after-state to satisfy the postcondition. As the postcondition is just a logical expression, there may be more than one after-state satisfying the postcondition. A service call thus introduces nondeterminism. The property (a) will play no role in the following; for a technique for computing precondition violations on service calls see [14].

Definition 5

A service composition with workflow W is correct with respect to pre- and postconditions pre and post if the following holds for all logical structures \(\mathcal {S}\) for K:

$$ [\![W]\!]_\mathcal {S}([\![pre]\!]_\mathcal {S}) \subseteq [\![post]\!]_\mathcal {S}, $$

where for a formula \(p \in \varPhi _K\) without free variables, we let \([\![p]\!]_\mathcal {S}\) denote the set of states in \(\mathcal {S}\) satisfying p.

This is a standard partial correctnessFootnote 2 definition: whenever we start the service composition in a state satisfying the precondition, the state reached when the service composition is completed should satisfy the postcondition. The task of fault localization now requires finding the locations in the workflow which are the cause of incorrectness.

3 Correctness Checking

In our setting, fault localization directly builds on the results of correctness checking. Therefore, we next present our technique for checking correctness of service compositions. In its basic approach, we follow Walther and Wehrheim [22] here, however, with some adaptation to fit the approach to the later fault localization. The main adaptation concerns naming: while Walther and Wehrheim [22] build one formula for the whole service composition, we build separate formulae for the parts of the composition (i.e. one formula for a condition in an if statement, one for a service call and so on). The formulae are assigned different names, and we can later use these names for the hard and soft constraints in the MAX-SAT instance.

Fig. 2.
figure 2

Service composition RestaurantChecker in SSA-form

3.1 Brief Introduction to Z3 Syntax

We start with a very brief description of the Z3 syntax which we employ.

Declarations. :

We use three types of declarations: (1) Declaring new sorts (types), written as (declare-sort \(\texttt {{<}name{>}}\) ), (2) declaring function symbols, written as (declare-fun \(\texttt {{<}name{>}}\) \(\texttt {{<}signature{>}}\) ), and (3) declaring constants, written as (declare-const \(\texttt {{<}name{>}}\) \(\texttt {{<}sort{>}}\) ). The declared sorts, functions and constants can from then on be used. Note that we do not fix the interpretation of these new concepts.

Definitions. :

Definitions, written as (define-fun \(\texttt {{<}name{>}}\) \(\texttt {{<}sig{>}}\) \(\texttt {{<}def{>}}\) ), allow to give an interpretation to functions.

Assertions. :

Assertions fix the facts that we would like to hold true. When asked for satisfiability, Z3 checks whether all assertions jointly can be made true. This involves trying to find an interpretation for the unknown (but declared and used) concepts.

Logic. :

Z3 uses standard predicate logic for writing logical expressions (e.g., quantifiers, boolean connectives). The connective ite stands for a conditional expression (if-then-else).

3.2 Encoding the Service Composition

Checking the correctness of a service composition proceeds in two steps: in a first step, we bring the service composition into static-single-assignment (SSA) form. Second, we translate the workflow, its precondition, the ontology and the negation of the postcondition into a logical formula which is then checked for satisfiability. We thereby check whether it is possible to start in a state satisfying the precondition but end in a state not satisfying the postcondition.

SSA forms require a variable to be assigned to at most once in the program text (for a technique for computing SSA forms see [8]). Our service composition is not yet in SSA form, as it for instance has three assignments to out. Figure 2 shows its SSA form. At merges of control flow (if, while) so called \(\phi \)-nodes (assignments) are inserted in order to unite variables assigned to in different branches.

Table 1. Translation of workflow statements

Given such an SSA-form, the translation next proceeds as follows. Table 1 shows the translation of the workflow (directly given in Z3 syntax, the solver we use). Every statement of the workflow is translated to a function declaration and this declaration is asserted to hold true. The thereby introduced names for statements will later prove helpful for fault localization. For assignments, we assert the variable to be equal to the assigned expression. For service calls, we assert the postcondition of the service to be true (as said before, precondition violations are not checked here). In the definition, the term post \(_{{ Svc }}\) (u,v) refers to the logical formula as specified for the postcondition in the service description.

For if statements, we get two function declarations: one giving a name to the condition and the other setting variables to the correct version according to the condition and the \(\phi \)-nodes. In the table, we assume every if and while statement to be followed by just one \(\phi \)-node. The translation can, however, easily be generalized to more than one such node. The \(\phi \)-nodes are translated to assertions equating the assigned variable to one of the parameters, based on the condition in the if and while statement, respectively. For loops, we assume a loop invariant to be given (see [23]). The invariant together with the negation of the loop condition acts like a postcondition of the whole loop block. The invariant is checked to actually be an invariant in a separate step. The assertion for a loop thus states that the invariant and the negation of the loop condition holds after the loop, where the variables in both formulae have to be instantiated to the current variable version as given by the \(\phi \)-node after the loop.

In addition, we need a translation of the ontology. This requires declaring all types, constants, predicates and functions and asserting the rules to hold true. Table 2 gives this translation.

Table 2. Translation of ontology \(K =(\mathcal {T},\mathcal {P},\mathcal {F},R)\)

In summary, this translation gives us a formula \(\varphi _{SC}\) for the workflow, a formula \(\varphi _K\) for the ontology and two formulae pre and post for the translation of the service composition’s pre- and postcondition. The translation for service composition RestaurantChecker is given in the appendix. The solver is now queried about the satisfiability of

$$\begin{aligned} pre \wedge \varphi _{SC} \wedge \varphi _K \wedge \lnot post \end{aligned}$$
(1)

(plus about the satisfiability of \(inv \wedge \varphi _W \wedge \lnot inv'\) for all loops with invariant inv, loop body W and SSA form tagging variables after the loop with primes). In case of our example service composition, the answer to (1) is “sat”, i.e., it is possible that the composition’s postcondition is not fulfilled at the end. Together with this answer, the solver returns a logical structure (which we call fault structure), i.e., universes for all sorts (types) and interpretations of the undefined predicate and function symbols, plus a state \(\sigma \) mapping all variables in the service composition to values of the universe. This state can be seen as the final state of the service composition, i.e., the state in which the negation of the postcondition is satisfied. Since the service composition is in SSA-form, the state of a variable is, however, never changed anyway (loops are summarized in the loop invariant). For our running example, Fig. 3 shows this information; the appendix contains it in the form returned by Z3. The fault structure constitutes our input to fault localization.

4 Fault Localization

Fault localization now proceeds by taking the “faulty” input to the service composition, i.e., the universe, interpretation and the state of the input to the service composition, together with the service composition and ontology, and solving a partial maximum satisfiability problem. Essentially, we are determining which parts of the service composition plus non-negated postcondition can simultaneously be satisfied on the faulty input. The complementary part contains the potential faults.

Fig. 3.
figure 3

“Fault structure” for example service composition

4.1 Computing Fault Sets

A partial maximum satifiability problem pMAX-SAT takes as input a predicate logic formula in conjunctive normal form. In this, some clauses are marked hard (definitely need to be satisfied) and others are marked as soft (need not necessarily be satisfied). The pMAX-SAT solver determines a maximum number of clauses which satisfies these constraints. In addition, Z3 – when used as pMAX-SAT solver [11] – allows to give weights to soft clauses. A weight sets a penalty for not making a clause satisfied, and Z3 determines solutions with minimal penalties. We will make use of this for prioritizing certain faults over others.

For fault localization, we build a trace formula encoding the faulty “run” of the service composition. In our model-based setting, the trace formula needs to contain the whole (logical) fault structure \(\mathcal {S}\) and the state \(\sigma \) of the fault. Table 3 shows the translation of this to Z3 input. For the universe, we declare all values of a type and state all variables of that type to just take values of the universe. For predicates and functions, we enumerate all cases of argument and result values. Out of the state, we just fix the value of the input to the service composition. This translation gives us two more formulae: \(\varphi _\mathcal {S}\) and \(\varphi _\sigma \). These two formulae describe a structure and state on which the postcondition cannot be satisfied.

Proposition 1

Let SC be a service composition with pre- and postcondition pre and post, respectively, and K an ontology. Let \(pre \wedge \varphi _{SC} \wedge \varphi _K \wedge \lnot post\) be satisfiable and \((\mathcal {S},\sigma )\) be the logical structure and state making the formula true. Then

$$ pre \wedge \varphi _{SC} \wedge \varphi _K \wedge post \wedge \varphi _\mathcal {S}\wedge \varphi _\sigma $$

is unsatisfiable.

For fault localization, we next ask for partial maximum satisfiability, making all assertions hard except for those of the service composition (the fault is in the service composition). The query thus is (underlining all hard constraints)

$$\underline{pre} \wedge \varphi _{SC} \wedge \underline{\varphi _K} \wedge \underline{post} \wedge \underline{\varphi _\mathcal {S}} \wedge \underline{\varphi _\sigma } $$

For every clause in this formula, the solver can tell us whether the clause has or has not been made true. Let \(F= \{c_1, \ldots , c_n\}\) be the set of (soft) clauses not made true. F constitutes (one) fault set. By changing all assertions (and hence statements) in F, the service composition can be corrected. Similarly to Jose et al. [13], we can also find different fault sets (for the same fault) by adding a new hard constraint \(c_1 \vee \ldots \vee c_n\) to our formula.

In case of our example service composition, the solver is always returning a singleton fault set. The fault sets we get by repeatedly starting the solver and adding new hard clauses are \(F_1 = \{branch3\}\) (the if statement in line 3 itself), \(F_2 = \{cond3\}\) (condition of if statement in line 3), \(F_3 = \{cond1\}\) (condition of if statement in line 1), \(F_4 = \{ a4\}\) (assignment in line 4) and \(F_5 = \{a2\}\) (assignment in line 2). A correction of any one of these can make the service composition correct. The easiest way for correction is to change the condition in line 3 to \( isHigh(r) \).

Table 3. Translation of logical structure \(\mathcal {S}\) and state \(\sigma \)

4.2 Prioritizing Fault Types

The MAX-SAT solver typically returns the fault sets in any order, not giving preferences to any soft clause. If we are interested in certain types of faults and prefer to see these first, we need to assign weights to soft clauses. Z3 then solves an optimization problem: it tries to find a maximum satisfiability solution which minimizes penalties. A weight therein incurs a penality on not making the clause true (with penalty 1 being standard).

If our interest is thus in localizing faults in conditions or service calls first and delaying more difficult faults in if- or while-statements to later phases, we could tag the corresponding clauses with higher penalties. For our translation we could replace the assertions for if-statements and loops by

figure a

This gives high penalities to if- and even higher to while-statements. The solver would then first return fault sets with assignments, service calls and conditions. In our case, the fault set \(F_1\) would then be returned as the last set.

4.3 Finding Faults Occurring Multiple Times

The previous technique helps to prioritize certain fault types over others. For narrowing down the number of faults to look at, we can also determine the statements occurring as faults in more than one “faulty run”, i.e., in our case in more than one logical structure \(\mathcal {S}\) making formula (1) true. For this, consider the example service composition in Fig. 4.

Fig. 4.
figure 4

Service composition RestaurantChecker2

The service composition RestaurantChecker2 is supposed to check whether the restaurant of input parameter in is vegan plus either cheap (\( isCheap \)) or accessible to handicapped persons (\( isHAcc \)). We assume the knowledge base to specify \( isCheap(R0) \). Again, the service composition is not correct. Fault localization gives us – now phrased in terms of the service composition, not its translation to SMT code – the two fault sets \(F_1^1 = \{ \lnot (isVegan(in)) \}\) and \(F_2^1 = \{ isHAcc(in) \}\), both based on the same logical structure and state invalidating the postcondition. The state \(\sigma \) with its valuation of the input in describes a “test run” of the service composition passing through lines 1, 2 and 3. We call \(\mathcal {F}^1 =F_1^1 \cup F_2^1\) a test-specific fault set. The idea is now to generate different “test inputs” and compute their fault sets.

To this end, we add an assertion forcing the current value of the input to take a different interpretation via predicate complementation. Let \(\mathcal {I}\) be the interpretation and \(\sigma \) the state returned by query (1). Let i be the input to the service composition and T its type. We now choose a unary predicate \(p: T \rightarrow Bool\) from our ontology. If \(\mathcal {I}(p)(\sigma (i))\) is true, we add an assertion \(\lnot (p(i)\), otherwise p(i). For our example, we add

figure b

If formula (1) together with this assertion is still satisfiable, we repeat the fault localization procedure thereby getting new fault sets. If the formula is not satisfiable, we choose a different predicate to be complemented and retry. This way, we get several test-specific fault sets \(\mathcal {F}^1, \mathcal {F}^2, \ldots \). The intersection of these test-specific fault sets gives us faults which – when corrected – can correct more than one faulty test input at the same time.

In our case, we get a second test-specific fault set

$$ \mathcal {F}^2 = \{ \lnot (isVegan(in)) , isCheap(in) \}\ . $$

Intersection with \(\mathcal {F}^1\) gives us \( \lnot (isVegan(in)) \), which is where the correction should be applied.

In general, predicates to be complemented should not be arbitrarily chosen. Predicates occurring in boolean conditions of the service composition are to be preferred over others (as they steer the execution to different branches). If none of the unary predicates bring us new fault sets, arbitrary n-ary predicates can be considered as well.

5 Conclusion

In this paper, we have introduced a fault localization technique for service compositions. It builds on an existing method for software fault localization via maximum satisfiability solving. Due to our setting of model-based development, our approach needs to account for several additional aspects: (1) In contrast to software, models are not executable and hence faulty test inputs cannot be derived via testing but only via SMT solving; (2) concepts of ontologies are undefined function symbols in a logical sense and hence have no fixed interpretation; (3) faulty inputs alone are not sufficient for fault localization when no interpretation is known. In addition to plain fault localization, we furthermore provide a method for prioritizing fault types and for finding faults occurring in multiple runs.

This fault localization technique is currently implemented within the modelling and verification tool SeSAME [3]. First experiments show that fault localization can be done within seconds and often just proposes singleton faults which can easily be corrected. However, more experiments are needed to see whether this observation can be generalized to more cases.