Keywords

1 Introduction

Model-Based Systems Engineering (MBSE) [1] is often applied to design large scale systems, because it can make sure of their reliability and save the cost of modification effectively. The systems modelling language SysML [2, 3] can support effective use of MBSE, for its well-designed mechanism for creating object-oriented models, which can be combined with software, people, material and other physical resources. In MBSE, SysML models are often used as the design for code. It means that whether the SysML model meets the users’ requirement in relation to the high reliability of the code. Unfortunately, to the best of our knowledge from the literature, there are few tools to support the verification of SysML models [4, 5] in particular rigorous ways of verification.

Testing-Based Formal Verification (TBFV) proposed by Liu [6,7,8] shows a rigorous, systematic, and effective technique for the verification and validation of code. TBFV integrated the specification-based testing approach and Hoare logic to verify the correctness of all the traversed program paths during testing. The advantage of TBFV is its potential and capability of achieving full automation for verification utilizing testing. However, the current TBFV is mainly designed for sequential code in which all of the details are formally expressed, and there is no research on applying it to verify SysML models yet.

In this paper, we discuss how the existing TBFV can be applied to SysML models for their verification and we use TBFV-M (testing-based formal verification for models) to represent the newly developed approach. Since SysML Activity Diagrams can model the systems dynamic behavior and describe complex control and parallel activities, our discussion in this paper focuses on the activity diagrams.

The essential idea of TBFV-M is as follows. All of the functional scenarios are first extracted from a given formal specification defining the users’ requirements. And at the same time, test paths are generated from corresponding SysML Activity Diagrams waiting to be verified. Then, test paths are matched with functional scenarios by a given algorithm. After this, the pre-condition of the test path is automatically derived by applying the assignment axiom in Hoare logic based on the functional scenario. Finally, the implication of the pre-condition of the specification with the guard condition of the functional scenario to the derived pre-condition of the path is verified which concerns the accuracy of the activity diagram. And the processing method of dealing with invocation is also be proposed by TBFV-M.

The remainder of the article will detail the TBFV-M method. Section 2 presents related work we have referenced. Section 3 characterizes the definitions of basic terms and concepts. Section 4 introduces TBFV and the derivation of the main idea of TBFV-M. Section 5 describes the principle of TBFV-M, showing the core technology of TBFV-M. Section 6 uses one case study to present the key point of TBFV-M. Finally, the details of the implementation are presented in Sect. 6 and Sect. 7 concludes the paper.

2 Related Work

2.1 Testing-Based Verification

Considering the shortcoming of formal verification based on Hoare logic being hard to automate, Liu proposed the TBFV (Testing-Based Formal Verification) method by combining specification-based testing with formal verification [6]. This method not only take the advantage of full automation for testing, but also the efficiency of error detection with formal verification. Liu also designed a group of algorithms [9] for test cases generation from formal specification written in SOFL [10]. A supporting tool [8] is also developed. These efforts have significantly improved the applicability of formal verification in industrial settings.

Raimondi [11] addressed the problem of verifying planning domains written in the Planning Domain Definition Language (PDDL). First, he translated test cases into planning goals, then verified planning domains using the planner. A tool PDVer is also generated. In this paper, testing is also used during verification and the effectiveness and the usability is improved.

2.2 Test Case Generation

Lasalle [12] utilized the existing UML/OCL Model-Based Test generation tool, Smartesting Test DesignetTM. He designed rewriting rules to translate a SysML model into an equivalent UML model. The advantage of this process is that we can use the existing UML tools to handle the SysML model.

Nayak [13] introduced an approach to transform the particular Activity Diagram into a model that can be used for testing, called ITM, based on its structure characteristics. The advantage of using ITM is that it can simplify the process of extracting and analyzing test scenarios based on the coverage criteria. However, it also has limitations on processing unstructured Activity Diagram because the unstructured Activity Diagrams shape is out of structure.

Oluwagbemi [14] proposed a new concept called activity flow tree (AFT) and it can store the information obtained by traversing the activity diagram. Then, AFT is used as an intermediate expression to generate test cases automatically. They designed the transformation and generation algorithm and compared their achievement with the work done by the predecessors.

Inspired by Liu’s work, we apply and extend the TBFV approach to models and propose the TBFV-M. A model is more intuitive than a formal specification because it requires less relevant background knowledge and is easier to communicate with customers. TBFV approach shows the treatment of code, while TBFV-M approach deals with SysML Activity Diagrams. And different with Feng Liang’s work, TBFV-M approach do not use other supporting tools, like Modelica, we merely use Hoare Logic to do the verification. Referring to test case generation, TBFV-M approach can deal with unstructured diagrams, which may have stronger processing power than existing approaches.

3 Related Concept

3.1 Formal Definition of Activity Diagram

Activity Diagram Formal Definition [2] can be represented as:

$$ AD = \left( {Node;\,Edge} \right) $$
(1)

Node is a set of nodes of which definition as follow:

$$ \begin{aligned} & Node \, = \{ InitialNode;\,FlowFinalNode;\,ActivityFinalNode;\,Action{ - }Node;\,ActivityNode; \\ & ForkNode;\,JoinNode;\,DecisionNode;\,MergeNode;\,RecieveSignaNode;\,SendSignalNode\} \\ \end{aligned} $$
(2)

InitialNode signifies the beginning of Activity Diagram, while ActivityFinalNode signifies the ending of Activity Diagram. Edges defines the relationship between nodes such that:

$$ Edge = \{ (x,y)|x,y \in Node\} $$
(3)

There are two types of edges: control flow and object flow. Control flow edges represent the process of executing token passing in AD and object flow edges are used to show the flow of data between the activities in AD.

3.2 Test Case

From a global view, test case based on the SysML activity diagram consists of test path and test data. And the definition is as followed:

$$ TC\left( {AD} \right) = \left( {Path;\,Data} \right) $$
(4)

For Activity Diagram, test path consists of a series of actions and edges in the diagram. Based on the formal definition of the activity diagram given above, the test path is defined as follow:

$$ path = \, \left( {a_{1}^{\prime } , \, a_{2}^{\prime } , \ldots ,a_{n}^{\prime } } \right) $$
(5)
$$ a_{i}^{\prime } = \left( {t_{n} , \, a_{n} } \right),\,\left( {i = 2, \ldots , \, n} \right) $$
(6)
$$ t_{n} = a_{i - 1} \to a_{i} ,\,\left( {i = 2, \ldots ,n} \right) $$
(7)

In these formulas, ai means node, ti means edge. In this case, a test path is a set of nodes, starting from node a1 and ending with node an through the transition edges t2 … tn.

Test data indicates the input information corresponding to a particular test scenario including various types of data, even user actions and so on.

3.3 Test Coverage Criteria

For software, the adequacy measurement of testing is reflected in the rate of coverage and effectiveness of the test case. These coverage criteria ensure the sufficiency of testing and provide implications for the test case generation algorithm. Here are four test coverage criteria used in our design, for test case generation of SysML activity diagram [15, 19, 20]:

  • Action coverage criteria: In software testing process, testers are often required to generate test cases to execute every action in the program at least once.

  • Edge coverage criteria: In software testing process, testers are often required to generate test cases to pass every edge in the program at least once.

  • Path coverage criteria: These coverage criteria require that all the execution paths from the programs entry to its exit are executed during testing.

  • Branch coverage criteria: These coverage criteria generate test cases from each reachable decision made true by some actions and false by others.

3.4 Hoare Logic

Hoare Logic is a formal system developed by Hoare [21, 22], and it is designed for the proof of partial correctness of a program. In Hoare Logic, the Hoare Triple [23] is best known and is also referenced in our method. The Hoare triple is of this form:

$$ \left\{ {\text{P}} \right\}\,{\text{C}}\,\left\{ {\text{Q}} \right\} $$
(8)

P and Q are assertions and C is a command. P is named the pre-condition, which is a predicate expression describing the initial states and Q the post-condition, which is also a predicate expression describing the final states.

Hoare also established necessary axioms to define the semantics of each program construct, including axiom of assignment, rules of consequence, axioms of composition, axioms of alternation, iteration and block. Axiom of assignment is used in our work, so we will briefly introduce it:

$$ \left\{ {Q\left( {E\backslash x} \right)} \right\}\,x: = E\,\left\{ Q \right\}, $$
(9)

where x is a variable identifier, E is an expression of a programming language without side effects, but possibly containing x, Q(E\x) is a predicate resulting from Q by substituting E for all occurrences of x in Q. This axiom means that to verify the correctness of the assignment, the postcondition Q should be satisfied. This equals to Q[E\x] are true because x is assigned by representing E after the execution.

3.5 Functional Scenario Form

A functional scenario is a logical expression that tells clearly what condition is used to constrain the output when the input satisfies some condition. Spre and Spost denote the pre- and post-conditions of operation S. Let:

$$ S_{post} = (G_{1} \wedge D_{1} ) \vee (G_{2} \wedge D_{2} ) \vee \ldots \vee (G_{n} \wedge D_{n} ), $$
(10)

Gi and Di (i ∈ 1, …, n) are two predicates, called guard condition and defining condition, respectively. The definition of functional scenarios and FSF (functional scenario form) are listed below:

$$ Functional\,Scenario = S_{pre} \wedge G_{i} \wedge D_{i} $$
(11)

In the definition of functional scenario, Spre ∧ Gi ∧ Di is treated as a scenario: when Spre ∧ Gi is satisfied by the initial state (or intuitively by the input variables), the final state (or the output variables) is defined by the defining condition Di. The conjunction Spre ∧ Gi is known as the test condition of the scenario, which serves as the basis for test case generation from this scenario.

$$ FSF \, = \, (S_{pre} \wedge G_{1} \wedge D_{1} ) \vee (S_{pre} \wedge G_{2} \wedge D_{2} ) \vee \ldots \vee (S_{pre} \wedge G_{n} \wedge D_{n} ) $$
(12)

3.6 Path Triple

The path triple is similar in structure to Hoare triple, but is specialized to a single path rather than the whole program and the definition is below:

$$ \{ S_{pre} \wedge G_{i} \} \,P\,\left\{ {D_{i} } \right\}, $$
(13)

P is called a program segment, which consists of decision (i.e., a predicate), an assignment, a return statement, or a printing statement. It means that if the pre-condition Spre and the guard condition Gi of the program are both true before path P is executed, the post-condition Di of path P will be true on its termination.

4 TBFV and TBFV-M

4.1 TBFV

TBFV is a novel technique that makes good use of Hoare logic to strengthen testing. The essential idea is first to use specification-based testing to discover all traversed program paths and then to use Hoare logic to prove their correctness. During the proof process, all errors on the paths can be detected.

Testing is a practical technique for detecting program errors. A strong point of testing superior to formal correctness verification is that it is much easier to be performed automatically if formal specifications are adopted [19], but a weak point is that existing errors on a program path may still not be uncovered even if it has been traversed using a test case. TBFV takes advantage of testing, realized full automation for error detection efficiency, and also overcome its weak point by making good use of relevant part of Hoare logic.

4.2 TBFV-M

In the last decade, the model-driven approach for software development has gained a growing interest of both industry and research communities as it promises easy automation and reduced time to market [17]. Because of the graphical notation for defining system design as nodes and edge diagrams, SysML model addresses the ease of adoption amongst engineers [18] (Fig. 1).

Fig. 1.
figure 1

TBFV-M usage scenario.

During the Model-Driven process, model is an important medium for the Model based system engineering development. The TBFV-M method takes the specification describing the users’ requirements and the SysML Activity Diagram model as input and verifies the correctness of the SysML model according to the specification. The TBFV-M method is mainly used to verify whether SysML Activity Diagram model meets the user’s requirements written in SOFL (Structured-Object-oriented-Formal Language).

5 Principle of TBFV-M

The procedure of TBFV-M is illustrated in Fig. 2. We find that functional scenarios are derived from the specification written in the pre-/ post-condition style, while test paths are generated from the Activity Diagram and the data constraints can be extracted from each test path. Then, the extracted data constraints are used to match with functional scenarios. A matching algorithm is defined by us. We will verify the successful matched the test path according to the requirements represented in specification. The verification part can be separated into three parts: first, create a path triple, and then use the axiom of Hoare Logic to derive pre-assertion for each test path. Finally, prove the implication of the pre-condition in the specification and pre-assertion. If we can prove all the implication of pre-assertion of all the test paths of the model and the matching pre-condition, then we conclude that the model is to meet the requirements.

Fig. 2.
figure 2

TBFV-M processing procedure.

5.1 Unified Formal Expression

Using a unified formal expression can not only reduce the ambiguity during communications, but also give a possibility to automate the entire process, making analysis and verification more accurate and efficient.

We establish the unified formal expression, including specification guide and modeling guide. Specification reflects complete requirements and we chose SOFL to describe formal specification. The SOFL method intergrades formal methods, structured methods and object-oriented methodology, which not only supports requirements analysis and specifications, but also play an import role during design and implementation stages. An example specification written in SOFL is given below. It describes that if a non-negative integer a equals to zero, TRUE will be returned; otherwise return FALSE.

figure a

5.2 Functional Scenarios Derivation

The overall goal of functional scenario derivation is to extract all functional scenarios completely in “Spre ∧ Gi ∧ Di” form (FSF), as mentioned above in related concept section. A systematic transformation procedure, algorithm, and software tool support for deriving an FSF from a pre-post style specification written in SOFL have been developed in our previous work [16].

The below segment of the process “equal_zero”, mentioned previously, shows the FSF generated from the specification described in the last one.

figure b

5.3 Test Path Generation

A test path auto-generation tool based on the SysML Activity Diagram model takes the model as input and generates test cases as outputs automatically, according to test path generation algorithms and coverage criteria chosen by test group members.

The SysML Activity Diagram test path generation includes three parts. First, we use transformation algorithm to compress the input Activity Diagram, which may contain unstructured module. The transformation is an iteration process, dealing with loop module, concurrent module and the problem of multiple starting nodes separately. After compressing, we transform this unstructured activity diagram into an intermediate representation form Intermediate Black box Model (IBM). IBM consists of one basic module and a map from black box to the corresponding original actions. The third phase of our approach is test path generation based on IBM. In this phase, two problems should be solved, which are basic module test path generation and black box test path generation. Details of automated test paths generation algorithm and implementation of unstructured SysML Activity Diagram has been developed in our previous work [24].

We give a motivation case to show the above process. Figure 3 is an unstructured SysML activity diagram model, which contains a concurrency module and a loop module.

Fig. 3.
figure 3

Motivation example.

Figure 4 shows how to compress an unstructured activity diagram and transform the unstructured module into a black box node. Eventually the unstructured activity diagram converts into an intermediate representation of IBM. The first step is to identify the loop module and compress it into a black box node while-do loop1, shown in Fig. 4(a). The compressed black box node is the intermediate representation of the loop shown in the following Fig. 5(a).

Fig. 4.
figure 4

Transformation process.

Fig. 5.
figure 5

Motivation example.

The second step is to identify the noJoin concurrency module and compress it into a black box node No FJ1, shown in Fig. 4(b). The compressed black box node is shown in the following Fig. 5(b).

Figure 5(b) is a compressed and structured SysML activity diagram that can be used to automatically generate test cases. Finally, the black box module can be replaced.

5.4 Matching Algorithm

Matching the test path with functional scenario is very important for verification. In order to verify the correctness of one path in Activity Diagram, we need to match it with corresponding functional scenario. The constraints of test path can be extracted from edges of each path, which are used to compare with Spre∧ Gi part of functional scenario. If unmatched test paths or functional scenarios appears, it means some errors may be exist in this model. And the model needs to be modified. The matching algorithm is given below.

figure c

Matching algorithm takes the edge list and FS_list as input. Edge list is the collection of guard conditions saved from test path and FS_list is extracted functional scenario form from specification. First, the algorithm sets the label of the two lists unvisited. And for each edge in edge list do data integration. Data integration is like data intersection. For example, if we contain two guard conditions x < 6 and x < 60, the integration of it is x < 6.

After completing the initialization step, find a matching functional scenario for each element in edge list. The specific operation is: the edge after the integration compares with Spre ∧ Gi in the functional scenario, if exactly the same, then we mean that we find the edge with the matched functional scenario. If there is no exact matched functional scenario, then there is an inaccurate modeling problem and needs to be refined. Therefore, immediately terminate the program, the problem of the edge will also be returned. After traversing all the edge_list, we also need to check whether each in FS_list has been visited. If there is an unvisited functional scenario, then it means that there is a requirement that the model fails to be represented in the specification, and the model needs to be refined.

5.5 Path Triple Establishment

Establish Path Triple and apply each node with the axiom in Hoare Logic. “(Spre ∧ Gi ∧ Di) (i = 2, …, n)” denote one functional scenario and P = [node1; node2; …; nodem] be a program path in which each nodej(j = 2, …, n) is called a functional node, which is a DecisionNode, ActionNode, or other activity diagram nodes. Assume each path P has its own target functional scenario, which is decided utilizing matching algorithm. To verify the correctness of P with respect to the functional scenario, we need to construct Path Triple: {Spre} P {Gi∧ Di}.

The path triple is similar in structure to Hoare triple, but is specialized to a single path rather than the whole program. It means that if the pre-condition Spre of the program is true before path P is executed, the post-condition Gi ∧ Di of path P will be true on its termination. Repeatedly apply the axiom for assignment to derive a pre-assertion, denoted by Ppre. Finally, we can form the following expression:

$$ \{ Spre \wedge Gi\} \, \to Ppre, $$
(14)

where Spre, Ppre and Gi ∧ Di are a predicate resulting from substituting every decorated input variable ~x for the corresponding input variable x in the corresponding predicate, respectively. And the correctness of the specific path is transformed into the implication Spre ∧ Gi → Ppre. If the implication can be proved, it means that no error exists on the path; otherwise, it indicates the existence of some error on the path.

5.6 Implication

Prove the implication. Finally, the correctness of one path whether it meets the corresponding requirement is changed into the proof of the implication “Spre ∧ Gi → Ppre”. If the implication can be proved, it means that the path can model one part of the requirement; otherwise, it indicates the existence of some error on the path.

Formally proving the implication “Spre ∧ Gi → Ppre” may not be done automatically, even with the help of a theorem prover such as PVS, depending on the complexity of Spre and Ppre. Our strategy is as follows: if the complexity of data structure is not high, we will transform the problem into solver, which can achieve full automation. Otherwise, if achieving a full automation is regarded as the highest priority, as taken in our approach, the formal proof of this implication can be “replaced” by a test. That is, we first generate sample values for variables in Spre and Ppre, and then evaluate both of them to see whether Ppre is false when Spre is true.

For example, if we need to judge the validity of the implication “(price > 0) → (price < 100 AND ~ price-5 = ~price2 - ~price”, use the test case (price, 60) and we can easily prove the implication is not correct.

5.7 Invocation

During the process of design, especially for the complex system, modularization is very necessary when modelling, according to users’ requirements. Model driven software development process often faces the problem of function or module invocation.

Because the TBFV-M method needs to deal with functional scenario derivation from specification describing users’ requirement and test path generation from SysML activity diagrams, we need to take both side into account while dealing with invocation.

For specification, if a function invocation is used as a statement, it can change the current state of a program. So that, the traversed path containing the invoked function should consider in deriving the pre-assertion of the invocated function. Our solution is utilizing the sub path of the invocated function to substitute the actual traversed path, while deriving the functional scenario form. Also, we need to append the pre-condition of invocated function into the Spre of particular functional scenario and during the above process parameter substitution needs to be considered.

To express the idea, we will give a motivation example.

figure d

While deriving, we can get the below functional scenario. HalfPrice is the invocation function.

figure e

According to the solution we mentioned above, we will substitute the original form with the sub path of invocation function and the actual parameter price is replaced by fare in the invocation function. The result is shown below.

figure f

For Activity Diagram, “Activity” is often used to realize the hierarchy design. Our solution is also utilizing the sub path of the invocated activity to substitute the actual traversed test path, while generating test path.

6 Case Study

Now we show a motivation example to detail the process of TBFV-M method. First, we will get a requirement from the user, which consists of inform the description, may like this: “The park will give the tourist fare discount according to their age. If he is younger than 6 or older than 70, he will be free; Or if he is between 60 and 70, he can enjoy the half price, otherwise he will pay the normal price”. This specification is formal and structured, as shown in the last section.

According to the specification, we can construct a set of SysML model and the Activity Diagram is shown below (Fig. 6).

Fig. 6.
figure 6

Case study.

First, we derive Functional Scenarios from specification and generate test paths from Activity Diagram. The result is shown as below.

figure g

Because of the invocated activity, we should substitute the original test path, like T4, into the update version, T4′, by substituting activity0 with its sub actions.

figure h

At the same time, we can extract data constraints from each test scenario, which is used for matching with functional scenario. Then, the matching process is shown below.

figure i

The blow segment chose the forth path and matched the first functional scenario as an example and shows the substitution process, from bottom to up.

figure j

Finally, we turn this verification problem into proving whether the pre-condition of specification can imply Ppre. If it can be proved, means that the path satisfies the requirement. As the strategy of implication mentioned before, this implication uses simple data structure, so that we use testing to access the procedure of verification. In this case, we prove it is correct.

7 Conclusion

We presented an approach, known as TBFV-M (Testing-Based Formal Verification for Model), for requirement error detection in SysML Activity Diagrams by integrating test cases generation and Hoare Logic. The principle underlying TBFV-M is first to derive functional scenarios form specification and generate test scenarios from Activity Diagrams. Then match them and verify each test scenario according to the corresponding functional scenario. Hoare logic is used during the verification process. TBFV-M method made up the limitation of TBFV, not concerning about models and solved the problem of inconsistent, incomplete, and inaccurate models. We also give a solution to deal with the invocation problem. It has advantage in reducing the probability of system error and shortening the developing time.