Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 The Software Product Line Challenge

Software product line engineering (SPLE) is an increasingly popular development paradigm for highly customizable software. SPLE allows companies to achieve economies of scale by developing several similar systems together.

SPLE is now widely embraced by the industry, with applications in a variety of domains ranging from embedded systems (e.g., automotive, medical), system software (e.g., operating systems) to software products and services (e.g., e-commerce, finance). However, the benefits of SPLE come at the cost of added complexity: the (potentially large) number of systems to be considered at once, and the need for managing their variability in all activities and artifacts.

This added complexity also applies to the verification of the products’ behaviour. A simple but cumbersome approach for product line verification consists in applying classical model checking algorithms [37] on each individual product of the family. However, for an SPL with n features, this would lead to \(2^n\) calls of the model checking algorithm. This solution is clearly unsatisfactory and should be replaced by new approaches that take the variability within the family into account. Those approaches often rely on compact mathematical representations on which a specialized model checking algorithm can be applied. The main difficulties are (1) to develop such a model checking algorithm, and (2) to propose mathematical structures that are compact and flexible enough to take the variability of the family and its specification into account.

In [10], we introduced Featured Transition Systems (FTS), an extension of transition systems used to represent the behaviour of all the products of an SPL in a single compact structure. We also showed how this representation can be exploited to perform model checking of product lines in an efficient way. In the rest of this paper, we briefly re-introduce FTS and summarize existing model checking algorithms for them. We also briefly show that FTS can be exploited to perform testing of software product lines. This is only a brief summary of the work that is presented at SOFSEM’17. More details can be found in our different papers cited below. Finally, we have to highlight that related work on product-line verification is vast and varied. To the best of our knowledge, effort in compiling related work on this topic can be found in the theses of Classen [5] and Cordy [11]. Beohar et al. recently compared the expressiveness of different SPL formalisms and found that FTS is the most expressive one [4].

Fig. 1.
figure 1

Several variants of a vending machine.

2 Featured Transition Systems

Let us introduce Featured Transition Systems with a classical vending machine example. The example is a short version of the one we presented in [8]. In its basic version, the vending machine takes a coin, returns change, serves soda, and eventually opens a compartment so that the customer can take her soda, before closing it again. This behaviour is modelled by the transition system shown in Fig. 1(a). There exist other variants of this vending machine. As an example, consider a machine that also sells tea, shown in Fig. 1(b). Another variant lets the customer cancel her purchase after entering a coin, see Fig. 1(c). A fourth one offers free drinks and has no closing beverage compartment, see Fig. 1(d). This variability hints that the vending machines could be developed as an SPL, of which four features can be already identified: the sale of soda, the sale of tea, the ability to cancel a purchase and the ability to offer drinks for free.

By combining these features differently, yet other vending machines can be obtained. However, not every combination of features yields a valid system (e.g., a vending machine should at least sell a beverage). One can use variability models to represent the sets of valid products. In SPLE, feature diagrams [30, 36] are the most common incarnation of variability models. The feature diagram for the vending machine SPL is shown in Fig. 2. This feature digram formally describes a set of vending machines; twelve of them. A model of the behaviour of a small example such as this would already require twelve, largely identical, behavioural descriptions, four of which are shown in Fig. 1.

Fig. 2.
figure 2

FD for the vending machines of Fig. 1.

Fig. 3.
figure 3

FTS of the vending machine.

FTS are meant to represent the behaviour of the myriad instances of an SPL in a single transition system. In fact, the main ingredient of FTS is to associate transitions with features that condition their existence. Consider again our vending machine example. Figures 1(b) and (c) show the impact of adding features Tea and CancelPurchase to a machine serving only soda: both add two transitions. FreeDrinks replaces ➀\(\begin{array}{l}\xrightarrow {pay}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {change}\end{array}\!\!\)➂ by a single transition ➀\(\begin{array}{l}\xrightarrow {free}\end{array}\!\!\)➂ and ➆\(\begin{array}{l}\xrightarrow {open}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {close}\end{array}\!\!\)➀ by ➆\(\begin{array}{l}\xrightarrow {take}\end{array}\!\!\)➀. The FTS of the whole vending machine SPL is given in Fig. 3. The feature label of a transition is shown next to its action label, separated by a slash. In these labels (and by conveniency in the rets of this paper), we use the abbreviated feature names from Fig. 2. The transitions are coloured in the same way as the features in Fig. 2.

3 Verifying SPLs with FTS

Over the years, we have developed a series of model checking algorithms that exploit the compact structure of FTS to verify sets of requirements on product lines. We first recap the meaning for product line requirements, and then briefly summarise our results.

3.1 What Are Product Lines Requirements?

The requirements of an SPL are requirements imposed over a subset of its products. As such, they can be represented as a formula in temporal logic preceded by a Boolean formula over the SPL features, which represents the set of products whose behaviour must satisfy the temporal formula. As an example, in single systems one can check that “the system can never reach a bad state”. The product-line counterpart of this property would be: “all valid products can never reach a bad state”. The objective of an SPL verification algorithm is thus to discover all products that do not satisfy a given property, and a proof of violation (i.e. a counterexample) for each of them. One can also extend our queries to quantitative, real-time or even stochastic requirements. For example, the single-product property “is the probability to satisfy the safety requirement greater than 0.5” becomes “what are the products for which the probability to satisfy the safety requirement is greater than 0.5” in the product-line realm.

3.2 How to Exploit the FTS Structure to Model Check Requirements: A Sketch

Let us now illustrate how one can exploit the FTS structure to reason on a classical verification problem: finding all the reachable states. Consider again the vending machine FTS of Fig. 3. State ➀ is an initial state, and thus reachable by all products. From there, the transition ➀\(\begin{array}{l}\xrightarrow {pay}\end{array}\!\!\)➁ can only be fired by products in \([\![v \wedge \lnot f]\!]\). Transition ➁\(\begin{array}{l}\xrightarrow {change}\end{array}\!\!\)➂ can be fired for all products in \([\![v]\!]\), and so state ➂ is reachable by the same products as state ➁. Proceeding in this way, we compute in one step the reachability relation of s for all the products. The presence of the feature diagrams permits us to ignore products that are not part of the product line. We also observe that if we find a state s reachable by a set of products A and discover later that s can also be reached by a set of product \(B\subseteq A\), then it is enough to consider the superset A.

Considering sets of products during the verification makes us move from an enumerative approach to a product line approach, which benefits from the common behaviour of the products. Interestingly, the theoretical complexity of our algorithms is higher than that of their enumerative counterpart. However, due to the structure of the FTS, experiments show that in practice the former is faster (see, e.g., [8] for extended comparisons). Observe that the efficiency of our approach also rely on efficient representation of sets of products. In [8], we showed that the best way is to represent them with Boolean formulas. We also showed that our approach remains efficient in quantitative settings, e.g. when properties are real-time [15] or in case the features are not Boolean [16].

3.3 Summary of Our Results

Let us now briefly summarize the results we have obtained over those six last years. Our first algorithms have mainly focused on extending model checking properties of Linear Temporal Logic (LTL) to FTS [8, 10]. We then moved to CTL and symbolic algorithms [9]. We have then proposed extensions of FTS that allow us to reason on more quantitative aspect of systems. This includes real time to specify timing constraints in a timed automaton fashion [15], and probabilities that allows us to make quantitative hypotheses via a combination of FTS and Markov chains [34]. Behavioural relations such as simulation were also extended to FTS. There, one tries to compute the set of products for which two states are in simulation [12]. This allows us, among many other possibilities, to define a CEGAR-based abstraction for FTS [14]. In all those algorithms, the root has always been to efficiently represents pairs of (state,product).

Fig. 4.
figure 4

Features of proveline

Tool. Some of our results have been implemented in ProVeLines: a product line of verification tools for QA on different types of product linesFootnote 1. The structure of the tool is that of an SPL, whose corresponding feature diagram is presented in Fig. 4 (taken from [17]). One can observe that the tool provides several opportunities to describe both systems (discrete, real-time) and requirements (reachability, simulation, LTL). The constraints on the top left of Fig. 4 informs us that using the real-time specification for systems disables the possibility to use LTL and simulation algorithms. Otherwise, this would require the use of dense-time verification algorithms.

Any ProVeLines variant requires at least two artefacts from the user: an FD and an fPromela model. For the former, we use TVL [6, 16], one of the latest incarnations of FDs, due to some of its advantages: high expressiveness, formal semantics and tool support. fPromela is a feature-oriented extension of Promela [27], which we defined as a high-level language on top of FTS. An fPromela model thus describes the behaviour of all the products defined by the FD [7, 16].

4 SPL Testing with FTS

FTSs, as concise models of SPLs behaviours, can also support model-based testing (MBT) activities. In our research we investigated two directions: (i) coverage and (ii) synergies between SPL testing and mutation analysis.

4.1 SPL Coverage Analysis

Extending Usual Coverage Criteria. Since FTS are extensions of transition systems, a natural research direction was to consider “usual” coverage criteria (e.g., all-states, all-transitions) for product-line test generation [21]. In our work, we modelled test cases in terms of sequences of actions. There are thus abstract by nature since in the FTS formalism, actions are simple labels without any input or output. Additionally, an Abstract Test Case (ATC) may not be executable. As we have seen, each transition can only be executed by the set of products that match the associated feature expression. If we consider a sequence of actions, we have to conjunct these feature expressions and check the satisfiability of the resulting expression to know which product(s) can execute this abstract test case. If the formula is not satisfiable, there is no product that can execute the behaviour described in this abstract test case. For example, ATC = \(\{pay,change,tea,serveTea,take\}\) leading to the run ➀\(\begin{array}{l}\xrightarrow {pay}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {change}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {tea}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {serveTea}\end{array}\!\!\)\(\begin{array}{l}\xrightarrow {take}\end{array}\!\!\)➀, is executable by products in \([\![v \wedge \lnot f \wedge v \wedge t \wedge t \wedge f]\!]\), which in turn trivially maps to the empty set. Such a negative test case can be useful to ensure whether an implementation does not allow more products than specified.

Thus, to be executable, an ATC can be executed by at least one product of the product line. We then extend this definition to executable test suites, by stating that they should contain only executable test cases. Equipped with such notions, we can defined product-line coverage as a function that takes a FTS and abstract test suite as parameters and returns a value between 0 and 1. This value represents the ratio between the number of actually covered elements (states, transitions, etc.) and the number of possible ones in the FTS, if the value is 1 then we obtain all-X coverage, where X is the set of elements under consideration for this coverage criterion. When such elements involve transitions, we impose that these transitions are executable by at least one product (see [21] for formal definitions). In our coffee machine, the following test suite both satisfies all-states and all-transitions coverage:

$$\begin{aligned} {}&\{(pay, change, soda, serveSoda, open, take, close) \\ {}&(free, tea, serveTea, take); (free, cancel, return) \} \end{aligned}$$

We also experimented using another criteria that is not based on the model structure but on the capture of usage model that describes usages of the system [19]. There are two ways to capture such usages: either by extracting them from logs (such as Apache logs) [20], and assign more importance to more frequent usages or by assiging them directly using a dedicated modelling tool such as MaTeLO [2]. Technically, these usage models take the form of a Markov chain that can be used to derive the most frequent test cases. There are then run on the FTS to derive the associated product-line coverage metrics. This scenario complements the one proposed by Samih et al. [35] who start by selecting a product prior to generate test cases using statistical testing techniques [38]. For more information about these dual scenarios, see [19].

Another interesting aspect that differs from “usual” coverage for single systems is the notion of P-coverage. P-coverage represent the ratio between the set of products executable by a given abstract test suite and the set of products derivable in the feature diagram that is \([\![FD]\!]\). Since ATCs relates the two types of coverage (products and their behaviours), their generation is de-facto a multi-objective problem. The compactness of the FTS formalism makes it easy for the SPL testing community to study different multi-objective scenarios and compare different criteria.

Multi-objective Coverage. Continuing previous line of work that considered coverage only a the structural (feature diagram) level [25, 26, 32], we initially started with a rather strange question: “what is the behavioural coverage of structural coverage?” [22]. The idea behind this question is that as some behavioural coverage criteria may be difficult to compute in practice because of their complexity, approximating them with less computationally expensive approaches at the feature diagram level can be of interest. To investigate this question, we measured the behavioural coverage (state, transitions and actions) of two FD coverage criteria: (i) pairwise coverage [29, 32] that covers any two combination of features and (ii) similarity coverage that maximises distances between configurations [1, 25]. Results [22] shown that it was indeed possible to cover large parts of behaviour by sampling few configurations (e.g. only 2 products were necessary to achieve all-transitions coverage for the Claroline SPL allowing more than 5,000,000 products). Nevertheless, the resulting test suites are not optimal and more experiments are needed to generalise our results.

Recently, we considered extending similarity at the behavioural level to design search algorithms that maximize both distances between configurations at the FD level and distance between test cases [23]. We considered various distances (Hamming, Anti-Dice, Jaccard, Levhenstein) and both single objective (operating on an initial random set of test cases) and bi-objective (also taking into account distance between products). We seeded our models with random faults to compare the various algorithms. In our models, being bi-objective is not necessary an advantage, and the efficiency seems largely influenced by the choice of the distance function we make. A threat to validity to these conclusions is the fact that our feature diagrams are not heavily constrained, favouring the accidental discovery of dissimilar products.

4.2 Mutation Analysis

A less expected application of FTSs in the field of software testing is mutation analysis [18, 23]. Mutation analysis (see [28] for a comprehensive survey) is a technique that assess the quality of test suites by mutating software artifacts (program, models) called mutants and measuring the ability of such test suites to distinguish them from the original (we also say that a test case kills a mutant) ones. The underlying idea is to mimic a “competent programmer” that would introduce a few mistakes in the implementation of a system. The mutation score measures the ratio of the number of killed mutants divided by the number total mutants for a given test suite. The contribution of FTS to mutation testing is first to model mutants as families [18] and then to exploit the FTS formalism to perform shared execution of the mutants “all-at-once” to speed up analysis [23].

Mutants as SPL Variants. We studied mutation analysis at the model level, where the original system and its mutants can be expressed as transition systems. Model-based mutation complements program-based mutation as they tend to exercise different faults. To generate mutants automatically we design so-called mutation operators. For a transition system, these operators are model transformations that for example remove a transition or replace an action by another one. As we have seen, the features in a FTS add or remove transitions in a similar way. Building on this analogy, we sketched a vision of managing (model) mutants as a SPL to bring all the advantages of FTS and variability modelling to mutation analysis [18]. We describe mutations as features and organise them in a feature diagram, which allows a precise control on the type and number of mutants we allow for analysis. From a behavioural perspective, all the mutants are represented in a centralised model (the FTS), which each eases their management and storage.

Accelerating Mutation Analysis. As noted by Jia and Harman [28], one of the practical obstacles to the development of mutation testing is the cost associated to mutation analysis. Traditional mutation testing proceeds by running every test case on every mutant. Since we need to have a large number of mutants to assess test suites’ sensitivity in a meaningful way, analysis time can be huge. In fact, this is equivalent as processing all mutants in isolation like the naive approach is doing for product line model-checking. Of course, an important justification of using the FTS formalism to model mutations is to avoid this naive approach and perform family-based mutation analysis [18]. We implemented this featured model-based mutation analysis recently [23]. The Featured Mutant Model (FMM) is thus comprised of a FTS modelling the mutant family and a feature diagram representing all the mutations supported by this family. To perform mutation analysis, we simply run test cases on the FTS. As we have seen, this yields a boolean formula describing all the mutants (in \([\![FD]\!]\)) that are killed by this test case. Therefore, we only need to run each test case once on the FTS, rather than on the \(2^n\) individual transition systems associated to this mutant family. Our experiments showed gains between 2.5 and 1,000 times than previous approaches. Additionally, the FMM favours higher-order mutation. Higher-order mutation consists in applying several mutation operators on the same model. In the FMM scheme, higher-order mutation is supported allowing certain features to be selected together in a mutant. If we want to restrict ourselves to the first order, we then need to specify that each feature excludes all the other ones. Computing the mutation score require enumerating the instances satisfying the union of formulas gathered for each test case and computing the total number of mutants from the feature diagram. Computing such values may be tricky for large models (even with BDD solvers) and optimisations require to be investigated [23].

4.3 ViBES: A Model-Based Framework for SPL Testing

All our research on model-based testing has been integrated in a framework called ViBES [24]. We designed an XML representation for FTS while the feature diagrams are encoded in TVL [6]. The framework is implemented in JAVA and provides a domain-specific language to create mutations operators and mutant families in a programmer-friendly way. The framework also contains the implementations of test coverage and generation techniques discussed above. Finally, the framework is open-source (MIT Licence) and can be downloaded here: https://projects.info.unamur.be/vibes/.

5 Conclusion

In ths paper, we summarised six years of efforts in harnessing the central problem of SPL analysis: the combinatorial explosion of the number of products to consider. To this end, we introduced featured transiation systems as a compact and efficient representation of the whole behaviour of a SPL. This unique representation of all the products served as a support to a family of verification algorithms itself implemented as a software product-line [17]. We also employed FTS for model-based testing activities such as coverage and test generation and prioritisation. The FTS formalism demonstrated its universality to readily be applide for mutation analysis of single systems, with substantial analysis speedups.

After having had a look on the past, let us have a look in the future. There are several research directions worth of investigation. First we would like to extend our verification algorithms to quantitative software product lines. This requires to extend the FTS formalism to specify quantities [31]. Another interesting information to specify in FTS is probabilities, in order to perform statistical model-checking activities [33]. Such extended FTS formalism is also of interest for testing [19, 20]. As is the addition of inputs and outputs for ioco conformance [3]. With respect to mutation, we would like to formally investigate the mutant equivalence problem using exact or approximate simulation techniques [13].