Keywords

1 Introduction

Traceability in software development refers to creating traces between software artifacts [1]. A trace is a triplet comprising a source artifact, a target artifact, and a trace link [2]. Such artifacts include source code, requirements, mockups, test cases, among others. Keeping traceability between software artifacts facilitates quality-assurance-related tasks such as maintenance, verification, and validation tasks, which are regular practices in information systems engineering [3, 4]. In practice, the effort required to maintain, validate, and generate traces between artifacts outweighs traceability benefits [5]. Therefore, some authors propose novel approaches that allow software development teams to create traces between artifacts [5,6,7,8,9,10,11,12,13]. Although such approaches are helpful, some of them require as input existing traceability data sets or existing traces between artifacts [5, 6, 8, 11], hindering their practical applicability by software development teams that do not currently trace their artifacts. On the other hand, other approaches limit their scope to specific artifacts [7, 9, 10, 12, 13], lacking generality.

In this paper, we propose OntoTrace: an ontology-based automatic reasoning tool for supporting trace generation in software development projects. OntoTrace uses software development teams’ context-dependent traceability ontology, representing their specific context source/target artifacts and their traces. Moreover, our approach support software development teams when defining traceability links without relying on historical traceability data sets or limiting their scope to tracing specific software artifacts. Then, software development teams can use OntoTrace to infer traceability-related information such as: i) which are the traceable source/target artifacts; ii) which artifacts are not yet traced; and iii) given a specific artifact, which are the possible traces between it and other artifacts.

To evaluate the feasibility of our approach and exemplify its application, we instantiate our approach in the context of a real-world use case at LogicFlow AG: a Swiss startup that has a traceability gap between functional/non-functional requirements and test scenarios—mainly focused on user interface (UI) test cases. We present the use of OntoTrace by using the LogicFlow AG’s traceability ontology, an automatic reasoner, and a graph-like UI to visualize software artifacts and traces. We show that OntoTrace allows for establishing and discovering traceability links. Furthermore, we discuss the next research challenges to a complete technology transference.

The paper is structured as follows: in Sect. 2, we review the related works; in Sect. 3, we set up the running example describing a use case at LogicFlow AG; in Sect. 4, we introduce OntoTrace in the context of our running example; and, finally, in Sect. 5 we discuss conclusions and future work.

2 Related Work

Trace generation and discovery have gained researchers’ attention, generating novel and tool-supported approaches. Some authors propose historic-data-based approaches such as artificial neural networks [5, 8], Bayes classifier [13], and similarity-based algorithms [6] for automatically creating traces between artifacts. However, such proposals require large and well-labeled training data sets based on historical traceability data, which are not always available. This represents an entry barrier for software development teams that currently do not trace their artifacts.

On the other hand, some authors propose approaches that do not rely on historical-traceability data sets, such as domain ontology-based recommendation systems [7, 13], pattern languages [9], expert systems [10], and metamodel-based ontologies [12]. Nevertheless, such approaches are limited to generating traces on the specific artifact, lacking generality. Some proposals [7, 8, 10] limit their source/target artifacts to text-based artifacts—e.g., such as textual requirements, source code, and standard norm documents. Therefore, mockups, models, UIs, and other non-textual artifacts are beyond their scope. Similarly, other approaches limit their artifacts to model-based artifacts [12], requirements [9, 13], and source code [9, 11, 13].

To address the gaps mentioned above, we propose an ontology-based automatic reasoning tool named OntoTrace that does not rely on historical-traceability data and is not restricted to a specific set of traceable artifacts. Although some authors base their approach on ontologies [7, 10, 12, 13], the sources describing their proposed ontologies are not available for reusing them. Therefore, OntoTrace also relies on a context-independent traceability ontology, making the sources available for reuse.

3 Running Example: LogicFlow AG Case

In the rest of this paper, we will use as a running example the LogicFlow AG case, a Swiss startup whose main objective is to provide a platform to facilitate the generation of UI testing in software development projects. Currently, LogicFlow AG has a web platform that allows testers to record test scenarios of web-based applications (see Fig. 1). Such test scenarios are automatically transformed into Selenium Script [14], a domain-specific language used for modeling and executing UI test cases. Moreover, LogicFlow AG’s platform automatically identifies changes in the UIs, comparing current web-based application version screenshots with former web-based application version screenshots—we refer to this module as UI automatic change identifier (UI-ACI) from now on. Despite the usefulness of the LogicFlow AG platform, startup members have identified that web-based application requirements are hardly traceable to the test scenarios. Such traceability gap hinders the maintainability of test scenarios, increasing the tester’s effort to keep them consistent with the requirements. In Fig. 1, we show the LogicFlow AG platform setup and the missing traces between artifacts.

Fig. 1.
figure 1

LogicFlow AG platform setup and missing traces between artifacts.

For instance, a use case where such traceability gap is evident is the following: A Swiss insurance company wants to use the LogicFlow AG platform to generate test scenarios based on their web-based application for calculating insurance premiums. Therefore, the Swiss insurance company’s testers create a test scenario based on the company’s requirements—i.e., the source artifacts—by using the LogicFlow AG platform. As a result, the testers create one test scenario comprising 63 Selenium Script commands. Moreover, the testers run the test scenario and compare the web-based application versions using the UI-ACI. Then, the LogicFlow AG platform’s UI-ACI automatically identifies nine changes in the UI. As a result of using the LogicFlow AG platform, the testers have a set of 72 target artifacts in one test scenario. However, up to this point, the testers do not have any trace between the requirements and the test scenario, hindering the test scenario’s maintainability. In Sect. 4, we show how this problematic case can improve by using OntoTrace.

4 OntoTrace: Enabling Ontology-Based Automatic Reasoning for Supporting Trace Generation in Software Development

In this section, we introduce OntoTrace and exemplify it through the running example. OntoTrace allows software development teams to infer traces among software artifacts using ontology-based automatic reasoning. To do so, OntoTrace relies on a domain-independent traceability ontology that has its foundation on general traceability definitions taken from [1, 2, 15], having terms as: trace, artifact, source artifact, target artifact, and traceability link. Therefore, as the first step to using OntoTrace, software development teams should extend such traceability ontology to their specific contexts. We fully extended the traceability ontology to the context of LogicFlow AG, including describing the source artifacts, target artifacts, and the traces between them. However, for the sake of space, in this paper, we show an excerpt of such extension (see Fig. 2).

Fig. 2.
figure 2

Excerpt of the OntoTrace traceability ontology extension in the context of LogicFlow AG.

First, we extend the traceability ontology’s source and target artifacts based on the LogicFlow AG context, having requirements as source artifacts and test scenarios as target artifacts. We continue increasing the class hierarchy until we identify two artifacts: non-functional requirement check texts as source artifacts and SeleniumScript execute commands as target artifacts. Check text is a non-functional requirement that checks if a text in a UI matches a specific format, font, or size. On the other hand, LogicFlow AG testers use the SeleniumScript execute command to verify such non-functional requirements in a test scenario. Thus, the trace between check text and execute command arises between these artifacts.

Having defined the traceability ontology extension to a specific context, software development teams should use a computational-readable knowledge representation language as OWL (Ontology Web Language) [16] to describe such extended ontology. Software development teams can use OWL editors such as Protégé [17] to generate an OWL file describing the ontology. This OWL file is the primary input to use OntoTrace. Then, OntoTrace process the OWL file containing the context-dependent ontology by using three main modules: i) the automatic reasoner, ii) the SPARQL query engine, and iii) the trace graph-like visualizer (see Fig. 3).

Fig. 3.
figure 3

OntoTrace overview.

To develop the OntoTrace modules, we use Apache Jena [18], a free-open-source Java framework for building ontology-based applications. Apache Jena allows us to integrate and develop the first two OntoTrace modules: the automatic reasoner and the SPARQL query engine. We select Pellet [19] as the OWL-based reasoner, allowing for inferring traceability-related data automatically from the context-dependent ontology. Then, we design a set of SPARQL queries to access the inferred data from the automatic reasoner. Apache Jena provides a default SPARQL query engine to execute such queries. For the sake of space, we do not show the SPARQL queries in this paper. However, we create a public GitHub repositoryFootnote 1 containing all the OWL files with the traceability ontology, the SPARQL queries, and the source code of OntoTrace.

After executing the SPARQL queries, the SPARQL query engine retrieves text-formatted triplets. However, we noticed that having just text-based information hinders the tool’s usability. Therefore, we create a graph-like visualizer by using JgraphX [20] that allows software development teams for visualizing the following information: i) all the source/target artifact; ii) which artifacts are untraced; ii) possible traces between artifacts resulting from the automatic reasoning; and iv) the existing traces between artifacts. Thus, OntoTrace allows software development teams to generate traces between artifacts by using the information inferred through ontology-based automatic reasoning.

We test OntoTrace by using the Swiss insurance company use case in the context of LogicFlow AG. In the current status of OntoTrace, we manually create the source artifact individual instances, describing the functional and non-functional requirements. We do the same with the target artifacts, creating the individual instances that describe the test scenario. We manually populate all the ontology with individuals since OntoTrace is not yet integrated with the LogicFlow AG platform. However, in further versions of OntoTrace, we will automate populating the ontology individuals. After creating such individual instances, OntoTrace allows testers to generate the traces between the requirements and the test scenario based on the automatic reasoner inferred information. We show in Fig. 4 an excerpt of such information regarding the Swiss insurance company use case, showing the possible traces between a non-functional requirement check text and target artifacts in the test scenario.

Fig. 4.
figure 4

Excerpt of OntoTrace showing the inferred use case information, representing the source artifacts as white boxes and the target artifacts as black boxes.

In the LogicFlow AG use case, LogicFlow testers using OntoTrace access the artifact suggestions based on the possible traceability links defined into the context-dependent traceability ontology. If the testers manually/automatically add, delete, or edit the traceability links contained in the context-dependent ontology, the OntoTrace suggestions will change. Thus, OntoTrace facilitates the evolution of traceability links without modifying existing traces, SPARQL queries, or the definition of artifacts, thanks to its ontology definition and automatic reasoning.

Although OntoTrace facilitates the evolution of such traceability links, this also causes the number of suggestions to increase over time. Such increasing could affect the scalability of OntoTrace in the long run since the more possible traceability links are established, the more possible traces can exist between source and target artifacts. As part of our future work we plan to investigate how to mitigate this limitation by integrating OntoTrace with functionalities that allow it to generate traces between artifacts automatically. That includes integration with proposals from the literature, such as those reviewed in Sect. 2.

On the other hand, creating artifacts manually, as we did in the LogicFlow use case, is time-consuming. Therefore, we propose that OntoTrace allows automatic generation of artifact instances in the future. We could achieve such automatic generation by using previously defined ontologies. For example, some proposals ontologically describe close-to-natural language requirements, such as user stories [21]. Such already existent ontologies can facilitate the automatic generation of artifact instances inside the context-dependent ontology.

5 Conclusions and Further Work

Trace generation between software development artifacts benefits quality assurance and software maintenance [3, 4]. However, the effort required to generate such traces outweighs traceability-related benefits [5]. In this paper, we reviewed some approaches in the literature for supporting trace generation. Although such approaches are helpful, we observed some of them require historical traceability data, hindering their implementation by software development teams that do not currently trace their artifacts. On the other hand, some approaches lack generality, limiting the set of possible traceable artifacts. Consequently, in this paper, we proposed an ontology-based automatic reasoning tool for supporting trace generation named OntoTrace, which addresses the gaps mentioned above.

OntoTrace requires that software development teams extend a traceability ontology based on general traceability definitions in the literature to their software development context. Thus, software development teams describe context-dependent artifacts such as requirements, source code, test cases, among others, and the traces between them. Then, software development teams can use such ontology together with OntoTrace to automatically infer traceability information such as: i) which are the traceable source/target artifacts; ii) which artifacts are not yet traced; and iii) given a specific artifact, which are the possible traces between it and other artifacts. In this paper, we showed how OntoTrace is successfully implemented by using a running example: a Swiss startup named LogicFlow AG aiming to fulfill the traceability gap between functional/non-functional requirements and UI test cases.

As future research steps, we expect to extend OntoTrace in other directions. As the first remark, OntoTrace depends on several external tools such as Protégé, Pellet, and JgraphX. In practice, we should provide a workspace that integrates all the OntoTrace functionalities, aiming to automate steps of our approach, e.g., automatically creating individual instances. Moreover, as traces between artifacts evolve, we will include new techniques—such as machine learning algorithms—for automatically devising new traceability links while the software development team uses OntoTrace. Such techniques will support software development teams to maintain the context-dependent traceability ontology over time. Finally, other steps such as the user interaction design and empirical validation should be performed in future research endeavors.