Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The role of traceability was recognised in the pioneering NATO working conference held in 1968 to discuss the problems of software engineering (Naur and Randell, 1969). One of the working papers in this conference examined the requirements for an effective methodology of computer system design and reported on the need to be able to ensure that a system being developed actually reflects its design. In a critique of three early projects focused on methodology, each was praised for the emphasis they placed on making “the system that they are designing contain explicit traces of the design process” (Randell, 1968).

Traceability was subsequently noted as a topic of interest in one of the earliest surveys on the state of the art and future trends in software engineering (Boehm, 1976), and its practice was certainly evident in those domains concerned with developing early tool support (Dorfman and Flynn, 1984; Pierce, 1978). By the 1980s, traceability could be found as a requirement in a large number of national and international standards for software and systems development, such as the high-profile DOD-STD-2167A (Dorfman and Thayer, 1990). Published research began to proliferate and diversify in the area of traceability in the late 1990s, spurred somewhat by renewed interest in the topic arising from two newly formed International Requirements Engineering professional colloquia, with two early papers focusing on the issues and problems associated with traceability (Ramesh and Edwards, 1993; Gotel and Finkelstein, 1994), the latter providing for the first systematic analysis of the traceability problem. The topic of traceability continues to receive growing research attention in the twenty-first century, with a particular focus on automated trace generation (Cleland-Huang et al., 2007; Hayes et al., 2006) and with concomitant advances in model-driven development (Aizenbud-Reshef et al., 2006; Galvao and Goknil, 2007; Winkler and von Pilgrim, 2010).

However, despite the introduction of widely-available commercial tools claiming to support traceability in the 1980s, and substantive growth in this market through the 1990s and millennium, the actual practice of traceability remains poorly documented and, where it is examined (Mäder et al., 2009b), it appears to be little influenced by research. One confounding factor is inconsistency in the use of traceability terminology and concepts, not only between researchers and practitioners, but also within each of these communities themselves.

This chapter seeks to provide a resource on traceability fundamentals.Footnote 1 It defines the essential traceability terminology in Section 2 and is supplemented by an extensive glossaryFootnote 2 that has been developed and endorsed by members of the traceability community. This glossary can be found as an appendix to this book and provides definitions for all the terms that are italicised in this chapter. The chapter also offers a model of a generic traceability process in Section 3 and describes the basic activities involved in the life cycle of a trace. This model is used as a frame of reference for articulating the grand challenge of traceability in the chapter by Gotel et al. of this book. Section 4 describes the basic types of traceability and explains some key associated concepts. Section 5 concludes the chapter.

2 Essential Traceability Terminology

At the most fundamental level, traceability is simply the potential to relate data that is stored within artifacts of some kind, along with the ability to examine this relationship. The ability to achieve traceability therefore depends upon the creation of navigable links between data held within artifacts that are otherwise disconnected. The value of traceability lies in the many software and systems engineering activities and tasks that the information provided through such interrelations can enable, such as change impact analysis, coverage analysis, dependency analysis, etc. (Gotel and Finkelstein, 1994; Lindvall and Sandahl, 1996; Ramesh and Jarke, 2001); tracing can provide visibility into required aspects of the software and systems development process and contribute to a better understanding of the software system under development.

This section defines two underlying terms, trace artifact and trace link, that are the building blocks of traceability. It subsequently uses these definitions to clarify the term trace. Based upon these definitions, the terms traceability and tracing are then defined.

2.1 Trace Artifact

Trace artifacts are traceable units of data. They refer to any residual data or marks of the software and systems development process that are made amenable to being traced. The term can apply to a single requirement, a cluster of requirements, or even to an entire requirements specification document. The term can apply to a Unified Modeling Language (UML) class diagram, a single class therein, or even to a particular class operation. For conceptual simplicity, the general term “artifact” is used to apply to both the object as a whole and to any internal delineation therein. What this means is that the granularity of a trace artifact is not pre-determined and may not even be consistent in any one particular project. It is this uncertainty in the granularity of trace artifacts that can lead to many problems in establishing and using traceability in practice.

Three terms closely associated with trace artifact include trace artifact type, source artifact and target artifact. The trace artifact type serves to classify the nature and function of the artifact, and is usually a recognised and “documented” by-product of the software and systems development process. The terms source artifact and target artifact serve to characterise the role of a particular trace artifact in a specified trace.

2.2 Trace Link

A trace link is a single association forged between two trace artifacts, one comprising the source artifact and one comprising the target artifact. This definition of trace link implies that the link has a primary direction for tracing, from the source artifact to the target artifact. Directionality between the two trace artifacts provides for the ability to traverse the trace link, or to follow it, so as to associate the two pieces of data. It is this juxtaposition that is sought through traceability, rather than the pure retrieval of one piece of data. In practice, however, every trace link can be traversed in two directions, so the trace link also has a reverse trace link direction and is effectively bidirectional, as illustrated in Fig. 1.

Fig. 1
figure 1_1_272755_1_En

Trace link directionality

The directionality of a trace link is therefore an important concept. Where a source artifact and a target artifact are defined, the semantics of the directionality is clear. Whether or not the trace link can physically be navigated in both directions, however, is usually a matter of implementation. Three terms clarify the directionality inherent in a trace link, the primary trace link direction, the reverse trace link direction and the concept of a bidirectional trace link.

Two interrelated terms that are closely associated with trace link are trace link type and link semantics. The trace link type serves to classify the nature and function of the trace link. It is usually characterised according to the meaning of the relationship between the two artifacts that the link associates, so the trace link type is generally defined in terms of the link’s semantic role. The trace link type is a broader term that may define a collection of links with the same link semantics.

The term trace relation is frequently used interchangeably with the term trace link in many publications. In reviewing the traceability fundamentals and encouraging the more consensual use of terminology within the traceability community, the proposal is to differentiate the two terms in the future. Following from database theory, a trace relation describes all the trace links that are specified between two defined artifact types acting as source artifacts and target artifacts. It is the trace relation that is captured in the commonly used traceability matrix.

2.3 Trace

Use of the term trace has led to some misunderstanding in the traceability community since it has two distinct meanings dependent upon whether the term is being used as a noun (i.e., “a mark remaining” (OED, 2007)) or as a verb, (i.e., “tracking or following” (OED, 2007)). When used in a software and systems engineering context, the meanings are often used interchangeably whereas they need to be distinguished. “Trace” can, therefore, be defined in two ways.

When used as a noun, the term “trace” refers to the complete triplet of trace elements that enable the juxtaposition of two pieces of data: the source artifact, the target artifact and the trace link. Additional information, in the form of trace attributes, may qualify properties of the overall trace or of each of the three elements. Such traces can either be atomic or chained (see Fig. 2). Where chained, the trace links are strung together by the source and the target trace artifacts that they connect, the target artifact for one trace becoming the source artifact for the subsequent trace, to form a series of data juxtapositions.

Fig. 2
figure 1_2_272755_1_En

A trace provided via a single trace link or via a chain of trace links

When used as a verb, the term “trace” (i.e., to trace) is associated with the activity of tracing (see Section 2.5).

2.4 Traceability

Traceability is the potential for traces (as defined above in the noun sense) to be established (i.e., created and maintained) and used. The challenge for traceability is that each of the component elements (i.e., the trace artifacts and trace links) needs to be acquired, represented and stored, and then subsequently retrieved as a trace to enable software and systems engineering activities and tasks. Both the time and the manner in which traces are established and brought together for use will depend upon the purposes to which the traceability is put. Consequently, traces exist within their own life cycles and can (ideally) be reused in different contexts. The type and the granularity of the trace artifacts, and the semantics of the trace link, are therefore details that are best determined on a project-by-project basis. They could perhaps even be determined on a moment-to-moment basis in relation to an overarching traceability strategy. It is this process through which traces come into existence and eventually expire that influences the definition of a generic traceability process model in Section 3.

Frequently used terms include requirements traceability, software traceability and systems traceability. These all delineate the artifact types that are the primary objects of interest for tracing purposes. For example, in the case of requirements traceability, this focuses explicitly on the potential to establish and use traces that associate requirements-related artifacts in some way or another. Other more specific traceability terms are defined in the glossary that accompanies this book.

2.5 Tracing

Tracing implies undertaking all those activities required to put traceability in place, in addition to all those activities that exploit the results.

Tracing activities demand some form of agency, and leads to the three associated terms of manual, automated and semi-automated tracing when referring to the nature of the activity that puts the traceability in place.

3 A Generic Traceability Process Model

Figure 3 depicts a generic traceability process model. It shows the essential activities that are required to bring traces into existence and to take them through to eventual retirement. Traces are created, maintained and used, all within the context of a broader traceability strategy. This strategy provides the detail of stakeholders’ needs, decisions regarding mechanism and automation, and also chains atomic traces in some agreed way to enable required activities and tasks. Continuous feedback is a critical aspect of the entire process to enable the traceability strategy to evolve over time. The four key activities of this generic traceability process model are described in the following sub-sections.

Fig. 3
figure 1_3_272755_1_En

A generic traceability process model

3.1 Traceability Strategy

Effective traceability rarely happens by chance or through ad hoc efforts. Minimally, it requires having retained the artifacts to be traced, having the capacity to establish meaningful links between these artifacts and having procedures to interrogate the resulting traces in a goal-oriented manner. Such simple requirements conceal complex decisions as to the granularity, categorisation and storage of assorted multi-media artifacts. It also conceals choices as to the approach for generating, classifying, representing and then maintaining their inter-artifact and intra-artifact linkages. Additional questions need to be answered, such as: Which of these tracing activities should be manual? Which should be automated? Where should the responsibilities for these activities lie? When should they be undertaken? There are many decisions that need to be made and, therefore, an enabling traceability strategy needs to be built into the engineering and management practices from day one on a software and systems engineering project. Figure 4 outlines the typical high-level activities associated with planning and managing a traceability strategy.

Fig. 4
figure 1_4_272755_1_En

Planning and managing a traceability strategy

Traceability is concerned with the provisioning of information to help in answering project-specific questions and in undertaking project-directed activities and tasks; it is thus a supporting system rather than a goal in its own right. This perspective demands understanding those stakeholders who may need the potential for traceability, what for and when? Acquiring clear-cut answers to these questions at the start of a project is not straightforward, as both stakeholders and their task needs will change. Even if these could be articulated exhaustively, building a traceability solution to service all needs is unlikely to be cost-effective, as resources are generally limited in some finite way. Determining whose needs to satisfy, and so which traceability-enabled activities and tasks to facilitate, is a value decision that lies at the heart of a traceability strategy; determining needs and resourcing constraints is a precursor to any discussion about trace artifacts, trace links and mechanism.

Ensuring that the traceability is then established as planned, and yet can adapt to remain effective as needs evolve and as a project’s artifacts change, is also the province of traceability strategy. Determining how the traceability will be provisioned such that the requisite quality can be continuously assured further demands analysis, assessment and potential modification of the current traceability solution. Assessing the quality and the execution of the traceability solution, and implementing a feedback loop to improve it, is a critical part of the traceability strategy for a project; it needs to develop and leverage historical traceability information.

Within the context of a broader traceability strategy, the creation, maintenance and use of individual traces and their constituent elements all need to be defined and managed. Given that atomic traces comprise source, target and relational elements, these data requirements need to be identified. This includes decisions as to meta-data to associate, dependent upon what kinds of traceability-enabled activities and tasks the trace is anticipated to participate in and support. Resourcing, planning and implementation decisions may hence vary on a trace-by-trace basis; for instance, it is quite possible that a particular trace is not created or maintained until its use is actually required. Traces thereby inhabit independent life cycles, the constituent activities of which are examined in the following sections.

3.2 Traceability Creation

When creating a trace, the elements of the trace have to be acquired, represented and then stored in some way, as illustrated in Fig. 5. Reference models and classification schemes characterising different types of trace link and trace artifacts drive the traceability creation process, as usually defined within the traceability information model of the overarching traceability strategy.

Fig. 5
figure 1_5_272755_1_En

Traceability creation

While project artifacts are generally pre-existing on a project, the links between them may not yet be defined. Techniques to support the creation of trace links can range from manual to automated approaches, each with differing degrees of efficiency and effectiveness. The differentiating factor is often whether the trace links are created concurrently with the forward engineering process (i.e., trace capture) or at some point later (i.e., trace recovery). Validation is therefore critical to the viability of the traceability creation process, regardless of how trace links are initially created, as it is concerned with determining and assuring the credibility of the trace as a whole.

3.3 Traceability Maintenance

An association made between two artifacts at a moment in time to serve a particular purpose does not automatically mean that the resulting trace will have a persistent, useful life. The need for maintenance on a trace can be triggered by changes to any of the trace’s elements that, in turn, can be triggered by changes to elements within a chain. Traceability maintenance can also be required following changes to the requirements and constraints that drive the overarching traceability strategy.

To maintain a trace, it needs to be retrieved and the nature of the change analysed to determine what update is necessary, as illustrated in Fig. 6. This may necessitate the propagation of changes and/or the creation of entirely new traces. Updates need to be performed, where applicable, recorded and verified. Feedback on the maintenance process is also essential for evolving the overarching traceability strategy. As per traceability creation, traces can be maintained continuously or on-demand.

Fig. 6
figure 1_6_272755_1_En

Traceability maintenance

3.4 Traceability Use

The availability and usefulness of traces has to be ensured to allow for their ongoing use throughout the software and systems development life cycle, potentially in a myriad of configurable ways. Here, it is helpful to distinguish between short-term traceability use during initial product development and long-term traceability use during subsequent product maintenance. Typical short-term uses for traceability include requirements completeness analysis, requirements trade-off analysis or requirements-to-acceptance-test mapping for final acceptance testing. Typical examples of long-term uses for traceability include the determination of effects of changes to a software system or the propagation of changes during its evolution.

Any atomic trace is likely to play a role in the context of many use contexts. To use a trace in isolation, or as a constituent part of a chain, it needs to be retrieved and rendered visible in some task-specific way, as suggested in Fig. 7. An important component of the use process is assessing the quality of the traceability that is provided in terms of the fitness for purpose with respect to the task or activity for which the traceability is required. Such information provides a feedback loop to improve the overall traceability strategy.

Fig. 7
figure 1_7_272755_1_En

Traceability use

4 Basic Types of Traceability and Associated Concepts

Additional terms that delineate different basic types of traceability are highlighted in the context of Fig. 8 and defined below.

Fig. 8
figure 1_8_272755_1_En

A simplified, but typical, tracing context

The traceability of Fig. 8 is bidirectional. Forward traceability offers the potential to link a single requirement statement to those methods of the class designed to implement it, and subsequently to follow this trace link to reveal the forward engineering process. Backward traceability offers the potential to link the class methods back to the requirement that they help to satisfy, and subsequently to follow this trace link to reveal the reverse engineering process. The forward and the backward direction pertain to the logical flow of the software and systems development process. These are the fundamental and primitive types of tracing.

In Fig. 8, the potential to trace from the requirement through to the code is vertical traceability, linking artifacts at differing levels of abstraction to accommodate life cycle-wide or end-to-end traceability. Any potential to trace between versions of the requirement or versions of the code is horizontal traceability, linking artifacts at the same level of abstraction at different moments in time to accommodate versioning and rollback. These two types of tracing, vertical and horizontal, employ both forward and backward tracing.

Two additional types of traceability are more conceptual in nature, and these can employ each of the above tracing types in some combination. Post-requirements (specification) traceability comprises those traces derived from or grounded in the requirements, and hence explicates the requirements’ deployment process. Pre-requirements (specification) traceability comprises all those traces that show the derivation of the requirements from their sources, and hence explicates the requirements’ production process. Only post-requirements traceability is evident in Fig. 8 since the requirement is the earliest development artifact available; this is the most common form of traceability in practice.

Figure 8 also serves to highlight some basic complexities surrounding traceability and so lends itself to the definition of a number of associated traceability concepts:

  • Do we create an atomic trace for each class method or for the cluster of methods within a class? This is an issue of trace granularity.

  • Do the three methods in the Display class fully satisfy the requirement? This is a question related to completeness. Does the trace then lead to the right code? This is a question of correctness. Is the trace up to date? This depends upon whether the traced artifacts reflect the latest project status. All of these questions are associated with the concept of traceability quality.

  • As Fig. 8 suggests, traces typically associate artifacts that are semantically very different, so the use of natural language alone to derive a trace link cannot always be trusted. For example, the play transition in the behavioural Statechart of Fig. 8 does not trace to the play method in the class diagram, or does it? Open issues in traceability research and practice have led to the formulation of a set of traceability challenges by the traceability community, and work is now underway to develop a Traceability Body of Knowledge (TBOK).

5 Conclusions

This chapter has defined terminology and concepts that are fundamental to the discipline of traceability. This includes the essential terms of trace, trace artifact, trace link, traceability and tracing in Section 2, along with a number of interrelated and dependent terms. The chapter has also described a generic traceability process model in Section 3 and characterised the basic activities involved in the life cycle of a trace. This includes a consideration of the activities comprising traceability strategy, traceability creation, traceability maintenance and traceability use. In Section 4, the chapter distinguishes between basic types of traceability and explains some key associated concepts.

The chapter is supplemented by an extensive glossary that has been developed and endorsed by members of the traceability community. This glossary contains additional terms and can be found as an appendix to this book.