Assessing the understandability of UML statechart diagrams with composite states—A family of empirical studies

Cruz-Lemus, José A.; Genero, Marcela; Manso, M. Esperanza; Morasca, Sandro; Piattini, Mario

doi:10.1007/s10664-009-9106-z

Assessing the understandability of UML statechart diagrams with composite states—A family of empirical studies

Published: 17 February 2009

Volume 14, pages 685–719, (2009)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Empirical Software Engineering Aims and scope Submit manuscript

Assessing the understandability of UML statechart diagrams with composite states—A family of empirical studies

Download PDF

José A. Cruz-Lemus¹,
Marcela Genero¹,
M. Esperanza Manso²,
Sandro Morasca³ &
…
Mario Piattini¹

668 Accesses
45 Citations
Explore all metrics

Abstract

The main goal of this work is to present a family of empirical studies that we have carried out to investigate whether the use of composite states may improve the understandability of UML statechart diagrams derived from class diagrams. Our hypotheses derive from conventional wisdom, which says that hierarchical modeling mechanisms are helpful in mastering the complexity of a software system. In our research, we have carried out three empirical studies, consisting of five experiments in total. The studies differed somewhat as regards the size of the UML statechart models, though their size and the complexity of the models were chosen so that they could be analyzed by the subjects within a limited time period. The studies also differed with respect to the type of subjects (students vs. professionals), the familiarity of the subjects with the domains of the diagrams, and other factors. To integrate the results obtained from each of the five experiments, we performed a meta-analysis study which allowed us to take into account the differences between studies and to obtain the overall effect that the use of composite states has on the understandability of UML statechart diagrams throughout all the experiments. The results obtained are not completely conclusive. They cast doubts on the usefulness of composite states for a better understanding and memorizing of UML statechart diagrams. Composite states seem only to be helpful for acquiring knowledge from the diagrams. At any rate, it should be noted that these results are affected by the previous experience of the subjects on modeling, as well as by the size and complexity of the UML statechart diagrams we used, so care should be taken when generalizing our results.

Comprehensibility of system models during test design: a controlled experiment comparing UML activity diagrams and state machines

Article Open access 23 April 2018

How consistency is handled in model-driven software engineering and UML: an expert opinion survey

Article 23 April 2022

Method of UML Statechart Checking Based on Explicit Model Checking

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Modeling is at the core of many disciplines, but it is especially important in engineering because it facilitates the communication and construction of complex systems from smaller parts (Thomas 2004). Models help us understand a complex problem and its potential solutions through abstraction. This is why software systems, which are often among the most complex of all engineering systems, can greatly benefit from using models and modeling techniques (Selic 2003). This idea is now receiving even more emphasis, since the software industry is moving towards Model-Driven Development (MDD) processes (Atkinson and Kühne 2003), in which software is developed at a higher level of abstraction than source code, based on models and model transformations. The MDD paradigm therefore focuses the effort of development on the design of models, rather than on coding. Correspondingly, the focus of software quality assurance is shifting from system implementation towards system modeling.

To be useful and effective, an engineering model must possess the following five key quality characteristics to a sufficient degree (Selic 2003): abstraction, understandability, accuracy, predictiveness and inexpensiveness.

In this paper, we focus on understandability because it is recognized as one of the main factors influencing maintainability^{Footnote 1}, and it is well-recognized that a large part of the effort invested in the development of any software product is devoted to maintenance (Pigoski 1997). More specifically, we focus on the understandability of UML statechart diagrams, since UML has become the de facto standard for modeling software systems; added to this is the fact that UML statechart diagrams have become an important technique for describing the dynamic aspects of a software system (Denger and Ciolkowski 2003). UML statechart diagrams are also considered to be one of the most important UML diagrams and they should be used by practitioners as a starting point for training newcomers to UML (Bolloju and Leung 2006).

The main goal of this line of research, which we have pursued over the last 5 years, was to investigate which constructs influenced the understandability of UML statechart diagrams, since a UML statechart diagram must be understood before any desired change on it can be identified, designed, or implemented. In the quest to reach this objective, we carried out a controlled experiment and a replication of it (Cruz-Lemus et al. 2005b). We found that activities, guards, simple states and transitions were the UML constructs that most influenced the understandability of UML statechart diagrams, but that the effect of composites states was not clear. Considering these results as preliminary, we decided to continue investigating composite states.

Composite states allow modelers to structure UML statecharts in a hierarchical fashion. A composite state represents the abstraction of an entire UML statechart diagram into which the composite state can be refined. As such, composite states are an important construct of the UML statechart diagrams metamodel (OMG 2003) and they are believed to be a fundamental modeling abstraction mechanism to help modelers master the complexity of a software system. From a theoretical point of view, UML statechart diagrams with composite states extend finite state machines to facilitate the description of highly complex behaviors (Hu and Shatz 2006) by dividing the system into smaller, less complex parts thereby making this system easier to understand. This in turn leads to a model that is easier to develop and modify.

Taking as a starting point the common use of hierarchical structures in modeling techniques, we thus hypothesized that abstracting a UML statechart diagram composed of highly related simple states and transitions into a composite state could help improve the understandability of a UML statechart diagram. Empirical support needs to be provided to show if this belief is actually true and, if so, under what conditions.

As related works show (see Section 2), references on empirical studies related to dynamic modeling in general and UML statechart diagrams in particular are few and far between. To our knowledge, the influence of composite states on the understandability of UML statechart diagrams has not been studied in the literature previously, despite the importance of the topic. This fact motivated us to gather empirical evidence for our hypothesis.

In this work, we present a family of three empirical studies consisting of five controlled experiments, whose design and execution were gradually modified and improved to alleviate some threats to the validity of the different component studies. We used relatively small statechart diagrams (10 to 25 states) as experimental materials and the experimental subjects were undergraduate and graduate students of Computer Science at several universities, along with a number of professionals with an average of 2 years’ experience in UML modeling.

The data analysis carried out in each individual experiment did not allow us to obtain conclusive results. This led us to carry out a meta-analysis study. Meta-analysis has been recognized as an appropriate way to aggregate or integrate the findings of empirical studies in order to build a solid body of knowledge on a topic based on empirical evidence (Lipsey and Wilson 2001; Miller 2000; Pickard 2004). Moreover, the need for meta-analysis is gaining relevance in empirical research, as is demonstrated by the fact that it is a recurrent topic in various forums related to Empirical Software Engineering. In other areas, such as psychology or medicine, a single study is extremely unlikely to be definitive. Dozens and even hundreds of studies on the same topic may follow. In Empirical Software Engineering, it is unusual for a large amount of studies concerning the same topic to take place, but it is necessary to cross the borders of individual studies to extract conclusions of a more general kind from families of experiments, with or without significant results.

Since we have not evaluated industrial systems with a large range of different size and complexity, we cannot generalize our findings to every usage of composite states in UML statechart diagrams. Nevertheless, our common family of experiments seems to indicate that the use of composite states is not always beneficial.

The paper is organized as follows. Section 2 presents related work. Section 3 provides a roadmap of the family of experiments that we have performed. Section 4 introduces the Cognitive Theory of Multimedia Learning (CTML) (Mayer 2001), which we have used as a background in some of our experiments. Sections 5, 6, 7 then explain in detail the experimental process used to carry out each of the studies that are part of the family of experiments. Section 8 summarizes the threats to the validity of the family of empirical studies. In Section 9 the results of the meta-analysis performed with the data are presented. The main conclusions achieved from this family of experiments and the future work that is planned are in Section 10.

2 Related Work

In this section, we situate our empirical study in relation to some other work found in the relevant literature.

Comprehension has been widely studied. In the literature, we can find works that have studied the comprehension of programs (Woodfield et al. 1981), complete models (Agarwal et al. 1999) or specific diagrams such as UML class diagrams (Purchase et al. 2001, 2002; Yusuf et al. 2007), UML collaboration diagrams (Glezer et al. 2005; Purchase et al. 2001, 2002) and UML sequence diagrams (Glezer et al. 2005; Xie et al. 2007). We can even find examples of pieces of work which study how the use of different artifacts, e.g. stereotypes, affects the way in which models are understood (Genero et al. 2008; Ricca et al. 2007; Staron et al. 2006).

As we have commented previously, understandability is considered to be a main factor influencing maintainability (Briand et al. 2001; Fenton and Pfleeger 1997; Harrison et al. 2000) and we can also find other works taking up this issue (Arisholm and Sjøberg 2004; Genero et al. 2007).

In some of these studies we have found that experience is a factor to be taken into account when measuring comprehension (Arisholm and Sjøberg 2004; Bolloju and Leung 2006; Ricca et al. 2007; Yusuf et al. 2007).

We found the following two papers dealing with empirical studies on the comprehension of UML diagrams which model dynamic aspects of an OO system:

Otero and Dolado (2004) evaluate the comprehension of the dynamic modeling in UML designs by using two experiments in which they compare the comprehension of UML sequence, collaboration, and statechart diagrams. They conclude that sequence diagrams are the most appropriate for comprehension of management information applications, collaboration diagrams are those best suited to real-time non-reactive systems, and statechart diagrams are the most appropriate for real-time reactive systems.
Otero and Dolado (2005) present two controlled experiments for evaluating the semantic comprehension of two standard languages, UML versus OPEN Modeling Language (Firesmith et al. 1998), from the perspective of dynamic modeling. The results reveal that the specification of dynamic behavior using OPEN Modeling Language is faster to comprehend and easier to interpret than when using the UML language, regardless of the type of dynamic diagram.

As we commented in the introduction, the main goal of our line of research over the last 5 years has been to investigate which constructs influenced the understandability of UML statechart diagrams, so the most closely related work is that done by ourselves prior to this. We had carried out a controlled experiment and a replication of it (Cruz-Lemus et al. 2005b) in which we found that some of the UML statechart diagram constructs (activities, guards, simple states and transitions) were the ones that most influenced the understandability of UML statechart diagrams. To perform that experiment, a group of teachers and students from the University of Castilla-La Mancha (Spain) performed a series of comprehension tasks on 20 different UML statechart diagrams which covered a broad range of values for the proposed metrics. In this study, composite states did not seem to affect the understandability of UML statechart diagrams.

In addition, in Cruz-Lemus et al. (2005c) we presented an experiment and its replication whose purpose was to find out the optimal nesting level of composite states within UML statechart diagrams. 38 Computer Science students from the University of Murcia (Spain) answered a set of comprehension questions related to the same system, but modeled using 0, 1, and 2 nesting levels in composite states, i.e., without composite states, with one composite state and with composite states within composite states. They concluded that a flat nesting level makes the diagrams more easily understandable.

This review of the literature reveals that the use of composite states and their impact on the comprehension of UML statechart diagrams have not been investigated in depth, despite the need for empirical studying of UML diagram comprehension, and in spite of how many recently- published works there are.

Even though in (Cruz-Lemus et al. 2005b) we found that composite states seem not to affect the comprehension of UML statechart diagrams, we considered this a bit suspicious, so we decided to investigate this finding in greater depth. Starting from the common use of hierarchical structures in modeling techniques, we decided to hypothesize that abstracting a UML statechart diagram composed of highly related simple states and transitions into a composite state could help improve the understandability of a UML statechart diagram. This hypothesis was what led us to carry out the research that we are presenting in the current study.

3 The Family of Experiments

An experiment may be a part of a common family of studies, rather than being an isolated event (Basili et al. 1999). Common families of experiments allow researchers to answer questions that are beyond the scope of individual experiments and let them generalize findings across studies, thus providing evidence for confirming or rejecting specific hypotheses. In addition, common families of studies can contribute to devising important and relevant hypotheses that may not be suggested by individual experiments. A common family of experiments is not necessarily composed only of identical replications of the same study. Materials, hypotheses, and specific tasks assigned to the subjects may be refined across experiments, based on the knowledge obtained after each experiment.

Figure 1 shows the chronology of the family of experiments we have carried out in our study on the understandability of UML statechart diagrams.

The first experiment and its replication (E1 and R1) took place in two universities in Spain in 2005. The materials and tasks to be performed were quite simple and the background knowledge of the undergraduate students used as subjects was not advanced. These studies provided some initial results that were later strengthened with the other experiments of the family.

The second experiment and its replication (E2 and R2) took place in two universities, one in Spain and the other in Italy, in 2006. The Italian students’ background was similar to that of those in the previous study (E1 and R1), but the Spanish subjects were PhD students and had more experience in modeling. In addition, the materials and tasks assigned to the subjects were improved, especially with the use of the CTML (Mayer 2001) for assessing the complete set of variables of the experimental design. We describe this theory in more detail in Section 4.

In these studies, we used students as experimental subjects. The tasks to be performed did not require high levels of industrial experience, so we believed that this experiment could be considered appropriate, as suggested in the literature (Basili et al. 1999; Höst et al. 2000). Working with students also implies a set of advantages, such as the fact that the prior knowledge of the students is rather homogeneous, there is the possible availability of a large number of subjects (Verelst 2004), and there exists the chance to test experimental design and initial hypotheses (Sjoberg et al. 2005). An additional advantage of using novices as subjects in experiments on understandability is that the cognitive complexity of the objects under study is not hidden by the experience of the subjects.

The main difference between the first four studies (E1, R1, E2, and R2) and the third experiment (E3) lies in the fact that we had professionals as experimental subjects in E3. Another feature that made that experiment distinct was that the materials and tasks were further renewed and improved.

In studies E1 and R1, we used variable understandability effectiveness, defined as the ability to understand the presented material correctly. In studies E2, R2 and E3, we added two new variables related to the CTML, retention and transfer. We explain these variables in Section 4.

These three variables were measured by using three separate tests based on questionnaires. The values of understandability effectiveness (UEffec), transfer (UTrans), and retention (UReten) were computed as the number of correct answers for each specific test divided by the number of questions.

The time needed to complete a test was also measured, but we chose not to use it because, from our own experience and following the advice of several experts, we have concluded that time is not a good indicator of understandability on its own. It provides information only about how quickly the tasks have been performed, but not about how well.

As for the design of the experiments, we used the guidelines provided in several works (Juristo and Moreno 2001; Kitchenham et al. 2002; Wohlin et al. 2000). Taking into account the kind of experimental designs used and the treatment of the studies, an appropriate statistical method for obtaining the results is an ANOVA (Kirk 1995; Winer et al. 1991). We set a statistical significance threshold α = 0.05 in all of our studies, so we rejected the null hypotheses of our studies if the statistical tests we used provide a statistical significance (p-value) of the results that was not higher than 0.05. We also studied the power of the statistical test when non-statistically significant results were obtained. We used SPSS (SPSS 2003) to perform all the statistical analyses.

We examine all the threats to validity of the experiments in Section 8.

4 The Cognitive Theory of Multimedia Learning (CTML)

Models in general and conceptual models in particular include both graphics and text. (Mayer 2001) proposed a definition of “multimedia” to include descriptions that include “words” and “pictures”. Conceptual models can be considered multimedia messages, since they include both words and graphic elements (Gemino and Wand 2005).

We have used CTML (Mayer 2001) to explain how individuals viewing explanative material develop an understanding of multimedia content being presented to them. One of the main strengths of this theory lies in the experimental studies that have been based on it to compare text-only presentations with graphics/text presentations in several fields (Craig et al. 2002; Gemino and Wand 2003; Mayer 1989; Mayer and Anderson 1991; Mayer 2001; Tabbers 2004).

There are a number of reasons for choosing CTML as a means of measuring how subjects understand the materials that are being presented (Gemino and Wand 2005). Firstly, CTML focuses on words and graphics, which are the elements used by UML. Secondly, CTML provides principles for the design of effective multimedia presentations that can be empirically tested. In third place, CTML has evolved over a decade of work, in which experimental instruments and methods have been developed (Mayer 1989, 2001).

CTML suggests that a learner is not an “empty vessel” waiting to be filled with domain information, but an active processor with limited cognitive capacity who attempts to integrate presented material with previous knowledge. This implies that individuals might differ in how they understand the same model, depending on prior knowledge and the attention they give to various parts of the model.

(Mayer 2001) suggests that three outcomes are possible when presenting explanative material: (1) no learning, (2) fragmented learning, and (3) meaningful learning. These outcomes are primarily based on concepts that can be measured by two variables that Mayer labels retention and transfer.

Retention is defined as the comprehension of material being presented. Transfer is the ability to use knowledge gained from the material to solve related problems not directly answerable from it. No learning occurs where retention and transfer are low. Fragmented learning occurs where retention is high but transfer is low. This result indicates that material has been received but has not been integrated well with prior knowledge. It suggests that memorization has occurred, rather than meaningful learning. Finally, meaningful learning occurs when both retention and transfer are high. High transfer indicates that information has been integrated into long-term knowledge and a high level of understanding of the presented material has been achieved.

5 First Experiment and Replication (E1 and R1)

In this section, we outline the main characteristics and results of the first experiment (E1) and its replication (R1). More details about this study can be found in (Cruz-Lemus et al. 2005a).

All the subjects received a short training session before the experiment, in which the instructor commented on the main constructs of UML statechart and showed two examples of the experimental tasks to be performed. These examples, as well as those performed in the rest of experiments and replications, were neutral with regards to the independent variable (whether using composite states or not), as one example contained composite states and the other did not.

We split the subjects randomly into two groups, which we here call Group A and Group B. Two different domains were used, one involving the functioning of an ATM (Automated Teller Machine) and the other a phone call. For each domain, two conceptually identical diagrams were used, but while one of the diagrams included composite state(s), the other did not.

In the first part of the experiment, we used the ATM domain, in which the subjects in Group A received a diagram without composite states, while the subjects in Group B received a diagram with composite states. In the second part of the experiment, we used the phone call domain. Subjects in Group A received a diagram with composite states, while the subjects in Group B received a diagram without composite states. The experiment design is summarized in Table 1.

Table 1 E1 and R1 design

Button	Functionality
S1	Time / Date
S2	Set Alarm on / off
S3 normal	Chronometer
S3 long	Date/Time/Alarm Update
S4	Light

Assessing the understandability of UML statechart diagrams with composite states—A family of empirical studies

Abstract

Similar content being viewed by others

Comprehensibility of system models during test design: a controlled experiment comparing UML activity diagrams and state machines

How consistency is handled in model-driven software engineering and UML: an expert opinion survey

Method of UML Statechart Checking Based on Explicit Model Checking

1 Introduction

2 Related Work

3 The Family of Experiments

4 The Cognitive Theory of Multimedia Learning (CTML)

5 First Experiment and Replication (E1 and R1)

5.1 First Experiment (E1)

5.2 First Experiment Replication (R1)

5.3 E1 and R1 Conclusions

6 Second Experiment and Replication (E2 and R2)

6.1 Second Experiment (E2)

6.2 Second Experiment Replication (R2)

6.3 E2 and R2 Conclusions

7 Third Experiment (E3)

7.1 E3 Design

7.2 E3 Procedure

7.3 E3 Data Analysis and Interpretation

8 Threats to the Validity of the Family of Empirical Studies

8.1 Conclusion validity

8.2 Internal validity

8.3 Construct validity

8.4 External validity

9 Meta-Analysis Study

10 Conclusions and Future Work

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A. Experimental Material

PHASE 1: The Xholon Watch

PHASE 1: The Xholon Watch. Test #1

PHASE 1: The Xholon Watch. Test #2

PHASE 1: The Xholon Watch. Test #3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation