Keywords

1 Introduction

Since the inception of business process modeling, the dual need of human understanding and executability of process models has been under discussion. Numerous studies have been conducted to gain insights into how these, often opposing, needs can be met. In practice, the understanding of a business process often depends on two aspects, that is, the understanding of the business process model and the understanding of any related business rules, which may or may not be part of the process model [29, 34, 36]. The understanding extracted from graphical process models is focused on the temporal or logical relationships between business activities, whereas the business rules comprise the constraints and mandates to control the behaviour of the business process and its activities [39]. When the two are not integrated, it increases the risks of incomplete understanding of the business process and hampers the effectiveness of business process management.

To facilitate better understanding efficiency and effectiveness, several studies have advocated integration of business rules into business process model [3, 14, 15, 34, 36]. At the same time, however, there is evidence that existing business process modeling languages lack the representational capacity to represent business rules sufficiently [8, 27]. Due to such representational limitations of graphical process modeling techniques it is not always possible, or indeed desirable, to represent related business rules within the process model [27]. Several studies have also explored situations under which business process models and business rules are best kept separated, and those when they are best integrated [3, 7, 13, 36].

Prior research classified integration of business rules with business process models into three approaches, namely text annotation integration, diagrammatic integration and linked integration (see Fig. 1). Text integration is a way of representing business rules in business process models by adding textual descriptions of rules – e.g. in BPMN, using the BPMN text annotation construct [3, 7, 36]. In contrast, diagrammatic integration relies on control flow constructs, such as sequence and gateways, and other constructs to represent business rules in business process models [13, 36]. Linked integration is characterised by the use of external rule repository links. It can either use static or dynamic approach to integrate and link each business rule with the corresponding part of the business process model [32, 36].

Fig. 1.
figure 1

Business rules integration approaches [34]

Despite several studies proposing various approaches for business process and rule integration, there is limited knowledge on the effect these approaches have on process understanding [35]. Previous studies have demonstrated that the linked integration approach is associated with better business process understanding as compared to a separated representation of process model and related rules [36]. However, how the three different approaches to business process and rule integration compare in terms of process understanding remains unknown.

In this paper, we present the outcomes of an empirical analysis undertaken to study the effects of different process and rule integration approaches on business process model understanding. Using a cognitive load perspective, and with the help of eye tracking equipment, we conduct an experiment to compare the differences of link integration, text annotation and diagrammatic integration on business process model understanding. The experiment uses three measurements to conduct the comparison, namely, understanding accuracy, mental effort and time efficiency. Our study provides empirical findings on the relative merits of integration approaches, which can help modelers make informed decisions regarding integration of rules and process models.

In the following sections we first present the research background of business rule integration methods as well as the role of eye tracking methods in studying business process model understanding. Section 3 introduces our experiment design. Section 4 presents the data analysis methods, the results of the experiment and discussion of insights drawn from the results, and finally Sect. 5 summarizes the contribution of the paper, limitations of the study, and an outline of future extensions of this work.

2 Related Work

Business process modeling and business rule modeling are complementary approaches for modeling business activities. To improve business process model representational capacity, researchers have developed various business rules integration methods in literature [14, 15]. In summary, three approaches of business rules integration methods have been proposed, namely, text annotation, diagrammatic integration and link integration, as shown in Fig. 1. It can be observed that the three integration methods have various distinctions in format and construction. Text annotation and link integration both use a textual expression to describe the business rules and connect them with the corresponding section of the process model. However, text annotation can result in repetition and, consequently, inconsistency of rule representation - i.e. the same rule being represented with slightly different text. For link integration, visual links can explicitly connect corresponding rules with the relevant process section. Even though link integration requires access to an external business rules repository, it is shown to reduce cognitive load required to mentally connect rules with process models [34]. Since the diagrammatic integration relies on graphical process model construction, such as, sequence flows and gateways, to represent business rules in the process model, limitations in representational capacity of the modeling language inevitably causes barriers or results in an increase in the complexity of the process model structures, which in turn may potentially result in an increase cognitive load for understanding the business process with rules integrated in diagrammatic format.

At the same time, a variety of factors have been identified as affecting the understanding of a process. These can be classified into two categories: process model factors and human factors. Process model factors relate to the metrics of the process models, such as modularization [28, 33], block structuredness [1, 38], and complexity. Human factors, or personal factors, relate to the factors of process model users, such as individual’s domain knowledge [33], modeling knowledge [5], modeling experience [22], and education level [37].

A number of prior studies have focused on different forms of process model complexity, with a broad consensus that most complexities contribute to the decreased understandability of process models. The independent variables investigated in these works include: number of arcs and nodes [22, 26], number of gateways [28, 30, 31], number of events [30], number of loops [5], and number of concurrencies [21], length of the longest path [21, 31], depth of nesting [9], and gateway heterogeneity [21, 31]. For example, [19] studied the relationship between structural properties and process understandability. They mentioned that the number of arcs in models will influence the understandability, and later in [20], they presented a set of seven process modeling guidelines that can help modelers to create less error prone models. Similarly, [17] measured the understandability of process models, and among their findings for measurement in structural model comprehension, they argued that concurrency and exclusiveness are more complicated compared with order. Other researchers identified content related factors such as the separability, reliability and validity of model that can influence the process understandability [18, 21].

Another area of relevance for our work is cognitive load theory, which refers to the total amount of mental effort being used in working memory [23]. To perceive mental effort, researchers have categorized the measurement of cognitive load into four main aspects: subjective ratings, performance measures, behavioural measures and physiological measures [2]. Subjective measures, also referred to as self-report measures, use single or multiple rating scales used by the user to rank/score their experienced level of load; Performance measures consist of task completion time, answer correctness, etc.; Physiological measures involve tracking galvanic skin response and heart rate; and Behavioural measures involve observing patterns of interactive behaviour [2]. In practice, behavioural and physiological measures are often used as they provide a direct measurement of cognitive load. Among the various related measurements, eye-based measures are one of the main behavioural measurements as they can provide a sensitive and a reliable measure for cognitive load. Due to the limited working memory capacity and cognitive resources, we can conclude from prior research that a heavy cognitive load will lead to error in process model understanding, and that the error frequency will increase with the level of cognitive load [34]. Therefore, it is important to study the merits of integrating business rules into business process models in terms of its implications on cognitive load and subsequent improvement (or lack of) in the understanding of business process [34].

Eye tracking has emerged in recent years as one of the key sensor technologies applied in studies of visual cognition [4], and has enjoyed adoption by researchers across many fields. Based on the cognitive load theory, eye activity is one of the physiological variables that can be used as a technique to reflect the changes in cognition [4, 23]. Through the use of eye tracker technologies, such as the Tobii Pro TX300Footnote 1, we can directly collect eye movement data and measure objective metrics such as pupillary response and fixation durations to indicate the correlation with cognitive function [2]. By detecting indicators such as fixation in each area of interest (AOI), we can directly identify the exact area that draws the attention of the participant. Although there is a long history on the use of eye tracking technologies in medical and psychological studies [12], the use of such technology in the business process modeling context is quite recent. To name a few, Petrusel and Mendling [24] defined the notion of Relevant Region and Scan-path to prove that Relevant Region is correlated to the answer during question comprehension. In [11], researchers used eye tracking method to measure and assess user satisfaction in process model understanding. In [25], the use of eye tracking technology enabled the researchers to identify the visual cues of coloring and layout that can improve performance in process model understanding.

3 Research Design

We use an experimental research design to undertake empirical evaluation of the three approaches to business process model and business rule integration. Our business process modeling language of choice is BPMN 2.0, due to its wide adoption and standing as an international process modeling standard. The experiment is inspired by methodologies proposed in [4], and has been adapted as explained below. Further, we consider the condition of lab environment, generalization ability and the need to control the learning effect during the experiment design.

The independent variable to be studied relates to the three approaches of business rules integration: text annotation, diagrammatic and link integration. The corresponding dependent variables are understanding accuracy, mental effort and time efficiency. Similar to other studies, we use measures of correctness of answers and time duration for answering questions to reflect the effectiveness of comprehension (or understanding accuracy) [6, 25]. Therefore, we use the number of correct answers to measure understanding accuracy. As for the time efficiency, the timing is counted starting from when the first process model is displayed on the computer screen, until the last question is answered and submitted. To measure participants’ mental effort, we use fixation duration as the objective measure in this experiment, which is now increasingly used as a mental effort measure in lieu of pupil dilation [16].

The overall experiment design is illustrated in Fig. 2. Each group of participants is first provided a BPMN tutorial and is then offered two models but using one of the three different approaches of rule integration. One of our scenarios, on which the models and rules are based, originates from a travel booking diagram included in OMG’s BPMN 2.0 examplesFootnote 2. The second model is adopted from Signavio website resourcesFootnote 3. For the purposes of this study, we have ensured, through multiple revisions, that we have created informationally equivalent models for all three integration approaches. Due to space limitation, the models cannot be included in the paper, but the complete materials of entire experiment are available for download on DropboxFootnote 4.

Fig. 2.
figure 2

Overall experiment design

In the remaining section, we introduce the instruments, settings and participants of our experiment.

3.1 Instruments

In this research, the instruments we use include a tutorial, the treatments and a questionnaire. In addition, we ensure all other confounding factors are constant, such as same eye-tracking lab equipment and same tutorial content. We do not impose a time limit or word count limit on the participants. In the treatment, we used the three integration approaches across two models. The two models are independent, and from different knowledge domains, however we made every attempt to maintain information equivalence and comparable complexity between them. Both models were adjusted to ensure consistency of format for each of the integration approaches. The two models have some diversity in terms of model constructs, for example the diagrammatic integration approach of model 1 has more parallel gateways (AND gateways, 18 vs. 6 in model 2), whereas model 2 has more exclusive gateways (XOR gateways, 15 vs. 3 in model 1).

In Table 1 we outline this diversity in terms of model constructs and model coverage for each question in each model. The listed model constructs indicate which constructs a participant will have to review in order to answer that question. Model coverage relates to the span of the question wherein a participant may have to navigate only a specific section of the process model to answer the question (local), or the whole process (global). We deliberately introduced diversity in questions to explore how each integration approach will affect cognitive load depending on process characteristics. This diversity allowed us to gain further insights into the relationship between process model constructs, rule integration approach and cognitive load (further details in the results section).

Table 1. Comparison of questions

We designed the tutorial and tutorial exercises to help participants develop familiarity with BPMN and the format of the main experiment. Since this experiment does not require any substantial knowledge from participants, only basic BPMN constructs are used in the tutorial and experiment models. The tutorial was presented at the beginning of the experiment session and we encouraged each participant to ask any questions during the tutorial session, so as to ensure their readiness for the experiment.

To keep a group balance, we used a pre-experiment questionnaire to determine participants’ prior knowledge and basic demographics to distribute participants to each group in a way that avoids accidental homogeneity of groups [4].

3.2 Setting

In this experiment, the questionnaire was implemented in Google Forms. The tutorial and experiment were carried out in an online web platform by using HTML, CSS, JavaScript and PHP with a back-end database using phpMyAdmin. The Areas of Interest (AOI) are created in Tobii Studio as shown in Fig. 3.

Fig. 3.
figure 3

Instrument illustrations of link integration

For the purpose of faithfully recording the eye tracking data, the experiment webpage was in full screen mode and complete models were displayed, without the function of zooming in or scrolling as these were not necessary. During the pilot test, the visibility of the experiment text and diagrams were examined carefully, and we ensured that all text and diagrams were clear from a distance of 1.2 m.

To eliminate colour blindness bias, we used a black, white and grey colour scheme for the Rule icon in link integration model. In addition, all experiments were conducted in the same lab with the same equipment. The lab is a small room with only a few machines and no windows, with a ceiling light as the only light source. The eye tracker equipment used in the experiment is the Tobii Pro TX300, with 23-inch screen of a resolution of 1920 × 1080. The participants were able to adjust the chair height to have the most comfortable position before calibration.

We used multiple Areas of Interest (AOI) to capture eye movements. For models featuring text annotation and diagrammatic integration, the screen was divided into two areas: a process model area and a question area. The process model area displayed the business process model, and the question area contained one question at a time for each model. For models featuring link integration, there was an additional third area for rules, which displayed the corresponding business rules when participants clicked on each “R” icon in the model, as shown in Fig. 3.

To ensure good quality of resulting data in the analysis of eye-movement related data, we had to eliminate the data of three participants whose eye movements failed to be properly recorded by Tobii eye tracker, that is, the eye tracker lost track of participant’s eyes and the data did not faithfully reflect the fixation of eye movement.

3.3 Participants

All participants were students invited from an Australian university. They were required to have only foundational knowledge in graphical conceptual models such as flowcharts, UML or ER diagrams, but were not required to have any substantial knowledge of business process or rule modeling. Participation was on a voluntary basis, but participants were offered a $30 voucher for participating in this research. There were 25 participants in each group, with experiments conducted one at a time. In total, 75 students participated in this experiment. As in other similar experiments [10, 16], the sample size of 20 to 30 participants for each group is feasible, providing us with sufficient volume of data for testing statistical significance.

4 Results

Our data analysis is focused on understanding accuracy, time efficiency and mental effort. We use the number of correct answers of each participant (ordinal data) as a measure of understanding accuracy. For mental effort and time efficiency, we use fixation duration and visit duration (numerical data) based on eye tracking data. We structured our analysis into three different levels to draw out the subtle differences: overall results for the dependent variable, model level results, and question level results.

The approach taken for the analysis of the data we captured in the experiments is outlined as follows: For numerical data, we first use Shapiro-Wilk testFootnote 5 to check whether the dependent variable is normally distributed. If data is normally distributed, we use Levene’s testFootnote 6 for homogeneity of variance to check whether it can meet the assumption of equal variance. If both the conditions are met, we use one-way analysis of variance (ANOVA) to further test the difference of means in the three groups. If there is a significant difference between the dependent variable and the integration groups, we use Tukey’s HSDFootnote 7 as the post-hoc test to further compare the difference in each pair of groups. If normality is violated, we use the Kruskal-Wallis testFootnote 8. If the Kruskal-Wallis test result is significant, we use the Dunn’s testFootnote 9 to rank the groups in a pair-wise comparison, as it is a commonly used post-hoc test of Kruskal-Wallis test. The Bonferroni correction was not used because the independent variable has three groups. For ordinal data, we use Kruskal-Wallis test. If the result is significant, we use the same post hoc test to rank the groups in pair-wise comparison. The significance level of 0.05 is used in all the tests.

4.1 Understanding Accuracy

We first investigate whether there is a relationship between the rule integration approach and understanding accuracy, captured through correctness of answers.

Overall the result of Kruskal-Wallis test indicates that there is a significant difference between the three groups in terms of understanding accuracy (p = 0.000). The result of post-hoc pairwise comparisons show that diagrammatic integration is associated with higher understanding accuracy than text annotation (one-tailed p = 0.000) and link integration (one-tailed p = 0.003), but that text annotation and link integration do not differ significantly (one-tailed p = 0.139).

Model Level:

As the results of Kruskal-Wallis test show in Table 2, we can conclude that there is a significant difference between the three groups in terms of understanding accuracy, both in model 1 (p = 0.002) and model 2 (p = 0.033). Given the result of post-hoc pairwise comparisons, we can further conclude that diagrammatic integration is associated with higher understanding accuracy than text annotation and link integration in both models, at the significance level of 0.05.

Table 2. Understanding accuracy

Question Level:

From Fig. 4, we can observe that there is a notable contrast in the mean comparison in understanding accuracy between diagrammatic integration and the other two approaches in the first two questions in model 1 and the last two questions in model 2.

Fig. 4.
figure 4

Understanding accuracy breakdown to each question

Conclusion 1:

Understanding accuracy is associated with the rule integration approach. Overall the diagrammatic integration shows better understanding accuracy than link and text integration. The same applies in model 1 and model 2. Link integration and text annotation do not significantly differ in understanding accuracy in all models.

4.2 Mental Effort

Overall, the result of Kruskal-Wallis test indicates that the difference in fixation duration between the three groups is not statistically significant (p = 0.082). Therefore, we further analyse the data relating to each model and each question to explore any detailed differences.

Model Level:

The results of Shapiro-Wilk test for mental effort, indicate that the assumption of normality in both models are not met, both p < 0.05 (p = 0.000 and p = 0.037). Hence, we use Kruskal-Wallis test in both models. As shown in Table 3, the difference in fixation duration between the three groups in model 1 is not statistically significant (p = 0.946). As for the results of model 2, our analysis indicates that the difference in fixation duration across three groups is statistically significant (p = 0.036).

Table 3. Mental effort

For model 2, the result of post-hoc pairwise comparisons shows that the diagrammatic integration group has a statistically significant higher fixation duration than text annotation (one-tailed p = 0.021) and link integration (one-tailed p = 0.008), but text annotation and link integration do not differ significantly (one-tailed p = 0.359).

Question Level:

From the mean comparison in fixation duration of each integration approach in Fig. 5, we can observe that there is a notable difference between diagrammatic integration and the other two integration approaches in the last two questions of model 2. We note that both these questions involved loop constructs.

Fig. 5.
figure 5

Fixation duration breakdown to each question

Conclusion 2:

Mental effort is partially associated with the rule integration approach. Overall, there is no significant difference in mental effort between different integration approaches. The same applies in model 1. In model 2 diagrammatic integration requires more mental effort than other integration approaches, especially when loop constructs are involved. Text annotation and link integration do not differ significantly in mental effort in all models.

4.3 Time Efficiency

Overall the result of Kruskal-Wallis test indicates that the difference in fixation duration between the three groups is not statistically significant (p = 0.273).

Model Level:

The results of Shapiro-Wilk test for time efficiency, indicate that the assumption of normality in both models are not met, all with p < 0.05 (p = 0.000 and p = 0.014). Therefore, we use Kruskal-Wallis test in both models. As shown in Table 4, for model 1 the result indicates that the difference in time efficiency between the three groups is not statistically significant (p = 0.884). In model 2, the difference in time efficiency across three groups is statistically significant (p = 0.021).

Table 4. Time efficiency

For model 2, the result of pairwise comparisons shows that the diagrammatic integration group has a statistically significant higher visit duration than text annotation (one-tailed p = 0.012) and link integration (one-tailed p = 0.006). However, text annotation and link integration group do not differ significantly (one-tailed p = 0.394).

Question Level:

From the mean comparison of visit duration in each integration approach in Fig. 6, we can observe that there is a notable difference between diagrammatic integration and the other two integration approaches in the last two questions of model 2, which involved loop constructs.

Fig. 6.
figure 6

Visit duration breakdown to each question

Conclusion 3:

Time efficiency is partially associated with the rule integration approach. Overall there is no significant difference in time efficiency between the different integration approaches, nor in model 1 when considered in isolation. In model 2, diagrammatic integration requires more time than other integration approaches when loop constructs are involved in the questions. Text annotation and link integration do not differ significantly in time efficiency in all models.

4.4 Analysis and Discussion

Overall, we can observe that for model 1, diagrammatic integration is more understandable than text annotation and link integration, but there is no significant difference in mental effort and time between the different integration approaches. At the same time, in model 2, diagrammatic integration is more understandable than text annotation and link integration, but requires more effort and time than the other two types of integrations. In reviewing these results against the diversity of model constructs and coverage (as outlined in Table 1), we stipulate that the differences in model constructs are the likely cause of these results. From a model construct perspective, we observe that model 2 has relatively more XOR gateways compared to model 1 in the diagrammatic integration approach (15 vs. 3). In model 2, the last two questions are focused on the model area that involves looping. However, there is no looping in model 1 and the other questions do not require the participant to mentally navigate gateways. Hence, we note that the presence and number of XOR and AND gateways, and loops formed through these constructs, may influence the mental effort and time efficiency. Meanwhile, as shown in Figs. 5 and 6, there is a notable difference in the last two questions of model 2 in terms of diagrammatic integration requiring most mental effort and time. We posit that the increase in mental effort and time we observed in model 2 is attributed to the number of XOR and AND gateways, and loops formed through these constructs, as mentioned above. Moreover, we believe the reason that diagrammatic integration requires more mental effort and time than the other two approaches in model 2, is that it uses more gateways than link/text annotation integration (21 vs. 7) to integrate business rules into the model, which inevitably causes the model to become more complex.

Based on above analysis, we consider mental effort and time efficiency to be only partially associated with the rule integration approach. That is, diagrammatic integration is associated with better understanding accuracy, but may require more mental effort and time than text annotation and link integration when the model involves complex loop constructs.

5 Conclusion

The central question in this research is to explore the difference between business process and rule integration approaches on business process model understanding. We set out to investigate this question through a cognitive load perspective by using eye-tracking, and studied the difference in terms of understanding accuracy, mental effort and time efficiency. Through the analysis, we discovered that the integrated approaches applied to models with specific characteristics will impact on cognitive load and consequently process understanding. For example, the presence and quantity of XOR gateways, AND gateways, and questions which require navigation of constructs through loop structures, seems to influence understanding, as observed in model 2.

The findings of this research provide empirical evidence of the relative merits of integration approaches. These findings can help process modelers make evidence-based decisions regarding integration of rules and process models relative to model characteristics. The design of this research experiment can also provide valuable methodological contribution in the field of business process model understanding. In particular, we illustrate feasible protocols and resulting advantages of using eye-tracking to study business process model understanding.

Our study is not without limitations. First, due to the limitation of the eye tracking software and the display capacity of the screen, the complexity of the process models and rules was restricted. Second, only two models were used in the experiment, which may hamper the generalizability of the conclusions. Further both models were created using BPMN, which also raises a question of generalizability across other business process modelling notations. Third, the validity of the results is potentially compromised by learning effects, since model 2 was assessed after model 1 in all experiments. Lastly, fatigue can also be considered as a potential weakness as there was no break for participants between the two models and we had no time limit for each participant to answer each question. Moreover, the individual variability (e.g. experience and domain knowledge) may influence the experiment results. Since all participants were students, we limit generalizability of the research to novice modelers. While organizational models are often more complex in reality, our findings still provide valuable comparative evaluations towards understanding the differences between integration approaches.

In our future work, we seek to extend our study with a consideration of further diversity of model construction and model coverage of rules, to better understand under what conditions the three different integration approaches perform better. We will design structural characteristics of the models in a way that enables us to measure effects of specific constructs on the dependent variables. We also plan to investigate the relationship between the dependent variables. Finally, this work can be extended to alternative process modeling notations, that is beyond BPMN, as different notations have different mechanisms for integrating rules, which is likely to effect process model understanding.