Keywords

1 Introduction

Enterprise modelling (EM) may serve a variety of purposes: developing or improving the organizational strategy, (re-)structuring business processes, eliciting requirements for information systems, promoting awareness of procedures and commitment to goals and decisions, etc. [16]. All these application scenarios require the involvement of a multitude of domain experts with different background knowledge [22]. It is therefore a challenge to express an enterprise model in a way equally well understood by all domain experts [11]. Limited understanding of the EM by stakeholders may result in low quality of the model and low commitment by stakeholders.

Traditional EM approaches involve an enterprise modelling expert who constructs an EM by interviewing domain experts, analyzing documentation and observing current practice, and validates the resulting model with stakeholders. Models constructed by such a consultative approach tend to exhibit low quality and poor commitment [21].

Recently, practitioners and researchers have advocated the potential of participative EM approaches, both in terms of promoting stakeholder agreement and commitment, as well as in producing higher quality models [1, 4] In other studies, tangible modelling approaches – in which physical tokens represent conceptual models – were found to be faster, easier and more interactive compared to a computer-supported approaches, where diagrams on a screen were manipulated [5, 8, 10]. In this paper we extend studies on tangible modelling to the EM domain by combining participative EM and tangible modeling in a hybrid approach.

We report on an empirical study in a graduate EM course in which we compared the effect of using a tangible modelling set with the use of computerized tools. The results were encouraging, as the tangible modelling groups showed a higher level of collaboration, produced better results, and scored higher on post-tests. On the other hand, they felt that it took longer to produce models and reported slightly lower levels of agreement. We discuss possible explanations and implications of these results. and indicate several avenues for further research.

In the next section, we summarize background and related work on enterprise modeling. Section 3 describes our research design; Sect. 4 presents our observations and measurements, and discusses possible explanations and generalizations; Sect. 5 discusses implications for practice and for research.

2 Background

In our experiment we use 4EM, which consists of an EM language, as well as guidelines regarding the EM process and recommendations for involving stakeholders in moderated workshops. [19]. 4EM sub-models includes Goals, Business Rules, Concepts, Business Process, Actors and Resources and Technical Components and Requirements models.

Participative EM, where modeling sessions in groups are led by EM practitioners, has been established as a practical approach to deal with organizational design problems. This relies on dedicated sessions where stakeholders create models collaboratively [21]. Participative EM process includes three general activities that can be performed iteratively: (1) extracting information about the enterprise, (2) transforming information into models, (3) using enterprise models (after mutual agreement on models is achieved) [11]. Participative EM attempts to alleviate the burden of analyzing numerous intra and inter-organizational processes, which makes the traditional consultative approach hard to apply [22].

With the EM practitioner serving as a facilitator during participative modelling sessions, the way participants interact is a crucial factor for EM success. Stirna et al. [21] claim that active involvement of workshop participants into modelling allows to generate models of a better quality and also increase understanding and commitment to the created models among the participants. Barjis [1] provides evidence that participation and interaction among stakeholders enables more effective and efficient model derivation and increases the validity of models. Front et al. [6] points out that a participative approach enables more efficient data acquisition and better understanding of enterprise processes.

Tangible modelling is a modelling process where components of the model can be grasped and moved by the participants [10]. Tangible modelling implies synchronicity: participants can perform changes to the model in parallel [9], making it suitable for participative modelling sessions [20]. In this way, tangible modelling is different from computer-based modelling, where models are often created by one person operating the modelling tool. There is evidence that tangible modelling sessions with domain experts can produce more accurate models and result in higher levels of collaboration as well as increased stakeholder engagement and agreement [5, 8, 10]. Related work has also found that the interactive nature of tangible modelling increases usability [20], while its resemblance to board games can make the modelling activity more fun [7]. Tangible process modelling, for instance, was found to provide better engagement [13], increase comprehensibility of the result [26] and promote higher consensus and more self-corrections while helping stakeholders involved in the tangible modelling sessions remember more details [3]. Similarly, some EM practitioners recommend using plastic cards as a means of improving the quality of models resulting from participative sessions [15, 19]. Advantages of tangible modelling can be related to evolutionary capabilities of human beings with regard to interacting with their physical surroundings. Psychological research has shown that by reducing cognitive load [14, 23] and improving cognitive fit [25], physical representations are easier to understand and manipulate [1]. This agrees with constructivist theories of learning, which maintain that learning takes place in project-based learning rather than in one-way communication, and that this is most effective when people create tangible objects in the real world [2].

3 Research Design

The research goal of this paper is to study effects of employing a tangible approach to EM compared to conducting computer-based modelling sessions. This section describes our research design following the checklist provided by Wieringa [27]. We translate our research goal in the following research question:

What are the effects of introducing tangible modelling as part of participative EM sessions?

The effects we concentrate on are the quality of the models, as well as the difficulty, degree of collaboration, and efficiency of the modelling process. Furthermore, we are interested in the educational value, namely the relative learnability with regard to 4EM. Measurement design is presented later in Table 1.

3.1 Object of Study (OoS)

Our experiment with tangible enterprise modeling was carried out with graduate students of an enterprise modeling course at Jönköping University. Students were asked to form groups no larger than five members. Although all sessions were supervised, the supervisor did not lead the sessions (as an EM practitioner would do), but just observed and provided feedback with regard to the correct application of the 4EM method. Therefore, objects of study, i.e. the entities about which we collect data, are EM sessions performed by students. The population to which we wish to generalize, consists of enterprise modeling sessions carried out by domain experts.

OoS validity. The objects of study have both similarities and differences to the target of generalization. Similarities include general cognitive and social mechanisms that are present in both our objects of study and in the population, such as evolutionary capabilities of grasping physical objects and the role of construction and participation in group work in learning. We also recognize several differences: the students have no shared experience in the organization being modeled and the supervisor did not lead the modeling session as a real-world enterprise modeling facilitator would do. Furthermore, the student groups were self-formed and so, while some groups may consist of very conscientious students, others could contain uninterested ones. Besides, some students may be shy and thus could collaborate less with their group. Nevertheless, as all of these phenomena may exist in the real world as well, these aspects (arguably) also make our lab experiment more realistic in terms of external validity. To take these possible confounding factors into account, we tried to make the presence of these phenomena visible by performing most measurements on an individuals instead of on groups and by observing group behavior, dynamics and outliers.

3.2 Treatment Design

Participants were first presented with a description of a real-world anonymized case of a sports retailer company. Each group was then given five weeks to perform a business diagnosis of the retailer by constructing three out of the six 4EM sub-models, namely the goal, concepts and business process viewpoints. The groups were instructed to perform as much of the modelling as possible together, during weekly, dedicated modelling sessions (4 h session a week). Treatment was self-allocated: Groups were allowed to choose between tangible or computer-based modelling sessions, as long as there was an even split. The tangible modelling groups were given a large plastic sheet, colored paper cards and pens to create the models. Different colors of paper cards were representing different types of elements—goals, problems, concepts and processes, similar to Fig. 5.1 of [19]. Cards could be easily attached to the plastic sheet and moved if necessary. These groups were instructed to make use of the cards when collaboratively building the models, and create digital versions of models after that. By contrast, the computer-based modelling groups (allowed to use a diagram tool of their choice) started working directly on a computer.

Treatment validity. While in real-life situations, the modelling technique might sometimes be prescribed, it was noted that free choice of the preferred notation to be used in EM activities and its effects on ease-of-use and understandability is desirable and worth investigating [26]. Our experiment is similar to situations where modellers have the freedom to choose their tools, and dissimilar to situations where the modelling technique is prescribed to them. Noticeably, the choosing of tools may hamper external validity of this study. In addition, internal validity may be threatened by the fact that participants were informed about both available treatments. This may cause an observer-expectancy effect, where participants change their behavior based on what they think the expectations of the experimenter are. In an attempt to mitigate this, we did not inform participants about the goal of the research nor of the measurements.

3.3 Measurement Design

We are interested in comparing the effects of tangible modelling versus computer-supported modelling on the quality of the result, on the modelling process, and in connection to their educational potential.

Table 1. Operationalized indicators and measurement scales

The quality of a conceptual model is commonly defined on three dimensions: syntax (adhering to language rules), semantics (meaning, completeness, and representing the domain) and aesthetics (or comprehensibility) [12, 24]. In this study we measured the semantic quality and syntactic quality of the resulting model and omitted measuring aesthetics due to its highly subjective nature. Both semantic and syntactic quality were estimated by the supervisor on a 5-point semantic difference scale by comparing the final models with the case description and 4EM syntax, respectively.

With regard to the modelling process, relevant factors are difficulty, amount of collaboration, as well as the overall task efficiency. Difficulty is a purely subjective measure [18] and was therefore measured as perceived difficulty via individual on-line questionnaires distributed at the end of the course. The questions (available at https://surfdrive.surf.nl/files/index.php/s/ixW4JlmtXma6OlE) were linked to a semantic difference scale, and provided room for optional free-text explanations. Collaboration—the amount in interaction between group members—is crucial for creating a shared understanding of a representation [17]. We indirectly measured collaboration by means of two indicators: observed collaboration (estimated by the supervisor throughout the five sessions) and perceived agreement (measured the same on-line questionnaire). Task efficiency is the amount of time to produce the final, digital model. In our case, because the task was spread across several weeks and groups may have worked at home, we could not directly measure the time groups spent. Therefore, we operationalize task efficiency in terms of perceived duration (measured via the online questionnaire) and observed pace (progress achieved during the dedicated modelling sessions, as estimated by the supervisor).

Finally, to evaluate the educational value of a tangible modelling approach, we looked at the final results of the students. As indicators we use final report grades and students’ performance on two exam questions on 4EM. The final report grade was decided by the supervisors and lecturer together, while exams were graded by the course lecturer, who otherwise did not take part in this study.

Measurement validity. Potential issues with measurement validity might occur due to the qualitative and self-reported nature of the data (internal causes), as well as the loosely controlled environment (contextual causes). Potentially, different in scales (e.g. ‘1’ can correspond to ’poor’ in one case on ’very easy’ in another) could confuse the respondents. The fact that model quality, collaboration (observed collaboration), task efficiency (observed pace) and learnability (final report grade) were assessed by one of the authors of the paper, who was one also the supervisor of the modelling sessions, is related to the self-reported nature of data and might influence validity. The person aimed at doing the assessments objectively, however still could have been biased or made mistakes. The way students formed groups (not randomly) and that they were allowed to choose a diagram tool might also be noted. To preserve construct validity, we tried to reduce mono-operation bias by operationalizing each concept in terms of two different indicators (where possible). We also attempted to minimize mono-method bias by using both self-reported and observed values where possible.

4 Results

Measurements included data on work of 38 students from Information Engineering and Management (School of Engineering) and IT, Management and Innovation (School of Business), who formed eight groups of three to five students. Although self-assigned, exactly half of the groups opted for “physical” (i.e. tangible) modelling. Every group submitted a report containing final, digital versions of their model (constructed in a tool of their choice), as well as justifications of their design decisions. No student dropped out of the modelling sessions, but only 23 filled in the questionnaire and 26 were in the examFootnote 1.

Table 2. Group measurements, aggregated per group type
Table 3. Individual measurements, aggregated respondent group type

Results per group (Table 2) show a higher degree of collaboration and a faster pace of the tangible groups. We observed that these groups tended to communicate more and make better use of the dedicated modelling sessions, while computer-based groups tended to divide tasks and occasionally skip sessions. Also, tangible groups produced models with slightly less syntactic quality but with a higher level of content correctness. Individually (Table 3), participants from tangible groups reported slightly lower perceived agreement (by 2 %) and lower difficulty (down 13 %). Furthermore, such participants sometimes reported of longer durations than their peers from groups using only a computer. Figure 1(b) shows that more tangible modelling participants than computer-based modelling participants perceived duration as being more than 20 h. Regarding the educational effect of using tangible models, we have noticed a significant improvement on both measured indicators of learnability. The reports submitted tangible groups were scored consistently higher than others (see Figure 1(a)). Furthermore, tangible modelling students obtained, on average, 7.5 % higher on questions related to the 4EM method and its application.

Fig. 1.
figure 1

Final report grades and perceived duration

Discussion. We cannot exclude the possibility that all differences between tangible and computer-based groups are random fluctuations explained by chance alone. Also, since our sample and treatments were not formed and allocated randomly, we refrain from using statistical inference to generalize. However, plausible explanations to interpret the noted differences can be offered.

First, the reduced syntactic quality of tangible models can be explained by the fact that tangible modelling does not constrain the syntax of models as strictly as computers do. Thus, some students might have used this freedom to construct models that are not syntactically correct.

Second, our explanation for the higher semantic quality of tangible models is that the tangible groups interacted more (without dividing tasks) and seemed to work harder (higher pace and longer perceived duration). This can be that tangible modelling supported participation by providing the fun-factor. The perceived duration might have also been influenced by the fact that after completion of tangible models, the students had to enter them into a software tool.

Third, the lower perception of difficulty and better exam results of tangible groups can be explained by the theory of constructivism, which says that learning is most effective when people jointly create tangible objects in groups.

Finally, the slightly lower perceived agreement within tangible groups may be explained by higher levels of collaboration. Due to less subdivision of tasks, tangible modelling forced groups to promptly discuss disagreements. It is also possible that the computer-based groups had lower actual levels of agreement without noticing this. Since they divided tasks among members and discussed less than the tangible groups, they may have overlooked some disagreements or misinterpretations. While our data do not exclude this possibility, it do not support it either. More research is needed to test this hypothesis.

Generalizability. Given available data, this study employs generalization by analogy: “If an observation is explained by a general theory, then this observation may also occur in other cases where this general theory is applicable” [27]. Since social or psychological mechanisms can explain the observed phenomena using constructs such as synchronicity, cognitive load, cognitive fit, gamification, and constructive learning (see Sect. 2), we can expect similarities in practice.

5 Conclusions and Future Work

Implications for research. Our results are consistent with earlier research that showing that tangible modelling promotes collaboration because of synchronicity, manipulability of physical tokens, and increased fun, while leading to better results due to the joint construction of physical models [10]. At the same time, the perception of increased duration contradicts our earlier research, where tangible modelling was observed to be faster than computer-based modelling. Results also show that collaborative modelling may increase the effort required for modelling, contrary to [1, 6]. One explanation of this is that our earlier results [10] used iconic physical tokens, i.e. objects that resemble the entities being modelled, which made them easier to understand. To test this explanation, we need to compare tangible modelling with iconic tokens and with plastic cards in future research. Also needed is a similar real-world experiment with EM practitioners, to verify the external validity of our results. Another interesting direction for further investigation are computer-based participative modelling tools (such as using smart boards and touch screens).

Implications for practice. Our results suggest that tangible enterprise modelling could be a useful tool for building consensus of stakeholders with diverse backgrounds and little EM experience. This is particularly useful on the early stages of enterprise modelling, when the goal is to improve the quality of the business [10, 21]. Our results also suggest that tangible EM has a positive educational effect by providing higher understandability and improved learnability.