Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Modeling, in all its various forms, plays an important role in representing and supporting complex human design activities. Especially in the development, the analysis, as well as in the re-engineering of information systems, modeling has proved to be an essential element in achieving high performing information systems [1]. More specifically, conceptual models are descriptions of the organizational context for which a particular system is developed [2]. According to Stachowiak [3], a model possesses three features. The mapping feature, of a model can be seen as a representation of the ‘original’ system and is expressed through a modeling language. Second, the reduction feature characterizes the model as only a subset of the original system. Finally, every model is created with an intended purpose or objective, i.e. the pragmatic feature. Due to many project failures that were the consequence of faulty requirement analysis in the 1960s, the importance of conceptual modeling grew substantially as a means to enable early detection and correction of errors. As a consequence, a wide range of conceptual modeling-based approaches and techniques were introduced. Criticism however arose, stating that most of these modeling-based approaches and techniques were based on common sense and the intuition of their developers, therefore lacking sound theoretical foundations [4, 5]. This led to the introduction of ontologies, which provide a foundation for conceptual modeling by means of a formal specification of the semantics of models and describe precisely which modeling constructs represent which phenomena [6]. Although ontologies were originally used in the domain of conceptual modeling to analyze the constructs used in the models and evaluate conceptual grammars for their ontological expressiveness, the role of ontological theories evolved towards improving and extending conceptual modeling languages (CML). From now on, we refer to all of these techniques as ontology-driven conceptual modeling (ODCM) approaches. We define ODCM as the utilization of ontological theories, coming from areas such as formal ontology, cognitive science and philosophical logics, to develop engineering artifacts (e.g. modeling languages, methodologies, design patterns and simulators) for improving the theory and practice of conceptual modeling [7]. In this paper, we intend to examine the mapping feature of conceptual models more closely in the context of ODCM. We aim to describe the use and the application of ontologies in mapping phenomena to models and are interested if there exist any connections between representing kinds of phenomena with certain ontologies and modeling languages. As such, we will survey the existing literature and determine which phenomena, ontologies and CMLs occur the most in the area of ODCM. Our survey of the literature will be conducted in the form of a systematic mapping review (SMR). The purpose of a SMR is to summarize prior research and to describe and classify what has been produced by the literature. Therefore, this paper aims to make the following contributions: (1) provide a classification founded on previously developed research that will categorize the different kinds of phenomena; (2) present two frequency tables that describe the types of ontologies and CMLs that occur the most; and (3) discuss the current and past use and application of ontologies and CMLs in representing phenomena.

2 Research Methodology

In order to achieve a rigorous mapping study, we based our method on the systematic literature study methods described in [810]. A mapping study aims to outline the structure of the investigated research area. In this paper, we thus perform a SMR on the use and application of ontologies and CMLs in the domain of ODCM. To conduct our SMR, we rely on the guidelines defined by [8]: (1) definition of the research questions; (2) formulation of a search strategy and the paper selection criteria; (3) construction of the classification and frequency table; (4) extraction of data and (5) synthesis of the results. In this section, we will describe guidelines (1) through (4). The synthesis of the results will be discussed in Sect. 3. We would like to note that this SMR is being performed by building further upon the literature set that was collected in [11]. In this paper, a literature study was conducted on the existing literature of ODCM in order to assess the kind of research that has been performed over the years. While this literature study focused more on the general research trends that occurred in ODCM, our paper intends to be more specific. Our objective is to focus on the type of ontologies and the kind of CMLs that have been applied in ODCM to represent different phenomena. As such, both the literature study as the SMR of this paper target the same research domain, i.e. ODCM, but perform their study on a different level of depth and focus. Therefore, for a full explanation of the formulation of the search strategy and paper selection criteria, we refer to [11].

The research questions, as defined below, act as the foundation for all further steps of the literature study. The research questions should be formulated in such a way that they represent the objectives of this literature study. Our questions serve multiple purposes: RQ1 aims at gaining more insight into the kind of phenomena the modeling languages represent. The purpose of this question is to reveal which type of phenomena research in ODCM has been focusing upon, and to discover which phenomena have been disregarded. We define phenomena as: elements or concepts that embody real-world occurrences and can be represented by a conceptual modeling grammar which provides a set of rules and constructs that show how to model and represent these real-world domains and phenomena [12]. RQ2 aims to discover which type of ontology and which type of CML has been used in a specific article. This question will allow us to determine the ontologies and CMLs that have been applied the most in previous research efforts. Finally, RQ3 intends to deliver more insights on the relationship between phenomena, ontologies and CMLs. As such, we compare the results of RQ1 and RQ2, and aim to reveal if there exists any influence between the kind of phenomena that are being represented by a conceptual model and the kind of ontology and CML that is being used to construct this conceptual model.

  • RQ1: Which kinds of phenomena are considered the most in ODCM?

  • RQ2: Which type of ontologies and CMLs are being used in ODCM?

  • RQ3: How are ontologies and CMLs applied to represent phenomena?

Our classification and frequency tables are based upon these first two research questions. To answer RQ1, we construct a classification that will allow us to categorize between different kinds of phenomena. We base our classification on the structuring principles defined by [12, 13]. In this paper, various perspectives or structuring principles are being distinguished, based upon previous research performed in classifying phenomena. A structuring principle or perspective is defined as a rule or assumption indicating how phenomena should be structured. We therefore construct our classification scheme and assign phenomena into different categories based upon these perspectives. Each of these categories is discussed in more detail below:

  • Static perspective: Phenomena that are characterized within the static perspective tend to describe the structure of a system. These kinds of phenomena are often represented with constructs named as entity, thing or object. These entities are being distinguished with a unique principle of identity and often hold a number of attributes, which represent specific values of the entity. Generally, these entities are also connected through a variety of relationships.

  • Dynamic perspective: The dynamic structure collects phenomena that represent change and time. These phenomena are generally translated in constructs that describe events and processes. The happening of an operation or activity that has been triggered by an external factor is called an event. A process is the trace of the events during the existence of an entity.

  • Behavioral & Functional (B&F) perspective: The main phenomena that belong to the B&F perspective are social phenomena and states and their transitions or transformations. Social phenomena relate to entities such as actors and the roles they assume and actions they perform. Also rules and goals can be categorized as social phenomena since they influence the behavior of an actor. A transformation of a state can be defined as an activity, based on a set of phenomena that transforms them to another set of phenomena. Other terms used are function or task.

For example, if a paper introduces a new method to model and describes data structures used for representing and exchanging database information, we would add a reference from this paper to the static perspective. Similarly, if a paper focuses on the semantic incompleteness of models in the area of business process modeling, a reference is added to the dynamic perspective. Finally, a paper that aims to represent role-related and goal-related concepts in agent-oriented modeling will be classified as a reference to the B&F perspective.

In order to answer RQ2, we will construct a frequency table that lists all CMLs, and another frequency table that lists all ontologies that are being used in the papers of our literature set. We thus start of with an ‘empty’ frequency table, and populate this table during the analysis and the reading of the articles. Whenever we encounter a yet undefined CML or ontology, we insert this as a new category of our frequency table. It is important for the reader to realize that one paper can address multiple CMLs, ontologies or perspectives of phenomena. For example, if a paper performs an ontological analysis with the Bunge Wand Weber (BWW) ontology [14] on both the languages UML and EER, then this paper has one reference to the BWW ontology, and one reference each to respectively UML and EER. Similarly, if a paper introduces an ontological framework based upon Unified Foundational Ontology (UFO) [15] and explains how this framework can be adopted without specifically demonstrating this framework to a CML, this paper will only be assigned a reference to UFO.

After we collected our research articles, we applied our classification and started to perform our data extraction. In total, the literature set represents 200 articles that are related to research in ODCM, and that were published from 1993 to 2015. All articles, classifications and other data of the SMR can be found at http://www.mis.ugent.be/ER2016/. To extract the data, we first gathered all the collected literature from our search strategy into the reference manager MendeleyFootnote 1, to organize the general demographic information such as title, author, publication year etc. Next, the extraction was performed through the qualitative analysis tool NvivoFootnote 2 to analyze and structure our data. Both the data from Mendeley and Nvivo were then merged in the statistical software tool SPSSFootnote 3 to conduct some additional qualitative analyses. The results of this analysis can be found in the section below.

3 Systematic Mapping Study Results

3.1 RQ1: Which Kinds of Phenomena Are Considered the Most in ODCM?

In order to answer RQ1, we classified the articles according to our classification scheme. In total, 104 articles belonged to the Static Perspective (45,8 %), 74 articles (32,6 %) to the B&F perspective and 49 articles (21,6 %) could be classified to the dynamical perspective. These findings are in line with the results of Fettke [16] and Davies et al. [17]. In their research, they investigated how practitioners applied conceptual modeling and which tools and techniques where the most popular. When asking the practitioners for the purpose of conceptual modeling use, the highest ranked application areas were: database design & management and software development. These domains mostly require rather static phenomena to be modeled. Other main application areas were improvement of internal business processes and workflow management. These domains encompass more phenomena of the B&F perspective and the dynamic perspective. It seems logical that academic research would also focuses on the same kind of areas and types of phenomena that are deemed important to practitioners and enterprises. To gain more insight at the evolution of which kind of phenomena have been the topic of interest in the field of ODCM, we display in Fig. 1 the number of references per type of perspective over the period 1993–2015. As the figure demonstrates, phenomena of the static perspective have been dominating ODCM for almost its entire life span.

Fig. 1.
figure 1

Perspectives over time

Only in the last five years has the B&F perspective overruled the interest in the static perspective. Starting from 2005, both the phenomena of the dynamic and B&F perspectives have increased in interest. A possible explanation to this trend is that ontologies were first applied to analyze constructs that represented static phenomena, while after several years of successfully applying these practices, the research community shifted the application of ontologies to constructs belonging to the dynamic and B&F perspective. Moreover our observation is in line with Recker and Rosemann [18], where they state that an increasing demand for a more disciplined approach towards process modeling and business process management (BPM) triggered related academic and commercial work aiming towards advanced process and business modeling solutions. Since these areas require concepts and elements that represent phenomena from both the dynamical and B&F perspective, it is likely that the increased demand in process modeling and BPM solutions caused the ODCM community to focus more on this domain.

3.2 RQ2: Which Type of Ontologies and CMLs Are Being Used in ODCM?

To answer our second research question, we display the frequency tables in Tables 1 and 2, which represent respectively all the ontologies that have been applied and all of the modeling languages that have been used in the field of ODCM. As we can see from our first frequency table, the BWW ontology (68) is by far the most occurring ontology. The second most occurring ontology is UFO (24). Both ontologies are by no coincidence foundational ontologies. Foundational ontologies are suitable for many different target domains since they provide a broad view of the world [19]. Therefore, they are a popular means to employ for different kind of phenomena and modeling languages. This assumption is again confirmed when regarding the many domain ontologies in the table and their frequency. Many of these ontologies have been referenced only once in a paper. Evidently, since a domain ontology is often developed for a specific purpose and targets a certain domain, its number of references is significantly lower compared to the domain-independent foundational ontologies. In our frequency table, we have made a distinction between foundational ontologies and domain ontologies, where we further categorized every domain ontology according to their application domain. Most of the domain ontologies in ODCM seem to apply to the business and enterprise domain, followed by domain ontologies in software systems development & architecture and the semantic web. The most frequently referenced domain ontology was the Resource-Event-Agent (REA) ontology.

Table 1. Frequency table - type of ontology
Table 2. Frequency table - type of CML (CML)

To get a closer look at the kinds of modeling languages that have been used by ODCM researchers, we summarize our results in frequency Table 2. As with ontologies, we can see that several CMLs dominate the field of ODCM. The most popular modeling language is by far the Unified Modeling Language (UML) with 68 references. EER holds second place, with a total number of 25 references. Again, these observations are similar to those of Fettke [16] and Davies et al. [17]. Their findings identified that the modeling languages UML and EER are two of the most frequently used modeling techniques of practitioners.

It is again no coincidence that the modeling languages UML and EER are most frequently applied to model static phenomena in areas such as database design and software development. Many modeling languages have been developed for specific purposes. For example, the EER modeling language was specifically developed for the purpose of describing the data and information aspects of databases while the Business Process Modeling Notation (BPMN) is more focused on specifying business processes. Other modeling languages that were frequently identified are the Web Ontology Language (OWL) and OntoUML. While most of the identified modeling languages are used to represent concepts and elements of a domain, the OWL language is often used to represent the structure of the ontology. One of the main advantages of using OWL is that it provides a machine-readable ontology, which can then be processed by applications. The language OntoUML is an example of a CML whose metamodel has been designed to comply with the ontological distinctions and axiomatic theories put forth by a foundational ontology, in this case UFO. When a model is built in OntoUML, the language induces the user to construct the resulting models via the combination of existing ontologically motivated design patterns. It is an interesting development to observe this kind of ontologically supported modeling language ranking fifth in the frequency table.

3.3 RQ3: How Are Ontologies and CMLs Applied to Represent Phenomena?

To gain a better understanding of the two most applied ontologies in ODCM, we have mapped their frequency of references over time. As we can see from Fig. 2, the BWW ontology has been especially popular in the years 2005-2009. However, since UFO’s introduction in 2005, researchers performing ODCM have keenly adopted the ontology. It is clear that many users of BWW have switched to UFO in the years 2010–2015.

Fig. 2.
figure 2

BWW and UFO over time

To better explain this shift in ontologies, we take a closer look at which phenomena the ontologies have been applied for in ODCM. As displayed in Table 3, more than half of all the phenomena that are related to the BWW ontology are categorized into the static perspective. Both the dynamic and B&F perspective each represent around 25 % of the phenomena that correspond with the BWW ontology. Contrary to the UFO ontology, more than half of the phenomena belong to the B&F perspective. These results imply that the BWW and UFO ontologies are being applied for specific kind of phenomena. Our results would suggest that the BWW ontology is more convenient to apply to static phenomena while the UFO ontology is more suited to deal with B&F phenomena. A similar, theoretical observation has also been made by [20], where they contribute a lack of social or behavioral aspects in the BWW ontology that are necessary to model a social environment. Our assumption is further supported when observing the structure of UFO. The UFO ontology is divided into three incrementally layered compliance sets: (1) UFO-A, which defines the core of UFO, describing Endurants, i.e. entities that persist through time; (2) UFO-B defining terms related to Perdurants, entities that do not persist through time such as events, and finally (3) UFO-C which describes social entities (both Endurants and Perdurants) and their behavior, or more specifically the social aspects of actors, roles and goals. UFO thus has a layer that specifically targets behavioral phenomena.

Table 3. BWW and UFO per type of perspective

Our results suggest that certain ontologies are more preferred depending on the kind of phenomena the modeler is dealing with. An interesting research opportunity would therefore be to investigate if certain ontologies are in fact more advantageous to apply depending on the kind of phenomena. Further, as described in Fig. 1, since the year 2005, the B&F perspective has gained much attention in the field of ODCM. Similarly in Fig. 2, we also notice an increase starting from 2005 in the utilization of the UFO ontology. When linking both trends, the shift from BWW to UFO could therefore be explained that the increased interest in modeling phenomena from the B&F perspective has persuaded more researchers into applying UFO instead of BWW, because of UFO’s beneficial ability to deal with this kind of phenomena.

To gain a better understanding in how CMLs are applied in ODCM, we map the ten most frequently used CMLs to the phenomena they should represent accordingly. The results are displayed in Fig. 3. For the static perspective, UML (39) is by far the most occurring modeling language, followed by the EER language (19) and OWL (16). Concerning the dynamic perspective, these phenomena seem to be represented the most through the UML language (12) and BPMN (11). Also languages such as EPC and Petri-nets are the most used for this perspective. Finally, when looking at modeling languages in the B&F perspective, we see that UML (27) is the most dominating modeling language. It seems that there does not really exist a second ‘competing’ or preferred modeling language in this perspective. We can see that modeling languages such as BPMN, ArchiMate and UEML are also applied to represent B&F phenomena, although they clearly are still far behind of UML. Despite UML offering many types of diagrams (class, activity, interaction, statechart etc.) to model a wide variety of phenomena, it is curious that one modeling language dominates all three perspectives. As mentioned above, many CMLs have been developed to represent and be applied in certain kind of application areas. Even though UML is a standard modeling language for a wide spectrum of application domains, it still has it deficiencies in representing certain kind of phenomena. Research by [21] for example, expressed the deficiencies of UML diagrams to model business organizations and the inadequate use of UML for abstracting high-level business-specific concepts. We should therefore carefully consider during the modeling process which kind of CML we will apply in order to represent certain kind of phenomena.

Fig. 3.
figure 3

CMLs per perspective

3.4 Additional Results

Beyond the investigation into the state of the research in ODCM, we describe here additional results that can be of interest for producers and consumers of research in ODCM. We have identified the top five journal and conference papers that were the most occurring publication forums in our literature set. These forums allow us to identify the main targets for ODCM research and to determine were previous research efforts can be found. The top five journals, with the respective number of papers are: Information Systems Journal (14), Data and Knowledge Engineering (9), Scandinavian Journal of Information Systems (7), Decision Support Systems (6) and Journal of Database Management (5). The top five conferences are the International Conference on Conceptual Modelling (8), Americas Conference on Information Systems (7), European Conference on Information Systems (7), International Conference on Information Systems (7) and Enterprise Distributed Object Computing Conference (6).

4 Discussion

In order to contribute to the field of ODCM, we discuss certain shortcomings and possible research opportunities that have been identified within this literature study.

Research Opportunity 1. As observed in Sect. 3.1, the field of ODCM has focused mostly on phenomena of the static perspective. Only in the last decade did we observe an increased interest in the dynamic and especially the B&F perspective. Similarly, the BWW ontology was by far the most applied in ODCM. We did recognize a growing interest in the UFO ontology, which is likely related to the growing interest in the B&F perspective. Furthermore, our results indicated that UML is the principal modeling language in ODCM. Moreover, UML was the most applied CML in every perspective. Although we do not doubt that both the BWW ontology and the UML modeling language are very adequate in performing ODCM, we can ask ourselves if this rather unilateral approach is much desired. As mentioned by Guizzardi [7], research in ODCM aims to develop engineering artifacts for improving the theory and practice of conceptual modeling. This research process is essential, not only to support acceptance among IS professionals, but also to establish the credibility of ODCM research among the larger body of researchers in the various engineering fields. If the field of ODCM produces artifacts that are mostly based upon the same and existing knowledge base, we tend to transform this research process into routine design [22]. As such, we believe many opportunities in ODCM still lie in addressing important and unsolved problems with new ontologies and different conceptual modeling languages. This diversification will lead to unique and innovative ways into solving these problems. A good example of such an innovative solution is the pattern language OntoUML, which was referenced by several papers in our literature set.

Research Opportunity 2. Our results would suggest that certain ontologies are more advantageous to apply, depending on which kind of phenomena the modeler is dealing with. However, as observed in [11], many researchers remain vague in defining the specific application of the ontology and in motivating their choice of ontological theories for the intended purpose. As displayed in Table 3, we observed that more than half of all the phenomena that were applied together with the BWW ontology where phenomena from the static perspective, while more than half of the phenomena that were used with the UFO ontology belong to the B&F perspective. These results would suggest that the BWW ontology is more convenient to apply to static phenomena while the UFO ontology is more suited to deal with B&F phenomena. These implications could serve as a testing hypothesis for future research to investigate these topics more thoroughly. This opportunity can also be approached from a different perspective, by relating the choice of an ontology (and the choice of a CML) to the pragmatic feature of a model [3]. Since every model is created with an intended purpose (its pragmatics), the ontology should correspond to this purpose. In other words, we believe that an opportunity lies in properly investigating which ontology can be applied according to the pragmatics of the model.

Research Opportunity 3. Ontologies are increasingly seen as key to successfully achieve semantic interoperability between models and languages. As identified in frequency Table 1, many different types of ontologies are being applied. Consequently, the field of ODCM has a wide variety of ontological analyses, ontology-based models and numerous methods in how to create or perform such analyses and models. However, this has re-introduced the interoperability problem, as also mentioned by Khan and Keet [23]. Especially on the long term, this raises the ambiguity between different ontology-founded models and increases the terminological confusion, which as a result leads to more complexity for both modelers and practitioners of ODCM. By increasing the interoperability between ontologies, we could facilitate their ease of use. By creating a mapping of elements between different ontological concepts and structures, this would reduce the workload for new research efforts since they could be based upon already earlier performed research. Efforts to increase interoperability can occur in many different forms. For example, as a way to increase the interoperability between ontologies, Khan and Keet [23] have created an online library of foundational ontologies called ROMULUS (Repository of Ontologies for MULtiple USes). ROMULUS maintains a catalogue of mappable and non-mappable elements among several foundational ontologies, and the pairwise machine-processable mapped ontologies.

5 Conclusion

This paper conducted a systematic mapping review in the field of ODCM. In total, our mapping study investigated 200 articles that originated from six digital libraries. We have provided a classification founded on previously developed research and two frequency tables, in order to clearly and thoroughly categorize papers dealing with ODCM. The classification scheme was used to identify which types of phenomena occurred the most, while the frequency tables aimed to discover the most frequently applied ontologies and CMLs. The results of the classification scheme indicate that phenomena of the static perspective have been considered the most in ODCM. However, during the last decade, we noticed an increased interest in phenomena of the dynamic and the B&F perspective. Our frequency tables determined that the BWW ontology and the UML modeling language have been applied most often. Originating from these results, we formulated several research opportunities: (1) we emphasized the importance of applying new kind of ontologies and types of modeling languages; (2) we suggest that the kind of ontology which is used to produce ODCM is of importance, and should be justified as a design choice in the modeling process; and (3) by increasing the interoperability between ontologies, we can link many of their analyses, models and frameworks and facilitate the overall ease of use in ODCM.