Keywords

1 Introduction

In business process modeling, semi-formal modeling languages such as BPMN are used to specify which activities occur in which order within business processes. Whereas the order of the activities is specified using constructs of the respective modeling language, the individual semantics of a model element such as “Check order” is bound to natural language. As long as models are created and read by humans only and a commonly agreed (potentially restricted) language is used, the usage of the natural language is no serious limitation. However, if models have to be interpreted by machines, e.g. for offering modeling support, search on a semantic level, content analysis in merger and acquisition scenarios and for re-using implementation artifacts linked to process elements (e.g. web services), a machine processable semantics of modeling elements is required [1]. In the past, several approaches tried to formalize the semantics of individual model elements by annotating elements of ontologies or other predefined vocabularies that to some degree formally specify the semantics of a model element. However, such approaches up to date suffer from a major limitation: Annotation is a highly manual and tedious task. The user has to select suitable elements of an ontology by browsing the ontology or doing a keyword-based search in the labels of the ontology. Even if the system is capable of presenting some annotation suggestions, e.g. based on lexical similarity of labels, the user has to make sure that annotations match the appropriate context in the process model by inspecting the structure of the ontology that typically is organized in a hierarchy. For example, if the ontology contains two activities labelled with “Accept invitation”, it is important whether this activity is part of the hiring process (where the applicant accepts e.g. a job interview) or the planning process for business trips (where the employee accepts an invitation of a business partner). In other words, the semantic context of an element that is to be annotated must be considered. Since no highly automated context-sensitive approach for process model annotation is available so far, this contribution is meant to facilitate developing, comparing and optimizing such approaches. To bootstrap systematic research in this direction, use cases for automated annotation approaches are described and existing annotation approaches are reviewed. With this, interest in a very promising research topic should be raised; both in regard to scientific outcome as well as practical usefulness.

The remainder is structured as follows. Section 2 provides use cases for automatic process model annotation. In Sect. 3, existing annotation approaches are reviewed. In Sect. 4, a conclusion and short outlook on research opportunities is provided.

2 Use Cases for Automated Annotation

In the following, application scenarios leveraging an automated process model annotation are presented.

Modeling Support.

If process elements are automatically annotated with elements of an ontology or taxonomy containing a set of predefined activities, this knowledge can be exploited to help the modeler completing his or her modeling task. This is illustrated by Fig. 1 showing a process fragment (bottom) being automatically annotated with a task ontology (top). This knowledge can then be exploited to provide modeling suggestions (right). The advantage of using this knowledge is that the suggestions for the following model element are not only derived on basis of one (or more) previous model element(s). Rather, they can be based on the knowledge representation that is linked to the model element via annotation. For example, in the knowledge representation it may be specified that after offering the job, potential candidates should be selected. The key difference to approaches based on e.g. suggesting activities retrieved from similar models such as the work by Koschmider [2] is that in this way normative knowledge is used, i.e., how an enterprise should act. Besides modeling support, automated annotation also provides the basis for leveraging information from knowledge representations that may provide additional value. For example, the PCF taxonomy [3] contains key performance indicators for all of the activities it contains (in the industry independent version approx. 1000 activities). Also, information to enact a process in the workflow environment may be linked to the set of specified reusable activities. All in all, new ways of modeling support and of providing additional assistance in the model-based design of process supporting information systems are possible due to an automatic process model annotation.

Fig. 1.
figure 1

Automated modeling suggestions

Process Retrieval.

Current repositories are equipped mainly with keyword-based search mechanisms or rely on process query languages such as BPMN-Q [4]. These instruments allow searching the process space using natural language as well as structural and behavioral information. However, they lack to restrict search to broader content or topics of a process corresponding with the distinct functional areas in an enterprise, in short with the business topic. Although it may be possible to manually assign descriptors to models and in fact manual annotation approaches have been discussed recently [5], this imposes an extra effort on modelers having to focus on delivering high-quality models in a timely manner. Moreover, descriptors must be kept up to date if the model is adapted. Hence computing the business subject of a process model automatically based on activities that are annotated automatically creates an additional value. It can be re-computed from time to time to keep the information up to date.

How the automatic annotation of processes may improve the retrieval of processes from a repository is shown in Fig. 2. The user types in the keyword “review” in the search form (top). Since reviewing activities can occur in many contexts of the enterprise activities, the user specifies the category “Human resource management” which automatically shows up by typing in the special keyword “category” (much like keyword-search functionality in file explorers of common operating systems). Based on the automated annotation of processes, the activity “Review applications” is found that belongs to a process in the HR realm. Hence with automated annotation, the retrieval of process knowledge on a semantic level can be improved.

Fig. 2.
figure 2

Improved process retrieval

Process Analysis.

Similar to process retrieval, current approaches for analyzing the contents of process models rely on keyword search or specialized query languages. Another way that is also common to analyze process models is to find similar process models, commonly referred to as process model matching. However, all these mechanisms have in common that an analysis is done in relation to what the user wants to know (which requires the user to know common terms in the business context) or what is available (when models are compared). However, in some situations of process analysis it may be favorable to introduce normative knowledge about which tasks typically occur in enterprises. With this, questions regarding the coverage of a process can be answered such as “Do we have a process for managing product quality?” which may be important for e.g. certain certification activities. Another example would be “In which area, we do not have yet specified processes?” or “Which of our processes are highly cross-cutting?” Fig. 3 illustrates how automated process model annotation may serve process analysis and comparison using normative knowledge.

Fig. 3.
figure 3

Advanced process analysis

At first, the user selects process models using a keyword search and adds models to the comparison (top). He or she subsequently inspects and compares the contents of the process model using a taxonomy of pre-defined business functions to guide this inspection (center). In more detail, the result of automatic annotation is displayed for each process model in a separate column. Each matching activity is displayed as a square that is saturated according to the matching score. Multiple squares are composed to a visualization that slightly resembles to well-known equalizer visualizations of audio-equipment. When the mouse is hovering over a square, matching score and other information can be shown such as a link to open the process model or other meta-information about the process. In order to zoom-in and -out, the user may also expand or reduce taxonomy levels (left).

Other visualizations that would be possible are histogram-like diagrams. In this way, automated process model analysis that is enabled by exploiting annotation information is the basis for advanced analysis and visualization capabilities.

3 State of the Art

Annotation in general has been discussed in the early stages of the Semantic Web movement [6]. In the following, annotation has also been explored in relation to enterprise modeling. For example, Boudjlida and Panetto describe annotation types in enterprise modeling [7]. The authors identify various semantic relations between an enterprise model and an element of an ontology and provide a schema for describing annotations. However, the authors also acknowledge that automation in annotation is largely missing: “However, an important feature is missing: it is the one that permits the automatic or the semi-automatic provision of the annotations.” [7] Since no comprehensive overview of existing, manual annotation approaches for enterprise model is available so far, a structured literature analysis is conducted. With this, developers of automated annotation tools should be served with an overview that should inform and inspire the development of automated annotation procedures.

3.1 Selection and Analysis of Relevant Literature

For analyzing the literature, the literature data bases EBSCO, Springer, ScienceDirect and Google Scholar were examined. Different queries such as “process model” AND annotation or “model annotation” or “semantic annotation” AND annotation and variants of these queries were executed leading to 83 hits. The following inclusion and exclusion criteria were applied: Articles were excluded that use the term “annotation” to simply express that some additional information is written in the process model that has been generated automatically (i.e. to find semantic deficiencies). Further, works were excluded aiming at the semantic annotation of web services (e.g. by standards such as SAWSDL) or described by [8,9,10,11,12] or paper-based forms [13] since this is only slightly related to model annotation. Moreover, articles were excluded that describe high-level, general purpose annotation frameworks e.g. in the field of Semantic Web (annotation of web pages). Articles were included that sufficiently deal with business process modeling and that discuss annotation in sufficient detail. Regarding the latter aspect, this means not merely using/exploiting annotated process models that have been annotated somehow somewhere, but that are concerned with annotation itself.

In terms of completeness of the literature search, it can be assumed that most relevant papers have been identified since a high overlap between hits from databases and Google Scholar was found. Moreover, also all works in the area of process model annotation contained in the recent survey from [14] were retrieved. Hence it is likely that all important works were identified. For this reason, a forward- and backward search as requested e.g. from Webster and Watson [47] was not performed. Especially a backward search did not prove to be fruitful, since with this predominantly annotation tools of the semantic web community (such as OntoMat Annotizer etc., see [15, 16]) have been found that are not specific to process model annotation. If such approaches would be included, all the annotation work of the semantic web community (as an example list, see http://semanticweb.org/wiki/Tools) would be relevant. However, in the BPM community more focused approaches exist that leverage the process structure such as the works form Born et al. [17]. Hence it is more useful to more strictly look at the works from the BPM community that developed annotation techniques, which is done in the paper at hand.

3.2 General Overview on the State of the Art

In the following, the results of the literature analysis are presented (cf. Table 1). Relevant works are compared and reviewed by first giving a Description of the overall approach. Besides, a precise account on the notion of Annotation concept is given, that is, the specific approach the authors described, developed or implemented. In addition, approaches are compared in regard to whether they provide a (formal) definition of annotation (column Def) and the Used technologies such as e.g. lexical databases, string similarities etc. Moreover, approaches are compared in regard to two key characteristics. The first is their implemented or envisioned degree of automation (column AU). Symbol ☐ is used to indicate manual, ▣ for semi-automated and ◼ for automated approaches. The second key criteria is whether the approach accounts for the semantic context of a process model element (column CO), i.e. what previous activities lead to the activity or which activity are triggered by the activity. This criteria is important for the annotation of process models, since processes are essentially about the order of tasks executed in a business process. Consequently, the flow of activities is important for annotation. If for example an order is captured, checked and finally executed, it is highly unlikely that after order execution an activity such as “Confirm order” is relevant for annotation, even if it lexically matches an activity label such as “Confirm order fulfilment”. So in essence, the criteria is about “knowing” the semantic context in which a process element occurs and considering this during automated annotation.

Table 1. Results of literature analysis

4 Conclusion and Outlook

In this study, general use cases that require an automated annotation approach have been presented. This underpins the relevance of such a research endeavor. Then a comprehensive overview on the state of the art in the literature was presented. A major result of this overview is that annotation is rarely automated. Even if it is suggested in the research works, no automation seems to be implemented. Also, rarely prototypes are shown. Regarding the semantics of annotation, context information is (apart from one work) almost never used. This is a surprising research gap that exists even today – after almost one decade of research on semantic technologies applied to BPM that started with simple process model annotation proposals. Therefore, a research opportunity lies in developing (semi-)automated annotation approaches in order to first leverage existing standards such as PCF (cf. the use cases in Sect. 2) and second to make use of the wealth of semantic technologies (e.g. for search and matching of models on the semantic level) when process models have automatically been annotated. All in all, this contribution may be a starting point for developing more sophisticated (semi-)automatic approaches capable of linking semi-formal process models with more formal knowledge representations. With this, new use cases are possible shifting the automated interpretation of process models to a new and more semantic level. This contribution should encourage research towards this goal.