Keywords

1 Introduction

Information systems in organizations support and automate the processing of business transactions. Typically, these systems are highly integrated into companies’ business processes. Each executed activity in an information system is recorded in an event log, storing information about what activity was executed at a certain point in time. Due to the highly competitive pressure in the market, organizations are strongly interested in analyzing these event logs to optimize their business processes. Process mining provides an accurate view on how processes are actually executed. Specifically, process discovery helps analysts to visually understand the underlying relationship between activities, find potential bottlenecks, and compare the actual execution with the desired one.

Due to the growth of available event data and the increase of process complexity, finding interesting insights is more challenging. Often extensive domain knowledge is required to analyze highly complex process models (e.g., spaghetti-like models) to obtain valuable insights [16]. Efforts, such as the \(PM^2\) or the \(L^*\) methodology, have been made to systematize the workflow of analysts to guide the planning and execution of process mining projects. Despite the increasing number of available tools that incorporate process mining capabilities, existing tools lack proper computer-assisted support. In practice, the work of analysts is characterized as largely manual, leading to many ad-hoc tasks.

Let us consider the following scenario, which exemplifies a common situation an analyst may encounter today. Julia is an analyst who is interested in the process performance of the BPIC 2019 event log. The process discovery returns a spaghetti-like process model which reflects the actual behavior of the process. To simplify the view, Julia manually filters cases based on domain knowledge. For instance, she selects cases that start with a requisition and are of item type “subcontracting”. Afterwards, she computes the case duration of the subset and compares it with the case duration over all cases. For this particular selection, the case duration turns out to be significantly lower with 31.8d compared to 71.5d. Next, Julia considers a different selection and computes the case duration again. As we can see, this repetitive work is time-consuming and error-prone, hampering efficient exploration and analysis.

In this paper, we present ProcessExplorer, a novel approach that provides automatic guidance during process discovery. Our work is inspired by the typical workflow of analysts, intended to obtain a general overview of the different process behaviors and their performance observed in the event log. ProcessExplorer recommends subsets of interesting cases that follow a certain behavior pattern. Based on the identified subsets, ProcessExplorer recommends insightful process performance indicators (PPIs) that characterize each subset. As part of our work, we developed an interactive visual exploration system that implements the underlying techniques of our approach. Our system does not make any assumptions about the process or the event log, which makes it specifically useful for exploring unknown processes. We evaluated the usefulness of our system by conducting a user study with process analysts (Sect. 6). The results show that both types of recommendations are valuable and useful for analysts, specifically for event logs with a large set of events.

In summary, our contributions are:

  1. 1.

    A novel approach to automatically compute recommendations that guide process analysts during the analysis of large event logs (Sect. 4). Specifically, we introduce a number of innovative techniques:

    • An adapted version of trace clustering capable of automatically generating subset recommendations of cases with interesting process behavior, related to the control-flow and data perspectives (Sect. 4.1).

    • A mechanism, based on statistical significance analysis, that identifies the most deviating PPIs that are relevant and insightful for the analyst to explore (Sect. 4.2).

    • A diversifying top-k ranking approach that generates a ranked list of subset and insights recommendations (Sect. 4.3).

  2. 2.

    An interactive visual exploration system that enhances analytic support to quickly explore large event logs (Sect. 5).

The rest of the paper is organized as follows. In Sect. 2 we provide an overview of the related work, followed in Sect. 3 by some preliminaries used throughout the paper. In Sect. 4 we describe the details of our approach. Then, in Sect. 5 an implementation system of our approach is presented, which is used for evaluating the usefulness of our approach, as described in Sect. 6. In Sect. 7 we present a brief discussion on the validity of our evaluation and limitations of our approach. Finally, in Sect. 8 we conclude the paper with a discussion and future work.

2 Related Work

Exploratory Data Analysis. Our work is highly related to exploratory data analysis (EDA), which deal with the issue that users do not know beforehand the characteristics of the data set. One direction to support is to provide the user with automatic visualization recommendations during the analysis. SEEDB [24] evaluates the interestingness of columns from relational data to obtain such recommendations. Voyager [26] suggests data charts according to statistical and perceptual measures, and provides faceted browsing. Both approaches focus on the selection of data attributes and aggregations. DeepEye [14] suggests which type of visualization yields the best data representation. VizRec [18] additionally provides personalized visualizations based on the user’s perception. Our approach supports the visual exploration of process executions in event data.

Data Insights. Another direction of exploratory data analysis is to automatically provide interesting insights rather than exploring data dimensions or encodings. Foresight [7] recommends visual insights by analyzing the statistical properties (e.g., correlations between data attributes). DBExplorer [22] aims to improve the understanding of the characteristics of the data attributes and querying of hidden attributes by using conditional attribute dependency views. In [11] a smart interactive drill-down approach is presented which discovers and summarizes interesting data groups. Milo and Somech [17] introduce a next step recommendation engine which suggests follow-up analysis actions based on prior action recordings. In process mining, a similar analysis is applicable to case attributes and PPIs. Our approach investigates how to extract interesting case attributes and PPIs from event logs automatically.

Interactive Browsing in Process Mining. Process mining tools such as ProM, Fluxicon Disco or Celonis are highly exploratory, enabling the user to interactively inspect and analyze event logs. P-OLAP [3] enables to analyze and summarize big process graphs as well as it provides multiple views at different granularity. VIT-PLA [27] summarizes traces and automatically creates data explanations from trace attributes using regression analysis. A linguistic summary approach is presented in [8]. In [2] the authors evaluate PPIs to identify the key differences between process variants. Similarly, Bolt et al. [4] use annotated transition systems to obtain the differences between process variants. Business rules on decision points are generated from the case attributes, revealing the influence of data values to the control-flow. However, different from our approach, existing tools lack of explicit analysis guidance.

Complexity Reduction of Event Logs. Many different approaches have been investigated to reduce the complexity of event logs such as trace or activity clustering. Smaller sub-logs that are obtained by grouping similar process behavior together tend to be less complex and easier to analyze. Clustering can be divided into alignment based [5, 25] and model-based [21, 23] approaches. An artifact-centric approach is presented in [9] where complex models are split into smaller and simpler ones with fewer states and transitions using statistical significance testing. In our approach, we build on the existing trace clustering technique to extract subsets with similar process behavior to reduce analytic complexity.

3 Preliminaries

We first provide some formal definitions, derived from [1], which are used in the exposition of our approach.

Let \(\mathcal {E}\) be the set of all possible event identifiers, \(\mathcal {A}\) be the set of attributes, and \(\mathcal {V}_a\) the set of all possible values of attribute \(a \in \mathcal {A}\). For an event \(e \in \mathcal {E}\) and an attribute \(a \in \mathcal {A}\): \(\#_a(e)\) is the value of attribute a for event e.

Let \(\mathcal {C}\) be the set of all possible case identifiers. For a case \(c \in \mathcal {C}\) and an attribute \(a \in \mathcal {A}\): \(\#_a(c)\) is the value of attribute a for case c. Each case contains a mandatory attribute trace: \(\#_{trace}(c) \in \mathcal {E}^*\), denoted as \(\hat{c} = \#_{trace}(c)\).

A trace is a finite sequence of events \(\sigma \in \mathcal {E}^*\) such that each event only occurs once: \(1 \le i < j \le |\sigma | : \sigma (i) \ne \sigma (j)\). Finally, an event log is a set of cases \(L \subseteq \mathcal {C}\) such that each event only occurs at most once in the log.

4 ProcessExplorer Approach

ProcessExplorer follows three main steps to provide intelligent guidance for event log exploration:

  1. 1.

    Discovery of Subset Recommendations. Our approach automatically discovers subsets that contain process cases of similar behavior using trace clustering based on the control-flow and the data perspective. These subsets are computed by splitting the event log into smaller and relevant subsets that are much more structured and easier to understand for analysts.

  2. 2.

    Discovery of Insights Recommendations. Our approach automatically discovers relevant PPIs and identifies the most interesting ones. The criteria to decide on the interestingness of PPIs is based on how much the subset deviates from a given reference.

  3. 3.

    Ranking of Recommendations. We rank the recommendations based on the analysis goals by applying diversifying top-k ranking. This ensures that our approach highlights most diversifying recommendations instead of showing very similar subsets or PPIs.

Next, we provide a detailed description of each step.

4.1 Discovery of Subset Recommendations

In this step, our approach automatically finds interesting process behaviors based on a modified version of trace clustering. The event log is split into cases of similar behavior, which are then used as the basis for the subset recommendations. Subset recommendations are similar to filters typically found in existing tools, but they are generated automatically from the event log.

Fig. 1.
figure 1

Architecture of the subset recommendation generation.

We adapt the trace clustering algorithm introduced in [21], which incorporates the control-flow and data perspective to generate clusters of cases. It defines a combined similarity function that balances the two perspectives. Let L be an event log, and let \(L_1, L_2, ..., L_n \subseteq L\) be the variants of L solely based on the control-flow. Furthermore, let \(L_i \cap L_j = \emptyset \) for all \(i, j \in [1, ..., n]\) and \(i \ne j\), i.e., cases are assigned to exactly one variant. For computing the similarity between cases, the two process perspectives are inspected separately before they are combined (see Fig. 1):

  • For the control-flow perspective, we compute the Levenshtein edit distance \(lev(\underline{(\hat{x})}, \underline{(\hat{y})})\) between two cases \(x, y \in L\). The Levenshtein edit distance is defined as the minimum edit operation costs to modify the trace of a case to another by insertion, deletion, or substitution of activities. For comparing traces of different length we use the normalized edit distance, which is the edit costs divided by the sum of the lengths of both traces. We compute the control-flow similarity between two sets of cases using the normalized Levenshtein edit distance for all variants, where \(X, Y \subseteq L\):

    $$\begin{aligned} sim_C(X, Y) = \sum _{x \in X} \sum _{y \in Y} lev(\underline{(\hat{x})}, \underline{(\hat{y})}) / (|X| \cdot |Y|) \end{aligned}$$
    (1)
  • For the data perspective, we explore the relationships between case attributes. We are particularly interested in values of case attributes that co-occur with other values. We extract the case attribute relationships by applying frequent pattern mining, which is an unsupervised data mining technique that searches for recurring patterns or relationships among itemsets. In particular, we use the FPclose algorithm [10] on case attributes (as itemsets) to find co-occurring case attribute values. The application of the FPclose algorithm is denoted as

    $$\begin{aligned} \mathcal {I}_{L_i} = \{ FPclose(X, s) : X \in L_i \} \end{aligned}$$
    (2)

    where s is the minimum support threshold in the range \(0 \le s \le 1\), and \(\mathcal {I}_{L_i}\) is the set of all case attribute value patterns in \(L_i\). The support threshold s determines the percentage of how often a pattern must be observed in the event log to be considered frequent. The FPclose algorithm only returns closed patterns, reducing thus the number of patterns considered. A pattern I is closed if there exists no proper superset J that has the same support as I. Case attribute value patterns are calculated for each variant \(L_i\) separately and mapped to the corresponding cases. We denote all case attribute patterns as \(\mathcal {I} = \mathcal {I}_{L_1} \cup ... \cup \mathcal {I}_{L_n}\). Extracted case attribute value patterns are compared with the following formula, which returns the proportion of case attribute values that are contained in both sets:

    $$\begin{aligned} sim_{D}(I_i, I_j) = \frac{2 \cdot |I_i \cap I_j|}{|I_i| + |I_j|} \end{aligned}$$
    (3)

    where \(I_i, I_j \subseteq \mathcal {I}\).

For calculating the clusters, the two similarity functions are combined such that the influence of the control flow and the data perspective can be controlled by a weighting factor w, with \(0 \le w \le 1\). We define a similarity matrix over the case attribute value patterns instead of cases:

$$\begin{aligned} M = (m_{ij})&= w \cdot sim_{C}(cases(I_i), cases(I_j)) \nonumber \\&+ (1 - w) \cdot sim_{D}(I_i, I_j) \end{aligned}$$
(4)

where cases is the mapping function \(\mathcal {I} \rightarrow \mathcal {C}\) that returns the set of cases of the case attribute value pattern. We use Hierarchical Agglomerative Clustering with ward linkage to compute the clusters for the similarity matrix M. Because we cluster case attribute value patterns, we need to obtain the corresponding cases to generate the subsets. These subsets may contain overlapping cases because similar case attribute value patterns may be shared among different clusters which may come from the same case. This issue is resolved by assigning the cases to the cluster with the minimum distance concerning the control flow.

Finally, our method automatically determines the optimal number of clusters k, the minimum support threshold s, and the weighting factor w. We formulate an optimization problem that maximizes the fitness of the underlying process models of each cluster, maximizes the silhouette coefficient of the clusters, and minimizes the number of clusters to obtain ks, and w. The optimization problem is solved by Particle Swarm Optimization [12], which is a genetic optimization algorithm inspired by the group dynamics of bird swarms. For each identified subset, we discover a C-Net using the Data-Aware Heuristics Miner [15] and compute the replay fitness.

4.2 Discovery of Insights Recommendations

In this step, our approach automatically obtains relevant process performance indicators (PPIs) yielding only the most interesting insights for each of the subset recommendations. PPIs, which are visually prepared for analysis, measure potential process bottlenecks, inefficiencies, and compliance violations. Our approach helps alleviating the abundant manual and repetitive work caused by the wide range of different PPIs that need to be evaluated.

We adapt the idea introduced in SEEDB [24] where the interestingness of visualizations is judged by how large the deviation of the visualized data is from a reference (e.g., a different event log, historical data, or other subsets). Similarly, we calculate PPIs from the selected subset and compare them to a reference to assess interestingness. Although there are other criteria that may make PPIs interesting, process analysts are particularly interested in deviations as these are indicators for process inefficiencies or anomalies. We propose a set of basic PPIs which distinguishes between case- and subset-based PPIs (see Table 1). Case-based PPIs describe a case-specific characteristic, and subset-based PPIs are defined as aggregations over all cases within a subset. The set of PPIs does not necessarily consider all existing aspects of a process, but our approach is largely agnostic to the particular definition of the PPI.

Table 1. Case- and subset-based process performance indicators (PPIs).

We gather interesting insights from the event log by performing statistical significance testing on the selected PPIs. These tests determine whether the PPI values of the subset and the reference set are drawn from different distributions. Specifically, the null hypothesis of the statistical significance test is that the two PPI value sets are drawn from the same distribution. For each test, a significance level is set beforehand which is the accepted probability that the null hypothesis is true, but wrongly rejected. Based on the p-value and the significance level, we decide to reject the null hypothesis or not. If the null hypothesis is rejected, the corresponding PPI is considered as an insight recommendation. We apply two different statistical significance test based on the type of the PPI: For case-based PPIs, we use the Kolmogorov-Smirnov test, and for subset-based PPIs, we use the Jensen-Shannon divergence.

We additionally calculate Cohen’s d effect size because the significance value does not necessarily indicate how large or small such a deviation is. For two measurement series \(x_1, x_2\), it is defined as follows:

$$d = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{(s^2_1 + s^2_2) / 2}} \quad \text {with} \quad s^2_i = \frac{1}{n_i - 1} \sum _{j=1}^{n_i} (x_{j,i} - \bar{x}_i)^2$$

The effect size illustrates the amount of deviation in a comprehensive and understandable scale. Cohen introduces ranges that determine how large or small the deviation is: \(0.2 < d \le 0.5\) indicates a small effect, \(0.5 < d \le 0.8\) a medium effect, and \(d > 0.8\) a large effect.

In our experiments, we found that some PPIs have a strong correlation between each other which leads to additional redundant insights. We summarize correlating insights by grouping them together using clustering. The pairwise Spearman correlation matrix is computed over all relevant PPIs, and it is used as the input for the clustering. The optimal number of clusters is determined by the elbow method.

4.3 Ranking of Recommendations

In this step, our approach ranks subset recommendations based on their relevance, yielding a ranked top-k recommendation list. The score given to each subset is the average score of the identified insights within, as defined by the product of effect size and coverage. The coverage is the proportion of cases in a subset that fulfill the specific insight. This criterion ensures that subsets with high representative insights are ranked higher.

In addition, we apply diversifying top-k ranking [20] over the list of relevant subsets, which returns a list of k items based on the score and diversity. We do this to avoid the typical issue of top-k lists, which tend to have very similar items on top of the list. Users tend to pay more attention to items on top of the list which can cause low selection diversity and the filter bubble effect [19]. We define diversity as the combination of trace and case attribute similarity of the cases within each subset. We use the same similarity function that was used for generating the subset recommendations to identify most diversifying subsets. Based on the similarities of the subsets and the calculated relevance score, we return a ranked top-K list of subset recommendations.

5 ProcessExplorer Implementation System

We implemented our approach into an interactive exploration system to illustrate and evaluate its usefulness. Our system allows analysts to interactively explore event logs with the automatic guidance provided by the ranked subset and insights recommendations. We describe our system using the publicly available event log of the BPI Challenge 2019Footnote 1 as a use case example. This event log contains data of the purchase order handling process collected from a multinational company in the Netherlands that deals with coatings and paints. The event log consists of 1, 595, 923 events relating to 42 activities in 251, 734 cases. The events are executed by 627 different users (607 human users and 20 batch users). The overall user interface of our system is depicted in Fig. 2 and consists of the subset recommendations, the insights recommendations, the process map, the stage view, and the subset statistics. A detailed view of the components is presented in Fig. 3.

Fig. 2.
figure 2

User interface of ProcessExplorer showing the subset and insights recommendations, the process map of the selected subset, the stage view, and the subset statistics.

Fig. 3.
figure 3

Detailed overview of the stage view, the subset recommendations, the insights recommendations, and the subset statistics (from top-left to bottom-right).

Process Map. Similar to other process mining tools, the process map visually shows the underlying process with navigation and visualization features found in existing process mining tools. This is depicted in Fig. 2. The user can hide activities and transitions using the relative occurrence slider at the bottom right of the process map. The figure presents the process map of an applied subset recommendation.

Subset Recommendations. The list of subset recommendations shown in Fig. 3 top-right are obtained from the event log. Subset recommendations are presented to the user as a list sorted by their assigned score. The user can modify subset recommendations by adding additional filters, such as the variant filter, the start and end activity filter, or happy path filter. Subset recommendations can be applied or deleted by the user. For the use case event log, 10 subset recommendations are returned.

Subset Statistics. The subset statistics depicted in Fig. 3 bottom-right give an overview of the activity distribution, variants, traces, and transitions of the containing cases. The subset statistic of our example use case shows that the selected subset only contains 6 out of 11 events, covers 1956 event occurrences, and shows the distribution of events compared to the previous stage.

Insights Recommendations. The insights recommendations depicted in Fig. 2 are the insights of the subset that the user decided to apply. Trace-based PPIs are rendered as text, describing the identified deviations. Cluster-based PPIs are rendered as bar charts, showing the distribution of values. Specifically for our running example, Fig. 3 bottom-left shows 2 out of 6 insights recommendations. In the example, subset PPIs were compared to all cases in the event log. The first indicates that the “Record Invoice Receipt” activity is directly followed by “Remove Payment Block” activity in 59.1% of the cases, compared to 14.6% in the overall event log. The second insight shows the distribution of resources for the activity “Receive Order Confirmation” which highlights the “user_029”.

Stage View. The stage view shown in Fig. 3 top-left allows simplified navigation between subset recommendations. A stage view records the applied subset recommendations in a hierarchical structure. Applying a subset creates a new stage for which new recommendations are computed based on the containing cases. Three different stages are depicted in our example use case. The stages were generated by accepting the first three subset recommendations.

6 Evaluation

In this section, we evaluate our proposed system in two different studies. First, we present a preliminary study that helped us defining the requirements of our ProcessExplorer implementation system. Then, we present a user study aimed at evaluating the usefulness of our approach through the application of our implementation system to a particular case.

6.1 Preliminary Study: Identify Key Requirements

A preliminary study based on an early prototype was conducted to define, based on expert analysts opinions, the key requirements to build a fully-fledged version of the ProcessExplorer system. In particular, we are interested on which kind of guidance may be more helpful to analysts for obtaining valuable knowledge from the event log more efficiently.

Setup. The prototype was shown to five process mining consultants (1 female) which are all familiar with state-of-the-art tools, such as Fluxicon Disco, PAFnow Process Mining (all participants), QPR Process Analyzer (4), Celonis and ProM (2). For the user study, we used the publicly available real-life event log from the BPI Challenge 2017. The interface was compared in three different modes. In the first mode, we consider only manual filtering with no guidance being provided. In the second mode, we introduce the stage view functionality allowing the analyst to focus on specific parts. In the third mode, we further introduce the subset recommendation feature which enables to analyze certain parts of the process in a semi-automatically fashion. All participants were asked to get familiar with the provided guidance and explore the given event logs. After exploring the event log with a specific mode, the participants were asked to fill out a post-stage questionnaire to rate their explicit experience and preference with the interfaces. Finally, we asked the participants to fill out the User Experience Questionnaire (UEQ) [13]. The UEQ consists of 5 scales: Attractiveness which reflects the overall impression, perspicuity, efficiency, dependability (pragmatic quality), stimulation and novelty (hedonic quality).

Fig. 4.
figure 4

Results of the UEQ for the preliminary study and for the final prototype. Range from −3 (horribly bad) to 3 (extremely good).

Results. According to the responses we obtained from the post-stage questionnaire, participants rated the stage view with the ability to navigate between the different subsets as beneficial and useful for the systematic exploration of large event logs. However, two participants took longer to analyze the event log due to the increased navigation capabilities. The participants rated negatively the need to navigate between the different views to change the selection. The subset recommendation feature received positive comments, but the information presented was not sufficient to let users directly see what the subset is all about. Furthermore, one participant criticized the top-down approach which makes it difficult to find the really important parts of the process. He suggested recommending potential interesting subsets of groups which can be inspected one-by-one.

In Fig. 4 we present the results of the UEQ. We observed that stimulation and novelty got the highest score, attractiveness, efficiency, and dependability only an average score and perspicuity a negative score. Furthermore, our study shows that participants consider the idea of a systematic analysis innovative and promising. However, our results do not offer conclusive evidence that the analysis experience is improved with our preliminary prototype’s user interface. The detailed comments from the participants show a steep learning curve for the preliminary prototype implementation, indicating that more guidance support was required.

6.2 User Study: Evaluation of Usefulness

A user study with the ProcessExplorer system was conducted to evaluate the usefulness of the system and the underlying concepts. Particularly, we are interested in how useful the subset and insights recommendations are for exploring large event logs.

Setup. The system was shown to six, different from the pre-study, process mining experts in a user study workshop. The workshop took 60 min and was divided into two parts. In the first part, ProcessExplorer was presented to the participants, and we introduced the implemented guidance features. We used BPI Challenge 2019 event log which was known to all study participants, so that little or no explanation was required about the inspected process. Participants were asked to express their explicit opinion about the system. In the second part of the workshop, the participants were able to explore the implemented guidance features to see how ProcessExplorer guides their analysis work.

For the evaluation of ProcessExplorer, we applied the Technology Acceptance Model (TAM) [6], which is a quantitative method to evaluate the potential acceptance of a given technology by end users. The standard evaluation form of the TAM consists of 10 statements (see Table 2) each assigned to a specific cluster related to the usefulness aspects (A = job effectiveness, B = productivity and time savings, C = importance of the system to the users’ job, D = control over the job). The statements were rated by the study participants to what extent they agree on a scale of −3 “Strongly disagree” to 3 “Strongly agree”.

Table 2. The questions and results of the TAM usefulness estimation.
Fig. 5.
figure 5

Overview of the TAM usefulness estimation for ProcessExplorer according to the question clusters which scales from −3 (strong disagree) to 3 (strong agree).

Results. Our results in Fig. 5 show that the ProcessExplorer system received a positive overall usefulness mean score of 2.8 (\(SD=0.4\)). For each of the specific cluster of the TAM, our system obtained a positive mean score. Statement 1 in cluster A received a high positive rate with a mean of 2.6 (\(SD=0.49\)). This can be explained by the fact that the ProcessExplorer system provides analysts a more in-depth view into the event log. Analysts can quickly explore the different process behaviors. In spite of this, we observed that the job performance (statement 6) did not improve, showing a lower score of 1.17 (\(SD=2.11\)), suggesting that not all participants share the same positive opinion about the system. This might be caused by the short period of exploration time. Most study participants agreed that productivity and time savings can be achieved through the system (cluster B). The participants agreed on the importance of the system to their job (cluster C) because our system improves the exploration of the different process behaviors.

We also asked the participants to fill out the UEQ (see Fig. 4). In comparison to the preliminary study, all values improved. Attractiveness, novelty, and stimulation are now in the above average range.

Lastly, we report on some individual feedback we received from the workshop participants. Most of the participants liked the idea of getting additional guidance during the analysis, instead of beginning without any hints. The idea of generating sub logs and presenting them as subset recommendations was described as “very innovative” (P1) and “super interesting” (P4). One participant wrote, that “It’s a very useful tool to gain quick control over unknown data” (P5). However, one participant (P6) does not think that the insights recommendations are useful. One could argue that the visual representation of these insights need to be clarified. Still, the subset recommendation was seen “as the real added value to process mining.” (P6). Two participants found the user interface (P2, P4) too overloaded with all the information shown at the same time.

7 Discussion

Validity. While our evaluation on the usefulness of ProcessExporer provided relevant observations, other aspects specific to our approach deserve further analysis. For instance, the effectiveness of the subset recommendation approach was discussed in the prior paper [21] which introduced the multi-perspective trace clustering. However, we did not conduct an effectiveness analysis of the insights proposed by ProcessExplorer for practical use. In terms of our evaluation methodology, we consider that a larger number of study participants could help finding more insights on the practicality of our approach.

Limitations. We consider that our approach exhibits mainly two limitations. One is related to the user interface, which is based on other process discovery tools and extended the process model view with recommendations. A non-static dashboard-like interface with customization capabilities could further improve our system. The second limitation we consider is that we score the interestingness of insights based on deviations. This may limit our approach to only work with PPIs that follow this behavior pattern. However, related work [24] has shown the efficacy of this deviation-based metric.

8 Conclusion and Future Work

In this paper, we presented ProcessExplorer, a novel interactive process mining guidance approach. It automatically generates ranked subsets of interesting cases based on control flow and data attributes, similar to the workflow of analysts who manually select certain cases. ProcessExplorer also evaluates a range of relevant PPIs and suggests those with the most significant deviation. We implemented our approach into an interactive exploration system, which we built based on requirements of expert analysts gathered during a preliminary study. To evaluate the usefulness of our approach we conducted a user study with business process analysts. Our results show that our approach can be successfully applied to analyze and explore real-life data sets efficiently.

As future work, we plan to extend our subset recommendation mechanism by applying activity clustering, which will allow to further narrow the analysis to specific parts of the process. It may also be of high interest to investigate different interestingness measures. We also plan to extend the user study to consider a longer period of time and larger number of participants. It may also be of interest to investigate in the ranking as well as the negative implications of automatic guidance to experts’ analysis performance.