Keywords

1 Introduction

Process mining has a history of over two decades of published research papers and case studies started to appear a bit over a decade ago. At the core of process mining (originally called workflow mining) is the use of logged traces from the execution of a business process to model the actual behaviour of the process. It is this data-driven approach that distinguishes process mining from other forms of process analysis which typically rely on developing an understanding of the process from people’s perceptions of the way the process behaves (or should behave). The model-from-logged-data approach can be traced back to 1998 [2] where the authors “describe an algorithm that, given a log of unstructured executions of a process, generates a graph model of the process” [2]. The resulting process graph (capable of modeling any partial ordering of activities and of modeling loops) represents the control flow of the process.

Since 1998, process mining has received much attention from the research community. A search of the Scopus database for publications since 1998 which mention “process mining”, returns nearly 6,000 documents. In the research space, ProMFootnote 1 was developed as an open-source repository of process mining tools and techniques, and since then there have been many other tools, both open-source and commercial. In 2009, the IEEE CIS The Task Force on Process Mining was established, and in 2011 the Process Mining Manifesto [3] posed a number of challenges for the discipline including C10: Improving Usability for Non-Experts and C11: Improving Understandability for Non-experts, which can be interpreted as saying that, at the time of its (the Process Mining Manifesto) publication, process mining remained largely a research topic and still had some way to go to being accepted as a mainstream computing technology.

However, with the first process mining conference launched in June 2019, the field has reached an important milestone in the progress to become a mature academic field [10]. It is timely, thus, to assess just how far the discipline has matured to date in terms of application in practice.

There is a variety of definitions of the maturity of a field of study and different approaches to measuring it. From a literature survey, Keathley et al. [17] synthesizes criteria for assessing maturity and use the results to develop a generalized maturity assessment framework. Van der Aalst [1] refers to the maturity of the BPM field as its relevance, as acknowledged by practitioners and academics. Recker and Mendling [20] investigates the maturity of the BPM field based on academic impact (through measuring citations) and research methodologies (through measuring the presence of certain research components in a paper). In Information Systems research, Cheon et al. [9] examines maturity of IS research based on the diversity of variables, research approaches, and generalisability of research findings.

In this paper, we rely on the maturity assessment framework in Keathley et al. [17] to identify dimensions, criteria, and metrics relevant to assessing the maturity of the process mining field in practice. Consistent with [1], we consider diffusion in industry and among practitioners [17] as one of the relevant dimensions of maturity of process mining in practice. According to [17, Table 5], criteria associated with the diffusion dimension include adoption in industry which can be measured by the ‘number of industries adopting findings from the research area’. Accordingly, the first research question this paper is interested in addressing is RQ1: How widespread are process mining tools and techniques across different domains?. To answer this question, we reviewed all published process mining case studies during the period 2007 to 2018 that are directly related to practice, i.e. that seek to address concerns raised by the industry partner, thus allowing us to assess the diffusion of process mining tools across different domains and different industries.

We also adapt the ‘research design characteristics’ maturity dimension from [17, Table 5] which is measured in terms of ‘rigo[u]r’. In [17], rigour comprises the sub-criteria ‘research objectives’ and ‘thoroughness’. In this paper we take thoroughness to refer to the combination of these two sub-criteria. Thus, the second research question in relation to maturity of the process mining field is RQ2: How thoroughly are process mining methodologies applied in the case studies to address practical problems? To answer this question, we define measures of thoroughness for each of the various phases of a generalised process mining methodology and assess process mining case studies against these measures.

The remainder of the paper is organised as follows. In Sect. 2, we position our work against related work and then introduce our generalised process mining methodology. In Sect. 3 we describe the criteria and metrics we use in assessing process mining maturity in practice. In Sect. 4, we describe how we identified and coded case study papers for maturity analysis. In Sect. 5, we present the results of this analysis. In Sect. 6, we reflect on our analysis results and offer some thoughts on the maturity of the field of process mining as derived from our analysis and provide some thoughts on potential future work. In Sect. 7, we offer some conclusions and reflect on the limitations of our current work.

2 Background and Related Work

2.1 Related Work

In the field of process mining, there have been other literature review studies with the aim of providing an overall descriptive view (for a definition of descriptive review see [22]), though not many. Tiwari and Turner [26] reviews 50 papers in the field of process mining to show the distribution of different analysis techniques and to examine how various challenges in the field were addressed (e.g. noise, hidden tasks). In contrast, our paper focuses on the diffusion of process mining tools and techniques in practice.

Ghasemi and Amoyt [14] uses an extensive literature review to identify the distribution of process mining studies across six different search engines. The authors demonstrate a growing trend of process mining research over the last 10 years in general and in the healthcare domain in particular. They also conclude that Scopus and Google Scholar together cover 96% of the papers published in the field of process mining.

Rojas et al. [21] describes a systematic descriptive literature review on the application of process mining in healthcare. The authors reviewed 74 papers in this area and provide some observations in terms of the types of processes encountered and data used, process mining methodologies, tools and algorithms that were applied, and emerging research opportunities they identified in the field. The structured literature review in Kurniati et al. [18] includes 37 process mining studies in the oncology domain. The authors reviewed the research questions posed, the methodologies used, the findings and results presented, and suggest future research opportunities.

Thiede and Fuerstenau [24] explores the use and maturity of process mining as a technology in practice through a structured review of process mining research. The review analysed 68 papers published in 22 journals using a maturity model synthesised from maturity models in ERP and business analytics. They identified that cross-system and cross-organisational process mining is underrepresented in IS journals. Thiede et al. [25] continues this study, extending the search period to 2015 and 2016 and focusing more specifically on empirical studies. The study reviewed 144 papers in relation to their coverage of different types of systems in an organisational context with findings confirming the results of the earlier study. The main distinction between the two papers with Thiede as first author and our paper, is in the definition of process mining maturity. Unlike Thiede and Fuerstenau [24] which considers process mining as a technology and defines maturity based on technology maturity models, in our paper, we consider process mining as a field of study consisting of tools, techniques and research methodologies and define criteria for maturity based on the maturity definitions of a field of research [17].

Overall, existing related work has paid little or no attention to the degree of maturity of application of process mining techniques.

2.2 Process Mining Methodology

To be able to assess the thoroughness of process mining case studies, in this section, we introduce the common phases of a process mining methodology and the important considerations in relation to each phase. Process mining case studies, whether motivated by real world problems or by researchers’ intentions to examine existing tools in a practical context, usually follow some kind of process mining methodology. While there is, as yet, no agreed standard process mining methodology, there are several process mining methodologies described in the literature, e.g. [3, 7, 12, 15]. Each of these methodologies (i) is described in terms of phases, where each phase has an objective, some required inputs, and some defined outputs, (ii) is not prescriptive in terms of tools and techniques. For our analysis, a methodology would provide a standard against which each case study can be assessed. We therefore synthesised a set of methodology phases (and associated objectives and outputs) from the phases described in [3, 7, 12, 15] that we use as the basis for analysing case study thoroughness. We do not see any objections to this approach as (i) few of the case studies we reviewed mentioned application of any specific methodology, and (ii) we assess each case study against the set of objectives and outputs (and then track these assessments using the relevant generalised methodology phase) not how closely the case study followed the generalised methodology. Thus no case study is penalised for not strictly applying the phases of the generalised methodology. Table 1 (Column 1) shows the phases of our generalized methodology lined up against synonymous phases of other published methodologies.

Phase 1 (Defining research questions): In the first phase of a process mining project, objectives and research questions should be specified. This should be done in consultation with organisational stakeholders and domain experts.

Phase 2 (Data collection): The objective of this phase is to understand the available data (as present in existing systems) and what can be extracted (event data and other attributes) and used (scope and granularity of data) to answer the research questions. According to the Process Mining Manifesto [3], the choice of data and data sources should be driven by the research questions. The outputs of this phase include (i) a conceptual data model (showing relationships between data sources and elements), and (ii) initial event logs.

Phase 3 (Data pre-processing): The objective of this phase is to ensure the extracted data is of high quality and is suitable for subsequent mining and analysis. Pre-processing may address missing data, incorrect data, or bringing data into the right or uniform format (e.g. timestamps), etc. A variety of process mining tools have been developed to transform data to the right format (e.g. [23]) and also to apply automated log cleaning methods on the extracted data. (e.g. [8, 11]). However, data cleaning is an ad hoc task and usually depends heavily on domain knowledge [23]. It is therefore naive to rely solely on tools to automatically resolve data quality issues and not be mindful of the deeper underlying reasons as to why these quality issues emerge in the first place. They may result from the way systems have been configured (including both operational use and logging) or from organisational rules (e.g. [4, 5, 23]).

Phase 4 (Mining and analysis of results): In this phase, process mining techniques are applied to the data prepared thus far in order to answer the research questions and to obtain process-related insights from analysis of the results. In our study, we consider a number of different types of process mining: process model discovery, conformance checking, performance checking, social network analysis, and comparative analysis. The form of analysis appropriate for a process mining case study is dependent on the research question(s) and the requirements of the context [12]. A variety of tools and techniques have been developed in the past 20 years in relation to different forms of process mining analysis. These algorithms and tools have different functionality, deal with different characteristics of input data, and produce different quality of outputs [11]. Accordingly, the appropriate choice of process mining analysis type, tools and techniques is crucial when conducting a process mining project. Similarly, the presentation of the results of an analysis is critical. Results should go beyond merely reporting the output of whichever tool was used, to include an interpretation of the findings with respect to the research question(s).

Phase 5 (Stakeholder evaluation): This phase in our methodology is the presentation of the findings to the stakeholders with a view to gaining stakeholders’ feedback as to the validity, accuracy, reasonableness and relevance of the findings. Interpretation and evaluation of the findings could occur more or less simultaneously and could also evolve through a number of iterations.

Phase 6 (Implementation): In this phase of a process mining project, the insights derived from phases 4 and 5 are implemented with the objective of improving the process, and (possibly) providing further support through process mining. The actual implementation of a process mining project, however, often goes beyond the scope of the reported case study.

Table 1. Generalised process mining methodology phases and semantically synonymous phases from published methodologies

3 Process Mining Maturity Criteria

As discussed, to assess the maturity of the field of process mining we draw on the maturity framework in Keathley et al. [17] and adapt two maturity dimensions which suit process mining research: diffusion and research design characteristics. In this section we further define the measures that we apply in this paper to evaluate process mining maturity based on these two dimensions.

3.1 Diffusion of Process Mining

Keathley et al. [17] defines diffusion as one of the dimensions of maturity of a field of study. Diffusion can be related to three main criteria, (i) adoption in industry, (ii) communities of practice and, (iii) technology development [17]. In this paper, we focus on adoption in industry as the main diffusion criterion and define it as the application of process mining tools and methods in different practical domains. To measure diffusion, we consider (i) the frequency of application of different process mining tools across the published case studies, (ii) the range and frequency of domains to which process mining has been applied as revealed by our literature search, and (iii) how process mining tools and techniques have achieved traction across different domains.

3.2 Thoroughness of Process Mining Case Studies

This study refers to ‘clarity of research questions’ and ‘thoroughness’ of process mining approaches as sub-criteria of process mining methodological rigour in practice. In our view (see Sect. 2.2) defining clear research questions should be part of any process mining project methodology. Hence, in this paper, we use the term thoroughness to refer to both these sub-criteria. According to the Process Mining Manifesto [3], the impact of process mining tools in practical domains can suffer due to immaturity of existing tools and obliviousness of researchers/practitioners to process mining methodology i.e. a lack of knowledge of the limitations of process mining tools in the context of the study, and inattentiveness to research questions and domain knowledge. Accordingly herein, we define ‘thoroughness’ in relation to a process mining case study as thorough consideration of the stakeholders, their requirements and the study context, through different phases of a process mining methodology (Sect. 2.2) including: (i) unearthing the research questions of interest to the organisation involved, (ii) the way data is collected and pre-processed, (iii) the manner in which mining algorithms are applied and data is analysed, (iv) the attention that has been paid to presenting the results to the stakeholders and, (v) the way these results have been evaluated with the stakeholders. In evaluating the degree of thoroughness of a process mining case study, we assess the degree of thoroughness based on the above considerations in relation to each phase of our generalised process mining methodology. For each phase, the highest level of thoroughness is ranked as 3 and the lowest level is considered as 1. We rank a phase as 0 if that specific phase was not mentioned in the process mining case study at all. The details of our coding approach is described in Sect. 4.3.

4 Analysis of Process Mining Case Studies

In this section our approach to identifying relevant published process mining case studies, and the criteria used to assess them is described in detail.

4.1 Paper Extraction

According to Paré et al. [19] our work represents a combination of a descriptive and a critical review of the application of process mining techniques. Hence our review approach is influenced by a number of related guidelines [13, 19, 22]. In our approach we (i) extract process mining case studies of the last 18 years [22], (ii) determine a selection strategy [19], (iii) develop coding dimensions and related assessment criteria [19], and (iv) perform the coding and the analysis [6].

We aim to provide both a descriptive (research question 1) and a reflective (research question 2) review of process mining case studies. Rather than limiting the review to a selective or representative set of papers, we aim to be as comprehensive as possible in considering the corpus of process mining case studies [19]. According to Ghasemi and Amoyt [14], the combination of Google Scholar and Scopus covers 96% of the published process mining papers in any topic and domain. Consequently, for this paper we used these two search engines to find process mining case studies. We consider the search period used i.e. from 2000 to 2018, to be inclusive as the earliest process mining case study papers found were published in 2007. The search process was carried out in a number of phases [19, 22]. Firstly, the Scopus and Google databases were searched for papers (articles, conference papers and book chapters) containing the phrase “process mining” with a publication date after 1999. Secondly, the data set was scanned to remove duplicate papers. Thirdly, the articles were filtered to remove books, theses, literature reviews, position papers, state of the art papers, general BPM papers (which may mention process mining), data mining papers (which mention process mining) and ‘citation only’ references. Fourthly, the title, abstract and keywords were reviewed to exclude obviously irrelevant articles (for instance articles that relate to the process of minerals and ore mining). Lastly, the inclusion and exclusion criteria (explained in the next subsection) were applied to each of the articles. This resulted in a final set of 152 articles.

4.2 Inclusion and Exclusion Criteria

As our analysis concerns process mining case studies, we need to be specific about the criteria for a paper to be to considered a case study. A process mining case study is focused on reporting the application of existing process mining tools and techniques to a specific domain to provide business value or address stakeholders’ requirements. To get a better picture of the application of process mining in a variety of contexts, we did not exclude any papers based on considerations of (perceived) quality [19].

We included, as case studies for our analysis, only those articles where process mining tools and techniques were the only forms of analysis used. We excluded the following articles: (1) articles where the principal contribution was a methodology, technique or tool, which was subsequently illustrated with a ‘case study’, (2) articles not written in English, (3) articles of which the full-text was not freely available to the authors, and (4) articles where process mining techniques were used for the purpose of data preparation as an input for data mining or statistical analysis rather than process discovery and analysis.

After initial filtering and subsequent application of inclusion and exclusion criteria, we identified 152 case study papers for analysis. Table 2 shows, by year, the number of published case studies.

Table 2. Articles published per year

4.3 Coding Dimensions and Analysis Approach

To answer the first research question in relation to diffusion of process mining case studies, we used literature review profiling techniques. Literature profiling is an effective approach to identify thematic trends and diffusion of interests in a field of study [13]. To evaluate the increase in the application of process mining in practice, we analysed the overall distribution of published case studies over the years. In order to evaluate the dissemination of process mining in different practical domains and the distribution of process mining techniques, the case studies were classified based on the domain of application and also the process mining tools applied to conduct the project.

To answer the second research question in relation to the thoroughness of application of process mining, case studies were evaluated, in each phase of process mining methodology, on a scale from 1 to 3. We assigned a coding value of 0 to any phase where, for one of a variety of legitimate reasons, the study authors skipped explaining the phase. Thus we were able to conduct our analysis without unduly penalising these studies. Table 3 shows the thoroughness criteria (Sect. 3.2) for this evaluation against each phase of process mining methodology (in Column 1) as discussed in Sect. 2.2.

Table 3. Thoroughness coding values for each phase of process mining methodology

To ensure coding reliability, two authors, using NVivoFootnote 2, coded the first 10 papers, resolved discrepancies and revised the coding criteria. Then the whole set of papers was coded by one author before being reviewed by all authors. Discrepancies were discussed and resolved, and, based on this feedback, the coding criteria were further revised. The whole set of papers was then coded a second time by the same author [16, 20].

5 Analysis

In this section we present an in-depth analysis of the selected process mining case studies to assess the maturity of the field of process mining in practice. This analysis is guided by our coding efforts and provides both qualitative and quantitative insights. In Sect. 5.1, we address the first research question in relation to diffusion of process mining tools and techniques. In Sect. 5.2, we report on the thoroughness of process mining case studies.

5.1 An Overview of the Process Mining Field

The increasing interest in publishing case studies in the process mining discipline since 2014, a finding consistent with Thiede et al. [25], is a positive indicator of applicability of the field of process mining to practice [1].

Our survey of process mining case studies indicates that more and more researchers or practitioners from various domains are interested in practical applications of process mining tools and techniques. The 152 case studies reviewed in this paper cover 34 different domains, including healthcare and education (as the two most frequent application areas), manufacturing, banking, finance, customer service, audit and fraud detection, construction, cybersecurity, logistics, and even game playing. Figure 1 shows the number of case study articles, by year of publication, that address the 7 most frequently mentioned domains. Figure 1 highlights that process mining has gained increasing traction in both the healthcare and education domains over time and suggests the suitability and potential of process mining to address problems in these complex domains. However,identifying the reasons behind these observations is not in the scope of this paper and could be investigated in future studies.Footnote 3

Fig. 1.
figure 1

Domains by year of publication - number of articles.

To better understand how advanced is the application of process mining tools and techniques in these top 7 domains, we investigated how the most developed process mining tools/techniques have been applied across the case studies conducted in these domains. Figure 2 shows, for the 7 most common domains, the process mining tools/techniques that were applied. We note the frequent use of Fuzzy Miner and Heuristic Miner and further note that Inductive Miner is used mostly in studies involving the healthcare and education domains.

Fig. 2.
figure 2

Algorithms by domain - number of articles.

We further note that:

  • Fuzzy Miner as one of the oldest, and simplest techniques, is overall the most commonly applied tool/technique, and has, on a year-by-year basis, been the most commonly applied tool/technique.

  • The Heuristic Miner has shown a decline over time in usage.

  • The Inductive Miner, since its release in 2016, has shown increasing usage.

  • Despite its known limitations, the Alpha Miner algorithm remains a popular tool/technique.

  • Even though Social Network Analysis is a common application area (19 articles include this form of analysis), the Social Network Miner tool/technique is infrequently mentioned by case study authorsFootnote 4.

5.2 Process Mining Methodology

In this section we investigate the degree of thoroughness of the various case studies. To better present the trends in the whole set of case studies, we devised an aggregated thoroughness indicator for each paper. For any reviewed case study, c, we refer to \(T_c\) as the overall measure of the thoroughness of c where \(0 \le T_c \le 1\) is calculated by summing the thoroughness value for each of the methodology phases and then dividing by 21 (the maximum possible value of thoroughness). To derive trends over time, the thoroughness values were averaged over year of publication (see Fig. 3A). It is clear that, over time, the number of case studies published per year generally increases. However, the degree of methodological thoroughness (average \(T_c\) per year) has significantly dropped.

Fig. 3.
figure 3

A(left) - Thoroughness by Year of Publication, B(right) - Thoroughness (Cumulative) Frequency Distribution.

In Fig. 3A, the height of each bar shows average thoroughness per year. The x-axis represents year of publication with the width of each bar representing the number of case studies published in the indicated year.

Figure 3B shows a cumulative frequency for \(T_c\). It can be observed that 53% of case studies in our survey achieved \(T_c \le 0.33\) thoroughness. Further, only 12.5% of case studies achieved \(T_c \ge 0.5\) thoroughness. Finally, only 8 case studies (4%) of the 152 analysed, achieved \(T_c \ge 0.67\) thoroughness (14 or better out of 21).

Based on a preliminary observation on the case studies’ authorship, we hypothesised that a possible explanation of the downward trend in the level of thoroughness of the papers could be related to the changes in the patterns of authorship. Accordingly we conducted an analysis on the co-authorship of the top 8 most informed case studies. The results of this analysis show that (i) one author is involved in 4 of the 8 papers, (ii) there are groups of co-authors involved in multiple papers, e.g. the same set of authors wrote 2 of the case studies, one author is involved in 3 of the case studies with co-authors who are themselves involved in at least 2 of the case studies, and (iii) several of the case studies involve both process mining and domain expertsFootnote 5 as co-authors. Further, each of these authors have research experience in multiple aspects of process mining. Further analysis of the (co-)authorship of case studies with lower levels of thoroughness is warranted to determine if (i) more domain experts are becoming involved in applying process mining techniques in practice, and (ii) if less experienced (from a process mining perspective) researchers are applying process mining methods and techniques in practice.

Fig. 4.
figure 4

Thoroughness - Methodology Phases. 1 = Research Questions, 2 = Data Collection, 3 = Data Pre-processing, 4 = Process Discovery, 5 = Conformance Checking, 6 = Performance, 7 = SNA, 8 = Comparative Analysis, 9 = Results, 10 = Stakeholder Evaluation, 11 = Implementation.

Figure 4 shows the heat map representing the level of thoroughness for each phase of process mining methodology over the years. Each cell shows the average degree of thoroughness for the specific phase. The darker cells indicate higher levels of thoroughness of process mining phases (the columns) for the specific year (the rows). The darker colors on the top of the heat map, confirm our observation from Fig. 3A showing the overall downward trend in the level of thoroughness of case studies over time. Looking in more detail, we can observe that phase 1 (formulation of the research questions) is one of the most thorough phases and, after an initial dip, has shown some progress in terms of methodological thoroughness. The level of thoroughness for phase 2 (data collection) is overall low (<2) and does not show much progress over the years. Thoroughness for phase 3, data pre-processing (column 3), also going down overall, shows few darker colors across the years, with only a high level of thoroughness for 2007 and 2011. For phase 4 (analysis, columns 4–8) the heat map shows the level of thoroughness in conducting these different forms of analysis, if present in the papers. The white cells show that we could not find instances of papers applying the related type of analysis in that year. We can observe that process discovery has a consistent downward trend in terms of its thoroughness. Conformance, performance, social network, and comparative analysis, also trend down, while showing a few peaks in thoroughness across the years. Phase 5, column 9 (results), shows a clear downward trend over the years. Column 10 (evaluation) shows that an evaluation, if present in the papers, is conducted mostly in a thorough way (>2). For phase 6, implementation (column 11), except in 2007 and 2013, the level of thoroughness is low (<2) and generally decreasing.

Our main criteria for assessing thoroughness includes consideration of the context and stakeholders’ requirements as well as being reflective in the choice of methods. The downward trends in phases 4–8 together with the increasing penetration of process mining into different practical domains may be interpreted as researchers and practitioners (perhaps due to a lack of expertise and experience) putting little or no importance on reflecting or explaining the reasoning behind their choice of methods and analytical tools. In contrast, phases 1 (research question) and 5 (evaluation) show the highest level of thoroughness indicative of increased interaction with stakeholders leading to a deeper understanding of the problem context, and of the relevance of results and insights.

Fig. 5.
figure 5

A(left) - Thoroughness - Healthcare, B(right) - Thoroughness - Education.

As our analysis in Sect. 5.1 revealed that process mining has received much attention in the healthcare and education domains, we now analyse the case studies published in these domains to examine the respective levels of thoroughness and thus to investigate maturity of the application of process mining in these domains. Figure 5A and B respectively show the degree of methodological thoroughness for papers in the healthcare and education domains. We can observe that in healthcare, consistent with the whole set of process mining case studies analysed in this paper, there is a downward trend in terms of methodological thoroughness (from more than 0.6 to less than 0.4). However, this is not the case for case studies published in the education domain as they show a slightly upward trend, even though they are lower in the level of thoroughness (from 0.2 to 0.4) compared to studies in healthcare.

Fig. 6.
figure 6

A(left) - Thoroughness - Methodology Phases Healthcare, B(right) - Thoroughness - Methodology Phases - Education.

To further understand these trends, we also looked to the heat maps, representing the thoroughness of process mining phases for these two domains. Figure 6A shows the heatmap for process mining phases in healthcare. Compared to the heatmap for the whole paper set, we can observe that the level of thoroughness in phase 1 (research questions) is always equal or higher than the level of thoroughness in phase 1 for the whole paper set (Fig. 4) across the years. This is consistent with the thoroughness of phase 1 in the education domain (except for 2009) with both domains showing an increasing trend in thoroughness of phase 1. These results suggest that over time, researchers in these two domains have developed their understanding of process mining and how it is related to the problems in these two domains. We also do not not observe any significant differences in the thoroughness of phase 1 between these two domains. There are no significant differences in relation to phase 2 (data collection) between the healthcare case studies and the whole paper set, but clearly we can see a higher degree of thoroughness in the application of process mining tools in healthcare compared to education. However, unlike healthcare case studies and the whole paper set, the level of thoroughness for phase 2, phase 3 (pre-processing), phase 5 (results in column 9 and evaluation in column 10), phase 6 (implementation in column 11), shows an increasing trend in the education domain. Different patterns in the level of thoroughness of process mining case studies in these two domains invites more investigation into the root causes of these variations; are process mining methods and techniques more suitable to specific domains and harder to apply in other domains? Do we need to tailor process mining methodologies according to the domain of application? Answering these questions is important in order to achieve higher levels of maturity and impact in practice. We will further discuss the analysis results in Sect. 6.

6 Discussion

To answer the question how mature is the process mining discipline in terms of its application in practice?, we assessed diffusion and thoroughness of process mining studies in practical domains by reviewing process mining case studies published between 2007 and 2018. In order to assess the diffusion of the process mining discipline in practice, we examined the dissemination of process mining tools and techniques across different domains through a literature review. The increasing number of process mining case studies and the broadening of domains in which these studies are conducted, imply a growing maturity of this field in relation to adoption in industry.

This paper also investigated maturity in terms of rigour (thoroughness of application of a process mining methodology). We consider a thorough application of a process mining methodology to be one where consideration of the organisational context and stakeholders’ problems is reflected through all phases of the methodology. We firstly synthesised, from published methodologies, a generalised process mining methodology and defined measures of thoroughness for each phase of the methodology. We derived an overall thoroughness value for each case study by aggregating phase-thoroughness values. Our analyses revealed an overall decrease in the level of thoroughness of the case studies from 2007 to 2018. Furthermore, looking to the level of thoroughness for each phase of process mining separately, shows that even though the formulation of research questions has improved over the years, other phases, specifically analysis, results, evaluation and implementation are not showing any improvement in terms of thoroughness (in fact decline is evident). One plausible explanation is that case studies are increasingly being carried out by domain experts and novice researchers. This is supported by the growing number of domains in which process mining is being applied together with the continuing popularity of obsolete and limited tools and techniques such as the Alpha MinerFootnote 6. This proposition can be further investigated by conducting a review on the authors of the paper set.

Unfortunately, the decrease in thoroughness of process mining case study approaches implies that, despite increasing adoption in industry, process mining is still not able to deliver the promised outcomes to real world problems. One way that the process mining research community can attend to this concern is by developing methodological guidelines (with emphasis on context and reflection through process mining phases) to support knowledge transfer from experienced process mining researchers to those that are relatively new to the field.

Future research is warranted to investigate other possible reasons behind this downward trend in the field. For instance, is the complexity of more advanced process mining tools and techniques hindering their application by non-process mining experts in practice?Footnote 7 Also, are more experienced process mining researchers moving away from publishing case studies (for example, as they are becoming harder to publish in good forums in the area) and do we thus see an increased number of case studies published by relative novices in the field (in lower quality forums)? If so, the aforementioned guidelines would help in increasing the thoroughness of published case studies.

7 Conclusion

The inaugural International Conference on Process Mining (Aachen/Germany June 2019) marked two decades of the existence of process mining as field of research and practice. Through a detailed analysis of 152 published process mining case studies, each involving an industry partner, we examined the maturity of process mining in practice by assessing the maturity dimensions of diffusion and thoroughness. Our analysis revealed a growth in diffusion of process mining tools and techniques across various domains, indicating maturation of the field in terms of usability for non-experts. However, we noted a continuing reliance on simple and outdated tools (such as Alpha Miner). We found an overall downwards trend in thoroughness of process mining case studies (with thoroughness differing across domains). We suggest the development of more accessible and suitable guidelines, possibly specific to individual domains, to help new researchers and domain experts as an area for future research.

Despite the limitations of our investigation in this paper; the subjective nature of the coding practice and being limited to what is recorded in publicly accessible papers, we believe that our observations pinpoint important considerations. However, future investigations such as authorship analysis or interviewing authors may shed further light on the progress of process mining in practice.