Keywords

1 Introduction

Organizations that operate in the current complex and turbulent environment need to implement changes in their structures and processes. Therefore, new management approaches are necessary to allow organizations to grow and increase their competitiveness [1,2,3].

Being part of the management task, it is necessary to describe the structure and implementation of business processes to continually improve them. Business Process Management (BPM) is an approach capable of supporting both the description and the improvement of processes. Both can be performed incrementally or radically [4].

BPM consists of several phases. One of them is the modeling of the current process (as is), during which the process is mapped as it is. Then the process is analyzed and improved, to design the future process (to be) [5].

One tool that can be used in the process discovery and modeling phase is data mining in information systems. Data mining techniques allow a series of applications in public administration or private companies, either as a verification process or as a discovery process [6].

The purpose of this article is to analyze the use of data mining in the modeling stage of complex and not well structured processes, specifically applying BPM to improve processes at a Federal Institute of Higher Education in Brazil.

2 Theoretical Reference

The way the processes are designed and carried out affects both the quality of the service and the efficiency with which the service is delivered. An organization can outperform another that offers similar services if it has better processes and performance. This holds true not only in customer-facing processes, but also in internal processes [7].

In the last decades, there has been a growing interest in BPM due to its ability to help organizations increase productivity by achieving operational excellence and reducing costs [8].

Research in this field originated in computer science, administration and information systems, and has resulted in a multitude of models, methods and tools that support the design, approval, management and analysis of business processes [8]. For Aalst, La Rosa and Santoro [9], “BPM is the discipline that combines approaches to the design, execution, control, measurement and optimization of business processes”.

BPM life cycles are steps and activities that must be followed to conduct BPM projects. Theoretical and empirical studies show differences in the number of steps and activities that must be carried out to promote BPM [10].

The BPM discipline can be seen as a continuous cycle that involves a series of phases, such as process identification; process modeling; process analysis; process redesign; process implementation; and process monitoring and control [7].

Business process modeling emerged from the need to explain and communicate business processes in an organization, making them easier for business users to understand. These users range from business analysts, who design the initial drafts of the processes, to the technical developers responsible for implementing them, and even the business team that implement and monitor such processes [11].

According to Aalst [12], creating models is a difficult and error-prone task. He cites some typical modeling errors, such as: the model describes only one version of reality; the model is unable to adequately capture human behavior; and the model is at the wrong level of abstraction.

The practice of process modeling emerged as a key instrument to allow decision making in the context of the analysis and design of information systems with process recognition [13].

Process mining aims to build a process model using an event log and a process discovery algorithm; such a technique has been applied for the discovery, modeling and improvement of processes [14].

Tiwari et al. [15] explain that among the main drivers of increased business process mining is the need for companies to learn more about how their processes operate.

The goal of process mining is to leverage event data to understand how an organization works. With process mining, it is possible to discover the sequence of tasks that are performed in a given business process as well as the interactions that occur between the participants in that process [16].

Business process mining can be used as a tool to discover how people drive processes in the real world. Dustdar, Hoffmann and Aalst [17] distinguish three different perspectives in the mining of business processes:

  • Process perspective: focuses on the ordering of activities (the process control flow). The goal is to find an acceptable representation of all possible paths within the process. These paths can be expressed in terms of a process model (for example, Petri net or event-oriented process chain);

  • Organizational perspective: focuses on originators within a process, that is, the people and roles that are involved and how they are related. This approach can be used to portray the roles and relationships between individuals in a process in terms of a social network;

  • Case perspective: takes into account the properties of the cases, that is, attributes that can differentiate one path through one process (case) from another.

Ferreira [16] explains that the event log can be the real log of an information system or it can be a log file created from historical data recorded in a database, for example. Whatever the source, the data in an event log must have a specific structure and must contain at least the following information:

  • a case id, which identifies the instance of the process;

  • a task name, which identifies the activity that was performed;

  • a user name, which identifies the participant who performed the task;

  • a timestamp, indicating the date and time the task was completed.

Process mining starts from event data and uses process models in several ways. For example, process models can be discovered from event data records, as well as serving as reference models, or used to design bottlenecks. In order to treat the information contained in an event’s logs it is necessary to develop algorithms. The process mining algorithms for the discovery of processes transform the information from the event’s logs into process models [12].

3 Research Method

This research is classified as exploratory and qualitative, and was developed as a case study in a Federal Institute of Higher Education in Brazil. The process analyzed was the Acquisition of Goods and Services. It was considered a critical process because it is extensive and complex and still involves several sectors of the organization. The process uses an information system, but it is not well structured and has a low level of standardization. This complexity is mainly due to the different types of products and services that are requested by the users of the organization.

The research consists of four stages, the first being a literature review, in order to theoretically support the research. The sources of consultation used were Emerald Insight, Scopus, Web of Science, Springer, Google and books, in addition to documentary sources, from which information was obtained from the organization.

The second stage started with the collection of data through the mining of processes in the institute’s information system, describing the process of the acquisition of goods and services, which are the related activities and who are the actors involved in the process. The data were extracted in order to create an event log and refer to the sequence of the flow and processing of the goods and services acquisition processes in the period from 2014 to 2017. The data served as a basis for the development of the “as is”.

In the third stage, in order to complement the information obtained with data mining and provide more quality in the modeling of the process (as is), it was necessary to hold working meetings with the objective of obtaining more details of the activities and answering questions regarding the results of the mining.

Afterwards, the analysis of the results and the conclusions of the research were carried out.

4 Development and Results

For the modeling of the processes and activity flow of the Goods and Services Acquisition process, SQL language was used to extract data from the institute’s information system and to create the event log. The SQL code can be seen in Fig. 1.

Fig. 1
figure 1

SQL code for extracting information about the process from the system

After executing the SQL code, the event log generated 138 instances of the process of goods and services acquisition, and more than 3,900 procedures for the acquisition of goods and services processes were found. To facilitate the reading and demonstration of the event log result in the survey, two instances of the goods and services acquisition process were selected. An example of the event log result is described in Table 1.

Table 1 Example of the result of an SQL system database query (instance 6142)

Each of these participants plays a role in this process by carrying out some of the activities identified. However, when performing these activities, these agents interact with each other in a way that is not fully recorded by the system. Figure 2 shows the flow between the agents in each process activity.

Fig. 2
figure 2

Modeling process based on mining of instance 6022

The process was modeled from the mining data of the process and represents the flow of an instance (id = 6022) of the process. It is possible to understand the flow between sectors (actors). However, it was not possible to identify the activities performed by the process participants. The event logs (id = 6022) of the purchasing process (see lines in the Table 1) represent the process procedures between the sectors of the institute. However, it was found that the process instances vary and do not allow a visualization of the complete interaction between all sectors. This can be evidenced by the analysis of other instances, such as that of id 6142 (Table 2).

Table 2 Example of the result of an SQL system database query (instance 6142)

In order to understand the flow of the process instance described in Table 2, a model was elaborated, illustrated in Fig. 3. The modeling of the instance mentioned allowed for the identification of 19 activities necessary for its completion.

Fig. 3
figure 3

Modeling process based on the mining of instance 6142

Another fact resulting from process mining was the discovery of the lack of standardization in the process. This can be verified when comparing Tables 1 and 2, or Figs. 2 and 3, both resulting from the mining process. The lack of standardization is evident.

Another major discrepancy was the different interaction times of the activities sectors, recorded in the’total hours’ column in Tables 1 and 2. In a meeting with the users of the process and information system, the times of some procedures were questioned (for example, see total hours column in Tables 1 and 2). Users reported that sometimes the processing between sectors occurs only physically, leaving the information system out of date.

It was realized that this is due to the fact that the information system is vulnerable to misuse by the employees of the institute. There are no hard rules for using the system and, therefore, the real processes do not correspond to the digital processing recorded.

As an example, the seventh row of Table 2 describes the procedure in the Accounting and Finance Department, where the total processing time was 2,352 h and 22 min. It was found that often the processes arrive and are stopped in the sector until they receive the certificate of receipt of the equipment or until the service contracted is fully paid off with the supplier. Some services, such as the service contract for internet supply, take up to two years to finalize the total payment, as in the case of a contract with monthly payments; and after one year it can be extended for another year.

Although the mining process helped to understand the current situation of the goods and services acquisition process, the data collected was insufficient to model the current functioning (as is) of the goods and services acquisition process. For this reason, it was necessary to hold meetings with the users seeking to give more quality in the modeling of the process.

In order to obtain a complete representation and a final’as is’ model of the process with all details, it was decided to bring together the users of the departments responsible for each activity of the process, resulting in a report with information on the author, flow activities and description activities. This more detailed analysis allowed the development of an ‘as is’ model of the process which is closer to reality, with 25 activities being identified in the process. With the current model (as is) defined, it was possible to move on to the next stages of application of BPM at the institute.

5 Final Considerations

This research presented the description and analysis of process modeling using data mining at a Federal Institute of Higher Education in Brazil.

The ‘as is’ modeling was developed using process mining techniques in the database of the information system of the organization, and through interviews and meetings with the process users.

Due to the fact that the nature of the products and services requested in the institute’s system is very varied, the processes are not standardized, and the use of the information system is not managed in a disciplined way, data mining was not sufficient enough to identify the complete processes and the flow of activities in detail.

Despite this, data mining facilitated the definition of instance flows and made it possible to detect critical points in the process, such as points with a long duration of time. With data mining, the analysis time, meetings and modeling effort of the analysts and users were reduced.

In organizations that have a high number of processes, such as the case of the organization analyzed, data mining has been demonstrated to be a tool that facilitates and streamlines the process of identifying, modeling and mapping processes in the application of BPM.

It also showed that the efficiency of data mining depends on how the organization uses its information system. As the processing of the purchasing process does not follow a logical sequence of activities or an event log, it was not sufficient for the modeling of this process, requiring the use of interviews and meetings with users.

As this research presented the results of only one case, its conclusions cannot be generalized, for which an investigation of other institutes and/or other types of organizations would be necessary. Also, it is suggested that future research might reanalyze the process of acquiring goods and services at the organization studied after the implementation of the model ‘to be’, and verify how the process is behaving.