Keywords

1 Introduction

Measuring the performance of business processes is a key operation within an organization to determine whether the expected objectives are being achieved. According to Kronz [7], process management must involve the collection and analysis of key performance indicators (KPIs), also known as process performance indicators (PPIs), and this, in turn, forms a basis for consistent and continuous optimization of business processes. PPIs are quantifiable metrics that allow us to evaluate the efficiency and effectiveness of business processes. Certain metrics can be applied to almost any process, such as cycle time, defined as the time it takes to process a case or process instance from start to end [3]. However, to evaluate the performance of a process in a more detailed way and verify that the results are aligned with the organization’s strategic objectives, it is important to be able to define measures and PPIs specific to the process domain under analysis. For example, the percentage of orders returned during a quarter with respect to the orders placed in that period.

PPIs can be computed directly from event logs [12]. In most traditional scenarios, an event log should have at least (i) a case identifier to indicate in which case or process instance the event occurred, (ii) an identifier of the task to which the event refers, and (iii) a timestamp indicating when the event occurred [3]. The attribute used to assign an event to a case is called the case notion [2]. An event log usually contains information on the executed activities of several cases, where it is assumed that each event is related to exactly one activity and is traceable to exactly one case. This feature, known as single case notion [2], is a widely accepted limitation in process management, and in particular in process mining, because it is considered not to be a faithful representation of reality.

Recent proposals such as [1, 2, 5], focused their efforts on the management and analysis of process execution data by assuming that multiple attributes can be considered as case notion (called object types), that these objects can coexist, and that an event can refer to any number of objects corresponding to different types. This object-centric behavior, which includes the multiple case notion, is considered closer to reality, thus seeking optimization in the process management. Figure 1 shows a simplified version of the Order-Delivery process. Figure 1a depicts the process using a traditional BPMN model, where a single case notion, the order, is used. Figure 1b shows the process considering the multiple case notion where several objects are involved (simplified version of the process described in [1]). In the latter, multiple relationships of different cardinalities (one-to-many and many-to-many) are identified between process activities/events (left) and object types (right), hereafter referred to as objects.

Fig. 1.
figure 1

(a) Order-to-Delivery(O2D) process with order as case notion. (b) Relationships between activities (left) and objects (right) of the O2D process adapted from [1]

This change in the way processes are dealt with is also reflected in the way data is recorded, giving rise to standards such as Object-Centric Event Log OCEL [5], which is capable of supporting the multiple case notion and giving importance to object information and relationships. Moreover, this change also affects the way in which data is analyzed. Proposals such as [2] focus on calculating precision and fitness to determine the quality of the process itself. In the specific area of performance measurement, only [9] centers on the definition of a set of predefined process-related time measures. However, as far as we know, there is no proposal that allows the definition of customized and domain-focused PPIs and measures for each process in an object-centric context.

Based on the example in Fig. 1a, where there is a single case notion (order), it would be possible to calculate the total processing time of an order using the registered execution times of the first and last activity in the process; or the average number of orders delivered during a period of time. However, if we analyze Fig. 1b there is no direct relationship between all objects and all process events, so in that context, the task of calculating the total processing time of an order or the orders delivered is not a trivial task. Furthermore, returning to Fig. 1a, since the order is the case notion of the process, in this scenario it would be complex to assume and manage the fact that, for example, a package could contain items from several orders. Therefore, we would not be able to define measures such as the average number of items from different orders that are delivered in the packages.

From the previous examples, we can deduce that in order to calculate measures in an object-centric context, it is necessary to perform intermediate operations that allow us, on the one hand, to trace information between different objects and, on the other hand, to relate elements associated with different instances of a process. These new features open up a wide range of possibilities for optimization and automation of process performance analysis, extracting performance data from events and objects recorded in object-centric event logs. This paper seeks to make a contribution to the modeling of PPIs, and in particular to the definition of customized performance measures by considering an object-centric approach. To this end, we identified a set of requirements to be taken into account in the definition of such measures and then integrated them into the existing PPI definition metamodel PPINOT [12]. To help guide our research, we formulated two research questions: (RQ1) What are the main characteristics to be considered for measuring process performance from an object-centric event log?, and (RQ2) How can performance measures and PPIs be defined in an object-centric context?

The remainder of this paper is organized as follows. Section 2 introduces some basic concepts and mentions relevant work in the area. Section 3 describes our approach to answering the research questions posed. Section 4 provides a discussion of the proposal. Finally, Sect. 5 concludes this paper.

2 Background and Related Work

Conventionally, event logs are generated following standards that allow structuring the information for the purpose of transferring these data in a unified way for its later extraction and analysis. The eXtensible Event Stream (XES) [6] is one of the most widely used for recording events associated with processes from a single case notion context. Therefore, to analyze a different process perspective, another notion (another object), it would be necessary to generate a new dedicated event log [8]. The eXtensible Object-Centric (XOC) format [8] supports multiple case notion and aims to deal with the relationships (one-to-many and many-to-many) between objects, avoiding the definition of a case notion. XOC logs have the disadvantages that they are usually very large, complex and have performance issues [1, 5]. Recently, the Object-Centric Event Log (OCEL) [5] standard was proposed with the objective of exchanging data with multiple case notion between different information systems and process mining tools. This OCEL format allows the log data to be analyzed from different perspectives (multiple case notion) and makes it possible to deal with two concepts derived from this scenario: convergence and divergence [1]. We refer to convergence, when an event can be related to different cases. For example, for the process in Fig. 1a, if instead of using order as case notion, we use another object, such as item, if there are two or more items associated to an order, the Place Order activity for that order would appear in as many traces as items are linked to that order. The divergence, on the other hand, refers to when for a given case there may be multiple instances of the same activity. For example, for the process in Fig. 1a, the divergence arises when discussing the possibility that for a process instance, an order, the Pack Items activity can be executed several times and that the items packed in the first instance of that activity can be associated with the first, the second or the last package associated with the order. These multiple relationships derived from the multiple object analysis are not considered in traditional event logs, thus losing relevant process data.

The other area of interest related to our paper is performance measurement. Although there is no single way to define PPIs, they are usually described by a set of attributes. In this paper, we based on the definitions of performance measures and PPIs proposed in [11] and formally described in the PPINOT metamodel presented in [10] (adapted from [12]), which provides a detailed structure for defining PPIs. This metamodel emphasizes the need to include as PPI attributes at least one identifier, a descriptive name, a process in which the PPI is defined, a set of goals indicating the relevance of the PPI, a measure definition that specifies how to calculate the PPI, a target value to be reached, a scope to define the subset of instances to be considered to calculate the PPI value, and a set of human resources to be responsible, accountable, and informed about the PPI. A PPI is calculated through a measure, which in turn can be defined by other measures. Each measure is related to a condition that specifies when the measure value is obtained. A condition relates an activity, an event or a data object to their respective states. Three types of base measures are defined in [10], where a value is obtained for each measure: time (measures the time elapsed between two events), count (measures the number of times an event occurs), and data (measures the value of a certain part of a data object). More complex measures, called derived measures, can be defined by combining the previous measures and applying mathematical or logical operations on them. Finally, all those measures can be aggregated according to aggregation functions such as sum, maximum, average. Given the importance of the measure definition for the calculation of PPIs, in this paper, we will focus on how to define these measures.

PPIs can be calculated automatically from data recorded in event logs [12]. Currently, most process mining tools can handle the event logs in XES format [3]. However, as there is a change in the way logs are generated, from single case notion to object-centric approach, it is reasonable to foresee some changes in the way calculation and analysis of object-centric data is performed. Recent works have begun to propose adaptations of the focus of analysis. In [4], a new measure is included to focus on conditions based on data objects attributes, but only under the single case notion context. Related to the definition of performance measures in an object-centric approach, we have only identified one proposal focused on the calculation of predefined time measures, such as waiting times or cycle time [9]. However, we have not identified any proposal that addresses the definition of customized performance measures in an object-centric context. In this paper, we seek to fill this gap and propose first steps for defining and modeling customized performance measures in an object-centric context.

3 Performance Measurement in the Multiple Case Notion

Event logs capable of reflecting multiple case notion (hereafter, object-centric logs), are formed by a set of events and objects. According to [5], an event represents an execution record of a business process and associates multiple elements (identifier, activity, timestamp and relevant objects) and optional characteristics such as event attributes. An object indicates the information of an object instance in the business process. It must contain a type and may contain several attributes that describe it. Since the content and structure of data recorded in object-centric logs is different from traditional logs, the way in which performance information is extracted and analyzed may also vary.

To establish these differences, we based on the PPINOT modeling characteristics for the definition of PPIs and measures to identify modeling requirements that support the object-centric approach (Sect. 3.1). In PPINOT, a BaseMeasure (bm) is calculated based on a case, in such a way that \(bm(case) = mvalue\). The measures described by PPINOT (Sect. 2) are based on conditions that make it possible to specify (i) the time when something happens, for example, the start or end of the execution of an activity, or the time when the state of a data object changes; or (ii) the value of a certain part of a data object (data property), for example, the name of the customer (property) that requests an order (data object). However, in an object-centric context, it is necessary to take into account not only the relationship among the events occurred, but also among objects and their characteristics. The new modeling requirements were integrated into the PPINOT metamodel, as shown in Fig. 2. They mainly affect the elements on which a measure can be calculated and a new measure was also incorporated. Those changes will be described in detail in Sect. 3.2.

Fig. 2.
figure 2

Extension of the PPINOT metamodel integrating object-centric concepts.

3.1 Comparison of Measure Definitions

In this subsection, we focus on describing how different types of measures would be defined in a traditional context and the identified drawbacks and/or necessary changes to define similar measures in an object-centric context. For this purpose, we based on the types of measures proposed in [10]. Hereafter, when we refer to a traditional event log, we refer to a single case notion event log. In a traditional log, events recorded are related to a single case or process instance.

Notion of Measures. In a traditional context, a BaseMeasure bm is related to a single case c, \(bm(c) = mvalue\), where mvalue is the value of the measure. In the object-centric context, since there is a multiple case notion, the definition of the measure must change to specify the reference notion (the object type) on which each measure will be calculated. Requirement 1: A measure must specify the object type used as a reference for its calculation.

Time Measures. A time measure measures the duration of time between two time instants. In a traditional log, a time instant is related to the execution of a process activity or event (without duration), the whole process, or the change in the value of an object. For the Order-To-Deliver process in Fig. 1a, a measure of the total time to process an order could be defined as: \(M_{time}\) = The duration between the time instant when Place Order changes to state active and the time instant when activity Deliver package changes to state completed. The tracking between the two instants is relatively straightforward since both activities, and all the activities executed in between, are related to the same case notion, an order. However, as shown in Fig. 1b, when referring to the object-centric context, not all objects are related to all activities, so there is no direct traceability between the different relationships. It would be relatively easy to calculate the instant at which an order is registered, since there is a relationship between Place Order and Order. Nevertheless, an order would be considered completed when all the packages associated with it are delivered. In this case, there is no direct relationship between Order and End Route and in the event log, this results in the fact that there is no End Route event that refers directly to any Order. Therefore, to calculate the measure, it would be necessary to perform a series of operations to trace the end of an activity (End Route) to another object (Order). Thus, Requirement 2: Identify traceability between objects.

If the reference object is order, the two time instants in our example should be calculated differently using a traceability function, but the reference object should be indicated for both of them (Requirement 1). Given that there is a direct relationship between the event Place Order and the object Order, the start instant could be defined as the traceability function \(startInstant = getAttribute(attribute, Event, Object)\), where attribute indicates the attribute in the log to be analyzed (timestamp); Event is the event from which the attribute is obtained (Place Order), and Object is the reference object (Order) related to the event. To obtain the end instant, we must know when the last package associated with each order was delivered. To do this, we must define a traceability function between the objects Order and Package to then obtain the maximum timestamp of the delivery times registered of all the packages associated with an order. The traceability function for end instant would be defined as:

\(itemsOrder = getObjects(Item, Order, Place Order)\)

\(packsItems = getObjects(Package, Item, Pack Item)\)

\(packsDelivered = getObjects(Package, Package:packsItems, Deliver Package)\)

\(routesPacksDel = getObjects(Route, Package:packsDelivered, Deliver Package)\)

\(deliveries = getEvents(Deliver Package, Route:routesPacksDel)\)

\(endInstant = max(timestamp,deliveries)\)

In the above, we used getObjects(ObjectA,ObjectB,Event) to select objects, where ObjectA is the object returned, ObjectB is the object that ObjectA relates to in the Event event. getObjects(ObjectA,ObjectB:listObjectsB,Event) is similar, but in this case, the relationship of ObjectA is analyzed with a subset of elements of type ObjectB. getEvents(EventObject) returns a subset of events of the type indicated in Event, where objects of type Object participate, or a set of these. Finally, the highest timestamp of all the packages delivered in an order is assigned as end instant.

Count Measures. A count measure measures the number of times something happens. This again refers to the measuring of something at a given instant, an event. For example, in the process of Fig. 1a, a count measure could be the number of packages delivered to fulfill an order, defined as: \(M_{count}\) = The number of times activity Deliver Package changes to state completed. As was the case with time measures, to calculate the number of packages of an order in an object-centric context, it is needed to perform a traceability between a reference object (Order) and another element, in this case an object (Package). In addition to base a measure on the count of time instants, we could also be interested in counting elements associated with objects and their attributes. Such as the number of items of type technology, where technology would be an attribute of the object Item. This is not possible with the original definition of a count measure, therefore, we define the Requirement 3: Extend the use of count measures to allow them to be applied, in addition to time instants, to objects and their attributes.

Data Measures. A data measure measures the value of a certain attribute included in the event log. This measure can make a selection based on a property of an object, for example, the invoice amount or the department in which the order is placed. In a traditional approach, it could be defined as:

\(M_{data} = \) The value of property department of an Order

In an object-centric context, the data measure should focus on object types and the attributes associated with them or with events, so it would be necessary to specify the element on which the data measure is applied. Therefore, Requirement 4: Specify the element on which a data measure is applied so that it can be calculated on object types and attributes of object or events.

Aggregated Measures. An aggregated measure uses an aggregation function sum, minimum, maximum, average to aggregate measures in several process instances, and the result can be grouped by the value of another measure. In a traditional context and taking as a basis the count measure \(M_{count}\), an aggregated measure could be defined as:

\(M_{agg\_count}\) = The sum of the number of times Deliver Package changes to state completed and is grouped by delivery city of Order. In the above definition, the aggregation function is sum, the grouping property is delivery city that is an attribute of the object Order. As a result we would obtain, for example, {(150, London), (280, Madrid), ... }

For the aggregated measure no new requirements have been identified in the object-centric context. The measures used to calculate it, should follow the requirements that apply to them (e.g., Requirement 3 for the count measure).

Derived Measures. A derived measure defines a function over other measures. Following our example of the Order-To-Deliver process (Fig. 1a), a derived measure could be the percentage of paid orders. In a traditional context, with Order as a case notion, the measure would be defined as:

\(M_{der}\) = the function \(\frac{paid}{regs} * 100\), where paid is the measure defined as the sum of the number of times Receive Payment changes to state completed, regs is the measure defined as the sum of the number of times Place Order changes to state completed.

As with aggregated measures, no new requirements have been identified for this measure in the object-centric context. A derived measure is defined over other measures by adding constraints (operations) and must support the definitions of those measures, taking into account the requirements defined for them.

Several Reference Objects. So far, in the object-centric context, we have mentioned the possibility of defining measures in which it is necessary to specify a reference object on which to pivot to perform the traceability that allows us to obtain the information desired. For example, order, when we want to know the number of packages shipped for an order. However, to exploit the possibilities of an object-centric event log it is also necessary to consider the possibility that a measure, for example an aggregated measure, involves more than one object type as reference. For example, in a measure defined as the average time to deliver package for each order, Package would be the reference object related to a time measure and an aggregated function would be applied taking Order as reference object. The above derives Requirement 5: A measure can involve several objects (case notions) as reference object.

3.2 Extension of the PPINOT Metamodel for the Definition of PPIs

This subsection aims to show how the modeling requirements identified in the previous subsection can be integrated into the PPI metamodel.

The result of this integration is shown in Fig. 2 as a UML diagram. White classes represent original PPINOT concepts taken or adapted from [10]. Gray classes and associations with thicker border are new elements derived from the identified requirements. White classes with thicker border represent classes slightly modified to adapt to the requirements. The shaded texts represent new attributes included. The blue circles indicate the requirements (Rx) to which that element of the metamodel is related. To extend the PPINOT metamodel, in addition to the requirements identified in Sect. 3.1, we adopted as reference the definitions of the object-centric elements described in [1].

For Requirement 1 it was noted that, due to the multiple case notion in an object-centric context, it is needed to indicate a reference object on which the measure would be calculated. After analyzing all types of measures, we found that this only affects base measures, since aggregated and derived measures are not defined directly on data of executed process elements, but on other measures. Therefore, to cover this requirement, a referenceObject element was added and related to BaseMeasure (R1), so that: given a base measure bm, for an object of type ot, and an object \(o \in ot\), the base measure is calculated as \(bm(o) = mvalue\).

Requirement 2 points out the need to establish a relationship between different objects to trace them and determine time instants to make a measurement. This can be done through a sequence of operations, such as those defined during the analysis of the time measure described in Sect. 3.1: getAttribute, getObject, getEvent. These operations can be seen together as a traceability Function of a TimeMeasure in which the participating elements (source and destination) are specified (R2a).

Since in the original metamodel a TimeInstantCondition could only be applied on States of Activities and Objects of a process, this should be extended so that a condition can be applied on Activities, as up to now, but also on a ObjectType and Attribute, both of events and objects (R2b). To provide more flexibility to the definition of the TimeMeasure, we included in it the attributes initialFrom and initialTo. They indicate whether the first or the last occurrence of the instance of an element should be taken into account for the calculation of the instants (R2c).

Requirement 3 again points out the need to extend the elements on which a measure can be applied to consider both object types and attributes of objects and events. The extension related to the count measure was addressed in (R2b) by associating the Condition to Element (R3), which in turn can be an ObjectType, Activity or Attribute. So a CountMeasure \(M_{count}\), defined as the number of items in a registered order could be defined as \(M_{Count(Item)}\)(Place Order, completed), where the reference object is Item and the value for the measure calculation is taken when the Place Order activity was completed; or another one as the number of packages delivered as \(M_{Count(Package)}\)(\(delivered_{Package}\)), where delivered is an attribute of Package.

With regard to Requirement 4, associated with the DataMeasure, slight changes were introduced. We specified in more detail the scope of application of the DataMeasure. In this case, to define a DataMeasure it is necessary to establish the DataContentSelection to indicate the part of data to be obtained with the measure. These DataContentSelection is defined by means of an Attribute (R4) either ObjectAttribute or EventAttribute. For an ObjectAttribute, it is necessary to specify the ObjectType to which the attribute of interest belongs.

The last requirement, Requirement 5, is related to the possibility of involving several object types in the definition of a measure. To address this requirement, we included a new base measure called ObjectAggregatedMeasure. Its purpose is to be able to apply an aggregationFunction on a BaseMeasure, using a reference object different from that of the BaseMeasure. This is reflected through the aggregates association. The aggregationFunction is defined on a reference object (obj1), and BaseMeasure by definition is also associated to an object (obj2) by means of the association referenceObject. The relationship between obj1 and obj2 is established through the traceabilityFunction. For example, the measure ObjectAggregatedMeasure would allow the definition of measures such as: the sum of orders whose number of items is greater than 5, which can be expressed as \(M_{ObjectAggregated}1 = SUM_{Order}(M_{Count(Item)}(Place\) \(Order, completed)) > 5) )\). In this example, SUM is the aggregationFunction applied on the Order object (obj1) and \(M_{Count(Item)}\) is the CountMeasure whose referenceObject is Item (obj2), and whose Condition is (Place Order, completed), that is, the Items associated to the Place Order event will be taken into account. Since the measure specifies a condition representing an operation (greater than, > ) implicitly, a Derived measure is being defined. If the mathematical operation were not necessary, this would not affect the relationship between the rest of the elements. The relationship between the Order object and Item is defined through the \(Order \sim Item\) traceabilityFunction as described previously in this section.

3.3 Modeling of Measures

Based on the extended metamodel presented in Fig. 2, the measures mentioned in Sect. 3.1 have been modeled. Due to space restrictions they are not included in this paper, but are available onlineFootnote 1.

4 Discussion

Throughout this paper, we have tried to answer the research questions posed in Sect. 1 regarding the definition of performance measures taking into account the characteristics derived from an object-centric context. We started by analyzing the way in which performance measures are defined in a traditional context, in order to identify the relevant characteristics that should be taken into account when defining measures in the object-centric context. To answer the question about the characteristics of performance measurement in an object-centric context (RQ1), we focused on how performance measures are defined in a traditional context using the PPINOT metamodel as a reference, and then tried to define the same measures in an object-centric context, if possible. As a result of this phase, we identified 5 requirements that vary between the traditional and the object-centric context. As mentioned in Sect. 3, some of them affect more than one performance measure definition. Among them we can highlight:

  • Take into account new attributes that must be considered in the measure definition. First, to specify the reference object on which the measure is defined (referenceObject), and second, to establish the relationship between process objects (traceabilityFunction) in case there is no direct relationship between them. For example, between Order and Package (Fig. 1b).

  • Extend and/or modify, as appropriate, the application scope of the measures, to take into account the elements that are part of an object-centric event log: ObjectTypes and Attributes of objects and events. This does not mean that the original condition criteria are eliminated, since it is still necessary to take into account the time instants where events occur, for example.

  • Allow the definition of a performance measure by reference to more than one object (object types).

The last point mentioned is closely related to the second question on how performance measures can be defined in the object-centric context. In general, all the measures proposed in the PPINOT metamodel for a traditional context can be defined in an object-centric context; however, adaptations must be made to include the requirements described in Sect. 3.1. In addition, since in an object-centric context it is useful to define measures taking into account several object types, it is necessary to introduce a new measure that takes into account the reference object on which any base measure is defined, and also to allow the application of aggregation functions on it taking another object as reference. Both this measure and the previous requirements were integrated into the PPINOT metamodel. In this work, we only address the modeling of performance measures from a conceptual point of view. However, one of the characteristics of the PPINOT metamodel is that it allows to facilitate the operationalization of the automatic calculation of PPIs. Therefore, with this analysis and initial proposal, we can lay the foundations for the operationalization of customized performance measures in an object-centric context. Although in this work we rely on PPINOT, the results of the analysis done could be used with other similar artifacts for the definition of process performance measures.

5 Conclusion and Future Work

In this paper, we present a novel approach for the definition of domain-specific performance measures in the object-centric context with the objective of optimizing the measurement of process performance. The analysis performed shows that the definition of measures in this context is not a trivial task since it is necessary to define and fulfill new requirements derived from the relationship of multiple objects and events. As future work, we propose to extend the set of test measures to identify new requirements (if any). Then, we propose to extend the formalization of the PPINOT metamodel so that performance measurement can be done automatically from an object-centric event log. We also plan to use it in a real context to evaluate its applicability.