Keywords

1 Introduction

A business process is an inter-related set of steps designed to transform inputs into outputs (goods or services). Understanding how a processes works (process analysis) is a key step in determining how the process can be improved (i.e. be changed so that it works somehow ‘better’). Process analysis then involves identifying performance metrics that allow point-in-time monitoring and tracking over time to assess how well a business process is meetings its proposed objectives. Such metrics may include various times associated with process execution, e.g. throughput time or idle time.

Shelf time is any idle-time period i.e. no activity is recorded on a case, where the duration of the idle-time exceeds some process-specific threshold and becomes somehow unacceptable (to process stakeholders). Clearly, periods of shelf time will (usually negatively) impact on individual case durations. While potentially significant at an individual case level, it is also important to be able to determine the prevalence and impact of shelf time across the entire corpus of process cases. Our shelf time analysis method focuses on identifying delays following the completion of activities and thus is useful in revealing activities that are responsible for causing process instance delays. We argue that the identification of such activities is important as they represent break points in a process instance beyond which it is not possible or practical to continue until some ‘blocking factor’ is resolved. We note that the existence of shelf time periods may be an indicator that a process is resource bound i.e. not enough capacity to deal with an accumulation of cases completing to the point where a shelf time period is observed, or that there is some external or un-recorded (in the event log) activity which is occurring, and on which the process depends. Consider, for instance, an insurance claims officer requesting a report from an independent medical examiner regarding the extent of the claimant’s injuries before determining a compensation offer. From the claims officer’s point of view, the (external) procedure at the medical examiner’s practice is opaque and the claims officer cannot proceed with the claim until the report is received. There is then shelf time, i.e. a break in the process, associated with the activity of requesting a report from the medical examiner.

An understanding of the root causes of shelf time periods provides insights useful to process stakeholders and analysts as input to process improvement/re-design. Questions that may be of interest in attempting to derive the root causes of shelf time periods include (i) are there activities frequently associated with shelf time periods? (ii) are cases with significant shelf time periods associated with particular resources? (iii) are interactions with particular third parties associated with shelf time periods?

In this paper we take a process-mining based approach to identifying and quantifying the effects of shelf time on case duration. The major contributions of this paper include (i) an approach for identifying and quantifying periods of shelf time in an event log triggered by an event activity, (ii) an analysis of a portfolio of claims of commercial CTP insurer to identify shelf time periods and triggering activities and (iii) a discussion of an extension of the approach to include identification of shelf time periods associated with other event attributes, e.g. the resource.

The remainder of this paper is organised as follows. In Sect. 2 we discuss some previous work related to idle-time analysis. In Sect. 3 we define the elements of our approach including events, event log, activities and resources and outline our algorithm for detecting shelf time periods (and associated triggering activities) in any case. In Sect. 4 we provide results from the application of our approach to real-life logs provided by a commercial CTP insurer and in Sect. 5 we reflect on the case study and provide some direction for future work in this area.

2 Related Work

Operations management is primarily concerned with efficiently controlling business processes in the production of goods or the delivery of services (the focus of the particular process). Its goal is the efficient use of resources in meeting customer requirements. In Operations Management, idle time is defined as (cycle time - processing time) where cycle time is the time between output of two flow units (outputs of the process, e.g. products or delivered services) and processing time is the actual time spent in each of the activities making up the process.

Idle time has long been of interest in process analysis in a variety of industries. In [2], the authors used simulation models to derive a set of variables useful in reducing doctor’s idle time in an outpatient setting. In [11] the authors use image processing-based methodology to automatically quantify the idle time of hydraulic excavators and in [5] the authors anlayse cycle time and idle time of draglines with a view to increasing efficient use of such capital intensive equipment.

Process mining, a branch of data science, aims at utilising historical, process-related information captured in so-called event logs to discover, monitor and improve processes [1]. Process mining is becoming more popular as evidenced by the growing number of case studies detailing successful application of analysis techniques [3, 4, 9, 10]. Process mining however, in common with other forms of data analysis, as is pointed out in [7, 8], is hampered by the overall data quality of the event log and the limited information frequently found in event logs, particularly those not generated by process-aware information systems. In [8] the authors refer to the common problem of not having exhaustive timestamp information recorded for events (i.e. having only a completed timestamp rather than scheduled, started and completed times).

In [6] the authors investigate the applicability of process mining approach to the semi-structured test processes of ASML (the leading manufacturer of wafer scanners in the world) with the aim of analysing idle time. Here the authors modify the original event log by applying an ‘inversion filter’ to the activities in the log such that revised activities represent the transition from one activity in a case to the next activity allowing the analysis of idle times instead of activity durations.

3 Formalisations

Definition 1

(Attribute, Event, Event Log). Let \(\mathscr {E}\) be the event universe, i.e.  the set of all possible event identifiers. Events may be characterised by various attributes, e.g. an event may belong to a particular case, have a timestamp, correspond to an activity, and can be executed by a particular person.

Let \( AN = \{a_1, a_2, ..., a_n\}\) be a set of all possible attribute names. For each attribute \(a_i \in AN \) (\(1 \le i \le n\)), \(\mathcal {D}_{a_i}\) is its domain, i.e. the set of all possible values for the attribute \(a_i\).

For any event \(e \in \mathscr {E}\) and an attribute name \(a \in AN \): \(\#_{a}(e) \in \mathcal {D}_a\) is the value of attribute named a for event e. If an event e does not have an attribute named a, then \(\#_{a}(e)\) = \(\perp \) (null value).

Let \(\mathcal {D}_{ id }\) be the set of event identifiers, \(\mathcal {D}_{ case }\) be the set of case identifiers, \(\mathcal {D}_{ act }\) be the set of activity names, and \(\mathcal {D}_{ time }\) be the set of possible timestamps, \(\mathcal {D}_{ res }\) be the set of resource identifiers. For each event \(e \in \mathscr {E}\), we define a number of standard attributes:

  • \(\#_{ id }(e) \in \mathcal {D}_{ id }\) is the event identifier of e;

  • \(\#_{ case }(e) \in \mathcal {D}_{ case }\) is the case identifier of e;

  • \(\#_{ act }(e) \in \mathcal {D}_{ act }\) is the activity name of e;

  • \(\#_{ start }(e) \in \mathcal {D}_{ time }\) is the starting time of e;

  • \(\#_{ complete }(e) \in \mathcal {D}_{ time }\) is the completion time of e; and

  • \(\#_{ res }(e) \in \mathcal {D}_{ res }\) is the resource who triggered the occurrence of e.

An event log \(\mathcal {L} \subseteq \mathscr {E}\) is a set of events. This definition of an event log allows the log to be viewed as a table, thus allowing the application of relational algebra to the log.

Definition 2

(Shelf Time Period). Let \(\mathcal {L} \subseteq \mathscr {E}\) be an event log, \( AN _\mathcal {L}\) be a set of attribute names found in \(\mathcal {L}\) and \(\mathcal {D}_{a}\) be the set of all possible values of \(a \in AN _\mathcal {L}\). Let \(\mathcal {D}_{ case }\) be the set of all case values in \(\mathcal {L}\) and \(\mathcal {D}_{ time }\) be the set of possible event timestamps in log \(\mathcal {L}\). Let \(\theta \) be the duration of a time window and \(\delta (t_1, t_2)\) give the difference between two times, \(t_1\) and \(t_2\), where \(t_1 \le t_2\).

A shelf time period is present in log \(\mathcal {L}\) if:

  • \(\exists e_i, e_j \in \mathcal {L} | \lnot \exists e_n \in \mathcal {L}, (\#_{id}(e_i) \ne \#_{id}(e_j) \ne \#_{id}(e_n)) \) \(\wedge \) \((\#_{case}(e_i) = \#_{case}(e_j) = \#_{case}(e_n)) \) \(\wedge \) \((\#_{complete}(e_n) > \#_{complete}(e_i))\) \(\wedge \) \((\#_{start}(e_n)\) \(< \#_{start}(e_j))\) \(\wedge \) \(\delta (\#_{complete}(e_i),\#_{start}(e_j)) > \theta \)

That is, a shelf time period is present in a log, if there exists events \(e_i\) and \(e_j\) such that there does not exist any other event \(e_n\) where \(e_i\), \(e_j\) and \(e_n\) are in the same case, and that \(e_n\) is never concurrent with either \(e_i\) or \(e_j\) and the time difference between the completion of \(e_i\) and the start of \(e_j\) exceeds some process-dependent value, \(\theta \). Note that if \(\theta \) is very small, then shelf time is the same as idle time.

Shelf time periods in the log may occur in a number of scenarios as shown in Fig. 1. In the illustration, the solid bars represent activities in a single case with (i) the length of the bar representing the duration of the activity, (ii) the horizontal alignment of the bars representing the relative timing of each activity in the case. This means that bars that align vertically on their left edges have a simultaneous start time, while bars that align vertically on their right edges have a simultaneous complete time. Shelf time periods may be bounded by the completion of a single event (marking the beginning of a shelf time period) and the start of a single event (marking the end of the shelf time period) as illustrated in scenario 1. Alternate scenarios allow for shelf time periods to begin with multiple events completing simultaneously or end with multiple events beginning simultaneously as shown in scenarios 2, 3 and 4.

Fig. 1.
figure 1

Shelf time scenarios

Definition 3

(Shelf Time Period Associated With a Given Activity). It is possible to filter shelf time periods to those that are triggered by the completion of a given activity.

Let \(\mathcal {L} \subseteq \mathscr {E}\) be an event log, \( AN _\mathcal {L}\) be a set of attribute names found in \(\mathcal {L}\) and \(\mathcal {D}_{a}\) be the set of all possible values of \(a \in AN _\mathcal {L}\). Let \(\mathcal {D}_{ case }\) be the set of all case values in \(\mathcal {L}\), \(\mathcal {D}_{ act }\) be the set of all activities in \(\mathcal {L}\) and \(\mathcal {D}_{ time }\) be the set of possible event timestamps in log \(\mathcal {L}\).

A shelf time period, triggered by a particular activity \({act}_x \in \mathcal {D}_{act}\) is present in log \(\mathcal {L}\) if:

  • \(\exists e_i, e_j \in \mathcal {L}, \#_{act}(e_i) = act_x | \lnot \exists e_n \in \mathcal {L}, (\#_{id}(e_i) \ne \#_{id}(e_j) \ne \#_{id}(e_n)) \) \(\wedge \) \((\#_{case}(e_i) = \#_{case}(e_j) = \#_{case}(e_n)) \) \(\wedge \) \((\#_{complete}(e_n) > \#_{complete}(e_i))\) \(\wedge \) \((\#_{start}(e_n)\) \(< \#_{start}(e_j))\) \(\wedge \) \(\delta (\#_{complete}(e_i),\#_{start}(e_j)) > \theta \)

Definition 4

(Shelf Time Period Associated With A Given Resource). It is possible to identify shelf time periods associated with a given resource. Here we consider that a resource may be assigned to a portfolio of concurrently active cases.

Let \(\mathcal {L} \subseteq \mathscr {E}\) be an event log, \( AN _\mathcal {L}\) be a set of attribute names found in \(\mathcal {L}\) and \(\mathcal {D}_{a}\) be the set of all possible values of \(a \in AN _\mathcal {L}\). Let \(\mathcal {D}_{ res }\) be the set of all resource identifiers in \(\mathcal {L}\) and \(\mathcal {D}_{ time }\) be the set of possible event start timestamps in log \(\mathcal {L}\).

A shelf time period, associated with a particular resource \({res}_i \in \mathcal {D}_{res}\) is present in log \(\mathcal {L}\) if:

  • \(\exists e_i, e_j \in \mathcal {L}, \#_{res}(e_i) = \#_{res}(e_j) = res_x | \lnot \exists e_n \in \mathcal {L}, (\#_{id}(e_i) \ne \#_{id}(e_j) \ne \#_{id}(e_n)) \) \(\wedge \) \((\#_{complete}(e_n) > \#_{complete}(e_i))\) \(\wedge \) \((\#_{start}(e_n)\) \(< \#_{start}(e_j))\) \(\wedge \) \(\delta (\#_{complete}(e_i),\#_{start}(e_j)) > \theta \)

3.1 Approach

Periods of shelf time associated with activities may be identified and the pervasiveness of shelf time in the event log may be determined using the following three step approach:

  1. 1.

    Populate a table, ST, containing events that are shelf time ‘triggers’, i.e. events that do not temporally overlap other events in the same case

  2. 2.

    For each event in ST, determine the temporally ‘next’ event in the case and determine the duration of the shelf time.

    1. (a)

      For each event \(e_i \in ST\), build a table of all events \(e_j\), in the same case, that start after \(e_i\) completes, i.e. \(\#_{complete}(e_i) < \#_{start}(e_j)\), and the calculate the difference \(\delta (\#_{complete}(e_i),\#_{start}(e_j))\)

      • \(BA \equiv \varPi _{ST.id, ST.case, ST.act, \mathcal {L}.id, \mathcal {L}.act, \delta (ST.complete, \mathcal {L}.start)}\) \(\quad \qquad (\sigma _{ST.case=\mathcal {L}.case \wedge \mathcal {L}.start > ST.complete}(ST \times \mathcal {L}))\)

      • \(\rho _{ST.id/startid, ST.case/case, ST.act/startact, \mathcal {L}.id/nextid, \mathcal {L}.act/nextact,}\) \(\quad \qquad _{\delta (ST.complete, \mathcal {L}.start)/ shelftime }(BA)\)

    2. (b)

      For each startid in BA, find the nextid with the minimum \( shelftime \), i.e. the temporally next event. Populate a table, \( ActivityShelf \), with only these events.

      • \( ActivityShelf \equiv \varPi _{case, startact, nextact, shelftime }(BA) -\) \(\quad \quad \quad \varPi _{x.case, x.startact, x.nextact, x. shelftime }\) \(\quad \qquad \quad (\sigma _{x.case=y.case \wedge x.startact = y.startact \wedge x. shelftime > y. shelftime }\) \(\qquad \qquad \quad (\rho _x(BA) \times \rho _y(BA)))\)

  3. 3.

    Aggregate \( ActivityShelf \) as required.

4 Case Study

The Compulsory Third Party (CTP) scheme operating in Queensland (Australia) provides motor vehicle owners and drivers an unlimited liability policy for personal injury caused through the use of the insured vehicle in incidents to which the governing legislation, the Motor Accident Insurance Act 1994 (the Act) applies. The Queensland CTP scheme is managed by the Motor Accident Insurance Commission (MAIC) and is underwritten by (currently four) licensed, commercial insurers. CTP premiums, collected as a component of vehicle registration, contribute to the respective insurers premium pool and are used to pay compensation to accident victims.

The Act lays out in detail the rights and obligations of the parties involved (claimant and insurer) in lodging and settling a compensation claim for injuries received as a result of a motor vehicle accident. The claimant must first notify the relevant insurer of their intention seek compensation (by lodging a standard Notification of Accident Claim form). The insurer will assess the claim to determine that it complies with the provisions of the Act. The insurer will then make determination as to whether it is liable for the claim, i.e. the insurer has accepted the application for insurance from the claimant. Following the liability decision, the claimant and the insurer will negotiate the agreed compensation (usually at a conference but negotiation may include litigation if the parties cannot come to agreement). Once agreed, formal settlement of the claim takes place and, after all monies are disbursed, the claim is finalised. A claim may exit the process at each of the Notification, Compliance and Liability phases. Reasons for exiting the claim process include the claim failing to comply with the provisions of the Act (through not containing all information relevant to the claim or not being submitted within prescribed timeframes) or the nominated insurer determining it is not liable for the claim (through the ‘at fault’ driver not holding a valid, current CTP insurance policy with the nominated insurer).

Once the insurer has accepted liability, the claim will progress to completion. Figure 2 shows the phased nature of the CTP claims management process.

Fig. 2.
figure 2

CTP claims management - value chain

For any of the scheme insurers, the injury-compensation claims process is complex involving negotiations between multiple parties (e.g. claimants, other insurers, law firms, health services providers, Centrelink, Workers Compensation, hospitals, police). While the Act prescribes maximum allowed periods for claims to reach certain milestones, CTP insurers nevertheless experience significant behavioural and performance variations in CTP claims processing affecting, in particular, claim durations. For instance, of the 2,535 settled claims in the dataset used for this study where the maximum injury severity was rated minimal, the duration from notification to settlement ranged from a minimum of 0 months to a maximum of 131 months (median duration = 19 months, mean duration = 21 months).

The CTP injury compensation claims process may be considered as a phased process marked by distinct reporting milestones. Each insurer is required to report to the MAIC when key milestone events (Notification, Compliance, Liability, Settlement, Finalisation) have occurred. A high-level process map is show in Fig. 3. The Act lays down maximum periods for determining whether the claim is compliant with the Act and the insurer(s) that is/are liable for the claim. (NB where more than one insurer is deemed liable for the claim, one insurer will be designated responsible for managing the claim.) Following the establishment of liability, the managing insurer will process the claim till finalisation. Following the liability stage, the progress of the claim is determined by factors such as (i) all parties agreeing that the injured person has reached a stage of maximum medical stability beyond which further recovery will not occur, (ii) the insurer and claimant agreeing a settlement offer, or (iii) mediation or litigation determining a settlement. In our case study, we considered 4,959 claims managed by one of the commercial CTP insurers comprising cases that were ‘open’ at some stage in the period 1-Jan-2012 to 17-Oct-2015 (3,446 ‘closed’ claims and 1,513 ‘open’ claims at time of data extract). The event log itself comprises 1,982,009 event records extracted from various components of the insurer’s claims management system including documents, notes, automated/system generated tasks, user initiated activities, records of changes to a claim’s compliance status and records of damages estimates generated at various points in a claims history. The event log also contained events representing CTP scheme milestone dates. Overall, there were 180 different activity codes.

Fig. 3.
figure 3

Reporting milestones in the CTP claims management process

An initial analysis of the event log revealed a pattern of ‘batch completion’ of assigned tasks by system users. That is, the insurer’s claims management system presents a user with a set of tasks ordered by due date. The user may select and mark as ‘completed’ one or more tasks at a time. Further, the insurer’s claims management system is a workflow management system which will generate tasks for users based on the current state of claims processing. More than one task may be generated at the same time. The batch completion and multi-task generation pose some problems in quantifying shelf time in any given claim.

Here we note that in the event log, the start time of an activity was the date/time on which the task was assigned to a user. The complete time for an activity was the date/time when the user marked the task as ‘completed’. NB. It was not possible to determine, from the data available, when the user first started working on the task. That is, it was not possible to determine the period between the date/time the task was assigned to the user and the date/time the user first started working on the task. Nor was it possible to determine whether the user worked continuously on the task or completed the task in installments.

Table 1. Shelf time instances and distribution by claim status. Periods of shelf time were deemed to be significant if they exceeded 340 h (approx 2 weeks).

Our initial analysis revealed that periods of shelf time were common in the claims under consideration. Table 1 shows the numbers of shelf time periods by claim status. It can be seen that across the entire corpus of claims, more than 60% of claims (3,057 of 4,959 claims) were affected by at least one significant shelf time period (\(\theta \) = 340 h). Figure 4 shows, for the 2,939 claims where total shelf time exceeded 680 h (1 month), the fraction of the claim comprising of shelf time.

Fig. 4.
figure 4

Shelf time as a fraction of claim duration

Table 2. Activities triggering shelf time periods

Table 2 shows the top 10 activities that most heavily impact on shelf time hours.

Perhaps unsurprisingly, the ‘General Follow Up Activity’ accounts for the largest block of shelf time in the log. Here the user would create this activity as reminder/alert to follow up (generally with some third party organisation such as medical services or Centrelink or Work Cover) a request for information. Other follow up type activities include ‘Rehab Follow Up’ and the ‘Uncoded’ activity code which is a collection of diary notes and general reminders to the user.

Table 3 shows where shelf time occurred in relation to the claim reporting milestone periods. It is apparent that most hours of shelf time occurred in the case-dependent phases of processing, i.e. post-Liability. (See Fig. 3.)

Table 3. Shelf time hours across claim reporting milestone phases

The case study findings showed that individual instances of shelf time (period greater than 2 weeks) occurred in 62% of all claims and that 59% of all claims experienced total shelf time of greater than 1 month. The technique was able to identify activities that triggered a period of shelf time and to quantify the total shelf time frequency and durations associated with each activity/trigger. The technique was also able to identify shelf time periods across the different phases (sub-processes) of the CTP insurance claim process which showed that, in general, the phase associated with most shelf time is the Post-Liability phase. We do note, however, some interesting observations including the large number of shelf time hours associated with the rm_ap_reviewassigndoccategory and CTP_90_011 activity codes in the Post-Finalisation phase. These will form the basis for further investigation in conjunction with the process stakeholder.

5 Conclusion

An understanding of shelf time (idle time that exceeds some process-dependent threshold) provide insights to process behaviour and acts as input to process improvement strategies. In this paper we have described a process mining-based technique suitable for identifying and quantifying shelf time in an event log. The technique takes an event log extracted from historical executions of a business process and requires each event in the log to have timestamps that can be used to represent the start and completion of the event. The essence of the shelf time identification technique is finding events which do not temporally ‘overlap’ other events in the same case in the log. The approach has been applied to a real-life event log extracted from a major, commercial, Queensland CTP insurer. Finally, the technique is robust enough to use event attributes other than the activity label that trigger shelf time periods. For instance, it would be possible to determine shelf time periods associated with the resource assigned to an event.