Keywords

1 Introduction

Workflow is an integral part of healthcare delivery. In this context, workflow can be formally defined as: “the sequence of steps involved in moving from the beginning to the end of a working process1.” Building upon this definition, we can also define a working process as: “a series of actions or operations conducing to an end.1”.

The ability to observe, instrument, and understand workflow provides critical information for a variety of applications, including but not limited to:

  • Enhancing the quality, safety, and outcomes of care delivery

  • Identifying opportunities to overcome barriers to technology adoption and adaptation in complex healthcare settings

  • Improving the efficiency and timeliness of clinical and translational research

The process of modelling and analyzing workflow is often executed through Time Motion Studies (TMS). TMS, alternatively referred to as “time-motion studies” or “time and motion studies”, are defined in the National Library of Medicine Medical Subject Heading system (MeSH) as “the observation and analysis of movements in a task with emphasis on the amount of time required to perform the task.” TMS methodologies originated as a business efficiency technique through the collective contributions of Frederick Taylor (Time Studies) (Taylor 1914) and Frank and Lillian Gilbreth (Motion Studies) (Baumgart and Neuhauser 2009).

The widespread use of TMS in the healthcare setting is a relatively recent development, and has proven to provide a valuable means for collecting quantitative workflow data in a broad spectrum of settings, ranging from evaluating the effectiveness of system implementations (Amusan et al. 2008) and assessment of costs (Schiller et al. 2008), to describing general workflow (Kloss et al. 2010) and utilization of time by clinicians (Kim et al. 2011). In clinical workflow studies, TMS gather quantitative workflow assessments specifically through continuous direct observation, which has been shown to be more accurate than work-sampling (Wirth et al. 1977) and self-reporting (Gordon et al. 2008; Ampt et al. 2007), and is increasingly being accepted as the “gold standard” for measuring and quantifying clinical workflow (Burke et al. 2000; Bratt et al. 1999). The general “design pattern” for the conduct of TMS is illustrated in Fig. 4.1 below:

Fig. 4.1
figure 1

Overview of prototypical workflow study design pattern. In this pattern, the process begins with the identification of key characteristics that serve to define a workflow of interest. Such characteristics are then used to create workflow annotation (or codification) standards that enable the collection of constituent data during various observation types. Subsequently, workflow observations are conducted, and the data generated therein are codified per the preceding annotation standards. Such observations usually include temporal data concerning instances and durations of workflow related activities. In some studies, observations are iterative, or involve multiple observers, necessitating the assessment of inter-observer or inter-observation agreement. Finally, the results of the preceding steps are analyzed and reported on, often employing descriptive statistics, and key findings are “fed back” to inform future workflow studies or optimization efforts

2 Key Concepts and Definitions Surrounding Time Motion Study Methodologies

As a tool for obtaining quantitative assessments of clinical workflow, TMS have been adapted and used in the healthcare setting since the early twentieth century. Without a unifying standard, however, the definition and scope of TMS have shifted significantly. Although we agree with the definition provided by the Agency for Healthcare Research and Quality, “an observation method used to determine the timing and duration of tasks or procedures”, a recent review concluded that the term “TMS” had been used to describe “a broad spectrum of dissimilar methods whose only common factor is the capture and/or analysis of the duration of one or more events” (Lopetegui et al. 2014). In the literature, there are many studies reported as TMS but they instead used methods such as self-reports and analysis of automatically generated timestamps. Moreover, among the studies that would be considered TMS, there is significant variability in the implementation and reporting of their findings, making aggregation of results difficult. Therefore, there is a need for researchers to properly categorize and rigorously define their methodologies. In a recent review (Lopetegui et al. 2014), we depicted four major classes of methods used in the literature currently classified as TMS, namely:

  1. 1.

    Methods that produce time-motion data by external observers (external observation)

  2. 2.

    Methods that produce time-motion data by the participants being studied (self-observation )

  3. 3.

    Methods that produce time-motion data automatically by computerized systems (automated observation )

  4. 4.

    Methods that lead to the creation of models and frameworks that can be used to support and/or enable the interpretation of data and findings generated during the course of TMS (model formulation )

Below, we provide a description of each of these methods and exemplary studies that have utilize them:

2.1 External Observation

In this type of studies, dedicated external observers perform the task of collecting time-motion data. Data collection can be done asynchronously by having the observer analyze video recordings of the study participant’s behavior in the work environment, also called “time-action analysis” (Minekus et al. 2013; van Oldenrijk et al. 2008). More often, it is conducted by having the observer directly shadow and observe the participant in real time.

Studies involving external observers use mainly two data collection methods: continuous observation and work sampling. In continuous observation, the external observer maintains the attention on the study participant and continuously records the time taken to perform one or multiple tasks, implying that the action of recording is triggered by an action performed by the participant. It is a useful approach to collect data for non-centralized tasks, sensible for short tasks, and provides granular and detailed field data. However, this method is resource consuming, and there is opportunity for biases as participants may feel disturbed. Sometimes, participants may also demonstrate improved performance when being observed: a phenomenon known as the Hawthorne effect.Footnote 1

Unlike continuous observation, which measures the elapsed time for a task, work sampling identifies the task being performed at a given instant (Hakes and Whittington 2011), repeating the measure at predefined fixed or random intervals during the observation. It is premised on the repetitive nature of work, and assumes the probabilistic generalization of the sampling findings to describe how workers spend their time overall. Compared to continuous observation, a major benefit of work sampling is that the observer can work with multiple study participants during a single observation period. Further, work sampling has been reported as an efficient approach for studies designed to classify work activities into fewer categories. With more categories describing less frequent tasks, the required number of observations may increase substantially (Burke et al. 2000), thus losing the advantage afforded by this method. Strictly speaking, work sampling estimates the proportion of time spent on an activity based on observations conducted at random time points (Barnett 2008).

The temporality of the sampling methodology has been debated in the literature, concluding that systematic work sampling often results in flawed and biased estimates; and random work sampling is a better approach (Oddone and Simel 1994) especially when assessing tasks that are performed periodically. However, one of the pioneering researchers of TMS argues that the reduction in biases provided by randomization is overweighed by the complexities in scheduling the observations, advocating in favor of fixed periodic intervals (Finkler et al. 1993). We observed this issue in our recent review: all work sampling studies involving external observers used a systematic fixed time interval: e.g.,. 1 min (Murden and Pintz 2003), 5 min (Deshpande et al. 2012), and so forth. A study used a much higher frequency of sampling at every 15 s, which the authors referred to as “Davis observation code” (Yawn et al. 2003). Under optimal circumstances, work sampling has been proposed as a useful and efficient methodology for analyzing the distribution of work activities in relation to the types of activities they perform (Pelletier and Duffield 2003). This method, however, falls short for questions related to task durations, occurrences, or workflow studies. A highly cited paper concludes that work sampling may not provide an acceptably precise approximation of the results that could be obtained by continuous observation time motion studies (Burke et al. 2000).

2.2 Self-Report

In this group of studies, time-related data are generated by study participants themselves. Although self-report can be a low-cost means for measuring work activities, perceptual differences among the participants who self-report their data can lead to discrepancies in how activities are categorized (Keohane et al. 2008). Also, participants may either lie about what they are doing, or change normal routine in order to generate data that they believe to be more favorable (Burke et al. 2000). This shortcoming has been demonstrated outside TMS when comparing self-reported data and observational data in studies of dentists providing preventive services: self-reported frequencies consistently exceeded observed frequencies (Demko et al. 2008).

Self-reports are also considered unreliable because they tend to over-estimate clinicians’ contact time with patients and under-estimate their non-productive time, compared to work sampling using an external observer (Bratt et al. 1999). Anecdotally, one study comparing the number of duty-hour violations among residents found no difference between self-reports and computer-recorded timestamps (Todd et al. 2011); however instead of reporting the agreement between the two sources of data, they compared if a threshold of work hours was exceeded, but not the specific durations. This reinforces the need to be aware of the inherent human biases in terms of the design and selection of outcomes when using self-reports as the main source of research data.

Data collection methods used by studies in this group can be first classified as synchronous or asynchronous. Commonly used approaches on the asynchronous side of the spectrum include interviews, focus groups, and surveys. These methods directly solicit information from study participants regarding the time it takes them to perform different tasks and/or different steps of a process. Asynchronous self-report methods are considered limited due to their reliance on participants’ subjective account of their workflow and working conditions (Hauschild et al. 2011). It has been widely acknowledged that clinicians are poor estimators of measures commonly found in TMS, such as task durations. For example, when comparing physician recall of event durations in the operating room, self-reported survey responses over-estimated the durations by 30 min on average, from a few minutes up to 2 h, when compared to durations extracted from the surgery log (McCall et al. 2006).

Commonly used approaches on the synchronous side of the spectrum are active tracking and self-reported work sampling. In active tracking, study participants are asked to log time motion data based on their work activities, either immediately after completing a task, or at a later time (e.g., by the end of the work day). On the other hand, self-reported work sampling involves repeated recording of work activities at pre-determined or random time points by study participants. As previously discussed, random work sampling is more commonly used (Yee et al. 2012), which is often facilitated by some types of electronic devices that remind participants at random intervals to record data. In a study that compared self-reported work sampling and traditional/external work sampling for measuring nursing tasks (Ampt et al. 2007), the self-reported method was found to be an unreliable means for obtaining an accurate reflection of the work tasks conducted by ward-based nurses. Also, nurses preferred the presence of an external observer, as recording activities while conducting clinical duties can be burdensome (Keohane et al. 2008). Despite the limitation, self-reported work sampling is easier to conduct and is more scalable with relatively low cost. Indeed, one of the largest TMS to date used the self-report work sampling method to study nursing work across 36 hospitals (Hendrich et al. 2008).

2.3 Automated-Observation

In this group of studies, timestamps and durations of tasks are captured automatically by sensors or computerized systems. Usually the physical movement of study participants, or their interaction with clinical IT systems, trigger the recording of time-motion data, providing a rich “motion” dimension and precise “time” measurements. It is important to note that studies of this category do not refer to those that use computerized tools for external observers (e.g., a tablet PC with TMS research data collection software). Instead, in these studies, time-motion data are being recorded automatically without the presence of an external observer, and without any active involvement of study participants.

Automated time-motion data streams may come from a broad range of sources, including indoor or global positioning systems, accelerometers, electrodes, radio frequency identification (RFID), and clinical IT systems. From study participants’ perspective, this method provides a passive and non-intrusive means for capturing time-motion data while they perform their usual clinical tasks. Examples include location-tracking devices (e.g., RFID tags) that record events when the participant approaches sensors, time-stamped logs of interaction events within an electronic health record (EHR) system, and sensor movements on a laparoscopic surgery training module.

With the availability of such continuous event logs, researchers have better tools to determine the structure underlying the sequence of events, or a flowchart-like process model. Markov Models or Hidden Markov Models have been commonly used to model workflows in the healthcare setting including trauma resuscitations (Mache et al. 2008) or patient trajectories (Mache et al. 2010a); and process mining techniques have also been employed to discover process models from event logs, check conformance/deviation of particular event logs, and suggest changes to the process to enhance workflow (Mache et al. 2009).

Although timestamps recorded by motion sensors have been demonstrated as a reliable source of data (Marjamaa et al. 2006), time-stamped logs from software usage need to be interpreted carefully. If the variable of interest is the duration of interactions with the software system (e.g., charting time), it may be constitutes an accurate measure. However, if the variable of interest would need to be deduced from the computer-recorded timestamp as a proxy (e.g., how long it takes for a patient to transfer to another unit), it might become problematic. For example, a TMS conducted in an emergency department compared continuous observation results to timestamps extracted from the EHR, concluding that on average the EHR-based events were recorded 2 min before they actually took place (median, interquartile range 31 min before to 3 min) (Gordon et al. 2008).

2.4 Model Formulation

In this final class of studies, the primary emphasis is not on conducting empirical investigation using TMS, but rather the creation of conceptual frameworks or equivalent constructs that can support and enable the interpretation of the results of TMS. These include efforts to create models that define the major characteristics that can be measured or understood through TMS, such as actors, activities, and environmental features pertinent to a given workflow (Sittig and Singh 2010). Such efforts can also include studies that focus on the creation of taxonomies and nomenclatures, as well as quantitative metrics, that serve to assist in the aggregation and interpretation of multiple, complimentary TMS (Yen et al. 2016; Lopetegui et al. 2013).

3 TMS Data Capture Tools

Since the early 2000s, several research teams have worked on building electronic data capture tools to facilitate the conduct of TMS. Among them, the most relevant contributions include:

  • Marc Overhage, Lisa Pizziferri, and Yi Zhang. Considered the pioneer of TMS in studying clinical workflow, Overhage and his colleagues introduced the Palm Pocket Digital Assistant program in 2001 (Overhage et al. 2001). This tool incorporates multi-level classification of clinical activities with which observers could label visible physical activities (e.g., talking on phone) and then group them into conceptual categories (e.g., direct patient care). Pizziferri et al. further adapted Overhage et al.’s categorization schema by adding new tasks and categories, and created a Microsoft Access-based application that could be deployed on touchscreen tablet computers (Pizziferri et al. 2005). They also introduced the concept of “primary task” to accommodate multitasking. Later, Zhang et al. adapted Pizziferri et al.’s tool by including a nursing activities taxonomy, and requiring certain additional attributes to be captured such as location, whom the activity served, position while performing the task (standing/sitting/walking), admission or discharge, and the clinical purpose of the activity (Zhang et al. 2011). They also extended the tool by adding the capability for recording communication multitasking (when a clinician is performing a clinical task while simultaneously communicating with others). Finally, they manually mapped the task list to the Omaha System which is a comprehensive practice and documentation standardized taxonomy designed to describe client care in combined terms [problem + category + target + care description].

  • Philip Asaro. Asaro developed a Palm-based application for conducting TMS in an emergency department in 2003. His tool also included a categorization schema for tasks, and allowed simultaneous recording of two activities with independent timing. He also published a novel synchronized data capture method in 2004 to study patient flow (Asaro 2004), wherein multiple data collectors observed different providers using a synchronized timestamp allowing reconstruction of tasks/events of ED care for individual patients. Then, in 2008, he used the tool to evaluate the impact of a computerized prescriber order entry (CPOE) system on nursing documentation workflow (Asaro and Boxerman 2008).

  • Johanna Westbrook. In 2007, Westbrook and her colleagues developed a Pocket PC application which included ten broad work task categories, additional participants involved in the task, and tools/equipment used to perform the task. It also allows external observers to record concurrent tasks independently, and incorporates a novel interruption module to record broken/resumed tasks and the ability to fix input errors. Westbrook et al. also pioneered on assessing inter-observer reliability using the agreement of overall percentage time in tasks. Their method was named WOMBAT (Work observation Method by Activity Timing), and has since been used in several studies (Ballermann et al. 2011; Westbrook et al. 2007, 2008, 2010; Westbrook and Woods 2009).

  • Stephanie Mache. In 2008, Mache et al. developed and evaluated a Pocket PC-based “computer-based medical work assessment program” (Mache et al. 2008). They generated a list of tasks that physicians commonly perform across different settings, and their application allows for the recording of primary and secondary tasks for multitasking events, as well as interruptions. In addition, they developed a new inter-observer reliability assessment method based on time and naming of the tasks. By creating and piloting new taxonomies for specific scenarios, this tool has been used repeatedly in German workflow studies regarding surgeons (Mache et al. 2010a), junior OB/GYN’s (Kloss et al. 2010), junior gastroenterology physicians (Mache et al. 2009), pediatricians (Mache et al. 2010b), oncology residents (Mache et al. 2011), anesthesiologists (Hauschild et al. 2011), and emergency physicians (Mache et al. 2012).

  • Philip Payne. In 2012, Payne et al. introduced the Time Capture Tool (TimeCaT) (Lopetegui et al. 2012): a comprehensive, flexible, and user-centered web application designed to support data capture for TMS. This tool aimed for widespread adoption by a collaborative network of TMS researchers who would be willing to contribute to further development and standardization of formulations regarding multitasking, inter-observer reliability assessment, and taxonomy selection. The end goal of the project was to create standardized TMS methods and thus the ability to produce comparable results that can be readily aggregated to facilitate knowledge discovery. Continued ongoing efforts of this project include the development and validation of an inter-observer reliability scoring algorithm, the creation of an online clinical task ontology, and a quantitative workflow comparison method.

Some of these tools are described in more depth in Chap. 12: Computer Tools for Recording Clinical Workflow Data.

4 Seminal Time Motion Studies in Healthcare

Building upon the concepts and definitions presented earlier in this chapter, in the following section, we summarize a set of seminal papers reporting significant TMS-based studies conducted in healthcare. As shown in Table 4.1, each of these papers is described in terms of the driving problem being investigated, the methods used, as well as intended outcomes or optimization objectives.

Table 4.1 Summary of seminal papers describing the use of time-motion studies in the health and life science domains, indicating the driving problem being investigated, the methods used, as well as intended outcomes or optimization objectives of those studies

5 Limitations and Future Directions

Nearly a century after the introduction of TMS to the healthcare arena, there is a genuine interest in aggregating results from TMS studies to generate knowledge regarding healthcare workflow, efficiency, patient safety, and quality. There is also a growing interest in using aggregated TMS results to support decision making on the acquisition and implementation of health information technologies (IT). Regrettably, existing attempts to aggregate results conclude that study comparison is very difficult due to the considerable variation in design, conduct, and reporting of such studies (Zheng et al. 2011). Efforts to summarize findings across TMS are further challenged due to the heterogeneity in activity categorizations and a lack of methodological standardization (Tipping et al. 2011).

First steps towards standardizing TMS include the work of Zheng et al. who, after analyzing a subset of 24 “time and motion studies” specifically assessing health IT implementations, proposed a checklist aiming at standardizing the reporting of such studies’ methods and results (Zheng et al. 2011). Also, methodological standardization has been proposed by Patel et al., by introducing a methodological framework for evaluating clinical cognitive activities in complex real-world environments that provides a guiding framework for characterizing the patterns of activities (Kannampallil et al. 2016). Although these efforts are important initial steps toward standardizing TMS, they do not address the persistent lack of common understanding concerning the definition of what is or is not a “time motion study”. Ultimately, a crucial step toward standardization and validation of time motion studies in the healthcare domain involves establishing a common understanding of TMS, accompanied by a proper identification of the distinct techniques it encompasses and aspects of the field that remain open and active areas of investigation. This chapter represents an initial attempt.

Based on the current state-of-the-art practice of the design and execution of TMS, we believe that there are a number of future directions for the field that will serve to enhance or extend the scope and impact of the TMS methodologies. These directions include but are not limited to:

  • Leveraging sensor data to expand the scope/nature of TMS, so that automated observation methods can incorporate higher volumes of “streaming” data collected from a variety of instrumented artifacts in a given environment. Such use of sensor data could include the tracking of activities performed by individual clinicians, utilization of technology-based tools, and the manipulation of physical environments. Leveraging such data will require the development of new TMS methodologies capable of dealing with data sources that exhibit variable volumes, velocities, and variability (i.e., “big data.”)

  • Creating continuous learning environments based on feedback from workflow studies, wherein we need to shorten the timeframe via which findings from TMS are provided back to the individuals being observed in order to support real-time or near-real-time decision making and workflow redesign. This could be made possible through using sensors to enable automated data collection, as well as improving the computational and data analytics capabilities that support/enable automated interpretation, summarization, and visualization of such TMS data (e.g., disintermediating analysis and reporting stage of TMS adhering to the prototypical design pattern shown in Fig. 4.1).

  • Finally, if we are successful in leveraging sensor technologies and creating continuous learning environments, we will be able to deliver workflow-aware information at the point of care (e.g., contextual, just-in-time information). Such a paradigm shift would fulfill the primary promise of clinical informatics, which is to deliver right information to the right person in the right format. Given the importance of clinical workflow on human cognition and decision making, increasingly fine-grained understanding of such factors, afforded by TMS and novel data and analytics techniques, provides a basis for achieving this goal.

6 Conclusions

The original use of the term Time Motion Studies, which combines the work by Taylor’s focusing on “time”, and Gilbreths’ on “motion” (Gilbreth 1914), refers to a method for improving efficiency and establishing employee productivity standards. In TMS, a task is broken into steps, and the sequence of movements or actions performed by study participants to accomplish those steps is observed to detect motion and to measure precise time taken for each movement or action. The extant literature of TMS includes a broad spectrum of distinct methodologies, including surveys, patient chart reviews, work sampling, and continuous observation. A commonality across these studies is the use of data generated via TMS to improve clinical workflow, with the ultimate objective of improving outcomes such as resource utilization, efficiency, safety, and patient health. As we look forward and envision the future of this stream of TMS-based research, our assessment of the current state of practice suggests the following improvement opportunities:

  • Enhancing and extending the methods for evaluating processes and outcomes associated with workflow studies;

  • Translating the results of workflow studies into data-driven interventions that could be delivered at the point of care and beyond; and

  • Improving the adoption and optimal use of technology in complex healthcare environments based on a better understanding of workflow-related inhibiting or enabling factors;

However, to achieve these goals, it requires us to address several important gaps in knowledge and practice, such as:

  • Ensuring the adoption and use of TMS methods become more widespread, and demonstrate the benefits in a variety of empirical settings and practitioner communities;

  • Creating a sustainable body of scholarly and applied work surrounding both methodological innovations and applied science relevant to TMS; and

  • Perhaps most importantly, ensuring that we use consistent language and nomenclature to describe all of these endeavors, such that a robust, applicable body of knowledge and best practices is being created and maintained.