Keywords

1 Introduction

Teams are a way of life in organizations. The military, the aviation and space industry, healthcare, corporations, and educational institutions all depend and rely on teams today more than ever. Effective teamwork creates knowledge, minimizes errors, promotes innovation, saves lives, enhances productivity, increases job satisfaction, and ensures success. Teams, when deployed, trained, and led correctly, can be powerful. But insuring that teams perform, learn, develop, and mature is not easy. In fact, it is complex and difficult. A key component to help with this is performance measurement—tools that measure teamwork. Thus we need to create these tools to accurately determine the strengths and weaknesses of the team. This is not an easy goal. We need valid, reliable, theory-driven practices that account for the dynamic nature of teams (Brannick & Prince, 1997; Langan-Fox, Wirth, Code, Langfield-Smith, & Wirth, 2001). This is a tall order, but progress has been made—much progress; this volume is a testament of that progress.

This chapter contributes to the volume by presenting a few insights and a picture of the research and practice on measuring teamwork over time. We will first provide some definitions to set the stage. We will next present some critical observations about measuring team performance. These observations are based primarily on the 30 years of experience of the first author at observing, measuring, and assessing team performance in various domains. We also rely on the literature to support these observations. Lastly, we will discuss some needs for developing future team-based measurement approaches.

2 Some Definitions

A team consists of two or more people who have defined roles and depend on each other to accomplish a shared goal (Salas, Dickinson, Converse, & Tannenbaum, 1992). In order to understand how teams work and subsequently perform, we have to understand how much the team knows, what skills they possess, and the overall attitude that they bring to the table; we refer to these elements as team competencies (Rosen et al., 2008).

The nature of teams is inherently complex, because individual workers are nested in teams, which are nested in organizations (Cannon-Bowers & Salas, 1997; Cannon-Bowers, Tannenbaum, Salas, & Volpe, 1995). With teams adding this dynamic layer of complexity, it is critical to slice apart and analyze what characteristics are embedded in the team, as well as the various factors (e.g., individual, team, and organizational factors) that contribute to team performance (Marks, Mathieu, & Zaccaro, 2001). The first step to understand team performance is to identify what characteristics the team possesses starting out. Examples of these inputs are individual motivation, attitudes, and personality traits (Driskell, Salas, & Hughes, 2010). Team-level inputs include power distribution, cohesion, and team resources (Marks et al., 2001). However, inputs are not limited to these characteristics. The type of task and how complicated it is also play a role. Next, we have to identify the processes, or the actions that occur when the team is working together to complete a task (LePine, Piccolo, Jackson, Mathieu, & Saul, 2008; Marks et al., 2001). Thus, it is apparent that teams are riddled with complexity, even at their nascent stages.

Though assessing team performance is challenging, we do it because team performance is linked to team effectiveness. Salas, Stagl, Burke, & Goodwin, (2007) defined team effectiveness as the result of a judgment process whereby an output is compared to a subjective or objective standard. Essentially, the results of the team’s inputs and processes are evaluated. Therefore, to ensure accuracy, we must match the outcome with the correct methods of measurement (Rosen, Wildman, Salas, & Rayne, 2012). The team yields outcomes at the team and individual levels. Team-level outcomes require the effort of all team members, such as coordination and communication. Individual-level outcomes include a team member’s attitude toward the team, which is related to team performance. Organizational-level outcomes are the resulting products of the task and how the team impacts the overall organization. Before we move on, it is important to remember that individual changes in attitude, motivation, mental models, and task knowledge, skills, and attitudes (KSAs) can impact future team processes and performance outcomes, because individuals make up a team (Cannon-Bowers et al., 1995; Tannenbaum, Beard, & Salas, 1992). Taking all these factors into consideration, in order for us to improve performance assessment, we must adopt a multilevel approach (individual, team, and organizational) to understand all the elements contributing to the way team members work together and what they produce based on their actions. With all of these issues in mind, we will now present our observations (in no particular order).

3 Observations

3.1 Observation 1: We Know a Lot

Team performance measurement is not a perfect science, yet. However, we have learned a great deal over the past 30 years, and we have amassed a robust body of literature on this area of measurement in an effort to address issues that researchers and practitioners face (Brannick & Prince, 1997; Cooke, Kiekel, & Helm, 2001; Kozlowski & Bell, 2003; Rosen et al., 2012; Wildman et al., 2012). Rosen and colleagues (2013) elucidated key components of team performance, as well as providing helpful guidelines for assessment in the context of performance in healthcare settings. Kendall and Salas (2004) addressed methodological concerns by investigating reliability and validity issues impacting team performance metrics. Taking a finer lens to team processes, He, von Davier, Greiff, Steinhauer, and Borysewicz (2015) have made significant progress towards the development of assessments (e.g., the Programme for International Student Assessment [PISA]) that capitalize on current technology to capture team collaborative problem-solving abilities. Due to recent research efforts, the ability to objectively capture real-time performance is also on the horizon (Stevens, Galloway, Lamb, Steed, & Lamb, 2017). To summarize, we know about why, how, when, and what to measure, but gaps remain. We will talk more on this later; for a more in-depth glimpse into team performance measurement advances, refer to Table 2.1.

Table 2.1 Sample of team performance measurement literature in the past 30 years

3.2 Observation 2: Context and Purpose of Measurement Matter

There is no “silver bullet” when creating a team performance measurement tool. We need to think about the context when creating all aspects of a measurement; who, how, and what is being used to conduct the evaluation. Team size, complexity of the task, physical environment of the task, task interdependence, and the amount of communication and interaction required to complete the task should also be considered (Salas, Burke, & Fowlkes, 2005).

The purpose of the performance measurement (i.e., team feedback) should determine what will be collected, and what needs to be collected should determine what kinds of resources are being used for the measurement (Meister, 1985). When choosing a team performance measurement, it is important to remember that all measures need adjustments and modifications in order to have a suitable quality for the required purpose (Salas et al., 2015). Targeting the idiosyncrasies within the team will give you a better idea of what modifications need to be made.

3.3 Observation 3: It Is Best to Triangulate

When it comes to measuring teamwork, it is nearly impossible to collect all of the necessary data from just one source. As noted by Dickinson and McIntyre (1997), “it surely takes a group or team of observers to obtain the necessary information to measure all instances of teamwork” (p. 37). There are a number of ways in which data can be collected. One can use self-report, peer assessments, observations, and objective outcomes. Using different types of data collection is optimal for getting the most data. It is best to use a combination of both qualitative and quantitative data. Subjective ratings are subject to bias; however, there are ways to reduce this bias. For example, observer ratings need to involve interrater reliability to make sure that the variable is being rated accurately from the beginning to the end (Rosen et al., 2012). We can do this by randomly selecting sessions for more than one rater to code and then comparing their ratings (Shrout & Fleiss, 1979). Also, different raters can focus on different areas based on their expertise. For example, supervisors can be used for summative assessments, while peers or subordinates can rate for ongoing or developmental evaluations.

Since teamwork is performed by individuals, it is also important to measure team performance at the individual level. We can achieve a more accurate evaluation of team performance when it is measured at multiple levels. Analysis at the individual level can pinpoint the members who effectively demonstrate teamwork skills (e.g., leadership, coordination, communication). Also, measuring both processes and outcomes can extend the amount of information you can learn about the team’s performance. Looking at processes can give you diagnostic information that addresses issues of development and can serve as a guide for feedback. Outcome measures, on the other hand, can provide you with “bottom line” performance. Making sure that you have a triangulation approach to collecting data can help ensure validity and address the limitations of the approaches when they are used alone. You do not want any potentially useful data to go unnoticed!

3.4 Observation 4: Team Size Matters

Teams come in all shapes and sizes. When it comes to performance, the size of the team can actually make a difference (Dyer, 1984; Sundstrom, De Meuse, & Futrell, 1990). Hackman (1987) suggested having teams with the least amount of people that are necessary to perform the task. The more team members that are added to a group, the lower the cohesion (McGrath, 1984) as well as group performance (Nieva, Fleishman, & Reick, 1978). The size of a team can be determined by the task at hand or the type of team (i.e., human-computer, distributed teams).

Larger teams run into issues of less flexibility and more differences within the team. More people mean more individual differences. These challenges also carry into the way the team’s performance is measured. Team performance measurement for large teams should include contingency planning, implicit coordination during task execution (i.e., shared mental models), information management, developed understanding of subteams, and an assessment of intra- and interteam cooperation. When conducting observations in complex team settings, raters should not observe more than two team members. This helps to avoid overlooking interactions (Dickinson & McIntyre, 1997).

3.5 Observation 5: Subject Matter Experts Can Assess Only Four or Five Constructs

Experts cannot assess or distinguish more than five team-based constructs. Measuring a construct requires subject matter experts (SMEs), who are individuals that have a strong understanding of the task setting and must make judgments about different team-based constructs. There is a tendency for observers and practitioners alike, to measure all they can measure—sometimes 12 to 14 constructs! Again, raters cannot distinguish these constructs; they all correlate at the end. Our experience is that raters should be trained to focus on only four or five constructs to avoid redundancy (Smith-Jentsch, Zeisig, Acton, & McPherson, 1998). When more than five related constructs are examined, the dimensions start to overlap and become more correlated with each other, making practical distinctions among teams almost impossible. In this case, less is better. Therefore it is wise to select team-based constructs carefully and use only those that matter for team performance.

3.6 Observation 6: It Is Best to Capture the ABCs—Attitudes, Behaviors, and Cognitions

It is best to capture representative attitudes, behaviors, and cognitions of teamwork. Teamwork has all of these elements. Noting Observation 5 above, it is best to choose one or two relevant ABCs to capture. Fortunately, an extensive body of research exists surrounding essential ABCs that promote effectiveness. This provides a clear outline of what measurement should capture. Recently, team orientation has been identified as a core attitudinal component of high performing teams (Salas, Sims, & Burke, 2005). Effective teams also promote a wide variety of behaviors such as communication, coordination, and cooperation, to name a few (Campion, Medsker, & Higgs, 1993; Kozlowski & Bell, 2003). For a more in-depth look at team behaviors, refer to Rousseau, Aubé, and Savoie (2006).

Regarding team cognition, shared mental models play an important role in ensuring that team members are on the same page. Successful development of shared mental models helps aggregate the knowledge of each member on the team to create a common understanding of what, how, and when the team needs to accomplish a goal or task (Mathieu, Heffner, Goodwin, Salas, & Cannon-Bowers, 2000). For further discussion of team cognition and its component parts, refer to DeChurch and Mesmer-Magnus (2010). Taken as whole, capturing ABCs is critical for determining how to best measure a team and maximize performance outcomes.

Capturing attitude is commonly used for measuring team performance because it is easy and does not rely on many resources. Measuring attitude is as simple as having team members individually answer a set of items, using a Likert scale to express their feelings in regard to particular statements. Recently, we have also seen examples of attaining information signals by capturing facial expressions, gestures, posture, and periods of silence (Anders, Heinzel, Ethofer, & Haynes, 2011; Shippers, Roebroeck, Renken, Nanetti, & Keysers, 2010; Schokley, Santana, & Fowler, 2003; Stevens et al., 2017). We need to measure attitudes because they are associated with team performance (Hackman, 1990; Peterson, Mitchell, Thompson, & Burr, 2000). In regard to behaviors, these can easily be captured through observation. We will elaborate more on what behaviors need to be observed in Observation 7. As for team cognition (knowledge), it still remains a challenge to find a promising method to measure this construct, but it is important to measure because it affects performance (Liu, Hao, von Davier, Kyllonen, & Zapata-Rivera, 2015). In a methodological review, Cooke, Salas, Cannon-Bowers, and Stout (2000) explained that we need to go beyond typical assessments to understand the structure of team knowledge. Different aspects of measurement for this construct include elicitation method (e.g., self-report, eye tracking, communication analysis), team metric, and aggregation method. Nonetheless there is a lot more to be done in regard to measuring cognition (e.g., Wildman et al., 2012).

3.7 Observation 7: Behavioral Markers Matter

Behavioral markers are paramount in performance measurement (Flin & Martin, 2001). Accurately capturing observable behaviors within a team is critical to assessing a team’s attributes. These markers should be studied in the context of the environment in which they are being applied. However, mapping constructs to the environment is only a part of the battle. Behavioral markers must be specific and the constructs of interest need to be clearly defined. We have already touched upon various widely used measurement tools that rely on observable team behavior in Observation 6, but a more granular lens must be used to establish what behaviors are of interest. To accurately execute this, time should be taken to methodically carry out the subsequent steps. First, we must establish the behaviors of interest. Next, we must systematically map constructs onto the behaviors. Additionally, we must clearly define the identified constructs. Finally, we must contextualize the behavioral markers by assessing them in the actual performance environment.

3.8 Observation 8: It’s All About the Constructs, Not the Method!

A primary issue surrounding constructs is the heightened emphasis placed on the method at the expense of unique traits present in the team. It is important to remember that all teams are not equal! Teams possess both explicit (e.g., observable behaviors such as verbal communication) and implicit qualities (e.g., unobservable processes such as shared mental models; Entin & Serfaty, 1999; Rosen et al., 2012). Due to the developmental nature of teams, certain phenomena (e.g., implicit qualities) emerge in teamwork that can be difficult to capture. Research has attempted to overcome this challenge by placing primary emphasis on the tools used to assess teamwork, but this can sacrifice important aspects of teamwork that influence performance. Most available tools are limited to assessing observable behaviors, but some of the team’s most important interactions are implicit and therefore difficult to capture. To illustrate this, in an operating room a patient goes into cardiac arrest; a nurse immediately hands the surgeon necessary tools while the anesthesiologist monitors the patient’s current condition and the surgeon attempts to stabilize the patient. This is a good example of a scenario in which implicit coordination is key to the success of the surgical team. Many of these actions need to take place in a matter of seconds; the actions are highly interdependent and do not require explicit communication. As you can imagine, it would be difficult to measure how aligned the team’s shared mental model was or how this impacted their ability to coordinate in a highly stressful situation.

Another challenge that centers on the constructs involved in measurement is the statistical method used in analysis. Though accurate and appropriate statistical analysis is critical to team assessment, it does not sufficiently capture performance all by itself. Many methods of analysis exist that establish the reliability and validity of constructs, but researchers should proceed with caution so as not to become completely reliant on these analyses. The environment and situation being assessed should also play a critical role, to ensure that empirical constructs translate to practical settings (Rosen et al., 2012).

Taking these factors into account when defining constructs is crucial to developing accurate and adaptable performance measures specific to the team. When it comes to teams, adaptability is key (Rosen et al., 2013) and should be reflected in the measurement process. Contextualization should, again, be taken into account, aligning constructs with team competencies to provide accurate construct definitions (Cannon-Bowers et al., 1995).

3.9 Observation 9: Measurement of Teamwork Is Not a “One-Stop Shop” Dynamic Phenomenon

Adding to the complexity of teamwork is the simultaneous need for multiple measurement methods that address the episodic nature of team processes. Teams do not run on fixed intervals; they accomplish different tasks at different times. Hence, it is important to recognize that there is no universal form of measurement that captures performance (Rosen et al., 2012), but keen observation can be a powerful tool when selecting a form of assessment (Rosen et al., 2012).

Although it might be a labor-intensive process to obtain these data, there are new unobtrusive approaches that are promising for team performance measurements. The most popular approaches for observing behavior are event-based measurement, real-time assessment, classification schemes, coding, and behavioral rating scales.

Event-based measurement plays out a scenario where the training objectives are connected to what exactly needs to be assessed. This lets the assessor design events specific to the behaviors to be evaluated. Having control over the events enhances the measurement reliability. Two measurement tools that were developed using the event-based approach are targeted acceptable responses to generated events or tasks (TARGETS; Fowlkes, Lane, Salas, Franz, & Oser, 1994) and team dimensional training (TDT; Smith-Jentsch et al., 1998). One of the most common approaches uses behavioral rating scales such as the behaviorally anchored rating scales (BARS), introduced by Smith and Kendall (1963). Other rating scales include behavioral observation scales (BOS) and graphic rating scales (Latham & Wexley, 1977; Patterson, 1922).

For capturing performance, assessment tools should take on a multilevel perspective (e.g., individual, team, and organizational levels), to accommodate the changes that teams encounter through their life cycle (Rosen et al., 2012; Wildman et al., 2012). Performance should also be measured frequently through a variety of techniques to prevent method bias. However, a challenge this poses is the overuse of dimensions or measures. Frequently assessing a team can get in the way of team dynamics or otherwise alter the team’s normative behavior.

Unobtrusive measures are useful in situations where a team’s performance is constantly changing, because they do not disrupt the workflow of the team members. The electroencephalography (EEG) approach to capturing team performance has also shown promise in regard to being unobtrusive while allowing for the real-time assessment of behaviors (Stevens et al., 2017). Automated performance measures (e.g., sociometric badges and audio recording devices) have also shown promise with regard to being both unbiased and unobtrusive. Expounding further on the area of automation, PISA made strides towards capturing both cognitive and social aspects of collaborative problem solving through computer-based assessment (He et al., 2015). One caveat about this method is that automated performance measures are not “stand alone” measures. They still need to be coupled with nonautomated forms of measurement. However, this need for multiple measures holds true for many forms of performance assessment.

3.10 Observation 10: What Is Good for Science Is Not Necessarily Good for Practice

Bridging the gap between research and practice is a critical focus for assessing teamwork performance. This is a challenge because what is good for team research is not always what practitioners want. Researchers can assess many elements of teamwork performance in a controlled laboratory setting, but this freedom can cause researchers to lose sight of what is relevant for practice. Practitioners need tools that are unobtrusive, diagnostic, economical, and easy to use (Rosen et al., 2012). Researchers do not always take an approach that meets these needs. This disparity between research and practice is compounded by the inconsistencies that exist within the dimensions of theoretical teamwork models.

3.11 Observation 11: Don’t Ignore the Basics

It is important to go back to the basics to ensure good practice. The underlying premise behind successful measurement provides a sound foundation for future research and practice efforts. The basics illustrate the guiding principles, emerging trends, and considerations for team performance measurement. Outlining clear constructs that target the attitudes, behaviors, and cognitions pertinent to teamwork, while factoring in the context of the environment, lays the groundwork for effective performance measurement.

Great strides have been made in the area of teamwork performance measurement. An area that shows great promise in particular has been modeling and simulation (Fiore, Cuevas, Scielzo, & Salas, 2002; Hao, Liu, von Davier, & Kyllonen, 2015). However, more development is needed to maintain focus as we move forward. Some of the challenges that still remain are determining what to measure, developing reliable instruments that are diagnostic, and ensuring that these instruments can be implemented across the life span of the team, while placing a heavy emphasis on practicality. To ensure that new methods of assessment are grounded in a reliable and valid foundation, we must go “back to the basics.”

4 The Future

4.1 Observation 12: We Need Tools that Capture the ABCs of Teamwork Dynamically in Real Time that Are Pragmatic, Relevant, and Unobtrusive

This is the holy grail of team measurement. That is the next step in the future. Efforts have been made to reach this goal; see promising work in Table 2.2. Future research should aim at improving the effectiveness of team measurement, such as the work being done by Cooke (2015), who noted that measuring interactions can easily be done unobtrusively and that more unobtrusive measures are needed. Research also needs to acknowledge the advancement of technology and increased usage of online tools for assessment (Awwal, Griffin, & Scalise, 2015).

Table 2.2 Overview of observations on team performance measurements

5 Conclusion

It is evident that team performance measures are important throughout many industries, and since not all teams are created equally, it is important to modify the measurement based on the specific team. When a measurement system is developed, it should address the question: Why do we measure? This question requires a clear definition of the purpose of the measurement tool (von Davier & Halpin, 2013). The purpose behind measuring performance is to generate research, provide teams with feedback, develop team training, evaluate performance, and plan for the future.

During the development of measurement tools another question you need to answer is: What areas of performance should be captured? As previously described, to accurately assess performance, the team should be measured on multiple dimensions and the conceptual elements of the measure should be clearly defined. This leads into the temporal considerations of performance assessment: When should we measure?Teamwork should be assessed midway through the performance cycle as well as after the conclusion of the performance episode. This begs the question: Where should teamwork performance be measured? Teamwork should be measured both in the field through the use of unobtrusive measures as well as in a synthetic environment (Rosen et al., 2013). Lastly, the proper method of analysis should be selected: How should we measure performance? Teamwork should be captured through self-report measures, observation, simulations, and balanced scorecards (Rosen et al., 2013).