Keywords

1 Introduction

The U.S. Army’s Generalized Intelligent Framework for Tutoring (GIFT) is an adaptive instructional software architecture [1] that can be used to monitor behaviors and provide feedback based on the real-time performance of teams. While GIFT is primarily used in military training domains, it has also can be highly versatile in supporting education and training in a variety of domains (e.g., simulation based team tasks, problem solving or psychomotor tasks). GIFT is in a category of learning technologies called adaptive instructional systems (AISs), “computer-based systems that guide learning experiences by tailoring instruction and recommendations based on the goals, needs, and preferences of each learner [or team] in the context of domain learning objectives” [2].

GIFT is also compatible or interoperable with team training environments including simulations and serious games, such as Virtual Battle Space 3 (VBS3). VBS3 is one simulation in which teams train together in a shared synthetic environment that represents a geographic area or terrain archetype (e.g., jungle, mountains, desert). With GIFT and VBS3, Soldiers can practice infantry maneuvers and tactics while their performance is continuously assessed against a set of Intelligent Tutoring System (ITS) author defined team performance objectives.

The GIFT team functional resilience (TFR) project aims to research and develop usable methods and technology enhancements that support automated team assessments within GIFT where GIFT is coupled with a team training simulation or serious game. Team functional resilience is a complex team cognition precursor for teams whose members provide heterogeneous individual functions interdependently to achieve team performance goals [3]. The ability to continue the mission in the face of missing functionality, such as when one team member underperforms or is otherwise compromised, can be described as functional resilience. Automating assessment of team functional resilience will enable GIFT to give teams immediate, objective, and adaptive feedback.

GIFT includes a tool, GameMaster, that lets human observers view real-time performance of a team of learners as they train within the VBS3 world [4]. GameMaster shows the state of automated assessments, and also prompts observers to determine and record learner progress toward defined, non-automated competencies.

In addition to human observation with GameMaster, GIFT provides a framework for conditions that can be authored, and automate assessment by monitoring activity within VBS3 on a second-by-second basis and interpreting behavior observations according to configurable conditions and rules. Adding automation to simulation-based training systems can enhance assessment by providing objective evidence to support assessments, weighting and prioritizing findings, and lightening the load on observers when critical events happen quickly or in different parts of the simulation. Automated assessments in GIFT can initiate real-time adaptive interventions (e.g., feedback or changes to scenario difficulty) or structure feedback for team-focused after-action review (AAR).

To automate assessment for TFR in a training simulation, we identified three primary research and development objectives:

  • Research and develop a methodology to measure and assess individual contributions to overall team performance, identifying functional compromise and recovery

  • Research and develop a scalable, reusable model of team performance that can be used in various contexts and for individuals of varying competencies

  • Research and develop a methodology to support dynamic assignment of team roles and responsibilities as the simulation state changes during training execution

As a result of automating team assessments with reusable definitions, data structures, and dynamic roles, GIFT can enhance human observations in team training. In the current state of development, automated assessments generate an adaptive AAR document. The document structures observed performance markers in relation to team skills so that instructors or team leaders can lead concrete, comprehensive discussions with learners. In the future, automated assessments in GIFT can also drive immediate feedback such as adaptive simulation changes. Additionally, if authored, simulation events and character behaviors can be adjusted in real time to adapt the challenge of the training and to target specific team members or relationships. The contributions described will make these adaptations more applicable in adaptive dynamic training simulations.

Next, we describe the GIFT architecture and how GIFT can stimulate low-adaptive simulations and serious games like VBS3 to provide more adaptive instructional interventions. Low adaptive systems only accommodate differences in the learner’s in-situ performance during training and do not consider the impact of other factors (e.g., motions, prior knowledge, goal-orientation, or motivation) that influence learning. Subsequent sections of this paper describe team assessment in theory and in practice, the research and development contributions of the GIFT TFR project to automated assessment capabilities, and recommended next steps for assessment of teams in adaptive instructional contexts. The implications of this research are focused on GIFT but are far reaching in their influence on the design of all instructional systems, adaptive or not. There is a growing focus on collaborative learning in pre-K, K-12, adult learning and in military organizations. Collaborative learning is an instructional approach “in which two or more people learn or attempt to learn something together” [5]. This is an important approach, not just because of its effectiveness, but also because of its alignment with how people tend to work together in teams. Team-based tutoring systems should be easily deployed and used to reflect how prevalent teams are in the workplace.

2 Overview of GIFT

As noted above, GIFT is an adaptive instructional architecture that supports authoring, real-time instructional management and experimentation processes. GIFT’s authoring capabilities include a drag-and-drop interface that uses configurable course objects (e.g., media, adaptive courseflow, conversation trees, authored branches and practice environments) [6]. Practice environments are ready-made interfaces to simulations like VBS, unity-based environments, and compatible learning management systems (e.g., Blackboard, Canvas, edX). Courses built using GIFT can also be used to guide and support experimentation experiences, but the key functions of GIFT exist in its real-time instructional management processes which are governed by a modular architecture (Fig. 1). GIFT’s tools allow for content from different domains to be authored, which allows for GIFT to support the creation of tutors that are not tied to one specific topic.

Fig. 1.
figure 1

GIFT Real-time Instructional Architecture [7]

GIFT uses data from sensors and other learner interactions to infer the states (e.g., learning, performance, emotions) of individual learners and teams. These learner states are used by the pedagogical module to select strategies and recommendations which are passed to the domain module where they are evaluated in context to determine the next actions by the tutor [8]. Context is the set of defined conditions under which the team performs their assigned tasks and includes the state of the environment and the team at any given time. It is important to note that the relevance of any AIS interventions is context-dependent. For example, feedback provided under one set of conditions may not be as relevant under another set of conditions. Artificial Intelligence (AI) plays a critical role in determining optimal interventions. Since it would be tedious to explicitly author an intervention for every possible condition of the team and the mission environment, AI is often used to weigh and infer the best available alternatives.

The gateway module provides a standard interface to facilitate the exchange of data with external simulations. This is of importance when the basis of automated assessment is interaction data of team members within an external simulation (e.g., VBS3). Next, we discuss details of sourcing measures of assessment and producing recommended interventions in both low-adaptive systems and AISs.

3 Overview of Team Dimensional Training

GIFT is being enhanced with experimental software to operationalize team assessment using Team Dimensional Training (TDT). First introduced to structure military team assessment and feedback, TDT provides a model of team process and performance that enables assessing teams or identifying the ways expert teams differ from less expert teams [9, 10]. A contribution of the work described herein is to show how TDT dimensions can be defined and assessed in a general manner within a software system. Operationalizing TDT enables GIFT to automate team assessments and provide feedback in the form of a more detailed structured AAR.

A full description of TDT is available in the above references. In summary, the TDT model describes antecedents of teamwork, such as team processes and attitudes that fall within four dimensions. The four dimensions derived from Salas’ 7 Cs [11] are:

  • Communication, or the clarity and form of delivering information

  • Information exchange, or choosing what information to share, when, and to whom

  • Supporting behavior, or helping another accomplish a task

  • Leadership, or the initiative of any team member to provide guidance to the team

In context of GIFT, the work to describe how each TDT dimension is expressed in a particular team task or domain typically requires subject matter expertise. However, connecting domain-specific assessments to TDT dimensions enables finding trends in team processing and identifying opportunities to improve underlying, process-level team function. Assessment that focuses only on specific observed behaviors might be hard for the team to generalize beyond the task at hand, but using those behaviors as input to assess the TDT dimensions gives the team a way to improve that can impact all team tasks.

Within TDT, team functional resilience may be considered to draw on all four dimensions. A critical dimension in team resilience is the supporting behavior of the tutor [12]. Individuals can make the team resilient to functional loss by preventing or reducing a loss, such as helping a person when their function becomes compromised, or by correcting the loss and taking over a function from a teammate. Furthermore, proper communication and information exchange help the team to recognize a threat to performance. Leadership is involved when individuals take steps before a compromise to avoid it, like appropriately dividing tasks or reminding each other of procedure.

Defining how to assess TDT dimensions is associated with a structured approach to observing and measuring trainee responses that is intended to increase inter-rater reliability for human instructors or observers of training. Targeted Acceptable Responses to Generated Events or Tasks (TARGETs) are a checklist of correct and possible incorrect trainee responses created with a task analysis in advance of training [13]. However, TARGETs require authoring effort to define, and effort during training execution to collect data [14]. Therefore, an opportunity exists to automate assessments with AI methods that build on TARGETs.

4 Team Assessment in Practice

In this section, we discuss the general role of analysis, evaluation and assessment in adaptive instruction. We also describe operational concepts for both manual and automated adaptive team assessments to compare and contrast the current state of practice for training with our proposed and emerging state of practice. This provides both an understanding of the training process and highlights the importance of team assessment in the training process.

4.1 Analysis, Evaluation and Assessment in Adaptive Instruction

At this point, we review the role of analysis and evaluation in the training development, reuse, and sustainment process and the role of assessment in the real-time instruction process. Often the terms analysis, evaluation and assessment are used interchangeably, but they have different roles in the team instructional process. The ADDIE process model [15] describes five stages (analysis, design, development, implementation and evaluation) in the development and continuous improvement of instruction. The analysis stage within ADDIE focuses on understanding the goal of the team training and how it will improve the team’s capabilities, capacity, productivity and performance of one or more tasks [16]. Training increases the knowledge and skills needed to reach a required level of competence. Once the analysis stage is completed, the design process can begin and subsequently, the instructional system can be developed and deployed. The evaluation stage of the ADDIE process provides a structured method to improve the instruction.

The real-time part of the ADDIE process is the implementation stage [15]. In an adaptive instructional context, the implementation stage includes real-time management and execution of training. Training management refers to selecting and configuring training simulations and scenarios that meet learners’ needs. Training management motivates reusability in assessment design, so that assessment is not locked to a specific scenario and does not increase the cost of authoring scenario changes. During training execution, the team is exposed to an experience or scenario intended to exercise the required knowledge and skills according to a set of defined training objectives. Team performance may be assessed at various points in the training scenario though observation and then the instructor (human or computer) uses that assessment to intervene (e.g., provide feedback or change the difficulty of the task). Assessments may also be used to determine competency and recommend future training experiences.

Assessment refers to observing and measuring performance, then comparing it to a standard for learners given the simulated conditions [15]. Many instructional systems today assess performance in order to infer learners’ progress toward training objectives. However, in contrast to expert human instructors, most computer training systems only adapt the instruction based on triggers in the observed performance rather than based on the resulting inferences [17, 18]. Also, little consideration is given to the estimation of other individual differences (e.g., emotions) and their impact on desired training outcomes. Adaptive training systems offer an opportunity to use assessments instead of simple triggers in selecting interventions like immediate feedback messages, AAR, tailored simulation events, or training progression versus remediation.

Recapping, analysis is a pre-instructional process, evaluation is a post-instructional process and assessment is a real-time instructional process. Good assessment can inform training evaluation by quantifying training effectiveness in relation to the original analysis and design, targeting specific training objectives and learner audiences. Now that we understand these differences, our next step is to evaluate current manual team training concepts and then analyze how current team training concepts might be improved through automation.

4.2 Operational Process for Manual Team Assessment

As noted above, the real-time team training process involves immersion of the team in a relevant experience, assessment of progress toward defined training objectives, and selection of appropriate interventions to provide feedback and maintain engagement. Assessment in today’s military training systems is largely a manual process. For example, live training exercises typically depend on human observers who assess when or if a required action is completed and whether that action was timely. Checklists for mission scenarios are a common assessment tool, but completing the checklist is not always an accurate measure of performance or knowledge/skill acquisition, and requires a good deal of attention from the observer.

It is even more difficult for a human observer to assess training during a scenario in a simulated environment, as there are many different computers/views to observe. In order to move toward personalized assessment and feedback, ITS frameworks such as GIFT can be leveraged to incorporate real-time automated assessment into the simulations. For virtual simulations and serious games (e.g., VBS3), predetermined sequences of events are used to infer whether the individual or team successfully completed the training mission, but AARs are usually manually built for each domain by comparing the expected and actual events in the simulation. In other words, manual authoring processes provide data to support prescriptive assessments. This prescriptive process is limiting in that it is not always apparent to users other than the author (e.g., subject matter experts) how the objectives of the training were met by scenario conditions or trainee performance.

Compared to manual processes, automation of authoring processes may offer additional flexibility by making the relationship between learning objectives, scenario conditions and trainee behaviors more apparent than any prescriptive approach. Automation may also provide increased transparency to authors and learners by highlighting learner behaviors that were not explicitly identified as measures of assessment, but nonetheless influenced the team’s successful performance.

4.3 Operational Process for Automated Team Assessment

The goal of automating team assessment is to reduce the human resources required to conduct training exercises, but also to offer additional flexibility and evidentiary power for inferring team states. If the data is available to support automated team assessment, then the automated approach can be a significant improvement over any manual assessment process. Automated processes using AI also offer the advantage of inferring team states without a comprehensive set of data, generalizing to work in situations that have not been predetermined at design time. However, acquiring behavioral and physiological data to support team state inference is an important phase of the ADDIE analysis process.

Next, we discuss how the automated assessment process might work in an operational context. As noted, the essential real-time elements of team training include an immersive experience, rigorous and accurate assessment, and selection of interventions to optimize learning outcomes. In one-to-one tutoring processes, a human or computer-based tutor would test the learner’s knowledge of the domain and adapt the instruction to the capability of the learner with the goal of fulfilling a set of learning objectives.

In a team tutoring context, the adaptive tutor does the same, but the operational process for automated team assessment is multi-dimensional. The adaptive team tutor must be able to:

  • Understand the goal of the training and its associated objectives

  • Consider the competency, performance, learning and emotional states of individual members of the team to explain observations and inform training interventions

  • Understand the state of the training and recognize trainees’ behaviors as inputs to measurement and assessment

  • Infer taskwork states (progress toward training objectives) periodically throughout the training process and respond with appropriate interventions

  • Infer the competency of the team from the competency of the individual members

  • Infer teamwork states (e.g., cohesion, leadership) periodically throughout the training process and respond with appropriate interventions

  • Evaluate the accuracy of team and individual learner state classification and the efficacy of its intervention decisions in order to reinforce AIS learning with each experience

To maintain an accurate operational picture of the team and its members, the adaptive team tutor must be able to source data to support the AI methods employed in the automated assessment process. In the next section, we discuss how our project addresses or plans to address these data, AI and adaptive team tutoring requirements.

5 Research and Design Contributions to Automated Assessment

This section discusses the team assessment process and approaches to address three primary research and development challenges in automated team assessment:

  • Capturing how each individual on a team contributes to team assessments

  • Developing a scalable data structure to describe team performance that can be generalized across a variety of team tasks and domains (e.g., cognitive, psychomotor)

  • Assigning team roles and responsibilities dynamically during scenario execution

  • Together these contributions make automation more capable, scalable, and reusable for team assessments.

5.1 The Role of AI in Automated Assessment

First, we discuss approaches to automated assessment of teams. Automation that does not require AI might include defining metrics and bounds that quantify good or poor performance. These can make assessment more objective and reduce workload. However, they typically focus on performance outcomes, with the link to underlying reasons for the outcomes either missing or implicit (rather than transparently modeled and shown to users). They are also likely to be tied to the specifics of a scenario. If the definitions of good performance depend on context or on interactions between several factors, then simple assessments can be costly to create and difficult to update for reuse in new or changing scenarios.

An alternative is to use AI methods to collect data and context from available sources and infer progress toward assigned objectives. AI is better able to handle states that were not implicitly defined, and AI is better able to infer states even with incomplete datasets. We include in the set of AI approaches both authored and machine-learned models of good behavior that can assess complex interactions or context that arise in team training settings.

Several capable approaches have been demonstrated in GIFT. One example demonstrated automated team assessment by applying the same structures used for an individual learner to a team, thus considering the team to be a single and separate entity for training purposes [19, 20]. Another example went in the opposite direction, creating numerous team entities to evaluate the interactions between all possible combinations of individuals [21, 22]. GIFT offers multiple ways to define and assess teamwork, and the contributions described here structure the assessments in a way intended to make them expressive without high authoring effort, and reusable across training domains.

We have previously described an initial implementation of the present work with a focus on operationalizing TDT in a specific team training scenario [3]. The initial implementation occurred within VBS3 and GIFT, and in it, a team of four infantry carried out a mission with injected events that could compromise team functions. We used TDT to assess team functional resilience and provide adaptive, prioritized support for AAR of the four contributing dimensions. We next present updated information about how AI is making team assessment more reusable for additional scenarios and larger teams.

5.2 Capturing Individual Contributions to Team Performance

One challenge in automated assessment is identifying individual contributions to team performance. Examples include identifying when individuals coordinate actions, divide work, fail to provide an expected function, and support each other by taking on a teammate’s functions. Compared to assessing team performance as if the team were a monolithic entity, tracking individual contributions requires more work to define all the details to observe during training. Flexible, reusable, scalable definitions of performance and the behavior markers of team competency (Fig. 2) can help reduce this workload. The definitions should be reusable on several levels – enabling changes in the specific scenario events and context, the simulation software, and the instructional domain. While ultimate generality is available in GIFT by making changes in the underlying code, generality of the tools available to non-technical users is also possible. Approaches to let instructors and other observers see how individuals contribute to team performance include identifying the high-level goals of team assessment, organizing and expressing them in ways instructors and other observers use, and reconfiguring parameters automatically without requiring user effort.

Fig. 2.
figure 2

Structure for reusable definitions of team competencies, measures, and observations.

Figure 2 shows four levels of abstraction (rows) that define team assessments in a way that aligns with instructor usage and that AI can reuse across scenarios. At the highest level of abstraction, team competencies from the TDT dimensions categorize all the available assessments and express expert knowledge of what factors the measures load on. Measures, in the second row, are domain-general in the sense that the same measure can be reused in many training domains, from infantry tactical training to workplace decision-making or government crisis response. In order to advance the goal of reusability, the measures are expressed in terms of individual contributions to the team. The structure of the team model (next section) allows domain-general descriptions of the individual contributions that contribute to the measure. For example, to help assess team functional resilience competencies, we would like to measure the time between one team member being compromised and another team member taking corrective action. Then, the required domain-specific knowledge can be filled in with model elements at the bottom row of the figure. For each training domain, we need to define the observations that indicate a compromised function and the observations that indicate corrective action took place.

The generality and reusability of the structure in Fig. 2 is enhanced by defining roles and responsibilities. Roles and responsibilities are concepts that can be reused at all levels of abstraction, tying into domain-specific definitions that capture the specific functionality a team needs in each training domain. They are modeled in our current implementation as follows.

Team members can each have multiple roles. Roles are flexible enough to describe facts about an individual’s job specialization, rank in a hierarchy, current knowledge, or actions performed. For example, during training a person who reports an event might be assigned the role of reporter. This example provides an opportunity to assess team functional resilience. If one team member was expected to report an event and another team member actually did, it is likely that we have evidence about the team’s resilience and about the two individual contributions to that resilience.

The ability to express expectations about individual behavior is supplied here by the concept of a responsibility. For example, the person who has knowledge of a critical event is defined at the domain-general level as responsible for reporting it within the team. Furthermore, the domain-specific details that determine who has knowledge of a critical event may be implemented using roles again. For example, an infantry person who takes enemy fire is in a role that is dynamically assigned during training (taking fire), and that role carries the responsibility to communicate information. Dynamic assignment of roles is described in the last part of this section.

The four levels of abstraction in defining team assessments reduce the work required to determine individual contributions because three of the four levels are not specific to the instructional domain and can be reused. It is also possible to create opportunities for reuse in the fourth level, domain-specific definitions. These definitions are expressed in GIFT as a combination of Java code, which requires a software engineer to change, and configuration parameters, which can be changed by instructors or end users. In GIFT, parameterization is a best practice that reduces engineering workload for adding or changing scenarios.

In addition, the present work links some parameters at the domain-specific level to be filled with roles and responsibilities. As a result, values for those parameters can be set automatically by events during training and do not need to be configured by humans. The workload on instructors and end users is reduced, and the automated assessments are hypothesized to be reusable in more training scenarios.

In conclusion, training that only assesses team outcomes will miss diagnostic characteristics of how the overall performance was achieved. The same overall mission success is different if one person carried the team or if the team effectively divided duties. Defining the importance of individual contributions to team performance and competency in a generalizable way with a reusable structure creates advantages when authoring assessments. The advantage of concrete feedback that includes objective examples during the AAR (post training) enables the traceability of individual behaviors that are markers of team states and objective team competencies.

5.3 Implementing a Scalable Model for Team Assessment

A second challenge is structuring a scalable model for assessing larger teams. The team assessment data structure should enable describing how any team members interact in pairs or larger groups. However, creating explicit data for every possible combination would create an exponential explosion in data size and sustainment effort. Team roles and responsibilities create a more scalable data structure to calculate and record team assessments. Model structures we introduce to GIFT efficiently define expert behavior and simulation events, enabling assessment at scale (Fig. 3).

Fig. 3.
figure 3

Roles and responsibilities inform how observations update team assessments.

The coloring of Fig. 3 suggests how implementation aligns with the four levels in Fig. 2. Referring to Fig. 1, these components are implemented within the domain module of GIFT and operate on inputs from the training via the gateway module. Adding them makes team assessment possible to code and configure efficiently.

First, the role manager defines what team roles exist and, for roles that change during training, how they can be recognized. The expert model links roles to responsibilities in the sense described above. As a result, the expert model informs the selection of what facts about the simulated world have value to monitor. During training execution, the role manager reads from the world model to update current role assignments.

Second, the world state manager facilitates storing and sharing information about the training simulation. The world state manager defines what facts about the world are needed to make assessments. When those facts are updated in messages from the simulator, their current values are maintained in a shared world model. The world model has also been called a working memory because it defines a subset of facts that are salient and relates those facts to each other with semantic relationships. For example, if an enemy fires at an infantry team, the simulation will send a message with the spatial coordinates of the shot. The world model can transform the location into a relationship that has semantic meaning, such as the shot was close to a certain team member.

In addition to storing facts with semantic meaning, the world model is also the facility for adding context needed for assessment. When the enemy fires at a person and that person communicates the information, context helps assess the correctness of the behavior. For example, if this is the first shot or the first report of enemy fire, the assessment of team information sharing is likely to be more positive than if there has already been a prolonged firefight. So, another benefit of the shared world model is as a store for facts about past events and for connecting events, rather than operating directly on inputs from the simulator. The logic to record past events could be stored without a world model, such as within the code for specific assessments, but the world model enables encapsulating knowledge about the world separately from interpretation, such as assessment. As a result of the encapsulation, there is no duplication of code needed to layer new assessments on the world model and no updates across the codebase needed to change a definition such as the distance where a shot counts as “close.”

Each world model contains information specific to the training domain under instruction. The design decision to impose a world model aids in organizing observable facts and structure as a shared state. This infers that world models may be reusable across many training domains and the conditions of the world may vary in each scenario to stimulate team behaviors needed to achieve team learning objectives.

Finally, the query manager is added as a mechanism to interpret world state into assessments. The kinds of queries that are implemented focus on change over time and time relations between events, such as “during” and “after.” These queries carry out assessment by interpreting the world state, including who was assigned which roles, at any given time. The current focus on tracking time and changes over time enables assembling a timeline during AAR which points learners to recorded examples from their training.

The query manager can update the team model at different levels based on the defined measures. For example, a binary measure that records the time between a functional loss and a corrective action would have two individuals who are updated as well as the overall team. To specifically avoid the explosion of all possible relationships between individuals as teams grow larger, the team model is sparse and only records roles and relationships between roles that have measures defined.

The key benefit of the models and components to manage them is the scalability for use with larger teams. The data structures also do not limit the complexity of scenarios and performance that may be observed, which typically need to be complex for effective team training. The design is also intended to increase code encapsulation and ease of encoding measures that are reusable because they work without recoding in more contexts. Finally, reusability across scenarios can be increased if these models are expressive enough to capture doctrine in a code library and automate assessments with reduced authoring needed. So, if future testing in new scenarios and instructional domains succeeds, the design of the scalable models will also increase reusability.

5.4 Assigning Team Roles and Responsibilities Dynamically

A third challenge is dynamic role assignment, allowing the roles and responsibilities in the team definitions and team model to change during the execution of training. Context-specific changes allow assessment to work flexibly when different learners can take on different functions within the team. Functional changes are important indicators of team competencies at a high level of abstraction which can therefore be used across many simulations. As examples, a team leader can assign team members to tasks with an optimal or suboptimal balance of responsibilities. Alternatively, as personnel recognize functions that might be missing in the team, their responses or divided attention can suggest characteristics of a team process like trust. Next, changing information within the simulation can help assess complex team cognitive competencies like information sharing and shared situational awareness. Finally, automated assessments also need dynamic roles for optimal reusability across scenarios. The assessments should avoid assumptions about which team member will have information or will take on tasks, in order to work in many possible paths through a simulation.

The baseline for team role assignment is configuring the assignments at the start of training, when the instructor who wanted the training fills in all details (examples of details could be: this person is the highest ranking, this person has medic skills, this person is the one carrying the special equipment). Clearly there are limits to what instructors can assign in advance. For example, in the infantry a squad might have one person with medic skills, but all the team members have first aid skills and the choice of who needs to help an injured person before the medic can get there might depend on who is closest. Similarly, it is not natural to assign roles like which person will go through a door first and who will go second. In current practice, instructors do assign these roles when they configure training, but the training would be more flexible and require less instructor workload if that assignment could be automated.

The process of making role assignment dynamic and automating the assignment leverages the expert role model and the world state model, both described above. The expert model separates roles from responsibilities, but it also maps in the other direction so that responsibilities met can tell what roles a trainee is undertaking. As a result, the corrective actions that involve taking on a teammate’s function to keep the team going can be detected using the expert model. The other part of the equation is the world model. One of its key functions is to interpret raw data from the simulation environment, like trainee locations in the world, into semantically meaningful labels. The world state is therefore the place to look for information like who is closest to the injured person and who entered a door second.

Again, properly interpreting the meaningful facts that are added to the world state is specific to the training domain, but their definitions can allow for greater or lesser generality and reuse with proper implementation. For example, a rule to find the second person passing through a door might be hard-coded to one location, or it could define a configurable door location that works for any scenario in the same simulator with a door to pass through. The configurable door location furthermore could be set by the instructor at the start of training, or it could use automation to search the simulation world for all doors and update world state whenever anyone passes through any door. Configuring the door location is required in the current implementation state, but the increased automation that removes this need might be desirable in the future.

Finally, should a truly reusable rule exist that records the second person to pass through any door in a simulation, the need for context in the world state becomes clear. Some doors might be benign areas back at base, where the order trainees enter can be recorded but does not make a difference to training. On the other hand, some doors exist in hostile territory and the infantry passing through must follow doctrinal procedures that define what they do once they are inside. The addition of context to interpret observed actions is one of the key contributions of the world model to automating team assessment.

A final consideration is the scalability of dynamic role assignment when the number of roles, team members, and facts about the simulated world increase. Constantly checking all possible assignments would probably overwhelm any computer. GIFT provides an event-based mechanism to make updates only about facts that changed from moment to moment during training. In addition, updating and rechecking the meaningful labels can be carried out efficiently as well. Rather than long lists of conditional (if-then) logic, a more efficient method for maintaining many such facts is the rete [23]. A rete algorithm implements efficient pattern matching to find the subset of facts that matter to a team assignment. Upcoming work is expected to implement rete to carry out dynamic team assignment in situations that call for real-time responsiveness, as opposed to after-action feedback that can be processed quickly once the full training run is complete.

In conclusion, we described at a functional level the software changes that are enhancing automated team assessment in the GIFT adaptive training framework. The changes build on our initial introduction of assessing team functional resilience. The enhancements described here tend to increase the role of AI in supporting assessment. As a result, the assessments become more general and reusable in more training settings. The assessments also require less workload to define when authoring the training and when configuring it for execution with a particular team. With ongoing implementation work, it may be possible to create a library of team assessments that encode many doctrinal team behaviors and correctly assess them in a range of scenarios and settings.

6 Recommended Next Steps

We introduced the concept of adaptive team training using GIFT, but the principles discussed in this paper may be generalized to the design of other adaptive instructional software architectures. To enhance the efficacy and usefulness of adaptive instruction as a tool for team training, we recommend the following team competency research topics as logical next steps:

  • Research and develop standard methods to define team-based training scenarios

  • Research and develop standard methods to define and operationalize team competency in cognitive and psychomotor domains

  • Automate the evaluation process for determining the effectiveness of team tutoring interventions to enhance AIS decision-making with each new experience

  • Implement self-improving assessments and interventions that respond to effectiveness evaluations and suggest ways to improve specific training

  • Provide instructors and training authors with actionable information about team training effectiveness

  • Increase high-level control over training in order to facilitate making training as effective as possible when configuring scenarios and during real-time execution.

The implications of this research are focused on GIFT but may contribute broadly to the design of automating instructional systems. An AIS goal is for collaborative learning support for education settings and team-based systems for training settings to be easily deployed and used for training because of how prevalent teams are in the workplace. The design of automation, capturing individual contributions to team performance, and implementing AI approaches to increase generality and reusability are methods that can be used for teams in GIFT and other technology systems to assess and act on team performance.