Keywords

1 Introduction

Driverless cars, autonomous drone delivery systems, collaborative robots as teammates, robotic concierges/hosts - these concepts are no longer science fiction as they may be coming to a home or business near you very soon. The notion of autonomy dominates contemporary visions for the future. Veloso and colleagues [1] have outlined a number of potential avenues for robotic systems in supporting the human race. Robotic systems are envisioned to support the elderly in their homes with physical movement, decision making, and even companionship. Robotic systems are hoped to revolutionize transportation and delivery systems. Robotic systems can support health care and connect doctors with patients at a distance. Backbreaking factory and warehouse work could be aided through the use of intelligent exoskeletons, as could individuals who have lost mobility due to medical conditions or injuries. Therapy and rehabilitation could be supported with robots. Customer service and of course entertainment are two other domains where robotic systems will likely make a huge impact on society – and in some instances they already are. While the technology possibilities are only limited by one’s imagination, there is one core element of each of the above examples that constrains the potential gains for future robotic systems, and that is the fact that all of these systems will need to, at some point, interface with humans. This has led to a burgeoning of the domain of Human-Robot Interaction (HRI) which studies numerous facets of how to improve human-robot interaction. One key challenge in this domain area is the issue of how to foster appropriate levels of trust of the robotic systems – i.e., will humans accept these technologies or reject them?

Trust represents one’s willingness to be vulnerable to another entity in a situation where there is some risk and little ability to monitor the other [2]. The trust construct has been applied to trust of technology to represent the “attitude that an agent will help achieve an individual’s goals in a situation characterized by uncertainty and vulnerability” [3, p. 54]. Thus, trust is relevant for both interactions with people as well as with intelligent agents such as robots. Chen and Barnes [4] define an agent as a technology that has autonomy, the ability to observe and act on its environment, and the ability and authority to direct actions toward goals. The current paper will use the term “future autonomy” to collectively represent the notion of robotic systems and agent-based technologies, the latter of which may have no physical embodiment. The key attribute of relevance for future autonomy in this context is that such systems will have both the capability and the authority to act in relevant operational scenarios.

Without a doubt, future autonomy is imminent, yet human trust will determine the effectiveness of said technology as humans decide whether or not to use it, and how to use it. The notion of trust calibration, or appropriate levels of trust, is the critical factor in determining the effectiveness of future autonomy. Meaning, the key challenge is in understanding when to trust and when to distrust technology. Inaccurate trust can result in catastrophic errors when humans rely on technology that is faulty or error-prone. Several accidents have been blamed on human overreliance on automated systems such as the Turkish Airlines flight 1951 in 2009 when a pilot relied on autopilot after an instrumentation failure [5]. The inverse is also problematic in that under trust can be detrimental to performance when humans fail to use a reliable technology as evidenced by the Costa Concordia cruise ship disaster that killed 32 passengers when the ship captain used manual navigation skills instead of a reliable automated navigation tool [5]. Appropriately calibrated trust is challenging because as technology gains in reliability humans tend to trust it more – appropriately so. However, the performance costs of errors are most severe in situations when a highly reliable system is given the highest level of autonomy and that system makes a mistake [6]. This is driven by the habit of humans to reduce their monitoring of highly reliable systems, which could make compensation, correction, and adaptation to novel demands more difficult when the technology fails. This paradox of automation has motivated the research community to examine the drivers (and detractors) of human trust of technology.

Trust has been a focal topic for researchers in the areas of automation [5] and robotics [7]. While a comprehensive review of this literature is beyond the scope of the current paper, the human-machine trust literature has identified a number of key trust antecedents including: performance [7], transparency [4, 8], perceived benefits of use [9], prior experiences with the system to include error types (i.e., false alarms and misses) and the timing of errors [5, 10], interactive styles (etiquette) [5], anthropomorphism [11], and individual differences such as one’s perfect automation schema [12], to name a few. Yet despite the burgeoning literature on human-machine trust, little field work has been done to examine the trust barriers among real operators to real tools that have real consequences in the world (R3) for trust or distrust. One such study found that pilot trust of an automated safety system in fighter aircraft was driven by performance considerations (reliable performance and system behavior that does not interfere with the pilot’s ability to fly and fight), transparency, perceived benefits and logical (compelling) rationale for why the technology was needed, and familiarity of system’s behavior [13]. It is likely that these same dimensions will be important considerations for future autonomy. Further, as technology increases in both decision capability and authority, it is likely that decision making capability and intent of the system will be important trust considerations [14]. Thus, the antecedents of trust for systems that involve a broader range of decision options should involve more intent-based dimensions relative to mere automated systems that have both decision authority and capability but only under the confines of a narrower set of circumstances.

The Department of Defense (DoD) is very focused on technologies for autonomy. Despite the domain of autonomy being quite broad, the notion of trust in autonomy is a persistent theme throughout much of the DoD Research Doctrine [15, 16]. Yet, it is critical to contextualize the target domain when considering trust so that trust considerations have a focused technology as a trust referent. Thus, the current paper will discuss pilot reactions to two future technologies: the Automatic Air Collision Avoidance System (AACAS) and an Autonomous Wingman (AW). AACAS has already undergone flight testing and is a more mature technology than the AW technology is currently.

The Automatic Air Collision Avoidance systems is part of the Air Force Research Laboratory’s Integrated Collision Avoidance Program which seeks to integrate the already-fielded Automation Ground Collision Avoidance System (AGCAS) with AACAS. AACAS was designed to mitigate mid-air collisions among fighters by calculating future aircraft trajectories of cooperative and non-cooperative aircraft and using a collision avoidance algorithm to determine if an automatic maneuver is required to avoid aircraft collision [17, 18]. Like prior systems such as AGCAS, AACAS must avoid interference with the pilots [18] which by avoiding nuisance the pilots should view the system as more trustworthy and be more likely to trust it [13]. The AW is more of a future concept, but would involve the notion of a robotic aircraft that serves as a subordinate to the flight lead. The AW would handle its own flight maneuvers but would be under the direct control of the flight lead to use as needed. Unlike current Remotely Piloted Aircraft (RPAs) the AW would not be remotely piloted but rather would able to respond to higher-level commands from the flight lead. Relative to AACAS, AW would be expected to be capable of handling a broader range of activities, whereas AACAS has one action – avoiding collision with other aircraft.

2 Method

2.1 Participants

The participants were operational F-16 (N = 131) and F-22 (N = 35) pilots at operational Air Force bases. Nine different F-16 units were visited, 4 of which were outside of the Continental United States. Two F-22 units were visited both of which were in the Continental United States. All of the pilots had, at a minimum, completed basic flight training and were operational pilots within the Air Force. The F-16 pilots had an average of 836 flight hrs. and the F-22 pilots averaged 372 h.

2.2 Procedure

Semi-structured interviews were conducted in person at the F-16/F-22 units. The current data were collected as part of a larger set of interviews centered on trust of ground collision avoidance systems. All pilots were first given an informed consent document which discussed the study objectives. Following consent, the pilots were administered a structured interview focused on attitudes and experiences of ground collision avoidance technologies that are already fielded on the F-16 and F-22. Following this set of questions, the pilots were given written descriptions of the two future technologies (AACAS and AW). After the pilots read the descriptions, a few questions were asked relating to their attitudes toward these systems. The current paper focuses on a subset of those data, and in particular, on responses to two questions: (1) In your opinion, what would be the biggest trust barrier with the AACAS system?, and (2) In your opinion, what would be the biggest trust barrier with an autonomous wingman? Responses were recorded by digital recorders (based on approval of the pilots) for later transcription and analysis. The entire interview lasted on average between 20–30 min. Data were coded with NVivo version 11 qualitative analysis software package. Note that each pilot was asked to provide the “biggest” trust barrier but they could provide multiple trust barriers for each technology.

3 Results

The relevant data clusters are reported in Tables 1 and 2 below. As shown in Table 1, the primary trust barriers reported by F-16 pilots for AACAS involved performance-related issues (e.g., reliability, connectivity issues, and concern about interference). The primary trust barriers reported by F-16 pilots for the AW included: workload concerns, reliability, and the lack of a human decision maker. As shown in Table 2, the primary trust barriers reported by F-22 pilots for AACAS involve performance issues: concern about interference and that the system could create a tactical disadvantage in combat. The primary trust barriers reported by F-22 pilots for AW involve concern about increased workload, the lack of a human decision maker, and the AW’s ability to adapt to novel constraints.

Table 1. Clusters and frequencies for F-16 pilots. Note: AACAS = Automatic Air Collision Avoidance System.
Table 2. Clusters and frequencies for F-22 pilots. Note: AACAS = Automatic Air Collision Avoidance System.

4 Discussion

The present paper examined trust barriers among operational pilots in relation to two forms of future autonomy within the Air Force, namely the AACAS and the AW technologies. While both 4th and 5th Gen fighter pilots served as the samples, the responses were fairly consistent between both sets of pilots; therefore, the data will be discussed across both samples rather than by a specific platform type (i.e., F-16/F-22). For AACAS, the primary concerns for pilots revolved around performance issues. Like prior fielded automated systems on fighter aircraft [13, 17, 18] pilots were very concerned about interference. Pilots did not want AACAS preventing them from getting close enough to other aircraft for training, battle damage checks, or most importantly, during Basic Flight Maneuvers (BFM). These concerns about interference were largely in preventing the pilot from engaging in a maneuver that was desired, essentially demonstrating concerns about false alarms. There were also concerns about the system causing harm by maneuvering one aircraft into another during the execution of an automated avoidance action in close formation. Pilots also reported concerns about the reliability of the system in general (given the complexity of the data links and algorithms required for the system), as well as concerns about the data linkages between cooperative and non-cooperative aircraft. In this case, cooperative aircraft would be those with a similar AACAS system and sensing capability, and non-cooperative would be those without AACAS. Given the speeds and tactical requirements of operating a fighter aircraft, these concerns are logical as pilots need to maintain a tactical edge on the battlefield. Pilots want a system that is both highly reliable, but not prone to nuisance activations (e.g., false alarms – activating when an activation was not necessary). Consistent with the literature on trust of automation [3, 4, 6, 7], performance and reliability are significant drivers of trust of technologies like AACAS. Additionally, pilots reported that they could see value in AACAS if the reliability of the system was very high and interference could be eliminated/minimized. This value was noted mostly as a “last ditch” maneuver to avoid an otherwise imminent collision.

The reported trust barriers for AW were a bit broader than those for AACAS and this may be reflected for two reasons: (1) AACAS is a more mature technology relative to AW and as such the trust concerns from pilots of AW may be driven by an overall uncertainty associated with AW, and (2) AW is intended to operate within a broader array of situations which thus creates greater complexity for trust evaluations. The pilots’ top concern was related to the expectation that the AW would add to an already high-workload environment. Operational fighter pilots operate at a high ops tempo and the flight requirements, communication requirements, and operational requirements create a high workload situation. Adding the complexity of communicating with and “leading” an AW raises concerns that pilots do not want the added workload. Like AACAS, reliability was also an issue for pilots when considering the AW.

In contrast to AACAS, when pilots considered the AW, they reported concerns about the lack of a human decision maker in the cockpit. Given the time-sensitive and dangerous domains that military personnel are faced with, these concerns are logical. Specifically, there are concerns that the system would make the wrong decision when faced with a difficult situation. Herein, it would be useful to highlight the intent-based transparency of the AW to the pilots, as called for in general by [14]. By using intent-based transparency methods the pilots and AW would have more opportunities to establish shared intent, which is crucial in dynamic, morally contentious situations. Shared intent allows two or more entities to establish predictable behaviors/reactions to novel constraints. Shared intent is important in this context due to the fact that pilots reported concerns about accountability for the AW. This is also important because pilots also noted that they have concerns about the AW’s ability to adapt to novel demands. The pilots seemed to want the AW to be able to think and respond “like a human,” however that may not be the best approach for this human-machine team. A more fruitful approach may involve leveraging the strengths of the AW and building a flight lead-AW relationship in a way that maximizes the strengths and minimize the weakness of each partner. This heterogeneous, but synergistic approach could maximize the effectiveness of the human-autonomy team. Further, the pilots noted concerns about potential hacking of the AW, and the potential for the AW to physically “hurt” the pilot by running into her/him. Thus, while performance-related concerns were definitely present for AW, similar to AACAS, the pilots seemed to also consider intent-based issues in relation to AW. The identification of these trust barriers are important for researchers and designers to consider in the development and fielding of future autonomy. Like AACAS, the pilots reported a number of potent benefits of a system like an AW to include: risk reduction for pilots (i.e., fewer pilots in harm’s way), using the AW to engage particularly risky targets or in very risky situations, using the AW to carry additional assets such as weapons and sensors, using the AW to jam surface-to-air missile batteries (i.e., to protect the pilot).

The next section presents a series of recommendations for military organizations seeking to field future autonomy systems. First, performance-related issues will be a paramount concern among operators. Thus, military organizations are encouraged to use videos as a means to “show” the performance and reliability of the system. There are two potential ways in which videos can be incorporated. Videos of operational performance should be shown to highlight both positive and negative exemplars of the system’s performance. The positive videos should boost trust, as demonstrated by a prior field study examining trust of the AGCAS system [19]. Yet, care must be taken to avoid situations of over trust as videos have the potential to generate high trust among individuals with little system experience which could negatively impact trust calibration. The videos serve as operational evidence of the system’s performance. While videos of negative system performance may cause a decrease in trust they are important for sharing stories among operators and will help the operators to understand the limits of the system. After all, a decrease in trust can be beneficial if it leads to a more accurate calibration of one’s trust. The second type of video might include test videos which show the system in scripted scenarios that test the limits of the system. Such videos would be impossible (and unethical) to create in actual operations, so testing seems like the right opportunity for such videos. Anecdotally for the present study, following the interviews, most of the pilots had an opportunity to discuss AACAS with a subject matter expert (SME) on the system and when that SME showed the pilots a successful test video of the AACAS in a close proximity high-speed pass, the effects on pilot trust were virtually spontaneous. In this case, “seeing is truly believing.”

Understanding the intent of the systems also seems to be an important theme emerging from this research. Using intent-based transparency methods should help to foster shared intent between the human and the system [14]. This shared intent, should in turn support predictability for how the system will behave in novel situations. If one understands the rules that govern the system’s behavior (i.e., goals, goal priorities, interactive styles, rules of engagement) then the system’s reaction to novel demands should be more predictable, at least more understandable. Intent-based transparency could be established through education and joint human-machine training. The educational aspects could focus on the background and purpose of the system, why it was designed, how the system sets and prioritizes goals in changing contexts, and the rationale for decision making processes. More importantly, the human should engage in joint human-machine training to experience how the system reacts to novel demands. Herein, the design of the scenario should be done in such a way as to stress the boundaries of the situation to maximize the range of potential decision options. Again, the interest is in building an understanding of the behavioral rules used to govern the system’s behavior, and in establishing some predictability of how the system executes those rules in various conditions. In this sense, having experience with a system reacting to the same or very similar circumstances is less variable than exposure to a smaller subset of encounters with novel stimuli that stress the system’s range of behavioral flexibility.

The current research is not without limitations. One limitation is that the study was limited to military personnel and military technologies. Future autonomy in the commercial sector could be perceived differently than military technologies. Further, non-military personnel may be more or less accepting of future autonomy, relative to military personnel. For instance, autonomous cars are beginning to hit the market and recent accidents have been blamed on overreliance on the technology. Military fighter pilots may be more prone to be skeptical of new technologies. A second related limitation is that only military technologies have been considered in this study. Technologies that are available on the commercial market may be perceived differently than military technologies. However, both AACAS and AW fit the criteria for “R3” in that they are real technologies, with real operators that have the potential for real consequences in the world. Finally, the current study involved qualitative data, future research on this topic might include experimental studies to pinpoint the impact of different trust factors on trust intentions and trust-based behavior.