Keywords

3.1 Overview of Research Project

The objective of this research was to develop an approach which would allow us to better understand behavioral states inherent in observed behaviors. This was achieved through mathematical modeling of human behavior using large scale signal corpora to integrate data modeling and physical/cognitive modeling. This research also aimed to establish practical detection techniques which could be applied to detecting excessive trust in automated systems in order to prevent negative outcomes, such as bank transfer fraud and traffic accidents.

Figure 3.1 illustrates the scope of the technologies involved in this project. The central objective of the project was to construct an Accompanying Intelligence which could assist users of automated and semi-automated systems by controlling their interactions. This accompanying intelligence needs to be able to understand human behavior in terms of how and what this person likes to do, in addition to conventional artificial intelligence functions, i.e., environmental awareness, planning and operation. In order to understand human behavior, it is necessary to mathematically model these behaviors. Although we are aware that excessive trust is the result of over acceptance of positive support, even when the users goal conflicts with the intention of the accompanying intelligence, quantitative measurement of this phenomenon is difficult, creating a need for research into cognitive modeling of excessive trust. Moreover, technologies used to detect excessive trust need to be tested under realistic conditions.

In this chapter, we discuss our research project and our results from three perspectives, as described below:

Section 3.2: Mathematical modeling of human behavior.

Section 3.3: Cognitive modeling of excessive trust.

Section 3.4: Evaluation of technologies which detect excessive trust.

Section 3.5: Detecting excessive trust in telephone fraud calls.

Fig. 3.1
figure 1

Project overview

Fig. 3.2
figure 2

Test vehicle equipped with a variety of sensors. Our test vehicle is equipped with various kinds of sensors for collecting driving behavior and biological data from drivers, as well as data on peripheral vehicles. As part of our passing behavior study, we recruited five instructors from a driving school and five ordinary drivers, and had each driver drive a circular route which contained expressways twice. For our project to model excessive trust, we collected data from more than forty passing events for each driver, for a total of about 1,000 lane change events

3.2 Mathematical Modeling of Human Behaviors

3.2.1 Large Scale Behavior Signal Corpora

With the coming acquisition and distribution of big data, behavior signal corpora will become crucial elements of real world research into human behavior. This research project is part of this data-centric understanding of human behavior, which we seek to lead by constructing a large corpus of behavior signals recorded in real world environments. We also wish to develop methods for collecting behavior signals in real world environments, as well as tag labeling methods for acquired data, and to make this data available to the public for research purposes.

In particular, we have been collecting multi-sensor vehicle driving behavior signals continuously for more than ten years. Using a vehicle equipped with a variety of sensors, e.g., sound, video, kinematics and driver physiology, we have collected data from over 1,000 real world drivers (Fig. 3.2). In addition, in connection with this study, we have collected driving behavior data during 1,000 highway driving passing (overtaking) events, since this represents a typical situation in which excessive trust in an automated system may occur. We developed tags for driver gaze behavior during passing events (Fig. 3.3) and subjective risk scores were assigned to each passing event by five driving safety evaluators, based on collected video footage.

Fig. 3.3
figure 3

Labeling of gaze direction. We labeled driver gaze direction with tags, using the lane change driving data mentioned above. A gaze direction labeling tool was developed which can classify gaze direction into one of 10 categories based on video images of drivers faces. We used the collected driving data and gaze label data to successfully model the relationship between driver gaze behavior and driving behavior

3.2.2 Mathematical Modeling of Human Behavior

In order to achieve a symbiotic relationship between human beings and automated systems, we first need to describe human behavior in relation to such a system mathematically, as a sequence of motions. Although machine control theory has traditionally been used to describe human behavior as a sequence of motions, methods which can handle a wide variety of non-deterministic human behaviors, which result from a range of human internal states, have not been sufficiently developed, and methods of identifying control parameters have also not become clear. In this project, we attempt to boldly use information technology to describe human behavior as a sequence of motions, and to develop data-centric, dynamic methods of constructing human behavior models.

To achieve this, we focus on Piece-Wise ARX models (PWARX) and Gaussian Mixture Models (GMM), powerful mathematical models which can be used for representing the dynamics of human behavior. The technical details of applying these models to driver behavior modeling are described in [1, 2]. These two models are mathematically equivalent to each other when squared error is used as the optimization criteria. Thus, we aim to improve both models developmentally from a complementary viewpoint, and then evaluate them with experiments using a test vehicle. By using mathematical behavior models, we can predict unique driving behavior associated with a particular driver, which makes it possible to optimize vehicle control for an individual driver. For example, the subjective risk level from the surrounding environment felt by a driver can be estimated using a behavior model, which can then be used as an evaluation criterion in path planning problems.

In one of our previous studies, we confirmed the stability of such a human-in-the-loop system through an experiment using a small electric vehicle equipped with a longitudinal driving assistance system, which was controlled using the same type of individualized path planning method described above [3].

PWARX Modeling In our project, PieceWise Auto Regressive systems with eXogenous input models (PWARX) are primarily used as a mathematical expression of decision/action situations, but the original model has two problems:

  1. 1.

    It cannot simultaneously classify and estimate piecewise auto regressive systems;

  2. 2.

    It is not robust to probabilistic variability in training data.

Therefore, we needed to improve the original model by incorporating probabilistic elements into the piece-wise ARX model. For the purpose of extracting human identity and behavioral characteristics from large volumes of data, we employed a probabilistic approach and proposed a Probability Weighted ARX (PrARX) model, which also includes a parameter estimation method [2, 4, 5].

Gaussian Mixture Modeling The technical merit of GMM formulation compared to PWARX is its compatibility with data-centric approaches, e.g., adaptation to new data and/or consistent model identification. In order to complement PWARX implementation in this project, we developed two adaptations for GMMs for application with driving behavior signals. For the model adaptation problem, we developed a Maximum A posteriori Probability (MAP) adaptation of distribution weight to correspond to mode distribution while driving, e.g., frequent stops during city driving versus uninterrupted driving on highways [6]. A non-parametric Bayesian approach was also implemented to estimate the optimal model structure (number of distributions) for a given set of training data. This approach has proven to be very effective when building robust models using data collected under real world conditions. Also, it then becomes possible to automatically determine which model structure to employ to correspond to changes in the environment. As a consequence, the range of models which can be applied is expanded.

Hitherto, it has been common, when we want to describe a driver’s behavior, to consider the driver as a signal source or as a controller who can decide outputs based on external environmental input information. However, there are also situations in which the impact of environmental information is primary, in which drivers respond according to their own risk evaluation function (i.e., each driver’s unique interpretation of the situational risk level). Therefore, if we can estimate how a driver’s behavior is controlled by these risk evaluation functions, it becomes possible to construct a robust driver model. Based on this observation, we studied collision avoidance maneuvers and examined driver risk perception of local environments as the evaluation function of a path decision problem, and attempted to estimate the evaluation function.

By investigating adaptation of a non-parametric Bayesian method using the Dirichlet process, the number of Gaussians of each GMM can be determined. It also becomes possible to automatically decide which model structure to employ, corresponding to changes in the environment. Figure 3.4 shows the distributed structure of the trained driver model. It can be confirmed that the optimal number of Gaussians has been selected, and that optimal shapes have been trained.

Fig. 3.4
figure 4

GMM driver models trained by non-parametric Bayesian estimation based on Dirichlet process. Each GMM represents a probabilistic function of the vehicle velocity under the given following distance for each individual driver

Fig. 3.5
figure 5

Analysis of internal driver state based on synchronicity of changes in driving situation and gaze behavior. In order to understand the relationship between changes in the driving situation and gaze behavior, we analyzed differences in driving situations in which drivers exhibited either concentration or distraction (such as searching for music while driving) using a large-scale data corpora

3.2.3 Integrating Visual Behavior into the Mathematical Driver Behavior Model

By considering human behavior as a holistic system of cognition, decision and action, we can attempt to model the process from cognition to decision as the process that converts consecutive signals from the outside world into a sequence of discrete internal states. Thus, we can formulate this as a conventional pattern recognition problem. As a result of this process, we discovered a new approach to constructing driver behavior models, which considers the driver’s internal state, by modeling the mutual relationship between environmental changes, gaze behavior and driving behavior. Using large amounts of driving behavior signals from our corpus, we have investigated the proposed modeling method, which integrates driver gaze behavior (cognition of the environmental situation) and operational behavior (decision making and vehicle operation), and the effectiveness of our model was verified [79].

3.2.4 Analysis of Gaze Behavior in Relation to Environmental Change

Drivers internal states are influenced by external stimulation, and actions in response to this stimulation are manifest as reactions. Therefore, there should be a correlation between gaze behavior and changes in the driving environment. By characterizing the relationship between signals associated with gaze behavior and environmental change, we can detect differences in a driver’s internal state, such as concentration or distraction, for example Fig. 3.5.

When considering passing cars as external stimulation, it was confirmed that drivers in a normal operating state (concentrating on the surrounding environment) reacted more quickly to changes in the environment than distracted drivers (who were searching for music using a speech recognition interface). This was indicated by the visual fixation of focused drivers on surrounding vehicles, resulting in a significant difference in response time of about 500 ms. Utilizing this difference between focused and distracted driving, we employed maximum a posteriori probability estimation based on a Naive Bayes method, and achieved an improvement in the precision of detection of a driver’s internal state of approximately 20 % in comparison to a driver state identification method used in previous studies, namely anterior fixation gaze ratio [10, 11]. In addition, based on principal component analysis of gaze direction, it was also confirmed that gaze shifting patterns are synchronized with surrounding driving situations when drivers are in a state of concentration, but that gaze shift patterns are not in synchronization when drivers are distracted. On average, gaze movement in response to passing vehicles is delayed when drivers are distracted. These differences in gaze movement can be used to improve the accuracy of inattentive driving detection [12].

Fig. 3.6
figure 6

Integrative modeling of gaze behavior and vehicle operation behavior. Integrative cognitive behavior modeling was used to describe the cognitive state of a driver, in relation to both the surrounding environment and to the driver’s gaze behavior, in a mathematical and integrative manner. In this model, gaze behavior and driving behavior are represented as a time series of discrete events, based on a Hidden Markov model. It was confirmed that dangerous vehicle operation behavior can be detected with a high degree of accuracy and also that an HMM can characterize safe and risky lane changes

Fig. 3.7
figure 7

Detection results for risky driving using the integrated model. This figure shows the improvement in the risky driving detection rate achieved using the proposed integrated model, confirming that the proposed integration of driver gaze and behavior data greatly improved detection accuracy, compared with models using only gaze or only driving behavior. Thus, the degree of driver concentration can be quantified using gaze direction, vehicle behavior and driver operation behavior

3.2.5 Integrated Model of Gaze and Vehicle Operation Behavior

We then focused on lane change scenes in which both gaze behavior and vehicle operation behavior were crucial for safety, and evaluated the effectiveness of modeling the correlation between gaze behavior and vehicle operation behavior signals by attempting to detect risky lane changes. Our driving signal corpus of 1,000 lane changes was used for this experiment.

The correlation between signals was modeled using a Hidden Markov Model (HMM) which had 3–5 states, each of which was characterized by a multivariable, discrete symbol distribution, i.e., gaze direction, pedal operation, longitudinal and lateral acceleration, etc. Each variable was coded into 4–10 discrete symbols, as shown in Fig. 3.6. Two different HMMs were trained using two sets of training data, i.e., the 5 % safest and 5 % riskiest lane change events, and were labeled Safe HMM and Risky HMM, respectively.

Based on the likelihood ratio between the probabilities calculated using the Safe and Risky HMM models, experiments to detect risky lane change events were performed. Figure 3.7 shows the results of these experiments. As shown in the figure, by integrating visual and operational behaviors we were able to improve risky lane change detection accuracy by 10 %, achieving 90 % accuracy with a false positive rate of 20 % [79]. Using this approach, the degree of correlation between driver concentration and driving operation can be quantified using a combination of visual and driving behavior signals. The technology developed here will be utilized for detecting excessive trust in Sect. 3.4.

3.3 Cognitive Model of Excessive Trust

3.3.1 Misuse/Disuse of Automated Systems

In this project, excessive trust is defined based on misuse/disuse of automated systems. As shown in Fig. 3.8, misuse/disuse of a system is a function of the relative performance of the automated system compared to the performance of manual operation by the user. When an automated system is not used, even though its performance is superior to that achieved during manual operation, the automated system is disused. When an automated system is used, even though its performance is inferior to that achieved during manual operation, the automated system is misused. As part of this study, we developed a method of quantitatively analyzing the misuse/disuse of automated systems using a 3D plane, as shown in Fig. 3.9. The decision whether or not to use an automated system is made based on reliance on the system, which does not exactly correspond with the actual capabilities of either the automated system (\(C_{a}\)) or performance during manual operation (\(C_{m}\)). Therefore, in general, the discrepancy between use of the automated system and its actual capabilities can be measured by observing how much the user misuses or disuses the system (Fig. 3.10).

Fig. 3.8
figure 8

Misuse/disuse of automated systems

Fig. 3.9
figure 9

Cognitive model of user switching between manual operation and an automated system, based on Gao and Lee [13]

Fig. 3.10
figure 10

Characterization of misuse/disuse. The tendency to misuse/disuse an automated system can be characterized by changes in the frequency of using an automated system compared to manual operation, taking into account the given capacities of both the automated system \(C_{a}\) and manual operation \(C_{m}\), i.e., f(\(C_{m}\), \(C_{a}\)). It was found that operators who disuse automated systems are more sensitive to changes in manual performance, based on the observation that the slope of \(C_{m}\) (\({\partial f}\)/\({\partial C_{m}})\) is larger than that of \(C_{a}\)

3.3.2 Measuring Misuse/Disuse

The basic scenario used to measure misuse/disuse of the system can be described as follows. The subject performs a line tracing task with the help of an automated drafting system. The subject can use either auto or manual mode for performing the task, but the accuracy of both the automated system and the manual drawing system are manipulated so that they do not work properly sometimes. If the subject uses the assistance system effectively, the frequency of usage is expected to approach the frequency of accurate functioning of the system. When a subject uses the automated system much less frequently than its frequency of accuracy, the user is considered to have a tendency to disuse the system. Results are characterized using a three dimensional plane consisting of the frequency of auto-mode usage, user satisfaction with auto system functioning and user satisfaction with the results of manual drawing. In order to control manual performance, the system is designed to also behave incorrectly at the same given frequency even when in manual mode. Therefore, we can measure the misuse/disuse tendency as a function of both automatic and manual drawing. This basic experiment was performed using 200 subjects and the effectiveness of the scenario was confirmed [14].

As shown in Fig. 3.11, this experiment can be implemented using different platforms, including a real vehicle, a driving simulator, or a simple PC gaming interface.

Fig. 3.11
figure 11

Multi-platform test environments. Test conditions, which included multiple platforms with different degrees of abstraction, included a real vehicle, a driving simulator and a video game. The experimental scenario was common to all of the systems, and all of the systems used a similar driving course. By using such a multi-platform test, it is possible to discuss the consistency of behavior across different systems. a Real car (real system). b DS (virtual system). c Game (laboratory system)

This misuse/disuse testing framework can be used to quantify the degree of excessive trust in an automated system. The results of this experiment clearly showed that, compared with subjects who have a tendency to misuse the automated system, subjects who disused the system were unable to evaluate the capabilities of the system properly. Disusers were more sensitive to changes in functioning capability when making decisions [1517]. In addition, by modeling misuse/disuse tendencies with the proposed integrated cognitive architecture (ACT-R), we were able to confirm our hypothesis [18, 19].

3.4 Study Measuring Overreliance on an Automated Driving System

We then conducted a simulator study to quantify the degree of driver overreliance on an automated driving system by modeling the consistency of driver decision making and driver gaze behavior during automated driving [20].

3.4.1 Definition of Over-Reliance

Figure 3.12 shows our hypothetical model of driver over-reliance on an automated driving system. We assumed that a driver is more dependent on the automated system if he or she was less sensitive to the risk level of the surrounding environment when making decisions during automated driving compared to conventional driving. In other words, over-reliance on a system can be represented by a decrease in the consistency of a decision making threshold during automated driving as compared to during conventional driving.

Fig. 3.12
figure 12

Degree of driver over-reliance on an automated system as represented by the gap in decision making ambiguity between conventional and automated driving

3.4.2 Experimental Conditions

3.4.2.1 Visual Behavior Based Situational Awareness

We focused on an automated driving system which requires drivers to monitor the surrounding environment and vehicle behavior in order to be ready to take control of the vehicle during critical situations. Since drivers are not required to operate the gas or brake pedals or steering wheel during automated driving, the degree of driver over-reliance on the automated system needs to be detected by observing behavior other than vehicle operation. In this study, we focused on using driver gaze behavior during lane changes to measure situational awareness. We analyzed the gaze behavior of fifteen drivers and observed their decision making process when changing lanes during conventional and automated driving.

3.4.2.2 Traffic Scenario

Fifteen subjects drove on a simulated, straight highway consisting of two lanes of traffic moving in the same direction using a driving simulator (Fig. 3.13). Each subject drove under conventional and automated driving conditions. They each made approximately 40 lane changes, both to the left and to the right, under each condition. Subjects were instructed to reach their destination as quickly as possible while driving close to the speed limit, by passing other vehicles ahead of them by making lane changes, if possible.

During automated driving, subjects could take their feet off the gas and brake pedals and their hands off the steering wheel, but were instructed to continue monitoring the roadway so that they could take control of the vehicle at any time, such as when automated control of the vehicle becomes risky. They could intervene in control of the vehicle by operating the pedals or steering wheel themselves if they felt there was any danger.

Fig. 3.13
figure 13

Traffic scenario used in the driving simulator. Drivers under time pressure make lane changes into the faster moving lane. Then the faster moving lane gradually becomes congested and turns into the slower moving lane, and drivers move back into their original lane. The speed of the traffic flow alternates iteratively between lanes

3.4.3 Estimation of Driver’s Risk Tolerance When Making Decisions

We represented the driver’s risk tolerance when making decisions as a two-dimensional surface, estimated using logistic regression. Figure 3.14 shows an example of the surface obtained for a sample driver. The driver’s lane change decisions during conventional and automated driving are shown in the graphs on the left and right, respectively. Each dot in the graph shows the result of the driver’s decision whether or not to make a lane change, according to the risk level of the surrounding environment. We can see the round and triangular dots on the graph overlapped more widely during automated driving than during conventional driving, and that the decision surface became more uniform during automated driving, indicating that the decision-making threshold became more ambiguous.

Fig. 3.14
figure 14

A driver’s decisions whether or not to make lane changes based on the risk level of the surrounding environment with probability surface, derived using logistic regression (Left Conventional driving. Right Automated driving)

3.4.4 Gaze Behavior During Lane Changes

Driver gaze direction was classified into one of five directions using the driver’s gaze coordinates on the screen of the simulator. Figure 3.15 shows an example of the gaze behavior of a driver during lane changes. The figure shows the relative frequencies of the five gaze directions accumulated at each point in time during right lane changes. The top and bottom graphs correspond to conventional and automated driving, respectively. We analyzed gaze direction for 20, 10 s before and 10 s after the beginning of each lane change.

Fig. 3.15
figure 15

Driver gaze behavior during right lane change (Top Conventional driving. Bottom Automated driving)

We can see from Fig. 3.15 that the proportion of the driver’s gaze directed in front of the vehicle decreased significantly during automated driving. The driver also looked to the right or into the right rear-view mirror more often when the vehicle was making automated right lane changes than when the driver made manual right lane changes. He also looked more to the left or into the left rear-view mirror when the vehicle was making automated left lane changes than when he made manual left lane changes.

3.4.5 Detection of Excessive Trust by Analyzing Relationship Between Driver Decision Making and Gaze Behavior

Figure 3.16 shows the relationship between increased inconsistency in decision making and deviation in gaze behavior during automated and conventional driving. The horizontal axis represents increasing inconsistency in a driver’s decision to make a lane change during automated driving as compared with inconsistency during conventional driving. The vertical axis shows the deviation in driver gaze behavior during automated driving from gaze behavior during conventional driving. We can observe some correlations between these parameters. For example, this figure shows that drivers who showed a larger deviation in consistency between automated and conventional driving also tended to be less sensitive to risk factors in the surrounding environment. This data also confirms that driver over-reliance on an automated driving system could be detecting by monitoring the deviation in driver gaze behavior between automated and conventional driving.

Fig. 3.16
figure 16

Deviation in gaze behavior between automated and conventional driving in relation to increased inconsistency in lane change behavior

3.5 Detecting Excessive Trust in Telephone Fraud Calls

This method of detecting excessive trust is applicable to other problems as well. For example, “Ore-Ore” (“It’s me”) telephone fraud has become a serious problem in Japan. Victims receive a telephone call from a fraudster who pretends to be a family member—a grandson, for example—who is in some kind of trouble. Eventually, the victim begins to believe the fraudster (i.e., develops excessive trust in the situation)—the “grandson” really is about to be arrested if the missing company funds are not recovered, so the victim transfers a large sum of money to him in order to save him. These kinds of crimes are successful due to excessive trust in the reality of the faked urgent situation, as well excessive trust in the fraudster. In order to prevent this kind of fraud, we applied a different type of behavior model from the one used in driver modeling. We theorized that the conversational behavior of victims during fraud calls may change in two ways, namely, vocabulary and speech quality. Therefore, we used keyword spotting technology and developed a speech quality measurement technique that characterizes physical changes in the vocal cords, i.e., stiffness. To measure these physical changes, we used Power Difference in Sub-band Spectrum (PDSS) to measure the harmonic structure of vocal cord vibration, a measurement technique originally developed for speaker recognition [21]. As shown in Fig. 3.17, stressed utterances recorded during simulated fraud call situations lack the regularity of harmonic structure in the higher frequency band which occurs in unstressed speech.

Fig. 3.17
figure 17

Spectral characteristics of vocal cord vibration in normal and stressed utterances. The observed irregularity in the high frequency region can be quantified using PDSS

We conducted a field test of our “Ore-Ore fraud” detection system which combined the above mentioned speech quality measure and keyword spotting technologies to count special words such as “transfer”, “police” and “lawsuit”, used during fraud-related conversations. With the help of the Okayama Prefectural Police and a bank, we evaluated the performance of our system in a real-world field test. During the field test there were no reported cases of telephone fraud scams in Okayama Prefecture, when, on average, there are 4–8 cases per month, an obvious, pronounced effect. Although we could not precisely confirm the effectiveness of the detection system, we could confirm that there were no false cases of fraud detection during the experiment.

3.6 Conclusion and Details of Ongoing Project

The purpose of this project was to develop an approach which would allow us to better understand behavioral states inherent in observed behaviors. In order to develop a mathematical model of driving behavior we developed two models, a PWARX and a GMM model. Furthermore, we extended the basic formulation of these models in order to improve their robustness and to make them more adaptable to new environments. Fully utilizing our large driving behavior signal corpus, we also analyzed and modeled the visual behavior of drivers during lane changes. We confirmed the effectiveness of the visual behavior model through experimental evaluation involving the detection of risky lane changes. Finally, in order to quantify excessive trust in automated systems, we conducted a simulator study which compared driver gaze behavior during lane changes during conventional and automated driving, and our experimental results showed that drivers who exhibited greater deviation in gaze behavior tended to be over-reliant on the automated driving system.

The driver behavior modeling technologies developed during this research will have a big impact on the safety and energy efficiency of intelligent vehicles. Our research on behavior during semi-automated (mixed-mode) driving is very important, particularly during the early stages of the commercialization of automated driving technologies. Nevertheless, since the achievements of these projects have only been verified under simulated or limited experimental conditions, the robustness of these technologies will need to be tested and improved in real-world situations. In 2014, the Ministry of International Trade and Industry of Japan (MITI) began a research project on next generation advanced driving assistance systems, in order to develop advanced safety technologies utilizing big data and mathematical driver models. The project focuses on building large signal corpora that contain a sufficient number of accidents/events, and then building a safe driver model using these corpora. The approach developed in the current research project is also used as the fundamental method of modeling driving behavior in the MITI project. The biggest challenge the MITI project faces is verifying how any developed technologies will scale up to incorporate big data that will include almost every possible real-world driving situation. To achieve that goal, the authors are studying technologies that can subdivide traffic situations into meaningful chunks, and generate huge but discrete representations of actual driving situations.