Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The development and application of new knowledge and information technologies have enormous influence on the way people live, work and learn. In the law enforcement sector, analysts are constantly required to understand and make sense of huge amounts of often unstructured data. Sense-making in this context means that analysts have to find and interpret relevant facts by actively constructing a meaningful and functional representation of some aspects of the “whole picture”. Visual Analytics (VA) possesses the potential to support the analyst’s reasoning and sense-making processes.

This is the point where the European project VALCRIFootnote 1 comes into play. Addressing the challenges of today’s law enforcement agencies, the main aim of this project is to support analysts in their reasoning and sense-making processes by providing appropriate data analytics tools, applying the methods of visual analytics. Thereby, one key focus of this project is concerned with human issues, such as, how to mitigate or avoid cognitive bias that might be caused by such automated systems, how sense-making occurs in this context, and how information and knowledge should be structured to support the human reasoning process.

In the course of this project, a visual analytics platform has been created that addresses the functional and thinking requirements of analysts [23, 24]. This platform consists of more than fifteen synchronized tools. Five of them are described in the following and three of them are depicted in Fig. 5.1.

  • The Search tool allows to search for specific crime incidents or to filter them on geographical area, time frames and crime types (e.g. burglary). The result is made accessible through the various tools of the VALCRI platform.

  • The Time tool shows a line chart that indicates the number of crime incidents. The time frame can be changed interactively, in order to get either a more detailed view or an overview of the data. Similar to the Time tool, a Statistical process control tool (SPC-tool) shows standard deviations of the number of recorded crime incidents in this time frame. This allows the user to quickly spot statistical outliers which may indicate that something unusual happened.

  • The Location tool depicts crime incidents on an interactive map. Crime incidents are represented as single dots or as rectangles, if a larger set of crimes are available in that area (more than 200). In such cases, the size of filled-out rectangles within a particular area indicates the number of crime incidents - the highest is completely filled and other areas are relative to this. The map can be interactively zoomed in and out, which changes automatically the visual representation and synchronizes the other tools with the updated dataset selection.

  • The Bar Chart tool shows the number of crimes according to a classification scheme. Discrimination factors include crime types, districts and resolving state. According to such discriminators, the numbers of crimes are shown on a bar chart sorted by the number of crime incidents. Clicking on a particular bar limits the dataset and synchronizes the other tools accordingly.

  • The List tool presents a list of the currently selected crimes including their metadata. Details of the crime are shown including the involved subjects, the location, time information and full description.

Fig. 5.1
figure 1

This figure shows the time, location, and bar chart tool of the VALCRI platform

Even if the support for sense-making with VA technologies is helpful and valuable, there is still a well-known problem of systematic errors, so-called cognitive biases, that might hinder analysts to draw sound conclusions. Cognitive biases occur when imperfect knowledge, uncertainty, complexity and time constraints prohibit people from making optimal decisions. In such situations, peoples often apply heuristics, which can be thought of as “rules of thumb” when making decisions or when evaluating the value, importance and meaning of information. These heuristics are useful in many cases, however, they can lead to severe and systematic errors in judgments and decisions [15, 21]. In the context of law enforcement analysis, these “systematic errors” or cognitive biases can occur in every phase of the decision making and reasoning process, such as discounting, misinterpreting, ignoring, rejection or overlooking pieces of information.

A large number of cognitive biases have been suggested and described in the literature. However, in the course of the VALCRI project and related requirements analysis, a set of eight cognitive biases has been selected, based on their significance for the daily routines of analysts [13]. These cognitive biases are listed in Table 5.1.

Table 5.1 Relevant cognitive biases in the VALCRI project

This chapter focuses on the question of how to ensure that a VA-platform mitigates cognitive biases from different perspectives: A (i) theory-driven, (ii) empirical and (iii) a data-driven perspective.

On the one hand, mitigating cognitive biases means reducing the probabilities that cognitive biases occur, or on the other hand, if they can not be avoided, to reduce their negative effects on the decisions and judgments. A prerequisite for answering this question empirically, for example in the course of experimental summative evaluations, is the measurement if and to what extent a cognitive bias occurs. Operationalization refers to the process and outcome of making non-directly observable constructs measurable. This would enable cognitive biases to be measured whilst a user interacts with a VA environment.

In the following section, we address some theory-driven approaches. Theory-driven refers to the fact that solely domain experts, in this case, experts in the field of cognitive science or cognitive biases, address the question of how to avoid, mitigate or operationalize cognitive biases. In the first subsection, some examples for a-priori design principles are given - for example how visualizations should be designed or how data should be represented. It is followed by a subsection on how to systematically analyze the tools which constitute a VA platform and a subsection which describes how to measure cognitive biases “on the fly”, i.e. by identifying actions and interactions with the platform. The consecutive section deals with empirical approaches, such as behavioral observations of analysts and operationalizations of cognitive biases that enable us to carry out experimental studies. We call these approaches empirical, because end-users, i.e. analysts, are required and their data, responses and evaluations are used for data analysis. Finally, the data-driven approach refers to statistical and data-mining methods that aim to identify patterns of a user’s interactions with the visual analytics platform that correlates with the presence or absence of cognitive biases.

2 Theory-Driven Approaches

This section describes three methods for cognitive bias detection and mitigation that are based on theoretical considerations and a literature review.

2.1 Design Recommendations

In the ideal case, visualizations are designed in a way that they do not induce cognitive biases at all. For several reasons, this ideal case is hard to achieve. Visualizations are made to serve a specific purpose, for example, to give an overview or to summarize data which could be only be described in confusing tables or exhausting texts. Representations are less detailed, less complex or less manifold than the part of the reality it aims to represent. Visualizations usually present a subset of a particular set of data; the more prototypical this subset, the easier it is for its recipients to generalize the whole dataset. The selection of subset and the way it is displayed, structured and visualized is the outcome of the human decision process of the visualization designer. However, human decision processes are vulnerable to cognitive biases. Nevertheless, a small set of a-priori design principles on how to make good visualizations can help. At least, there is a small set of recommendations on how to avoid some notable cognitive biases in visualizations, for example through the graphical layout of competing information [4] or through multiple views of the same information [13].

In the following, a simple example demonstrates how the above-mentioned selection process, as well as design decisions on how to display these pieces of information, might have an effect on recipients. One particular cognitive bias which has an impact on the selection process is called Selective Perception and a particular cognitive bias which has an impact on how a certain visualization is interpreted is the so-called Framing Effect. Selective Perception refers to the effect that only a small part of the reality is represented and in the focus of one’s attention, a small part that is usually not representative of the whole. The Framing Effect is the tendency to draw different conclusions from the same information, depending on how that information is presented [22]. The data in the following chart (Fig. 5.2) is from the 2016 Annual report of the Police crime statistics of the Austrian Ministry of the Interior [3]. The data represent the overall numbers of recorded complaints. It demonstrates an example of the Framing Effect. In these two charts the same information is depicted with different aspect ratios. The chart on the left side uses an aspect ratio of 3:5, while the chart on the right side uses an aspect ratio of 4:3. The increase of complaints and records from the year 2015 to 2016 looks more dramatic in the left chart than on the right-hand one. Therefore, the American Psychological Association [1] recommends using a 4:3 aspect ratio for all histograms and bar graphs. The range of scales can also have a large effect. For example, when comparing Figs. 5.2 with 5.3 it becomes obvious that the increase from 2015 to 2016 becomes even less dramatic, if the ordinate starts at 0. The APA suggests to either start all ordinates at 0 or to clearly highlight it otherwise.

Fig. 5.2
figure 2

The data from 2014 to 2016 in 3:5 format (left) and in 4:3 format (right)

Fig. 5.3
figure 3

The data from 2014 to 2016 with an ordinate starting from 0 (left) the data from 2007 to 2016 (right)

Figure 5.3 also indicates that the impression of trends is dependent on the time frame, which is also an example for Clustering Illusion. The chart at the right-hand side of Fig. 5.2 shows that the numbers are actually decreasing when comparing both halves of the 10-year period.

2.2 Systematic Tool Analysis

The systematic tool analysis aims to evaluate and improve VA environments and its tools with regard to their potential to avoid or mitigate cognitive biases. In a nutshell, this approach investigates each tool with respect to the various cognitive biases. In a first step, the tools of the platform are selected and briefly described, which includes the context in which they are used, their purpose, their input data and output format, etc. Then the tools are analyzed by domain experts such as cognitive psychologists or experts on cognitive biases. These experts have to evaluate if and, to what extent, the tools either mitigate or facilitate different cognitive bias. Ideally, such an analysis is done for each cognitive bias separately.

Such a systematic investigation leads to a matrix, with tools as rows and cognitive biases as columns. For each cell, the investigator describes to what extent the respective tool mitigates or facilitates that particular cognitive bias. For example, the Map tool (see Fig. 5.1) may lead to Selective Perception, if a specific area is heavily crowded with crime incidents, because this might attract the attention of the analysts away from other parts of the map. An example tool that has the potential to mitigate the Confirmation Bias is the Time tool, as it allows the user to change the time-frame and thus the amount of crime data displayed. This results in the presentation of different perspectives and contexts of crime data, which has the potential to avoid the Confirmation Bias. The outcome of this method provides an overview of the mitigation capabilities and dangers of cognitive biases of the whole platform.

In order to analyze the danger of cognitive biases and mitigation strategies of individual tools, we propose to follow the Delphi method [19]. Delphi is designed as a structured and systematic process to develop forecasting perspectives on future events by asking panel experts. Typically, this method is performed in two or more iterative rounds, whereby in each round, every expert evaluates the current state and provides additional input, which leads to an adapted and improved next version.

2.3 Process-Oriented Operationalization

The aim of the following approach is to identify and describe the users actions and interactions with the tools of the VALCRI platform, in order to measure their tendencies towards cognitive biases. The effect of a particular cognitive bias can be predicted in certain well-defined decision tasks. However, in the case of an interactive VA platform, there is a wide range of potential behavioral manifestations that makes it impossible to describe all the actions and interactions which occur when a biased behavior takes place. The design recommendations and the systematic tool analysis described above can provide conductive insights that helps to identify the behavioral patterns related to cognitive biases. To demonstrate how this method works, we focus on the example of Selective Perception and briefly outline how it could occur by using the Search, List and Location tool of the VALCRI platform. As mentioned above, this cognitive bias is defined as being focused on a particular area of the information space, whilst ignoring other pieces of information.

To detect this particular cognitive bias, a similarity measurement can be computed between the keywords entered into the Search tool, i.e. between the documents and crime reports further examined via the List tool or between the parameters of the visualizations of the Location tool. A high similarity between the keywords, the selected documents and the visualization parameters over a longer period of time is considered as an indication that the user is focused on a particular area of the information space, i.e. the Selective Perception.

In the context of VA, it is important to distinguish between different kinds of searching modes, such as explorative, investigative, hypothesis-driven and question-driven searches. The validity of the operationalization of any cognitive bias can be improved when taking such contextual information into account. For example, in case of a hypothesis-driven search, an analyst who is engaged in a small area of the information space shouldn’t be identified as being affected by Selective Perception, however, this does not mean that the user’s behavior is not influenced by any other cognitive biases.

3 Empirical Approaches – Behavioral Observation and Outcome-Oriented Operationalization

This section presents two empirical methods for detecting cognitive biases.

3.1 Behavioral Observation

In the context of the VALCRI project, several behavioral observations have been carried out. In one study, nine experienced law-enforcement analysts worked on a task for around 2 h, separately from each other. While working on the task, they were asked to “think aloud” on their reasoning, ideas and conceptions. Their activities were video and audio recorded and the screen activity was captured. The participating analysts’ task was to analyze a particular crime type in a city district over a given period of time and the main question for them was, should more patrols be sent to this city district. A qualitative interview was then carried out.

While working on the task, the participants were observed by at least one expert on cognitive biases who did not intervene during this exercise. The observer filled out a prepared form, indicating the time when a cognitive bias was observed, the tools that had been used by the analyst, and if necessary, further explanation on this observation in an open format. These observations were subsequently validated and enriched by two other experts who used the video and audio recordings.

On the one hand, the outcome of this exercise was a validation and enrichment of the systematic tool analysis described in Sect. 5.2.2, as well as the elaboration of new ideas for potential process-oriented indicators. On the other hand, compared to the purely theory-driven elaboration of the tool - cognitive bias matrix, the outcome of this exercise resulted in a mapping between sets of tools and cognitive biases. The reason for this is that for certain, often more complex, workflows and processes, the analysts used a combination of tools simultaneously.

An example would be the combination of the Time tool, the SPC tool and the Location tool when searching for “peaks in the noise”, for a certain area and period of time. In many cases, the search for such peaks was focused on the maximum values and quite often, the analysts were not trying to falsify their initial hypothesis (e.g. by checking also for other periods of time or other city districts). This particular work process often resulted in vastly overlapping combinations of some cognitive biases: the Confirmation Bias, the Framing Effect, the Base Rate Fallacy and the Clustering Illusion, i.e. these cognitive biases occurred often in parallel.

3.2 Outcome-Oriented Operationalization

3.2.1 Confirmation Bias

Considering the large number of cognitive biases mentioned in the literature, only a few methods have been suggested for their objective measurement, such as a questionnaire or test. One example is the Selective Exposure Paradigm which has been proposed by Festinger [7] in the context of the cognitive dissonance theory, but later applied to elicit “confirmatory information search” [9]. Confirmatory information search is a main component of the Confirmation Bias. The Selective Exposure Paradigm is structured as follows: participants are confronted with a decision task and have to make an initial decision for one of two alternatives. Then the participants are exposed to various pieces of information that either confirm or disconfirm their initial decision. Half of the pieces of information are consistent with regard to the initial decision (i.e., the selected alternative) and half of them are not. In some cases of the Selective Exposure Paradigm, the pieces of information are short headline-like statements and the participants also have to indicate whether or not they would like to read further (more detailed) information on each statement [10]. Confirmatory information search is observed when a participant doesn’t change their initial decision, even if overwhelmed by a large number of disconfirming pieces of information and if they are not interested in reading the detailed information.

Another aspect of the Confirmation Bias is Confirmatory Information Evaluation [8]. For each piece of information and statement, participants can be asked to what extend they consider this statement as important and credible. Importance and assumed credibility are usually highly correlated with each other. Confirmatory Information Evaluation can be observed if the importance and credibility evaluations for consistent statements (i.e. statements that are in favor of the initial decision) are higher than for statements that are in favor of the alternative.

The values for Confirmatory Information Search and Confirmatory Information Evaluation can be interpreted as an individuals’ baseline-measurement of having a Confirmation Bias when evaluating the visualization system.

3.2.2 Clustering Illusion

The Clustering Illusion is defined as the tendency to see patterns where no patterns exist [12]. This tendency can be, for example, observed when people interpret patterns or trends in random distributions. A very similar cognitive bias is the Gambler’s Fallacy, which refers to the belief that runs of one binary outcome will be balanced by the opposite outcome [2, p. 118]. In both cases, the cognitive fallacy is based on the belief that random events or data-points follow some rules, trends or patterns, which of course, they do not.

In the context of the VALCRI project, the following outcome-oriented operationalization of the Clustering Illusion has been applied: participants were confronted with a small dataset of 60 crime incidents and were asked to make a decision by means of the examples. They used certain tools of the VALCRI platform, in particular the Location, Time and List tool. The Location tool indicated the spatial distribution of crime incidents, the Time tool enabled to get insights on the temporal distribution of those crime incidents in different period of time and the List tool enabled them to look at some details of the incidents. The crime incidents had been randomly selected from a larger data-set and were located in two separate district of the suburban areas of the city of Birmingham.

In the main study, four examples were provided to the participants. For each example, the participants had ten minutes to inspect the data by using the above mentioned tools. Two examples were considered as random and the remaining two had been constructed in a way that there was a temporal increase for a period of six months and a local concentration within one of the city district. Another independent variable in the main study was the extent to which the participants could interact with the data. In half of the examples, the participants were allowed to interact with the tools and to change the parameters of the visualizations. In the other half, participants were asked to use only the List tool and to keep the other tools, i.e. the visualizations in the narrower sense, as prepared by the evaluators. In the interactive condition, it was possible to inspect the data from different perspectives and to principally falsify one’s own impressions of patterns or trends.

After inspecting the data, the participants were asked (i) to evaluate if they would increase the police presence either in city district A or in city district B, (ii) to evaluate the certainty of their decision, (iii) to announce if their decision was based on the data or patterns and trends in the data, and if yes (iv) argue their decision. The idea was to measure an individual’s tendency to see patterns where no patterns exist by the confidence ratings (ii) and the extent to which their decisions were based on data (iii) for the random-examples. These individual tendencies can be taken into account as the baseline when evaluating the visualization quality of the VALCRI system with regard to the Clustering Illusion.

4 Automatic Cognitive Bias Detection Approach

In this section, we briefly outline a method to automatically detect the cognitive biases based on user interaction patterns. Even if this method could be regarded as an approach that can be applied on any cognitive bias, we focus here on the Confirmation Bias and the Clustering Illusion. In addition to the automated bias detection method, it also outlines how a detected bias can be mitigated through feedback and prompts. This approach follows and extends the idea described by Nussbaumer et al. [18].

The starting point for the automatic cognitive bias detection is the operationalization as described in Sects. 5.2.3 and 5.3.2. They allow us to assess, in a controlled setting, whether a participant in such an experiment has these cognitive biases. Based on this method, we propose a data-driven approach to detect cognitive biases by taking into account interaction data of users (log data of user actions). If a cognitive bias is detected (indicated through a high probability for the occurrence of a bias), then a prompt or visual feedback is provided to the user (see Fig. 5.4).

Fig. 5.4
figure 4

This diagram depicts the overall approach to integrate automatic bias detection into a visual analytics environment

The data-driven method is based on machine learning algorithms to automatically classify the users behavior in a visual learning environment based on the interactions with the tools of this environment. Participants in a study have to solve a criminal analysis task with the VAE. This task is embedded in a controlled experiment (e.g. Selective Exposure Paradigm described in Sect. 5.3.2.1 or the Clustering Illusion study described in Sect. 5.3.2.2) so that it can be assessed if their behavior is biased. Additionally, log data from their interaction with the VA tools are collected. From the experiment, it is known which interaction data is from biased and unbiased users and can subsequently create two groups. These two groups form the basis for further classification of interaction data from users that did not participate in a Selective Exposure Paradigm. In this way, when a user makes use of the VA tools, interaction data is collected and it can be determined if this interaction data is more similar to that of a biased or unbiased user. For clustering, several machine learning methods are available, such as the Support Vector Machine algorithm [20] or clustering algorithms [6].

The method described above, calculates probabilities for the occurrence of a cognitive bias. If such a probability is high, feedback could be provided to make the user aware that a cognitive bias might be involved in the thinking process. Such feedback can consist in visual clues that do not distract the user unduly, but nevertheless catches the users attention.

5 Conclusion and Outlook

Overall, this chapter aims at providing new methods and knowledge for discovering, measuring, and mitigating cognitive biases in the context of VA. Though a vast body of literature exists that deals with cognitive biases, most of it treats cognitive biases on a theoretical level. The work presented in this chapter includes several steps towards devising methods for measuring and mitigating cognitive biases.

Our elaborated methods extend the use of state-of-the-art of measuring cognitive biases on several dimensions. Firstly, a new procedure to measure the Clustering Illusion has been developed. The results are promising, but the applied methodology should be improved - further analysis of the log data should be carried out to determine whether or not it contains typical patterns of participants who are more influenced by the Clustering Illusion. Secondly, the method to measure cognitive biases through a classification of cognitive processes and assigning them in a structured observation constitutes a new approach in this field. This provides a basis for the operationalization of further cognitive biases. Thirdly, the data-driven approach outlines a method to detect cognitive biases based on user interactions with a VAE. All these methods outlines new directions on how cognitive biases can be measured, consisting of empirical studies, expert-driven behavioral observations and automatic observations through a logging system. In order to avoid detrimental effects of cognitive biases all together, new design recommendations have been elaborated. Though these design recommendations are based on existing ideas in literature, the innovation lies in in the translation of these ideas into the design of VA components. Furthermore, the systematic tool analysis provides a new approach to critically evaluate a VAE according to their potential inducements and mitigation of cognitive biases. This analysis allows for formative and summative assessments of a VAE.

Data visualization is a type of communication and just like in every communication process, the presented information could be misinterpreted by the receivers. The reason for this misunderstanding could be the presence of cognitive biases. In this chapter, we focused on a small set of cognitive biases, which could occur in a VAE. In the future, the design and evaluation of visualization techniques should be influenced by a combination of data-driven and theory-driven methods. The basic principles of these approaches could be easily transferred to different VAEs and applied on other cognitive biases. Another important aspect is the context in which the visualization is used. Ignoring the context, could lead to false classifications of biased and unbiased behavioral patterns.

The users require interactive interfaces and personalized visualization techniques. The appearance of emerging and innovative visualization techniques allows the user to interact in new way with datasets. Even if VA is a dynamic field of research, classical principles to detect and mitigate cognitive biases have been often disregarded. To design informative visualization with the least impact of cognitive biases, the cooperation of different fields of expertise is necessary.