Keywords

6.1 Introduction

Major initiatives often have specific drivers and circumstances which bring them into being and which provide the focus for investigation and analysis, and also subsequent applications which build on these developments. The Scientific Revolution in the fifteenth and sixteenth centuries was one such development and it was initiated in part by a new and systematic view of the natural world defined by Newton [1] which operated by fixed laws which could be represented by mathematics, and partly by the availability of resources for experimentation and the development of the natural sciences.

The driver for the first stored-program digital computer in 1946 was, according to its name ENIAC, USA [2] and EDSAC, UK [3], able to perform numerical calculations much faster than by the earlier mechanical means. Bowden [4] captured the key attribute of this new invention—it could calculate much faster than a human could think. Valves (vacuum tubes) gave way to transistors and the rest is history. Both the Scientific Revolution and the advent of the digital computer may be considered paradigm shifts according to the definition advanced by Kuhn [5].

The driver for Scientific Visualization, as discussed earlier, was the requirement of the National Science Foundation’s Division of Advanced Scientific Computing in the USA in 1987 for advice on how best to utilize supercomputers to meet users’ needs in a variety of scientific disciplines. Many of these were producing “fire-hoses” of data that far outstripped the computational capability to analyze them even with supercomputers. The report, Visualization in Scientific Computing (McCormick et al. [6]), was produced after consultation and described how such an initiative could be accomplished, and has provided the basis for much of the subsequent development in this field.

In a similar way, the driver for the development of visual analytics was the requirement of the US Department of Homeland of Security in 2004 to have better IT tools and facilities (Hennessy et al. [7]) for the protection of its national and international borders. Much data were already available in different forms and different places but it was also massive, complex, incomplete, dynamic, and uncertain. Networks could integrate databases, but facilities were needed to provide the best analyses of the data, which was often varying in real time. It was therefore requested that the National Visualization and Analytics Centre (NVAC) [8] to define a research and development agenda to facilitate advanced analytical insight. After international consultation and two working group meetings, the report: Illuminating the Path: The Research and Development Agenda for Visual Analytics (Thomas and Cook [9]) was produced. This report included the following key areas:

  • The science of analytical reasoning.

  • Visual representations and interaction techniques.

  • Data representations and transformations.

  • Production, presentation, and dissemination of actionable information to decision-makers in a form understandable by them.

  • Moving research into practice.

Each of these areas produced a number of recommendations for the further work, or action, that was needed.

Prerequisites of the initial requirements implicitly included

  • The ability to aggregate heterogeneous datasets.

  • The need to be able to interact effectively with the data.

  • Efficient and effective visualizations that could link into the human cognitive processes.

  • Incorporation of real-time data, whether autonomous or from direct human input.

  • Incorporation of portable, handheld, devices for mobile analytics.

The McKenzie Report (Henke et al. [10]) detailed the ways in which visual analytics is able to address the challenges of a data-driven world. Fisher et al. [11] and Hong et al. [12] reviewed aspects of the visual analytics for big data.

6.2 Defining Visual Analytics

According to Thomas and Cook (2005), visual analytics may be defined as follows:

the science of analytical reasoning facilitated by interactive visual interfaces. People use visual analytics tools and techniques to synthesize information and derive insight from massive, dynamic, ambiguous, and often conflicting data; detect the expected and discover the unexpected; provide timely, defensible, and understandable assessments; and communicate assessment effectively for action

Visual analytics is a multidisciplinary field that includes the following focus areas:

  • Analytical reasoning techniques that enable users to obtain deep insights that directly support assessment, planning, and decision making

  • Visual representations and interaction techniques that take advantage of the human eye’s broad bandwidth pathway into the mind to allow users to see, explore, and understand large amounts of information at once

  • Data representations and transformations that convert all types of conflicting and dynamic data in ways that support visualization and analysis

  • Techniques to support production, presentation, and dissemination of the results of an analysis to communicate information in the appropriate context to a variety of audiences

Visual analytics therefore represents a development “of the fields of information visualization and scientific visualization that focuses on analytical reasoning facilitated by interactive visual interfaces ” [13]. This close coupling of the human reasoning, cognitive ability, and computer processing and display make visual analytics suitable for large and complex problems which could be more difficult to address by other methods. It has therefore been closely associated with the analysis of big data.

This may be summarized as follows:

  • Scientific visualization deals with data that has a natural geometric structure (e.g., MRI data, wind flows).

  • Information visualization handles abstract data structures such as trees or graphs.

  • Visual analytics is especially concerned with coupling interactive visual representations with underlying analytical processes (e.g., statistical procedures, data mining techniques) such that high-level, complex activities can be effectively performed (e.g., sense making, reasoning, presentation, decision making) [13].

Keim et al. [14] formulated a visual analytics “mantra” to capture the overall process of the VA method: analyze first, show the important, zoom, filter, and analyze further details on demand.

A key component of visual analytics, therefore, is the potential to magnify and augment human cognitive power and capability by means of:

  • increasing cognitive resources, such as by using a visual resource to expand human working memory,

  • reducing search, such as by representing a large amount of data in a small space,

  • enhancing the recognition of patterns, such as when information is organized in space by its time relationships,

  • supporting the easy perceptual inference of relationships that are otherwise more difficult to induce,

  • perceptual monitoring of a large number of potential events, and

  • providing a manipulable medium that, unlike static diagrams, enables the exploration of a space of parameter values [9].

Scholtz et al. [15] provide an introduction to visual analytics.

6.3 What Makes Visual Analytics Distinctive?

A possible objection to the foregoing analysis is to say that humans can be unreliable in performing specific tasks, and therefore can be unreliable at interacting with data. It would be safer to allow the computer to perform the numerical calculations and output the results, which could then be acted on by the human. In other words, keeping the human out of the loop could be a more preferable option. This is certainly true for computing tasks which are generally automated, e.g., checking credit card information for a transaction at a point-of-sale device against a database of valid cards. In this case, there is only one of two possible results: valid or invalid. Thus, in simple cases, when the information is numerical and quantitative, the analysis can be performed by the computer alone. However, often the data that are available are representing a situation about which there is some uncertainty. The computer analysis will have little knowledge or information about this external uncertainty, and therefore human guidance on which part, or parts, of the data to analyze is needed. In addition, if there is some time-critical element required in the analysis, then human intervention may be mandatory. An example of this is the decisions to be made at frontier posts on whether to let particular cargos pass through, or whether they should be stopped and searched. The computer may have information about the company’s products and services in its database along with its trading records and any previous customs infringements, but may be unlikely to have specific information on this particular movement of goods. The human observer at the border will have additional local information, and may also have sensor data from devices used to detect illegal substances. The combination of these two sets of information is crucial to making a decision about what to do next. Clearly, such time-critical decisions also need to be optimal and as accurate as possible. Searching more trucks and vehicles than is necessary could lead to long traffic delays and adverse economic effects. In this case, keeping the human in the loop is key to the success of the overall evaluation and the efficiency of the process. An advantage to this approach is the judgement that the human can exercise based on their evaluation of the overall situation and their experience of similar incidents in the past that they can draw on. A possible disadvantage to the human in the loop is human error in inputting data from the particular local situation. One solution to this is to have automated sensor and vision data input, but human judgement may still be required on the overall analysis in order to reach an optimum decision.

This coupling of human reasoning and experience with computational analysis is the principal strength of visual analytics. This implies that the relationship between these two components needs to be fully understood, and numerous research projects are seeking to explore this area in greater detail. In addition, the cognitive processes involved in human reasoning are the subject of ongoing research in order to get a better understanding of how humans process key information. Finally, the kind of visual representations appropriate to accurate and efficient decision-making is also being explored in more detail. For example, a visualization that is chosen for a scientific analysis of a simulation may not be appropriate for a security application where there is real-time data being analyzed.

Visual analytics may be expressed diagrammatically as shown in Fig. 6.1.

Fig. 6.1
figure 1

The visual analytics workflow. Based on D. A. Keim, J. Kohlhammer, G. P. Ellis, F. Mansmann: Mastering The Information Age—Solving Problems with Visual Analytics. Eurographics, 2010. Licensed under the Creative Commons Attribution-Share Alike 4.0 International license (https://commons.wikimedia.org/wiki/File:VisualAnalyticsWorkflow.svg)

Keim et al. [16] outline the challenges associated with analyzing large amounts of data. Yang and Wu [17] detail ten challenges associated with data mining research. Järvinen et al. [18] present the findings of a project by VTT, TKK, and Helsinki Institute of Information Technology on the concept of visual analytics and the state of visual analytics research and development, and its relevance for industrial and consumer applications in Finland.

6.4 Software for Visual Analytics

Following on from an earlier survey of commercial systems by Zhang et al. [19], a further analysis was performed by Behrisch et al. in 2018 [20]. Evaluation was based on a number of aspects including features, performance, usability, suitability for specific user groups, and ability to handle complex data types. Possible future developments are also proposed. This followed on from the earlier survey [19] and therefore developments in the software over this period were noted, including the entry of new products into the market place. The visual analytics market is estimated to be worth $6.5 billion by 2022 [21] (from $2.2 billion in 2016) and it is therefore a highly competitive area with many companies seeking to capitalize on the opportunities presented by the increasing volumes of data, and the increasing adoption by the market place of visual analytics tools. A number of software and systems providers are seeking to augment their analytics products with the power of Artificial Intelligence (AI) in order to provide enhanced capability and to increase their leverage in the market place.

Scholtz [22] emphasizes the importance of including the user and their intended application(s) in the evaluation of visual analytics software, and not just comparing the functions within the software available, or their interfaces. There may be some specific difficulties with the data in the particular application area, and therefore a good match is needed between the analytics software and the application. In other words, there can be gap between the design of a visual analytics tool and its use in actual practice. User-centered evaluations of analytics software are essential in order to obtain an accurate view of their applicability to the domain of interest. Aspects of the user’s requirements include the tasks to be performed, the data (and metadata) to be used, the filtering processes to be applied to the data, and the thinking processes that the users engage in when seeking to understand the meaning of the data. For example, for the latter aspect, it may take time for a user to understand the significance of a visualization when it changes. The cognitive processes of the human are often overlooked in the analysis of data and the operation of the visual analytics software. Such user-centered evaluations may become even more important as new functions (such as AI) are added to future software. Scholtz [22] also includes visual analytics case studies adapted from the intelligence and human–computer interaction communities to illustrate the principles advanced.

Table 6.1 shows examples of visual analytics software.

Table 6.1 Examples of visual analytics software

According to Behrisch et al. [20], QlikView, Spotfire, PowerBI, and Tableau are the established key players in the field. A number of the products listed in Table 6.1 offer free downloads of older versions, or versions with reduced functionality, in order to give potential users an opportunity to access and use the software on their data. Harger and Crossno [23] reviewed open-source visual analytics toolkits that were available in 2012. Although this has largely been superseded by commercial companies offering free downloads, the methodology employed to compare the toolkits is still useful to consider, alongside the importance of a user-centered evaluation as detailed by Scholtz [22].

6.5 National Visualization and Analytics Centers

6.5.1 Collaborative Research and Development

The current trend toward large research projects is exemplified by funding programs in the UK, the European Union, and the National Science Foundation (NSF). In the UK, the larger projects involve inter-institution collaborations where the combined research strength is believed to be greater than the sum of the parts [24]. In addition, collaboration is often required when interdisciplinary research is involved. In the European Union, many research and development projects involve collaborations between institutions and organizations across at least three European countries and usually involve industrial partners, especially where the outputs of the projects are expected to be new products or services.

In the USA, 24 National Science Foundation Science and Technology Centers were created in 1987 to pursue foundational interdisciplinary research. The objective was to address increasing global competition, and to develop innovative, interdisciplinary approaches in important areas of basic research. The first STCs were established in 1989 and more were added in 1991. The STC Program was administered through the Office of Science and Technology Infrastructure at NSF. Examples were

A Graphics and Visualization Centre [25] was established in 1991 which was a consortium of research groups from five universities: Brown University, the California Institute of Technology (Caltech), Cornell University, the University of North Carolina at Chapel Hill, and the University of Utah. It conducted research in modeling, rendering, high-performance architectures, graphical interaction and communication, and scientific visualization.

6.5.2 USA—National Visualization and Analytics Centers

In order to implement the research and development agenda defined by Thomas and Cook [9], the Department of Homeland Security set up a network of Visualization and Analytics Centers [26] to advance the various functional areas. This network included University of Washington, Stanford University, Purdue University, Pennsylvania State University, Georgia Institute of Technology, and University of North Carolina at Charlotte [27].

Areas of interest to the network of Visualization and Analytics Centers included

  • Data wrangling and preparation,

  • Distributed storage architectures,

  • Advanced computational concepts,

  • Analytics and visualization,

  • Human-centered systems,

  • Decision support and business processes,

  • Privacy and security, and

  • Analytics for the Internet of Things and embedded systems.

A Visualization Center for Command Control and Interoperability Environments (VACCINE) [28] is centered on Purdue University and created methods, tools, and applications to analyze and manage the large amounts of information for all mission areas of homeland security in the most efficient manner. It was established as a Center of Excellence by the Department of Homeland Security Science and Technology Directorate.

6.5.3 Canadian Network for Visual Analytics

In Canada, a related network for visual analytics (CANVAC) [29] was set up in Vancouver in 2012 to:

address the needs of a growing visual analytics (VA) research community in Canada by supporting the requirements of all VA stakeholders, i.e., researchers, developers and user organizations. CANVAC has the following goals:

  • To develop and assist in the development of visual analytics expertise in Canada.

  • To facilitate and promote research conducted in the field of visual analytics.

  • To support and promote education and training in visual analytics in academia and industry.

  • To promote and represent the Canadian visual analytics community internationally [29].

The founding members included the University of British Columbia, Simon Fraser University, and Dalhousie University, and further participants included University of Alberta, University of Calgary, York University, OCAD University, and University of New Brunswick. Industry partners included those involved in utilizing visual analytics tools, and also companies involved in aerospace, safety and security, health care, telecommunications, and transportation. Links are also maintained with NVAC in the United States, UKVAC in England, BRAVA in Brazil, and VisMaster in the EU.

CANVAC focussed on visual analytics techniques to

  • Acquire and manage large amounts of data.

  • Visually explore and synthesize information.

  • Derive insight from massive, dynamic, ambiguous, and often conflicting data.

  • Provide assessments that are timely, defensible, and understandable.

  • Communicate assessments effectively to allow action.

Diagrammatic representations of these activities are also specified [30].

6.5.4 The Vancouver Institute for Visual Analytics

The Vancouver Institute for Visual Analytics (VIVA) [31] is a joint research institute between Simon Fraser University, the University of British Columbia, and the British Columbia Institute of Technology. In 2007, the Boeing Company provided an industrial research grant to Simon Fraser University and the University of British Columbia to study visual analytics. Its objectives were to disseminate visual analytics research results to government and business organizations and to detail how visual analytics might be used within Boeing. A further grant was provided in 2010 to establish VIVA. Training courses, seminars, and a Summer School are provided in visual analytics by VIVA to students, researchers, SMEs, and industrial employees to enable these tools to be used to solve current problems.

6.5.5 The Visual Analytics Research and Development Consortium of Canada

The Visual Analytics Research and Development Consortium of Canada (VARDEC) [32] was found in 2013 in cooperation with Mitacs Inc., the Canadian Network for Visual Analytics (CANVAC), and The Boeing Company. Other founding Canadian member companies include nGrain (Canada) Corporation and Convergent Manufacturing Technologies. The objective of the Visual Analytics Research and Development Consortium of Canada (VARDEC) is to develop and commercialize visual analytics products. It links together major aerospace companies and Small and Medium Enterprises (SMEs) to migrate academic research and development into products and services. Industry-led projects in the academy and research laboratories provide the basis ongoing development.

VARDEC complies with the Government of Canada’s Industrial and Technological Benefits (ITB) agreements with international partners, such as The Boeing Company, by supporting investments in modern technology which can benefit the Canadian economy.

6.5.6 Brazilian Visual Analytics Initiative

The Brazilian Visual Analytics Initiative (BRAVA) [33] aims at leveraging the collaborative research in the field of VA and promoting the networking between Brazilian and Canadian researchers.

6.5.7 European Union

VisMaster [34] was a European Coordination Action Project focused on the research discipline of visual analytics. Its objective was to address the challenge of increasing amounts of data and be able to utilize it effectively for technological progress and business success. It culminated in the publication “Mastering the Information Age: Solving Problems with Visual Analytics” (Keim et al. [35]).

6.5.8 UK

The University of Oxford e-Research Centre uses Visual Analytics for big data [36]. It regards visual analytics as based on the following assertions:

  • Statistical methods alone cannot convey an adequate amount of information for humans to make informed decisionshence the need for visualization.

  • Algorithms alone cannot encode an adequate amount of human knowledge about relevant concepts, facts, and contextshence the need for interaction.

  • Visualization alone cannot effectively manage levels of details about the data or prioritize different information in the datahence the need for analysis and interaction.

  • Direct interaction with data alone isn’t scalable to the amount of data availablehence the need for analysis and visualization [36].

6.6 Sample Applications

Visual analytics is relevant to a variety of application domains, and particularly to those involving large datasets, real-time data, heterogeneous data, and areas where the data are complex, ambiguous, or conflicting. These areas include the physical, biological sciences and medical sciences, security, climate and geological monitoring, and commerce. A brief outline of a number of applications is given here [37] while recognizing that static pictures cannot do justice to the nature of applications which are often interactive and in real time. Therefore, these descriptions and illustrations are only indicative.

Sun et al. [38] identified five categories of application as follows:

  • Space and time,

  • Multivariate,

  • Text,

  • Graph and network, and

  • Others.

These application categories were then related to the most appropriate steps in the visual analytics process: user interaction, analysis, and visual mapping.

6.6.1 Weather and Climate Monitoring

Monitoring weather and climate involves collecting large amounts of real-time data from remote sensors positioned at various points around the globe and in satellites. This data can be input into various climate models. This visual approach assisted the analysts to interpret the data and gain insight into factors governing the climate and climate change.

6.6.2 Visual Analysis of Social Media Data

Schreck and Keim [39] detail how visual analysis may be used in the area of social media.

6.7 Current Research and Development

Areas of current research and development may be exemplified to some extent by the publications at the annual IEEE visualization conference. In the 2018 conference [40], the papers were grouped into the following themes for the three components of the conference: Visual Analytics Science and Technology (VAST), Scientific Visualization (SciVis), and Information Visualization (InfoVis). These themes are detailed in Table 6.2.

Table 6.2 Principal themes in the research papers at IEEE Visualization 2018 (with overlapping areas indicated in italic text)

Table 6.2 indicates the variety of research and development that is being done nationally and internationally. A number of the above themes could appear in more than one column, but researchers who produced papers for review would normally come from one of the three constituent areas and would submit them to their preferred conference. However, when the themes are examined as a whole, it is clear that this is somewhat arbitrary. For example, many of the themes in SciVis and InfoVis would be relevant to the research in VAST. The areas of interaction and applications (the italic text in Table 6.2) are common to all three areas. Although perception and cognition are key aspects in the field of information visualization (i.e., how is a particular visual representation to be interpreted by a user?), they are just as relevant for visual analytics and scientific visualization, where visual images have to be interpreted at some stage in the analysis. The principal differences in the three areas would appear to be as follows. In visual analytics, a primary objective is to discover relationships and anomalies in datasets particularly when they are very large or in real time. In scientific visualization, the objective is to discover aspects in data representing physical processes or physical objects which result in greater understanding of the laws governing such processes. For information visualization, the objective is to understand how information is assimilated by humans so that it can be portrayed accurately and appropriately in various kinds of visual images from different application domains.

The top ten interaction challenges in extreme-scale visual analytics are outlined by Wong et al. [41, 42].

6.8 Has Visual Analytics Subsumed Information Visualization and Scientific Visualization?

As noted earlier, interaction and applications are functions which are common across all the areas of visualization. Thus, there are overlapping functions and goals with the three visualizations. Currently, there is no general agreement on the boundaries between the three areas. However, each area may be characterized as follows [9]:

  • Scientific visualization deals with data that has a natural geometric structure (e.g., MRI data, wind flows).

  • Information visualization handles abstract data structures such as trees or graphs.

  • Visual analytics is especially concerned with coupling interactive visual representations with underlying analytical processes (e.g., statistical procedures, data mining techniques) such that high-level, complex activities can be effectively performed (e.g., sense making, reasoning, decision making).

Visual analytics seeks to marry techniques from information visualization with techniques from computational transformation and analysis of data. Information visualization forms part of the direct interface between user and machine, amplifying human cognitive capabilities. These capabilities of information visualization, combined with computational data analysis, can be applied to analytic reasoning to support the sense-making process [9].

Therefore, there is a degree of interdependence between visual analytics and the other two areas: scientific visualization for data where modeling is required, and information visualization for effective image representation and interaction paradigms to facilitate user sensemaking. From this perspective, therefore, visual analytics subsumes key aspects of the other two areas. Software companies appear to be using the single-term data visualization to cover many areas without necessarily specifying any primary functions. Thus, user assessment and selection of software need to take into account the functions required and also perform user-centered evaluations as detailed earlier.

6.9 The Future

Heer [43] noted the following deficiencies with regard to current data visualizations and what needs to be done to improve them:

  • Many images lack perceptual principles.

  • Need to augment analysis in the most productive ways and accomplish better decision-making.

  • Rankings of visual perception of visual encodings (for comparing quantities from least accurate to most accurate) indicate that position is the most effective representation (e.g., therefore, it can be useful to show bar charts alongside an area map).

  • Rethink user interfaces for data visualization. Not specifying a variety of charts for the user to manually select from. It is better to automatically provide a chart based on the data; the user can then drill down for more detail.

  • Need new end user exploration tools.

  • Move from specification to exploration.

  • Show data variation not design variation. Many current images just show alternative designs for the visualization of the same data.

  • Users need tools to exercise skepticism and consider new questions.

Therefore, the way forward needs to be characterized by accomplishing transitions as follows:

  • From designers to decision-makers.

  • From specification to exploration.

  • From design variation to data variation.

6.10 Conclusions

This chapter has reviewed the development and advancement of visual analytics and its relationship to scientific visualization and information visualization. Its importance as a tool for the exploration and sensemaking of large datasets has been outlined. When comparing the wide variety of visual analytics products currently available, it is important to consider the functions available, their intended application domains, and how the software is going to be used. Thus, it is advisable to perform a user-centered evaluation as detailed by Scholtz [22]. Current research development in visual analytics has been reviewed and possible future directions outlined.

Further Reading

  • Kang, Y. and Stasko, J. Examining the Use of a Visual Analytics System for Sensemaking Tasks: Case Studies with Domain Experts, IEEE Transactions on Visualization and Computer Graphics, Vol 18, No 12, pp 2869–2878 IEEE, Los Alamitos, CA (2012). Online at—http://web.cse.ohio-state.edu/~machiraju.1/teaching/CSE5544/Visweek2012/vast/papers/kang.pdf

  • Ward, M. Grinstein, G. and Keim, D. Interactive Data Visualization: Foundations, Techniques, and Applications, 2nd edition, A.K. Peters/CRC Press, Boca Raton, FL (2015).

  • In addition, a number of the vendors who support visual analytics applications offer tutorials on visual analytics which are set in their context of their own software.