Keywords

1 Introduction

We live in a world that is increasingly characterised as a fusion of physical and digital/virtual events. Today, contactless technology involving mid-air interactions (e.g. virtual reality, holograms, and volumetric displays) is being designed for application scenarios commonly found in daily life, such as in shops, hospitals, museums, and cars. Importantly, emerging digital technology is enabling interaction with digital worlds where the human senses are as important and prominent as they are in people’s daily life (Velasco and Obrist 2020). For example, multisensory technology is more connected to our body, emotions, actions, and biological responses in realistic scenarios that are no longer limited by audio-visual experiences but also include touch, smell, and taste experiences (Cornelio et al. 2021).

However, mid-air technology is not well studied in the context of multisensory experience. Despite the increasing development of mid-air interactions (Vogiatzidakis and Koutsabasis 2018; Koutsabasis and Vogiatzidakis 2019) and particularly of mid-air haptics (Rakkolainen et al. 2020), we still lack a good understanding of the influence of this technology on human behaviour and experiences, in comparison with the understanding we currently have about physical touch (Cornelio et al. 2021). For instance, the crossmodal processing of mid-air touch with other senses is not well understood yet. Additionally, the impact of mid-air interaction on human behaviour, such as emotions, agency, and responsibility, remains unclear. Considering the growing development of mid-air technology and the importance of multisensory cues in both our daily life and our interaction with technology (Velasco and Obrist 2020), we need to gain a rich and integrated understanding of multisensory experiences for mid-air technology in order to design interfaces that support more realistic and emotionally engaging digital experiences.

In this chapter, I discuss opportunities of mid-air interactions in the context of multisensory experience (see Fig. 1). I particularly emphasise three areas of development: (1) mid-air tactile dimensions—in which I highlight the ability of our brain to associate information perceived from different senses, and I discuss the opportunities to exploit this ability to engage mid-air touch with other sensory modalities; (2) multisensory integration—in which I highlight the lack of studies involving mid-air touch in the broad literature of multisensory integration and discuss opportunities to advance our understanding of mid-air touch to the extent to which we understand physical touch; and (3) agency and responsibility—in which I highlight how we live in a world that is increasingly automated and integrated to our body, and I discuss possibilities of using multisensory mid-air interactions to promote a feeling of control and responsibility in a world in which intelligent algorithms (e.g. autonomous systems and autocomplete predictors) assist us and influence our behaviour. Finally, I conclude with a reflection on how multisensory experiences can be included as a part of future ethics guidelines for mid-air interactions. Readers are encouraged to consider how ultrasound haptics can become part of meaningful multisensory mid-air experiences, which positively influence behaviour.

Fig. 1
A schematic diagram of the mid-air sensory characteristics. Integration of a number of senses, In addition, the Agency, as well as responsibilities.

Mid-air interaction that involves the human senses, behaviour, and experiences

2 Mid-air Touch in an Emerging Contactless World

The COVID-19 pandemic has demonstrated that touchless systems have the potential to significantly impact our interactions with technology in two relevant ways. First, unlike contact-based devices such as touchscreens, contactless activation (e.g. doors, taps, toilet flush, payments, etc.) can provide a more hygienic solution for reducing the spreading of pathogens. Second, physical distancing and national lockdowns have produced an acceleration towards digital experiences that enable us to interact with others remotely. The digitalisation and transformation of business and education practice have taken place over a matter of weeks, resulting in an increased human–computer symbiosis. However, these online experiences and activities often lack realism compared with their physical counterparts, as they only use limited sensory cues (mainly vision and audio). Mid-air technologies can significantly enhance such digital experiences, that otherwise are limited to being seen or heard (e.g. virtual tours through a museum), through the addition of a haptic component (e.g. haptic interactions or contactless haptic feedback). Whilst it has been argued that mid-air technologies can become ubiquitous in the future, as depicted in sci-fi movies (e.g. Minority Report), the current situation is accelerating the need for more meaningful digital experiences and making evident the advantages of touchless interactions in physical and virtual spaces.

These recent events provide a unique opportunity for research and development innovation around mid-air technology. There is a need to apply the principles of human–computer interaction (HCI) and multisensory experience to enhance interaction with contactless technologies (e.g. gesture recognition, body tracking, ultrasound haptic output) in order to, first, study how multiple senses can be engaged with mid-air interactions and thus design novel and more meaningful touchless paradigms, and second, apply the knowledge gained to help in emerging applications that support not only technical innovation but also societal responsibility in the context of an accelerated digital human-technology integration.

Current advances in mid-air technologies, however, have been mostly focussed on the context of software and hardware development in order to advance engineering methods related to accuracy (Matsubayashi et al. 2019), recognition (Sridhar et al. 2015), and rendering (Long et al. 2014). However, whilst recent research has explored mid-air technologies in the context of human perception (e.g. emotions (Obrist et al. 2015)), little is known about how these technologies influence human behaviour and how human perception can be exploited to improve the interaction with such technologies. For example, ultrasound has enabled rich tactile sensations (e.g. 3D shapes in mid-air), however, different questions remain—how does the user perceive those shapes? Is it sufficient to display a 3D shape of a button for the user to perceive a button shape? How can that perception be exploited to engage with other senses? And finally, how can this technology help society?

3 Opportunities for Multisensory Mid-air Haptics

Humans are equipped with multiple sensory channels to experience the world (Stein and Meredith 1993). Whilst Aristotle taught us that the world is dominated by 5 basic senses (sight, hearing, touch, smell, and taste), research in philosophy suggests that we have a lot more (anywhere between 22 and 33 senses) (Smith 2016). Some examples are the sense of proprioception (the perception of spatial orientation), kinaesthesia (the sense of movement), the sense of agency (the sense of control), amongst many others. Whilst there are increasing efforts to design digital experiences that go beyond audio-visual interactions, involving for instance, smell (Maggioni et al. 2018), and taste (Narumi et al. 2011), mid-air technologies still lack a multisensory perspective. In the future of mid-air interaction, this view can change how mid-air technologies are studied, aiming to account for multisensory information.

In the next sections of this chapter, I describe three areas of development that consider human multisensory perception to advance the study of mid-air haptics. To do so, I focus on three main challenges of developing multisensory mid-air interactions. First, I describe how crossmodal correspondences could improve the experience of mid-air touch. Second, I outline opportunities to introduce mid-air touch to the study of multisensory integration. Finally, I discuss how this multisensory approach could benefit application scenarios that require a sense of agency in the interaction with autonomous systems. This chapter highlights the potential benefits of integrating ultrasound haptics into multisensory experience, from both a research and application perspective.

3.1 Challenge 1: Mid-air Tactile Dimensions

Mid-air haptic feedback produced by focussed ultrasound can be effectively rendered in the form of 3D shapes (Long et al. 2014) and textures (Frier et al. 2016) that can be felt by users without direct physical contact. For example, people could “touch and feel a hologram of a heart that is beating at the same rhythm as your own” (Romanus et al. 2019) through such technologies. However, despite the great levels of possible control over haptic patterns (Martinez Plasencia et al. 2020) and the ability to render complex shapes (Long et al. 2014), “mid-air haptic shapes do not appear to be easily identified” (Rutten et al. 2019). This difficulty of precisely identifying shapes and patterns produced by mid-air haptics can be caused by the lack of other sensory cues (e.g. visual feedback) that help to confirm the perceived shape of an object or the texture of a tactile pattern. Adding extra sensory cues perceived through different channels could help the identification of haptic information (Ozkul et al. 2020; Freeman 2021). For instance, combining a sphere rendered on the hand with a visual display showing the same shape, or combining a rough texture pattern with a rough sound. However, it could be possible to produce a multisensory experience with only the haptic feedback itself using crossmodal associations.

The human brain has the ability to associate crossmodal information from different senses. This is well supported by the broad body of literature in crossmodal correspondences (CCs) research. CCs are defined “as a tendency for a sensory feature, or attribute, in one modality, can be matched (or associated) with a sensory feature in another sensory modality” (Spence and Parise 2012). These associations have been widely employed in design, marketing, and multisensory branding (Spence and Gallace 2011). For instance, it has been shown that the shape of a mug can influence the coffee taste expectations (Van Doorn et al. 2017), that colours and sounds influence our perception of temperature (Velasco et al. 2013; Ho et al. 2014), and that our sensation of touch can be influenced by odours (Dematte et al. 2006). These CCs have not yet been explored for mid-air touch. For instance, it is unclear how people associate tactile patterns produced by focussed ultrasound on the skin with different sensory features such as sounds, smells, temperature, moisture, emotions, amongst others. This can be explored by building on prior studies in the literature. A rich variety of robust CCs have been demonstrated between various pairs of sensory modalities (Parise 2016). Particularly relevant in this chapter, a number of studies have found CCs involving tactile features such as heaviness, sharpness, thickness, angularity, and temperature that are associated with other sensory features.

For example, it has been shown that the perception of heaviness is associated with dark and low-pitched cues, whilst the perception of sharpness is associated with high-pitch sounds (Walker et al. 2017). Similarly, textures can be associated to adjectives referring to not only tactile features but also to visual and auditory domains. For example, smooth textures are associated to adjectives such as “bright”, “quiet”, and “lightweight” whilst rough textures with adjectives such as “dim”, “loud”, and heavy (Etzi et al. 2016). Additionally, there is evidence suggesting that certain shapes can be associated to different temperatures attributes (Van Doorn et al. 2017; Carvalho and Spence 2018).

These CCs are also common when referring to the chemical senses (smell and taste). Studies have shown that angular shapes are associated to sour tastes whilst rounded shapes with sweet tastes (Salgado Montejo et al. 2015). Similarly, angular shapes have been found to be associated to lemon and pepper odours whilst rounded shape with raspberry and vanilla odours (Hanson-Vaux et al. 2013).

Emotions also play an important role in CCs when referring to tactile features. Research has shown that soft textures are associated with the positive emotion of happiness and rough textures with negative emotions such as fear, anger, and disgust (Iosifyan and Korolkova 2019). In another example, in the study by Etzi et al. 2016, smooth textures were associated with the labels “feminine” and “beautiful” whereas rough textures with the adjectives “masculine” and “ugly”.

In summary, by considering these previous findings about how haptic features are associated with other sensory features, developers could design for a particular intended user experience (e.g. a haptic pattern that is heavy, cold, and bright). There are specific situations in which a specific experience may be required. For example, a pleasant and warm haptic sensation could be suitable for a remote video call (e.g. a virtual handshake), whereas an unpleasant and cold experience might be required for a virtual horror game (e.g. a spider walking across your hand). Future work in this area consists of a series of studies to explore CCs between specific patters of mid-air haptic feedback on subjects’ hand with different features, not only related to touch attributes (e.g. shapes) but towards multisensory features such as temperature, texture, and emotions (see Fig. 2).

Fig. 2
Six hand gestures for the warm, pleasant tactile feeling could be used for a video surveillance connection, whereas a cold, terrible one could be used for a digital video game. Temperature, texture, and feelings.

a Mid-air haptic feedback, b different haptic patterns on the user’s hand, c the associations of the haptic patterns with sensory features can produce a multisensory experience

The knowledge and findings from more research in this area could give insights into how designers can create more realistic and vivid experiences of touching “real” objects. This in turn could perhaps reduce computational power (e.g. high accuracy needed to render) by exploiting the power of human perception. That is, a better understanding of the capabilities and limits of human perception can lead to more effective interactions (Debarba et al. 2018).

Similarly, the generation of a large dataset of haptic patterns and their associations with sensory features could contribute not only to a better design of haptic experiences in the areas of HCI but also to the body of research on CCs in the area of psychology and cognitive neuroscience. Such advances could lead to haptic designs that are significantly more visceral than those outlined in Chaps. “User Experience and Mid-Air Haptics: Applications, Methods, and Challenges” and “Ultrasound Haptic Feedback for Touchless User Interfaces: Design Patterns”.

3.2 Challenge 2: Multisensory Integration

Since experiences in the world can be ambiguous, our perceptual system collects cues from our different senses to resolve potential ambiguities. To resolve these ambiguities, humans follow two general strategies, i.e. cue combination and cue integration (Ernst and Bülthoff 2004). Whilst cue combination accumulates information from different sensory sources to disambiguate uncertainty, cue integration integrates information from different sensory sources to find the most reliable estimate by reducing variance as much as possible. In other words, since perception is inherently multisensory (Stein and Meredith 1993), when a single modality is insufficient for the brain to produce a robust estimate, information from multiple modalities can be combined or integrated for better estimation. Examples of these two phenomena are described below.

Cue combination: Imagine you are sitting in a stationary train carriage looking out the window at another nearby train. The other train starts moving and then your brain faces an ambiguous situation: is it you or the other train that is moving? In this uncertain situation, your brain will raise an answer (right or wrong) by combining multiple sensory cues. Vision alone may not be enough to solve this ambiguity, but, if your brain combines information from your vision (seen parallax motion from outside), vestibular system (your perceived position in space), and proprioceptive system (feeling the motion of the train), this ambiguity can be easily revolved (Ernst and Bülthoff 2004).

Cue integration: In situations when the perceptual event involves more than one sensory estimate, cue integration is employed by our brain. For instance, in a size estimation task, you may use both vision and touch to judge the size of an object. But, if you are simultaneously seeing and touching the object, is the perceived size determined by your vision, by your touch or by something in-between? In this case, information of the different sensory modalities has to be integrated to determine a robust estimate (see Fig. 3). That is, the brain reduces the variance of the size estimate by weighting the influence of each modality (vision and touch) based on their reliability (Ernst and Banks 2002).

Fig. 3
A schematic diagram of humans acquiring strong awareness by synthesizing data from various senses to get the most reliable estimate. From left to right, Visuo haptic cues, Vision, Touch, Sigma, and Size estimate.

Humans achieve robust perception by integrating information from different sensory sources to find the most reliable estimate by reducing its variance as much as possible

A wide range of research has been conducted to investigate how the human senses are integrated, a particular topic of interest is focussed on haptics, for example, visuo-haptic integration (Ernst and Banks 2002) and audio-haptic integration (Petrini et al. 2012), as well as the integration of haptics and smell (Castiello et al. 2006; Dematte et al. 2006).

However, despite the rapid development of mid-air technologies, efforts to study haptic integration are uniquely focussed on physical touch to date, and it is therefore unknown how mid-air touch is integrated with the other senses. For instance, we do not know if the integration of vision, audio, or smell with mid-air touch is similar to what has been found with actual physical touch, as there are many factors that make physical and mid-air touch different (e.g. physical limits, force, ergonomics, instrumentation). Here, I see an opportunity to expand our knowledge of mid-air interaction by applying the principles of multisensory integration from psychology. Bridging this gap could open up a wide range of new studies exploring the integration of multiple senses with mid-air touch, using the technology recently developed in HCI and further taking advantage of current knowledge in this area. For example, a number of studies have already provided insights that improve our understanding of mid-air haptic stimuli perception in terms of perceived strength (Frier et al. 2019), geometry identification (Hajas et al. 2020b), and tactile congruency (Pittera et al. 2019), providing compelling evidence of the capability of mid-air haptics to convey information (Hajas et al. 2020a; Paneva et al. 2020).

Future research into mid-air haptics should be directed to explore how tactile sensations created by focussed ultrasound are integrated with other senses in order to create more meaningful experiences and to reduce ambiguity in virtual and digital worlds. For instance, in a virtual world, you might feel a rounded shape rendered on your bare hand in mid-air via an ultrasound haptics device, but by using your sense of touch only it would be very difficult to identify whether that shape is an onion, a peach, or a hedgehog (see Fig. 4). However, by integrating cues from other senses, this uncertainty is reduced, leading to a more vivid, real, and emotional experience. For example, the 3D shape and texture of a peach’s velvety skin can be felt using mid-aid ultrasound, a VR headset can show its bright orange colour giving the perception of a juicy fruit, whilst a sweet smell can be delivered to your nose (see Fig. 4b). This is an example of the new possibilities that can be achieved through multisensory technology but have, to date, been underexplored. With the accelerated digitisation of human interaction due to social distancing, multisensory digital experiences become increasingly relevant and compelling.

Fig. 4
Three photographs of a woman wearing a virtual reality-like headset that perceives she is holding an onion, an orange, and a hedgehog.

Example of a virtual object identification task in VR integrating different senses: a rounded shape, smooth texture and stinky smell, b rounded shape, velvety texture and sweet smell; c irregular shape, spiky texture, and animal-like smell

This push towards mid-air touch integration can be done by a combination of VR simulations and computational methods. The development of fundamental principles involving mid-air touch integration can contribute not only to design experiences but also to the literature on sensory integration models and their comparison with the previous proposed models involving physical interaction.

3.3 Challenge 3: Agency and Responsibility

One crucial experience that humans have whilst interacting with the world is the sense of agency (SoA), often referred to as the feeling of being in control (Kühn et al. 2013). Whilst recent multisensory technology and intelligent systems help users by making some computing tasks easier and more immersive, they might affect the SoA experienced by users, thus having implications on the feeling of control and responsibility felt whilst interacting with technology. Particularly for touchless systems, designing mid-air interactions that preserve the user’s SoA becomes a challenging task as these interactions lack the typical characteristics of touching an object, and therefore, they might be considered less robust compared with physical interactions. However, the SoA can be improved when the user receives appropriate multisensory cues. In the next sections of this chapter, I explain an overview of what the SoA is, why it is important in our interactions with technology and how it can be increased with a multisensory approach.

3.3.1 The Sense of Agency

The SoA refers to the experience of being the initiator of one’s own voluntary actions and through them influencing the external world (Beck et al. 2017). Georgieff and Jeannerod 1998 defined this phenomenon as a “who” system that permits the identification of the agent of an action and thus differentiates the self from external agents. The SoA reflects the experience that links our free decisions (volition) to their external outcomes, a result of action-effect causality in which the match between the intended and actual result of an action produces a feeling of controlling the environment (Synofzik et al. 2013), such as happens when we press the light switch and perceive the light coming on (e.g. I did that). To experience a SoA, there must be an intention to produce an outcome, and then, three conditions need to occur, (1) the occurrence of a voluntary body movement, (2) the execution of an action that aims at the outcome, and (3) the external outcome itself (see Fig. 5a). These conditions are present during our everyday life as we constantly perform goal-directed motor actions and we observe the consequences of those actions (Hommel 2017). This action-effect causality is particularly important in our agentive interactions with technology representative in HCI.

Fig. 5
Two schematic diagrams of the ordinary action of turning on a light and the interaction with different holograms while wearing a virtual reality headset in pushing a button.

Elements that compose the SoA in daily life tasks and in our interaction with technology

3.3.2 The Sense of Agency in HCI

When we interact with systems, actions are represented by user input commands, and outcomes are represented by system feedback. Input modalities serve to translate user’s intentions into state changes within the system, whilst system feedback informs the user about the system’s current state (see Fig. 5b). In this interplay, the SoA is crucial to support a feeling of being in control. For instance, when we manipulate a user interface (e.g. on a computer or smartphone), we expect the system to respond to our input commands as we want to feel that we are in charge of the interaction. If this stimulus–response interplay elicits a SoA, then the user will perceive an instinctive feeling that “I am controlling this”.

Due to the ubiquity of our interaction with systems for work or leisure purposes, we usually do not think about our SoA during the interaction, and it may go unnoticed (Moore 2016). However, a clear example that highlights the importance of our SoA in HCI is when this experience is disrupted. When there is a mismatch between user expectations and the actual sensory feedback from the system, the user experiences a sudden interruption in the feeling of control. This can negatively affect acceptability (Berberian 2019) and usability (Winkler et al. 2020), e.g. poor game controllers may cause frustration (Miller and Mandryk 2016).

In summary, if a system does not support a SoA, the user might feel discouraged from using it (Limerick et al. 2015) and lose self-attribution of their actions’ outcomes. For this reason, the SoA is gaining increasing attention from the field of HCI. Developing interaction techniques that increase user’s SoA will provide the feeling of “I did that” as opposed to “the system did that”, thus supporting a feeling of being in control.

3.3.3 Supporting Agency through Multisensory Mid-air Haptics

The SoA has been suggested to “underpin the concept of responsibility in human societies” (Haggard 2017). Whilst mid-air interactions are becoming increasingly popular, a major challenge is how responsibility is shared between humans and touchless systems. That is, whilst causality and accidents are usually attributed to human errors, today crucial actions have been delegated to computers in contexts involving people. A notable example of this agency delegation is characteristic of autonomous systems. Today, such systems are found in vehicles, machines, and aircrafts and could potentially reduce the SoA since increasing levels of automation can decrease the user’s feeling of being in control. This raises the question: “who is in control now”? (Berberian et al. 2012).

Given that mid-air technologies have been recently integrated into automotive applications (Hessam et al. 2017), home appliances (Van den Bogaert et al. 2019), and aviation interactions (Girdler and Georgiou 2020) (as depicted in Fig. 6), there are increasing opportunities to develop multisensory interactions for autonomous systems that preserve a user’s SoA.

Fig. 6
Three photographic scenarios of the applications for automobiles, household appliances, and aircraft encounters have all begun to make use of technologies that operate in the midair.

Example mid-air haptics in vehicles a, home appliances b, and aviation c

Particularly in driving scenarios, previous studies have employed haptics to make the driver aware of semi-autonomous vehicles’ intentions, by means of force feedback (Ros 2016), so that agency is shared between the user and the vehicle (i.e. the system is not fully in control but delegates appropriate level of control to the operator). However, this has not been explored with mid-air interaction yet, despite the proven benefits of combining gestures with mid-air haptic feedback for in-vehicle tasks (Harrington et al. 2018), e.g. as discussed in Chap. “Ultrasound Mid-Air Tactile Feedback for Immersive Virtual Reality Interaction”. For example, by minimising the visual demand associated with touchscreens in vehicles, mid-air ultrasonic feedback reduces the number of off-road glance time and “overshoots” (Harrington et al. 2018; Large et al. 2019). This is because “employing haptics naturally reduces the need for vision as users are not required to visually confirm selection or activation”, promoting more safe driving practices (Harrington et al. 2018).

More opportunities to improve the user’s SoA through mid-air interactions have been demonstrated in the literature. For example, the study by Cornelio et al. (2017) showed that both physical and touchless input commands produce a user’s SoA and also showed that mid-air haptic feedback produces a SoA comparable to that felt with typical vibrotactile feedback. Similarly, Evangelou et al. (2021) showed that mid-air haptic feedback can promote implicit SoA as well as protect against latency-induced reductions in the explicit judgements of agency.

Additionally, analogue studies have shown that the SoA can be modulated by sensory and emotional information not only related to the somatosensory channel (Beck et al. 2017; Borhani et al. 2017) but also in response to other sensory cues. For instance, showing increased SoA with positive pictures (Aarts et al. 2012), or pleasant smells (Cornelio et al. 2020), and decreased SoA with negative sounds (Yoshie and Haggard 2013; Borhani et al. 2017). This evidence suggests that sharing control between systems and the operators could be aided by multisensory information. This is because the SoA arises by a combination of internal motoric signals and sensory evidence about our own actions and their effects (Moore et al. 2009).

Those previous findings could be used as a foundation to conduct new studies to explore how multisensory mid-air interaction can improve the user’s SoA. An action plan that I see is to link the study of agency modulation from the areas of psychology and neuroscience (e.g. using affective sensory cues) with the study of agency modulation in the area of HCI (e.g. using mid-air haptics to increase the SoA) in order to introduce an approach of multisensory mid-air interactions featuring ultrasound haptic feedback.

Future work in mid-air haptic technologies integrated with autonomous systems should focus on how to include multisensory mid-air interactions (involving, e.g,. the integration of various senses and crossmodal correspondences), in order to make the operators more aware of the system’s intentions, actions, and outcomes. In other words, exploiting the benefits of mid-air technologies and multisensory experiences to share agency between the operator and the system.

In summary, although recent technology posits the user in environments that are not fully real (e.g. virtual or augmented) and where users’ actions are often influenced (e.g. autocompletion predictors) or even automated (e.g. autonomous driving), multisensory signals can help users to feel agency even though they are not the agent of the action or when several commands were not executed (Banakou and Slater 2014; Kokkinara et al. 2016).

Finally, advances in the development and robustness of novel interaction paradigms, which involve multisensory mid-air technologies, can be integrated in the future policy making and legal systems. For example, giving insights to craft guidelines for autonomous driving, preserving moral responsibility, and a safe operation through technology in the future when fully autonomous systems are ubiquitous.

Indeed, legal systems have already started crafting guidelines for autonomous vehicles (Beiker 2012) that preserve moral responsibility (De Freitas et al. 2021), as well as drafting theoretical foundations for the next generation of autonomous systems (Harel et al. 2020). In the future, mid-air interactions can be part of these efforts to promote responsibility during interaction.

4 Discussion, Conclusions, and Future Directions

Whereas humans experience the world through multiple senses, our interaction with technology is often limited to a reduced set of sensory channels. Today, we see increasing efforts to digitalise the senses in order to design more meaningful and emotionally-loaded digital experiences (Velasco and Obrist 2020). Mid-air haptics produced by focussed ultrasound provides a step forward to the digitalisation of touch, enabling interaction paradigms previously only seen in sci-fi movies (Cwiek et al. 2021). For example, it is now possible to touch holograms (Kervegant et al. 2017; Frish et al. 2019) as well as levitate objects (Marzo et al. 2015) and interact with them (Freeman et al. 2018; Martinez Plasencia et al. 2020). However, recent advances in mid-air technology have been focussed on hardware and software development, and therefore, little is known about how these technologies influence human behaviour, and how designers can exploit our knowledge of human perception to improve interaction with such technologies.

In this chapter, I proposed three areas of development to advance mid-air air haptics from a multisensory perspective. I first discussed how crossmodal correspondences can be used to design mid-air haptic patterns that create multisensory experiences, by exploiting the ability of the human brain to associate crossmodal information of different senses. Then, I outlined the lack of research around mid-air touch in the study of multisensory integration and highlighted the need for more research to advance our understanding of mid-air touch with respect to our current understanding about physical touch. Finally, I describe a crucial experience in both our daily life and our interaction with technology—the sense of agency—and discuss how a multisensory approach to mid-air technologies can promote a sense of responsibility, particularly in the context of autonomous systems.

To achieve the multisensory approach suggested in this chapter, it is important to consider methods used to measure user experiences. Favourably, the literature provides different methods to quantify the extent to which people perceive a multisensory experience. For instance, in the case of CCs, visual analogue scales (VAS) are used as a measurement instrument to quantify a sensory attribute by ranging across a continuum of values (Cline et al. 1992). For example, the degree of association between sensory features usually ranges across a continuum from not at all to extremely (Crichton 2001). With this method, researchers have found CCs between different sensory modalities such as smells (e.g. lemon scent) and bodily features (e.g. thin body silhouettes) (Brianza et al. 2021) or between body attributes (e.g. hardness) and haptics metaphors (e.g. vibration patterns on the skin) (Tajadura-Jimenez et al. 2020). The opportunities for designing multisensory mid-air interactions can be extended by means of these associations which can enrich a sensory attribute or provide the perception of amplified sensory features.

In the case of multisensory integration, a large range of studies have provided computational methods to quantify the extent to which a person integrates different senses. One example is maximum-likelihood estimation (MLE), which is often employed to integrate different sources of sensory information when the goal is to produce the most reliable estimate (Kendall and Stuart 1979). More recently, sensing technologies and artificial intelligence (AI) techniques have been employed to digitally replicate how the human body integrates different senses (Zhu et al. 2020). These techniques have been considered a promising approach towards robotic sensing and perception (Tan et al. 2021). The efforts of digitalising the sense of touch in the literature give us growing opportunities to introduce a wide range of new studies to explore the integration of mid-air touch with other senses.

In terms of quantifying the SoA, implicit methods can be found in the literature. For example, the intentional binding paradigm (Haggard et al. 2002) provides an implicit measure of the SoA by indicating a relationship between agency experience and perception of time. In this paradigm, the level of agency can be assessed as perceived differences in time between voluntary actions and their resultant outcomes. Using this method, the previous studies in HCI have shown evidence of the level of control of mid-air interactions (e.g. a mid-air button) in comparison with a physical interaction (e.g. a keyboard button) (Cornelio et al. 2017).

The methods described above can be used in the efforts of introducing multisensory mid-air interactions. Whilst prior methods are typically done with physical touch, a wide range of studies is yet to come to understand for mid-air touch what we currently know about physical touch.

In the future of mid-air technology, a multisensory view can change how mid-air interactions are studied, designed, and put into practice. This view includes the integration of multisensory experiences for mid-air haptics, particularly focussing on the sense of touch but also studying its relation, association, and integration with other senses in order to convey more meaningful digital experiences to humans that in turn promote agency and responsibility. The new knowledge acquired to achieve this goal will generate underlying principles resulting in the design of more solid application scenarios in the future, that take into consideration not only engineering advances but also human behaviour and perception, thus augmenting the capabilities of our digital social interaction. That is, we need to foster a new and inclusive ecosystem of multidisciplinary research around mid-air interactions involving psychology, neuroscience and HCI, so that the impact of these technologies is not limited to hardware and software, but also provide an impact on society with respect to accelerated digitisation of human experiences.

I particularly emphasise the importance of preserving a SoA in a world that is increasingly automated. Since the SoA arises through a combination of internal motoric signals and sensory evidence about our own actions and their effects (Moore et al. 2009), a multisensory view can significantly make technology users more aware of their actions and the consequences of these, thus promoting a feeling of responsibility. Emerging research is already examining how to improve the SoA in HCI. For example, by exploring motor actuation without diminishing the SoA (Kasahara et al. 2019), exploring appropriate levels of automation (Berberian et al. 2012), or exploring how the SoA can be improved through olfactory interfaces (Cornelio et al. 2020). Despite such efforts, it has been suggested that “the cognitive coupling between human and machine remains difficult to achieve” (Berberian 2019), so further research is needed.

Nonetheless, research on mid-air interactions can be included in this body of research exploring agency, responsibility, and the human sensory system, so that we design digital experiences in which users can see, hear, smell, touch, and taste just like they do in the real world. Future directions around multisensory integration can break from the conventional studies in mid-air technologies and thus help to achieve this goal.