Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Information and Communication Technologies (ICT) have developed in unexpected ways in recent decades. Changes in the physical environments, hardware, communication and transmission modes have allowed for the development of techniques for transmitting information with very high performance, maximizing speed and reducing data loss. These technologies have enabled high speed networks and the development of applications based on multimedia objects, in particular, interactive real-time applications (Balbinot et al. 2000).

These applications have manifested in our society in recent years in the form of video streamers and recorders, videotext, telephone-based voice systems, on-line services, information kiosks, ‘intelligent’ household appliances and multimedia systems (Jensen, 1998).

New media is “invading” the Internet everyday. Images, videos, texts, animations are used to share any kind of human feeling (happiness, sadness, passion, hunger, pride, etc.). Also, the Internet is not open to only computers anymore. Internet access is a feature of modern cellphones, domestic equipment, cars, and many other devices. Mobility and connectivity are now requirements for many daily activities.

This is changing the way that society deals with networked content. It is transforming user’s needs, actions and reactions. Figure 1 illustrates the new reality for billions of users nowadays: there are many things to read and share, many ways to distribute information and many people to notice and react to it.

Fig. 1
figure 1

The evolution of Information on the Internet (or “Reliable Unreliability!”), a particular view of Neal Yamamoto (2013)

Digital convergence/divergence has many different perspectives and definitions. One of them takes in account the “shapeshifting”. For Rachel Hinman (2011):

Just like the Wonder Twins transforming into “the form of” a convenient animal/water configuration that will save the day, convergence is what enables experiences to shapeshift between different devices and environments.

So, thinking about convergence means that designers have to allow user experiences to move fluidly through multiple content and devices. Hinman (2011) also pointed out a proposal for convergence levels which defines convergence as a number of procedures: activity convergence (what users do), media convergence (what users perceive) and technology convergence (what users experience).

Instead of being an isolated experience, the technology convergence enables the fluidity capable to allow the user to move across multiple devices. Combining media and technology convergences in this manner, we have the tools which make possible user perception and experience. In Fig. 2 is shown how this convergence levels could be applied in practice. Netflix is a tool for media convergence that promotes entertainment activities to users, such as movies and additional information. Netflix offers to users, by different technologies convergence, media content. Users can use mobile devices, TV sets or cellphones to connect to Netflix services. The TV still being the social hub for entertainment, but another resources are easily connected to improve interaction and preserve the individual participation.

Fig. 2
figure 2

An example of covergence levels in practice: using NetFlix (MailOnline, 2012)

For Lund (2011) this “toolbox” can be seen as a massively networked ecosystem that gives us increasing power to connect with others, to accelerate the growth of intelligence and to shape the world with our ideas.

Social media, the Internet of things, “quantified self”, robots, gestural user interfaces, homebrew or Do It Yourself (DIY) technology are examples of elements that make part of this convergence toolbox according with Lund perspective (Lund, 2011).

In the digital age, social media is a good way to think about and understand human beings. According to Mayfield (2008) “sharing ideas, cooperating and collaborating to create art, thinking and commerce, debate and discourse, finding people” are the main benefits from social media. These benefits make virtual things seem like natural things. That is why it spreads so quickly.

Social media has traditionally been a term used to describe user-generated content that can be shared with others online. It can include blogs, wikis, social networks, and a variety of other platform types and applications. Over recent years, there has been a huge growth of social media networks and a corresponding increase in the number of taxonomies to classify them (Kaplan & Haenlein, 2010).

For Kaplan and Haenlein (2010) social media can be grouped into: blogs, social networks, social multimedia sites, wikis, discussion forums, social bookmarking websites, location based services. These tools are designed to eliminate friction and make communication more accessible for anyone using any device anytime. As devices mediating communication become smaller and less intrusive, the closer communication approaches the potential ideal of telepathy (Lund, 2011).

The Internet of Things is a simple name to define a network of complex and sophisticated systems, usually systems that refer objects with sensors attached. McKinsey et al. cited in (Atzori, Iera, & Morabito, 2010) described it as:

When objects can both sense the environment and communicate, they become tools for understanding complexity and responding to it swiftly. What’s revolutionary in all this is that these physical information systems are now beginning to be deployed, and some of them even work largely without human intervention.

The Internet of Things involves both objects and connectivity. In other words, it adds new tangibly interconnected elements to the online environment. So, Internet tools can control domestic equipment using information gathered online or from sensors (perhaps using weather forecasts, user food preferences or location information).

This term “The Quantified Self” has emerged in the USA to define a collaboration of users and tool makers who share an interest in self knowledge through self-tracking. This movement believes that self-improvement of technological tools can be used successfully to persistently monitor and record all facets of human life.

A recent publication of BBC Future (Weintraub, 2013) highlights that today it is easier to track everything, from diet to mood to sleep quality, then it was in the past. Features like GPS, accelerometers, cameras, microphones and gyroscopes can record human activity, location and other vital statistics. The data from this constant monitoring can reveal the patterns and habits of individuals and also of the community. this constantly available information can be used to identify opportunities for health improvement or to prevent natural disasters. For Amy Robinson (Weintraub, 2013): “The insights that we could learn from having all this quantified self data available are almost unfathomable”.

In a recent ambitious project, the University of Zurich’s Artificial Intelligence Lab plans to create a robot in 9 months, it is expected to be “born” in March 2013. This robot, called “Roboy” brings up many daily activities inside and it aims to interact with people. As noted in Mail Online (2012): “Roboy is the robotic boy set to help humans with everyday tasks”.

This kind of effort supports the hypothesis that one of the primary goals for robot development is communication. However, for humans the most natural communication media still being humans, it happens using human senses. For Kanda and Ishiguro (2013) the human brain does not react emotionally to artificial objects such as computers or smartphones in the same way as it reacts to an image of a human face. For this reason, there are many projects investigating and building solutions for humanoids.

Thus, in general robots are getting “smarter” and more functional, they are learning how to deal with in our environment, with more sophisticated monitoring systems and sensor. The robots are being prepare to human interaction and communication as social elements of our convenience.

Human communication always used gestures, movements and expressions as oral language support. Certain gestures are so commonly used around the world that are understood throughout different cultures and times, such as a wave or thumbs up. Natural Interaction is a way to apply this concept to user interfaces in computer systems

Making a brief retrospect we can identify the evolution of these devices through the command-based languages, through the graphical user interfaces (GUIs), and finally the direct manipulation with the advent of the use of the pointers (mouse). From the rise of touch screens, the use of cameras in the analysis of user actions and the creation of devices that allow us to use technology more easily. For example, the Kinect that enables us to think about the development of more sophisticated and natural user interfaces. The Natural Interaction (NI) studies ways that humans can interact through humans five senses, be that with gestures, voice commands, corporal expressions or human body parts detection and identification (Frati, 2011; Hewett et al., 2009; Rauterberg et al., 1996; Smith & Waterman, 1981; The Engineer, 2012).

The Do-It-Yourself (DIY) culture has been continuously articulated since mid-1920s. Nowadays, DIY is an evidence across many disciplines as Health, Publishing, Production, Projects. DIY is improved by Internet tools for write, edit, publish and distribute content. The current DIY perspective uses these new sharing mechanisms for enables communities, creativity efforts and social capital. For Kuznetsov and Paulo (2010) this accessibility and decentralization has enabled large communities to form around the transfer of DIY information, attracting individuals who are curious, passionate and/or heavily involved in DIY work. Lund (2011) added that the DIY culture associated with convergence trends makes possible turning thoughts into things at a easier and faster way.

For all the elements briefly described before, one common feature is observed: the challenge goes further technological issues, it is more related with how we think about the use of technology.

2 Multisensory Interaction Design

Recently, the area of “Interaction Design” (Preece, Rogers, & Sharp, 2005) became an active concept in the design of interactive systems. Increasingly system design takes into account not only technical aspects, but especially the appropriation of these technologies in the daily users activities. Another way to analyze the interaction design is to approach it as a space for communication between people and computer systems.

The interaction design integrates knowledge from different fields of study (cognitive engineering, ethnography, communication, psyoology, among others). Another important point is to provide a new understanding of user-centered approach in terms of productivity, efficiency and usability for empathy, fun, beauty, loyalty and users involvement.

For De Paula (2003) the interaction design concept is closed related with users’ needs, the design of interactive technologies has also to encompasses user’s feelings and thoughts.

The technology is now increasingly available, miniaturized, transparent and ubiquitous. And the challenge for designers is to make information and functionality more easily understood by users. A practical scenario for this situation is the cell phone. Cell phones are available for a large portion of the population with different backgrounds, age or social class. The models of the cell phones are very diverse in colors and shapes, however, over time the miniaturization process is evident. Increasingly, mobile devices with mobile internet are used by the population, thus check emails, send messages to social networks or post a video. These activities are often facilitated by technology. Finally, the cell phone is present everywhere, providing information at any time and in different formats.

In this simple example, it is observed that:

  • Increasing functionality contrasts directly with the reduction of the device sizes (thumbnail). The interaction with the interface elements getting smaller is a challenge to our physical capacity (the size of our hands and finger dexterity).

  • Increased functionality overlaps artifacts and their uses. For example, cell phones are commonly found in football matches to replace traditional radio batteries. The use of watches has been declining due to recent habit of consultation the time by phone.

  • The ubiquitous connectivity adds a new dimension to our daily activities. The information is everywhere, anytime on any device. It generates a cognitive overload and contrasts with the growing demand from its use.

Therefore, computing becomes ubiquitous, which leverages the use of our shared environments that are also enriched with new possibilities of communication and interaction. This reality brings to the field of Human-Computer Interaction a series of challenges and new opportunities.

In literature this discussion for multiplicity and overload scenario has many ramifications. In this study, we will initiate the concept of distributed cognition.

The term “distributed cognition” is explored in the literature (Hutchins, 1995; Norman, 1993; Salomon, 1996) since 1993. In 2000, James Hollan, Edwin Hutchins, and David Kirsh discussed that distributed cognition provide a theoretical basis for effective understanding of Human-Computer Interaction. Besides, distributed cognition is a fertile arena for discussing design and evaluation of digital artifacts (Hollan, Hutchins, & Kirsh, 2000).

According to Hollan et al (2000), in contrast to traditional theories, distributed cognition extends the range of cognitive possibilities because instead of considering only the individual, it covers the interactions between people, systems and devices into the environment. It is important to understand that distributed cognition is closed to the “whole” perspective, instead of a particular type of cognition. One of the aspects evaluated is the embodied cognition. From the point of view of distributed cognition, the organization of the human mind in development and operation modes is an emergent property of interactions between internal and external resources. In this perspective, the human body and the virtual environments are central rather than peripheral roles in the interaction. An example of this concept in practices is the daily multiplicity in terms of digital information, which increasingly requires different methods of acquisition and display of information (Hollan et al., 2000).

One approach to working with different modalities of interaction is the development of multimodal interfaces (MMUI: MultiModal User Interface) (Sun, Chen, Shi, & Chung, 2006). Multimodal interfaces are centered on the user and allow the user to interact with a computer using their own natural styles of communication, such as speaking, writing, touch, gestures and looking. A multimodal interface is easier to use because the interface is more natural and intuitive. Moreover, this type of interface has a potential to make complex technology more accessible to a wider range of users. However, such interfaces development also is complex and requires the integration of different technologies for effective use of different senses simultaneously.

To Chang and Ishii (2006) the solution for the digital information overload is the use of other modes of interaction—the sensory interfaces. The sensory interfaces are digital enlargements of existing physical objects by adding sensory mappings. Designers of these interfaces are less concerned with the design of new physical shapes to manipulate digital information, but rather they are concerned with expanding the expressive power of family well knowned artifacts. The interface design is focused on sensory real world, rather than suggesting new mappings for the virtual world objects. As a consequence, the designer must address the physical and aesthetic limitations of the existing devices. Finally, the sensory interfaces design process depends on three aspects: understanding the senses, understanding the physical objects semantic, and understanding the ritual use of the object (Chang & Ishii, 2006).

A more practical view of sensory interfaces is presented by Keith V, Nesbitt and Ian Hoskens in 2008. In this work, a multi-sensory interface could better users information retrieval. One particular issue is that the sensory interaction where the sensitive (user) combines information and insight in an unique direction. Sensory interaction is related to sensory perception. Our perceptions produce our experiences and this is consistent with our senses. When perception and realization do not work together the interaction is comprometida. Moreover, enrich interaction with estímulos in perception can facilitate and accelerate the understanding and implementation of user tasks. An example is an application that uses the main screen and an additional screen for complementary interaction. This multisensory interaction can serve to increase user confidence and reduce perceived workload, since redundant information can collaborate in knowledge retention (Nesbitt & Hoskens, 2008).

Another constant presence in mobile devices today is the touchscreen or Touch Interaction. Usually these devices were treated as binary input devices. Nowadays, works in the detection and processing of additional dimensions of touch, for the purposes of enriching interaction on touchscreens, including pressure, orientation, posture, hands movements and configuration added new possibilities of interaction in this mode. Chris Harrison and Scott Hudson (2012) discuss the many facets of Touch Interaction. According to the authors, this new style of interaction allows a variety of interaction techniques and could be used to better the user experience required in these devices today.

Multisensory Interaction Design (MID) should consider the user interface design as a intersection of three main aspects: pluralistic, adaptability and cognitive ability, as shown in Fig. 3. Apply a pluralistic design means to consider the “multi” inherent to the digital convergence world. Multimedia, multiuser, multidevice and also multiuse are examples of things that should be considered. Even when the application is not designed for a “multi” perspective, it can be used like that. We have many site designed for personal computer screens and today they are accessed by smartphones. Adaptability or flexible interfaces brings up a fluid design. In practice, it means that an user interface is capable to be transformed by itself. For design, it should consider profiling (or user profile), accessible interfaces (accessibility issues) and flexible features. We have many complex software techniques to develop this kind of feature, but all of them should begin in the interaction design. And finally, but not less important, the users and their cognitive ability. Cognitive ability considers how all this stuff is absorbed by users. For do that, we have to think about users’ senses, perception capabilities and also emotions. For example, a user with some perception limitation will need some special feature of the user interface. This kind of assessment is treated in accessibility issues, for sure. But, accessibility does not consider users’ senses. An user with some impaired has other senses more sensitives. So, to think about cognitive ability is a way to approach all the user experience abilities.

Fig. 3
figure 3

Multisensory interaction design

The multisensory design interaction considers sensory interfaces, devices, media content and mainly users’ senses. For any kind of media content there one or more users senses used to interpret it. Deal with multiple users senses efficiently is a golden goal for designers’ expertise. In the other hand, users have many possibilities of interaction and perception which gave them a full multimedia experience. For example, Fig. 4 presents the recent technology promoted by Google called Google Glasses (Google, 2013). This case explores a multisensory and fully connected user. This “new” user is always connected with Internet and has the property of receiving and sending information. The media publishing is something almost transparent to users. Information about time or a historical building can enrich user reality. GPS functionalities associated with other resources can make a no borders world immediately, changing language or time zones.

Fig. 4
figure 4

Multisensory interaction design in practice: Google glass project (Google, 2013)

3 Multimedia User Experience

According to Tim Morris (2000), a multimedia system can be considered a computer system designed to play multimedia content whether it is audio, image, video or graphic simply text. From the perspective of users, multimedia means a combination of two or more continuous media being played in a time interval—usually audio and video. Integrating all these media on a computer, allows us to use the computing power to represent the information interactively. According to Jensen (1998), interactivity is:

a measure of the potential ability of a media to allow the user to exercise influence over the content or form of the mediated communication.

Today, the digital convergence is suitable with multiple capabilities (mobility, hypertext, 3D, natural interaction) in order to provide an enriched interaction experience to users. This fact puts the multimedia experience beyond a passive interaction where users receive multimedia objects.

For Liu (1999) a multimedia experience, inherent of its nature, will include different types of media content. For a seamless experience, each element’s timing should be coordinated with the other element’s timing. A synchronization function is also required to provide the delivery of the multimedia experience.

Universal Multimedia Experience (UME) concept brings up the notion that users should have an experience anytime and anywhere. Thus, the user is the focus and network is purely a vehicle of the content (Pereira & Burnett, 2003). Hudgeons and Lindley (2010) added that an interactive multimedia experience is compromised with the audience interaction and its specification includes a experience segment having a plurality of multimedia elements and their attributes.

So, a multimedia user experience is closed related to users’ senses. A multimedia experience can use two or more senses, individual or at the same time. Typically, multimedia experience involves differents types of media. The medias can be presented individual or synchronized. The users can interact with the medias, with other users and also coordinate medias exhibition (multiple views). This view enables us to think about a surgery transmission as a multimedia experience in telemedicine. The goal pointed out here is how we can recover this multimedia experience making use of multiple medias, synchronization, coordination (user’s views) and interaction.

For Nalin Sharda (2003) the technology used for creating, coding, storing and transmitting multimedia content has an important role to play in any multimedia experience. A multimedia document is classified according to their temporal characteristics as dynamic media (or continuous), such as video and audio, and static media (or discrete), such as images and text (Hudgeons & Lindley, 2010). A hypermedia document is a multimedia document where the relationship between the components, meaning its logical structure and presentation, is set based on the hypertext paradigm, with the reservation that in the case of hypermedia documents, nodes contain information represented in different media (Sharda, 2003).

The incorporation of multiple views allow the combination of several static and dynamic media, their interactions and its presentation in different ways. In fact, create user interfaces to deal with a multisensory interaction makes the activity of design more complex. The designer has to understand human sensation and perception systems. The design still answering to users needs but now considering how the content is absorbed by senses of users. In the end, the multimedia user experience becomes a pluralistic experience applied for any field of knowledge according with the user cognitive ability. By the way, user experiences always enrich the design process. It is not different in multisensory interfaces as discussed in the following subsections.

3.1 Telematic Dancing

In 1966, the show “9 Evenings: Theatre and Engineering” organized by EAT (English, Experiments in Art and Technology) showed a series of performance art presentations that united artists and engineers, where they were exploited technological resources to the theater (Augusto, 2004). The technological poetics in the 50 decade were marked by early experiences with synthetic art made by Abraham Palatinik. In the 60 decade, the emergence of electroacoustic music, the initiative of Jorge Antunes and the entry of the computer in art by Waldemar Cordeiro marked the convergence between digital and artistic worlds in Brazil (Machado, 2005).

The show “Versus” (see Fig. 5) consisted of a distributed multimedia experience, where artists located in three distinct Brazilian cities (João Pessoa, Salvador and Brasilia) interacted through a high definition video session. Dancers located in Salvador and Brasilia interacted in real time via video streams. The created atmosphere turned possible the feeling of being in the same physical place. In parallel, musicians located in a third place (João Pessoa) generated audio from laptops and transmitted simultaneously to Salvador and Brasilia (RNP, 2005). In this sense, the show “(In) Toque” connected two dancers in Rio de Janeiro, a dancer in Salvador, a DJ in Sao Paulo and a robot in Natal. The use of robots as a part of the show activated other level for user experience. For the audience, it was possible to observe the entanglement between synthetic and human bodies. For the artists, it was possible to design movements for non-humans participants (robots) could interact with the human ones (dancers). For the designers, it was possible to evaluate how users (audience and dancers) perceive and interact with robots (Murilo, 2008).

Fig. 5
figure 5

Versus show scenes

The show “E-Pormundos Afeto” can be seen as an evolution of “Versus” and “(In) Toque” because it was managed by a distributed multimedia system called Arthron (Silva et al., 2011). “E-Pormundos Afeto” discussed questions about the changes in our behavior, in our understanding of near and far, present and past, together and separately. Arthron was used to allow real time switching of high-definition video. The show integrated distributed participants in two Brazilian cities (Fortaleza and Natal) also in Spain (Barcelona). In this multimedia experience 3D models, natural interaction and video streams worked together with dancers as we can see in Fig. 6 (UFBA, 2015).

Fig. 6
figure 6

“E-Pormundos Afeto” show scenes

3.2 eHealth

Video-based applications are increasingly popular. A interesting area is Telemedicine or eHealth applications, such as clinical sessions, second medical opinion, interactive training or surgery transmission. This scenario is featured by handle multiple video streams. Also, other objects, as clinical images, animations, video-based exams can be used to enrich the multimedia experience (Coury, Messina, Filho, & Simões, 2010; Silva et al., 2011).

Nowadays the digital content is largely used in clinical exams. For example, exams like x-rays, ultrasound and laparoscopy are generated in digital format. Also, these exams (images or videos) are used by surgeons during a medical procedure. Multimedia objects make part of medical scenarios and are used to increase or make possible some clinical procedures. So, a live surgery transmission requires also a support for integrating and reproducing another multimedia objects as medical images or 3D models. A telemedicine system for live surgeries should consider this plurality which makes surgery a real multimedia experience.

Live surgery transmissions are useful in many medical fields but some surgeons believe it has potential value for educational benefit (Gandsas, McIntire, Palli, & Park, 2002). Besides, the connectivity support has expanded opportunities for the provision of a flexible, convenient and interactive form of continuing medical education (Curran & Fleet, 2005).

The idea of recording and playing multimedia experiences on Arthron was motivated as a way to enrich the Medicine students experience in their surgery classes. The transmission of surgeries in real time, using the Arthron at University Hospital Lauro Wanderley has been a successful practice. In Fig. 7 we can observe students during a class using the Arthron. In Fig. 7b (a) slides illustrate the traditional mode of display contents and in (b) the real time visualization. In addition, students can ask questions to the surgeon who is conducting the operation in the surgery room. Also, it is possible to switch the video streams during the surgery transmission. The Arthron main feature is to offer the user a simple interface for handling different sources/streams of media simultaneously. Therefore the user can remotely add, remove, configure the presentation format and schedule the exhibition in time and space of media streams as shown in Fig. 7a.

Fig. 7
figure 7

Multimedia Experience for Telemedicine Scenario. In (a) we have the captured nodes (video streams) managed remotely by Arthron. In (b) we have the telemedicine room where students and the professor could interact in real-time with the surgery room

An innovation of Arthron is to provide the possibility to manipulate 3D objects, especially human anatomical structures, while viewing other streams, such as video. The addition of these 3D models is especially useful as a didactic resource focused to distance training and learning. Through this feature the surgeon can show students in an integrated mode live video, 3D models that demonstrate the normal organs function, tissues or structures of the human body. In Fig. 8b we present the integration of 3D models to Arthron tool.

Fig. 8
figure 8

Arthron User Interfaces. In (a) The interface for video manipulation and in (b) the 3D models used in a surgery transmission with Arthron

4 Discussion

The interaction design of digital convergence resources is a new challenge for designers and users. Designers have to think about new solutions made of pluralistic medias and devices. Users have to adapt their multiple senses required to perceive an embedded digital world to day-by-day life. The technology convergence enables the design of multisensory user interfaces capable to move across multiple devices and medias.

Multisensory Interaction Design (MID) is one approach to address the plurality inherent to digital convergence domain. This approach considers the user interface design through plurality, adaptability and cognitive ability aspects. A pluralistic design means to consider multimedia, multiuser, multidevice and also multiuse features. Adaptability is closed related with profiling (or user profile), accessible interfaces (accessibility issues) and flexible features. The cognitive ability involves the users capability of perception or users’ senses. It means that user experience abilities should be observed and represented in the interaction design.

The user experience in digital convergence is also a multimedia user experience. A multimedia experience can use two or more senses, individual or at the same time. Typically, multimedia experience involves differents types of media. The medias can be presented individual or synchronized. The users can interact with the medias, with other users and also coordinate medias exhibition (multiple views). We discussed about two scenarios of multimedia user experience: the arts and technology and the eHealth one. These scenarios show to us how rich and multisensory a user experience can be. We used the same tool, Arthron, to explore different user’s senses in the both cases. The first one (Arts and Technology) has shown how to integrate high quality video and audio streams with human movements (dancers) and robots. On the other hand, the eHealth scenario brings up 3D objects and natural interaction as a complementation of audio and video streams.

Therefore, the digital convergence is an ongoing phenomenon that changes our reality in terms of media, devices, applications. In the other side, users are changing, too. The expectations, the needs, the profiles, the use of perception. It is all in a transformation process stimulated by the digital convergence issues. However, user’s senses still the same but working with them is the emergent challenge. Multisensory interaction design can be a primary effort to understand the digital convergence design and its integration with everyday applications.