Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

People exhibit inherent intrinsic variations for their gesture articulations because gestures carry dependency with both the person producing them and the specific context, social or cultural, in which they are being produced. In his psycholinguistic studies on human discourse and relationship between gesture and thought, Mcneill [1] considers that “gestures are the spontaneous creations of individual speakers, unique and personal” and that gestures “reveal the idiosyncratic imagery of thought” (p. 1). More than cultural dependency, gestures are also deeply intertwined with speech, which makes the lexical and syntactic structures of language also affect the specific forms in which gestures are being produced [2]. The user-dependency aspect of gesture production has been many times reflected by previous work that analyzed users’ gesture preferences in conjunction with specific gesture sensing technology, such as interactive tabletop surfaces [3, 4], accelerated movements [5], and freehand gestures [6,7,8]. These studies, generally referred to as “gesture elicitation studies,” have shown that some level of consensus exists between users due to similar conceptual models that users naturally seem to construct when thinking about common interactive tasks. However, these studies also pointed out many variations in users’ preferences for gesture commands, with probably the most important finding being that users prefer different gesture commands than those proposed by experienced designers [3].

For the specific case of multi-touch input, there are many degrees of freedom that can be independently controlled during gesture articulation, such as the number of fingers or the finger types touching the surface [9], single-handed or bimanual input [10, 11], variations in the number of strokes forming the gesture [12, 13], and the use of additional modalities accompanying finger touch input leveraged by sensing pressure [14] and various parts of the finger anatomy [15]. In their user-defined surface gestures study, Wobbrock et al. [4] captured the many degrees of freedom aspect when noting that “surface gestures are versatile and highly varied−almost anything one can do with one’s hands could be a potential gesture” (p. 1083). Indeed, our recent work experimentally confirmed variation in multi-touch gesture articulation, and reported the many ways in which people naturally introduce variation for surface gestures when not being constrained by limitations imposed by the interface or the recognizer’s ability to discriminate between gesture types [13]. At the same time, the versatility of multi-touch input makes prototyping multi-touch gesture recognizers a difficult task because, in many cases, “the programming of these [multi-touch] gestures remains an art” [16, p. 2875]. We believe that one cause for this recognition challenge is our limited understanding of variability in multi-touch gesture articulation, which affects not only recognition performance but also users’ expression possibilities in current multi-touch interfaces. For example, a better understanding of variability could benefit interface design beyond achieving high-performant recognition, toward more fluent and expressive interactions able to exploit the explicit signals contained within the variability of the articulation [17] and could lead to more accurate multi-touch gesture recognizers [18]. In addition, a better understanding of variability can improve multi-user design in which users interact simultaneously or collaborate to define a gesture in a specific task.

In this chapter, we advocate for the need of more in-depth user studies to better understand how users using multi-touch gestures. As gestures are versatile, we argue for designing interaction techniques that support many-to-one mappings between gestures and commands. In our research methodology, we think that a good start point is to explore the variability of multi-touch gesture input from a user-centric perspective. We conducted a pair of experiments to investigate and to understand how users produce multi-touch gestures. We employed quantitative and qualitative methods to understand the variability of multi-touch gesture articulation. In our previous work [13], we presented a first study to understand the variability of users’ multi-touch gesture articulations and we leverage a taxonomy of multi-touch gestures and introduced the concept of atomic movement. Building on these results, we extend our previous experiment in this work to understand whether our findings are consistent and robust for a larger gesture set and new participants. We first compute the number of variations users can propose when they are not constrained. We then provide an in-depth analysis to characterize in a comprehensive manner the strategies employed by users to produce different articulations for the same gesture type. Our results also include subjective and qualitative feedback informing about users’ multi-touch gesture preferences.

2 Related Work

Supporting users’ wide range of multi-touch gesture articulation behavior has been previously noted as an important design criterion to deliver increased flexibility and a high-quality user experience [19]. This fact has led to a number of orthogonal design implications [4, 20, 21]. For example, principled approaches provide basic building blocks from which gesture commands are derived, such as gesture relaxation and reuse [22], fluidity of interaction techniques [23], and cooperative gestures [24]. Other researchers advocated for assisting users in the process of learning multi-touch gestures during actual interaction by proposing gesture visualizations [20], dynamic guides [25], and multi-touch menus [9, 26]. At the same time, other user-centric approaches advocate for enrolling users right from the early stages of gesture set design [4, 5, 27]. Such participatory studies revealed interesting findings on users’ behaviors in articulating gestures as well as on users’ conceptual models of gesture interaction. This previous work also recommends flexible design of gesture commands to accommodate variations in how users articulate gestures. Oh and Findlater [21] went further and investigated the feasibility of user-customizable gesture commands, with findings showing users focusing on familiar gestures and being influenced by misconceptions about the performance of gesture recognizers. Rekik et al. [28] examined user’s perceived difficulty of articulation multi-touch gestures.

To deal with users’ variations in multi-touch input, in a previous work [13], we presented the first investigation toward understanding multi-touch gesture variability. We described a general taxonomy to understand users’ gestures, and to derive implications of users’ variability of gesture articulation. Our taxonomy was the result of a user-centric study giving new insights into the different possible articulations of unconstrained multi-touch gestures. In that study, we considered a small set of 8 gesture types and we explicitly fixed the number of variations that users were asked to produce for every gesture. While this was sufficiently sound to elicit users’articulation, it also gave rise to further questions, which we address in this work. More specifically, we are interested in the following related questions: (1) the number of variations a user would be able to propose; and (2) what strategies users adopt to articulate the same gesture type within unconstrained multi-touch input.

3 Spontaneous Gestures

We define spontaneous gestures that are produced by users under unconstrained articulation conditions, i.e., users have the total freedom in creating such gestures without any instructions. This concept and definition allows us to capture the versatility of multi-touch gestures in a faithful manner with respect to users’ actual intentions. We believe that spontaneous gestures are important for multi-touch interfaces since they deliver a more pleasurable experience by not constrained users to conform to and follow specific articulation patterns. Spontaneous gestures enable us to understand how users are actually transforming a geometric gesture shape into a motor articution plam. Such a fundamental understanding allows to abstract away from existing gestures and to leverage existing multi-touch input by incorporating more general concepts related to users’ gesture articulation behavior thus ending up with more flexible and powerful interaction techniques.

In the following, we present the results of two experiments to understand spontaneous gestures. Our analysis was conducted in light of our previous work [13] in order to illustrate and to confirm its predictive power for new people and new gesture types. We first recall the open-ended experiment in which we introduced the concept of atomic movement and established our taxonomy [13]. Then, we present our goal-oriented experiment from which we report strategies employed by users to create spontaneous gestures and we discuss users’gesture preferences.

4 Open-Ended Spontaneous Gestures

We report in this section the results of the first task of the experiment conducted in [13]. Our goal in that experiment was to observe and analyze users’ unconstrained multi-touch gestures. We asked 30 participants to produce as many gestures as possible that came to their mind such as gestures that had a meaningful sense to them or gestures that they would use to interact with applications. In addition, participants were asked to describe the gesture they performed using the think-aloud protocol.

4.1 The Concept of Atomic Movements for Gestures Production

A recurrent observation regarding participants gesture input behavior was that participants grouped their fingers into unitary blocks that moved in a consistent manner. We found that the number of contact fingers did not impact their movements, as long as fingers were close together. One interesting observation was that the notion of finger proximity is relative to the gesture type and also to user-proper referential and seems to be hard to define in absolute and universal manner from a system point-of-view. Users referential can in fact be substantially scaled up or down from the performance of one gesture to another one. However, the referential tends to stay constant and consistent over time and through continuous movements composing the same gesture. For example, one participant used two hands simultaneously with multiple fingers in contact with the surface to draw a circle such as each hand was drawing half of the circle. The same participant used both hands simultaneously moving from the top to the bottom of the surface to denote that he was translating all images that were in-between his hands. For these two examples, the relative distance between fingers composing the same movement is different: in the first example, it represents the distance between the fingers of the same hand, but in the second example, it represents the distance between the two hands, which can cover all the surface width.

To explain these behaviours, we introduce the notion of “atomic movement” which reflects users’ perceptions of the undividable role that a group of fingers is playing when performing a gesture. From our observations, atomic movements are mostly in reference with the imaginary trail of a group of fingers. An atomic movement has an internal state that can change depending on hands shape, fingers arity, velocity, direction, etc. However, state changes do not alter the role an atomic movement is playing in users’ minds and their primary intentions. Atomic movements are often mapped to global strokes in symbolic gestures, but they also capture more abstract movements implied globally by a whole set of fingers. In the particular case of users performing a symbolic gesture, users do not mind about the trail of each individual finger; instead they seem to view the atomic movement produced by a group of fingers as a single stroke without consideration to the actual individual strokes produced by each finger. For more abstract multi-touch gestures, fingers’ atomic movements express a global meaning that users convey. In all cases, the stroke or the trace of individual fingers considered separately are not an important issue from the user’s atomic movement perspective, which contrasts with the system perspective when processing and interpreting multi-touch input. From our observations, we distinguish between two classes of movements depending on whether (i) the multi-touch path corresponding to fingers is stationary or (ii) the multi touch path implies an embodied motion. As practical examples, variable number of fingers, from one or both hands, moving together following the same path or being held stationary to delimit or point a region on the surface, are among the most frequently observed atomic movements.

4.2 An Embodied Taxonomy of Multi-touch Gesture

To capture the space in which our participants produced gestures, we propose the multi-level layered taxonomy summarized in Table 3.1. The multiple levels of our taxonomy do not model separable attributes to be characterized individually. Instead, they represent the different aspects of a single unified dynamic mechanism employed by users in the production of a multi-touch gesture.

Table 3.1 A taxonomy of multi-touch gesture

At the highest level of our taxonomy, we model the fact that a multi-touch gesture emerges the users’ understanding of the gesture path before touching the surface. From this perspective, an external observer can only try to guess the semantic concept hidden in the user’s gesture, since it might be the case that the gesture itself is not sufficient to fully reveal user’s intention—an observation in accordance with previous studies [4, 29, 30]. From a neurological perspective, hands and fingers are controlled and coordinated by the human motor system to achieve a desired task. The physicality level thus captures the motor control allowing users to project the semantic level onto the interactive surface. Finally, the movement level is the practicalresult of the motor goal expressed by hands and fingers motions in order to infer unitary blocks composing the gesture.

The movement level is at the core of our model since it constitutes the interface between the user and the interactive surface. We propose to structure this level according to two generic classes built in a recursive manner. At the lowest level of the recursion, we find the class of gestures formed by an elementary atomic movement. An elementary atomic movement can be either of type stationary (Ref) or Motion as discussed previously. The Compound class refers to the recursive composition of a set of atomic movements. It is expanded in two classes according to the lifetime and the synchronicity of composing atomic movements. The Parallel class refers to users making two or more different but synchronous parallel atomic movements. This class engages relative finger motions as well as two-handed symmetric and asymmetric interaction. The Sequential class refers to users performing a set of atomic movements, either parallel or elementary, holding and releasing hands or fingers, on and from the surface, in a discrete iterative manner.

5 Goal-Oriented Spontaneous Gestures

We conducted a second experiment to understand how users explain variability for the gestures they produce. We have two main goals: (1) we are interested in how many variations a user would be able to propose and (2) we are interested in observing how people express variability for the same gesture type and what strategies they employed to create different articulation patterns for the same gesture. We asked 16 new participants to create as many different articulation variations as they were able to for 22 gesture types (see Fig. 3.1), given the requirement that executions were realistic for practical scenarios, i.e., easy to produce and reproduce later.

Fig. 3.1
figure 1

The gesture dataset for the second experiment contains 22 gesture types: letters, geometric shapes (triangle, square, horizontal line, circle), and symbols (five-point star, spiral, heart, zig-zag), and algebra symbols (step-down, asterisk, null, infinite)

5.1 Gesture Variations

Participants were instructed to propose as many articulation variations as possible for each gesture type. We collected 5,155 (=1031\(\,\times \,\)5) total samples for our set of 22 gesture types. In (Fig. 3.2), we summarize the number of gesture variations produced for each gesture type. on average, our participants proposed 2.92 variations per gesture type (sd \(\,= 0.45\)), a result which we found to be in agreement with the findings of Oh and Findlater [21] for action gestures (mean 3.1, sd \(\,= 0.8\)). A Friedman test revealed a significant effect of gesture type on the number of variations (\(\chi ^2(21) = 84.41\), \(p < 0.001\)). The “star” and “spiral” gestures presented the lowest number of variations (1.68 and 2.19 variations on average). The gesture with the largest maximum number of variations was “square” (3.56 on average) for which our participants managed to easily decompose it into individual strokes that were afterward combined in many ways in time and space. These first results suggest that the specific geometry of the gesture enables users with different affordances of how to articulate that shape. Likely, the mental representation of a gesture variation implies a particular type of articulation which is tightly related to the gesture shape. We can also remark that for all gesture types, except “star” and “spiral” the maximum number of variations was between 4 and 7 variations. This observation suggests that our choice of 4 variations for each gesture type in our first experiment can be even larger for some gestures types and some users. The minimum number of variations was between 1 and 2 variations. This result suggests that for some users and for some gesture types, the number of gesture articulation variations can be limited which can be explained by the previous practice but also by geometrical shape of the gesture.

5.2 Strategies for Creating Different Gesture Articulation Variations

To better understand how participants produced different gesture articulation variations for the same gesture type, we report in this section the different strategies elaborated by our participants. Based on our observations and also participants’ comments, we arrived at the following strategies:

Fig. 3.2
figure 2

Number of gesture variations produced for each gesture type. Note boxes show min, max, median, and first and third quartiles computed with data from all participants

  1. 1.

    Vary the number of atomic movements. As highlighted in our taxonomy, a gesture can be composed of a variable number of atomic movements. To define a new gesture articulation for the same gesture type, some participants vary the number of atomic movements composing the gesture. Most of them associated the maximum possible number of atomic movements to the number of direction changes in the gesture type (e.g., for “square” gesture, there are four direction changes). Other participants proposed different gesture articulations by varying the way the set of the atomic movement were produced. Two strategies were used: (1) changing the direction of some atomic movements composing the gesture articulation. For instance, an atomic movement representing an horizontal line may be created by moving the fingers from left to right or from right to left; and (2) changing the order of execution of the set of atomic movements composing the gesture articulation. For instance, the same gesture can be articulated using many atomic movements, and for the same atomic movements users may produce different orderings, e.g., there are 442 possible ways to draw a “square” using only sequential movement [31] (p. 273).

  2. 2.

    Vary the synchronization of atomic movements. As we showed in our taxonomy, a gesture is composed by a set of atomic movements which can be entered in sequence (i.e., one atomic movement after the other, such as in drawing the “plus” sign with one finger) or in parallel (i.e., multiple atomic movements are articulated at the same time, e.g., using two fingers to draw two sides of a “heart” shape at the same time). To create a new gesture articulation for the same gesture type, participants varied the synchronization of the atomic movements composing their gestures. However, not all gestures can be produced with hand movements in parallel. In fact, only gestures containing a symmetry can be performed with parallel atomic movements. Interestingly, wherever a presented, participants produced synchronous parallel atomic movements to create that part of the gesture (i.e., some atomic movements of the gesture were articulated with one atomic movement at the same time and others were articulated in parallel. e.g., using two fingers at the same time to draw the two diagonal symmetric lines of a “triangle” shape and then one finger to draw the horizontal line).

  3. 3.

    Vary the number of hands. As highlighted in our taxonomy, a gesture can be performed by using one hand or both hands. Interestingly, all participants varied the number of hands to articulated gestures. Most participants used one hand only when there was a single atomic movement to produce and used both hands when there were two atomic movements that could be entered in parallel. In addition, when using one-handed gestures, two additional strategies were observed: (1) changing the hand from the dominant to the non-dominant, and (2) alternating hands to enter the sequence of atomics movements. However, that these two strategies were rarely used by our participants.

  4. 4.

    Vary the number of fingers. For the same gesture articulation (i.e., the same number of atomic movements and hands with the same synchronization), we rarely observed participants varying the number of fingers to propose a different gesture articulation. This observation confirms that users rarely care about the number of fingers they use to produce multi-touch gestures [4, 13].

6 Discussion

In this section, we present user preferences and qualitative data that capture users’ mental models as they articulate spontaneous gestures.

6.1 Users’ Preferences

The primary goal of our user study was to understand users’ unconstrained multi-touch gesture articulation behaviors and to analyze the features and degrees of freedom that users will consider to propose different variations for the same gesture type within multi-touch input. This was planned before running our experiment in the form of a questionary that users filled in after completing the task. In fact, we preferred to ask participants about their preferences at the end of the experiment in order to not influence them during the experiment.

After completing the set of gestures, participants were asked to rate their satisfaction regarding their multi-touch performance on a 7-point Likert scale (1 strongly disagree, 7 strongly agree). Results showed that participants were satisfied with the set of gestures they proposed (median 6, stdev\(\,= 0.83\)). Three participants were extremely satisfied and only one participant gave a score of 4 miming that he could propose other gestures by varying the number of fingers.

Fig. 3.3
figure 3

Users’ preferences for articulating multi-touch gestures

We then asked participants to rate their preferences regarding the number of fingers, number of strokes synchronization and one hand and bimanual input in gesture articulation; see Fig. 3.3. Interestingly, although bimanual parallel articulations were more represented in the second gesture performed by users rather than in their first gestures, our participants preferred bimanual to one-handed sequential gestures. This observation suggests that people could develop different preferences with practice for articulating gestures in terms of strokes synchronization.

6.2 Users’ Mental Models During Spontaneous Gesture Production

Along all our experiments, we observed carefully the variations in how users articulate multi-touch gestures, and we recorded users’ qualitative feedback. We highlight in this section such findings.

  1. 1.

    Preference for multi-finger input. 13 out of 16 participants used more than one finger per stroke over all. Some participants were enthusiastic to touch the surface with many fingers at once, and witnessed they “feel more free and comfortable when using many fingers”, while one participant said he was “more comfortable with multiple fingers, since I feel like their movement is better controlled by my arm”. Although multiple fingers were preferred, participants did not really care about the exact number of fingers touching the surface. One participant witnessed “one or multiple fingers is the same and has no effect on the stroke nor on gesture expressiveness... I try to see how can I decompose the gesture into multiple strokes and use both hands simultaneously for different strokes”. Also, it was often the case for some fingers to disconnect from the surface for a short period of time during gesture articulation (e.g., start drawing with three fingers, continue with two, finish with three fingers again). For such cases, an appropriate visual feedback might prove useful to show users what unintentionally happened during articulation.

  2. 2.

    One finger is for precise input. When participants employed one finger only, they explained that they did so to be more accurate. For example, one participant witnessed that “when the symbol is complicated, such as a five-point star or spiral, I prefer using one finger to be accurate”. Three participants regularly used one finger to enter gestures. Two witnessed they conceptualized strokes simultaneously articulated by multiple fingers as being different, even though the movement was the same. Participants also made connections between single-finger gestures and pen input in many cases, e.g., “I use my finger like a pen”. This finding may have implications for future finger gesture designs, as we already know that finger and pen gestures are similar but also different in many aspects [32].

  3. 3.

    More fingers means more magnitude. Three participants felt they were drawing thicker strokes when employing more fingers. This finding may have implications on designing interaction techniques that exploit the number of fingers touching the surface beyond finger count menus [9].

  4. 4.

    Symbol type influences multi-touch input. Two participants said they articulated letters just like they would write them with the pen, one stroke after another. However, they felt more creative for the other symbols. One participant commented that for the “null” gesture, she would like to draw it just like she have tought at school: first the circle and then the line. Another participant was enthusiastic to touch the surface with both hands at once “I wish we had been taught to use both hands simultaneously to write letters! It is faster, more precise, and easier”. Most participants considered that the number of strokes and their coordination in time depends on gesture type. One participant said that “if the symbol can be drawn with only one stroke, I prefer to perform it with one stroke only”; two other participants “whenever there is a symmetry in the symbol, I prefer multiple simultaneous strokes”; and another participant “whenever I can decompose the symbol on multiple stokes where I can use my both hands to perform strokes simultaneously, I will do it”.

  5. 5.

    Gesture position, rotation, size and speed can be a source of variation. One participant said that the position of the gesture on the surface, gesture size; rotation and velocity represent sources of variation that he could used to propose more gesture articulation variations. However, he did not recur to then for two reasons: (1) varying the number of hands and movement synchronicity over time are more specific and “intuitive” for multi-touch surface, (2) the velocity may be difficult to distinguish without any feedback from the surface.

7 Conclusion

The results presented in this chapter contribute toward a better understanding of spontaneous gestures. We now have a more precise idea how users produce unconstrained multi-touch gestures. We also identified how many variations users are able to produce in general, by examining experimental results for a set of representation gesture types. Our findings are important in the context of proposing new interaction techniques that make use of the variability of user gestures. Further work will investigate more aspects of users’ multi-touch gesture production behavior in the attempt to reach a systematic understanding of multi-touch interaction with spontaneous gesture production patterns.