Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

7.1 Introduction

The piano-style keyboard remains the most commonly used interface for many computer music tasks, but it is also notable for a different reason: it is the object of a persistent disconnect between musicians and computer scientists, whose differing approaches to understanding expressive keyboard performance have important implications for music computing and human-computer interaction generally.

Expression on the acoustic piano, considered from a mechanical perspective, seems straightforward: the speed of a key press determines the velocity with which a hammer strikes a string, which in turn determines nearly all acoustic features of a note. In the mechanical view, expression at the piano is a function of the velocity and timing of each key press (with secondary contributions from the pedals). Accordingly, digital keyboards are essentially discrete interfaces, sensing only the onset and release of each note, with a single velocity metric associated with each onset.

On the other hand, pianists often swear by a rich, multidimensional gestural vocabulary at the keyboard. To the pianist, key “touch” is a critical component of any expressive performance, and though pianists differ on the ideal approach, there is a consensus that expressive keyboard gesture goes well beyond mere key velocity. Consider pianist Alfred Brendel on emulating orchestral instruments at the piano (Berman 2000):

The sound of the oboe I achieve with rounded, hooked-under, and, as it were, bony fingers, in poco legato. The flute… whenever possible, I play every note with the help of a separate arm movement. The bassoon… the touch is finger-staccato. The noble, full, somewhat veiled, ‘romantic’ sound of the horn demands a loose arm and a flexible wrist.

Another common thread in piano pedagogy is the value of achieving a “singing” touch. Reginald Gerig, in summarising his historical survey of famous pianists’ techniques, writes, “the pianist with the perfect technique is also a singer, a first-rate vocalist! The singing voice is the ideal tonal model and aid to phrasing, breathing, and interpretation” (Gerig 2007, p. 520).

Perhaps symptomatic of these diverging views, very few pianists would choose even the most sophisticated digital piano over an acoustic grand of any quality. Pianist Boris Berman (2000) offers this advice to students: “Often overlooked is the need to work on an instrument that responds sufficiently to the nuances of touch. (No electronic keyboard will do, I’m afraid.)”

7.1.1 Quantifying Expressive Keyboard Touch

Our work seeks to reconcile mechanical and expressive views of piano performance by developing quantitative models of keyboard technique. We deliberately leave aside the question of how key touch affects the sound of the piano, instead focusing on the performers themselves. Topics of interest include:

  • How does expressive musical intent translate into physical motion at the keyboard?

  • Can we identify general relationships between musical character and key motion?

  • Which aspects of touch are common across performers, and which are part of an individual player’s style?

  • Can we use detailed measurements of a player’s finger motion to predict the musical qualities of a performance?

These questions have important implications for both musical performance and HCI. Like many forms of human-computer interaction, the notion of a “key press” is an abstraction which reduces a potentially complex series of motions into one or two discrete quantities. In this chapter, we will show how looking beyond this abstraction can reveal new details of a performer’s intentions; similar abstraction-breaking approaches can potentially yield insight into other computer interfaces.

7.2 Background

7.2.1 Measurement of Piano Performance

Numerical measurement of piano performance has a long history, detailed summaries of which can be found in Goebl et al. (2008) and Clarke (2004). The percussive nature of the piano action has encouraged models of expressive performance focused primarily on velocity and timing. Classical performances in particular are often evaluated by tempo and loudness deviations from a printed score, e.g. (Repp 1996).

In the past decade the field of performance studies has flourished (Rink 2004), bringing with it an emphasis on the performing musician as equal partner with the composer in realising a musical product. However, even as attention has shifted to the unique role of the performer, a bias remains toward analyses of tempo and dynamics, which Rink suggests may be “because these lend themselves to more rigorous modelling than intractable parameters like colour and bodily gesture” (p. 38).

It is true that conventional analyses largely discard any sense of the performer’s physical execution beyond the resulting hammer strikes. Acoustically speaking, though, this approach has merit: in the 1920s, Otto Ortmann (1925) demonstrated that keys played percussively (in which a moving finger strikes the key) exhibit a different pattern of motion than those played non-percussively (when the finger begins at rest on the key surface), but Goebl et al. (2004) showed that apart from a characteristic noise of the finger striking the key, percussive and non-percussive notes of the same velocity are indistinguishable by listeners. A similar study by Suzuki (2007) showed very little spectral difference between tones played in each manner.

7.2.2 Beyond Velocity and Timing

If velocity and timing (along with pedal position) are sufficient to reproduce a piano performance with high accuracy, what then is the value of studying additional dimensions of performer motion? Doğantan-Dack (2011, p. 251) argues that the performer’s conception of a performance is inseparable from its physical execution:

I would indeed hypothesize that performers do not learn, represent and store rhythmic-melodic units without their accompanying gestural and expressive dimensions. As distinct from the listener’s experience and knowledge of such local musical forms, the performer, in order to be able to unfold the dynamic shape of the musical unit from beginning to end as in one single, unified impulse, needs a kind of continuous knowledge representation that is analogue and procedural rather than declarative.... The performer does not come to know the rhythmic-melodic forms they express in sound separately from the physical gestures and movements required to bring them about. Any gesture made to deliver a unit of music will inevitably unify the structure and expression, as well the biomechanical and affective components, which theory keeps apart.

From this perspective, measurements of piano mechanics alone will fail to capture important details of the performance’s original expressive conception. Measuring key touch as a continuous gestural process rather than a sequence of discrete events may better preserve some of these details. Of course, gesture measurement can be carried further still, even beyond the keyboard: for example, Castellano et al. (2008) found links between pianists’ emotional expression and their body and head movements. For our present purposes, measuring continuous motion at the keyboard provides an appropriate amount of expressive detail while retaining links to traditional methods of analysing piano performance.

Some authors have previously examined touch as a continuous entity. Parncutt and Troup (2002) discuss mechanical constraints in playing complex multi-note passages, and also examine the contribution of auxiliary contact noises (finger-key, key-keybed, hammer-string) to the perception of piano sound; the amplitude and timing of these noises often depends heavily on the type of touch used. Goebl and Bresin (2001), in examining the reproduction accuracy of MIDI-controlled acoustic pianos, contrast the continuous velocity profile of a human key press with its mechanical reproduction. On the commercial side, Bösendorfer CEUS pianos have the ability to record continuous key position (Goebl et al. 2008), but thus far few detailed studies have made use of this data.

7.3 Measuring Gesture Within a Key Press

To better understand the expressive dimensions of key touch, it is necessary to break the abstraction of a key press as discrete event. To this end, we have developed a new hardware and software system which can be retrofitted to any piano to measure the continuous position of each key.

7.3.1 Optical Sensor Hardware

Our sensor system (McPherson and Kim 2010) is based on a modified Moog Piano Bar, a device which installs atop an acoustic piano keyboard to provide MIDI data. Internally, the Piano Bar uses optical reflectance sensors on the white keys and beam interruption sensors on the black keys to measure the position of each key (Fig. 7.1). The Piano Bar generates discrete key press and release events from this information, but we instead sample the raw sensor values to provide a continuous stream of position information. The position of each key is recorded 600 times per second with 12-bit resolution (closer to 10 bits in practice due to limited signal range).

Fig. 7.1
figure 1

Optical reflectance and break-beam sensors measure continuous key position

7.3.2 Data Segmentation and Analysis

The 600 Hz sample rate is sufficient to capture several position values during the brief interval the key is in motion, recording not just its velocity but its shape (continuous time-position profile). Key press events can be extracted in real time from the position data stream by simple thresholding to identify the start of a press, followed by peak detection to identify the point of impact with the key bed. Once the start and end points of the press have been identified, higher-level features can be extracted, including (MIDI-like) velocity, peaks and troughs in the continuous velocity curve (indicating percussively-played notes), and the position of the key immediately following the key bed impact (which is proportional to the force exerted by the player). See McPherson and Kim (2011) for further details.

Beyond measuring traditional key presses, continuous key position can identify fingers resting lightly on a key surface, changes in weight over the course of a long-held note, and details of the overlap between notes in a phrase. Section 7.5 will show that measurements of weight in particular may have important correlations with expressive musical intent.

7.4 Multidimensional Modelling of Key Touch

Our sensor system takes an important step toward reconciling pianists’ nuanced, multidimensional view of keyboard technique with the mechanical realities of the instrument. In recent work (McPherson and Kim 2011) we show that, regardless of whether different forms of key touch produce different sounds on the acoustic piano, pianists can and do vary the shapes of their key presses in multiple independent dimensions. Two user studies conducted on an acoustic piano with continuous position sensing support this result:

7.4.1 Study 1: Gesture and Intuition

Without being told the purpose of the study, subjects were asked to play a simple melody guided by nine different expressive cues (e.g. “very delicate, as if afraid of being heard”, “like flowing water”, “as if you’re angry at the piano”). Twelve features were selected to represent the shape of each key press, including key position and velocity measurements during the beginning, middle and end of the brief window during which the key is in motion.

If we accept the traditional notion that key presses reduce only to key velocity, then all 12 features should be linearly related. Instead, using principal component analysis, we demonstrated that six independent dimensions were required to represent 90% of the variance among all key presses, suggesting that pianists’ rich picture of key touch has a strong mechanical foundation.

We further trained classifier systems to predict the expressive cue from key motion. We showed that classifiers trained on all 12 features performed on average 25% better than those trained on MIDI velocity alone, indicating that key motion in multiple dimensions correlates meaningfully with expressive intent. The detailed nature of this correlation is a primary topic of continuing investigation.

7.4.2 Study 2: Multidimensional Performance Accuracy

In pilot studies with professional pianists, we identified five dimensions of key motion for further investigation (Fig. 7.2):

Fig. 7.2
figure 2

Five dimensions of a piano key press (Reprinted from McPherson and Kim (2011) with kind permission from ACM)

  • Velocity: Speed of the key in the traditional MIDI sense, related to hammer speed.

  • Percussiveness: Whether the finger is at rest or in motion when it strikes the key.

  • Rigidity: For percussively-played notes, whether the finger joints are rigid or loose when the finger-key collision occurs.

  • Weight: Force into the key-bed immediately after a press.

  • Depth: Whether a press reaches the key bed or stops midway through its range of motion.

Our main study evaluated whether subjects were able to accurately control each dimension, independently or in combination. Each dimension was divided into two or three discrete classes, and decision tree classifiers were trained using key presses performed by the investigators. Ten subjects (all novice or intermediate pianistsFootnote 1) played a series of isolated key presses, attempting to match particular target classes (Fig. 7.3). Subject accuracy is shown in Fig. 7.4; with the exception of rigidity, subjects were able to control each individual dimension with greater than 75% accuracy. Multidimensional accuracy was lower, but still significantly above random chance for each task.

Fig. 7.3
figure 3

Multidimensional key touch testing environment

Fig. 7.4
figure 4

Proportion of key-presses correctly classified, with 95% confidence intervals (Reprinted from McPherson and Kim (2011) with kind permission from ACM)

These results suggest that the keyboard can successfully be used as an interface for multidimensional gesture sensing, with the results potentially applicable to other mechanical button interfaces as well.

7.5 Towards a Model of Expressive Gesture

The studies in the previous section establish that touch can be analysed in multiple dimensions using continuous key position, and that pianists are able to control multiple dimensions both intuitively (Study 1) and deliberately (Study 2). Our ultimate goal is a numerical model relating expressive musical intent to multidimensional key touch. This is a challenging proposition, given the subjective and personal nature of musical expression. Our work in this area is ongoing, but we here present two initial case studies that may be indicative of broader patterns.

7.5.1 Touch in Beethoven’s Piano Sonata #4

We collected key touch measurements from four professional pianists’ performances of the second movement of Beethoven’s 4th piano sonata, op. 7. This movement (the opening of which is shown in Fig. 7.5) presents an interesting case study in expressive touch: the tempo is slow and the texture spare, employing long-sustaining notes and chords. The phrasing and the tempo marking largo, con gran espressione suggest continuity and intensity despite the slow tempo and soft dynamic level.

Fig. 7.5
figure 5

Beethoven Piano Sonata #4, op. 7, mm. 1–7. Notes highlighted in red correspond to measurements in Fig. 7.6 (Color figure online)

Though each pianist’s interpretation differed, we observed some notable patterns of key touch in the opening measures that suggest a relationship between expressive intent and physical gesture.

7.5.1.1 Weight and Musical Intensity

Figure 7.6 shows one pianist’s performance of mm. 3–5. For clarity, only the top notes in the right hand are shown. In contrast to traditional MIDI representations, the pianist’s action over the entire course of each note is shown. Because a felt pad separates the key from the key bed, weight on a pressed key effects a slight change in position. Examining the pattern of weight over the course of each note suggests interesting musical correlations:

Fig. 7.6
figure 6

Key position for Beethoven sonata #4, mm. 3–5, topmost line only. Vertical axis indicates deviation from key rest position in inches. Dashed line indicates (scaled) damper pedal position. Colours by pitch: A (magenta), B (green), C (blue), D (red), E (black) (Color figure online)

  1. 1.

    The first note (corresponding to m. 3) has a rounded position profile indicating that the pianist increased the force into the key bed over the course of the note, rather than exerting maximum force at the time of impact. A similar effect can be seen in the last two notes of the passage (m. 5). Subjectively, we observed that these notes tended to be played with greater emphasis and continuity; the phrase in m. 5 in particular was typically played with a crescendo across the measure.

  2. 2.

    The long note in m. 4 (shown in blue in Fig. 7.6), marked sforzando in the score, exhibits a particularly telling weight profile. After the initial impact, the pianist exerts an exaggerated force into the key which diminishes over the course of the note. This is a direct parallel to the typical musical shape of the passage, where the sf note marks the strong point of the phrase, with a diminuendo for the rest of the measure.

These observations indicate that force into the key bed may correlate with the intended intensity or direction of a musical phrase. Since the piano’s decay cannot be altered, conventional analyses typically do not consider the performer’s intended shape within a note; however, the body language of most pianists indicates that they shape phrases both across and within notes. Indeed, a recent study found that nearly half of keyboard players agreed with the statement “I think about musical shape when thinking about how to perform a single note” (Prior 2011).

7.5.1.2 Articulation and Touch

Figure 7.7 shows key velocity over time for the note marked sf in m. 4, highlighting the shape of the key onset itself. Each pianist was asked to play the entire phrase twice, the first time playing a “warm” sforzando, the second time a “sharp” sforzando. Such distinctions in articulation are common on string instruments, where they relate to the motion of the bow. Though their application on the piano is less obvious, pianists routinely employ similar vocabulary.

Fig. 7.7
figure 7

Key velocity (in/s) versus time for the sforzando note in m. 4. Four players (two shown here) were asked to play a “warm” and a “sharp” sforzando, with most players demonstrating a more percussive touch on the latter

Our measurements showed that three of four pianists played the “sharp” sforzando with a more percussive stroke than the “warm” sforzando.Footnote 2 The fourth pianist played both notes identically. Peak key velocity tended to be similar in both versions, suggesting that the expressive descriptors influenced the shape of each pianist’s gesture independently of its overall speed.

7.5.2 Touch in Schubert’s Piano Sonata D. 960

The fourth movement of Schubert’s Piano Sonata in B-flat Major D. 960 opens with a curious marking: forte-piano, implying a note that begins strongly and immediately becomes soft (Fig. 7.8). This dynamic profile is impossible on the acoustic piano, yet the marking recurs frequently throughout the movement. We interviewed the four pianists in our study about their approach to this passage and collected continuous key position measurements, not only of the Schubert’s fp marking, but also several “re-compositions” substituting other dynamic markings: forte, forte with diminuendo, mezzo-forte, mezzo-forte with an accent, mezzo-forte with a sforzando, and piano.

Fig. 7.8
figure 8

Schubert D. 960 movement IV opens with a forte-piano that has no direct mechanical realisation on the piano

Three of the four pianists indicated specific gestural approaches to playing forte-piano. To one pianist, the marking implied “a sharp attack, but without the follow-up weight.” Another interpreted it as “forte, [with] piano body language,” adding that in his teaching, body language is important and that he encourages students to “apply expression to non-keystroke kinds of events.” A third explained in more detail:

When I teach people, there are a number of different ways I tell them they can vary the tone color.... There’s the speed of the attack, there’s the weight of the attack, there’s the firmness of the fingers, and there’s how direct or how much of an angle you play at. So when I see an fp … I want a very fast attack and probably a shallow attack.

Given the consistency and specificity of the pianists’ responses, we expected to find a marked difference in key motion for notes played forte-piano compared to other dynamic markings. However, the numerical data is less clear-cut. Figure 7.9 shows key presses for the top G for all pianists, scored according to key velocity, percussiveness, maximum key depth and follow-up position (weight) several milliseconds after impact with the key bed. The data exhibits some clustering according to playing style, indicating at least moderate consistency across pianists and repetitions. Velocity shows clear differentiation among playing styles, but few other dimensions show any substantial, independent correlation. The results were similar for the bottom G, and did not change when considering only the three pianists who indicated a consciously different technique.

Fig. 7.9
figure 9

Comparison of four features of the opening G key press in Schubert D. 960, labelled according to the initial dynamic marking. Plots reflect several repetitions by four pianists

One hypothesis for this result is that, though the pianists may perceive each marking differently, their physical execution is identical. Another is that the pianists do indeed use different body language for different dynamics, but that this is not manifested in different profiles of key motion. Either way, this excerpt demonstrates the limitations of key touch in encompassing all aspects of expression at the piano.

7.5.3 Discussion

The preceding examples lend initial support to the notion that expression on the piano extends beyond discrete metrics of key velocity, and that in certain cases, expressive intent has a direct effect on the profile of motion within a single key press. Further studies are needed to definitively establish the nature and scope of the expression-gesture link. In addition to studying larger numbers of performers, the use of narrowly-defined expressive tasks (for example, to emulate a carefully constructed audio example, or to play an excerpt emphasising specific emotional qualities) will help clarify the ways that key touch is shaped by expressive intent. Augmenting future studies with video motion capture could allow further exploration of the way body language reflects the expressive qualities of a performance.

7.6 Implications

7.6.1 Computationally Augmenting the Acoustic Piano

Piano touch and HCI naturally converge in the creation of new digital musical instruments. We have created a system of electromagnetic augmentation of the acoustic piano which uses information from the performer’s key touch to shape the sound of each note (McPherson 2010). The acoustic piano, for all its versatility, has a notable limitation: a note, once struck, cannot be further shaped by the performer before it is released. Our system (Fig. 7.10) uses electromagnets inside the instrument to induce the strings to vibration independently of the hammer mechanism, allowing notes with no attack and indefinite sustain, as well as harmonics, pitch bends and new timbres.

Fig. 7.10
figure 10

The magnetic resonator piano, an electronically-augmented acoustic piano. Electromagnets induce the strings to vibration in response to continuous gestural input sensed from the motion of the piano keys

Continuous measurements of key touch are used to control signals to the electromagnets; continuous key position measurement also enables new types of gestures that have no meaning on the traditional keyboard (McPherson and Kim 2010). For example:

  • Slowly moving a key from its rest position to fully pressed creates a crescendo effect.

  • Exerting a heavy force on a pressed key elicits a timbre change by altering the harmonic content of the electromagnet waveform.

  • Periodic variations in key pressure generate vibrato on a note.

  • Lightly vibrating a key near its rest position creates a glissando up the harmonic series of the corresponding string.

  • Holding one key while lightly pressing the adjacent key bends the pitch of a note.

All of these effects are rendered acoustically by the strings and soundboard, promoting integration between traditional and extended piano sounds. Relationships between keyboard gestures and acoustic features are adjustable in software, allowing the instrument to serve as a laboratory for the mapping problem: how do we map the performer’s physical actions to sound in a way that is flexible and intuitive?Footnote 3 Studying the expressive aspects of piano touch, as in Sect. 7.5, may assist in developing mappings that build on the intuition of trained pianists.

In McPherson and Kim (2011), we showed that novice and intermediate pianists were successfully able to control the volume of electromagnetically-induced notes by manipulating continuous key position, both for passages played at a constant dynamic level and passages containing crescendos within individual notes. Acoustic feedback appeared to be highly important for user accuracy: when the electromagnetic system was switched off and the pianists were asked to silently control the depth of each key press, they exhibited significantly lower consistency.

7.6.2 Broader Implications for HCI

Even beyond the creation of musical interfaces, piano touch can potentially offer lessons for HCI researchers.

7.6.2.1 The Value of Breaking Abstractions

Most input devices rely on abstractions: keyboards, mice, trackpads and touchscreens each capture a few salient dimensions of a more complex human motor process. In this chapter, we have shown how breaking down a similar abstraction at the piano keyboard can yield additional insight into the performer’s intention. We are interested not only in which keys are pressed and when, but how the keys are pressed. Despite the apparent irrelevance of many of the additional dimensions of motion to sound production at the piano, interviews and the pedagogical literature show that pianists feel strongly about them, with certain patterns appearing common across performers and others serving as hallmarks of personal style.

Correspondingly, HCI researchers may find value in considering the broader gestural parameters of common devices. Examples of such approaches include pressure-sensitive computer keyboards (Dietz et al. 2009) and mice (Cechanowicz et al. 2007) and touch-screen interaction which considers the orientation of the user’s finger in addition to traditional XY contact location (Wang et al. 2009). A potentially interesting area of exploration would employ such systems in common interaction scenarios without alerting the user to the presence of additional dimensions, looking for patterns in users’ gestures which could guide the development of future interfaces.

7.6.2.2 Interaction on an Intuitive Level

The piano can be considered as a human-machine interface whose parameters of interaction are quite different from normal computer systems. Playing a musical instrument involves a great deal of internalised, subconscious motor control, and correspondingly, expressive musical intent and physical gesture are connected on an intuitive level. Though piano technique reflects considerable specialised training, we believe that patterns of expression at the piano can potentially generalise to other gestural interfaces.

Designing interfaces to specifically capture expressive information is a challenge; even defining the nature of “expression” is no easy task. A notable recent example of such a system is EyesWeb (Camurri et al. 2004), which analyses motion profiles from video data to assess expressive intent. Our experience with piano technique suggests that expressive cues can be found in many places, and that bridging the qualitative and quantitative creates potential for new computer interfaces that permit intuitive, expressive interaction, with applications to the creative arts and beyond.