Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

8.1 Introduction

Networked music and sound performance have become significant areas of interest to contemporary musicians, composers and audiences alike. The recent popularity of pop music artist Gotye’s Youtube Orchestra (Gotyemusic 2013), and The Youtube Symphony Orchestra (Symphony 2010) demonstrate how a-synchronous and synchronous networked music performance has begun to capture the wider public imagination. However, networked music has a much longer history of technologists and musicians developing interfaces and collaborations on purpose-built low latency, high-fidelity platforms that facilitate hitherto improbable meetings between musicians of diverse cultural and musical traditions. Perhaps because of the ad-hoc nature of these collaborations, improvisation is a key feature of this style of performance. And, like much co-located improvised music, it challenges the traditional roles of the artist and audience, where the performative experience is often a shared participatory interaction between the musicians themselves. This shares many of the implications brought about by shifts in interactive digital media in which audiences have been transformed from viewers to participants, expounded in Chaps. 3 and 9 (“Evaluation and Experience in Art”, Candy 2014; “Mutual Engagement in Digitally Mediated Public Art”, Bryan Kinns 2014).

To gain a greater understanding of interaction in a telematicFootnote 1 collaborative context, therefore requires a move away from established paradigms of co-located performance evaluation, to a practice-led model of assessing musicians’ distributed, formative and spontaneous creation of improvised music (Candy 2006). The term practice-led is used here to describe research with a primary focus on understanding “the evolution of new practices” (Candy 2011, p. 35), i.e. of research arising from the needs and enquiries of practice, rather than with the sole intention of developing an artefact.

8.2 Background

The literature of social semiotics, multimodality and cognitive linguistics all emphasise that it is our physical interactions with the world as discourses of social practice that provide us with a conceptual framework for interpreting the meaning of those interactions. This derives from their shared linguistic heritage but more importantly it underscores the nature of networked musical improvisation as an embodied social practice. The results of multimodal discourse analysis (MDA) of inter-cultural networked interaction demonstrate that each of these modes has something distinctive to tell us about the construction of meaning between geographically dispersed musicians of different cultures and musical traditions.Footnote 2

8.2.1 Melody and Timbre as Semiotic Resources

In this research, melody and timbre are viewed as “semiotic resources” that can describe “what you can ‘say’ with sound, and how you can interpret the things other people ‘say’ with sound” (Van Leeuwen 1999, p. 4). It is proposed that representation and interpretation in networked musical interaction originate as shared metaphors of experience, generated from sounds based on an understanding of how they are physically produced (Van Leeuwen 1999). In these studies, an examination of the experiential qualities of the musicians’ improvised interaction allows for an evaluation that maps musical instance to gesture and reflective experience. This affords an insight into the creative and cognitive aspects of networked engagement where “observations of art as experience provide the basis of evaluation” (Candy 2013). As network technology facilitates new fields for inter-cultural interaction in music and the digital arts, social semiotics can provide the necessary tools for the analysis and evaluation of networked experiential engagement across a range of disciplines.

The field of networked collaboration is a fast-growing area that spans a number of disciplines. Much work has already been achieved in CSCW research (computer supported cooperative work) in developing frameworks in which to improve our understanding of dispersed collaboration, particularly in education and the workplace. However, many of the central theoretical perspectives of CSCW such as symbolic interactionism, activity theory and distributed cognition are not well suited to understanding the experiential and embodied characteristics of the physical production and interpretation of sound. We therefore argue that a social semiotic perspective best achieves an evaluation of interaction in networked improvised music by accounting for “felt significance of sound” (Cumming 2000, p. 134) and its interpretation across cultures.

8.2.2 Networked Performance

A feature of networked musical performance is that our understanding of telematics is sometimes clouded by the technical and conceptual parameters in which it takes place. Performances can involve an array of instrumental, technical and network configurations drawing together musicians with little or no understanding of the distributed environments in which the performance is occurring. Interaction in networked improvisation is distinguished from interaction in co-located contexts because performers interact without the expressive signifiers of body language and facial expression that are present in co-located (same venue, shared space, visually and gesturally interactive). Dedicated low-latency (network delay) telematic interfaces require high network speeds, and currently do not support robust video streaming of collaborators on domestic connections. Web-based video streaming applications not only make high demands on available bandwidth, but even when employed on high-speed research networks, visual fidelity lags noticeably behind the audio. This appears to be of lesser importance to musical interaction than one might think. Caceras et al. (2008) found that networked musicians don’t generally look at video streaming when they perform but suggest that it “serves primarily the purpose of providing an experience for the audience” (Caceras et al. 2008, p. 63). Rather than it being an essential component of networked interaction, visual streaming functions as a “material anchor” (Hutchins 2005, p. 1573) for conceptually bridging embodied located and dislocated experience.

Evaluation of networked musical interaction therefore requires us to look beyond the situated, visual and sonic ‘aesthetics’ of musical performance, in favour of the way in which dispersed musicians perceive and respond to representation and meaning in the “flow” of networked improvisatory performance (Csikszentmihalyi and Csikszentmihalyi 1993).

8.2.3 Multimodal Discourse Analysis

This research describes an analytical framework employing multimodal discourse analysis (MDA) that emerged in the 1980s and 1990s from linguists Michael Halliday, Robert Hodge, Gunter Kress and Theo van Leeuwen, who proposed that “meaning is not only communicated through language but also through other semiotic modes” (Machin and Mayr 2012, p. 6). As a practice-led model, it provides a valuable instrument for the evaluation of interaction in networked musical improvisation wherein multimodal data (video, music and text) are “recontextualisations of social practices” (Van Leeuwen, quoted in Lindstrand, 2010, p. 87), while acknowledging “interpretation is also a semiotic action” (Kress and Van Leeuwen 2001, p. 40). The analytical framework also adopts ideas from the field of cognitive linguistics, that similarly focuses on “the relation of language structure to things outside language: cognitive principles and mechanisms not specific to language” (Klemmer 2010). The experiential and material qualities of different modes of discourse are foregrounded by the inherent parallels between MDA and cognitive linguistics, and also through the interpretation of those discourses.

Multimodality is crucial for developing an understanding of musical and experiential interaction in networked improvisation from different cultural perspectives, in which the same musical interaction may have more than one interpretation. Cultural nuance has sparked long-running debates in musical aesthetics, e.g. whether musical meaning resides within the formal structure of music itself, or is the result of “symbolisms depicting actions, character and emotion” (Meyer 1956, p. 2). The authors take Van Leeuwen’s position that music and sound are dynamic. A social semiotic perspective views this as representing the actions of people, rather than representing the objects or things themselves (Van Leeuwen 1999). This is particularly relevant in improvisation where, as Berliner (1994) argues, “the ideas that occur during a solo assume different forms of representation: sounds, physical gestures, visual displays, and verbalisations. Each potentially involves distinctive thought processes and distinctive qualities of mediation with the body” (Berliner 1994, p. 206). MDA facilitates the examination of sounds, gestures and verbalisations through music, video and text, foregrounding the creative and cognitive components of networked improvisational interaction.

8.3 The Framework

The analysis of music, video and text within Mills’ framework (presented in this chapter) can therefore capture insights that might otherwise be missed by examining one of these expressive modes in isolation, and it provides for a more thorough evaluation of musicians’ interactive experiences. It also demonstrates recognition of the “affective” force of sound that as Coker (1972) argues, “activates emotional patterns of behaviour” (Coker 1972, p. 39). Our understanding and hence interpretation of experiential meaning in music is conceptually structured by embodied patterns in melody, rhythm, pitch, tonality and timbre that act as semiotic resources analogous to the physical production and vocalisation of speech acts. As Van Leeuwen argues, “the dividing line between speech, music, and other sounds is very thin. Many of the same kinds of things can be done verbally, musically or by means of ‘noises’” (Van Leeuwen 1999, p. 92). In other words, representation and meaning are viewed as emerging from our understanding of the physical experience of producing patterns of speech. This occurs in melody through our experiential understanding of what we physically have to do to produce a type of sound with our voice and body, for instance, speaking or singing in a low voice and increasing vocal effort to raise the pitch. As Van Leeuwen points out, “how people (composers, musicians, professional interpreters, audiences) interpret and experience this pattern, their experiences are likely to be in the same broad area” (Van Leeuwen 1999, p. 94). In this sense, the experience of force, or moving our bodies in motion, or standing upright conceptually structure our understanding of musical interaction through schematic relationships, e.g. related physical effort to the production of high or low pitch ranges and associated metaphorical perception of excitement or relaxation. It should be stressed that we are not claiming that musicians consciously think in, and of these terms, but rather that they result from their verbalised perception of interaction (Fig. 8.1).

Fig. 8.1
figure 1

Illustrates the interrelationship of multimodal discourse analysis and social semiotics in analysing and evaluating experiential meaning in networked improvisation

It emerges that conceptual metaphor is key to understanding networked musicians’ patterns of experience. In this light, “metaphor is not merely a matter of language, it is a matter of conceptual structure […] it involves all the natural dimensions of our experience, including aspects of our sense experiences: colour, shape, texture, sound etc. These dimensions structure not only mundane experiences but aesthetic experience as well” (Lakoff and Johnson 1980, p. 235). While recognising cultural distinctions, metaphor plays an in integral role in the examination of cross-cultural networked musicians’ experiences.

8.4 Application of the Framework to the Case Studies

Here, we demonstrate the analytical techniques of the framework and how the analysis evaluation of melodic and timbral interaction between expert cross-cultural musicians is achieved. The definition of a ‘cross-cultural’ musician in this context is to denote cultural heritage, rather than to imply a musical practice. Case study examples are used to illustrate how the analysis of music, instrumental gesture and the musicians’ reflective experiences can lead to an evaluation of the strategies that musicians develop to navigate dislocated and unfamiliar musical terrain.

To examine musical and cognitive interaction, it is necessary to listen to, and observe the participants improvising from geographically dispersed locations, and to ask them to reflect on their experiences. Reflective Video Cued Recall (VCR) (Omodei and McLennan 1994; Raingruber 2003) procedures were utilised, in which musicians were played a video recording of their performance and asked to stop the video, and to verbalise their experience as they recall their interaction.

As a first step, the analysis focuses on the cognitive experiences of one Australian musician in relation to several musicians of Asian and European cultural heritage. This provides a lens through which to view the interaction that is subsequently cross-referenced with the reflective experiences of the other participating musicians.

While the researchers were able to observe the focus musician and conduct the VCR session immediately after each improvisation, for logistical reasons this was not possible with the international networked musicians. Review was achieved by transferring the audio-visual data via a file transfer application immediately after the performance, then uploaded to a private YouTube channel, within a 24–48 h period following the session. The VCR was then conducted via the Google Hangouts application, which allows for real-time stop and start YouTube clips, allowing the participant and researcher to stop the video where necessary. The VCR audio was then recorded by the QuickTime application for later transcription and the process of identifying musical interaction began. Where necessary translators were also present in the recording of the VCR data.

8.4.1 Parameters of Interaction in Melody and Timbre

The process of identifying parameters of interaction in melody and timbre requires the examination of four specific components of music and sound derived from the analysis, and guided by Mills’ extensive experience in the field of networked improvisation. They are;

Musical initiation, what forms of musical motif or sound are used to begin an improvisation; initiate new sections within an established improvisation session (melody, rhythm, harmony, timbre); combinations of instruments. Motivic development: how does melodic or timbral interaction evolve; the ways in which melody and qualities of sound are employed by networked musicians, and what musical forms do their responses take, e.g., melodic, timbral, rhythmic. Harmonic development: how is tonality established, which instrument and musician/s initiate it, how do other musicians respond?

Timbre: what qualities of sound are being used (instruments, approaches to using qualities of sound); passages in which timbre is predominant in interaction; which instrument and musician/s initiate it, and how do other musicians respond to it.

8.4.2 Data Collection

The video and audio recordings of the case study performances and VCR transcripts provided a very rich source of data and were invaluable for drawing relationships between instances of musical interaction, performative gestures and what the musicians verbalised about their experiences of the interaction. From a multimodal perspective, it was also necessary to view the data sets together. This provided a challenge in being able to listen to the musical improvisation, observe the musicians’ gestures and read their reflective comments in a way that each could be viewed in relation to each other without having to switch between data sets. This was achieved in a two-step process of compiling the individual video recordings of each musicians’ performance into multiscreen clips, and then identifying instances of musical interaction, related performative gestures and musicians’ reflective comments. These were then entered into a data table. Figure 8.2 illustrates screenshots of multiscreen videos of each case study performance.

Fig. 8.2
figure 2

Multiscreen video clips of three case studies featuring dispersed musicians improvising in the telematic interface, eJamming

The data table for each case study contains a chronological development of the melodic, and timbral attributes of the improvisatory interaction along with other related components such as sequentiality and simultaneity (call and response), motivic exchange and development, texture, etc. It also documents associated gestures involved with the production or manipulation of sound, as well as the musician’s reflective comments about their perception of the interaction at these given points. This provides a global view of the interaction as it evolved over a 40-min period (Table 8.1).

Table 8.1 This data table is indicative of the dataset developed from the coding of the session, and also includes excerpts of interview transcripts obtained from the video-cued recall session. The data table is abridged due to space requirements, but time period 5:08-6:28 (marked in bold) is an unabridged example of the type of data that is obtained for every time period listed. Case study participants are MH guitar, ST ney. Present at the video-cued recall session were RM researcher, OT translator and AT Persian musicologist

8.4.3 Analysis

The analysis began by examining salient instances of improvised melodic interaction in the multi-screen screen video clips. This was then cross-referenced with related information from the data table such as the melodic or timbral nature of exchanges, associated gestures and the musicians’ verbal reflections of these instances. The data table also provided additional information such as patterns of meter (rhythmic pulse), and harmonic development that as semiotic resources contribute to representation and meaning in the interaction. The VCR transcripts yielded much information about how the musicians perceived their interaction at these specific points, and it is only by drawing relationships between all of these interactive components that a thorough analysis was carried out.

The selected example as illustrated in the table above, is an 8-min section beginning in the opening minutes of case study II, and features Iranian ney player Sina Taghavi (ST), and guitarist and focus musician Michael Hanlon (MH). It was performed from separate locations at the University of Technology, Sydney. The musicians did not know each other or meet before hand. While geographic distance is not a factor, the study is designed to emulate the circumstances in which a collaboration of this nature takes place. ST arrived in Australia from Iran in the previous 12-month period and for the purposes of this research fulfilled the criteria of an expert Persian musician.

The example includes the instrumental warm up as the musicians started to interact while final line checks were completed. It then traced the developing interaction throughout the improvisation and includes a more detailed examination of interaction between 5:00–8:10 on the timeline.Footnote 3

An overview of the entire 8-min section revealed that the melodic interaction between ney and guitar began in a tentative call and response (sequential) pattern, which increasingly resulted in overlapping (simultaneity). The interaction in this section mirrored the entire 40 min improvisation in that the melodic interaction based itself around small interval ranges and repeated melodic lines that emerge at different sections of the improvisation, sometimes transposed or modulated to different keys. Within the first 30 s of the improvisation, the guitarist played an ascending conjunct melodic line that was then imitated by the ney player. This call and response melodic imitation acted a meeting point for both musicians who comment on it within the first 3 min of their video cued recall session.

Guitarist MH,

I was just trying to feel it here and see where he was at and trying not to play too much. So it was a kind of floating about looking for notes

Ney player ST translated by an interpreter,

It was a new experience with the guitarist, it was a strange feeling and environment, some moments he would feel really close and some moments he felt really far away.

What the musicians were expressing was their perception of each other’s presence in the musical space. MH’s expression, “I was just trying to feel it here and see where he was at”, and as ST comments, “some moments he would feel really close and some moments he felt really far away” are indicative of the adjustments that the two networked musicians are making to interact in the telematic and non-visual encounter. Indeed, it is their concept of embodied co-located musical interaction that is structuring their creative engagement in networked interaction as applicable to the MUSICAL LANDSCAPE metaphor (Johnson 2007). This early encounter also illustrated an evolving familiarisation between both participants as the melodic dialogue develops from sequential to simultaneous interaction.

There were some apparent differences in intonation (tuning) that occur and then diminish over time. While the ney in this study is tuned to an equal tempered E, the scales that it uses combine tetra chords (containing quarter tone intervals that form a perfect fourth) in the upper and lower registers. Explaining how this was occurring, Iranian tar player and musicologist AT who was present in the VCR session’s states, “the higher tetra chords exist on guitar, but the lower ones don’t […], which is why you can hear it as being out of tune”. This formed an important part of the adjustment that both musicians make as they attenuate their playing in the early stages of the improvisation.

As the interaction moved from call and response (adjacency pairs) to increasing overlapping playing, or as Tannen (1992), would argue moving from “report talk” to “rapport talk” (cf. Van Leeuwen 1999, p. 68), the improvisation became more fluid, indicating the growing musical relationship between the two musicians.

The data table illustrates how the harmonic base that underscored the melodic interaction moved between A minor, E minor and C# minor for the duration of this whole section.

At 4:53 MH reached over to his effects unit in to change and lengthen the reverb setting, and at 5:08 he placed his guitar pick on the table to achieve a softer string picking sound for the next section of the interaction. These gestures set the timbre of the guitar sound, which implicitly suggested the atmosphere for the following section. MH then began plucking slow descending ostinato (repeating) arpeggio lines in A minor that created a base for ST to play the ney over. This chordal and melodic interaction was initiated in small (conjunct) intervals that as the interaction developed became wider and more expressive. At 5:38 ST manipulated the timbre of the ney through a combination of breathiness, trembling lip movements and shaking of the instrument, which created an intimate, vibrato like sound from the instrument. He also articulated the notes with a legato (gentle attack) and longer durational phrasing. What can be extrapolated from this is that the combination of the guitarist playing soft timbral (reverbed) descending tonal patterns triggered a response from the ney player, who then emulated similarly timbral qualities from the techniques described above. Asked about the his finger and lip gestures, ST (ney) states,

I was trying to be atmospheric and that was the feeling that I was getting from the guitar player, so I wanted to create an atmosphere.

As ST comments, he was able to perceive a combination of the guitar changing timbre from a sharp or more percussive plectrum sound to softer finger plucking and descending ostinato patterns, which he uses to develop his response.

These combinations of amplitude, timbre, descending pitch contours, note articulation and duration are well established parameters of communication of emotion in music, and have been rigorously defined from empirical studies of “cue utilization in performers communication of emotion in music” (Juslin and Sloboda 2010, p. 463). While studies of the communication of emotion in music to date are based on collocated music performance, it is argued that these same attributes are paramount to representation and meaning in networked improvised music performance. In the absence of visual cues, they become important signifiers for networked musicians to communicate and respond to in networked interaction.

Demonstrating another variation of this at 6:25, the improvisation came to a brief resting point, and MH performed a gesture of adjusting the level of delay on the guitar from his effects unit. ST then initiated the following interaction on the ney by moving up a register and beginning this next section by replaying a melody that emerged in the opening few seconds of the improvisation at 0:02. MH responded to this on guitar by voicing a higher pitched, wider range melody by increasing the pitch range of the accompanying ostinato chord pattern in E minor. This section of interaction then concluded by returning to the tonic at 7:40 as the players dropped back down to the lower octave in an imitative call and response on the same 3-note ascending melodic pattern that they started the segment with.

8.4.4 Findings

A summary of the analysis in this research reveals that networked musicians comprehend improvisatory interaction through a blend of metaphorically structured perception and embodied auditory imagination. They perceive significance in patterns of sound as a gestalt, which then form the imaginative structures on which they base their collaborative approaches. This was illustrated in a number of examples where musicians refer to the height, depth, or motion of another musician’s sound, and the ways in which this influenced their musical responses, such as playing “underneath” a perceived height in timbre, or to “play catch up” with a faster rhythm cycle by marking the pulse patterns. The musicians’ longer-term strategies developed through iterative stages of the interaction as the result of recalling previous musical (melodic, timbral and rhythmic) events. Repetition of melodic motifs was used to create a sense of form, or to seed new musical material from sections of musical deconstruction. Instrumental gestures such as breath, lip and finger positions were also employed to elicit timbral variation, or attenuate differences in intonation.

The key approaches that emerged from both studies are as follows:

  • Extended note durations in adjacency pairs (call and response), which focussed on the timbral nature of sound in the early stages of the improvisation. This provided a basis on which musicians felt able to begin to contribute to the interaction.

  • Sequential (layered) interaction developed as harmonic accompaniment and emergent melodic motifs. This occurred as musicians’ familiarity with each other developed.

  • Rhythmic and melodic repetition occurred in later stages of the improvisation where musicians often recalled musical events that emerged in the beginning of the improvisation. This was done to augment, or refocus the improvisation.

  • Rising and falling harmonic and melodic progressions signifying building tension and climax, which then transitioned to release and relaxation leading to a deconstruction of musical material returning to extended note durations with a focus on the timbral nature of sound.

8.5 Conclusions

While these approaches may share similarities to those that musicians use in co-located improvisatory scenarios, without the signifiers of presence (eye contact, facial expression and body language), they illustrate the efficacy of experiential metaphor in replacing these communication mechanisms in the minds of the musicians. The result is that while interactive approaches may be similar, the pervasiveness of metaphor in comprehending interaction enables an outcome for the networked musician where they have an opportunity to learn and develop their practice with other musicians with whom they would never likely have met.

For practitioners and researchers alike, the networked musical experience remains an elusive concept, and by its nature engenders more visceral verbal accounts in which metaphor is most often called upon to describe the experience. In this sense experiential metaphor provides a scaffold for an evaluation of adaptability in networked interaction, and as in Candy’s criteria for evaluation (Candy 2014, p. 41), “purposeful” strategies behind the manipulation of sound parameters become the primary criteria for assessing musicians approaches to their interaction.

The evaluation of collaborative interaction through a social semiotic perspective, and metaphorically structured perception as outlined in Mills’ framework make it potentially applicable not only other tele-collaborative domains but also a variety of digitally mediated interaction. It illustrates an interdisciplinary approach to gathering and assessing data that require it to account for actions of practice augmented by the qualitative analysis of reflective experience. While this is a feature of many of the approaches in this book, it is the interpretation of these two components through image schematic structures that provide artists and researchers with an additional tool for interpreting reflective experience in their specific discipline.