1 Introduction

Creative engagement emerges from creative activities when a user is engaged in an active, reflective, and constructive cognitive process in pursuing a creative outcome with an interactive system [3, 17, 18]. It transforms a user’s role from consumer or spectator to contributor [19, 50], and encourages ‘autotelic’ creative activities and ‘meaningful’ actions [23, 26], consequently making the experience a ‘memorable’, ‘sustainable’, and ‘intrinsically rewarding’ one, rather than simply a ‘pretty’ one [3, 7, 13, 18]. Music making is a key form of creativity [6] and is a rich domain in which to explore the design of interactive support for creative engagement, especially when designing digital musical instruments in which support for creative engagement could be built into the instrument itself.

Whilst creative engagement empowers users with creative skills and confidence by providing them with new experience, propelling them in a direction they would previously never think of, encouraging them to integrate these strategies into future work and daily life, and contributing to users’ mood and life positively [1, 13,14,15, 31, 54], it can be rather challenging for novices to achieve creative engagement.

Amateurs of an interactive activity, which we consider to be those who take part in the interaction for pleasure and with no intention to become professionals, may quickly run out of ideas and typically have insufficient experience and skills to find new ideas, and so are easily discouraged after the first few endeavors [37]. Creative prompts, stimuli, and other suggestions can help support novices in extending and deepening their creative engagement. For example, targeted information [15], expert patterns [32], or unexpected and valuable content that they might not have otherwise considered or even come across in the digital environment [39, 41]. Indeed, studies have shown that novices demonstrated more creative engagement when given support to kick-start their creative activities in digital painting [1], and showed better performance when being provided information about rule violations in digital film making [15]. Recent studies have revealed that visual stimuli can positively prompt creative performance since they provide ‘potential cues, analogy-sources or other similes’ for inspiration [29, 51], which can divert one’s attention [22], increase the rarity and non-obviousness of ideas [4] and prompt problem reinterpretation and reconstruction [25]. To further explore the design of visual stimuli which supports creative engagement this paper examines the effect of providing visual stimuli to support creative engagement with digital musical interfaces.

1.1 The multimodality of digital musical interfaces

Musical instruments are inherently multimodal, traditionally combining physical input modalities with auditory output of music along with haptic feedback to the musician. With the advent of real-time computer generated audio, digital musical interfaces have taken advantage of the multitude of input and output modalities offered by modern multimodal computing and often offer a far wider range of combinations of input and output modalities than traditional instruments. Interested readers are referred to the field of New Interfaces for Musical Expression for a glimpse of the dazzling array of combinations of modalities in contemporary multimodal digital musical interfaces [28]. For example, a digital musical instrument might map some physical input such as button or sensor activation to auditory output [55] and additionally offer visual output and input modalities. Output visuals are additional outputs which often act as a synchronized display that enhances physical interaction and the understanding of the music [8]. A typical example are audiovisual interfaces that generate concurrent audio and visuals in real-time, creating a rich audio-visual experience [12, 16]. Visuals can also form part of the input control interface, acting as an input to determine the content of the music, e.g. an interactive bio-inspired sonification tool to convert images into music [44] or a sketch-based musical interface to synthesize sound based on parameters of the sketches [52].

Expanding the range of modalities of digital musical interfaces to include visual modalities has been shown to reinforce the players’ physical interactions, supplement auditory cues, increase the dynamics, richness, robustness and reliability of an interface, elicit joy, surprise and delight, enhance performance and engagement levels, allow new perspectives on performance to be found, and lead ideas to move in different musical directions [27, 42, 53, 57].

In addition to being potential input and output modalities of digital musical interfaces, visual stimuli have a third role as they can be used to inspire musical creativity. As with other fields where visuals have been shown to stimulate creativity, studies of the music-making process of professional and novice musicians have shown that visual materials are an essential tool for facilitating creativity [2, 20]. That is, the visual modality may influence the musical output by stimulating the user’s inspiration and essentially act as an indirect input to the music. This paper explores this third role of visuals in digital musical interfaces by examining the forms of visual stimuli that are effective for supporting creative engagement and so hopes to inform the future design of digital musical interfaces whose multimodal interaction includes visual representations which support creative engagement.

2 Research question

Research on the effects of the form of visual stimili on creative performance and creative experience are inconclusive. For example, exposure to familiar or literal solutions as stimuli have been found to block the generation of unusual and original solutions [9, 21]. This happens mainly because a person consciously or unconsciously becomes attached to existing solutions and starts to repeat key attributes or features of the examples. In contrast, more obscure analogies and abstract, remote, partial, incomplete or between-domain stimuli have been found to be inspirational stimuli due to their abstract nature [11, 29, 34]. However, Goucher pointed out that, based on neuroimaging analyses of the creative process, inspirational stimuli of any kind, both near and far from the problem space, can activate temporal brain regions in association with semantic word processing, word concept recognition, and memory, and enhance the fluency of idea generation [25]. These inconsistent findings on the effect of the level of abstraction of visual stimuli on creativity motivates the first research question of this paper: whether visual stimuli in different forms (abstract or literal) have different impact on creative engagement in the process of music creation.

Research on visual stimuli has mostly been undertaken domains that work predominantly with visual language, such as graphic design, product design, and sketching. A secondary objective of the research reported in this paper is to explore whether abstract visuals are useful inspirational stimuli in creative process which involve less visual activities – music creation in our case.

In music making, graphical scores (GS), have been proposed as an open tool in visual form for both trained and inexperienced musicians to create music [49]. In this way GSs are an additional visual modality in a multimodal music system, offering an influence on the musical production. A GS is series of idiosyncratic or personal visual representations that convey various dimensions of sound information required to perform a piece of music [47]. Unlike conventional staff based musical notations, which stipulate the symbols and their content, GSs are non-instructional, open-ended, and ambiguous. GSs encourage performers to interpret and decide what to play in their own way, which allow performers to consider themselves as improvisers [40, 47]. The dynamic nature of GSs means that they are widely utilized in live music performances as a complementary creativity support tool for improvisational music making [36, 38, 46]. Prior knowledge of a GS may help to guide a performer’s interpretation, but the risk is that the performers may be restricted to the specified meaning of symbols, limiting the degree of creative freedom and resulting in derivative outcomes. Indeed, the idiosyncratic nature of GSs versus traditional notation and their freedom of interpretation suggest that these may be key to their use as prompts for inspiration, raising the question of whether being informed about the meaning of a creative stimuli reduces it value in creative engagement.

Finally, it is worth noting that existing studies of GS have mostly examined the music making process of professionals and valued the resultant artistic outcome above the creative experience. Open questions remain about the impact of GSs and their explanation on novices’ creative processes. In contrast to professional musicians’ processes, novice’s creative processes are typically valued primarily for their experience rather than the quality of the final artistic output. These points motivate the second research question of this paper: whether or not providing information about a GS will support a novice’s creative engagement.

The research discussed above suggests a positive relationship between abstract visual stimuli and creativity, and that the freedom of interpretation of GS can effectively support creativity [39, 40]. Following these arguments this paper hypothesizes that creative engagement will be greater when novices play with an abstract GS and without information about the GS as captured in the following hypotheses:

H1: GS with abstract symbols will better support novices’ creative engagement than those with literal symbols.

H2: Playing without information about GS will better support novices’ creative engagement than playing with information.

3 Study design

To explore the hypotheses presented in the previous section an interactive prototype referred to as MTBox was designed, built, and used in a controlled experiment with novices as described in this section.

Fig. 1
figure 1

MTBox - prototype used in the study

Fig. 2
figure 2

Participant interacting with MTBox

Table 1 Musical ideas in graphical score
Fig. 3
figure 3

Graphical score with literal symbols (Gliteral)

Fig. 4
figure 4

Graphical score with abstract symbols (Gabstract)

3.1 Prototype: MTBox

MTBox (Fig. 1) is a multi-modal interface, with physical input modality of press-buttons and a rotary dial, and output modalities of audio display and visual display on the top of the box (as seen in Fig. 2). As a user interacts with the physical interface, real-time audio output is played and a representation of the music being made is displayed on the visual timeline display on the top of MTBox (see two versions of the timeline in Figs. 5 and 6). In this way there is a tight connection between the physical input modality and the audio and visual output modalities. The timeline display also displays creative stimuli drawn in the form of graphical scores shown at the top of Figs. 5 and 6. These are used as a prompt for user creativity, or even as an input to human creativity. These creative stimuli are not connected to the input modality or output modalities.

Following the paradigm of tangible music interfaces [58], the physical interface of MTBox was intentionally designed to be different to conventional musical input devices and metaphors such as piano keys or DMI grid controllers. This is in order to remove any preconceptions about the instruments and to reduce non-musicians typical nervousness about playing with conventional instruments.

MTBox is a cube with each side of length 20 cm. Fig. 2 shows a participant using the box. The hardware was built for a previous study to explore the influence of task motivation and user interface mode on novices’ creative engagement [57]. For the study reported in this paper the mode of interaction and the graphical interface were modified to create a more playful and interactive music making interface and to include a graphical score. Pressing a button on the vertical side of MTBox triggers a pre-recorded audio sample. The front and back buttons trigger long samples to play in a loop. To increase autonomy and expressiveness, the left and right buttons trigger eight short samples. The short samples are one beat long and will be played only once when triggered. Two sets of short samples consisting of percussion and piano notes were provided - pressing button B2 on top of MTBox switches between the two sets of short samples allows participants to produce rhythmic patterns.

On the timeline display, there are 24 tracks to record in real-time the interactions for each sample individually. Short samples are represented by dots, long samples by lines. The length of the lines increases with time when triggered. The buttons on the top of MTBox provide control over the timeline interface. Knob R4 scrolls the timeline forward or backward. Button B3 controls the playback point of the timeline. When pressed, the timeline jumps to the specified point and starts playback. Button B5 resets the scrolling timeline to the current playback point. Button B1 erases all records forward from the current point on the timeline.

3.2 Graphical scores

To develop a set of musical ideas that could provide musical inspiration to non-musicians, three musicians were asked to create a piece of music with MTBox. Musical ideas and patterns of short samples, e.g., triggering three samples together or one by one and shifting between two samples, were extracted from their playing records. Table 1 displays these musical ideas. Two sets of graphical symbols were designed to convey musical ideas, following mapping and coding strategies [47]. The literal Graphical Score (GS) was designed based on the visual records of samples drawn on the timeline interface. As shown in Fig. 3, the lines and dots on the literal GS represent looping samples and short samples respectively. The abstract GS was designed with shapes that are not directly related to those shown in the timeline. The rectangles denote looping samples. Circles and lines indicate short samples, as presented in Fig. 4. A list of symbols with predetermined orders were displayed at the top of the timeline interface, producing two versions of timeline interfaces: Gliteral (Fig. 5), and Gabstract (Fig. 6). In MTBox the GS moves gradually from right to left with the same speed as the timeline.

3.3 Independent variables

Two independent variables were manipulated:

  • Abstract and literal GS, within-subjects factor (repeated). Participants interacted with two GSs in turn. To eliminate the influence of the sequence of exposure to prototypes, the orders of GSs were counterbalanced.

  • The presence or absence of GS information, between-subjects factor (non-repeated). There were two groups of participants. Participants in group 2 were informed of the meanings of the symbols while those in group 1 were not.

Fig. 5
figure 5

Timeline with literal graphical scores (Gliteral)

Fig. 6
figure 6

Timeline with abstract graphical scores (Gabstract).

3.4 Participants and procedure

24 participants who perceived themselves as novice music makers were recruited via poster and email (12 men, 12 women). These were a mix of undergraduates, postgraduate students, and non-students. Specifically, 16 participants majored in computer science, four majored in design, and four undertook research related to interaction. None of them had experience in the performing arts. Thirteen participants were aged 18–25 years, ten aged 26–35, and one aged 36–45. No participants reported being color-blind. Participants signed a consent form and were informed that they could leave at any time. Each participant completed four sessions of the experiment and received 10 pound sterling (GBP) as compensation.

Session 1: Guided learning with Gno. In a pilot study not preported here, two novices reported that they were easily disoriented without fully learning the box. To enable a fuller understanding of MTBox, a prototype without GS (Gno) was presented. Participants were guided through the interactions of MTBox, including the button functions, long and short loops, and timeline interface. The researcher demonstrated the interactions in response to questions until there were no more questions, at which point it was assumed that participants understood how to interact with the prototype.

Session 2: Exploration task withGno. Participants were encouraged to explore Gno in their own ways. No specific outcome was required. From this session onward, the researcher sat at the corner of the room in case participants needed any help, but unlike session 1 did not proactively assist the participants. Participants were reminded of the time after 10 minutes and were informed that they could now move on to the next session or continue with the current session if they wished.

Session 3: Creation task with one prototype. One of the two prototypes was presented. For participants in Group 1, GS was introduced as a tool to provide inspiration during playing. Only participants in group 2 were informed of the meaning of each symbol in the GS. Participants were encouraged to create a piece of music, the length and genre of which were not specified. They were also told that no judgment would be made about the quality of the final output and were specifically reminded that they were not asked to follow the GS, but rather to use it as supplementary material for creation. Participants were reminded of the time after 10 minutes and could continue to the next session if they wished. Afterwards, they were asked to fill out the questionnaire.

Session 4: Creation task with the other prototype. The other prototype was introduced. The procedures were the same as those presented in session 3. Afterwards, participants were asked to fill out the questionnaire. A short interview was then conducted with participants to understand their creative process.

Table 2 Facets of creative engagement measured with the agreement questionnaire according to PCA

3.5 Data collection

Five forms of data were collected in the study as described in this section.

3.5.1 Agreement questionnaire

To better understand the participants’ perceptions of the creative process, an agreement questionnaire was designed based on the evaluation on creativity support tools [10, 48] and user engagement [43, 57], focusing on the facets such as perceived aesthetics and interpretation of GS, satisfaction with the results as well as perceived creativity. The questionnaire consists of 11 statements addressing these facets which can be found in Table 2. Participants were asked to rate their agreement on each statement with a 7-point Likert scale from 1 (strongly disagree) to 7 (strongly agree). This questionnaire was presented to participants after playing with both Gliteral and Gabstract in session 3 and session 4 respectively.

Table 3 Significant results of questionnaires

In order to compare the subjects’ perceived creativity with different versions of GS, in session 2 after exploration with Gno, participants were asked to rate their perceived creativity in the creation of a piece of music (Q0: I was creative in the exploration with the music).

3.5.2 Choice questionnaire

The choice questionnaire was presented at the end of the experiment. There are three questions including 1) how important is the GS? (single choice): very important, moderately important, neutral, slightly important, or not at all important; 2) when is the GS more helpful? (multiple-choice): all the time, once I got the brief, during learning process, during music idea generation, or when I do not know what to do and 3) how did the GS help your creative process? (multiple-choice): activated related musical ideas in memory, gave examples to follow, provided ideas on sample combinations, and provided inspiration on music structure and others.

3.5.3 Comparative questionnaire

A comparative questionnaire was presented at the end of the experiment, asking participants to choose which of two GSs that best fit the statements. Each statement represents one of the seven factors of creative engagement [57], as shown in Table 4.

3.5.4 Creative results

While participants were interacting with Gno, Gliteral and Gabstract, the dynamic changes on the timeline interface were screen recorded using QuickTime Player. Screenshots of the timeline interface were stitched together to form a record of the creative result, which contains all the music fragments the participant made during the creation process.

3.5.5 Interview

A semi-structured interview collected subjective feedback from each participant after playing with Gliteral and Gabstract. Participants were asked to describe their creative process, how they interpreted the graphical scores, how the graphical scores affected their playing, and how they utilized the graphical scores. After they finished playing with the prototypes, the participants were asked to describe the differences in the playing experience between the two versions (if any), which one they preferred, which was more inspiring as well as the reasons for their choices. The questions were not asked in the same sequence, but rather were used to direct the flow of conversation in the interview whilst ensuring that all questions were addresses in the interview.

4 Data analysis & results

4.1 Agreement questionnaire

A Principal Component Analysis (PCA) on participant feedback indicates that three independent facets were measured by the agreement questionnaire: inspirational support, result satisfaction, and effort of interpretation. In the analysis the cutoff value of 0.71 was selected as 0.71 or greater value indicates 50% overlapping variance between variable and factor according to Comrey and Lee’s criteria (1992, as cited in [43]). The internal consistency of the three components identified by the PCA is low (Cronbach’s Alpha<0.001), demonstrating that each component was distinct, with no significant overlap. Table 2 lists the resulting number of items, item loadings, the amount and percentage of variance explained by each component, grouped in terms of the three independent facets, as outlined below.

Inspirational Support accounted for 39% of the variance and consisted of four items. Item loadings on this component ranged from 0.847 to 0.896. These items are related to participants’ perceptions of how inspiring the GS is, the ease of exploring new ideas, usage frequency and the ability to perform various outcomes. The internal consistency of items was excellent (Cronbach’s Alpha = 0.914).

Results Satisfaction occupied 20% of the variance and consisted of three items. Item loadings on this component ranged from 0.886 to 0.907. These items are associated with participants’ satisfaction with the result, perceived quality of the results and creativity. The internal consistency of items was excellent (Cronbach’s Alpha = 0.895).

Effort of Interpretation accounted for 11% of the variance and consisted of two items. Item loadings on this component ranged from −0.723 to 0.781. These items were related to participants’ perceptions of ease of interpretation and development of own interpretation. The internal consistency of items was good (Cronbach’s Alpha = 0.614)

Table 4 Results of comparative questionnaire (significant results are shown in bold)

A scale value of each component was calculated by averaging the items that make up the component. This value can be understood as the degree of agreement with each component. The higher the value, the more the component was agreed on by the participants. To analyse these data further, nonparametric tests were adopted as the data does not obey normal distribution. A Mann-Whitney test suggested that there was a significant difference for the component effort of interpretation (\({Z} = -3.524, {p}< 0.001\)) between groups. The agreement for effort of interpretation in the group playing without information (M = 5.208) was bigger than that of the group playing with information (M = 3.917). Wilcoxon signed ranks test suggested that there was a significant difference in effort of interpretation for different versions of GS (\({Z} = - 2.458, p = 0.014\)). Participants found Gliteral (M = 4.146) easier to interpret, as compared to Gabstract (M = 4.979).

Further analysis was conducted to compare how participant agreement differed for questionnaire questions Q1, Q7 and Q0/Q11. Wilcoxon signed ranks test suggested that participants’ agreement on perceived creativity with music (Q0/Q11) was significantly different between Gliteral and Gno(\({Z} = - 2.749, \textit{p} = 0.006\)), as well as between Gabstract and Gno(\({Z} = - 3.252, {p} = 0.001\)). The perceived creativity with Gliteral (M = 4.542) and with Gabstract (M = 4.583) were higher than that with Gno (M = 3.000). In the group which was not informed of GS details, there was a significant difference for GS aesthetics (Q1) (\({Z} = - 2.146, {p} = 0.032\)). Participants agreed more on the aesthetics of Gabstract (M = 5.567) than that of Gliteral (M = 4.250). All significant results are listed in Table 3.

4.2 Choices questionnaire

In terms of analysis of the choices questionnaire, a Chi Square test for cross-tabulation was applied. Between two groups of participants, there existed a significant difference on the choice ‘Give examples to follow’ in the question “How did the graphical score help you?” (df = 1, p = 0.041). In the group playing without GS information, four participants ticked this statement, whereas nine participants in the group playing with GS information ticked it. However, no statistical significance was found for either Q1 or Q2 between groups of participants, indicating that playing with or without GS information did not have any influence on participants’ choice of how much and when the graphical score was important.

4.3 Comparative questionnaire

Table 4 presents the results of the comparative questionnaire, with significant differences of a Chi Square test in bold. In the group playing without information about the GS, significantly more participants rated that they enjoyed it more with Gabstract than with Gliteral (\({X}^2\) = 10.667,  p = 0.001). In the group playing with information about the GS, significantly more participants rated that they felt more frustrated with Gabstract than Gliteral (\({X}^2\) = 10.667, p = 0.001). Meanwhile, significantly more participants rated that they felt Gliteral helped them to get more inspiration (\({X}^2\) = 6.000, p = 0.014).

4.4 Results assessment

The quantity and diversity of creative results can be employed to evaluate the success of a creation process and the effect of a creative stimuli [25, 51]. As it is difficult for novices to control the overall structure of a musical composition [6], and given that creative output can be regarded as a temporary position within a creative experience [23], the analysis of participants’ creative results did not evaluate the broad musical structure. Instead the analysis of the final musical structure concentrated on the smallest unit of creative content – musical patterns – which are regarded as a string of beats with similar track combinations which appear frequently within the musical structure [6].

Two assessors were invited to look for music patterns in the participants’ creative results. Both assessors followed the same procedure and selection criteria:

  1. 1.

    Look for similar combinations of track records on the timeline.

  2. 2.

    Attempt to incorporate compositions on different tracks into the same pattern.

  3. 3.

    When a combination of the same shape appears twice or more, it is considered as a pattern.

  4. 4.

    When there are subtle changes in the combination, such as adding a beat but not influencing the overall structure, it should still be classified as the same pattern.

  5. 5.

    When there exist subtle changes in the combination, but the change repeated twice or more, it is classified as a new pattern.

Each assessor conducted three rounds of pattern exploration independently. After each round of exploration, the evaluators discussed with the author about the pattern criteria and the controversial patterns. Finally, the two assessors exchanged the patterns for cross-reference and discussed them together with the author in order to reach an agreement of the final patterns in each creative process. The number of patterns types and the total number of patterns were recorded. Internal consistency of the number of pattern types identified by the two assessors was satisfactory (Cronbach Alpha coefficient = .892), as was the total number of patterns (Cronbach Alpha coefficient = .923), suggesting the raters were following the same criteria to look for patterns and the data was reliable. Fig. 7 presents an example of patterns extracted from one participant.

Fig. 7
figure 7

Interaction between GS style and information

Table 5 Significant results of results assessment

Nonparametric tests were adopted to analyse the number of pattern types as the data does not obey normal distribution. Wilcoxon signed ranks test suggested that GS style caused significant differences in the number of pattern types and the number of patterns produced by the participants, as presented in Table 5. Participant generated more types of patterns and made more patterns with Gabstract (M = 6.456) and with Gliteral(M = 4.917) than with Gno(M = 3.318). When playing with Gabstract, participants made more types of patterns and more total patterns than with Gliteral. Mann-Whitney Test suggests that the number of pattern types in the two information groups were significantly different when playing with Gliteral. Participants who were informed of the GS details generated more types of patterns (M = 5.196) than those who were not informed (M = 4.525).

4.5 Interview analysis

A bottom-up deductive thematic analysis was conducted on the interview transcripts [30]. Two researchers experienced in thematic analysis went through the transcripts independently three times and iteratively coded the sentences with preliminary themes. Then, the researchers discussed the themes together and combined similar preliminary themes to create categories of themes. Below are themes identified in the interview transcripts along with representative quotes from participants.

4.5.1 Interest trigger

The interview feedback suggested that both GSs implicitly triggered participants’ interest to respond, either by trying to make sense of its meaning, testing the results, or setting a creative goal. Four participants from both groups asked questions such as ‘what does this mean, how could I interpret that?’ or ‘can I actually do that?’ and started to make sense of the symbols. Two participants reported that their motivation to explore more of the box was triggered, e.g., ‘I tried to do more things that I probably wouldn’t have done instinctively’.

Despite the abstractness of the symbols in Gabstract, two participants mentioned that Gabstract triggered their interests. In the presence of Gabstract, one participant was willing to accept the challenge of creating more complex music. Although no participants reported that clues concerning the composition of the music were obtained from the symbols, one participant took them as a reminder of ‘being creative’ and ‘taking care of the structure of the piece.’

4.5.2 Intuitive aid

Generally, the GS was regarded as an ‘intuitive aid’ and an ‘interesting tool’ as it provided examples for participants to learn ‘how to play chunk’. In addition, participants reported that they became less lost in the presence of the GSs. Participants tended to look for solutions or better sound ideas from GSs when they ‘don’t know what to do next’, ‘get stuck’, ‘get repetition’, ‘messed up’ with sound, or were not satisfied with what they were creating. The ideas included ‘combination of different samples’, rhythmic patterns that can ‘be translated to sound sequence’, musical structure that included ‘where to plug in the drums’, as well as musical ideas such as how to ‘mix’, ‘what to use’, ‘when to start or stop’, and ‘how to finish’. Two participants reported that they had difficulty remembering a sound and its corresponding button, but that GSs helped them to recall the sound based on the color and shape.

4.5.3 Idea catalyst

The GSs also played a vital role in helping participants to develop their own ideas and to come up with more ideas on their own. Participants reported that they began by following the score until they ‘got into it a little bit’ and started concentrating on their own explorations. Participant also reported that they started by reproducing the ideas suggested by the symbols, and then ‘from that idea I developed something else’. In this way it seems that is when they started modifying the ideas interpreted from the GSs that they started to create their own ideas e.g., ‘maybe I can blend something like this’. They also asked themselves questions such as ‘what can I fill in when seeing the symbol?’, and then tried to create musical ideas following that.

4.5.4 Loose impression

With Gabstract, eight participants reported that they did not develop a ‘one-to-one mapping’ on sound and graphic elements, or a specific interpretation of each symbol. Two even reported that they ‘didn’t really understand what it meant’. Instead, they developed a ‘loose impression’ or a ‘feeling’ from Gabstract when they occasionally caught a glimpse of it. While most participants recognized that Gliteral was designed based on the timeline, participants reported a variety of interpretations of Gabstract. Symbols were taken as ‘a reminder of taking care of general structure and of being creative’, as an indication of timing and ‘key points’. Additionally, symbol size was also mapped to sample length e.g., ‘I added a loop sample when seeing a big shape’.

As for participants in the group informed of GS details, four of these participants mentioned they put in more effort to ‘remember’ the meaning of the symbols. However, it was difficult and confusing for them to ‘remember’ what the symbols represent according to the information given.

4.5.5 Graphic style

Gliteral and Gabstract created different impressions on participants. Gliteral was described as clear, linear, logical, specific, straight forward, simple, systematic, organized, oppressive, softer and not providing useful information, while Gabstract was described as abstract, representative, relaxed, open, no right or wrong, visually pleasant, with more things to find, be more interesting, complex, aggressive, confusing, and make no sense.

Whether the symbols created a positive impression is vital as it triggered participants’ willingness to try something different, and influenced their attitudes and approaches towards the creative activity. Four participants expressed their appreciation of the visual style of Gabstract, and even those who thought Gabstract was too abstract to interpret found it ‘visually pleasant’.

5 Discussion

Participants reported higher perceived creativity with both abstract and literal Graphical Scores than when playing without GS. With both Gabstract and Gliteral, participants generated and played with more types and numbers of patterns than with Gno. The obtained results support the claim that visual stimuli of any kind are useful in supporting creativity [25]. Additionally, results of the thematic analysis suggested that both forms of visual stimuli worked as an interest trigger, an intuitive aid, and idea catalyst. The results suggest that the additional visual stimuli triggered participants’ interest to play in the beginning of the creative process, provided inspiration when they encountered barriers, and facilitated the development of musical ideas during the creative process. We suggest that visual stimuli contribute to the evolution of engagement, supporting novices’ music making in different stages from immediate engagement to sustained engagement and eventually to creative engagement [7].

Hypothesis H1 (graphical score with abstract symbols will better support creative engagement) is supported by the results that show that participants performed more types and numbers of patterns when playing with Gabstract than with Gliteral. Despite the fact that more effort needed to be invested in developing an interpretation of Gabstract, the satisfaction with the results was not affected. Moreover, the interview feedback related to loose impression suggested that abstract allowed greater space for interpretation, which encouraged more explorations. However, the advantages of Gabstract depended largely on whether participants were informed of the design. The enjoyment with Gabstract was superior to Gliteral, and the graphics of Gabstract were aesthetically more appealing than Gliteral to the participants who played without any information. For the group of participants who were informed of the symbol meanings, Gabstract was regarded as more frustrating and less useful to the group playing with information. This is possibly because the giving information concerning Gabstract confused participants and impeded them in developing a loose impression of the symbols, resulting in blocking access to associative thinking and leading to an unfavorable impact on participants’ creative experience. Although the participants may have put more effort into interpreting Gabstract when given no information, the loose impression increased their enjoyment of the creative process and encouraged them to explore more, consequently allowing the music created to be more individual.

Hypothesis H2 (playing without information about graphical score will better support creative engagement) is not supported by the findings. According to the results, playing without GS information had both disadvantages and advantages, depending largely on the GS style. In the case of Gabstract, not giving information can be beneficial as participants discovered the GS with abstract symbols to be more enjoyable and more aesthetically pleasing. In contrast, being informed about the design of abstract stimuli could seriously frustrate participants during the creative process. In contrast, in terms of Gliteral, more participants agreed that they got more inspiration from Gliteral in the group who were informed about GS details. In addition, these participants performed more types of patterns than those who were not informed of GS details. Generally, giving the information about GS required less effort to interpret for both literal and abstract visual stimuli. More participants in the group playing with information agreed that the GS helped the creative process by offering examples to follow. Combining participants’ comments on the literal GS, we suggest that providing information about the GS made it more straightforward for participants to understand the musical ideas which were suggested by the GS. It provided examples for participants to follow and learn from, becoming an intuitive aid to catalyze the development of ideas in situations where participants don’t know what to do. In terms of the visual stimuli that are more open and less direct, providing information about the score seemed to increase its complexity and limit participants’ autonomy to develop their own understandings, which led to negative impact on the creative engagement.

In terms of the creative outcomes, it is worth noting that there is a discrepancy between the participants’ subjective satisfaction and the objective analysis. No evidence was found in the questionnaires to suggest that the GS had any effect on reported satisfaction with creative outcomes. However, the results assessment found more patterns were generated with Gabstract, implying an advantage of Gabstract in improving creative performance. This inconsistency suggests that there does not necessarily exist a direct correlation between subjective feelings of satisfaction with results and objective measures of creative outcome. This is similar to the conclusion that subjective ratings of creativity may diverge significantly from more objective measures [35]. Therefore, qualitative and quantitative assessments are likely to be complementary measures of the complex multi-dimensional constructs of creative engagement.

Fig. 8
figure 8

Creative Engagement with Visual Stimuli

5.1 How visual stimuli support creative engagement

We propose a descriptive model shown in Fig. 8 to explain how different forms of visual stimuli (abstract or literal) and information about the stimuli support creative engagement based on the framework of creative engagement proposed by [7]. We suggest that both forms of visual stimuli helped to draw the novices’ attention, trigger their interests and guide them into a sense-making process, thus engaging them at the very first stage of interaction (immediate engagement). However, in our view, the graphic style and information given about the visual stimuli affected the development of interpretation. Literal visual stimuli with information about the visuals appeared to guide individuals to learn what is specified, leading to a prescribed interpretation of the visuals being established. In contrast, with its variation of symbolic shapes, the abstract graphic representations may be associated by individuals with different dynamics of the music, such as tempo, volume, timing, etc. This associative thinking, more of an unfocused process that activates different memory locations in the brain and helps enlarge the source context, results in an open interpretation.

Once the interpretation is developed, novices interact further and respond to more visual stimuli (sustained engagement). Working with the prescribed interpretation, we suggest that novices tend to follow the examples of musical ideas (e.g. combinations of samples and rhythmic patterns) suggested in the visual stimuli and reproduce them. This helps novices achieve a creative breakthrough from nothing to something, which is especially important for individuals who do not have any ideas about what to do. As the diverse interpretations may elicit various potential music ideas, we suggest that these encourage participants to make more explorations and implementations. We see this as an iterative process during which new potential ideas may be discovered and implemented, with more explorations to follow. As a result, the tones and phrases generated tend to be more individual and have more musical dynamics.

As the interaction progresses further, novices may start to evaluate and improve the previous ideas or even create new ideas gradually (creative engagement). They evaluate and filtrate ideas, sifting out unsatisfied ones and then started modify them into new ones. By this point, the modified ideas become the user’s creative output. This is similar to the process of transfer and transformation, during which people engage reflectively and creatively trying to conceive novel and original ideas [24]. In this way, visual stimuli provide ‘performative agency’ that can initiate people’s efforts to pursue a creative outcome [5].

The two paths by which visual stimuli and its information influence creative engagement complement each other. We see prescribed interpretation (literal visual stimuli with information about the stimuli) as being helpful for novices to achieve a breakthrough from nothing, but the disadvantage is that the inspirational sources are limiting and can easily become rigid and depleted after a period of time, even leading to fixation. The advantage of the open interpretation (abstract visual stimuli with no information) is that they may provoke more individual associations, thus eliciting more diverse creative inspirations and more variety in the creative output. Therefore, in practical applications, a balance of the two forms of visual stimuli and its information should be considered.

5.2 Implications for designing visual stimuli

The results of our study suggest that the creative benefits of visual stimuli do apply in the context music. To extend the results beyond the current study, more general implications for the design of visual stimuli to support novices in creative engagement are discussed below.

  1. 1.

    Providing autonomy for interpretation on the visual stimuli. Visual stimuli should be designed to encourage users to take the stimuli as a supplementary source of inspiration rather than an instruction. We suggest avoiding the disclosure of design intent of the visual stimuli. Without the pressure of following visual stimuli or interpreting them in a certain way, visual stimuli can evoke more associative thinking. This also helps to avoid distracting users from their main creative task.

  2. 2.

    Providing abstract but not too abstract visual stimuli. Abstract visual stimuli are visual representations that less associated with current tasks. These encourage users to develop a variety of interpretation, which triggers indirect associations in memory and provokes inspiration. It is necessary to avoid overly abstract visual stimuli that take effort to interpret, as they may distract users from the main task. However, it is also necessary to avoid overly simple visual stimuli that users feel too oppressive, or that cause them to feel directed.

  3. 3.

    Adaptively integrating both literal and abstract forms of visual stimuli. Literal visual stimuli is helpful to kick-start in the initial stage of the creative process. It trigger users’ interest to respond and provide examples to follow. By imitating examples novices can quickly learn from examples and start to develop their own ideas. In order to continuously engage the user creatively, visual stimuli needs to be adaptively changed into more abstract forms over time and in response to user behaviour. This is because visual stimuli that are too literal tend to limit the imagination and lead to constrained creative results.

5.3 Limitations and future works

To avoid the empirical grouping affecting the objectivity of the thematic analysis, the interview transcripts were analysed anonymously and ungrouped. As a result, the thematic analysis provided little useful evidence of differences between information groups. Future work should seek to explore more qualitative feedback from participants on the provision of information about the scores.

For the purpose of this study, within-subject comparisons were conducted with the aim to obtain in-depth understanding of the differences between forms of visual representations. To balance the time and participants’ patience and willingness to be creative, no baseline condition was included in the present study. The analysis compared the perceived creativity between Gabstract and Gno and between Gliteral and Gno. Although the results of different analysis are consistent, the different tasks in the three sessions may lower the credibility of the comparison results. In future studies, formal baseline condition should be included and different experimental sessions should be conducted at longer intervals.

It is possible that both versions of visual stimuli were at the abstract end of the abstract-literal continuum, or that specific variations in shape or size were responsible for the results. Attempts were made to avoid subtle variations of GS design: 1) The symbols in Gliteral were designed based on the timeline; 2) The guided learning and exploration sessions aimed to help participants become familiar with the timeline interface. However, potential pitfalls related to the GS design might have affected the results. In future studies, the level of abstraction of design should be more carefully considered.

Whilst the Gliteral symbols were designed based on the timeline, the Gabstract symbols were designed intuitively by a graphic designer with the primary design criteria being that they should not be directly related to the shapes on the timeline. No consideration was explicitly given to the evocative nature of the Gabstract symbol design. Future work should explore how the shape, colour, and visual complexity of the Gabstract symbols evoke certain associations in people. For example, exploring the cross-modal correspondence of visual brightness to expectations of auditory pitch [33], or the expectation of the kind of sound based on shape such as the bouba/kiki effect in which visual features shapes have been shown to consistently map to certain kinds of sounds [45]. Such future research would also be closely related to the idea of graphic-evoked mental imagery [56] in which incorporating what is imagined from the graphics into the composition can positively enhance young children’s music compositional creativity (ibid). Given the temporal nature of music there may also be implicit associations between the abstract shapes and the musical gestures captured in the music they represent. Whilst the abstract-literal design distinction was made purely in terms of the visual form of the shapes designed, future work should explore possible associations between visual form and musical gesture.

Finally, according to one participant, the potential disadvantage of visual stimuli is that it can lead to distraction. Although it was reported as being useful in stages such as exploration and ideation by to most participants, GS could be distracting in stages that require focused thinking. Further studies still need to be performed to understand how to adapt visual stimuli to different creative stages.

6 Conclusion

To conclude, the study presented in this paper examined the effects of visual stimuli in supporting novices to achieve creative engagement while interacting with multimodal digital music systems. It took graphical scores as a form of creative stimuli, exploring whether the graphical style and the presence or absence of explanation had different impact on creative engagement. The results suggest that abstract visuals were effective external stimuli to guide novices to find inspiration and to support creative engagement. However, providing information about the visual stimuli had both advantages and disadvantages, depending largely on the forms of visual stimuli. In contrast to previous works which mainly focused on the creative process of professionals, this work shows that novices’ creative processes can also benefit from visual stimuli. Moreover, the study findings also contribute to practices in the field of New Interfaces for Musical Expression, suggesting that visuals can be designed to be creative stimuli within digital musical interfaces themselves. Finally, by undertaking our study in the multimodal domain of music making our results demonstrate the broad applicability of visual stimuli as creativity support tools.