Keywords

1 Introduction

An attempt was made to create (using a computer), with the current Artificial Intelligence (AI) techniques, a music (melody) that could be appreciated by people [1]. One of the fundamental aspects to this respect derives from the concept of “musical expressiveness” [2]. it is an added value to a melody, that renders it pleasant and interesting to listen to.

Musical expressiveness is closely tied to the concept of musical interpretation that the executant achieves during his/her own performance [3]. The sounds are all played with the same intensity and the various elements inherent to the musical phrase are highlighted with different sonorities (forte, piano, crescendo, diminuendo, accents, sforzato…). This interpretation, though, is not defined by the executant based on personal taste: in some cases the indications on the dynamics are already specified in the score, but in other cases they are absent. In this latter case, the musician does not leave it to chance, but performs based on the analysis of the score [4, 5]: it is actually in the very score that the “hidden” indications of the composer may be identified. This explains why various executants can achieve a similar interpretation of the same piece.

The dynamics of a piece may be built by analyzing the harmonic functions (derived from the Functional Harmony theory) that the composer used when writing a phrase [6]: these functions allow highlighting (with different intensities indeed) fundamental elements of a musical phrase, such as for instance a cadence, an ostinato or a change of tonality (modulation) [7].

This article will present an algorithm able to investigate the musical expressiveness of a musical piece, by reading the score on its symbolic level. Instead of manually modeling the expressiveness, the algorithm identifies the harmonic functions and based on the them it provides indications for the dynamics by means of graphic representation. The concept of expressiveness is then analyzed from the standpoint of musical dynamics and not from the interpretative standpoint, which allows using elements such as the accelerando, the ritardando, the rubato and so on; moreover, the rhythm is not taken into consideration.

This paper is structured as follows. We start by reviewing background and related work in Sect. 2. The theory of Functional Harmony is described in Sect. 3. The Harmonic Operator is described in Sect. 4. We discuss the methods and initial results in Sect. 5. Section 6 contains the conclusions.

2 Background and Related Works

Musical expressiveness is an important research theme within the context of artificial intelligence and it was studied from various perspectives.

Approaches to this problem were based on a statistical analysis [8, 9], on mathematical models [10] or on analysis by synthesis [11, 12, 13]. They are usually empirical methods, which allow obtaining results expressed by numbers, therefore easy to analyze. All these approaches have an algorithm created by a person who conceived a mathematical model able to seize the musical expressiveness elements of a performance.

Another interesting approach is the one based on inductive learning of the rules [14, 15, 16, 17]: instead of manually creating a model for the recognition of the elements related to musical expressiveness, the computer must automatically discover these elements through certain learning rules.

Each of these studies provided important contributions to research thanks to the different perspectives that were used: all the studies formalized mathematically the various observation points.

This article presents an article that, drawing inspiration from the preceding studies, has the objective of identifying the musical dynamics of a melody. Unlike the aforementioned studies that are based on the analysis of an execution, the algorithm tries to define the musical expressiveness on the basis of a musical grammar which is reflected in the functional harmony. The algorithm created for such purpose has the task of reading a certain melody on its symbolic level (this is why scores in MIDI format were used, without any indication on dynamics); to identify the harmonic structures (through a melody segmentation process) and the corresponding harmonic functions (see paragraph 3); finally, to render in graphic format a diagram related to the musical dynamics to apply to the melody.

The effectiveness of the method was tested by analyzing piano pieces of the 18th and of the 19th century, the results of which were compared with the scores reviewed by important musicians.

These results allow applying this method to the algorithms for the automatic generation of a tonal melody so as to render it pleasant and interesting.

3 Functional Harmony

In the Functional Harmony the objective, on the one hand is to identify a sound, a chord or a succession of chords, the “intrinsic sound value” assumed with respect to a certain reference system [18]: the tonality. For instance the D-F#-A chord represents the tonic chord in the tonal system of D major or of dominant with respect to the tonal system of G major or G minor.

On the other hand, the Functional Harmony allows highlighting the capacity of a sound, chord or succession of chords, to establish organic relations with other sounds, chords or successions of chords of the same tonal system [19, 20].

From this it ensues that in the functional harmony there are the following fundamental principles.

  • The chords are made up of single sounds of the tonal system to which they belong (Fig. 1).

    Fig. 1.
    figure 1

    Chords on the degrees of the G major scale.

  • The degrees of the scale (therefore of a tonal system) belong to a specific harmonic function: harmonic function of tonic (T) (I degree), subdominant (S) (IV degree) and dominant (D) (V degree) [18]. The three harmonic functions of I, IV and V degree are called main because they are linked by a relation based on the interval of the perfect 5th that separates the keynotes of the three corresponding chords; the chords relating to the rest of degrees on the scale are considered “representatives” of the I, IV and V degree (with which there is an affinity of the third - two sounds in common - because the 3rd is actually the interval that regulates the distance between the respective keynotes) and secondary harmonic functions rest with it (Fig. 2).

    Fig. 2.
    figure 2

    Degrees of the scale grouped per harmonic function.

  • Every degree of the scale has its own resolution tendency based on its own harmonic functions [18, 20]. It follows that all the chords will have a harmonic function of relaxation or of tonal center T, or of tension towards such center D, or of breakaway from it S (Fig. 3).

    Fig. 3.
    figure 3

    Resolution tendency of the chords based on the harmonic function.

It is important to note that the resolution of the chords may even not follow the diagram of Fig. 3 [18]: this means that one function of tonic may resolve towards a function of dominant without necessarily passing through a function of subdominant or one function of subdominant may resolve to a function of tonic without passing through a function of dominant.

The identification of these harmonic functions allow the executant to have an image of the character of the piece so as to better define the field of dynamics.

4 Functional Harmony Operator

The identification of the harmonic functions, indispensable to describe the dynamics of a score, is performed by the FHO (Functional Harmony Operator) Harmonic Operator created for the occasion.

Initially the score of a musical piece in MIDI format is read. This protocol allows having various voices of the melody divided into levels and it entails the identification of the pitch and of the duration of a sound in numerical form.

A monodic score may therefore be represented as a sequence Sm of N notes ni indexed on the basis of the appearance order i:

$$ S_{m} = (n_{i} )\;_{{i \in \left[ {0,N - 1} \right]}} $$

A polyphonic score may be considered as the overlapping of two or more monodic sequences Sm1, Sm2, … which may be represented by a matrix Px,y

$$ P{\kern 1pt} (k) = \left( {\begin{array}{*{20}c} {{\text{p}}_{ 1 , 1} ( {\text{k)}}} & {{\text{p}}_{ 1 , 2} ( {\text{k)}}} & \ldots & {{\text{p}}_{{ 1 , {\text{y}}}} ( {\text{k)}}} \\ {{\text{p}}_{ 2 , 1} ( {\text{k)}}} & {{\text{p}}_{ 2 , 2} ( {\text{k)}}} & \ldots & {{\text{p}}_{{ 2 , {\text{y}}}} ( {\text{k)}}} \\ {{\text{p}}_{ 3 , 1} ( {\text{k)}}} & {{\text{p}}_{ 3 , 2} ( {\text{k)}}} & \ldots & {{\text{p}}_{{ 3 , {\text{y}}}} ( {\text{k)}}} \\ {{\text{p}}_{ 4 , 1} ( {\text{k)}}} & {{\text{p}}_{ 4 , 2} ( {\text{k)}}} & \ldots & {{\text{p}}_{{ 4 , {\text{y}}}} ( {\text{k)}}} \\ \ldots & \ldots & \ldots & \ldots \\ {{\text{p}}_{\text{x,1}} ( {\text{k)}}} & {{\text{p}}_{\text{x,2}} ( {\text{k)}}} & \ldots & {{\text{p}}_{\text{x,y}} ( {\text{k)}}} \\ \end{array} } \right) $$

where x represents the number of voices (or levels) and y the number of rhythmic movements existing in the musical piece [21]. Figure 4 shows a polyphonic musical segment (4 voices or levels) and the corresponding matrix. Every sound is identified by a number: the MIDI protocol takes into consideration the pianoforte keyboard (because it is the instrument with the maximum sound extension) and it assigns to the lowest note (A), the value 1 and to every subsequent sound an increasing value (A# = 2, B = 3, C = 4, etc.). Every sound has its own duration which is defined considering the shortest duration existing in the piece and calculating the other durations proportionally [22]. In the example shown in Fig. 4 the shortest duration is represented by the quaver ( ) which will assume the value 1 and therefore the crotchet will assume the value 2 ( ).

Fig. 4.
figure 4

Score representation matrix.

The next stage of the score reading entails the recognition of the tonalities to identify the tonal systems to which the sounds belong (see paragraph 3). This is done by identifying the characteristic notes, i.e. the notes for which the two tonalities are different, the one that is left and the one to which modulation occurs [19, 23].

At this point, for every single rhythmic movement, the sounds that make up a chord are identified (keeping in mind that the sounds must be distant from each other by a third interval) and the sounds which are extraneous to the harmony must be eliminated (Fig. 5) [24].

Fig. 5.
figure 5

Simplification of the score by eliminating the notes which are extraneous to a harmony.

Finally, FHO identifies for each chord the harmonic function (T, S, D) associated to it (based on the pertaining tonal system), so as to have for each movement a functional indication to be used for the definition of the dynamics (Fig. 6) [24, 25].

Fig. 6.
figure 6

Tonal functions of the chords.

5 Obtained Results

The developed algorithm has the objective of defining the dynamics (expressiveness) of a musical piece considering only the harmonic functions contained in it. As such, the system may be applied only to tonal musical pieces and it may be an important help for the systems of automatic creation of tonal melodies: currently one of the most interesting fields within the context of artificial intelligence in the music field.

Input parameters are not necessary for the elaboration: the algorithm reads the initial musical notes and it defines automatically the first tonal system, and after that, with every change of the characteristic notes it defines a new tonal system. This score segmentation process (for the identification of the modulations) does not impose any limit to the dimensions of the score representative matrix, which will be instead dimensioned for every reading of the piece, on the basis of its intrinsic characteristics (number of voices, number of sounds, number of movements or rhythmic divisions).

The initial tests were carried out on a set of musical segments of various lengths, specifically selected, in order to verify the “validity” of the analysis. Then, entire scores were taken into consideration, so as to compare them with scores that had been already reviewed by important musicians, in order to verify the validity of the method.

The results of the analysis are indicated in an isometric diagram which compares the harmonic function of every single chord with the harmonic function of rest expressed by T which represents the chord of the reference tonal system and with respect to which the comparisons and classifications may be made: otherwise, every single chord may not provide any information.

Table 1 shows the three harmonic functions (T, S, D) every single one being identified by an interval of numeric values, which depends on the number of degrees contained in it: function T contains the degrees I, III and VI (therefore the values go from 1, 2 and 3), function S contains the degrees IV and II (values 4 and 5) and, finally, function D contains the degrees V and VII (values 6 and 7).

Table 1. Information value of every single chord.

Figure 7 proposes a musical segment with the corresponding graphic analysis.

Fig. 7.
figure 7

Mendelssohn “Songs without words” n.6 op.19: functional analysis. (Color figure online)

In the upper part of the diagram colors are used to represent the various brackets of information of the harmonic functions of the single chords, while the lower part of the diagram shows the information of the single Tonic chord (to which the comparison is made). The color representing a certain bracket will have a darker or a clearer tone based on whether the value of a harmonic function is of a major (D) or of a minor (S) tension. If the chord has a T function the diagram will contain a column having only the color of the corresponding information bracket (green). If, instead, the chord has an S or D function the color within the column changes by fading out, passing from the color of the preceding value to the color of the value representing the tonal function. The larger the color difference, the greater the musical tension expressed in the phrase and, therefore, the higher the sound intensity. Following the gradualness of the color fad-out it is possible to create:

  • a “crescendo”, in case the transition is from green (T) to orange (S) or to red (D); this effect is represented by the graphic symbol ˂ the length of which varies based on the fade-out duration;

  • a “diminuendo”, in case the transition goes from red (D) or orange (S) to green (T); this effect is represented by the graphic symbol > the length of which varies based on the fade-out duration;

  • an “accento” or “appoggiato”, in case a transition occurs in succession from T–D–T (green-red-green): the “accento” or “appoggiato” would be on the harmonic function D (see Fig. 7, bar 3, second movement); this effect is represented by the graphic symbol “-”.

On the basis of this information the algorithm then proposes a dynamics to apply to the piece (see Fig. 7).

6 Conclusions and Discussion

This article presented an algorithm for designing musical expressiveness for a tonal melody generated by computer. The model is based on the concept of “harmonic function” derived from the theory of De la Motte, which allows drawing a graphic diagram representing the sound intensity and, therefore, the evolution of the musical dynamics (or musical expressiveness). Thus, the proposed method observes both the musical grammar related to the succession of harmonies and the musical phrase syntax.

The results allow the use of the method not only to render a computer-generated melody creative, but also to comment on the existing music. In other words the method may have implications on the musical analysis and, therefore, represent a means of support for the teaching activity: a tool to stimulate the recovery of not-fully-acquired abilities or as simple tool of consultation and support to the explanation of the teacher.

It can be noticed that musical expressiveness is an interesting topic for scientific investigation and for technology research. A future study might target a combined analysis of the harmonic structure and of the rhythm of the computer-generated melody in order to give major coherence to the musical phrase.