Keywords

1 Introduction

One of the main objectives in the field of artificial intelligence (AI) is to develop systems able to reproduce intelligence and human behavior: the machine is not expected to be able to have the same cognitive abilities as humans, or to be aware of what it is doing, but only to know how to efficiently and optimally solve problems, being them difficult ones, in specific fields of action. Therefore, the purpose of the studies carried out in the field of AI is not to replace human beings in all their capacities, but to support and improve human intelligence in certain specific fields: the improvement may be based on the computing power derived from the use of computers.

The areas in which the studies on AI have been developed are generally the areas of multi-agent systems, automatic learning, natural language processing, planning, robotics and vision, web.

One of the research fields which is still partially involved in the AI-related process is music.

The themes of interest of this field refer mainly to the recovery of paper music scores, the recovery and preservation of audio media, the study and realization of databases for music, of protection models for the cultural musical patrimony, of models for the distribution and fruition of music and of models for the segmentation of the score.

There have yet only been a few attempts in the field of musical composition: to compose music is an art and it is a difficult task even for human beings. When a composer writes a musical piece, he has an idea, an intention and has his creativity. A musical piece is a multi-dimensional space with different interdependent levels: duration of sounds, musical phrases, vertical and horizontal sonorities, dynamics, articulations, and so on.

To automatize the task of music composition turns out to be rather difficult, if not impossible.

This article is going to present an algorithm able to generate a musical idea, of assistance to the composer, on the basis of a self-learning system, centered on the concept of “Functional Harmony”.

This paper is structured as follows. We start by reviewing background and related work in Sect. 2. The theory of the Functional Harmony is described in Sect. 3. The Process of Markov is described in Sect. 4. We discuss the methods and initial results in Sect. 5. Section 6 contains the conclusions.

2 Background and Related Work

As opposed to other areas, music is a research area which has been yet little explored as far as AI is concerned.

Several studies have been carried out on computer-aided musical analysis or the processing of already-written music texts, yet there have been only a few studies in the field of automatic composition.

A first interesting attempt is the work of Cambouropoulos [1], that, starting from the concept of causality, uses Markov’s Chains (a principle that will be used in consequent studies [2, 3]) as a tool to help the generation of a musical idea.

Another interesting analysis is the work of D. Cope: Experiments in Musical Intelligence (EMI) [4]. It is corpus-driven and adopts techniques of pattern matching, musical recombinancy, and augmented transition networks, a technique commonly used in natural language processing.

The concept lying at the basis of this study is that of recombination of the musical phrases found in already existing music compositions. The result is that not only the produced music is pleasant, but it also tries to produce music that copies the style of a composer.

Following Cope’s work logic, a different system known as the Automated Com- poser of Style-Sensitive Music (ACSSM) [5] is developed.

Like EMI, the output music is produced by reconstructing the deconstructed music segments. Compared to EMI, the techniques used for deconstruction and reconstruction of ACSSM are improved by adopting structures proposed in a preference rule based theory called A Generative Theory of Tonal Music [6], which models the unconscious intuitions perceived by music listeners.

Specifically, the lengths of segments depend on the grouping structure of the original music; this technique turns out to perform better than the discretely sizing technique used in EMI. ACSSM also considers the metrical aspect of music by attempting to imitate the metrical structure of the original music in the output music.

Other music generators include Vox Populi [5], which is an interactive system for music composition, based on genetic algorithms, and Band-OUT-of-a-Box [7, 8], which is an interactive real-time improviser based on several machine learning techniques, including clustering and Markov chains.

This article presents an algorithm, inspired by preceding works, that has the objective of generating a new “musical idea”, i.e., a sequence of notes which, as a whole, form an idea for the composer.

Though simple, the new musical idea must stem from a well-defined compositional logic that is not formalized beforehand, but that is going to be automatically and gradually built, by analyzing the harmonic structure contained in the already existing musical compositions (of tonal style and by different authors).

The algorithm created for this purpose will have the main task of reading not the simple harmonic structure characterizing every single movement from different musical compositions, but the “harmonic function” that is carried by every single movement, as specified by De la Motte’s Theory of the functional harmony.

Using the Markov process, the algorithm will be able to improve ever more the quality of the musical ideas, by reading ever more musical compositions.

An interesting aspect of our system, already emerged in EMI, is that the new “musical idea” should, not only be appreciable, but, when reading compositions of the same author, copy the style of the same author.

3 Functional Harmony

In the functional theory [9], the goal is to identify in a sound, a chord or a chord succession, the “intrinsic sonorous value” assumed, compared to a specific reference system polarized in a center, or the capacity to establish organic relations with other sounds, chords or chord successions of the same system.

The functional theory tends to go beyond the sonorous event as it manifests itself, to interpret what lies behind that which appears in a particular instant, to seize the meaning, the “role” that it covers in comparison to other events that come before and after it, therefore the “function” that it performs in the context within which it is immersed.

In particular, as far as the chord is concerned, the functional theory tends to re- search, beyond what it represents by itself in comparison to a certain reference system [10] (for instance, the chord G-B-D, compared to the tonal system and the tonality of C Major, is the dominant chord), the harmonic function performed, the organic relation established with the one that comes before and the one that comes after it.

The pillars of the functional theory are the harmonic functions of tonic (T), sub-dominant (S) and dominant (D), that Riemann was the first to identify as the foundation and pivot of any type of chord succession, hypothesizing in the connection I-IV-V-I (Fig. 1) the archetype of the tonal harmony and the model which any type of chord concatenation should be traced back to (Fig. 2).

Fig. 1
figure 1

Archetype of the tonal harmony according to Riemann

Fig. 2
figure 2

Riemann’s analysis of Beethoven’s Piano Sonata n° 1 op. 2

It follows that all the chords will have a harmonic function of relaxation or of tonal center T, or of tension towards such center D, or of breakaway from it S.

The three harmonic functions of I, IV and V degree are termed main functions because they are linked by a relation based on the interval of the perfect 5th that separates the keynotes of the three corresponding chords; the chords relating to the rest of degrees on the scale (II, III, VI and VII) are considered “representatives” of the I, IV and V degree (with which there is an affinity of the third—two sounds in common—because the 3rd is actually the interval that regulates the distance between the respective keynotes) and secondary harmonic functions rest with it.

Figure 3 illustrates the sequence of the degrees of the C major scale with their related harmonic function, in which it is easy to notice the functional correlation among the different degrees.

Fig. 3
figure 3

Harmonic functions of the degrees of the scale

Based on the above considerations it is possible to infer how a musical phrase is built on the basis of the sounds belonging to the chords of certain degrees of the scale that follow one another according to the diagram in Fig. 1. To this end, it is important to point out that generally:

  1. 1.

    the function of Tonic (T) goes towards a function of Subdominant (S) that can be represented by the IV degree (S) or by the II degree (Sp) of the scale;

  2. 2.

    the function of Subdominant (S) goes towards a function of Dominant (D) that can be represented by the V degree (D), by the III degree (Dp) or by the VII degree (D7);

  3. 3

    the function of Dominant (D) goes towards a function of Tonic (T) that can be represented by the I degree (T) or by the VI degree (Tp).

The adverb “generally” was used to describe the direction of the tonal functions because the composer has a certain degree of freedom of writing that in some cases allows him to disregard the provisions of musical grammar: for instance, the function of Tonic (T) might go directly towards the function of Dominant (D).

The algorithm developed on the basis of the theory of Functional Harmony represents, therefore, a support to the composer: it can create and propose a new musical idea that the composer may modify by enriching it with the melodic figurations.

Musical grammar provides the composer with a series of tools allowing him to vary, within the same musical piece, an already presented melodic line, by inserting notes which are extraneous to harmony. The sounds of a melodic line, in fact, may belong to the harmonic construction or may be extraneous to it. The former sounds, which fall in the chordal components, are called real, while the latter sounds, which belong to the horizontal dimension, take the name of melodic figurations (passing tones, turns or escape tones).

They are complementary additional elements of the basic melodic material that lean directly or indirectly on real notes and also resolve on them. The use of melodic figurations, therefore, allows achieving greater freedom of the melody, be- stowing upon it a better profile (see Fig. 4).

Fig. 4
figure 4

Representation of the melody in Fig. 2 without the melodic figurations

4 Hidden Markov Model (HMM)

The Markov chains are a stochastic process, characterized by Markov properties.

It is a mathematical tool according to which the probability of a certain future event to occur depends uniquely on the current state [11]. Let X n  = 1 be the current state and \(X_{n + k} = j\) the state after k steps, with i, j belonging to the set of states.

The conditional probability \(p\left[ {X_{n + k} = j|X_{n} = i} \right]\) is called a transition probability in k steps of the Markov chain [7].

The probability of transitioning from state i to state j in k steps of a homogeneous chain is indicated by Eq. (1):

$$p_{{{\text{i}},{\text{j}}}} (k) = P[X_{{{\text{n}} + {\text{k}}}} = j|X_{\text{n}} = i]$$
(1)

By tagging the states as 1, 2, …n + 1 we can summarize all the transition probabilities, p i,j (k), in a matrix P(k), of the dimension n × n, where in the jth column and the ith row there is the transition probability from state i to state j in k steps:

$$P(k) = \left( {\begin{array}{*{20}c} {{\text{P}}_{ 1 , 1} ({\text{k}})} & \ldots & {{\text{P}}_{{ 1 , {\text{j}}}} ({\text{k}})} & \ldots & {{\text{P}}_{{ 1 , {\text{n}}}} ({\text{k}})} \\ {{\text{P}}_{ 2 , 1} ({\text{k}})} & \ldots & {{\text{P}}_{{ 2 , {\text{j}}}} ({\text{k}})} & \ldots & {{\text{P}}_{{ 2 , {\text{n}}}} ({\text{k}})} \\ \ldots & \ldots & \ldots & \ldots & \ldots \\ {{\text{P}}_{\text{i,1}} ({\text{k}})} & \ldots & {{\text{P}}_{\text{i,j}} ({\text{k}})} & \ldots & {{\text{P}}_{\text{i,n}} ({\text{k}})} \\ \ldots & \ldots & \ldots & \ldots & \ldots \\ {{\text{P}}_{\text{n,1}} ({\text{k}})} & \ldots & {{\text{P}}_{\text{n,j}} ({\text{k}})} & \ldots & {{\text{P}}_{\text{n,n}} ({\text{k}})} \\ \end{array} } \right)$$

The matrix P(k), for k = 1, performs a fundamental role in Markov’s chains theory: it (known as the transition matrix) represents the probability of transitioning to the next consecutive step [12].

It is possible to represent the transition matrix P by means of a graph called transitions diagram [13]. The latter consists in a graph the nodes of which represent the single states while the arcs, oriented and labeled with the probability, indicate the possible transitions [14]. For instance, considering the matrix

$$P(k) = \left( {\begin{array}{*{20}c} {0.7} & {0.1} & {0.2} \\ {0.1} & {0.6} & {0.3} \\ {0.3} & {0.3} & {0.4} \\ \end{array} } \right)$$

it has the following corresponding diagram (Fig. 5):

Fig. 5
figure 5

State-transitions in a Hidden Markov model

The problem of classification of the sequences may be solved by calculating the probability of the single sequence s to be emitted by the model M: P(s|M). Formally (2):

$$P\left( {s|M} \right) = \sum\uppi\,P\left( {s,\uppi|M} \right)$$
(2)

The designed algorithm uses a matrix of the transitions to construct a compositional logic able to create a musical idea [11]: the matrix represents the probabilities for a type of harmonic function to resolve to another type of harmonic function (Fig. 6).

Fig. 6
figure 6

Matrix of the transitions of the harmonic functions

A first and main task of the algorithm is to read music compositions in MIDI format (by different authors and of different ages), recognize the harmonic functions of the different musical degrees [15] and update the matrix of transitions. By reading an ever bigger number of music compositions, the algorithm will be able to propose ever more pleasant musical ideas: and this is because, by reading the mu- sic compositions, the probabilities of transition, but also the individual “state- transitions” (T, S, D) change, i.e. if a new harmonic function is identified (for in- stance the function Sp) this function will automatically be inserted in the matrix as a new “state” generating new transition probabilities.

5 The Results Obtained

The model of analysis set forth in this article was verified by realizing an algorithm the structure of which takes, most of all, in consideration each and every single aspect described above: the algorithm does not provide for any limit with respect to the dimensions of the transition matrix, but, on the contrary, it will be automatically dimensioned every time a new composition (already existing) is read, based on the characteristics of the respective composition.

The algorithm has the objective of proposing a new tonal musical idea as a source of inspiration for a new composition. As such, the new idea will have the typical characteristic of a musical phrase, i.e. it will not contain modulations (passage from one tonality to another) and the first harmonic function will be the tonic.

As far as the rhythmic structure is concerned, it has been decided to see to it that the new idea be representative of a harmonic structure (as the example in Fig. 4) that the composer might refine at a later stage, by inserting melodic figurations. Hence, a function, ergo a single sound, shall correspond to every single movement, bestowing a homorhythmic character on the melody (every movement will have the same duration).

The only parameters required as input for the elaboration are:

  1. 1.

    the musical tempo (a fractional number placed always at the beginning of the staff next to the key that indicates the total sum of the movements that must be contained in a beat and determines the sequence of the accents inside the same beat);

  2. 2.

    the number of beats (that the new idea will have to have).

On the basis of these parameters it is possible to define the total number of movements that will compose the new idea and, on the basis of the transition percent- ages derived from the transition matrix, we will determine for every function the number of times it will have to be repeated within the idea.

An example of a musical idea in a ¾ and four beats, generated after the reading of only three music compositions by different authors and different ages is illustrated below (Fig. 7):

Fig. 7
figure 7

Example of functional analysis and the related transition matrix updated after the reading of every music composition

  • Theme of the melody “Ah, vous dirai-je Maman”, KV 265;

  • Moment musical No. 3 in F minor by Schubert;

  • Song without words No. 9 by Mendelssohn.

For simplicity’s sake in the demonstration and in order to better exploit the efficiency of the method, only the three main harmonic functions (T, S and D) were taken into consideration and all the secondary harmonic functions were ascribed to them (Sp counts as S, Dp counts as D….).

Furthermore and also solely for demonstrative purposes, only the first four beats of every composition were taken into consideration: it is important to specify that this choice was not motivated by the fact that the choice was made to create a four-beat phrase.

By means of the last transition matrix the algorithm determines the transition percentage from one state to the other (Fig. 8a) and therefore, on the basis of the total number of movements re w musical idea (in this example there are 12 because 4 beats of 3 are required), the number of times every function may occur (Fig. 8b).

Fig. 8
figure 8

Representation of the transition percentages from one state to the other (a) and of the number of every function within the musical idea

The results of Fig. 8b represent the basis for the random generation of the harmonic functions of the new idea. It is immediately deduced that there won’t be a unique possible combination of harmonic functions, but, on the contrary, many different combinations may be obtained. The only common element of all these combinations is that the first harmonic function will always be the tonic one inasmuch as all the tonal music compositions always begin on the Tonic chord because it is representative of the main tonality.

In Fig. 9 below there is a representation of one of the possible combinations of harmonic functions and some possible examples of melodies, defined according to the rules of traditional harmony: every harmonic function is determined by the structure of the chord from which it derives and the chord is formed (fundamentally) by three sounds at a third distance one from the other.

Fig. 9
figure 9

Example of functional structure and related possible melodies

In the example in Fig. 9 the main tonality is C major and therefore the harmonic functions will be represented by the following sounds:

  1. 1.

    T (representative of the first degree): C, E, G;

  2. 2.

    S (representative of the fourth degree): F, A, C;

  3. 3.

    D (representative of the fifth degree): G, B, D.

In this case, too, it is easy to understand how the presence in the transition matrix of the secondary harmonic functions may generate more appreciable melodies thanks to the presence of different combinations of sounds.

An example of how the third melody of Fig. 8 may be modified by the composer by using the melodic figurations is given in Fig. 10.

Fig. 10
figure 10

Example of a melody

6 Conclusions

This article examined the use of Markov’s process, as a mathematical means of in- formation of the encoding with respect to the progression of the harmonic functions (as described by De la Motte) of the musical material. This mathematical process may be an efficient tool, used under the guidance of music theory, to formulate the models elaborated by the computer for the purposes of classic music composition.

The work presents several improvement opportunities. First of all, the possibility to consider, when reading harmonic structures in different music compositions, not only the functions of Tonic, Subdominant and Dominant, but also their correlated functions: the functions on degrees II, III, VI and VII.

Second of all, the possibility to incorporate in the proposed method the concept of “Cadence” which is very important on the compositional level for the definition of the musical phrase.

The tools presented in this article, developed on the basis of specifically musical objectives, are not meant in any way to be deemed a system for the composition of a musical piece, they rather represent a means of support to the didactic activity: a useful tool to allow specific in-depth analysis, stimulate the recovery of abilities that are not entirely acquired or as a simple tool of consultation and support to the explanation of the lecturer.