Keywords

1 Introduction

Voice-leading is the art and science of how to connect chords to one another. In the 20th century, Lewin [7] introduced Neo-Riemannian theory, which is based on the idea of connecting chords according to some definition of harmonic proximity [2]. This notion of harmonic proximity requires, in turn, a notion of a distance between chords. It is natural then to introduce at this point some mathematical formalism to address the question of how to measure the distance between two chords; see the work of Tymoczko [8,9,10], Hall and Tymocko [6], and Derfler [3], just to name but a few.

This work focuses on the pedagogical aspects of voice-leading in jazz music, a style where voice-leading is also an important feature. We would like to provide composers with a tool to understand and write voice-leadings by following criteria that are at the same time systematic and musically meaningful. On the mathematical side, we offer the musician a minimal but meaningful mathematical formalization of voice-leadings so that musical concepts are still recognizable in the formalization. The structure of this paper is as follows. We start by introducing some definitions, which will help build the formal framework (the musical universe). In Sect. 3, we study metric spaces in music and introduce the nabla distance, which is the distance to measure the size of a voice-leading. Section 4 contains the pedagogical applications of the nabla methodology.

2 The Musical Universe

We begin by defining the space of frequencies. In principle, it would be enough for our purposes to consider the set of audible frequencies, say, the interval \((20, 2\cdot 10^4)\), when measured in Hz. However, for completeness we will consider the space of frequencies \(\varPhi \) as the real line (it is closed under product and sum of frequencies). Let xy be two pitches described by their frequencies. We write \(x\sim y\) if and only if \(x=2^k\cdot y\), for some integer k. Recall that two pitches are an octave apart when the quotient of the highest frequence to the lowest is 2. This relation identifies all the pitches that are apart any number of octaves as just one pitch.

From now on, we assume we are in the presence of the equal temperament. Given a fixed pitch class [k], we define the circle of fifths \(PC_k/{\sim }\) as the set \(PC_k/{\sim }\,=\,\left\{ [k], \left[ k\cdot 2^{\frac{7}{12}}\right] , \left[ k\cdot 2^{\frac{14}{12}}\right] , \left[ k\cdot 2^{\frac{21}{12}}\right] ,\ldots , \left[ k\cdot 2^{\frac{77}{12}}\right] \right\} \). This definition is illustrated in Fig. 1. The pitch class of A was chosen as the base and then the circle of fifths is built up from it by multiplying the previous pitch by \(2^{\frac{7}{12}}\), the distance of a fifth in terms of frequency.

Fig. 1.
figure 1

The circle of fifths

A chord X(q) is a subset of the pitch classes in \(PC_k/{\sim }\). In Western tonal music, some chords are described by a root and a quality. A chord is an unordered collection of pitches. When we introduce the root and the quality, the pitches are then ordered. The root is the lowest pitch in the chord whereas the quality refers to labels given to chords. For example, a dominant seventh chord on C is the chord composed by C-E-G-B\(\flat \), in that order. The root of this chord is C and the quality dominant seventh. This label tells us that the first three notes form a major triad and that B must flat so that there is minor seventh between C and B. The quality of a chord is indicated by several symbols (m or lowercase for minor chords, + for augmented chords, etc.).

A chord progression is a sequence of chords. As such, chords in a progression are presented in a given order, which is the order they appear in time. A suitable way to deal with chord progressions is by considering the matrix of classes. If \(P\in \mathcal{M}_{m\times n}(PC_k/{\sim })\) is a chord progression of length n, then each chord is a vector of m notes and there are n chords in the progression. We can arrange the notes of the chord progression in a matrix as follows.

$$\begin{aligned} P=\left( \begin{array}{ccc} [\theta _{11}] &{} \ldots &{} [\theta _{1n}] \\ \vdots &{} \ddots &{} \vdots \\ {}[\theta _{m1}] &{} \ldots &{} [\theta _{mn}] \\ \end{array} \right) \end{aligned}$$

To fix ideas, let consider the 2-note chord progression {E, C} to {F, E}, which from now on will be notated as {E, C}\(\Longrightarrow \){F, E}. Its matrix representation is \(P=\left( \begin{array}{cc} [E] &{}[F] \\ {}[C] &{}[E] \\ \end{array} \right) \).

Let \(\varPhi ^+\) be the set of positive frequencies. A voicing or a voice-leading of a chord is a mapping \(V_{X(q)}\) from \(\mathcal{M}_{m\times n}(PC_k/{\sim })\) to \(\mathcal{M}_{m\times 1}(\varPhi ^+)\). The mapping takes a given class to a note. Indeed,

$$\begin{aligned} V_{X(q)} \left( \left( \begin{array}{c} [\theta _{1j}] \\ \vdots \\ {}[\theta _{mj}] \\ \end{array} \right) \right) = \left( \begin{array}{c} \phi _{1j} \\ \vdots \\ {}\phi _{mj} \\ \end{array} \right) \end{aligned}$$

where \(\phi _{ij}\in [\theta _{ij}]\), for \(i=1,\ldots , m\) and some j in \(\{1,\ldots , n\}\). Following with the previous example, a voice leading for the chord progression could be (among other possibilities) \( V_{X(q)} \left( \left( \begin{array}{c} [C] \\ {}[E] \\ \end{array} \right) \right) = \left( \begin{array}{c} C4 \\ E4\\ \end{array} \right) \). For ease of reading, we will notate the frequencies by their standard names instead of their numerical values. Therefore, we will write A4 instead of 440 Hz.

An arrangement of a chord progression is the mapping defining which notes of the chords are chosen for the voice-leading. Formally, it is a mapping \(A_{C^\sim }: \mathcal{M}_{m\times n}(PC_k/{\sim })\rightarrow \mathcal{M}_{m\times n}(\varPhi ^+)\) written as

$$\begin{aligned} A_{C^\sim } \left( \left( \begin{array}{ccc} [\theta _{11}] &{} \ldots &{} [\theta _{1n}] \\ \vdots &{} \ddots &{} \vdots \\ {}[\theta _{m1}] &{} \ldots &{} [\theta _{mn}] \\ \end{array} \right) \right) = \left( \begin{array}{ccc} \phi _{11} &{} \ldots &{} \phi _{1n} \\ \vdots &{} \ddots &{} \vdots \\ {}\phi _{m1} &{} \ldots &{} \phi _{mn} \\ \end{array} \right) , \end{aligned}$$

where \(\phi _{ij}\in [\theta _{ij}]\), for \(i=1,\ldots , m\) and \(j=1,\ldots , n\). From now on, arrangements will be notated as \((\phi _1,\ldots , \phi _n)\longrightarrow (\phi ^{\prime }_1,\ldots , \phi ^{\prime }_n)\), that is, as bijections between sequences of notes; compare this notation to that of chord progressions above.

For the chord progression {E, C}\(\Longrightarrow \){D, G}, \(A_{C^\sim }\) could take on the form, among others, of \( A_{C^\sim } \left( \left( \begin{array}{cc} [C] &{} [G]\\ {}[E] &{} [D]\\ \end{array} \right) \right) = \left( \begin{array}{cc} C4 &{} G4 \\ E4 &{} D4 \\ \end{array} \right) \).

3 The Nabla Distance

It is possible to endow the musical space with a metric. The idea is to measure the distance between two notes and what follows is just a formalization of what our ears do in a natural way all the time; see [4] for more information on the cognitive aspects of music. We define a metric \(\varDelta : (\varPhi ^+)^2\rightarrow \mathbb {R}\) as an integral.

$$\varDelta (\alpha , \beta )=\left| \int _{\alpha }^{\beta }\frac{\varOmega }{\phi }d\phi \right| ,$$

where \(\alpha \) and \(\beta \) are frequencies and \(\varOmega \) is a constant such that \(\left| \int _{1}^{2}\frac{\varOmega }{\phi }d\phi \right| =12\); see [1] for a relationship between this constant and the definition of cents. By working out the integral above, this distance can be expressed as \(\varDelta (\alpha , \beta )=\left| \varOmega \ln \left( \frac{\alpha }{\beta }\right) \right| \). The value of the constant is \(\varOmega =12\cdot \left| \log _2(e)\right| \), which indicates that the octave is divided into 12 equal half-tones. This \(\varDelta \) function does hold the three properties of a metric, namely: positivity, \(\varDelta (\alpha , \beta )\ge 0\); symmetry, \(\varDelta (\alpha , \beta )= \varDelta (\beta , \alpha )\); and the triangle inequality \(\varDelta (\alpha , \beta )\le \varDelta (\alpha , \gamma )+\varDelta (\gamma , \beta )\).

The pair \((\varPhi ^+, \varDelta )\) is called the musical metric space. This metric can be extended to the spaces of pitch classes by just taking the minimum of the elements in each pitch class. For two classes \([\theta ], [\tau ]\) in \(PC_k/{\sim }\), we have \(\tilde{\varDelta }([\theta ], [\tau ]) = \text {min}\left\{ \varDelta (\alpha , \beta ) \,|\, \alpha \in [\theta ], \beta \in [\tau ]\right\} \). See the work [5] of Forte for more information on distance functions. For example, \(\varDelta (C5, E4)=8\) and \(\varDelta (C4, E5)=16\), but \(\tilde{\varDelta }([C], [E]) = \text {min}\left\{ \varDelta (\alpha , \beta ) \,|\, \alpha \in [C], \beta \in [E]\right\} =4\). Notice that the maximum value the distance \(\tilde{\varDelta }\) can take is 6.

Let \(P\in \mathcal{M}_{m\times n}(PC_k/{\sim })\) be a chord progression such that \(P=([p_{ij}])\), for \(i=1,\ldots , m\) and \(j=1,\ldots , n\). Consider \(\sigma \), an element in the symmetric group \(\mathcal{S}_m\) defined over the set of indices \(\{1, 2,\ldots , m\}\). Then, we define E(P), the extension of P, as those matrices \(B=(b_{ij})\) in \(\mathcal{M}_{m\times n}(PC_k/{\sim })\) such the following two conditions hold: (1) For some values of j, \([p_{ij}]=[b_{ij}]\), for all \(i=1,\ldots , m\); (2) For the rest of values of j, \([p_{ij}]=[b_{\sigma _k(i)j}]\), for all \(i=1,\ldots , m\), where \(\sigma _k\) is a permutation in \(\mathcal{S}_m\).

These conditions state that a column in B is either the same column in P or a permutation of some column of P. E(P) is the set of such matrices. Consider again the matrix associated to the chord progression {E, C}\(\Longrightarrow \){F, A}. Then, the extension of P is

$$\begin{aligned} E(P)=\left\{ \left( \begin{array}{cc} [C] &{} [A] \\ {}[E] &{} [F] \\ \end{array} \right) , \left( \begin{array}{cc} [C] &{} [F] \\ {}[E] &{} [A] \\ \end{array} \right) , \left( \begin{array}{cc} [E] &{} [A] \\ {}[C] &{} [F] \\ \end{array} \right) , \left( \begin{array}{cc} [E] &{} [F] \\ {}[C] &{} [A] \\ \end{array} \right) \right\} \end{aligned}$$

Next, we need to define the distance that a voice travels through a given chord progression. We will use the symbol \(\tilde{\nabla }\) to define the distance of a chord progression P. Then, \(\tilde{\nabla }(P)\) is defined as follows: \(\tilde{\nabla }(P)=\sum _{i=1}^{m}\sum _{j=1}^{n-1}\tilde{\varDelta }([\theta _{ij}], [\theta _{i(j+1)}])\). The value of \(\tilde{\nabla }(P)\) is the sum of all the distances between consecutive notes of a voice over all voices in the chord progression.

The operator nabla can also be defined for the set E(P) as follows: \(\tilde{\nabla }(E(P)) = \left\{ \tilde{\nabla }(B) \,|\, B\in E(P) \right\} \). Notice that \(\tilde{\nabla }(P)\) is a real value and \(\tilde{\nabla }(E(P))\) a set of values. Let us compute \(\tilde{\nabla }(P)\) for the chord progression {E, C}\(\Longrightarrow \){F, A}. Indeed, \(\tilde{\nabla }(P)=\sum _{i=1}^{m}\sum _{j=1}^{n-1}\tilde{\varDelta }([\theta _{ij}], [\theta _{i(j+1)}]) = \tilde{\varDelta }([E], [F])+\tilde{\varDelta }([C], [A])= 1\,+\,3\,=\,4 \) Actually, we don’t need to consider all the matrices in E(P) to compute \(\tilde{\nabla }(E(P))\). It is enough to choose those where the first column is not rearranged. The nabla distances of the matrices in E(P) are

$$\begin{aligned} \tilde{\nabla }\left( \left( \begin{array}{cc} [C] &{} [A] \\ {}[E] &{} [F] \\ \end{array} \right) \right) =1+3=4,\quad \tilde{\nabla }=\left( \left( \begin{array}{cc} [C] &{} [F] \\ {}[E] &{} [A] \\ \end{array} \right) \right) =5+5=10, \end{aligned}$$

The nabla value of the extension of P is \(\tilde{\nabla }(E(P)) = \left\{ \tilde{\nabla }(B) \,|\, B\in E(P) \right\} =\left\{ 4, 10 \right\} \).

A chord progression is said to be optimal if \(\tilde{\nabla }(P) = \text {min}\left\{ \tilde{\nabla }(E(P)) \right\} \). In our example, the chord progression {E, C}\(\Longrightarrow \){F, A} was optimal as the nabla distance attained the minimum at that progression.

Analogously, the nabla distance can be defined for arrangements; it will be notated by \(\nabla \) (without tilde). If \(A=(\phi _{ij}) \in \mathcal{M}_{m\times n}(\varPhi ^+)\) is an arrangement, then the formal definition of \(\nabla \) is \(\nabla (A)=\sum _{i=1}^{m}\sum _{j=1}^{n-1}\varDelta (\phi _{ij}, \phi _{i(j+1)})\).

An arrangement A is said to be optimal if \(\nabla (A)=\tilde{\nabla }(P_A)\), where \(P_A\) is the chord progression associated to A. Let us consider two arrangements associated to the chord progression {E, C}\(\Longrightarrow \){F, E}, say, \(A_1\):(E4, C4)\(\longrightarrow \)(F4, E4) and \(A_2\):(E4, C4)\(\longrightarrow \)(F5, E5). Let us find which one is optimal by computing their nabla distances. We have \(\nabla (A_1)=\varDelta (E4, F4) + \varDelta (C4, E4) = 1+4=5\) and \(\nabla (A_2)=\varDelta (E4, F5) + \varDelta (C4, E5) = 13 + 16 = 29\). Therefore, the first arrangement is the optimal one.

Let us work out a larger example, with three voices and three chords in the progression. In the example below, we have removed the square brackets to simplify the notation as it is clear we are speaking of pitch classes. Since the extension of P is composed of all permutations of the columns of P, we can apply a sequence of permutations (the \(\sigma \)’s below) to obtain a sequence of chord progressions reaching the minimum value.

$$\begin{aligned} P&=\left( \begin{array}{ccc} A &{} D &{} G\\ F &{} B &{} E\\ D &{} G &{} C\\ \end{array} \right) \xrightarrow {\sigma _1: D\leftrightarrow G} P_1=\left( \begin{array}{ccc} A &{} G &{} G\\ F &{} B &{} E\\ D &{} D &{} C\\ \end{array} \right) \xrightarrow {\sigma _2: G\leftrightarrow B} P_2=\left( \begin{array}{ccc} A &{} B &{} G\\ F &{} G &{} E\\ D &{} D &{} C\\ \end{array} \right) \\&\qquad \quad \tilde{\nabla }(P)=31 \qquad \qquad \qquad \quad \tilde{\nabla }(P_1)=15 \qquad \qquad \qquad \quad \tilde{\nabla }(P_2)=13 \\ \\&P_2=\left( \begin{array}{ccc} A &{} B &{} G\\ F &{} G &{} E\\ D &{} D &{} C\\ \end{array} \right) \xrightarrow {\sigma _3: G\leftrightarrow E} P_3=\left( \begin{array}{ccc} A &{} B &{} E\\ F &{} G &{} G\\ D &{} D &{} C\\ \end{array} \right) \xrightarrow {\sigma _4: C\leftrightarrow E} P_4=\left( \begin{array}{ccc} A &{} B &{} C\\ F &{} G &{} G\\ D &{} D &{} E\\ \end{array} \right) \\&\quad \,\, \tilde{\nabla }(P_2)=13 \qquad \quad \quad \qquad \quad \tilde{\nabla }(P_3)=11 \qquad \quad \quad \qquad \quad \, \tilde{\nabla }(P_4)=7 \\ \end{aligned}$$

In this case, \(P_4\) is the chord progression with minimum \(\tilde{\nabla }\) distance.

Let us discuss now how to obtain the chord progression of minimum value. Assume we have a chord progression \(P=([\theta _{ij}])\), where \(i=1, \ldots , m\) and \(j=1, \ldots , n\). Each transition from a chord to the next can be thought of as a bijection between two sets of cardinal m. We know by elementary combinatorics that the total number of bijections is \((m!)^{n-1}\). If we assume that the number of voices is constant, then the size of E(P) is exponential in n. However, constructing the whole set E(P) is not practical. It is more interesting to design an algorithm to find the optimal chord progression through a set of operations performed on P. The example above could suggest that a possible algorithm would take the shortest distances between pitch classes and build the optimal sequence of chord progression (Fig. 2).

Fig. 2.
figure 2

The nabla application.

Alas, the previous statement is not true in general and the following little example disproves such a claim. Consider the chord progression {(C, E, B}\(\Longrightarrow \){(B, F\(\sharp \), A}. If we look for the shortest distances between the individual notes, we will obtain the bijection \(C\leftrightarrow B, G\leftrightarrow F\sharp , E\leftrightarrow A\); the nabla distance of this bijection is 7. However, the bijection below gives a smaller nabla distance. which has \(\tilde{\nabla }(P_1)=1\,+\,5\,+\,1\,=\,7\) as its minimum value. Notice that the choice of \(E\leftrightarrow A\) is forced by the choices of previous notes. However, the bijection \(C\leftrightarrow B, E\leftrightarrow F\sharp , G\leftrightarrow A\) gives a smaller nabla distance: \(\tilde{\nabla }(P_1)=1+2+2=5\).

4 The Nabla Application

In this section we show how to use the idea of the nabla distance to teach voice-leadings in jazz music. Voice-leadings, contrary to what it might seem, are common in jazz music and part of a proper performance practice. In order to help the interested musician to understand and use the nabla approach to part-writing, we wrote an application, the \(\nabla \) app, which, from a sequence of chords input by the user, computes the optimal chord progression. The application is already available on Apple store and the interface is in Spanish, although it will be translated into English very soon. This application can be used to illustrate concepts of mathematical theory in the classroom. It may help the music student to familiarize themselves with mathematical formalization (all the concepts found in Sect. 2). Also, it allows the teacher to take a hands-on approach to part-writing in jazz or classical music.