Keywords

1 Introduction

Among mathematical explorations of music, few topics are as tantalizing as voice leading, which embodies the rich interplay between melody and harmony, and epitomizes the transcendant relationship between individual and ensemble. Insofar as voice leading is a specific means by which one musical situation changes to another, and describing change with precision is well-served by mathematics, voice leading is a natural candidate for abstract mathematical modeling. Voice leading has been modeled in [10] as mathematical functions between sets, as groups or groupoids of transformations on a graph or lattice in [5] and [14], as transformations between pitch-class sets in [11], and as multisets of ordered pairs drawn from specified multisets in [17]. All of the above models are discrete, focusing exclusively on the starting and ending situations, and treating the mediating process between them as an “instantaneous event” [12]. In contrast, recent geometrical and topological models for voice leading [3, 4, 12, 17] incorporate the mediating process in the form of continuous paths between states.

To mathematicians, the use of continuous paths to study relationships among geometrical and topological objects is familiar territory: much of algebraic topology depends on careful development of the notion of a “path,” which enables precise definition of important topological invariants, such as the fundamental group and groupoid of a space. Hence, the invocation of paths to study voice leading opens a wide realm of possibilities for the mathematical study of voice leading.Footnote 1 Some of these possibilities have already been explored, with significant results: in the orbifold geometry of chords in [4] and elsewhere, and in the theory of gestures developed by Mazzola and Andreatta in [12]. It is the goal of this paper to make more explicit connections between the mathematics of paths and continuous models for voice leading. One immediate payoff of this effort is greater precision in modeling concatenation of voice leadings, which enables resolution of certain voice crossing ambiguities. Of more general benefit is deeper reconciliation among geometric, transformational, and gestural approaches.

2 Precise Paths, Path Composition, and Homotopy

We now extend the notion of a “path” beyond that of a “generalized line segment” [18]. All of the mathematical material in this section and the next is standard for a first course in algebraic topology (e.g., [13]). We presuppose familiarity with topological spaces and continuous functions between them.

Definition 1

Let X be a topological space. A path in X is a continuous function \(p:[a,b]\rightarrow X\), where [ab] is a closed interval in the real numbers \(\mathbb {R}\). The initial point of p is p(a) and the terminal point of p is p(b).

Note that a path in X is an example of a gesture [12]. We will assume the parameter set of a path is the unit interval \(I=[0,1]\); this does not really pose any restriction since any closed interval can be shifted and scaled to coincide with I. We include the following for later reference:

Definition 2

A topological space X is said to be path connected if a path from \(x_0\) to \(x_1\) exists for any \(x_0\) and \(x_1\) in X.

Even with the parameter set restricted to I, it should be clear that there is infinite variability among paths between any pair of points. Allowing such variability has musical relevance: taking \(X=\mathbb {R}\) as pitch space for a single voice, and the parameter t as time, path variability corresponds in music to the variety of ways a voice can move (continuously) from one pitch to another (or even depart from and return to the same pitch): it could slide smoothly and linearly over the entire allotted time interval (as in Vers le blanc [3]); it could remain on the starting pitch until a very small subinterval of time, during which the pitch slides quickly enough to the ending pitch to be perceived as an instantaneous change; it could move in quick stair-step fashion through the notes of a scale, as in a fast run played by a woodwind instrument; or it could engage in some combination of these, as in the clarinet solo opening Gershwin’s Rhapsody in Blue (Fig. 1). Path variability— perhaps at the discretion of the performer— is acknowledged at least as early as Fux [8], in giving “diminution” as the reason for prohibition of direct motion from an imperfect consonance into a perfect consonance (see Fig. 2).Footnote 2

Fig. 1.
figure 1

Clarinet solo in opening measures of Gershwin’s Rhapsody in Blue. Most performances convert the last fifth or so of the glissando into a smooth portamento; for voice leading purposes the glissando is a path from F3 to B\(\flat \)5.

Fig. 2.
figure 2

Excerpt from Josquin’s Ave Maria, showing only the alto and bass parts. On the word “Dei” the bass voice executes a written-out diminution; for voice leading purposes it is a path from F3 down to C3.

Another standard mathematical operation on paths, concatenation or composition, also has musical relevance. For example, in the first three measures of the Josquin excerpt shown in Fig. 2, we might want to equate the two-voice path from (G4, C3) to (G4, G3) followed by the path from (G4, G3) to (A4, F3) with the single path from (G4, C3) to (A4, F3). In another example from [18], pp. 81–82, paths in two-note chord space are decomposed into concatenations of pure parallel and pure contrary paths. (These examples illustrate a problem of ambiguity in modeling voice leadings using continuous paths. Specifically, in [18] and other sources, a voice leading modeled by a path that touches a mirror singularity corresponding to two voices sounding the same pitch class is assumed to contain a voice crossing; however, the alto and bass in the Josquin excerpt sound the same pitch class in the second measure, thereby touching a mirror singularity, but they do not cross. Later, we will see how the ambiguity can be resolved by means of orbifold paths.) We model concatenation of voice leadings by path composition, defined as follows.

Definition 3

If \(p:I\rightarrow X\) and \(q:I\rightarrow X\) are two paths such that \(p(1)=q(0)\), the composition \(p*q\) of p and q is the path given by p(2t) for \(0\le t\le 1/2\) and \(q(2t-1)\) for \(1/2<t\le 1\).

Note that, in order for the composition \(p*q\) to be defined, the terminal point of p must coincide with the initial point of q, so that composition of arbitrary ordered pairs of paths is not necessarily defined, which means composition fails to be a binary operation on the set of paths in X.

Having allowed for path variability in our mathematical model, we now introduce a mathematically precise way of ignoring path differences that are unimportant from a voice leading point of view. Specifically, we define an equivalence relation on paths using the idea of continuous deformation. The motivation for invoking this particular equivalence is twofold: first, it agrees with notions of equivalence for voice leadings already in use (perhaps implicitly) in existing literature; and second, it is a standard tool in mathematics, whose invocation enables deployment of some extremely powerful ideas.

Definition 4

Let p and q be two paths in X such that \(p(0)=q(0)=x\) and \(p(1)=q(1)=y\) (i.e., the initial and terminal points are x and y, respectively, for both paths). A homotopy rel end points from p to q is a continuous function \(F:I\times I\rightarrow X\) such that \(F(s,0)=p(s)\) and \(F(s,1)=q(s)\) for \(0\le s\le 1\), and \(F(0,t)=x\) and \(F(1,t)=y\) for \(0\le t\le 1\). The paths p and q are called homotopic rel end points if such an F exists; in this case we write \(p\simeq q\).

Note that for each value of s F gives a path from x to y, and \(s=0\) gives p and \(s=1\) gives q, so F is a continuously varying family of paths from x to y between p and q. Also note that, in our setting, all homotopies will be rel end points, so until further notice we will refer simply to homotopy and assume it is rel end points. We note also that in the context of gesture theory, a homotopy is an example of a hypergesture [12]. For a musical example of homotopy, consider the trill at the beginning of the Gershwin excerpt (Fig. 1). There is a continuous family of variations in which the pitch variation in the trill decreases continuously from a whole tone through a semitone, to a slight vibrato, to no variation at all; hence the path of the trilled note is homotopic to that of a non-trilled note.

It is a standard exercise in algebraic topology to prove that homotopy is an equivalence relation, and that path composition is well-defined for homotopy classes of paths. That is to say, if p and q are composable paths and we denote their homotopy classes by [p] and [q], then for \(p^\prime \simeq p\) and \(q^\prime \simeq q\) we have \([p*q]=[p^\prime *q^\prime ]\), so we can define \([p]*[q]\) unambiguously as \([p*q]\). Another standard exercise is to prove that composition of homotopy classes of paths is associative; that is, \(([p]*[q])*[r]=[p]*([q]*[r])\) for all p, q, and r for which the indicated compositions are defined. Furthermore, if for \(x\in X\) we denote by \(1_x\) the constant path at x, then we have \([1_x]*[p]=[p]\) for any path p with initial point x, and \([p]*[1_y]=[p]\) for any path p with terminal point y. Finally, if p is a path from x to y, and we define \(p^{-1}\) by \(p^{-1}(t)=p(1-t)\) for \(0\le t\le 1\) (so \(p^{-1}\) is the path from y to x obtained by traversing p backwards), then \([p]*[p^{-1}]=[1_x]\) and \([p^{-1}]*[p]=[1_y]\). We can now give our model for voice leading.

Definition 5

Let C be a path connected chord space, and let \(c_1\) and \(c_2\) be chords in C. A voice leading in C from \(c_1\) to \(c_2\) is a homotopy class of paths in C with initial point \(c_1\) and terminal point \(c_2\).

We note that, under this definition, if we were to replace the bass part in the third measure of the Josquin excerpt shown in Fig. 2 with the single note F3 held for the entire measure (i.e., remove the diminution), the voice leading from the start of the third measure to the start of the fourth measure would be unchanged. Equating the two versions is very much in the spirit of [8]; in using one version to justify prohibition of the other, it is clear that Fux intended to consider the two versions identical in some sense, at least from a voice leading point of view. The equivalence relation of homotopy makes the identification precise.

3 The Fundamental Group and Covering Spaces

By now it should be clear that, but for the fact that arbitrary ordered pairs of paths need not be composable, homotopy classes of paths under path composition satisfy all the defining axioms for a group. The usual expedient at this juncture is to choose a specific member \(x_0\) of X, called a base point, and consider only those paths whose initial and terminal points are \(x_0\) (loops at \(x_0\)). Such paths are all composable with one another, and the set of their homotopy classes does indeed form a group, called the fundamental group of the pair \((X, x_0)\), and denoted \(\pi _1(X, x_0)\). If X is path connected, then it can be shown that \(\pi _1(X, x_0)\) and \(\pi _1(X, x_1)\) are isomorphic, so it is common to dispense with explicit reference to the base point, and write simply \(\pi _1(X)\). The fundamental group is functorial in the sense that a given continuous mapping between topological spaces induces a specific homomorphism between their fundamental groups. Because it allows one to unleash the computational power of algebra in the study of topology, the mathematical importance of the fundamental group is hard to overestimate.

For our first example of a fundamental group, consider n-dimensional Euclidean space \(\mathbb {R}^n\). Choose an arbitrary base point \(x_0\in \mathbb {R}^n\), and an arbitrary loop p at \(x_0\), so that \(p:I\rightarrow \mathbb {R}^n\), \(p(0)=p(1)=x_0\), and p is continuous. The mapping \(H:I\times I\rightarrow \mathbb {R}^n\) given by \(H(t,s)=x_0+(1-s)(p(t)-x_0)\) gives a homotopy from p to the constant path at \(x_0\), so there is only one homotopy class of loops at \(x_0\); hence \(\pi _1(\mathbb {R}^n)\) is the trivial group consisting of only one element. A path-connected space with trivial fundamental group is said to be simply connected. It is a straightforward exercise to prove that, in a simply connected space, any two paths having the same initial and terminal points are (path) homotopic.

Let us pause to consider the musical relevance of the preceding paragraph. In [18], Tymoczko uses Euclidean n-space \(\mathbb {R}^n\) to model n-voice music (ordered pitch space). He defines a voice leading in pitch space to be an equivalence class of pairs of elements of (n-voice) ordered pitch space under the uniform operation of permutation. A representative of such a voice leading is simply an ordered pair of points in \(\mathbb {R}^n\). There is no need to specify a path, because the voice leading is completely determined by the starting and ending points. In agreement with this is the fact that in \(\mathbb {R}^n\) all paths from one specified point to another are homotopic to each other. If a path is needed, any path between the specified endpoints will do; for example, Tymoczko in [18] chooses the straight line segment for its geometrical properties.

We now introduce covering spaces. These are of high importance both mathematically and musically, because they mediate in a very precise, direct way between continuous and discrete models. Recall that every topological space is axiomatically equipped with a collection of “open sets” that enable precise definition of continuity, and a “homeomorphism” is a bijective, bicontinuous function. In Definitions 6 and 7 below, E and B are topological spaces,“map” is synonymous with “function,” and “surjective” means every element of B is the image of at least one element of E under p.

Definition 6

Let \(p:E\rightarrow B\) be a continuous surjective map. The open set U of B is said to be evenly covered by p if the inverse image \(p^{-1}(U)\) can be written as the union of disjoint open sets \(V_\alpha \) in E such that for each \(\alpha \), the restriction of p to \(V_\alpha \) is a homeomorphism of \(V_\alpha \) onto U.

The usual visual image associated with the above definition is a disjoint (possibly infinite) collection of copies of U floating above U; p maps all of the copies of U in E down to U itself in B. In the next definition, a “neighborhood” of a point is an open set containing that point.

Definition 7

Let \(p:E\rightarrow B\) be continuous and surjective. If every point b of B has a neighborhood U that is evenly covered by p, then p is called a covering map, and E is said to be a covering space of B.

The classic first (nontrivial) example of a covering space one sees is the function p from the real line \(\mathbb {R}\) to the unit circle \(S^1\) given by \(p(x)=(\cos 2\pi x, \sin 2\pi x)\). This function also models an important musical example; namely, that of octave equivalence. If we take the unit of length in \(\mathbb {R}\) to be one octave, then p is the precise function that maps pitch space to pitch-class space. If we choose the base point in pitch-class space \(S^1\) to be the pitch class C, then its inverse image under p (called the fiber over C) will consist of all the particular pitches whose pitch class is C; it is an infinite, discrete subspace of pitch space \(\mathbb {R}\). The situation is illustrated in Fig. 3.

In topological parlance, if \(p:E\rightarrow B\) and \(f:X\rightarrow B\) are (continuous) functions, and \(\tilde{f}:X\rightarrow E\) is such that \(p\circ \tilde{f}=f\), then \(\tilde{f}\) is called a lifting of f. One of the most important facts about covering spaces concerns the existence and uniqueness of liftings of paths and path homotopies:

Lemma 1

If \(p:E\rightarrow B\) is a covering map, and \(p(e_0)=b_0\), then any path in B with initial point \(b_0\) has a unique lifting to a path in E with initial point \(e_0\), and any path homotopy of a path in B with initial point \(b_0\) has a unique lifting to a path homotopy in E of a path with initial point \(e_0\).

Fig. 3.
figure 3

The octave equivalence covering map from pitch space to pitch-class space, with pitch class C and part of the fiber over it shown.

The significance of the above lemma is that it can be used to define a correspondence between homotopy classes of loops at \(b_0\) in B (that is, elements of \(\pi _1(B, b_0)\)) and elements of the fiber \(p^{-1}(b_0)\). In fact, this correspondence can be viewed as a group action of \(\pi _1(B, b_0)\) on the fiber, which can be extended to the entire covering space E, so that we obtain a homomorphism of \(\pi _1(B, b_0)\) into the group of covering transformations (i.e., fiber-preserving homeomorphisms) of E. If E is simply connected, then this homomorphism is an isomorphism and E is called the universal cover of B; in the general case (assuming E is path connected) the group of covering transformations is a quotient of \(\pi _1(B, b_0)\).

For topologists, one of the immediate payoffs of the situation described above is that the problem of finding \(\pi _1(B, b_0)\) is converted to finding the group of covering transformations of the universal cover of B (if it exists). In the case of the covering \(\phi :\mathbb {R}\rightarrow S^1\), \(\mathbb {R}\) is known to be simply connected, and the group of covering transformations is readily seen to be the group of translations by integer amounts, so \(\pi _1(S^1)\) is the additive group \(\mathbb {Z}\) of integers. Elements of \(\pi _1(S^1)\) correspond to the (signed) number of times a path winds around the circle \(S^1\) before returning to the base point. Musically, if we model voice pitch-class voice leadings as homotopy classes of paths in pitch-class space, this result means that the pitch-class voice leadings of a one-note chord to itself are in one-to-one correspondence with the integers, interpreted as leaps up or down by some (whole) number of octaves. This set agrees with (one-voice) pitch-class voice leadings from a pitch class to itself as defined in [18]. In particular, as in [18] (“the specific path matters!”) our model distinguishes between paths that travel up or down, and between paths that travel different numbers of octaves. Note, however, that we have obtained more than just an enumerative correspondence: we have made an explicit transition (via path liftings) from a continuous model for voice leadings (paths) to a discrete one (integers), and the discrete model comes equipped with a group structure. Moreover, the group is none other than the group whose action on pitch space \(\mathbb {R}\) gave us octave equivalence (and hence pitch-class space \(S^1\)) in the first place. The ability to identify the set of voice leadings of a chord to itself in some chord space with the set of symmetry operations by which the chord space is defined can be extended beyond the particular example of one-note chords under octave equivalence, but to do so we will need to introduce additional mathematical machinery.

4 The Orbifold Fundamental Group

If B is an orbifold, its ordinary fundamental group \(\pi _1(B)\) does not account for the behavior of paths at orbifold singularities. For example, the fundamental group of the Möbius band (two-note chord space in [18]) is the additive group of integers,Footnote 3 but this misses the orbifold structure by which the boundary acts as a mirror for paths that are projections of straight line segments in \(\mathbb {R}^2\), a feature that is essential for the chord space voice leading model [18]. To capture the additional structure algebraically, we need to make use of the orbifold fundamental group, an extension of the ordinary fundamental group. Note that, unlike the ordinary fundamental group and ordinary covering spaces, the material on orbifolds in this section is not typically covered in a first course on algebraic topology. The classic reference is [16]; other useful references are [6] and [7].

Before presenting a definition of the orbifold fundamental group, we provide some mathematical background on manifolds, group actions, and orbifolds. A manifold is a space that looks locally like \(\mathbb {R}^n\). The precise definition uses the language of local models (charts and atlases, see [15] or [9]); we omit it for brevity. A group \(\varGamma \) acts on a space X if there is a function \(\varGamma \times X\rightarrow X\), with the image of an ordered pair \((\alpha , x)\) denoted by \(\alpha \cdot x\), which is compatible with the group structure of \(\varGamma \) in the following sense: First, for all \(\alpha ,\beta \in \varGamma \) and \(x\in X\), we have \((\alpha \beta )\cdot x=\alpha \cdot (\beta \cdot x)\), and second, if 1 denotes the identity element of \(\varGamma \), we have \(1\cdot x=x\). Two examples of group actions that are of particular importance to us are \(\mathbb {Z}\times \mathbb {R}\rightarrow \mathbb {R}\) by \(k\cdot x=x+k\) (translation in \(\mathbb {R}\) by integer amounts, musically interpretable as transposition by some fixed amount in pitch space), and \(\varSigma _n\times \mathbb {R}^n\rightarrow \mathbb {R}^n\) by \(\sigma \cdot (x_1,x_2,\ldots ,x_n)=(x_{\sigma (1)},x_{\sigma (2)},\ldots ,x_{\sigma (n)})\), where \(\varSigma _n\) is the group of permutations of n items and \(\sigma \in \varSigma _n\) (musically interpretable as the permutation operation applied to ordered sequences of pitches). The orbit of a point \(x\in X\) under a given action of \(\varGamma \) on X is the set \(\{\alpha \cdot x|\alpha \in \varGamma \}\); the quotient space obtained by collapsing each orbit to a point is denoted \(X/\varGamma \). Note, for example, that the orbit of \(0\in \mathbb {R}\) under the translation action of \(\mathbb {Z}\) is \(\mathbb {Z}\) itself, and that the orbit of a point \((x_1,x_2,\ldots ,x_n)\in \mathbb {R}^n\) under the permutation action described above consists of n! points if the coordinates of the point are all distinct, but fewer if some of the coordinates are the same; in the extreme case the orbit has just one point if all the coordinates are the same. If \(x\in X\), the isotropy group of x is the subgroup \(\varGamma _x\) of \(\varGamma \) defined by \(\varGamma _x=\{\alpha \in \varGamma | \alpha \cdot x=x\}\). Under the translation action of \(\mathbb {Z}\) on \(\mathbb {R}\), every point of \(\mathbb {R}\) has trivial isotropy group; on the other hand, under the permutation action a point \((x,x,\ldots ,x)\) on the diagonal of \(\mathbb {R}^n\) has the whole group \(\varSigma _n\) as its isotropy group.

An orbifold is a generalization of a manifold in which quotient spaces \(\mathbb {R}^n/\varGamma \) replace \(\mathbb {R}^n\) as the local model, where \(\varGamma \) is either finite or acts properly (i.e., with finite isotropy). The precise, general definition is complicated [16]. Fortunately, all of the orbifolds that concern us belong to the tractable family of developable orbifolds: global quotients of manifolds by discrete groups acting properly.

The definition we will give of the orbifold fundamental group is due to William Thurston [6]. The main idea of the definition is to generalize covering space theory to orbifolds. First, the definition of a covering map is modified such that the mapping from each sheet to an evenly covered open set is allowed to be a quotient map corresponding to a group action, rather than a homeomorphism, yielding the definition of an orbifold covering projection. Next, the definitions of universal cover and covering transformations are modified accordingly. Finally, rather than identifying the group of covering transformations of the universal cover with the fundamental group via a theorem as before, the identification is used to define the orbifold fundamental group:

Definition 8

Let Q be a connected orbifold. The orbifold fundamental group of Q, denoted \(\pi _1^{orb}(Q)\), is the group of covering transformations of the universal orbifold cover \(p:\tilde{Q}\rightarrow Q\).

For the developable orbifolds of interest to us musically, the universal orbifold cover is \(\mathbb {R}^n\), and the orbifold fundamental group coincides with the group \(\varGamma \) whose action on \(\mathbb {R}^n\) defines the orbifold. In the case of developable orbifolds, the relationship between the orbifold fundamental group and the ordinary fundamental group is captured neatly in a theorem of Armstrong ([1, 7]):

Theorem 1

Let \(\varGamma \) act properly by homeomorphisms on a connected, simply connected, locally compact metric space X, and let \(\varGamma ^\prime \) be the normal subgroup of \(\varGamma \) generated by the elements which have fixed points in X. Then the fundamental group of the orbit space \(X/\varGamma \) is isomorphic to the factor group \(\varGamma /\varGamma ^\prime \).

For example, in the case of the two-note chord space \(C^2=\mathbb {R}^2/\varGamma \), where the action of \(\varGamma \) combines both octave equivalence and permutational equivalence, \(\varGamma \) is isomorphic to the semidirect product \((\mathbb {Z}\times \mathbb {Z})\rtimes \varSigma _2\). In this case the subgroup \(\varGamma ^\prime \) is the normal subgroup of \(\varGamma \) containing elements of the form \((k,-k)\cdot \sigma \) with \(k\in \mathbb {Z}\) and \(\sigma \in \varSigma _2\); this is isomorphic to the semidirect product \(\mathbb {Z}\rtimes \varSigma _2\). The quotient \(\varGamma /\varGamma ^\prime \) is isomorphic to \(\mathbb {Z}\) and is the fundamental group of the Möbius band, which is the underlying topological space of \(\mathbb {R}^2/\varGamma \) [18]. Note that when \(\sigma =\tau \) is nontrivial (swap coordinates), the fixed point set of \((k,-k)\cdot \tau \) in \(\mathbb {R}^2\) is the line \(y=x+k\); note also that the orbifold quotient \(\mathbb {R}^2/\varGamma ^\prime \) is the strip between two consecutive such lines and has underlying space homeomorphic to \([0,1]\times \mathbb {R}\).

As in the case of ordinary covering spaces, if \(Q=\mathbb {R}^n/\varGamma \) is a developable orbifold, and we choose a basepoint \(x_0\) of Q that has trivial isotropy group \(\varGamma _{x_0}=1\), then the elements of \(\pi _1^{orb}(Q,x_0)\) are in one-to-one correspondence with homotopy classes of (ordinary) paths from a specified point \(\tilde{x}_0\) in the fiber of the basepoint \(x_0\) of Q to any point \(\tilde{x}\) in that fiber (including \(\tilde{x}_0\) itself), and since \(\mathbb {R}^n\) is simply connected, there is only one such homotopy class for a given pair \((\tilde{x}_0, \tilde{x})\). Hence the elements of \(\pi _1^{orb}(Q,x_0)\) are in one-to-one correspondence with line segments in \(\mathbb {R}^n\) from \(\tilde{x}_0\) to points of the fiber (including the constant line segment at \(\tilde{x}_0\)). If Q is a chord space, the projections of such line segments are the generalized line segments in Q from \(x_0\) to itself, which are the voice leadings from \(x_0\) to itself [4]. If, however, the basepoint \(x_0\) has non-trivial isotropy group \(\varGamma _{x_0}\) (as would happen in a chord space if \(x_0\) were on the boundary, corresponding to a chord with one or more pitch classes occurring more than once), then there is not a one-to-one correspondence between elements of \(\pi _1^{orb}(Q,x_0)\) and line segments in \(\mathbb {R}^n\) from \(\tilde{x}_0\) to points of the fiber of \(x_0\). Rather, for every such line segment there is a distinct member of \(\pi _1^{orb}(Q,x_0)\) for each member of \(\varGamma _{x_0}\).

We now examine what all this means musically. Denote the orbifold of n-note chord space as defined in [18] by \(C^n\). Recall \(C^n=\mathbb {R}^n/\varGamma \), where \(\varGamma \) is the group of transformations corresponding to octave equivalence and permutational equivalence. In this case, \(\varGamma \) is the wreath product of \(\mathbb {Z}\) with the symmetric group \(\varSigma _n\), or equivalently the semidirect product \(\mathbb {Z}^n\rtimes \varSigma _n\). From Thurston’s definition we obtain that \(\pi _1^{orb}(C^n)=\varGamma \). Hence, if we are modeling voice leadings as homotopy classes of (orbifold) paths in \(C^n\), the set of voice leadings from a given n-voice chord to itself can be identified with \(\varGamma \). For \(n=2\), we can make the identification explicit as follows. Since in this case \(\varGamma =(\mathbb {Z}\times \mathbb {Z})\rtimes \varSigma _2\), any element of \(\varGamma \) can be written uniquely as a product \((m, n)\cdot \sigma \) where m and n are integers, and \(\sigma \) is either the nontrivial element \(\tau \) (swap coordinates) or trivial element 1 (do not swap) of \(\varSigma _2\). Musically, this means that any voice leading from a two note chord to itself can be represented by transposing the first voice by m octaves (up or down, depending on whether m is positive or negative), the second voice by n octaves, and then possibly swapping the two voices (\(\sigma \)).

Fig. 4.
figure 4

A sequence of paths in chord space representing the Josquin excerpt in Fig. 2.

For a particular musical example, consider the Josquin excerpt in Fig. 2 again. The four-measure passage shown begins and ends on the same chord, so it represents an element of \(\pi _1^{orb}(C^2)=\varGamma =(\mathbb {Z}\times \mathbb {Z})\rtimes \varSigma _2\). Moreover, neither voice changes octave, and the voices do not swap, so the four-measure sequence represents the identity element \((0,0)\cdot 1\) of \(\pi _1^{orb}(C^2)\). A sequence of directed line segments in \(C^2\) representing the passage is shown in Fig. 4 (the segments wrap around between the right and left sides as described in [18]). It is important to note that, unlike in the case of ordinary (non-orbifold) coverings, for a given loop in \(C^2\) and specified lifting of the base point, there can be more than one lifting to \(\mathbb {R}^2\). Specifically, in the Josquin excerpt, suppose the bass and alto switch parts (remaining in their respective octaves) in the second measure, when both parts are on G (i.e.,at the point where the loop in \(C^2\) contacts the boundary). In this case the loop traced in \(C^2\) is unchanged, but the element of \(\pi _1^{orb}(C^2)\) represented is now \((-1,1)\cdot \tau \ne (0,0)\cdot 1\), so the two voice leadings differ.

This example illustrates the need for \(\pi _1^{orb}(C^2)\) as opposed to \(\pi _1(C^2)\); the latter is not sensitive to the difference between the two voice leadings. In general, an orbifold path carries more data than the (ordinary) path to which it projects in the underlying quotient space; the additional data distinguishes between multiple liftings. Details on defining orbifold paths can be found in [7].

5 The Fundamental Groupoid

Most voice leadings do not begin and end at the same chord, just as most paths in a space are not loops. The usual generalization of a group to use in such a situationFootnote 4 is a groupoid. A groupoid is typically defined using category theory [2]. A category \(\mathcal{C}\) consists of a class \(\mathrm{ob}(\mathcal{C})\) of objects and, for each xy in \(\mathrm{ob}(\mathcal{C})\), a set \(\mathcal{C}(x,y)\) of morphisms in \(\mathcal{C}\) from x to y. For each triple (xyz) of objects, there is an associative composition function \(*:\mathcal{C}(x,y)\times \mathcal{C}(y,z)\rightarrow \mathcal{C}(x,z)\), and for each object x there is an identity morphism \(1_x\in \mathcal{C}(x,x)\) such that for \(g\in \mathcal{C}(w,x)\) we have \(g*1_x=g\) and for \(f\in \mathcal{C}(x,y)\) we have \(1_x*f=f\).Footnote 5 A morphism \(f\in \mathcal{C}(x,y)\) is called an isomorphism if there exists a morphism \(f^{-1}\in \mathcal{C}(y,x)\) such that \(f*f^{-1}=1_x\) and \(f^{-1}*f=1_y\). If \(\mathrm{ob}(\mathcal{C})\) is a set (as opposed to a proper class), then \(\mathcal{C}\) is called a small category.

Definition 9

A groupoid is a small category in which every morphism is an isomorphism.

Let X be a topological space. The category \(\pi X\) whose objects are the points of X, and for which \(\pi X(x,y)\) consists of the homotopy classes of paths from x to y, forms a groupoid, called the fundamental groupoid of X. If C is a chord space, then voice leadings in C coincide with the morphisms of \(\pi C\).

As in the case of the fundamental group, if C is an orbifold, we need to make a distinction between the ordinary fundamental groupoid \(\pi C\), for which the morphisms are homotopy classes of ordinary paths, and the orbifold fundamental groupoid \(\pi ^{orb}C\), for which the morphisms of will be homotopy classes of orbifold paths. Defining classes of orbifold paths in the general case is complicated, but for a developable orbifold \(Q=\mathbb {R}^n/\varGamma \), and a given pair of points xy in Q with trivial isotropy groups, the members of \(\pi ^{orb}(x,y)\) are in one-to-one correspondence with homotopy classes of (ordinary) paths in \(\mathbb {R}^n\) from a particular member \(\tilde{x}_0\in \mathbb {R}^n\) of the fiber over x to members of the fiber over y. Again, since \(\mathbb {R}^n\) is simply connected, there is only one such path class for a given pair \((\tilde{x}_0,\tilde{y})\) of points in \(\mathbb {R}^n\). If Q is a chord space, and we take the line segment in \(\mathbb {R}^n\) from \(\tilde{x}_0\) to \(\tilde{y}\) as a representative path, its projection in Q is a generalized line segment. If x or y has non-trivial isotropy group, then there are multiple orbifold path classes corresponding to each generalized line segment. If both isotropy groups are non-trivial, then accounting for the multiplicity is complicated, but if just one (say \(\varGamma _x\)) is non-trivial, then the orbifold path classes corresponding to a generalized line segment themselves correspond to members of \(\varGamma _x\).

Returning to the Josquin example, note that there are two orbifold path classes corresponding to the generalized line segment from GG to AF, since the isotropy group of GG is \(\varSigma _2\). The multiplicity is necessary for composition of voice leadings to be well defined; clearly we must distinguish the voice leadings represented by \((G,C)\mathop {\longrightarrow }\limits ^{(2,5)} (A,F)\) and \((G,C)\mathop {\longrightarrow }\limits ^{(-2,9)} (F,A)\), but to do so we must distinguish those represented by \((G,G)\mathop {\longrightarrow }\limits ^{(2,-2)} (A,F)\) and \((G,G)\mathop {\longrightarrow }\limits ^{(-2,2)} (F,A)\).