Keywords

1 Introduction

Repetitions are a fundamental structuring principle in many musical styles. They guide the listener in their experience of a musical piece through creating listening experiences, and facilitate the recall process [25, p. 228 ff.]. As such, the study of repetition is an important research topic in many fields of music research, and computational methods enable researchers to study musical repetitions quantitatively in large music collections. Musical pattern finding is important in several areas of music research. In Music Information Retrieval, repetitions have been used as indicators of musical segmentation [21], or to find themes or choruses in large databases [49]. In Music Analysis, analytical approaches based on repetition, for instance Réti’s motivic analysis [52], have been formalized and evaluated by developing a computer model [5]. In Folk Music Research, computational discovery of shared patterns between folk song variants offers the potential to detect moments of stability, i.e. melodic elements that change relatively little through the process of oral transmission [58]. Hypotheses on memory, recall and transmission of melodies can be tested on large databases using musical pattern finding [26].

As we show in this article, there are many different kinds of repetition that different researchers investigate: large repeated structures, such as themes, chorusses, or stanzas; smaller repeated units, such as motifs; but also building blocks of improvized music, such as formulae or licks. For these different purposes, and for different genres, the authors of the discussed studies have formalized repetition in different ways. What may be considered as a variation or as musically unrelated depends on a great number of factors, factors which yet need to be understood [57].

A computational method to find musical repetitions can contribute to an understanding of principles of repetition and variation. Using computational methods, researchers can model and test knowledge on the cognitive principles underlying musical repetition. Cognitive and computational models can cross-pollinate each other in such an overarching research question, as argued, for instance, by Honing [22, 23]. In this paper, we explore computational methods for finding repeated musical patterns, which we will refer to as musical pattern finding.

Currently, the knowledge of musical pattern finding is dispersed across different fields of music research. Miscellaneous studies present various approaches to the problem, but there is no systematic comparison of the proposed methods yet. Moreover, the influence of music representation, and filtering of algorithmic results on the success of musical pattern finding is currently unknown. This lack of comparative assessment is further complicated by the lack of a sound methodology for the evaluation of musical pattern finding.

This paper provides a comprehensive overview, review and discussion of the field of musical pattern finding. We present the essence of assorted studies, and we proceed to clarify the relationships between different methods, proposing a taxonomy of musical pattern finding approaches, and discussing various studies according to the criteria of music representation, filtering, reference data, and evaluation. Furthermore, we identify current challenges of musical pattern finding and conclude with steps to overcome these challenges.

2 State of Knowledge on Musical Pattern Finding

We conducted a comprehensive literature survey on musical pattern finding. In this section, we provide an overview of the various studies. First, we discuss the methods that the surveyed studies used, introducing a taxonomy within which the different approaches can be placed. We continue by considering different forms of music representation, different methods of filtering algorithmic results, and close with an overview of reference data and evaluation methods used in the studies.

In this survey, our focus lies on studies using symbolic music representations. Several studies work with audio representations [9, 44], but we will not discuss the problems related to transforming audio representations for musical pattern finding. For this, and other methods for the audio domain, we refer the reader to the overview by Klapuri [28].

Fig. 1.
figure 1

A schematic representation of a taxonomy for musical pattern finding methods. For each category, relevant studies are listed.

We present our overview of studies on musical pattern finding in Table 1. The tables’ columns refer to the study, naming the first author and year (please refer to the references for the full list of authors). The second column mentions the goals of the authors in exploring musical pattern finding. We list the musical pieces that each study used. Further categories, which we will discuss in more detail below, are the method the studies employed, the music representation that was used, what kind of filtering of the algorithmic results was performed, the reference of musical patterns against which the results were compared, represented by first author and year, and finally, which evaluation methods were applied. Empty cells denote that the category in question is not applicable to a particular study.

Table 1. Overview of research on musical pattern finding in music

2.1 Methods

As can be gleaned from the various goals of the studies presented in Table 1, the interest for musical pattern finding is diverse, spanning different music research disciplines and musical genres. This might be the reason why various authors present their algorithms without stating how their method relates to most of the other research on musical pattern finding.

To assess the relationship of different methods, we propose a taxonomy in which the various methods can be placed in perspective, which is represented in Fig. 1. Below, we will explain the distinctions within the taxonomy.

Pattern Discovery or Pattern Matching. Musical pattern discovery, on the one hand, aims at the identification of motifs, themes, and other musical structures in a piece of music, or between related pieces of music (intra-, and inter-opus discovery). Typically, algorithms applied for pattern discovery do not presuppose prior knowledge of possible candidates.

Musical pattern matching, on the other hand, strives to recognize pre-defined patterns in a musical piece, or a corpus of music [53]. Applications of pattern matching include the identification of fugue subjects and counter-subjects in Bach fugues [19] or variations of themes, as in some works by Mozart [18], among others.

In terms of the applied techniques, there is an overlap of pattern matching with classification problems in music retrieval, which aim at the identification of similar musical pieces in a large database (e.g. [24, 33]). However, in this paper, we focus on the problem of identifying a musical subsequence in a larger piece of music.

In Fig. 1, musical pattern discovery algorithms are presented on the left; musical pattern matching approaches can be found on the right side of the diagram.

String-Based or Geometric Methods. A piece of music can be represented by a series - or string - of tokens. For instance, the notes of a melody could be represented by a string of pitches (A; G; A; D), as MIDI note numbers (57; 55; 57; 50), or as tokens representing both pitch and duration of the note ((A, 1.5), (G, 0.5),(A, 1.0), (G, 1.0)). These multiple possibilities will be further discussed in the section on music representation below.

One approach to musical pattern finding is to search for identical subsequences of tokens in a string representation of a melody or multiple melodies. This approach has been derived from techniques developed within Computational Biology to compare gene sequences. Gusfield [20] provides a thorough overview of these techniques.

In geometric methods, a melody is considered as a shape in an n-dimensional space. Repeated patterns are then identified as (near-)identical shapes. Geometric methods are especially interesting for polyphonic music, as they can deal more efficiently with note events occurring at the same time [45, p. 328]. Meredith’s Structure Induction Algorithms (SIA) [45] order the pitch/onset points lexicographically, and search for vectors of pitch and time relationships which repeat elsewhere in a musical piece. This concept, used by Meredith and Collins [12, 45] for pattern discovery, has been applied by Lemström et al. [39] for pattern matching.

The musical pattern discovery approaches by Buteau and Vipperman [6] and Szeto and Wong [56] rely on a similar conceptualisation of music as n-dimensional shapes. In the latter study, the geometric relationships are represented as nodes and edges in graphs [56]. Laaksonen [34], Laitinen and Lemström [35], Lubiw and Tanur [42] and Romming and Selfridge-Field [54] represent musical pieces as point sets of onset times and pitch, which can be shifted in the time or pitch domain, which is handled through substitution weights in an alignment algorithm.

Use of Indexing Structures. Within the string-based methods we can define two categories: namely, whether or not indexing structures are used. Such indexing structures typically are a graph of tree-like structure, in which the repetitions contained in a string are represented, and can therefore be efficiently found. We will first present those studies that do not use indexing structures for the music pattern discovery, then those which do use them.

The simplest string-based approach to finding repeated patterns in a melody \(M\) consists of sliding all possible patterns \(P\) past \(M\), and recording all found matches. This approach is taken by Nieto and Farbod [47]. There are some extensions of this simple approach, which skip some comparison steps between \(P\) and \(M\) without missing any relevant patterns. One of these extensions, the algorithm by Knuth, Morris and Pratt [31], has been applied by Rolland [53] for musical pattern discovery.

Yet another approach is Crochemore’s set partitioning method [16], which recursively splits the melody \(M\) into sets of repeating tokens. Cambouropoulos [7] used this method to find maximally repeating patterns (i.e. repeated patterns which cannot be extended left or right and still be identical) in musical pieces. Karydis et al. [27] refine the set-partitioning approach to find only the longest patterns for each musical piece, with the intuition that these correspond most closely to musical themes.

Meek and Birmingham’s [44] algorithm transforms all possible patterns up to a maximal pattern length to keys in a radix 26 system (representing 12 intervals up or down, unison, and a 0 for the end of the string). After a series of transformations, which consolidate shorter into longer patterns, identical patterns are encoded by the same numerical keys.

There are number of studies which do use indexing structures [8, 9, 15, 30, 36, 37]. Knopke and Jürgensen [30] use suffix arrays representing phrases of Palestrina masses. Conklin and Anagnostopoulou [15] use a sequence tree to represent search spaces of patterns in Cretan folk songs, which is pruned based on the patterns’ frequency of occurrence. Lemström and Laine compare suffix tries to the more compressed form of suffix trees on a varied music corpus [38]. Lartillot [36] employs a pattern tree to model musical pieces; what is special about this indexing structure is that it allows for cyclic structures within the graph, which can capture repetitions of short patterns, forming building blocks within larger repeated structures.

Exact or Approximate Matching. Next to searching for exact matches, approximate matching is also of great interest to musical pattern finding. Rhythmic, melodic and many other conceivable variations are likely to occur, such as the insertion of ornamentations during a repetition, the speeding up or slowing down of a musical sequence, deviations in pitch, or transpositions.

Several ways to define approximate matching for musical pattern finding have been proposed. Approximate matching algorithms typically distinguish between approximate matches and irrelevant patterns based on a threshold on a given similarity measure. This can be the number of allowed mismatches, referred to as k-mismatch, the Hausdorff-distance, which refers to the Euclidian distance between sets of points [54], or the length of the longest common subsequence [40]. If musical pieces are represented as sequences of numbers (e.g. pitch values), it is also possible to define a threshold on the difference between values of two compared sequences – \(\delta \)-matching – or on the sum of all differences between the values in two compared strings – \(\gamma \)-matching – as suggested by Clifford and Iliopoulos [10]. However, to our knowledge, \(\delta \)-matching and \(\gamma \)-matching have not been used for musical pattern finding to this date.

Furthermore, the threshold can also be defined as a maximum amount of edit operations in an alignment algorithm (also known as edit distance or Levenshtein distance) [41]. This way, also strings of different length, or strings containing gaps in relation to each other, can be considered as approximate matches. Giraud et al. [19] use edit distance as a similarity measure for Bach fugues. For the comparison of Mozart themes and variations, Giraud and Cambouropoulos [18] extend the alignment metric with another operation, fragmentation. Fragmentation was introduced by Mongeau and Sankoff [46] so that musical embellishments cause smaller edit distances between patterns than with the classical definition of edit distance.

In a generalisation for patterns repeated on different time scales, alignment algorithms have also been developed further so as to consider patterns shifted by time constants or ratios. This approach, generally referred to as dynamic time warping, has been applied, among others, by Laitinen and Lemström [35] to a database of music from different genres, and by Laaksonen [34] in order to find augmentations and diminuations of fugue subjects.

To our knowledge, only Rolland [53] applied approximate matching to musical pattern discovery, using the Levenshtein distance to compare a pattern with a match candidate. Implicitly, however, approximate matching can also be achieved through more abstract music representations, as Cambouropoulos et al. [8] point out. We will discuss this in more detail below.

2.2 Music Representation

There are different musical dimensions to be considered for comparisons of musical patterns: rhythm, pitch, but also dynamics, timbre, and many more. Moreover, there are different conceivable abstraction levels at which to represent these dimensions: in terms of absolute values; in terms of categories or classes; in terms of contours indicating only the direction of change; among others. This is closely related to Conklin’s [13] notion of musical viewpoints.

A glance at the music representation column reveals that the majority of the studies on musical pattern finding use pitch or pitch intervals as the music representation, in some of the studies this is combined with rhythmic representations such as note onset or duration.

Chou et al. [9] extract a more abstract representation from the musical surface: they represent polyphonic musical pieces as chords, which are automatically extracted from simultaneous pitches, even though the paper does not report the results of such an approach. Similarly, Arzt et al. [1] match patterns with a symbolic fingerprint that is derived from pitch values sampled at specific time intervals.

Three studies [8, 37, 53] suggest multiple music representations. Rolland [53] allows the users of his FlExPat software to switch between different music representations, but he does not report how this influences the results of his musical pattern discovery algorithm. Cambouropoulos et al. [8] suggest to compare a pitch interval representation with a more abstract step-leap representation, but results of these two representations are not discussed by the authors. Lee and Chen [37] state that pitch and duration can be either represented in independent suffix trees, or in a combined suffix tree which contains tuples of pitch and duration values. They do not find either approach satisfying, so they suggest two new indexing structures, Twin Suffix Trees, and Grid-Twin Suffix Trees, as possible alternatives. They do not report any results produced by these different representations.

2.3 Filtering

A frequently described problem in musical pattern discovery is the great amount of algorithmically discovered patterns as compared to the patterns that would be considered relevant by a human analyst. For the task of computer-aided motivic analysis, Marsden recently observed that “... the mathematical and computational approaches find many more motives and many more relationships between fragments than traditional motivic analysis.” [43].

Therefore, most of the presented studies employ a filtering step, which is supposed to separate the wheat from the chaff. One approach is to consider only relatively long patterns. Cambouropoulos [7] filters according to pattern lengths. Likewise, Collins’ compactness trawling to refine the result of Structure Induction Algorithms [12] is based on such a filtering, in which patterns of adjacent notes are preferred over patterns with many intervening notes.

Another frequently employed filtering method is based on the assumption that patterns which occur more often might also be considered as more important by human analysts, and is applied in several studies [7, 53].

Conklin and Anagnostopoulou [15] are also interested in a pattern’s frequency of occurrence, but weigh it against its frequency in a collection of contrasting music pieces, the anticorpus. This process is designed to favour patterns that are characteristic for the analyzed piece. Conversely, also patterns which are characteristically not represented in specific pieces or genres can be interesting for music researchers [14].

Nieto and Farbod filter according to Gestalt rules during the search process, which means that pattern candidates containing relatively long notes or rests, or relatively large intervals will be rejected [47].

Lartillot’s detection of maximally specific patterns [36] filters for patterns which occur in different musical domains: a pattern which repeats the rhythm as well as the pitch of another pattern is considered as more specific than one that repeats in one of these domains alone.

It is also possible to employ several filtering parameters, and adjust their relative weights using an optimisation algorithm. Meek and Birmingham [44] take this approach. They let the algorithm select those filtering parameters which give the best agreement between the discovered patterns and Barlow and Morgenstern’s Dictionary of Musical Themes [2].

Lemström et al. [39] apply seven different filtering methods to their pattern matching results: these filters are based on the relationships between the points in the query patterns (intra-pattern vectors). The least frequent intra-pattern vectors are considered the best candidates for meaningful matches.

2.4 Reference Data

Some of the presented studies use annotated musical patterns to evaluate the results of their musical pattern discovery algorithm, which we will denote as reference data. Such reference data ranges from overviews of frequently used licks in jazz improvisation [48] to themes in Western art music [2]. This reference data is typically assembled by domain specialists, annotating what they consider the most relevant patterns of the analyzed music collection.

Such reference data can serve as a ground truth to evaluate musical pattern discovery: if there is a good agreement between an algorithm and a reference, the algorithm emulates human judgements on relevant patterns, and might therefore be more useful for tasks such as assisting music analysis for this genre.

2.5 Evaluation

Many studies discussed here evaluate qualitatively: the discovered patterns are scrutinized. Typically, selected examples are presented to the reader in this case. If there is no reference against which the results can be compared, the researchers’ and readers’ judgements form the post hoc reference.

For available reference data, several researchers evaluate the discovered patterns quantitatively by counting the number of agreements between algorithm and reference data [12, 18, 19, 44, 47, 56].

Sometimes, the reference data takes the form of a segmentation. In this case, the starting and end positions of algorithmically discovered patterns can be compared with segmentation boundaries [6, 7].

In a number of the studies [34, 35, 37], the researchers aimed for fast solutions, which make the algorithms more interesting for practical use. Therefore, computation speed is used as an evaluation metric in these cases. This does not give an indication of the usefulness of the automatically found patterns, however.

Several quantitative evaluation measures have been suggested for musical pattern finding, but there seems to be no common methodology among the reported studies. This is one of the points to which we will turn in the next sections, in which we will discuss the current challenges in the field of musical pattern discovery.

3 Challenges of Musical Pattern Finding

We have now observed that in various studies, different approaches to musical pattern discovery have been proposed. There are some common challenges which can be derived from these observations, which will be discussed below, following the same order as the previous section.

3.1 Methods

The approaches to musical pattern finding presented above all aim to find specific kinds of patterns, in specific genres and for specific applications. This is a good starting point, as a restriction to one part of the research field enables researchers to formulate more concrete questions, and to interpret results more easily.

However, it is crucial to know how different methods compare to each other, and to answer questions such as the following: how many relevant and irrelevant patterns does each algorithm find in relation to given reference data? Are there specific advantages of a string-based over a geometric approach for a given pattern finding task, or vice versa? In short: which musical pattern finding method performs best as compared against a given set of reference data?

3.2 Music Representation

As of yet, there is no systematic analysis of the influence of the music representation on musical pattern finding. The presented studies use various representations, but they do not directly evaluate the influence of the music representation on the musical pattern discovery results.

This is another challenge of musical pattern finding: do more abstract representations lead to more irrelevant discovered patterns? Or is some abstraction desired for the description of some musical parameters? Which musical dimensions are important for a given musical genre? Which music representation approximates best the judgements on repeating patterns of a human listener, as manifested in reference data?

3.3 Filtering

We have also seen manifold filtering criteria applied in various studies. Again, the influence of different filtering parameters on the quality of musical pattern finding has not yet been systematically investigated.

Are the most frequent patterns the ones human listeners would judge to be the most relevant? Or is it, conversely, just the unique patterns which capture our attention? Which filtering criteria match those of a human annotator of a given reference annotation most closely?

Insights on the influence of filtering are not only needed to improve musical pattern finding methods; filtering criteria of a computational method can also serve as a model of the criteria human listeners apply when they isolate themes, motifs, or other salient musical patterns in musical pieces.

3.4 Reference Data

As the preceding paragraphs show, there is a great need for references of human judgements on repeated patterns in music.

The literature overview pointed out a number of resources which can be used as reference data for musical pattern finding. As this reference data is very diverse, and highly subjective, matching the results of a computational method closely to one set of reference data does not necessarily imply that the whole problem of musical pattern finding has been solved.

This is another challenge of musical pattern finding: will an algorithm that performs well on finding licks in jazz improvisation also perform well for finding themes in Western art music? What exactly is the data contained in a reference annotation?

3.5 Evaluation

Further advances in repeated melodic pattern finding research hinges on a good evaluation of different methods. Many of the above studies report some successful results, yet how do we know these results are not only the grains the proverbial blind hen happens to find?

One very essential approach is of course a qualitative evaluation, in which discovered patterns are compared to those in reference data. However, for large corpora this is unfeasible, and it is hard to report the relative success of a method using qualitative evaluation, unless the lengthy analysis would be reported.

We showed above that some studies use quantitative measures, derived from the number of agreements between algorithmically found patterns and patterns in reference data. Yet the definition of an agreement is problematic: do we only consider patterns which match those in a reference exactly, or should there also be a (penalized) score for partial matches?

Collins et al. suggest to allow a certain number of differences between algorithmically discovered or matched, and reference patterns [12]. A window around the start and end position of a pattern, allowing patterns to be counted as matches which are slightly shorter or longer than the reference patterns, might be another worthwhile approach. Recently, Collins [11] suggested a cardinality score as a similarity measure, which expresses the amount of the overlap between patterns. From this, several potentially interesting measures can be derived (see [11] for details).

So far, it is not clear whether some of these evaluation measures might be too tolerant, including too many patterns as agreements between algorithm and reference, or too strict, and register almost no patterns as matches between algorithm and reference. In both extremes, the performances of different methods, music representations and filtering techniques would probably be hard to assess.

Finding an effective, unified methodology of evaluation represents another challenge to the field of musical pattern finding.

4 Perspectives of Musical Pattern Finding

From the above mentioned challenges, we conclude important perspectives for future research on musical pattern finding. These perspectives relate to the influence different methods, different music representations and different filtering techniques have on the results of musical pattern discovery, what we can learn from annotated patterns in reference data, and which evaluation measures should be used.

4.1 Comparison of Methods

In order to assess different methods for repeated melodic pattern finding, it is indispensable to compare them in a more direct way, even if they have been designed to find specific kinds of patterns, or to find patterns in a specific genre. We suggested in our overview a taxonomy of different approaches to pattern finding, which helps to put the many different methods in perspective.

Ideally, next to the qualitative comparison of results, also a quantitative comparison should take place, which necessitates that methods be tested on the same corpus, and compared against the same reference data. The recently introduced MIREX track on musical pattern discovery [11] is an important step into this direction. Similar incentives to compare approaches in musical pattern matching would be highly desirable.

4.2 Comparison of Music Representations

We have noted the importance of understanding the influence of music representation on musical pattern finding. Future research can amend this gap of knowledge by systematically comparing results from different representations to reference data. This way, we can learn more about the musical dimensions, abstraction levels and concepts of similarity that human analysts rely on for finding musical patterns.

Results from experimental studies (e.g. [17, 29, 55]) on music perception and recall should also be taken into account for choosing music representations. They provide theories which can be employed and tested by musical pattern finding. The comparison of musical pattern finding using different music representations with human annotations and with perceptual theories will generate insights which can feed back into research on similarity and variation in music theory and music cognition.

4.3 Comparison of Filtering Techniques

The systematic investigation of filtering techniques is also marked out as a fruitful direction of future research. By applying different filtering criteria and comparing the resulting patterns to reference data, we can better understand what conditions a pattern needs to fulfill in order to be considered relevant by human analysts.

Moreover, insights from research on long-term musical salience [4] can lead to models of how human listeners filter musical patterns according to salience or relevance. Musical pattern finding can benefit from and contribute to this research area in music cognition.

4.4 Reference Data

We have shown that there is a need for reference data of musical patterns annotated by human listeners. At the same time, the available reference data is a challenge in itself, as different references are based on subjective judgements of human analysts who annotated different kinds of patterns, working in different genres.

Researchers in musical pattern finding can treat their algorithms as models of human analysis, and through relative successes and failures of these models, we can understand better which criteria underlie human judgements on repeated musical patterns.

Therefore, another perspective of this research area is the study of reference data itself by comparison to computational results. Such a study will promote knowledge on the concept of musical repetition, as applied by different human listeners in different genres. Such knowledge is important for various disciplines of music research interested in the nature of musical repetitions.

4.5 Towards an Evaluation Methodology

We infer that the best way to quantitatively evaluate musical pattern finding algorithms consists of a combination of the several proposed measures. In combination, these different evaluation measures should give a reasonable impression of the respective successes of different approaches.

Many of the presented studies have performed a qualitative analysis of selected patterns. This should remain an indispensable evaluation step: for musical pattern discovery, patterns which quantitatively correspond to reference data exceptionally well, but also those which correspond exceptionally badly should be investigated qualitatively. In musical pattern matching, those patterns should be investigated which are found by the algorithm, but have not been annotated by human listeners, and those which have been annotated, but were overseen by the algorithm. Only with such an accompanying qualitative analysis can the behaviour of a pattern finding algorithm be truly understood.

Quantitative and qualitative methods can also be combined in a learning process, by which quantitative measures are fitted to qualitative judgements. This method has been applied to improve the automatic generation of musical patterns by Pearce and Wiggins [50]. A use of qualitative judgements in combination with different quantitative measures will not only contribute to the improvement of musical pattern finding, but will also greatly benefit related research in other fields, e.g. music cognition, musicology and music theory, as notions of what constitutes a repetition are quantified.

5 Conclusion

Our literature overview has shown that between the many approaches to musical pattern finding, there are important challenges left unaddressed. There is a need to investigate how methods developed for a specific goal generalize to other tasks within musical pattern finding, and how different methods perform on the same reference data. Moreover, the influence of different music representations and filtering on the results of musical pattern finding algorithms is not yet unravelled, which demands further research. The different idiosyncrasies of reference annotations, and their semantics are poorly understood. Finally, no standardized evaluation measure for musical pattern finding has yet been established, which we suggest to overcome by combining various quantitative measures with a qualitative evaluation.

As stated in the introduction, much is to be gained for diverse music research disciplines from musical pattern finding. Some successes have already been achieved, which makes further research in this field an intriguing effort to pursue. This thorough investigation of the state of the art in musical pattern finding, its challenges and perspectives, should help to mark out the field in which further fruitful research can take place.