Abstract
A cover version is an alternative rendition of a previously recorded song. Given that a cover may differ from the original song in timbre, tempo, structure, key, arrangement, or language of the vocals, automatically identifying cover songs in a given music collection is a rather difficult task. The music information retrieval (MIR) community has paid much attention to this task in recent years and many approaches have been proposed. This chapter comprehensively summarizes the work done in cover song identification while encompassing the background related to this area of research. The most promising strategies are reviewed and qualitatively compared under a common framework, and their evaluation methodologies are critically assessed. A discussion on the remaining open issues and future lines of research closes the chapter.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adams, N.H., Bartsch, N.A., Shifrin, J.B., Wakefield, G.H.: Time series alignment for music information retrieval. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 303–310 (2004)
Ahonen, T.E., Lemstrom, K.: Identifying cover songs using normalized compression distance. In: Int. Workshop on Machine Learning and Music, MML (July 2008)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press Books, New York (1999)
Bello, J.P.: Audio-based cover song retrieval using approximate chord sequences: testing shifts, gaps, swaps and beats. In: Int. Symp. on Music Information Retrieval (ISMIR), September 2007, pp. 239–244 (2007)
Bello, J.P., Pickens, J.: A robust mid-level representation for harmonic content in music signals. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 304–311 (2005)
Berenzweig, A., Logan, B., Ellis, D.P.W., Whitman, B.: A large scale evaluation of acoustic and subjective music similarity measures. In: Int. Symp. on Music Information Retrieval, ISMIR (2003)
Cano, P., Batlle, E., Kalker, T., Haitsma, J.: A review of audio fingerprinting. Journal of VLSI Signal Processing 41, 271–284 (2005)
Casey, M., Rhodes, C., Slaney, M.: Analysis of minimum distances in high-dimensional musical spaces. IEEE Trans. on Audio, Speech, and Language Processing 16(5), 1015–1028 (2008)
Casey, M., Slaney, M.: The importance of sequences in musical similarity. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), May 2006, vol. 5, p. V (2006)
Casey, M., Veltkamp, R.C., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-based music information retrieval: current directions and future challenges. Proceedings of the IEEE 96(4), 668–696 (2008)
Dalla Bella, S., Peretz, I., Aronoff, N.: Time course of melody recognition: a gating paradigm study. Perception and Psychophysics 7(65), 1019–1028 (2003)
Dannenberg, R.B., Birmingham, W.P., Pardo, B., Hu, N., Meek, C., Tzanetakis, G.: A comparative evaluation of search techniques for query-by-humming using the musart testbed. Journal of the American Society for Information Science and Technology 58(5), 687–701 (2007)
de Cheveigné, A.: Pitch perception models. In: Plack, C.J., Oxenham, A., Fay, R.R., Popper, A.N. (eds.) Pitch – Neural coding and perception, pp. 169–233. Springer, New York (2005)
Deliège, I.: Cue abstraction as a component of categorisation processes in music listening. Psychology of Music 24(2), 131–156 (1996)
Dixon, S., Widmer, G.: Match: A music alignment tool chest. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 492–497 (2005)
Dowling, W.J.: Scale and contour: two components of a theory of memory for melodies. Psychological Review 85(4), 341–354 (1978)
Dowling, W.J., Harwood, J.L.: Music cognition. Academic Press, London (1985)
Downie, J.S.: The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoustical Science and Technology 29(4), 247–255 (2008)
Downie, J.S., Bay, M., Ehmann, A.F., Jones, M.C.: Audio cover song identification: MIREX 2006-2007 results and analyses. In: Int. Symp. on Music Information Retrieval (ISMIR), September 2008, pp. 468–473 (2008)
Egorov, A., Linetsky, G.: Cover song identification with IF-F0 pitch class profiles. In: MIREX extended abstract (September 2008)
Ellis, D.P.W., Cotton, C.: The 2007 labrosa cover song detection system. In: MIREX extended abstract (September 2007)
Ellis, D.P.W., Cotton, C., Mandel, M.: Cross-correlation of beat-synchronous representations for music similarity. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), April 2008, pp. 57–60 (2008)
Ellis, D.P.W., Poliner, G.E.: Identifying cover songs with chroma features and dynamic programming beat tracking. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), April 2007, vol. 4, pp. 1429–1432 (2007)
Ellis, D.P.W., Whitman, B., Berenzweig, A., Lawrence, S.: The quest for ground truth in musical artist similarity. In: Int. Symp. on Music Information Retrieval (ISMIR), October 2002, pp. 518–529 (2002)
Foote, J.: Arthur: Retrieving orchestral music by long-term structure. In: Int. Symp. on Music Information Retrieval (ISMIR) (October 2000)
Fujishima, T.: Realtime chord recognition of musical sound: a system using common lisp music. In: Int. Computer Music Conference (ICMC), pp. 464–467 (1999)
Gómez, E.: Tonal description of music audio signals. PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain (2006), http://mtg.upf.edu/node/472
Gómez, E., Herrera, P.: The song remains the same: identifying versions of the same song using tonal descriptors. In: Int. Symp. on Music Information Retrieval (ISMIR), October 2006, pp. 180–185 (2006)
Gómez, E., Ong, B.S., Herrera, P.: Automatic tonal analysis from music summaries for version identification. In: Conv. of the Audio Engineering Society (AES) (October 2006); CD-ROM, paper no. 6902
Gouyon, F., Klapuri, A., Dixon, S., Alonso, M., Tzanetakis, G., Uhle, C., Cano, P.: An experimental comparison of audio tempo induction algorithms. IEEE Trans. on Speech and Audio Processing 14(5), 1832–1844 (2006)
Gusfield, D.: Algorithms on strings, trees and sequences: computer sciences and computational biology. Cambridge University Press, Cambridge (1997)
Harte, C.A., Sandler, M.B.: Automatic chord identification using a quantized chromagram. In: Conv. of the Audio Engineering Society (AES), pp. 28–31 (2005)
Heikkila, J.: A new class of shift-invariant operators. IEEE Signal Processing Magazine 11(6), 545–548 (2004)
Hu, N., Dannenberg, R.B., Tzanetakis, G.: Polyphonic audio matching and alignment for music retrieval. In: IEEE Workshop on Apps. of Signal Processing to Audio and Acoustics (WASPAA), pp. 185–188 (2003)
Izmirli, Ö.: Tonal similarity from audio using a template based attractor model. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 540–545 (2005)
Jensen, J.H., Christensen, M.G., Ellis, D.P.W., Jensen, S.H.: A tempo-insensitive distance measure for cover song identification based on chroma features. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), April 2008, pp. 2209–2212 (2008)
Jensen, J.H., Christensen, M.G., Jensen, S.H.: A chroma-based tempo-insensitive distance measure for cover song identification using the 2d autocorrelation. In: MIREX extended abstract (September 2008)
Kim, S., Narayanan, S.: Dynamic chroma feature vectors with applications to cover song identification. In: IEEE Workshop on Multimedia Signal Processing (MMSP), October 2008, pp. 984–987 (2008)
Kim, S., Unal, E., Narayanan, S.: Fingerprint extraction for classical music cover song identification. In: IEEE Int. Conf. on Multimedia and Expo (ICME), June 2008, pp. 1261–1264 (2008)
Kim, Y.E., Perelstein, D.: MIREX 2007: audio cover song detection using chroma features and hidden markov model. In: MIREX extended abstract (September 2007)
Kurth, F., Müller, M.: Efficient index-based audio matching. IEEE Trans. on Audio, Speech, and Language Processing 16(2), 382–395 (2008)
Larkin, C. (ed.): The Encyclopedia of Popular Music, 3rd edn. (November 1998)
Lee, K.: Identifying cover songs from audio using harmonic representation. In: MIREX extended abstract (September 2006)
Lee, K.: A system for acoustic chord transcription and key extraction from audio using hidden Markov models trained on synthesized audio. PhD thesis, Stanford University, USA (2008)
Lemstrom, K.: String matching techinques for music retrieval. PhD thesis, University of Helsinki, Finland (2000)
Levitin, D.: This is your brain on music: the science of a human obsession. Penguin (2007)
Manning, C.D., Prabhakar, R., Schutze, H.: An introduction to Information Retrieval. Cambridge University Press, Cambridge (2008), http://www.informationretrieval.org
Mardirossian, A., Chew, E.: Music summarization via key distributions: analyses of similarity assessment across variations. In: Int. Symp. on Music Information Retrieval, ISMIR (2006)
Marolt, M.: A mid-level melody-based representation for calculating audio similarity. In: Int. Symp. on Music Information Retrieval (ISMIR), October 2006, pp. 280–285 (2006)
Marolt, M.: A mid-level representation for melody-based retrieval in audio collections. IEEE Trans. on Multimedia 10(8), 1617–1625 (2008)
Miotto, R., Orio, N.: A music identification system based on chroma indexing and statistical modeling. In: Int. Symp. on Music Information Retrieval (ISMIR), September 2008, pp. 301–306 (2008)
Müller, M.: Information Retrieval for Music and Motion. Springer, Heidelberg (2007)
Müller, M., Kurth, F., Clausen, M.: Audio matching via chroma-based statistical features. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 288–295 (2005)
Myers, C.: A comparative study of several dynamic time warping algorithms for speech recognition. Master’s thesis, Massachussets Institute of Technology, USA (1980)
Nagano, H., Kashino, K., Murase, H.: Fast music retrieval using polyphonic binary feature vectors. In: IEEE Int. Conf. on Multimedia and Expo (ICME), vol. 1, pp. 101–104 (2002)
Navarro, G., Mákinen, V., Ukkonen, E.: Algorithms for transposition invariant string matching. Journal of Algorithms (56) (2005)
Ong, B.S.: Structural analysis and segmentation of music signals. PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain (2007), http://mtg.upf.edu/node/508
Oppenheim, A.V., Schafer, R.W., Buck, J.B.: Discrete-Time Signal Processing, 2nd edn. Prentice Hall, Englewood Cliffs (1999)
Pachet, F.: Knowledge management and musical metadata. Idea Group (2005)
Papadopoulos, H., Peeters, G.: Large-scale study of chord estimation algorithms based on chroma representation and hmm. In: Int. Conf. on Content-Based Multimedia Information, pp. 53–60 (2007)
Pickens, J.: Harmonic modeling for polyphonic music retrieval. PhD thesis, University of Massachussetts Amherst, USA (2004)
Poliner, G.E., Ellis, D.P.W., Ehmann, A., Gómez, E., Streich, S., Ong, B.S.: Melody transcription from music audio: approaches and evaluation. IEEE Trans. on Audio, Speech, and Language Processing 15, 1247–1256 (2007)
Purwins, H.: Proles of pitch classes. Circularity of relative pitch and key: experiments, models, computational music analysis, and perspectives. PhD thesis, Berlin University of Technology, Germany (2005)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. of the IEEE (1989)
Rabiner, L.R., Juang, B.H.: Fundamentals of speech recognition. Prentice Hall, Englewood Cliffs (1993)
Riley, M., Heinen, E., Ghosh, J.: A text retrieval approach to content-based audio retrieval. In: Int. Symp. on Music Information Retrieval (ISMIR), September 2008, pp. 295–300 (2008)
Robine, M., Hanna, P., Ferraro, P., Allali, J.: Adaptation of string matching algorithms for identification of near-duplicate music documents. In: ACM SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection (PAN), pp. 37–43 (2007)
Sailer, C., Dressler, K.: Finding cover songs by melodic similarity. In: MIREX extended abstract (September 2006)
Sankoff, D., Kruskal, J.: Time warps, string edits, and macromolecules. Addison-Wesley, Reading (1983)
Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Signal Processing Magazine 23(2), 133–141 (2006)
Schellenberg, E.G., Iverson, P., McKinnon, M.C.: Name that tune: identifying familiar recordings from brief excerpts. Psychonomic Bulletin and Review 6(4), 641–646 (1999)
Schulkind, M.D., Posner, R.J., Rubin, D.C.: Musical features that facilitate melody identification: how do you know it’s your song when they finally play it? Music Perception 21(2), 217–249 (2003)
Selfridge-Field, E.: Conceptual and representational issues in melodic comparison. MIT Press, Cambridge (1998)
Serrà, J., Serra, X., Andrzejak, R.G.: Cross recurrence quantification for cover song identification. New Journal of Physics 11, art. 093017 (September 2009)
Serrà, J., Gómez, E., Herrera, P.: Transposing chroma representations to a common key. In: IEEE CS Conference on The Use of Symbols to Represent Music and Multimedia Objects, October 2008, pp. 45–48 (2008)
Serrà, J., Gómez, E., Herrera, P., Serra, X.: Chroma binary similarity and local alignment applied to cover song identification. IEEE Trans. on Audio, Speech, and Language Processing 16(6), 1138–1152 (2008)
Sheh, A., Ellis, D.P.W.: Chord segmentation and recognition using em-trained hidden markov models. Int. Symp. on Music Information Retrieval (ISMIR), pp. 183–189 (2003)
Tsai, W.H., Yu, H.M., Wang, H.M.: A query-by-example technique for retrieving cover versions of popular songs with similar melodies. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 183–190 (2005)
Tsai, W.H., Yu, H.M., Wang, H.M.: Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval. Journal of Information Science and Engineering 24(6), 1669–1687 (2008)
Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)
Typke, R.: Music retrieval based on melodic similarity. PhD thesis, Utrecht University, Netherlands (2007)
Tzanetakis, G.: Pitch histograms in audio and symbolic music information retrieval. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 31–38 (2002)
Ukkonen, E., Lemstrom, K., Mäkinen, V.: Sweepline the music! Comp. Sci. in Perspective, 330–342 (2003)
Unal, E., Chew, E.: Statistical modeling and retrieval of polyphonic music. In: IEEE Workshop on Multimedia Signal Processing (MMSP), pp. 405-409 (2007)
Vignoli, F., Paws, S.: A music retrieval system based on user-driven similarity and its evaluation. In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 272–279 (2005)
Voorhees, E.M., Harman, D.K.: Trec: Experiment and evaluation in information retrieval (2005)
White, B.W.: Recognition of distorted melodies. American Journal of Psychology 73, 100–107 (1960)
Xu, R., Wunsch, D.C.: Clustering. IEEE Press, Los Alamitos (2009)
Yang, C.: Music database retrieval based on spectral similarity. Technical report (2001)
Yu, Y., Downie, J.S., Chen, L., Oria, V., Joe, K.: Searching musical audio datasets by a batch of multi-variant tracks. In: ACM Multimedia, October 2008, pp. 121–127 (2008)
Yu, Y., Downie, J.S., Mörchen, F., Chen, L., Joe, K., Oria, V.: Cosin: content-based retrieval system for cover songs. In: ACM Multimedia, October 2008, pp. 987–988 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Serrà, J., Gómez, E., Herrera, P. (2010). Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond. In: Raś, Z.W., Wieczorkowska, A.A. (eds) Advances in Music Information Retrieval. Studies in Computational Intelligence, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11674-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-11674-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11673-5
Online ISBN: 978-3-642-11674-2
eBook Packages: EngineeringEngineering (R0)