Abstract
A multimodal approach for percussion music transcription from audio and video recordings is proposed in this work. It is part of an ongoing research effort for the development of tools for computer-aided analysis of Candombe drumming, a popular afro-rooted rhythm from Uruguay. Several signal processing techniques are applied to automatically extract meaningful information from each source. This involves detecting certain relevant objects in the scene from the video stream. The location of events is obtained from the audio signal and this information is used to drive the processing of both modalities. Then, the detected events are classified by combining the information from each source in a feature-level fusion scheme. The experiments conducted yield promising results that show the advantages of the proposed method.
This work was supported by funding agencies CSIC and ANII from Uruguay.
Chapter PDF
Similar content being viewed by others
Keywords
References
Andrews, G.: Blackness in the White Nation: A History of Afro-Uruguay. The University of North Carolina Press, Chapel Hill (2010)
Dixon, S.: Onset detection revisited. In: Proceedings of the 9th International Conference on Digital Audio Effects (DAFx-06), Montreal, Canada, pp. 133–137, September 2006
Essid, S., Richard, G.: Fusion of multimodal information in music content analysis. In: Müller, M., Goto, M., Schedl, M. (eds.) Multimodal Music Processing, pp. 37–52. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Germany (2012)
Ferreira, L.: An afrocentric approach to musical performance in the black south atlantic: The candombe drumming in Uruguay. TRANS-Transcultural Music Review 11, 1–15 (2007)
Fitzgibbon, A., Fisher, R.B.: A buyer’s guide to conic fitting. In: British Machine Vision Conference, BMVC 1995, Birmingham, pp. 513–522, September 1995
Gillet, O., Essid, S., Richard, G.: On the correlation of automatic audio and visual segmentations of music videos. IEEE Transactions on Circuits and Systems for Video Technology 17(3), 347–355 (2007)
Gillet, O., Richard, G.: Automatic transcription of drum sequences using audiovisual features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005), pp. 205–208, March 2005
Grompone von Gioi, R., Jakubowicz, J., Morel, J.-M., Randall, G.: LSD: a Line Segment Detector. Image Processing On Line 2, 35–55 (2012)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, San Francisco, CA, USA, pp. 359–366 (2000)
Klapuri, A., Davy, M. (eds.): Signal Processing Methods for Music Transcription. Springer, New York (2006)
Lim, A., Nakamura, K., Nakadai, K., Ogata, T., Okuno, H.: Audio-visual musical instrument recognition. In: National Convention of Audio-Visual Information Processing Society, March 2011
Müller, M., Goto, M., Schedl, M. (eds.): Multimodal Music Processing. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Germany (2012)
Tsuji, S., Matsumoto, F.: Detection of ellipses by a modified Hough Transformation. IEEE Transactions on Computers 27(8), 777–781 (1978)
Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 2, pp. 28–31. IEEE (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Marenco, B., Fuentes, M., Lanzaro, F., Rocamora, M., Gómez, A. (2015). A Multimodal Approach for Percussion Music Transcription from Audio and Video. In: Pardo, A., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2015. Lecture Notes in Computer Science(), vol 9423. Springer, Cham. https://doi.org/10.1007/978-3-319-25751-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-25751-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25750-1
Online ISBN: 978-3-319-25751-8
eBook Packages: Computer ScienceComputer Science (R0)