Abstract
Automatic speech recognition technology can be integrated in an information retrieval process to allow searching on multimedia contents. But, in order to assure an adequate retrieval performance is necessary to state the quality of the recognition phase, especially in speaker-independent and domainindependent environments. This paper introduces a methodology to accomplish the evaluation of different speech recognition systems in several scenarios considering also the creation of new corpora of different types (broadcast news, interviews, etc.), especially in other languages apart from English that are not widely addressed in speech community.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology behind Search, 2nd edn. ACM Press Books (2011)
Varona, A., Rodríguez Fuentes, L.J., Penagarikano, M., Nieto, S., Diez, M., Bordel, G.: Search and access to information contained in the speech of multimedia resources. Procesamiento del Lenguaje Natural 45, 317–318 (2010)
The HTK Speech Recognition Toolkit, http://htk.eng.cam.ac.uk/
Sclite, ftp://jaguar.ncsl.nist.gov/current_docs/sctk/doc/sclite.htm
Dybkjaer, L., Hemsen, H., Minker, W.: Evaluation of Text and Speech systems, pp. 1–64, 99–124. Springer (2007)
Moreno, J., Garrote, M., Martínez, P., Martínez-Fernández, J.L.: Some experiments in evaluating ASR systems applied to multimedia retrieval. In: Detyniecki, M., García-Serrano, A., Nürnberger, A. (eds.) AMR 2009. LNCS, vol. 6535, pp. 12–23. Springer, Heidelberg (2011)
Spoken lecture processing system, MIT, http://web.sls.csail.mit.edu/lectures/
Iglesias, A., Moreno, L., Ruiz-Mezcua, B., Pajares, J.L., Jiménez, J., López, J.F., Revuelta, P., Hernández, J.: Web Educational Services for All: The APEINTA project, Web Accessibility Challenge. In: 8th International Cross-Disciplinary Conference on Web Accessibility, Hyderabad, India (2011)
Oard, D., Wang, J., Jones, G., White, R., Pecina, P., Soergel, D., Huang, X., Shafran, I.: Overview of the CLEF-2006 Cross-Language speech retrieval track. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 744–758. Springer, Heidelberg (2007)
Huang, X., Jack, M., Ariki, Y.: Hidden Markov Models for Speech Recognition. Edinburgh University Press (1990)
De Mori, R., Bechet, F., Hakkani-Tur, D., McTear, M., Riccardi, G., Tur, G.: Spoken Language Understanding: A Survey. IEEE Signal Processing Magazine 25, 50–58 (2008)
Ogata, J., Goto, M.: Speech repair: quick error correction just by using selection operation for speech input interfaces. In: Proc. Eurospeech 2005, pp. 133–136 (2005)
Sarma, A., Palmer, D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
González, M., Moreno, J., Martínez, J.L., Martínez, P. (2013). An Illustrated Methodology for Evaluating ASR Systems. In: Detyniecki, M., García-Serrano, A., Nürnberger, A., Stober, S. (eds) Adaptive Multimedia Retrieval. Large-Scale Multimedia Retrieval and Evaluation. AMR 2011. Lecture Notes in Computer Science, vol 7836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37425-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-37425-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37424-1
Online ISBN: 978-3-642-37425-8
eBook Packages: Computer ScienceComputer Science (R0)