Abstract
Mobile service robots in human environments need to have versatile abilities to perceive and to interact with their environment. Spoken language is a natural way to interact with a robot, in general, and to instruct it, in particular. However, most existing speech recognition systems often suffer from high environmental noise present in the target domain and they require in-depth knowledge of the underlying theory in case of necessary adaptation to reach the desired accuracy. We propose and evaluate an architecture for a robust speaker independent speech recognition system using off-the-shelf technology and simple additional methods. We first use close speech detection to segment closed utterances which alleviates the recognition process. By further utilizing a combination of an FSG based and an N-gram based speech decoder we reduce false positive recognitions while achieving high accuracy.
Chapter PDF
Similar content being viewed by others
Keywords
- Linear Discriminant Analysis
- Speech Recognition
- Language Model
- False Recognition
- Speech Recognition System
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
van der Zant, T., Wisspeintner, T.: Robocup x: A proposal for a new league where robocup goes real world. In: Bredenfeld, A., et al. (eds.) RoboCup 2005. LNCS, vol. 4020, pp. 166–172. Springer, Heidelberg (2006)
Huang, X., Alleva, F., Hon, H.W., Hwang, M.Y., Rosenfeld, R.: The SPHINX-II speech recognition system: an overview. Computer Speech and Language 7(2), 137–148 (1993)
Lamel, L., Rabiner, L., Rosenberg, A., Wilpon, J.: An improved endpoint detector for isolated word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing [see also IEEE Trans. on Signal Processing] 29(4), 777–785 (1981)
Macho, D., Padrell, J., Abad, A., Nadeu, C., Hernando, J., McDonough, J., Wolfel, M., Klee, U., Omologo, M., Brutti, A., Svaizer, P., Potamianos, G., Chu, S.: Automatic speech activity detection, source localization, and speech recognition on the chil seminar corpus. In: IEEE Int. Conf. on Multimedia and Expo, 2005 (ICME 2005), July 6, pp. 876–879 (2005)
Padrell, J., Macho, D., Nadeu, C.: Robust speech activity detection using lda applied to ff parameters. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2005), March 18-23, vol. 1, pp. 557–560 (2005)
Rentzeperis, E., Stergiou, A., Boukis, C., Souretis, G., Pnevmatikakis, A., Polymenakos, L.: An Adaptive Speech Activity Detector Based on Signal Energy and LDA. In: 3rd Joint Workshop on Multi-Modal Interaction and Related Machine Learning Algorithms (2006)
Ruhi Sarikaya, J.H.L.H.: Robust Speech Activity Detection in the Presence of Noise. In: Proc. of the 5th Int. Conf. on Spoken Language Processing (1998)
Lin, Q., Lubensky, D., Picheny, M., Rao, P.S.: Key-phrase spotting using an integrated language model of n-grams and finite-state grammar. In: Proc. of the 5th European Conference on Speech Communication and Technology (EUROSPEECH 1997), pp. 255–258 (1997)
Wessel, F., Schlüter, R., Macherey, K., Ney, H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. on Speech and Audio Processing 9(3), 288–298 (2001)
Seymore, K., Chen, S., Doh, S., Eskenazi, M., Gouvea, E., Raj, B., Ravishankar, M., Rosenfeld, R., Siegler, M., Stern, R., Thayer, E.: The 1997 CMU Sphinx-3 English Broadcast News transcription system. In: Proc. of the DARPA Speech Recognition Workshop (1998)
Calmes, L., Lakemeyer, G., Wagner, H.: Azimuthal sound localization using coincidence of timing across frequency on a robotic platform. Journal of the Acoustical Society of America 121(4), 2034–2048 (2007)
Calmes, L., Wagner, H., Schiffer, S., Lakemeyer, G.: Combining sound localization and laser based object recognition. In: Tapus, A., Michalowski, M., Sabanovic, S. (eds.) Papers from the AAAI Spring Symposium, Stanford, CA, pp. 1–6. AAAI Press, Menlo Park (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doostdar, M., Schiffer, S., Lakemeyer, G. (2009). A Robust Speech Recognition System for Service-Robotics Applications. In: Iocchi, L., Matsubara, H., Weitzenfeld, A., Zhou, C. (eds) RoboCup 2008: Robot Soccer World Cup XII. RoboCup 2008. Lecture Notes in Computer Science(), vol 5399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02921-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-02921-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02920-2
Online ISBN: 978-3-642-02921-9
eBook Packages: Computer ScienceComputer Science (R0)