Abstract
In this work, we propose to combine two modalities, handwriting and speech, to build a mathematical expression recognition system. Based on two sub-systems which process each modality, we explore various fusion methods to resolve ambiguities which naturally occur independently. The results that are reported on the HAMEX bimodal database show an improvement with respect to a mono-modal based system.
Chapter PDF
Similar content being viewed by others
References
Karray, F., Alemzadeh, M., Saleh, J.A.: Human-Computer Interaction: Overview on State of the Art. IJSSIS, 137–159 (2008)
Jaimes, L., Sebe, N.: Multimodal human computer interaction: A survey. Computer Vision and Image Understanding 108, 116–134 (2007)
Thiran, J.-P., Marquès, F., Bourlard, H.: Multimodal Signal Processing - Theory and Applications for Human-Computer Interaction. Elsevier (2010)
Atrey, P.K., Hossain, M., AEl Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia Systems, 345–379 (2010)
Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. IJDAR 15, 331–357 (2012)
Awal, A.-M., Mouchère, H., Viard-Gaudin, C.: A global learning approach for an online handwritten mathematical expression recognition system. In: PRL, pp. 1046–1050 (2012)
Rhee, T.H., Kim, J.H.: Robust recognition of handwritten mathematical expressions using search-based structure analysis. In: ICFHR, pp. 19–24 (2008)
Fateman, R.: How can we speak math? University of California, Tech. report (2012)
Wigmore, A., Hunter, G., Pflugel, E., Denholm-Price, J., Binelli, V.: Using automatic speech recognition to dictate mathematical expressions: The development of the talkmaths application at Kingston University. JCMST 28, 177–189 (2009)
Deléglise, P., Estève, Y., Meignier, S., Merlin, T.: Improvements to the LIUM French ASR system based on CMU Sphinx: what helps tosignificantly reduce the word error rate? Interspeech (2009)
Cmu sphinx system, http://cmusphinx.sourceforge.net/html/cmusphinx.php
Denoeux, T.: Conjunctive and disjunctive combination of belief functions induced by non distinct bodies of evidence. AI 172, 234–264 (2007)
Smets, P., Kennes, R.: The transferable belief model. AI 66, 191–234 (1994)
Quiniou, S., Mouchere, H., Peña Saldarriaga, S., Viard-Gaudin, C., Morin, E., Petitrenaud, S., Medjkoune, S.: HAMEX – a Handwritten and Audio Dataset of Mathematical Expressions. In: ICDAR, pp. 452–456 (2011)
Muda, L., Begam, M., Elamvazuthi, I.: Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient and Dynamic Time Warping. TJC 2 (2010)
Mouchère, H., Viard-Gaudin, C., Kim, D.H., Kim, J.H., Garain, U.: ICFHR2012: Competition on recognition of online handwritten mathematical expressions (crohme 2012). In: ICFHR (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Medjkoune, S., Mouchere, H., Petitrenaud, S., Viard-Gaudin, C. (2013). Multimodal Mathematical Expressions Recognition: Case of Speech and Handwriting. In: Kurosu, M. (eds) Human-Computer Interaction. Interaction Modalities and Techniques. HCI 2013. Lecture Notes in Computer Science, vol 8007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39330-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-39330-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39329-7
Online ISBN: 978-3-642-39330-3
eBook Packages: Computer ScienceComputer Science (R0)