Abstract
Laryngealizations are irregular voiced portions of speech, which can have morpho-syntactic functions and can disturb the automatic computation of FO. Two methods for the automatic detection of laryngealizations are described in this paper: With a Gaussian classifier using spectral and cepstral features a recognition rate of 80% (false alarm rate of 8%) could be achieved. As an alternative a “non-standard” method has been developed: an artificial neural network (ANN) was used for non-linear inverse filtering of speech signals. The inversely filtered signal was directly used as input for another ANN, which was trained to detect laryngealizations. In preliminary experiments we obtained a recognition rate of 65% (12% false alarms).
This work was supported by the German Ministry for Research and Technology (BMFT) in the joint research project ASL/VERBMOBIL and by the Deutsche Forschungsgemeinschaft (DFG). Only the authors are responsible for the contents.
Experiments with learn≠test (speaker-independent, leave-one-out), showed the same tendencies.
Voice source signals (figure 49.1) were measured with a laryngograph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Batliner, S. Burger, B. Johne, and A. Kießling. MÜSLI: A Classification Scheme For Laryngealizations. In Proc. ESCA Workshop on prosody, pp. 176–179, und, September 1993
A. Batliner, A. Kießling, R. Kompe, E. Nöth, and H. Niemann. Can You Tell Apart Spontaneous and Read Speech if You just Look at Prosody? In this volume
J. Denzler, R. Kompe, A. KieBling, H. Niemann, and E. Nöth. Going back to the Source: Inverse Filtering of the Speech Signal with ANNs. In Proc. European Conf. on Speech Communication and Technology, Vol. 1, pp. 111–114, Berlin, September 1993
W. Hess. Pitch Determination of Speech Signals, Vol. 3 of Springer Series of Information Sciences. Springer-Verlag, Berlin, Heidelberg, New York, 1983
D. Huber. Aspects of the Communicative Function of Voice in Text Intonation. PhD thesis, Chalmers University, Göteborg/Lund, 1988
A. KieBling, R. Kompe, E. Nöth, and A. Batliner. Irregularitäten im Sprachsignal —störend oder informativ? In R. Hoffmann (ed.), Elektronische Signaherarbeitung, Vol. 8 of Studientextezurr Sprachkommunikation, pp. 104–108. TU Dresden, 1991
P.L. Kirk, P. Ladefoged, and J. Ladefoged. Using a Spectrograph for Measurements of Phonation Types in a Natural Language. Working Papers in Phonetics, (59): 102–113, 1984
K. Nebel. Spektrale Merkmale zur Detektion von Laryngalisierungen im Sprachsignal. Master’s thesis, Lehrstuhl für Informatik 5 (Mustererkennung), Universität Erlangen-Nürnberg, 1992
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kießling, A., Kompe, R., Niemann, H., Nöth, E., Batliner, A. (1995). Voice Source State as a Source of Information in Speech Recognition: Detection of Laryngealizations. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-57745-1_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-63344-7
Online ISBN: 978-3-642-57745-1
eBook Packages: Springer Book Archive