Abstract
The recognition component of the SUNDIAL system has being developed jointly by Logica Cambridge, Erlangen University, Cselt, Daimler Bern Ulm, Cap-Gemini Innovation and Politecnico di Torino; the device acts as an Acoustic Front-End, performing the feature extraction and the acoustic-phonetic decoding stages. For the feature extraction stage, several speech processing algorithms were tested and compared by means of RSA (Recognition Sensitivity Analysis) [13], in terms both of recognition performance and speaker-sensitivity. The recogniser was intended to be used over the telephone network: therefore, the problem of high dynamic range and of spectral drifts of the signal were addressed. To this purpose, energy and cepstrum normalisation procedures were introduced to improve robustness in real TLC environments. The acoustic-phonetic stage was based on the HMM technology with Discrete (Italian), Continuous (All languages) and Semi-Continuous (German) paradigms. Speech units were selected on a phoneme basis with context dependency. Tests on the SUNDIAL recogniser have been performed both with read and spontaneous speech. Recognition performance for a typical continuous speech, speaker independent task, based on read input, scored about 80% Word Accuracy over a 1000 words vocabulary, without linguistic constraints. The recognition component has been embedded in real-time prototypes built on the overall SUNDIAL architecture. This also includes linguistic processing, dialogue managing, access to the information system, message generation and text-to-speech synthesis functions. Four prototypes are also being tested with spontaneous dialogues obtained from naive speakers, in two different application environments: access to flight enquiry and reservation in English and French, and train information access in German and Italian.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
P. Baggia, C. Rullent “Partial parsing as a Robust Parsing Strategy” ICASSP’93, Minneapolis (USA), 1993, pp. 123–126
A. Ciaramella, D. Clementino, R. Pacifici “Real-Time Speaker-Independent Large-Vocabulary CDHMM-based Continuous Telephonic Speech Recognizer” ICSLP’ 92, pp. 89–92, Banff (Canada), October 1992
F. Class, A. Kaltenmeier, P. Regel-Brietzmann. “Fast Speaker Adaptation combined with Soft Vector Quantization in an HMM Speech Recognition System”, ICASSP 92, Vol. I, pp. 461–464, S. Francisco (USA), March 1992
L. Fissore, P. Laface, G. Micca, G. Sperto. “Channel Adaptation for a Continuous Speech Recognizer”, ICSLP 92, pp. 1495–1498, Banff (Canada), October 1992
L. Fissore, E. Giachin, P. Laface, G. Micca “Selection of Speech Units for a Speaker-Independent CSR Task” Eurospeech ’97, pp. 1389–1392, Genova (Italy), 1991
L. Fissore, P. Laface, P. Massafra, F. Ravera “Analysis and Improvement of the Partial Distance Search Algorithm” ICASSP’93, pp. 315–318, Minneapolis (USA), 1993
E. Gerbino, M. Danieli “Managing Dialogue in a Continuous Speech Understanding System” Eurospeech’ 91, Berlin (Germany) 1993
P. Heisterkamp, S. McGlashan, N. J. Youd “Dialogue semantics for an oral dialogue system” ICSLP, pp. 643–646, Banff (Canada), 12-16 October 1992
G.T. Niedermair “Linguistic modelling in the context of oral dialogue” ICSLP, pp. 635–638, Banff (Canada), 1992
J. Peckham. “Speech understanding and dialogue over the telephone: an overview of progress in the SUNDIAL project”, Proc. DARPA Workshop, pp. 14–27, Pacific Grove (CA), 1991
E.G. Schukat-Talamazzini, H. Niemann, W. Eckert, T Kuhn, S. Rieck “Acoustic Modelling of subword units in the ISADORA speech recognizer” ICASSP’ 92, pp. 577–580, S. Francisco (USA) 1992
E.G. Schukat-Talamazzini, M. Bielecki, H. Niemann, T. Kuhn, S. Rieck “A Non-Metrical Space Search Algorithm for Fast Gaussian Vector Quantization” ICASSP’ 93, pp. 688–691, Minneapolis (USA), 1993
T.J. Thomas, J. Peckham, E. Frangoulis. “A Determination of the Sensitivity of Speech Recognises to Speaker Variability” ICASSP 89, pp. 544–548, Glasgow, Scotland, UK, May 1989
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Charpentier, F., Micca, G., Schukat-Talamazzini, E., Thomas, T. (1995). The Recognition Component of the SUNDIAL Project. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-57745-1_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-63344-7
Online ISBN: 978-3-642-57745-1
eBook Packages: Springer Book Archive