The Recognition Component of the SUNDIAL Project

Charpentier, F.; Micca, G.; Schukat-Talamazzini, E.; Thomas, T.

doi:10.1007/978-3-642-57745-1_53

F. Charpentier⁵,
G. Micca³,
E. Schukat-Talamazzini⁴ &
…
T. Thomas²

Part of the book series: NATO ASI Series ((NATO ASI F,volume 147))

234 Accesses

Abstract

The recognition component of the SUNDIAL system has being developed jointly by Logica Cambridge, Erlangen University, Cselt, Daimler Bern Ulm, Cap-Gemini Innovation and Politecnico di Torino; the device acts as an Acoustic Front-End, performing the feature extraction and the acoustic-phonetic decoding stages. For the feature extraction stage, several speech processing algorithms were tested and compared by means of RSA (Recognition Sensitivity Analysis) [13], in terms both of recognition performance and speaker-sensitivity. The recogniser was intended to be used over the telephone network: therefore, the problem of high dynamic range and of spectral drifts of the signal were addressed. To this purpose, energy and cepstrum normalisation procedures were introduced to improve robustness in real TLC environments. The acoustic-phonetic stage was based on the HMM technology with Discrete (Italian), Continuous (All languages) and Semi-Continuous (German) paradigms. Speech units were selected on a phoneme basis with context dependency. Tests on the SUNDIAL recogniser have been performed both with read and spontaneous speech. Recognition performance for a typical continuous speech, speaker independent task, based on read input, scored about 80% Word Accuracy over a 1000 words vocabulary, without linguistic constraints. The recognition component has been embedded in real-time prototypes built on the overall SUNDIAL architecture. This also includes linguistic processing, dialogue managing, access to the information system, message generation and text-to-speech synthesis functions. Four prototypes are also being tested with spontaneous dialogues obtained from naive speakers, in two different application environments: access to flight enquiry and reservation in English and French, and train information access in German and Italian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Speech Processing and Prosody

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Using of Open-Source Technologies for the Design and Development of a Speech Processing System Based on Stemming Methods

References

P. Baggia, C. Rullent “Partial parsing as a Robust Parsing Strategy” ICASSP’93, Minneapolis (USA), 1993, pp. 123–126
Google Scholar
A. Ciaramella, D. Clementino, R. Pacifici “Real-Time Speaker-Independent Large-Vocabulary CDHMM-based Continuous Telephonic Speech Recognizer” ICSLP’ 92, pp. 89–92, Banff (Canada), October 1992
Google Scholar
F. Class, A. Kaltenmeier, P. Regel-Brietzmann. “Fast Speaker Adaptation combined with Soft Vector Quantization in an HMM Speech Recognition System”, ICASSP 92, Vol. I, pp. 461–464, S. Francisco (USA), March 1992
Google Scholar
L. Fissore, P. Laface, G. Micca, G. Sperto. “Channel Adaptation for a Continuous Speech Recognizer”, ICSLP 92, pp. 1495–1498, Banff (Canada), October 1992
Google Scholar
L. Fissore, E. Giachin, P. Laface, G. Micca “Selection of Speech Units for a Speaker-Independent CSR Task” Eurospeech ’97, pp. 1389–1392, Genova (Italy), 1991
Google Scholar
L. Fissore, P. Laface, P. Massafra, F. Ravera “Analysis and Improvement of the Partial Distance Search Algorithm” ICASSP’93, pp. 315–318, Minneapolis (USA), 1993
Google Scholar
E. Gerbino, M. Danieli “Managing Dialogue in a Continuous Speech Understanding System” Eurospeech’ 91, Berlin (Germany) 1993
Google Scholar
P. Heisterkamp, S. McGlashan, N. J. Youd “Dialogue semantics for an oral dialogue system” ICSLP, pp. 643–646, Banff (Canada), 12-16 October 1992
Google Scholar
G.T. Niedermair “Linguistic modelling in the context of oral dialogue” ICSLP, pp. 635–638, Banff (Canada), 1992
Google Scholar
J. Peckham. “Speech understanding and dialogue over the telephone: an overview of progress in the SUNDIAL project”, Proc. DARPA Workshop, pp. 14–27, Pacific Grove (CA), 1991
Google Scholar
E.G. Schukat-Talamazzini, H. Niemann, W. Eckert, T Kuhn, S. Rieck “Acoustic Modelling of subword units in the ISADORA speech recognizer” ICASSP’ 92, pp. 577–580, S. Francisco (USA) 1992
Google Scholar
E.G. Schukat-Talamazzini, M. Bielecki, H. Niemann, T. Kuhn, S. Rieck “A Non-Metrical Space Search Algorithm for Fast Gaussian Vector Quantization” ICASSP’ 93, pp. 688–691, Minneapolis (USA), 1993
Google Scholar
T.J. Thomas, J. Peckham, E. Frangoulis. “A Determination of the Sensitivity of Speech Recognises to Speaker Variability” ICASSP 89, pp. 544–548, Glasgow, Scotland, UK, May 1989
Google Scholar

Download references

Author information

Authors and Affiliations

Logica, Cambridge, UK
T. Thomas
Cselt, Turin, Italy
G. Micca
Erlangen University, Germany
E. Schukat-Talamazzini
Cap-Gemini Innovation, Paris, France
F. Charpentier

Authors

F. Charpentier
View author publications
You can also search for this author in PubMed Google Scholar
G. Micca
View author publications
You can also search for this author in PubMed Google Scholar
E. Schukat-Talamazzini
View author publications
You can also search for this author in PubMed Google Scholar
T. Thomas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Technology of Computers Faculty of Sciences, University of Granada, E-18071, Granada, Spain
Antonio J. Rubio Ayuso & Juan M. López Soler &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Charpentier, F., Micca, G., Schukat-Talamazzini, E., Thomas, T. (1995). The Recognition Component of the SUNDIAL Project. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_53

Download citation

DOI: https://doi.org/10.1007/978-3-642-57745-1_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-63344-7
Online ISBN: 978-3-642-57745-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

The Recognition Component of the SUNDIAL Project

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Speech Processing and Prosody

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Using of Open-Source Technologies for the Design and Development of a Speech Processing System Based on Stemming Methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Recognition Component of the SUNDIAL Project

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Speech Processing and Prosody

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Using of Open-Source Technologies for the Design and Development of a Speech Processing System Based on Stemming Methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation