Skip to main content

The Recognition Component of the SUNDIAL Project

  • Conference paper
Speech Recognition and Coding

Part of the book series: NATO ASI Series ((NATO ASI F,volume 147))

  • 234 Accesses

Abstract

The recognition component of the SUNDIAL system has being developed jointly by Logica Cambridge, Erlangen University, Cselt, Daimler Bern Ulm, Cap-Gemini Innovation and Politecnico di Torino; the device acts as an Acoustic Front-End, performing the feature extraction and the acoustic-phonetic decoding stages. For the feature extraction stage, several speech processing algorithms were tested and compared by means of RSA (Recognition Sensitivity Analysis) [13], in terms both of recognition performance and speaker-sensitivity. The recogniser was intended to be used over the telephone network: therefore, the problem of high dynamic range and of spectral drifts of the signal were addressed. To this purpose, energy and cepstrum normalisation procedures were introduced to improve robustness in real TLC environments. The acoustic-phonetic stage was based on the HMM technology with Discrete (Italian), Continuous (All languages) and Semi-Continuous (German) paradigms. Speech units were selected on a phoneme basis with context dependency. Tests on the SUNDIAL recogniser have been performed both with read and spontaneous speech. Recognition performance for a typical continuous speech, speaker independent task, based on read input, scored about 80% Word Accuracy over a 1000 words vocabulary, without linguistic constraints. The recognition component has been embedded in real-time prototypes built on the overall SUNDIAL architecture. This also includes linguistic processing, dialogue managing, access to the information system, message generation and text-to-speech synthesis functions. Four prototypes are also being tested with spontaneous dialogues obtained from naive speakers, in two different application environments: access to flight enquiry and reservation in English and French, and train information access in German and Italian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. P. Baggia, C. Rullent “Partial parsing as a Robust Parsing Strategy” ICASSP’93, Minneapolis (USA), 1993, pp. 123–126

    Google Scholar 

  2. A. Ciaramella, D. Clementino, R. Pacifici “Real-Time Speaker-Independent Large-Vocabulary CDHMM-based Continuous Telephonic Speech Recognizer” ICSLP’ 92, pp. 89–92, Banff (Canada), October 1992

    Google Scholar 

  3. F. Class, A. Kaltenmeier, P. Regel-Brietzmann. “Fast Speaker Adaptation combined with Soft Vector Quantization in an HMM Speech Recognition System”, ICASSP 92, Vol. I, pp. 461–464, S. Francisco (USA), March 1992

    Google Scholar 

  4. L. Fissore, P. Laface, G. Micca, G. Sperto. “Channel Adaptation for a Continuous Speech Recognizer”, ICSLP 92, pp. 1495–1498, Banff (Canada), October 1992

    Google Scholar 

  5. L. Fissore, E. Giachin, P. Laface, G. Micca “Selection of Speech Units for a Speaker-Independent CSR Task” Eurospeech ’97, pp. 1389–1392, Genova (Italy), 1991

    Google Scholar 

  6. L. Fissore, P. Laface, P. Massafra, F. Ravera “Analysis and Improvement of the Partial Distance Search Algorithm” ICASSP’93, pp. 315–318, Minneapolis (USA), 1993

    Google Scholar 

  7. E. Gerbino, M. Danieli “Managing Dialogue in a Continuous Speech Understanding System” Eurospeech’ 91, Berlin (Germany) 1993

    Google Scholar 

  8. P. Heisterkamp, S. McGlashan, N. J. Youd “Dialogue semantics for an oral dialogue system” ICSLP, pp. 643–646, Banff (Canada), 12-16 October 1992

    Google Scholar 

  9. G.T. Niedermair “Linguistic modelling in the context of oral dialogue” ICSLP, pp. 635–638, Banff (Canada), 1992

    Google Scholar 

  10. J. Peckham. “Speech understanding and dialogue over the telephone: an overview of progress in the SUNDIAL project”, Proc. DARPA Workshop, pp. 14–27, Pacific Grove (CA), 1991

    Google Scholar 

  11. E.G. Schukat-Talamazzini, H. Niemann, W. Eckert, T Kuhn, S. Rieck “Acoustic Modelling of subword units in the ISADORA speech recognizer” ICASSP’ 92, pp. 577–580, S. Francisco (USA) 1992

    Google Scholar 

  12. E.G. Schukat-Talamazzini, M. Bielecki, H. Niemann, T. Kuhn, S. Rieck “A Non-Metrical Space Search Algorithm for Fast Gaussian Vector Quantization” ICASSP’ 93, pp. 688–691, Minneapolis (USA), 1993

    Google Scholar 

  13. T.J. Thomas, J. Peckham, E. Frangoulis. “A Determination of the Sensitivity of Speech Recognises to Speaker Variability” ICASSP 89, pp. 544–548, Glasgow, Scotland, UK, May 1989

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Charpentier, F., Micca, G., Schukat-Talamazzini, E., Thomas, T. (1995). The Recognition Component of the SUNDIAL Project. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-57745-1_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-63344-7

  • Online ISBN: 978-3-642-57745-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics