Skip to main content

DIMEx100: A New Phonetic and Speech Corpus for Mexican Spanish

  • Conference paper
Advances in Artificial Intelligence – IBERAMIA 2004 (IBERAMIA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3315))

Included in the following conference series:

Abstract

In this paper the phonetic and speech corpus DIMEx100 for Mexican Spanish is presented. We discuss both the linguistic motivation and the computational tools employed for the design, collection and transcription of the corpus. The phonetic transcription methodology is based on recent empirical studies proposing a new basic set of allophones and phonological rules for the dialect of the central part of Mexico. These phonological rules have been implemented in a visualization tool that provides the expected phonetic representation of a text, and also a default temporal alignment between the spoken corpus and its phonetic representation. The tools are also used to compute the properties of the corpus and compare these figures with previous work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Canfield, D.L.: Spanish pronunciation in the Americas. The University of Chicago Press, Chicago (1981/1992)

    Google Scholar 

  2. Cuétara, J.: Fonética de la ciudad de México. Aportaciones desde las tecnologías del habla. MSc. Thesis in Spanish Linguistics, UNAM, Mex (2004) (in Spanish)

    Google Scholar 

  3. Gamboa, C.: Un sistema de reconocimiento de voz para el Español. BSc. Thesis. UNAM, Mex (2001) (in Spanish)

    Google Scholar 

  4. Kirschning, I.: Research and Development of Speech Technology and Applications for Mexican Spanish at the Tlatoa Group. Development Consortium at CHI 2001, Seattle, WA (2001)

    Google Scholar 

  5. Lope Blanch, J.M.: En torno a las vocales caedizas del español mexicano, en Estudios sobre el español de México. México: Universidad Nacional Autónoma de México, 57–77 (1963-1964/1983) (in Spanish)

    Google Scholar 

  6. Llisterri, J., Mariño, J.B.: Spanish adaptation of SAMPA and automatic phonetic transcription. Reporte técnico del ESPRIT PROJECT 6819, Speech Technology Assessment in Multilingual Applications, 9 (1993)

    Google Scholar 

  7. Moreno, A., Mariño, J.B.: Spanish dialects: Phonetic transcription. In: Proceedings of ICSLP 1998. The 5th International Conference on Spoken Language Processing, Sydney (1998)

    Google Scholar 

  8. Pérez, H.E.: Frecuencia de fonemas. Concepción: Universidad de Concepción, Chile (in Spanish) (2003)

    Google Scholar 

  9. Perissinotto, G.: Fonología del español hablado en la Ciudad de México. Ensayo de un método sociolingüístico, México: El Colegio de México (in Spanish) (1975)

    Google Scholar 

  10. Pineda, L.A., Massé, A., Meza, I., Salas, M., Schwarz, E., Uraga, E., Villaseñor, L.: The DIME Project. In: Coello Coello, C.A., de Albornoz, Á., Sucar, L.E., Battistutti, O.C. (eds.) MICAI 2002. LNCS (LNAI), vol. 2313, p. 166. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Quilis, A.: Fonética Acústica de la Lengua Española. Madrid: Gredos (in Spanish) (1981/1988)

    Google Scholar 

  12. Uraga, E., Pineda, L.A.: Automatic generation of pronunciations lexicons for Spanish. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 330–339. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Uraga, E., Gamboa, C.: VOXMEX Speech Database: design of a phonetically balanced corpus. In: Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal (May 2004)

    Google Scholar 

  14. Villaseñor, L., Massé, A., Pineda, L.: A Multimodal Dialogue Contribution Coding Scheme. In: Proceedings of ISLE workshop, LREC 2000, Athens, May 29-30 (2000)

    Google Scholar 

  15. Villaseñor, L., Montes y Gómez, M., Vaufreydaz, D., Serignat, J.F.: Experiments on the Construction of a Phonetically Balanced Corpus from the WEB. In: Gelbukh, A. (ed.) CICLing 2004. LNCS, vol. 2945. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pineda, L.A., Pineda, L.V., Cuétara, J., Castellanos, H., López, I. (2004). DIMEx100: A New Phonetic and Speech Corpus for Mexican Spanish. In: Lemaître, C., Reyes, C.A., González, J.A. (eds) Advances in Artificial Intelligence – IBERAMIA 2004. IBERAMIA 2004. Lecture Notes in Computer Science(), vol 3315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30498-2_97

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30498-2_97

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23806-5

  • Online ISBN: 978-3-540-30498-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics