Skip to main content

A Paradigmatic Path for Statistical Content Analysis Using an Integrated Package of Textual Data Treatment

  • Conference paper
Classification and Data Analysis

Abstract

In this paper different phases of the treatment of text are sketched, in order to link them both with some lexical characteristics of the analised corpus and with multidimensional techniques useful for the statistical content analysis of the latter. Our proposal is directed towards maintaining intact the system of meanings present in the corpus and to bettering the degree of monosemy of words. In this way a corpus vocabulary of mixed units of analysis is realised.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Balbi, S. (1995). Non symmetrical correspondence analysis of textual data and confidence regions for graphical forms, in: JADT 1995 Analisi statistica dei dati testuali, Bolasco, S., Lebart, L., Salem, A. (Eds.), CISU, Roma, II, 5–12.

    Google Scholar 

  • Bolasco, S. (1994). L’individuazione di forme testuali per lo studio statistico dei testi con tecniche di analisi multidimensionale, in: Proceedings of XXXVVII Riunione Scientifica della SIS–Sanremo, CISU, Roma, 95–103.

    Google Scholar 

  • Bolasco, S. (1997). L’analisi informatica dei testi in: La ricerca qualitativa, Ricolfi, L. (Ed.), Nuova Italia Scientifica, Roma, 165–203.

    Google Scholar 

  • Bolasco, S. (1998). Meta-data and Strategies of Textual Data Analysis: Problems and Instruments, in: Data Science, Classification and Related Methods, Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.-H., Baba, Y. (Eds.), Springer, Tokio, 468–479.

    Google Scholar 

  • Bolasco, S., Lebart, L., Salem, A. (Eds.), (1995). JADT 1995 - Analisi statistica dei dati testuali,CISU, Roma, 2 tomes.

    Google Scholar 

  • Bolasco, S. & Morrone, A. (1998). La construction d’un lexique fondamental de polyformes selon leur usage, in: Proceedings of JADT 1998, Mellet, S. (Ed.) Univ. Sophie Antipolis de Nice, 155–166.

    Google Scholar 

  • Bolasco, S. & Morrone, A. (1998b). A frequency dictionary of polyforms as a linguistic database for text disambiguation in TALTAC, in: Proceedings of VI Conference of the IFCS, Univ. of Rome, Short Papers Volume, 32–35.

    Google Scholar 

  • De Mauro, T., Mancini, F., Vedovelli, M., Voghera, M. (1993). Lessico di frequenza dell’italiano parlato, EtasLibri, Milano.

    Google Scholar 

  • Elia, A. (1995). Per una disambiguazione semi-automatica di sintagmi composti: i dizionari elettronici lessico-grammaticali, in: Ricerca qualitativa e computer Cipriani R. & Bolasco S., (Eds.), Franco Angeli, Milano, 112–141.

    Google Scholar 

  • Grigolli, S., Maltese, G., Mancini, F. (1991). Un sistema per la lemmatizzazione automatica di testo libero in: Atti del convegno AICA Text Processing, Milano.

    Google Scholar 

  • Herdan, G. (1964). Quantitative Linguistics, Butterworths, London (tr. it. 1971, Linguistica quantitativa. Il Mulino, Bologna).

    Google Scholar 

  • Lafon, P., (1984), Dépouillements et statistiques en lexicométrie, Slatkine, Genève-Paris.

    Google Scholar 

  • Lebart, L. & Salem, A., (1994). Statistique textuelle, Paris, Dunod.

    Google Scholar 

  • Lebart, L. (1995). Discriminazione in base a dati testuali in: Ricerca qualitative e computer, Cipriani R. & Bolasco S. (Eds), Franco Angeli, Milano, 184–202.

    Google Scholar 

  • Morrone, A., (1993). Alcuni criteri di valutazione della significatività dei segmenti ripetuti, in: Jadt 1993. Secondes Journées internationales d’Analyse statistique de Données Textuelles, Anastex, S. J. (Eds.), TELECOM, Paris, 299–309.

    Google Scholar 

  • Morrone, A., (1995). Una strategia di trattamento del testo per l’individuazione di variabili testuali rilevanti, in: JADT 1995–Analisi statistica dei dati testuali Bolasco, S., Lebart, L., Salem, A. (Eds.), CISU, Roma, 135–142.

    Google Scholar 

  • Muller, Ch. (1977). Principes et méthodes de statistique lexicale, Hachette, Paris.

    Google Scholar 

  • Reinert, M. (1995). I mondi lessicali di un corpus di 304 racconti di incubi attraverso il metodo Alceste in: Ricerca qualitativa e computer, Cipriani R. & Bolasco S. (Eds), Franco Angeli, Milano, 203–223.

    Google Scholar 

  • Salem, A. (1988). Approches du temps lexical. Statistique textuelle et series chronologiques. Mots, 17, 105–143.

    Article  Google Scholar 

  • Scott, J. (1997). L ‘analisi delle yeti sociali. NIS, Roma.

    Google Scholar 

  • Silberztein, M., (1993). Dictionnaires électroniques et analyse automatique des textes, Collection informatique linguistique, Masson, Paris.

    Google Scholar 

  • Wegman, E. J. (1990). Hyperdimensional Data Analysis Using Parallel Coordinates. J.A.S.A., 85, 411, 664–675.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Bolasco, S., Morrone, A., Baiocchi, F. (1999). A Paradigmatic Path for Statistical Content Analysis Using an Integrated Package of Textual Data Treatment. In: Vichi, M., Opitz, O. (eds) Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60126-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-60126-2_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65633-3

  • Online ISBN: 978-3-642-60126-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics