Skip to main content

Measuring Linguistic Complexity: Introducing a New Categorial Metric

  • Chapter
  • First Online:
Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 860))

Abstract

This paper provides a computable quantitative measure which accounts for the difficulty in human processing of sentences: why is a sentence harder to parse than another one? Why is some reading of a sentence easier than another one? We take for granted psycholinguistic results on human processing complexity like the ones by Gibson. We define a new metric which uses Categorial Proof Nets to correctly model Gibson’s account in his Dependency Locality Theory. The proposed metric correctly predicts some performance phenomena such as structures with embedded pronouns, garden paths, unacceptable center embeddings, preference for lower attachment and passive paraphrases acceptability. Our proposal gets closer to the modern computational psycholinguistic theories, while it opens the door to include semantic complexity, because of the straightforward syntax-semantics interface in categorial grammars.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    These linguistic phenomena are described in detail in Sects. 4 and 5.

  2. 2.

    Here we are stricter than in other articles, i.e. we neither allow \(\otimes \) of positive formulas nor of negative formulas, because we only use the \(\mathbin {\backslash }\) and \(\mathbin {/}\) symbols in categories (and not \(\otimes \)): only combining heterogeneous polarities guarantees that a positive formula is a category, and that a negative formula is the negation of a category.

  3. 3.

    This list is redundant: for instance intuitionism plus acyclicity implies connectedness.

  4. 4.

    The same procedure can show the increasing complexity of the examples (1a)–(1c) by drawing the relevant proof-nets. This practice is avoided in this paper due the space limitation and its simplicity comparing to the running examples.

  5. 5.

    Following Lambek [12], we have assigned the category \(S / (np \backslash S)\) to relative pronoun I. Note that even assigning np, which is not a type-shifted category, would not change our numeric analysis at all.

  6. 6.

    It is worth mentioning that DLT-based complexity profiling can not support two linguistic phenomena: Multiple Sentences and Heavy Noun-Phrase Shift. For more details on the problem related to the Multiple-Quantifier Sentences and a possible solution consider [20, Chaps. 5 and 7].

References

  1. Chomsky, N.: Aspects of the Theory of Syntax. MIT Press, Cambridge (1965)

    Google Scholar 

  2. Moot, R., Retoré, C.: Natural language semantic and computability. J. Log. Lang. Inf. (to appear—preliminary version: arXiv:1605.04122)

  3. Johnson, M.E.: Proof nets and the complexity of processing center-embedded constructions. In: Retoré, C. (ed.) Special Issue on Recent Advances in Logical and Algebraic Approaches to Grammar. J. Log. Lang. Inf. 7(4), 433–447. Kluwer (1998)

    Google Scholar 

  4. Morrill, G.: Incremental processing and acceptability. Comput. Linguist. 26(3), 319–338 (2000)

    Article  Google Scholar 

  5. Gibson, E.: The dependency locality theory: a distance-based theory of linguistic complexity. In: Image, Language, Brain, pp. 95–126 (2000)

    Google Scholar 

  6. Bever, T.G.: The cognitive basis for linguistic structures. In: Cognition and the Development of Language (1970)

    Google Scholar 

  7. Gibson, E., Thomas, J.: The processing complexity of English center-embedded and self-embedded structures. In: University of Massachusetts (ed.) The proceedings of the North-Eastern Linguistic Society 1996 (1996)

    Google Scholar 

  8. Kimball, J.: Seven principles of surface structure parsing in natural language. Cognition 2(1), 15–47 (1973)

    Article  Google Scholar 

  9. Gibson, E.A.F.: A computational theory of human linguistic processing: memory limitations and processing breakdown. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA (1991)

    Google Scholar 

  10. Gibson, E.: Linguistic complexity: locality of syntactic dependencies. Cognition 68(1), 1–76 (1998)

    Article  MathSciNet  Google Scholar 

  11. Chomsky, N.: Some Concepts and Consequences of the Theory of Government and Binding, vol. 6. MIT press (1982)

    Google Scholar 

  12. Lambek, J.: The mathematics of sentence structure. Am. Math. Mon. 65(3), 154–170 (1958)

    Article  MathSciNet  Google Scholar 

  13. Moot, R., Retoré, C.: The logic of categorial grammars: a deductive account of natural language syntax and semantics, LNCS, vol. 6850. Springer (2012). http://www.springer.com/computer/theoretical+computer+science/book/978-3-642-31554-1

  14. Girard, J.Y.: Linear logic. Theor. Comput. Sci. 50, 1–102 (1987)

    Article  MathSciNet  Google Scholar 

  15. Retoré, C.: Calcul de Lambek et logique linéaire. Traitement Automatique des Langues 37(2), 39–70 (1996)

    Google Scholar 

  16. Roorda, D.: Proof nets for Lambek calculus. Logic and Computation 2(2), 211–233 (1992)

    Article  MathSciNet  Google Scholar 

  17. Warren, T., Gibson, E.: The effects of discourse status on intuitive complexity: Implications for quantifying distance in a locality-based theory of linguistic complexity. In: Poster presented at the Twelfth CUNY Sentence Processing Conference, New York (1999)

    Google Scholar 

  18. Moot, R.: Proof nets for linguistic analysis. Ph.D. thesis, UIL-OTS, Universiteit Utrecht (2002)

    Google Scholar 

  19. Gibson, E., Ko, K.: An integration-based theory of computational resources in sentence comprehension. In: Fourth Architectures and Mechanisms in Language Processing Conference, University of Freiburg, Germany (1998)

    Google Scholar 

  20. Mirzapour, M.: Modeling preferences for ambiguous utterance interpretations. (Modélisation de préférences pour l’interprétation d’énoncés ambigus). Ph.D. thesis, University of Montpellier, France (2018). https://tel.archives-ouvertes.fr/tel-01908642

  21. Chomsky, N.: Aspects of the Theory of Syntax, vol. 11. MIT press (2014)

    Google Scholar 

  22. Chatzikyriakidis, S., Pasquali, F., Retoré, C.: From logical and linguistic generics to Hilbert’s tau and epsilon quantifiers. IfCoLog J. Log. Their Appl. 4(2), 231–255 (2017)

    Google Scholar 

  23. Hilbert, D.: Die logischen grundlagen der mathematik. Mathematische Annalen 88(1), 151–165 (1922)

    Article  MathSciNet  Google Scholar 

  24. Vasishth, S., et al.: Quantifying processing difficulty in human language processing. In Rama Kant Agnihotri and Tista Bagchi (2005)

    Google Scholar 

  25. Richard L Lewis and Shravan Vasishth. An activation-based model of sentence processing as skilled memory retrieval. Cognitive science, 29(3):375–419, 2005

    Article  Google Scholar 

  26. Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychol. Rev. 111(4), 1036 (2004)

    Article  Google Scholar 

  27. Catta, D., Mirzapour, M.: Quantifier scoping and semantic preferences. In: Proceedings of the Computing Natural Language Inference Workshop (2017)

    Google Scholar 

  28. Mirzapour, M.: Finding missing categories in incomplete utterances. In: 24e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), p. 149

    Google Scholar 

  29. Lafourcade, M., Mery, B., Mirzapour, M., Moot, R., Retoré, C.: Collecting weighted coercions from crowd-sourced lexical data for compositional semantic analysis. In: New Frontiers in Artificial Intelligence, JSAI-isAI 2017. Lecture Notes in Computer Science. vol. 10838, pp. 214–230. Springer (2018)

    Google Scholar 

  30. Cooper-Martin, E.: Measures of cognitive effort. Mark. Lett. 5(1), 43–56 (1994)

    Article  Google Scholar 

  31. Engonopulos, N., Sayeed, A., Demberg, V.: Language and cognitive load in a dual task environment. In: Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 35 (2013)

    Google Scholar 

  32. Perfetti, C.A.: Reviewed work: linguistic complexity and text comprehension: readability issues reconsidered by alice davison, georgia m. green. Language 65(3), 643–646 (1989)

    Article  Google Scholar 

  33. Kramsch, C.: Why is everyone so excited about complexity theory in applied linguistics. Mélanges Crapel 33, 10–24 (2012)

    Google Scholar 

  34. Blache, P.: A computational model for linguistic complexity. In: Proceedings of the first International Conference on Linguistics, Biology and Computer Science (2011)

    Google Scholar 

  35. Blache, P.: Evaluating language complexity in context: new parameters for a constraint-based model. In: CSLP-11, Workshop on Constraint Solving and Language Processing (2011)

    Google Scholar 

Download references

Acknowledgements

We would like to show our gratitude to Philippe Blache for his insightful discussion at our lab and also for inspiration that we got from his papers [34, 35]. We would like to thank our colleague Richard Moot as well for his numerous valuable comments on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehdi Mirzapour .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mirzapour, M., Prost, JP., Retoré, C. (2020). Measuring Linguistic Complexity: Introducing a New Categorial Metric. In: Loukanova, R. (eds) Logic and Algorithms in Computational Linguistics 2018 (LACompLing2018). Studies in Computational Intelligence, vol 860. Springer, Cham. https://doi.org/10.1007/978-3-030-30077-7_5

Download citation

Publish with us

Policies and ethics