Abstract
This paper describes a number of criteria for archivable documentation of grammars of natural languages, extending the work of Bird and Simons’ “Seven dimensions of portability for language documentation and description.” We then describe a system for writing and testing morphological and phonological grammars of languages, a system which satisfies most of these criteria (where it does not, we discuss plans to extend the system).
The core of this system is based on an XML schema which allows grammars to be written in a stable and linguistically-based formalism, a formalism which is independent of any particular parsing engine. This core system also includes a converter program, analogous to a programming language compiler, which translates grammars written in this format, plus a dictionary, into the programming language of a suitable parsing engine (currently the Stuttgart Finite State Tools). The paper describes some of the decisions which went into the design of the formalism; for example, the decision to aim for observational adequacy, rather than descriptive adequacy. We draw out the implications of this decision in several areas, particularly in the treatment of morphological reduplication.
We have used this system to produce formal grammars of Bangla, Urdu,Pashto, and Persian (Farsi), and we have derived parsers from those formal grammars. In the future we expect to implement similar grammars of other languages, including Dhivehi, Swahili, and Somali. In further work (briefly described in this paper), we have embedded formal grammars produced in this core system into traditional descriptive grammars of several of these languages. These descriptive grammars serve to document the formal grammars, and also provide automatically extractable test cases for the parser.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Beesley, K.R., Karttunen, L.: Finite State Morphology. University of Chicago Press, Chicago (2003)
Bird, S., Simons, G.: Seven dimensions of portability for language documentation and description. Language 79(3), 557–582 (2003)
Blevins, J.: A reconsideration of Yokuts vowels. International Journal of American Linguistics 70(1), 33–51 (2004)
Burnard, L., Bauman, S.: TEI P5: Guidelines for electronic text encoding and interchange (2013)
Chomsky, N.: Aspects of the Theory of Syntax. MIT Press, Cambridge (1965)
David, A., Maxwell, M.: Joint grammar development by linguists and computer scientists. In: IJCNLP, pp. 27–34. The Association for Computer Linguistics (2008)
Dieterman, J.I.: Secondary palatalization in Isthmus Mixe: a phonetic and phonological account. SIL International, Dallas (2008), http://www.sil.org/silepubs/Pubs/50951/50951_DietermanJ_Mixe_Palatalization.pdf
Halle, M.: Prolegomena to a theory of word formation. Linguistic Inquiry 4, 3–16 (1973)
Halle, M., Mohanan, K.P.: Segmental phonology of modern english. Linguistic Inquiry 16(1), 57–116 (1985)
Hankamer, J.: Finite state morphology and left to right phonology. In: Proceedings of the Fifth West Coast Conference on Formal Linguistics. pp. 29–34 (1986)
Harris, J.W.: Two theories of non-automatic morphophonological alternations. Language: Journal of the Linguistic Society of America 54, 41–60 (1978)
Harris, Z.: Yokuts structure and Newman’s grammar. International Journal of American Linguistics 10, 196–211 (1944)
ISO TC37: Language resource management — Feature structures — Part 1: Feature structure representation (2006)
ISO TC37: Language resource management — Lexical markup framework, LMF (2008)
ISO TC37: Language resource management — Feature structures — Part 2: Feature system declaration (2011)
Karttunen, L.: The insufficiency of paper-and-pencil linguistics: the case of Finnish prosody. In: Kaplan, R.M., Butt, M., Dalrymple, M., King, T.H. (eds.) Intelligent Linguistic Architectures: Variations on Themes, pp. 287–300. CSLI Publications, Stanford (2006)
Knuth, D.E.: Literate Programming. Center for the Study of Language and Information, Stanford (1992)
Marantz, A.: Re reduplication. Linguistic Inquiry 13, 435–482 (1982)
Maxwell, M.: Electronic grammars and reproducible research. In: Nordoff, S., Poggeman, K.-L.G. (eds.) Electronic Grammaticography, pp. 207–235. University of Hawaii Press (2012)
Maxwell, M.: A Grammar Formalism for Computational Morphology (forthcoming)
Maxwell, M., David, A.: Interoperable grammars. In: Webster, J., Ide, N., Fang, A.C. (eds.) First International Conference on Global Interoperability for Language Resources (ICGL 2008), Hong Kong, pp. 155–162 (2008), http://hdl.handle.net/1903/11611
Newman, S.: The Yokuts Language of California. Viking Fund, New York (1944)
Rice, C., Blaho, S. (eds.): Modeling ungrammaticality in Optimality Theory. Advances in Optimality Theory. Equinox Press, London (2009)
Schmid, H.: A programming language for finite state transducers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 308–309. Springer, Heidelberg (2006)
Walsh, N.: DocBook 5: The Definitive Guide. O’Reilly, Sebastopol, California (2011), http://www.docbook.org/
Weber, D.J., Black, H.A., McConnel, S.R.: AMPLE: A Tool for Exploring Morphology. Summer Institute of Linguistics, Dallas (1988)
Weigel, W.F.: The interaction of theory and description: The yokuts canon. Talk Presented at the Annual Meeting of the Society for the Study of the Indigenous Languages of the Americas (2002)
Weigel, W.F.: Yowlumne in the Twentieth Century. Ph.D. thesis, University of California, Berkeley (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maxwell, M. (2013). A System for Archivable Grammar Documentation. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40486-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-40486-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40485-6
Online ISBN: 978-3-642-40486-3
eBook Packages: Computer ScienceComputer Science (R0)