Abstract
In this chapter, we provide a brief characterization of what we consider the best and most common structure that empirical corpus-linguistic papers can and should have. In particular, we first introduce the four major parts of a corpus linguistics paper: “Introduction”, “Methods”, “Results”, and “Discussion”. Since the nature of corpus data and corpus techniques makes the two sections very field-specific, we then focus more particularly on the “Methods” and “Discussion” sections of a typical quantitative corpus linguistic paper. We provide recommendations that span the research cycle from data description to analyzing the dataset and reporting the results of statistical tests.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This is also a means of bringing credit and recognition to all those involved in corpus compilation.
- 2.
See Gries (in press) for more information about how to carry out the tasks of retrieval and annotation discussed above.
References
American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.
Berez-Kroeker, A., Gawne, L., Kung, S., et al. (2017). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56(1), 1–18.
BNC Consortium. (2001). The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/. Accessed 30 August 2019.
Branco, A., Cohen, K. B., Vossen, P., Ide, N., & Calzolari, N. (2017). Replicability and reproducibility of research results for human language technology : Introducing an LRE special section. Language Resources and Evaluation, 51(1), 1–5.
Cleveland, W., & McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific data. Science, 229(4716), 828–833.
Fox, J. (2003). Effect displays in R for generalised linear models. Journal of Statistical Software, 8(15), 1–27.
Fox, J., & Hong, J. (2009). Effect displays in R for multinomial and proportional-odds logit models: Extensions to the effects package. Journal of Statistical Software, 32(1), 1–24.
Fuoli, M., & Hommerberg, C. (2015). Optimising transparency, reliability and replicability: Annotation principles and inter-coder agreement in the quantification of evaluation expressions. Corpora, 10(3), 315–349.
Gries, S. Th. (2013). Statistics for linguistics with R (2nd rev. & ext. ed.). Boston/New York: De Gruyter Mouton.
Gries, S. Th. (2016a). Variationist analysis: Variability due to random effects and autocorrelation. In P. Baker & J. A. Egbert (Eds.), Triangulating methodological approaches in corpus linguistic research (pp. 108–123). New York: Routledge, Taylor and Francis.
Gries, S. Th. (2016b). Quantitative corpus linguistics with R. 2nd rev. & ext. ed. New York & London: Routledge, Taylor & Francis Group.
Gries, S. Th. (in press). Managing synchronic corpus data with the British National Corpus (BNC). In A.L. Berez-Kroeker, B. McDonnell, E. Koller, & L. Collister (Eds.), MIT open handbook of linguistic data management. Cambridge, MA: The MIT Press
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. Berlin/New York: Springer.
Loewen, S., & Plonsky, L. (2015). An A-Z of applied linguistics research methods. New York: Palgrave.
Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS repository: Advancing research practice and methodology. In A. Mackey & E. Marsden (Eds.), Advancing methodology and practice: The IRIS repository of instruments for research into second languages (pp. 1–21). New York: Routledge.
Paquot, M., & Plonsky, L. (2017). Quantitative research methods and study quality in learner corpus research. International Journal of Learner Corpus Research, 3(1), 61–94.
Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687.
Porte, G. (2012). Replication research in applied linguistics. Cambridge: Cambridge University Press.
Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. Proceedings of international conference on new methods in language processing, Manchester, UK.
Spooren, W., & Degand, L. (2010). Coding coherence relations: Reliability and validity. Corpus Linguistics and Linguistic Theory, 6(2), 241–266.
Tufte, E. (2001). The visual display of quantitative information (2nd ed.). Graphics Press: Cheshire, CT.
Wilkinson, L., & The Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54(8), 594–604.
Wulff, S., Gries, S. Th., & Lester, N. A. (2018). Optional that in complementation by German and Spanish learners: Where and how German and Spanish learners differ from native speakers. In A. Tyler, L. Huan, & H. Jan (Eds.), What does applied cognitive linguistics look like? Answers from the L2 classroom and SLA studies (pp. 97–118). Berlin & Boston: De Gruyter Mouton.
Zuur, A. F., Ieno, E. N., & Elphick, C. S. (2010). A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1(1), 3–14.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Gries, S.T., Paquot, M. (2020). Writing up a Corpus-Linguistic Paper. In: Paquot, M., Gries, S.T. (eds) A Practical Handbook of Corpus Linguistics. Springer, Cham. https://doi.org/10.1007/978-3-030-46216-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-46216-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46215-4
Online ISBN: 978-3-030-46216-1
eBook Packages: Religion and PhilosophyPhilosophy and Religion (R0)