Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Klavan, J.; Divjak, D. (2016)
Publisher: De Gruyter
Languages: English
Types: Article
Usage-based linguistics abounds with studies that use statistical classification models to\ud analyse either textual corpus data or behavioral experimental data. Yet, before we can draw\ud conclusions from statistical models of empirical data that we can feed back into cognitive\ud linguistic theory, we need to assess whether the text-based models are cognitively plausible and\ud whether the behavior-based models are linguistically accurate. In this paper, we review four\ud case studies that evaluate statistical classification models of richly annotated linguistic data by\ud explicitly comparing the performance of a corpus-based model to the behavior of native\ud speakers. The data come from four different languages (Arabic, English, Estonian, and Russian)\ud and pertain to both lexical as well as syntactic near-synonymy. We show that behavioral\ud evidence is needed in order to fine-tune and improve statistical models built on data from a\ud corpus. We argue that methodological pluralism and triangulation are the keys for a cognitively\ud realistic linguistic theory.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • 20 categorical variant choice: construction, priming and frequency effects on the choice between full and contracted forms of am, are and is. Corpus Linguistics and Linguistic Theory. [Ahead of print - last consulted online at http://www.degruyter.com/view/j/cllt.ahead-of-print/cllt-2014-0022/cllt-2014-0022.xml on 28/05/2015]
    • Bermel, Neil & Knittl. 2012a. Corpus frequency and acceptability judgements: A study of morphosyntactic variants in Czech. Corpus Linguistics and Linguistic Theory 8 (2): 241- 275.
    • Bermel, Neil & Knittl. 2012b. Morphosyntactic variation and syntactic constructions in Czech nominal declension: corpus frequency and native-speaker judgements. Russian Linguistics 36 (1): 91-119.
    • Box, George E. P. 1976. Science and statistics. Journal of the American Statistical Association 71 (356): 791-799.
    • Bradshaw, John. 1984. A guide to norms, ratings, and lists. Memory & Cognition 12 (2): 202- 206.
    • Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld, eds. Roots: Linguistics in search of its evidential base. Berlin: Mouton de Gruyter, 77 96.
    • Bresnan, Joan, Anna Cueni, Tatiana Nikitina & R. Harald Baayen. 2007. Predicting the dative alternation. In Gerlof Bouma, Irene Krämer & Joost Zwarts, eds. Cognitive foundations of interpretation. Amsterdam: Royal Netherlands Academy of Science, 69 94.
    • Bresnan, Joan & Marilyn Ford. 2010. Predicting syntax: processing dative constructions in American and Australian varieties of English. Language 86 (1): 186 213.
    • Burnham, Kenneth P. & David R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. 2nd ed. New York: Springer.
    • Bybee, Joan L. & David Eddington. 2006. A usage-based approach to Spanish verbs of Language 82 (2): 323-355.
    • In Stefan Th. Gries & Dagmar Divjak, eds. Frequency effects in language learning and processing. Berlin: Mouton de Gruyter, 177-206.
    • Chafe, Wallace. 1992. The importance of corpus linguistics to understanding the nature of language. In Jan Svartvik, ed. Directions in corpus linguistics. Berlin: Mouton de Gruyter, 79-97.
    • Crawley, Michael J. 2007. The R book. Chichester: John Wiley & Sons.
    • De Sutter, Gert, Dirk Speelman & Dirk Geeraerts. 2008. Prosodic and syntactic-pragmatic mechanisms of grammatical variation: The impact of a postverbal constituent on the word order in Dutch clause final verb clusters. International Journal of Corpus Linguistics 13 (2): 194-224.
    • Deignan, Alice H. 2005. Metaphor and corpus linguistics. Amsterdam; John Benjamins. Divjak, Dagmar. 2003. On trying in Russian: a tentative network model for near(er) Ljubljana, 15-21 August 20 Slavica Gandensia 30: 25-58.
    • Divjak, Dagmar. 2004. Degrees of verb integration. Conceptualizing and categorizing events in Russian. Ph.D. diss., Dept. of Oriental & Slavic Studies. K.U.Leuven (Belgium).
    • Divjak, Dagmar. 2008. On (in)frequency and (un)acceptability. In Barbara Lewandowska-Tomaszczyk, ed. Corpus linguistics, computer tools and applications - state of the art. Frankfurt a. Main: Peter Lang, 213- Studies in Language] Divjak, Dagmar. 2010. Structuring the lexicon: a clustered model for near-synonymy.
    • Berlin, New york: Mouton de Gruyter. [Cognitive Linguistics Research] Divjak, Dagmar & Antti Arppe. 2013. Extracting prototypes from exemplars. What can the psychological reality of corpus-based probabilistic models. Cognitive Linguistics 27(1): 1-33.
    • Divjak, Dagmar & Stefan Th. Gries. 2008. Clusters in the mind? Converging evidence from near-synonymy in Russian. The Mental Lexicon 3 (2): 188-213.
    • Divjak, Dagmar & Stefan Th. Gries (eds). 2012. Frequency effects in language representation (Vol. 2). Berlin, New York: Mouton de Gruyter.
    • Erker Daniel & Gregory R. Guy. 2012. The role of lexical frequency in syntactic variability: Variable subject personal pronoun expression in Spanish. Language 88 (3): 526-557.
    • Ford, Marilyn & Joan Bresnan. 2013a. Using convergent evidence from psycholinguistics and usage. In Manfred Krug & Julia Schlüter, eds. Research methods in language variation and change. Cambridge University Press, 295-312.
    • Ford, Marilyn & Joan Bresnan. 2013b. `They whispered me the answer' in Australia and the US: A comparative experimental study. In Tracy Holloway King & Valeria de Paiva, eds. From quirky case to representing space: Papers in honor of Annie Zaenen. Stanford: CSLI Publications, 95-107.
    • (http://web.stanford.edu/group/cslipublications/cslipublications/Online/azfest-final.pdf, last accessed on 22/01/2015).
    • Frary, Robert B. 1988. Formula scoring of multiple choice tests (correction for guessing). Educational Measurement: Issues and Practice 7 (2): 33-38.
    • Gilquin, Gaëtanelle & Stefan Th Gries. 2009. Corpora and experimental methods: A state-ofthe-art review. Corpus Linguistics and Linguistic Theory 5 (1): 1-26.
    • Glynn, Dylan & Kerstin Fischer (eds). 2010. Quantitative methods in cognitive semantics: Corpus-driven Approaches. Berlin: De Gruyter. [Cognitive Linguistics Research 46].
    • Glynn, Dylan & Justyna Robinson (eds). 2014. Corpus methods for semantics: Quantitative studies in polysemy and synonymy. Amsterdam: John Benjamins. [Human Cognitive Processing 43].
    • Gries, Stefan Th. 2003. Multifactorial analysis in corpus linguistics: a study of particle placement. London & New York: Continuum Press.
    • Gries, Stefan Th., Beate Hampe & Doris Schönefeld. 2010. Converging evidence II: more on the association of verbs and constructions. In Sally Rice & John Newman, eds. Empirical and experimental methods in cognitive/functional research. Stanford, CA: CSLI, 59-72.
    • Gries, Stefan Th. & Dagmar Divjak. (eds). 2012. Frequency effects in language learning and processing (Vol. 1). Berlin, New York: Mouton de Gruyter. [Trends in Linguistics] Gries, Stefan Th. & Martin Hilpert. 2010. Modeling diachronic change in the third person singular: a multifactorial, verb- and author-specific exploratory approach. English Language and Linguistics 14 (3): 293-320.
    • Grondelaers, Stefan & Dirk Speelman. 2007. A variationist account of constituent ordering in presentative sentences in Belgian Dutch. Corpus Linguistics and Linguistic Theory 3 (2): 161-193.
    • Harrell, Frank E. 2001. Regression modeling strategies. With applications to linear models, logistic regression and survival analysis. New York: Springer.
    • Hosmer Jr, David W., Stanley Lemeshow & Rodney X. Sturdivant. 2013. Applied logistic regression. John Wiley & Sons.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article