Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Languages: English
Types: Doctoral thesis
Subjects: P1
Theoretical arguments based on the "poverty of the stimulus" have denied a\ud priori the possibility that abstract linguistic representations can be learned\ud inductively from exposure to the environment, given that the linguistic input\ud available to the child is both underdetermined and degenerate. I reassess such\ud learnability arguments by exploring a) the type and amount of statistical\ud information implicitly available in the input in the form of distributional and\ud phonological cues; b) psychologically plausible inductive mechanisms for\ud constraining the search space; c) the nature of linguistic representations,\ud algebraic or statistical. To do so I use three methodologies: experimental\ud procedures, linguistic analyses based on large corpora of naturally occurring\ud speech and text, and computational models implemented in computer\ud simulations.\ud In Chapters 1,2, and 5, I argue that long-distance structural dependencies\ud - traditionally hard to explain with simple distributional analyses based on ngram\ud statistics - can indeed be learned associatively provided the amount of\ud intervening material is highly variable or invariant (the Variability effect). In\ud Chapter 3, I show that simple associative mechanisms instantiated in Simple\ud Recurrent Networks can replicate the experimental findings under the same\ud conditions of variability. Chapter 4 presents successes and limits of such results\ud across perceptual modalities (visual vs. auditory) and perceptual presentation\ud (temporal vs. sequential), as well as the impact of long and short training\ud procedures. In Chapter 5, I show that generalisation to abstract categories from\ud stimuli framed in non-adjacent dependencies is also modulated by the Variability\ud effect. In Chapter 6, I show that the putative separation of algebraic and\ud statistical styles of computation based on successful speech segmentation versus\ud unsuccessful generalisation experiments (as published in a recent Science paper)\ud is premature and is the effect of a preference for phonological properties of the\ud input. In chapter 7 computer simulations of learning irregular constructions\ud suggest that it is possible to learn from positive evidence alone, despite Gold's\ud celebrated arguments on the unlearnability of natural languages. Evolutionary\ud simulations in Chapter 8 show that irregularities in natural languages can emerge\ud from full regularity and remain stable across generations of simulated agents. In\ud Chapter 9 I conclude that the brain may endowed with a powerful statistical\ud device for detecting structure, generalising, segmenting speech, and recovering\ud from overgeneralisations. The experimental and computational evidence gathered\ud here suggests that statistical language learning is more powerful than heretofore\ud acknowledged by the current literature.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • Table 10. Sentences allowable under [3]. Rows are first words, columns are second words. The re160 Braine, M. (1987). What is learned in acquiring word-classes--a step toward an acquisition theory. In B. MacWhinney (Ed.) Mechanisms of language, pp.
    • Brighton, H. (2002). Compositional syntax from cultural transmission. Artificial Life, 8(1): 25-54.
    • Broeder, P., & Murre, J. (2000). Models of language acquisition: Inductive and deductive approaches. Oxford: Oxford University Press.
    • Brooks, P. & Tomasello, M. (1999). How young children constrain their argument structure constructions. Language, 75,720-738.
    • Brooks, P. & Tomasello,M. (1999). Young children learn to produce passives with nonce verbs. Developmental Psychology, 35,29-44.
    • Cartwright, T.A., and M.R. Brent (1997). Syntactic categorization in early language acquisition: formalizing the role of distributional analysis, Cognition, 63,121-170.
    • Cassidy, K. W. & Kelly, M. H. (1991). Phonological information for grammatical categoryassignmentsJ.ournal of MemoryandLanguage,30,348-369.
    • Cassidy, K. W. & Kelly, M. H. (2001). Children's use of phonology to infer grammatical class in vocabulary learning. Psychonomic Bulletin and Review,8,519-523.
    • Chater, N. & Vitänyi, P. (2001). A simplicity principle for language learning: reevaluating what can be learned from positive evidence. Manuscript submitted for publication.
    • Chater,N. (1996).Reconcilingsimplicity andlikelihood principlesin perceptual organization. Psychological Review, 103,566-581.
    • Chater, N. (1999). The search for simplicity: A fundamental cognitive principle? Quarterly Journalof ExperimentalPsychology,52A, 273-302.
    • Childers, J. & Tomasello, M. (2001). The role of pronouns in young children's acquisition of the English transitive construction. Developmental Psychology, 37,739-748.
    • Brooks, P., Tomasello, M., Lewis, L., & Dodson, K. (1999). Children's overgeneralization of fixed transitivity verbs: The entrenchment hypothesis. Child Development, 70,1325-37.
    • Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton & Co.
    • Chomsky,N. (1995).Theminimalistprogram.MIT Press.
    • Chomsky, N. (1955). The Logical Structure of Linguistic Theory. Manuscript, Harvard University. Published by Plenum Press, New York and London, 1973.
    • Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
    • Chomsky, N. (1980). Rules and representations. Cambridge, MA: MIT Press.
    • Christiansen, M. H., & Chater, N. (1994). Generalization and connectionist language learning. Mind and Language, 9,273-287.
    • Christiansen, M. H., & Chater, N. (1999). Toward a connectionist model of recursion in human linguistic performance. Cognitive Science, 23,157- 205.
    • Christiansen, M.H., Conway, C.M., & Curtin, S. (2000). A Connectionist SingleMechanism Account of Rule-Like Behavior in Infancy. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society, Mahwah, NJ: Lawrence Erlbaum Associates.
    • Cleeremans, A. Servan-Schreiber, D., & McClelland, J.L. (1989). Finite state automata and simple recurrent networks. Neural Computation, 1,372-381.
    • Content, A., Mousty, P. Radeau, M. (1990). Brulex. Une base de donnees lexicales informatisee pour le francais ecrit et parle. L'Annee Psychologique,90,551-566.
    • Conway, C.M., & Christiansen, M. H. (2002a). Sequential Learning through Touch, Vision and Audition. Paper presented at the 24th Annual Conference of the Cognitive Science Society, Fairfax, VA.
    • Conway, C.M., & Christiansen, M. H. (2002b). Modality Constrained Statistical Learning of Spatial, Spatiotemporal,and Temporal Input. Poster to be presented at the 43rd Annual Meeting of the Psychonomic Society, Kansas City, KS.
    • Culicover, P. (2000). Syntactic nuts. Oxford: Oxford University Press.
    • Culicover, P. W. (1995). Adaptive learning and concrete minimalism.
    • Proceedings of GALA 95.
    • Daugherty,K., & Seidenberg,M. (1992). Rules or connections?the past tense revisited. Annual Conference of the Cognitive Science Society, 14,259-- 264.
    • Dienes, Z. (1992). Connectionist and memory-array models of artificial grammar learning. Cognitive Science, 23,53-82.
    • Dowman, M. (2000) Addressing the Learnability of Verb Subcategorizations with Bayesian Inference. In Gleitman, L. R. & Joshi, A. K. (Eds.) Proceedings of the Twenty-Second Annual Conference of the Cognitive Science Society. Mahwah, New Jersey, USA: Lawrence Erlbaum Associates.
    • Dulany, D.E., Carlson, R.A., & Dewey, G.I. (1984). A case of syntactical learning and judgement: How consciousand how abstract?Journal of ExperimentalPsychology:General,113,541-555.
    • Elman (1993). Learninganddevelopmentin neuralnetworks:The importanceof starting small. Cognition, 48,71-99.
    • Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14,179-211.
    • Fiser, J., & Aslin, R.N. (2002). Statistical learning of higher-order temporal structure from visual shape-sequences. Journal of Experimental Psychology:Learning,Memory,and Cognition,130,658-680.
    • Flake, G.W. (1998). The computationalbeautyof nature.Cambridge,MA: MIT Press.
    • Frigo, L., & McDonald, K. L. (1998). Properties of phonological markers that affect the acquistitionof gender-likesubclassesJ.ournal of Memory and Language, 39,218-245.
    • Gell-Mann, M. (1995). The quark and the jaguar: Adventures in the simple and the complex. New York: W.H. Freeman.
    • Gibson, E.J. (1991). An Odysseyin Learning and Perception. Cambridge, MA: MIT Press.
    • Gleitman, L. R., Gleitman, H., Landau, B. & Wanner, E. (1988). Where learning begins: Initial representations for language learning. In F.J. Newmeyer (Ed.), Linguistics: The Cambridge survey, Vol. 3, pp. 150--193.
    • Gold, E. M. (1967). Language identification in the limit. Information and Control, 16,447-474.
    • Goldberg, A. (1999). "The emergence of the semantics of argument structure constructions".In MacWhinney,B. (Ed.) Theemergenceof language.
    • Goldberg, A. (2003). Constructions:a new theoretical approachto language.
    • Trends in cognitive sciences, 7,219-224.
    • Goldsmith, J. (2001). Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics, 27(2): 153-198.
    • Gomez, R. (2002). Variability and detection of invariant structure. Psychological Science, 13,431-436.
    • Gomez, R.L., & Gerken, L. A. (1999). Artificial grammar learning by 1-year-olds leadsto specific andabstractknowledge.Cognition, 70,109-135.
    • Gömez, R.L., & Gerken, L. A. (2000). Infant artificial language learning and language acquisition. Cognition, 70,109-135.
    • Hahn, U. & Chater, N. (1998). Similarity and rules: distinct? exhaustive? empirically distinguishable?Cognition,65,197-230.
    • Hahn, U., & Nakisa, R.C. (2000). German Inflection: Single or Dual Route? Cognitive Psychology, 41,313-360.
    • Hauser,M., Chomsky,N., Fitch, W.T. (2002) The faculty of language:What is it, Who has it, and how did it evolve? Science, 298 (22), 1569-1579.
    • Holender, D. (1986). Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: A survey and appraisal. Behavioral and Brain Sciences,9,1-23.
    • Horning, J.J. (1969). A study of grammatical inference. PhD Thesis, Stanford University.
    • Hurford, J. (2000) The Emergence of Syntax. In C. Knight, M. StuddertKennedy, and J. Hurford (Eds.) The Evolutionary Emergence of Language: Socialfunction and the origins of linguisticform. CambridgeUniversity Press,pp.219-230.
    • Hochberg, J., & McAlister, E. (1953). A quantitative approach to figural goodness.Journal of Experimental Psychology, 46,1953,361--364.
    • Jusczyk, P.W. (1999). How infants begin to extract words from speech. Trends in Cognitive Sciences,3,323-328.
    • Kelly, M.H. (1992). Using sound to solve syntactic problems: The role of phonology in grammaticalcategory assignments.PsychologicalReview, 99,349-364.
    • Kirby, S. (2001). Spontaneous evolution of linguistic structure: An iterated learning model of the emergenceof regularity and irregularity. IEEE Transactions on Evolutionary Computation, 5(2): 102-110.
    • Kiss, G. (1973). Grammatical word classes: A learning process and its simulation. Psychology of Learning and Motivation, 7: 1-41.
    • Klavans, J.L., & Resnik, P. (1996). The Balancing Act: Combining Symbolic and Statistical Approaches to Language. Cambridge, MA: MIT Press.
    • Kolmogorov, A. N. (1965). Three approachesto the quantitativedefinition of information. Problems in Information Transmission, 1,1-7.
    • Levin, B. (1993), English verb classesand alternations. Chicago: The University of Chicago Press.
    • Li, M. & Vitänyi, P. (1997). An introduction to Kolmogorov complexity theory and its applications (2nd edition). Berlin: Springer Verlag Lord, C. (1979). Don't you fall me down: Children's generalizations regarding cause and transitivity. Papers and Reports on Child Language Development, 17. Stanford, CA: Stanford University Department of Linguistics.
    • Luce, R.D. (1963). Detection and recognition. In R.D. Luce, R.R. bush, & E.
    • MacKay, D.J.C., (1992). Information-based objective functions for active data selection. Neural Computation, 4,589-603.
    • MacWhinney, B. (1987). The Competition Model. In B. MacWhinney (Ed.), Mechanismsof languageacquisition.Hillsdale, NJ: LawrenceErlbaum.
    • MacWhinney, B. (1989). Competition and Lexical Categorization. In R.
    • Corrigan, F. Eckman, & M. Noonan (Eds.) Linguistic categorization, 195- 242.New York: Benjamins.
    • MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. 3rd Ed. London : Lawrence Erlbaum.
    • Manning,C., & Schütze,H. (1999).Foundationsof StatisticalNatural Language Processing, Cambridge, MA: MIT Press.
    • Maratsos, M. P. & Chalkley, M.A. (1980). The internal language of children's syntax: The ontogenesis and representation of syntactic categories. In K. E.
    • Nelson (Ed.), Children's Language Volume2, pp.127-214.New York: Gardner Press.
    • Marchman, V. & Bates, E. (1994). Continuity in lexical and morphological development:A test of the critical mass hypothesis.Journal of Child Language, 21(2), 331-366.
    • Marcus, G.F. (1999). Do infants learn grammar with algebra or statistics? Science,284,436-437.
    • Marcus, G.F. (2001). The Algebraic Mind: Integrating Connectionism and Cognitive Science. Cambridge, MA: MIT Press.
    • Marcus, G.F., & Berent, I. (2003). Are there limits to statistical learning? Science, 300,52-53.
    • Marcus, G.F., Vijayan, S., Bandi Rao, S., Vishton, P.M. (1999). Rule Learning by Seven-Month-OldInfants.Science,283:77-80.
    • McClelland, J.L. (1998). Connectionist models and Bayesian inference. In M. Oaksford, & N. Chater (Eds.) Rational models of cognition. Oxford: Oxford University Press.
    • McClelland, J.L., & Plaut, D.C. (1999) Does generalisationin infant learning implicate abstract algebra-like rules? Trends in Cognitive Sciences,3,166- 168.
    • Miller, G.A. (1967). Project Grammarama, in The Psychology of Communication: SevenEssays. Baltimore: Penguin.
    • Mintz, T. H. (2002). Category induction from distributional cues in an artificial language. Memory & Cognition, 30,678-686.
    • Mintz, T.E., Newport, E., & Bever, T. G. (1995). Distributional regularities in speechto young children.In Proceedingsof NELS,25,43-54.
    • Mintz, T. H. (2002). Category Induction from Distributional Cues in an Artificial Language.Memoryand Cognition,30,678-686.
    • Mintz, T.H., Newport, E.L., & Bever, T.G. (2002). The distributional structure of Monaghan,P., Chater, N. & Christiansen,M.H. (submitted).The differential contribution of phonological and distributional cues in grammatical categorisation.
    • Morgan, J.L., Newport, E. (1981). The role of constituent structure in the induction of an artificial language.Journal of verbal learning and verbal behavior, 20: 67-85.
    • Newport, E., & Aslin, R.N. (2000). Innately constrained learning: Blending old andnew approachesto languageacquisition.In S.C. Howell, A. Fish, & T.
    • Keith-Lucas (Eds.) Proceedings of the 24`x`Annual Boston University Conference on Language Development.
    • Oaksford,M., & Chater,N. (1994). A rationalanalysisof the selectiontask as optimal data selection. Psychological Review, 101(4), 608-631.
    • Onnis, L., Roberts, M., & Chater, N. (2002) Simplicity: A cure for overgeneralizations in language acquisition? In W.D. Gray & C.D. Shunn, (Eds.) Proceedings of the 24th Annual Conference of the Cognitive Science Society, London: LEA.
    • Onnis, L., Christiansen, M. H., Chater, N., & Gomez, R. (2003). Reduction of Uncertainty in Human SequentialLearning: Evidence from Artificial Grammar Learning.. Proceedings of the 25`1'Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates.
    • Pena, M., Bonatti, L., Nespor, M., Mehler, J. (2002). Signal-driven computations in speechprocessing. Science, 298,604-607.
    • Perruchet, P., & Pacteau, C. (1990). Synthetic grammar learning: Implicit rule abstraction or explicit fragmentary knowledge? Journal of Experimental Psychology:General,119,264-275.
    • Pine, J.M., & Lieven, E.V.M. (1997). Slot-and-frame patterns and the development of the determiner category. Applied Psycholinguistics, 18, 123-138.
    • Pinker, S. (1989). Learnability and Cognition: The Acquisition of Argument Structure. Cambridge, MA: MIT Press.
    • Pinker, S. (1995). The language instinct. Harmondsworth: Penguin.
    • Pinker, S. (1999). Words and rules: The ingredients of language. New York: Basic Books.
    • Pizzuto E., & Caselli, M.C. (1992). The acquisition of Italian morphology: implications for models of language development.Journal of Child Language, 19,491-557.
    • Plaut, D.C., McClelland, J.L., Seidenberg,M.S., & Patterson, K. (1996).
    • Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103,56-115.
    • Plunkett, K. & JuolaP. (1999). A connectionistmodel of english past tenseand plural morphology. Cognitive Science, 23, (4), 463-490.
    • Pothos, E., & Chater, N. (2002). A simplicity principle in unsupervised human categorization. Cognitive Science,26,303-343.
    • Prasada, S., & Pinker, S. (1993). Generalizations of regular and irregular morphology. Language and Cognitive Processes,8,1-56.
    • Quinlan, J.R. & Rivest, R. (1989). Inferring decisiontreesusing the minimum description length principle. Information and Computation, 80,227-248.
    • Reber, A. S. (1969). Transfer of syntactic structure in synthetic languages.
    • Journal of ExperimentalPsychology,81,115-119.
    • Redington, M., & Chater, N. (2002). Knowledge representation and transfer in artificial grammar learning (AGL). In R. M. French & A. Cleeremans (Eds.) Implicit learning and consciousnessa:n empirical, philosophical, and computational consensusin the making. Psychology Press.
    • Redington, M., Chater, N., and Finch, S. (1998). Distributional information: A powerful cue for acquiring syntactic categories. Cognitive Science, 22(4): 425-469.
    • Rissanen, J. (1987). Stochastic complexity. Journal of the Royal Statistical Society,SeriesB, 49,223-239.
    • Rissanen, J. (1989). Stochastic complexity and statistical inquiry. Singapore: World Scientific.
    • Rumelhart, D.E., & McClelland, J.L. (1986). On learning the past tense of English verbs. In J.L. McClelland, D.E. Rumelhart and the PDP Research Group Parallel distributed processing: Explorations in the microstructure of cognition. Vol 2: Psychological and biological models, pp.216-271.
    • Saffran,J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation:The role of distributionalcues.Journal of MemoryandLanguage,35,606-621.
    • Saffran, J.R., Aslin, R.N., & Newport, E.L. (1996). Statistical learning by 8- month-oldinfants.Science2,74,1926-1928.
    • Saffran,J.R., Johnson,E.K., Aslin, R.N., & Newport,E. L. (1999). Statistical learning of tone sequencesby human infants and adults. Cognition, 70,27- 52.
    • Schvaneveldt, R.W., & Gomez, R.L. (1998). Attention and probabilistic sequencelearning. Psychological Research, 61,175-190.
    • Seidenberg, M. S., & McClelland, J.L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96,523- 568.
    • Seidenberg, M. S., MacDonald, M.C., & Saffran, J.R. (2002). Does grammar start where statistics stop? Science, 298,553-554.
    • Seidenberg,M.S., MacDonald, M.C., & Saffran, J.R. (2003). Responseto Marcus and Berent. Science,300,53.
    • Servan-Schreiber, D., Cleeremans, A. & McClelland, J.L. (1991). Graded State Machines: The representation of temporal contingencies in simple recurrent networks. Machine Learning, 7,161-193.
    • Servan-Schreiber, E., & Anderson, J.R. (1990). Learning artificial grammars with competitive chunking. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16,592-608.
    • Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30,50-64.
    • Shannon,C.E. (1948). A MathematicalTheory of Communication,Bell System TechnicalJournal, 27,379-423 and623-656.
    • Smith, K. H. (1966). Grammatical intrusions in the recall of structured letter pairs: Mediated transfer or position learning? Journal of Experimental Psychology,72,580-588.
    • Stolcke, A. (1994). Bayesian Learning of Probabilistic Language Models.
    • Teal, T.K. and Taylor, C.E. (2000). Effects of Compression on Language Evolution. Artificial Life, 6 (2): 129-143.
    • Tomasello, M. (2000). First steps towards a usage-based theory of language acquisition. Cognitive Linguistics, 11,61-82.
    • Tomasello, M. (2000). The item based nature of children's early syntactic development. Trends in Cognitive Sciences,4,156-163.
    • Tomasello, M. (2003). Constructing a Language: A Usage-Based Theory of Language Acquisition. Harvard University Press.
    • Van der Helm, P.A., & Leeuwenberg, E.L.J. (1996). Goodness of visual regularities:A non-transformationaalpproach.PsychologicalReview,103 (3), 429-456.
    • Vokey, J.R., & Brooks, L.R. (1992). Salience of item knowledge in learning artificial grammar. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20,328-344.
    • Wallace, C.S., & Freeman, P.R. (1987). Estimation and inference by compact coding. Journal of the Royal Statistical Society Series B, 49 (3), 240-265.
    • Wolff, J. (1982). Language acquisition, data compression and generalization.
    • Language & Communication, 2,57-89,1982.
    • Wolff, J. (1991). Towards a Theory of Cognition and Computing. Chichester: Ellis Horwood.
    • Zipf, G. K. (1949). Human behavior and the principle of least effort. AddisonWesley, Reading, MA.
    • Zuidema,W. (2003). How the povertyof the stimulussolvesthe povertyof the stimulus. In S. Becker, S. Thrun, and K. Obermayer (Eds.) Advances in Neural Information Processing Systems15,Cambridge, MA: MIT Press.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article