Language Pdf 103577 | Testingugprojectsummary

Partial capture of text on file.
         Testing the Universal Grammar Hypothesis 
         1. Introduction 
           Perhaps the single most controversial claim in linguistic theory is that children learning their 
         native language face an induction problem, or in other words, that the available input 
         underspecifies the adult state. This induction problem is known by many names: the “Poverty of 
         the Stimulus” (e.g., Chomsky 1980a, Chomsky 1980b, Lightfoot 1989, Crain 1991), the “Logical 
         Problem of Language Acquisition” (e.g., Baker 1981, Hornstein & Lightfoot 1981), and “Plato’s 
         Problem” (e.g. Chomsky 1988, Dresher 2003). Regardless of the name, it all boils down to the 
         same claim: the data generally available to young children are compatible with multiple 
         hypotheses, or perhaps more correctly, data necessary to rule out the incorrect hypotheses are 
         either not available at all, or not available in sufficient quantity (Lightfoot 1982, Legate & Yang 
         2002, among many others).  
           The Universal Grammar (UG) hypothesis was introduced as a solution to this problem 
         (Chomsky 1957/1975, 1965). The logic of the UG hypothesis is straightforward: if the necessary 
         evidence for choosing the correct linguistic hypothesis is unavailable in the input, then children 
         must bring some internal bias to the language learning problem (Chomsky 1981, Hornstein & 
         Lightfoot 1981, Legate & Yang 2002, among many others). While the necessity of some kind of 
         bias is generally granted by even the most ardent critics of the UG hypothesis (e,g, Pullum & 
         Scholz 2002, Regier & Gahl 2004), the nature of the necessary biases is the subject of 
         considerable debate.  First, there is the question of what cognitive objects the bias operates over.  
         A bias might operate over the representations the child considers as hypotheses (e.g., parameters 
         of linguistic variation (Chomsky 1981)), the data the child learns from (e.g., only unambiguous 
         data (Fodor 1998)), or the learning algorithm the child uses to alter belief in competing 
         hypotheses (e.g. trigger-based learning (Gibson & Wexler 1994, Niyogi & Berwick 1996)).  
         Second there is the question of whether the necessary bias is specific to language learning (i.e. 
         domain-specific) or applies generally to any kind of cognitive learning (domain-general). UG is 
         usually proposed as a collection of domain-specific biases ranging over both the representations 
         that children consider and the data that children learn from, but this is far from logically 
         necessary (e.g., Chomsky 1971, 1981, Kimball 1973, Baker 1978, Gordon 1986, Lightfoot 1991) 
           Recent debates about the UG hypothesis have tended to focus on two broad questions.  The 
         first concerns the existence of the induction problem (e.g., Sampson 1989, 1999, Pullum & 
         Scholz 2002, MacWhinney 2004, Tomasello 2004), which is, of course, the motivation for the 
         UG hypothesis. Until recently, the claim that children’s input lacks sufficient evidence for 
         successful language learning has been based on the intuitions of linguists rather than on large-
         scale empirical analyses of child-directed speech. However, without quantifiable evidence for 
         induction problems, there is no need for the UG hypothesis at all. In fact, Pullum and Scholz 
         (2002) have claimed just that: using data from the Wall Street Journal corpus (Linguistic Data 
         Consortium 1993) and the CHILDES database (MacWhinney 2000), they argue that there is no 
         evidence for an induction problem for several well-known linguistic phenomena in English such 
         as anaphoric one (Baker 1978) and yes-no questions involving complex subjects (Chomsky 
         1971). Even granting the existence of an induction problem for a given linguistic phenomenon, 
         the second broad question follows directly: what is the nature of the prior knowledge necessary to 
         solve that problem? More specifically, is the knowledge innate or derived from prior learning?  Is 
         the knowledge domain-specific or domain-general?  One could imagine any or all of the possible 
         combinations being applicable to various aspects of the linguistic system: some knowledge may 
         be innate and domain-general, some innate and domain-specific, some derived from domain-
         general knowledge acquired previously, and some derived from domain-specific knowledge 
         acquired previously. With the proliferation of possible types of prior knowledge, it is not clear 
         that a single type will be sufficient to solve all of the induction problems in language learning. In 
         fact, Tomasello (2004) takes this one step further: he argues that the proliferation of specific 
         suggestions for that prior knowledge in the theoretical literature has rendered the UG hypothesis 
         untestable through standard scientific falsification. He contends that it will not be possible to 
         evaluate the UG hypothesis until it is broken down into specific hypotheses about biases with 
         respect to specific linguistic phenomena.  
           The project we propose here aims to address both of these questions directly, and in the 
         process lay out a concrete methodology for testing the UG hypothesis that is in similar in spirit to 
         what both the critics of the UG hypothesis (e.g., Pullum and Scholz 2002 and Tomasello 2004) 
         and the supporters of the UG hypothesis (e.g., Chomsky 1957/75, Crain and Pietroski 2002) 
         propose. Utilizing techniques recently made possible through advances in technology, and 
         combining aspects of theoretical, experimental, and computational linguistics, it is now feasible 
         to perform several quantitative tasks relevant to evaluating the UG hypothesis with respect to the 
         issues discussed above. We can search reasonably large corpora of both adult and child-directed 
         speech for relevant linguistic structures; we can precisely measure the adult knowledge state 
         children eventually attain using psycholinguistic techniques from experimental syntax; and we 
         can implement sophisticated probabilistic learning models (specifically Bayesian models) capable 
         of operating over the structured representations postulated by linguistic theory.  With these 
         techniques in hand, we plan to investigate the existence of the induction problem by examining 
         both the realistic data used as input by children (available through resources such as CHILDES 
         (MacWhinney 2000)) and the knowledge state achieved by adults for complex linguistic 
         phenomenon such as syntactic islands (e.g., the experiments in Sprouse 2007). We will then 
         implement Bayesian learning models to test whether unbiased learners can reach the adult 
         knowledge state given the data available.  If unbiased learners cannot do this, then we can 
         conclude that the induction problem does indeed exist for that phenomenon and that children 
         require learning biases to succeed.  We can then identify what kind of biases lead to acquisition 
         success by incorporating different types of learning biases into the models (as is done, for 
         example, for learning anaphoric one in Pearl & Lidz (submitted)). The biases implemented may 
         be domain-general in nature (e.g., Regier & Gahl 2004, Perfors, Tenenbaum, & Regier 2006, 
         Pearl & Lidz submitted) or domain-specific (Sakas & Fodor 2001, Pearl & Weinberg 2007, Pearl 
         2008, submitted, Pearl & Lidz submitted). Crucially, because the Bayesian modeling framework 
         allows us to accommodate biases of many kinds, from choosing the smallest hypothesis 
         consistent with the data (Tenenbaum & Griffiths, 2001) to restricting the input to certain clauses 
         (Lightfoot 1991, Pearl & Weinberg  2007) to constraining the representations under consideration 
         via parameters (Chomsky 1981), we will be able to both reduce the UG hypothesis to smaller 
         specific hypotheses and evaluate the necessity of those hypotheses for successful learning (for 
         instance, as advocated for by Tomasello (2004)).  
            
         2. Accurate measures of the primary data 
           The first step of our investigation is to assess the input that is actually available to children 
         for various linguistic phenomena.  Since the debate regarding the induction problem and the 
         necessity of UG hinges on the state of children’s input, occurrence facts about child input should 
         not be based on the intuitions of linguists (an idea advocated extensively in Pullum & Scholz 
         (2002), for instance).  This is particularly true now that corpora of child-directed speech are freely 
         available, such as CHILDES (MacWhinney 2000). Notably, however, the corpora available are 
         rarely marked with all the information of interest to a linguist focused on complex syntactic and 
         semantic phenomena, which are primarily the locus of the induction problem debate (Crain & 
         Pietroski 2002, Legate & Yang 2002, Pullum & Scholz 2002, Lidz, Waxman, & Freedman 2003, 
         Reali & Christiansen 2004, Regier & Gahl 2004, Kam et al. 2005, Perfors, Regier, & Tenenbaum 
         2006, Foraker et al. 2007, Pearl & Lidz submitted, among many others). While some corpora may 
         contain morphological information or part-of-speech identification, most are simply transcripts of 
         child-directed speech.  We propose to annotate several available child corpora in the CHILDES 
         database syntactically (using, for example, the features in Government and Binding Theory 
         (Chomsky, 1981)) via a two-step process.  The output of this process will be fully formed 
            hierarchical structures, so that formal analyses from theoretical linguistics can be easily adopted 
            as biases in the models we later build (see sections 4 and 5 for details). First, we will use a freely 
            available dependency tree parser (such as the Charniak parser1) to generate a first-pass syntactic 
            analysis.  Then, we will evaluate the resulting syntactic trees by hand (with the help of 
            undergraduate research assistants), correcting when necessary, to ensure the accuracy of the 
            structures generated.  We intend to make the final parsed corpora available through CHILDES for 
            other language researchers to use. 
              In addition, we propose to investigate adult corpora of conversational speech (such as those 
            available through TalkBank (http://www.talkbank.org) in order to compare the differences 
            between adult and child-directed speech for various linguistic phenomena.  Often, child-directed 
            speech corpora are relatively sparse compared to available adult speech corpora, especially if 
            syntactic annotation is desired, which has led much of the corpus-based linguistic research to rely 
            on adult-directed speech (e.g., Pullum & Scholz (2002)).  Yet, it is a common (and quite 
            reasonable) argument that child-directed speech may differ quite significantly from adult speech 
            (see, for example, discussion in Legate & Yang (2002)).  Given that recent probabilistic learning 
            models are sensitive to the relative frequencies of various data (e.g., Foraker et al. 2007), it seems 
            only prudent to ask, for a given linguistic phenomenon, if the data frequencies do differ. It may 
            turn out for some linguistic phenomena that the relative frequencies do not vary much between 
            the speech directed at, say, three-year-olds and the speech directed at adults.  This would then 
            suggest that adult speech corpora may indeed be a reasonable estimate of children’s input for 
            some phenomena, particularly complex syntactic and semantic interpretation phenomena that are 
            acquired later in development (e.g., negative polarity items like ‘any’, the interpretation of 
            connectives such as ‘or’, and binding theory phenomena, as discussed in Crain & Pietroski 
            (2002)).  Given the abundance of adult-directed conversational speech, such a scenario would 
            provide a far richer source of data from which children’s input could be estimated. However, 
            should child-directed and adult-directed speech frequencies differ, it will be crucial to this project 
            to determine not only if, but also in what way they differ, so as to correctly evaluate both our own 
            models and those potentially offered by others. 
              Like the child-directed speech, much conversational adult-directed speech is not annotated 
            with syntactic information.  The process we propose to use to generate annotated adult-directed 
            speech corpora is identical to the process for generating the annotated child-directed speech, 
            involving a first-pass annotation by a freely available parser and subsequent human evaluation of 
            the generated annotation.  We intend to make the annotated corpora available to the research 
            community either through TalkBank (http://www.talkbank.org) or the Linguistic Data 
            Consortium (http://www.ldc.upenn.edu/), a common repository for electronic corpora. 
               
            3. Accurate measures of the adult state 
              The second step of our investigation is to assess the adult knowledge state children eventually 
            attain. It almost goes without saying that acceptability judgments form the primary measure of the 
            adult grammar in the field of theoretical syntax; therefore, acceptability judgments are the logical 
            choice for a quantifiable measure of the adult state. There are at least three reasons for the 
            predominance of acceptability judgments in the study of adult grammars.  First, acceptability 
            judgments can be provided with little effort from the subject (Schutze 1996, Cowart 1997). 
            Second, these judgments are highly reliable across speakers of the same language (Cowart 1997, 
            Keller 2000, Sprouse 2007). Third, these judgments are a robust proxy for grammaticality 
            (Chomsky 1965, Schutze 1996, Cowart 1997, and many others). Paradoxically, the very 
            properties that have made acceptability judgments such a valuable data source for theoretical 
            syntacticians have also served to undermine general confidence in that data.  First, because 
                                                     
            1 Available through Brown University (ftp://ftp.cs.brown.edu/pub/nlparser/). 
         judgments are available to any native speaker, linguists have tended to use their own judgments 
         rather than those of naïve consultants (Christiansen and Edelman 2003). Second, because 
         judgments are generally reliable across speakers, linguists have tended to use single data points 
         rather than samples (Bresnan 2007, Cowart 1997).  Third, because judgment tasks are often 
         designed as a choice between grammatical and ungrammatical, until recently relatively little 
         research has been done on the gradience inherent to acceptability judgments, and the factors that 
         might be causing or influencing that gradience (Keller 2000, Sorace and Keller 2005). 
           In response to these concerns, several linguists have developed a set of formal methodologies, 
         which have collectively come to be known as experimental syntax, for collecting acceptability 
         judgments. While the details vary from experiment to experiment, experimental syntax 
         methodologies all have at least four components in common (Featherston 2007, Sprouse 2007).  
         First, judgments are collected from a sample of naïve consultants, usually at least 10 and ideally 
         more than 20, to insure that judgments generalize to the broader population.  Second, consultants 
         are presented with a variety of sentences for any given structure under investigation, to insure that 
         the judgments generalize across lexical items.  Third, consultants are presented with a formal 
         task, such as a Likert Scale task or the Magnitude Estimation task (Stevens 1957, Bard et al. 
         1996), to help insure that relative acceptability data are not lost to categorical responses.  Fourth, 
         data are analyzed using standard behavioral statistics.  For this project, we will use experimental 
         syntax techniques to measure the relative acceptability of structures in the adult grammar for 
         comparison to the relative frequencies of those structures in the child-directed speech corpora and 
         adult conversational speech corpora.  
            Experimental syntax methodologies have advantages over previous informal collection 
         techniques too numerous to mention here (see Schutze 1996, Cowart 1997, Keller 2000, 
         Featherston 2007, and Sprouse 2007 for discussion). However, given the nature of this project - in 
         particular, the comparison between relative frequencies and acceptability judgments - two of 
         these advantages bear mention. First, experimental syntax has introduced rating tasks, such as 
         magnitude estimation (Stevens 1957), that provide a more precise measure of relative 
         acceptability than previous informal collection tasks. Most informal collection tasks involved 
         binary rating scales such as yes/no or limited, discrete rating scales such as the 5 or 7 point Likert 
         scales.  All of these limited scales can result in a loss of information to categorization (Bard et al. 
         1996). In contrast, magnitude estimation places no predefined restriction on the response scale: 
         subjects may use the entire positive number line for their responses, thus eliminating the 
         categorization problem. Bard et al. (1996) demonstrated that given such freedom, subjects 
         routinely distinguish more than 7 levels of acceptability. Furthermore, Sprouse (submitted b) has 
         demonstrated that subjects’ responses in magnitude estimation tasks are incredibly robust across 
         samples, even with minor variations to the experimental design (such as modifying the modulus 
         sentence). Taken together, these facts suggest that newer rating tasks such as magnitude 
         estimation will provide more detailed data regarding the adult grammar. 
            Second, experimental syntax has also introduced the principles of factorial experimental 
         design, which has enabled the investigation of contributions from factors that are traditionally 
         outside the domain of syntactic theory, but that may still have an effect on both acceptability 
         judgments and (crucially) relative frequencies. For example, Sprouse (2008, submitted a) both 
         demonstrate that the acceptability of wh-movement dependencies is affected by the distance of 
         the dependency (see also Frazier (1989) and Phillips et al (2005)).  Specifically, shorter wh-
         movement dependencies (1) are significantly more acceptable than longer wh-movement 
         dependencies (2) despite the fact that syntactic theories predict both structures to be categorically 
         grammatical. 
          
         (1) Jack hoped that you knew who the giant would chase. 
         (2) Jack knew who you hoped that the giant would chase.
The words contained in this file might help you see if this file matches what you are looking for:

...Testing the universal grammar hypothesis introduction perhaps single most controversial claim in linguistic theory is that children learning their native language face an induction problem or other words available input underspecifies adult state this known by many names poverty of stimulus e g chomsky a b lightfoot crain logical acquisition baker hornstein and plato s dresher regardless name it all boils down to same data generally young are compatible with multiple hypotheses more correctly necessary rule out incorrect either not at sufficient quantity legate yang among others ug was introduced as solution logic straightforward if evidence for choosing correct unavailable then must bring some internal bias while necessity kind granted even ardent critics pullum scholz regier gahl nature biases subject considerable debate first there question what cognitive objects operates over might operate representations child considers parameters variation learns from only unambiguous fodor algor...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area