Keywords

1 Introduction

Despite unresolved differences as to whether language competence is innate (e.g. Chomsky 1981, 1995; Pinker 1984) or developed through the interaction of human cognition with the input (e.g. Tomasello 2003), it is now agreed that exposure to language at the very least improves language learning outcomes. Yet considering the relatively new status of input-driven accounts of language acquisition, there is a great deal of research which still needs to be carried out to assert the role of exposure in language development. Studying bilingual children and their language acquisition can contribute greatly to this debate as in bilingual environments two languages are in constant competition for the input space which creates optimal testing grounds for any input-based hypotheses. To date, contexts of bilingual exposure have revealed links between the amount of input and the rates of lexical development in both languages (e.g. Pearson et al. 1997; Hoff et al. 2012; Thordardottir 2014). They have also, to some extent, shown links between the amount of input and the pace of grammatical acquisition in preschool children (Barrena et al. 2008; Thordardottir 2014) but not always in older children and not across all grammatical domains (see e.g. Unsworth 2014; Gathercole and Thomas 2009). The aim of this study is to examine the relationship between two types of input and the respective grammars at their onset in order to ask the question of why the links between input and grammar do not always appear straightforward.

The theoretical framework followed in this study is rooted in the usage-based theory which sees the essence of language in its symbolic dimension while its structure is viewed as being merely derivative (Barlow and Kemmer 2000; Croft 2001; Tomasello 2000, 2003). The communicative focus of this model is reflected in the term usage-based, ‘one in which the speaker’s linguistic system is fundamentally grounded in ‘usage events’: instances of a speaker’s producing and understanding language’ (Barlow and Kemmer 2000, viii). The central tenet of this model is that variation in language acquisition can be explained by the variety of ways in which children’s learning mechanisms respond to the properties of idiosyncratic input received in individual languages (Tomasello 2003). Such learning mechanisms include intention-reading and pattern-finding skills which are domain-general in that they support not only language acquisition but also general cognitive development. Crucially, however, the work of these mechanisms is secondary to spontaneous language use: it is only with cumulative exposure to, and use of language that the child observes regularities between concrete linguistic constructions, and ultimately builds abstract representations around them (Tomasello 2000, 2003; Croft 2001).

Croft’s Radical Construction Grammar (RCG) (2001) is of particular interest to this study as this usage-based framework departs most radically from any syntactic models which assume that the child is genetically endowed with modularly specialised language (e.g. Chomsky 1981, 1995). This radical departure extends earlier attempts to apply the notions of ‘constructions’ to some language which does not lend itself to syntactic analysis (e.g. Fillmore et al. 1988) by postulating that all language, from words to most abstract rules, can be analysed as constructions. The decision to analyse words on a par with more complex constructions can be justified by the occurrence of words such as [the X-er, the Y-er]: although they are used as independent lexical items, they include bound morphemes in their syntactic representation and so could also be viewed as one-word constructions (Croft 2001, p. 17). The central hypothesis of this model is thus that constructions are the primitive units of any such representation while the primitive syntactic categories are non-existent. As Fig. 1 shows, constructions here are seen as pairs of grammatical form and meaning in a unit whose primary function is symbolic.

Fig. 1
figure 1

The symbolic structure of construction (Croft 2001, p. 18)

Croft’s model (2001) is indeed a far cry from syntactic universality and from earlier rule-based models of language acquisition which see language as a universal property of the human mind (Chomsky 1981, 1995; Pinker 1994). Croft (2001) argues that (1) the emergent categories are construction specific; (2) constructions are language specific and so (3) all formal properties of grammar are language-specific. Here the only universal is the holistic conceptualisation of highly particular situation types and the conceptual relationships among them, resulting from the shared judgment of similarity among all language speakers (Croft 2007, 2010) who ‘may linguistically group similar situation types in any way (...) as long as similarity is respected’ (2010, p. 13). In departing radically from other frameworks, Croft’s model (2001) also puts a new perspective on bilingual first language acquisition: it seems to predict that the bilingual child will generate separate mechanisms for coping with two different types of input and that she will develop categories which are specific to each language but not shared between the languages, at least not initially.

The main question asked in this study is how a bilingual child, who hears two languages from birth, builds grammatical representations early on and in what way input frequency, the key aspect of the usage-based theory, plays a role in this type of acquisition. The grammatical representations of interest in this study are noun inflections as they represent the radical types of constructions included in Croft’s model (2001) where a single word includes a bound morpheme in syntactic representation. Noun inflections are also among the first signs of grammatical acquisition, preceding verb morphology (Slobin 1966; Zarębina 1965).

This study relies on Bybee’s (2001) distinction between token and type frequency. High token frequency of a word or phrase is the number of times that a particular linguistic entity comes up in speech: for example, the work ‘broke’ occurs in a spoken corpus 66 times per million words while the word ‘damaged’ only five times, giving the former a higher token frequency (Kučera 1982, as cited by Bybee 2001 , p. 10). On the other hand, high type frequency is the dictionary frequency of a given pattern which determines the creation of slots in strings and categorization: with higher type frequency of an element appearing within a given slot, there is a greater chance that the child will learn to apply this element productively to any new similar items (Bybee 2001, p. 14). For example, where –er acts as a constant element and X or Y fill the slot, the more types of nouns the child hears which end in –er, the sooner she is expected to learn to apply this schema productively to any new nouns. The notion of frequency is likely to capture well the differences between languages, such as Polish and English which are examined in this study: although both are fusional, the Polish inflection system with verb conjugations and noun, adjective and numeral declensions is relatively more complex compared to the now diminished English inflection system. Owing to this, these two sources of input can help attribute the emerging pattern-finding skills more reliably to the individual languages without running the risk of interaction from early on.

2 Inflection in Polish and English

The English noun inflection is more rudimentary but there are still three orthographical markings left on most regular nouns in addition to the Ø marking in singular default contexts which can inform the current discussion: the singular genitive -‘s (e.g. mummy’s), the plural –s (e.g. mummies), and the plural genitive –s’ (e.g. mummies’) (CIDE 1995). Although they tend to be realised in the same way in speech through the addition of the same one of the three phonological variants /s/, /z/ or /iz/, in this case /mʌmiz/ (CIDE 1995), in this study they are referred to as singular genitive, plural default and plural genitive to reflect their function.

Compared to this, the Polish system of inflection is relatively complex with Polish nouns categorised according to case, number and gender (Bańko 2009). There are three genders in Polish: masculine, feminine and neuter. Case has seven types which are usually presented in the following order: nominative, genitive, dative, accusative, instrumental, locative and vocative (Bańko 2009). Polish nouns follow over 50 different inflection paradigms and often one marking maps across many grammatical contexts. Table 1 below shows three common inflection paradigms used with Polish nouns (adapted from SWJP 1996).

Table 1 Three common inflection paradigms used with Polish nouns

Monolingual Polish children are reported to acquire all the seven singular case markings as well as the nominative and accusative plural markings before their second birthday (Smoczyńska 1985; Dąbrowska and Szczerbiński 2006) but initially their use in not fully productive and around the age of two some unfamiliar words are often left uninflected (Dąbrowska 2005). Longitudinally, the first forms to emerge in Polish monolingual children tend to be in the singular nominative default case (Smoczyńska 1985). The accusative forms emerge soon after; they tend to be followed by the vocative forms and then the genitive (Zarębina 1965; Smoczyńska 1985). Dąbrowska and Szczerbiński (2006) link this commonly observed order of acquisition to exposure by showing exactly how frequent these different case markings are in the input of one monolingual child called Marysia (from the Szuman data available on CHILDES) and convert them into percentages, which suggests that acquisition of nominal inflections relies on morphological contrasts from early on. Initially, this approach will be replicated in this paper for both Polish and English although there are reasons to believe that grammatical acquisition is linked to phonological rather than abstract contrasts (Bybee 2001). As can be seen in Table 2 below, typical input data can predict the order of acquisition of most but not all cases in Marysia’s acquisition but this discrepancy may be eliminated in my study if the input and output of one and the same child are compared.

Table 2 Case markings in the input (Dąbrowska and Szczerbiński 2006)

In terms of English, the first noun forms to develop in monolingual acquisition are observed around the second birthday in the singular default form, a preference which is often explained by their simpler phonological shape (Brown 1973; Keshavarz and Ingram 2002). They are followed by the plural default, singular genitive and lastly plural genitive (Brown 1973). Indeed, children master the pronunciation of sibilant /s/ and /z/ relatively late which could potentially explain why they may omit the –s marking even once they have started to use it with some nouns. Errors of omission, however, should be seen as separate from the mechanisms delaying the acquisition of a particular case. De Houwer (2009), for example, argues that if children were guided by phonological simplicity, they would never attempt to produce more complex grammatical structures. Therefore, the argument of phonological accessibility is called into question here by the strength of one which predicts the order of acquisition by their frequency in the input.

3 Methodology

Case study methodology has been chosen here as looking at one child, and therefore only one ‘cognitive filter’, helps to attribute the outcomes more reliably to the given input and eliminate confounding factors which could come into play in any multi-case research. The protagonist of this study is Sadie, the first-born and normally developing child of the researcher, who presents a case of bilingual first language acquisition (BFLA). Diary data recorded over the period of nearly 18 months between the ages of 0;10.10-2;03.22 were used in this study to document every 30-min slot for each of the 7 days of the week as representative of input in a particular language (De Houwer and Bornstein 2003). In her second year of life, Sadie’s linguistic input was divided between 65% English and 35% Polish (Gaskins 2017). In qualitative terms, however, Sadie’s English input was much richer than that in Polish. She lived in London, attended an English-speaking nursery, and heard English at home from her father, while Polish was heard only from her mother, whose command of English did not go unnoticed by the child. This imbalance of input is reflected in Sadie’s lexical outcomes: at the age of 2;02 Sadie had 74% of English (292) and 26% Polish words (103) words at her disposal (Gaskins 2017). Moreover, when she was recorded on video addressed solely in Polish, she used on average 90% of English and 10% Polish word tokens (Gaskins 2017).

The data on the child’s emerging inflection come from 30 half-hour audio video clips recorded in three contexts between the ages 1;10.16-2;5.11: two monolingual contexts where Sadie was addressed in English by her father, or in Polish by her mother, and a bilingual context where she was addressed in both languages with both parents present. Parental language use is captured through the monolingual recordings. All video recorded data are transcribed using CHAT tools and analysed by means of CLAN freq and kwal commands. In this study, words with emerging inflections counted are only those which (a) had previously emerged in one form but (b) now appear in another form and (c) are used as such productively rather than in an act of imitation. If a word is modelled directly before the child’s turn in one form, e.g. kaczka (English: duck) but then is used by the child in another, e.g. kaczki (duck+INFL), its use is also counted as productive in that form. Excluded are any amalgams which are items acquired first in a form other than the singular default (MacWhinney 2014): they are treated as uninflected as they are the only forms available to the child. Further to this, diary data provide additional examples of relevant constructions.

4 Results

4.1 Sadie’s Productions in English

Sadie’s acquisition of English inflection does not appear to be delayed by the presence of two languages in the input. In fact, Sadie’s inflections emerge relatively early compared to some monolingual children. In a study of three children acquiring English in America, for example, Brown (1973), reports that some use of plural and possessive inflection was evident at what he refers to as stage II (28–35 months). By comparison, Sadie attempts to inflect nouns already at Brown’s stage I (15–30 months): the first plural noun (eyes) was recorded at 1;08.18 (approx. 21 months), with the next two words added 6 days later (shoes and bubbles), followed by two instances of pluralisation recorded over a month later at 1;09.25 (flowers and boots). Markings in singular genitive contexts emerged 2 months after their plural counterparts with the first instance (tata’s turn: English daddy’s turn) recorded at 1;10.13. Importantly, despite the initial sporadic use of inflected nouns and a very inconsistent application of these markings to the nouns in relevant contexts until the end of the data collection period, it is clear that number oppositions emerged before case oppositions.

4.2 Sadie’s English Inflections in the Light of Input

Input data from paternal speech show that in Sadie’s input by far the most commonly heard word form among nouns was the singular default form (77% of noun types) which is also the first form to emerge in Sadie’s acquisition. Beyond this, the second most frequent form in the input was that of regular plural nouns (recorded with 21% noun types). This corresponds with the order of acquisition recorded in the diary: Sadie used a contrastive marking for the first time at 1;08.24 to denote plurality. Lastly, the least frequent word form in the input was that of nouns in singular genitive (2% noun types) and Sadie attempted using it the latest (1;10.13). There were no nouns in plural genitive recorded in the input or the child’s speech. Owing to insufficient phonological contrast between the cases, it is clear that phonological features could not have played a role in the emergence of English case markings and the child must have been guided purely by their functionality in common usage.

4.3 Sadie’s Productions in Polish

Compared to English, Sadie’s acquisition of the Polish nominal inflection was delayed which is in line with studies of grammatical delay of the ‘minority’ language in cases of imbalanced exposure (e.g. Hoff et al. 2012; Paradis et al. 2014; Thordardottir 2014). Sadie first attempted Polish inflections at 1;11.05 which is about 6 months behind the monolingual schedule (Dąbrowska 2001, 2005). Consequently, with only six inflected words recorded in the diary by the age of 2, Sadie’s use of inflection is also far from monolingual children’s productivity with these forms around the second birthday (Dąbrowska and Szczerbiński 2006). Although at times Sadie’s productivity appears higher, this is likely because it is inflated by high levels of accuracy of use among the singular nominative forms. For example, Sadie was often involved in naming games and asked ‘co to jest?’ (what’s this?) which requires the use of the nominative forms and so provides an untrue reflection of the child’s ability to apply the emerging grammar. Beyond the accurate recall of nominative forms in nominative contexts, accurate productions of other nominal markings are rare: only 21 out of 60 inflected words were used in their accurate forms which suggests that their accurate use may have been coincidental. It becomes clear that in the analysis it would be more informative (a) to exclude the singular nominative contexts from the analysis and (b) to shift the analysis from correctly used markings to error patterns in all other contexts instead.

There are two striking error patterns which emerge from the analysis of inflections in the diary and on video. The first pattern is represented by 24 tokens of nouns ‘defaulting’ to the singular nominative case in contexts which require the use of another form. These include three word types: one token of the masculine noun dom [house], 20 tokens of the masculine tata [daddy] and three tokens of the feminine mama [mummy]. This is in line with monolingual children who tend to revert most frequently to the singular nominative forms (Smoczyńska 1985). The second error pattern, and one to become the focus of this study, is represented by 37 tokens of attempted inflection, including six word types in total, such as dzidzia [baby], babcia [nanny], kaczka [duck], truskawka [strawberry], piłka [ball],and but [shoe]. The first five of these words are feminine and the last is masculine which is similar to monolingual children who first attempt and master inflection on masculine and feminine as opposed to neuter nouns (Dąbrowska and Szczerbiński 2006). Of all these tokens, 36 feminine nouns default to an –i marking (Table 3) and the only masculine noun defaults to an –a marking but it is impossible to say which case category the child defaults to: the –a marking is only ever used in singular genitive while the–i marking is representative of more than one case.

Table 3 Inflection paradigms for the nouns targeted first in the use of inflection

Initially, there is some suspicion that this ‘default’ case could be indeed the singular genitive as all the nouns produced by Sadie take these markings in this particular case. This suspicion stems from the earlier reports of ‘the curious case of the genitive’: the overuse of the genitive is not uncommon in monolingual children though it is usually observed in contexts which require the use of the dative (Dąbrowska 2001; Dąbrowska and Szczerbiński 2006). However, this suspicion must be dismissed as the vast majority of these nouns share morphological patterns (i.e. they are feminine nouns which default to an -i marking) and as such represent the same small ‘gang’ (Bybee 2001). Therefore, it is speculated here that the second error pattern discussed above may be a sign of the child relying on the most salient phonological features of the Polish nominal inflection system. This means that the original research question will need to be rephrased to account for the development of a certain phonological marking rather than the acquisition of a particular case.

Reliance on an idiosyncratic strategy is quite likely considering the differences between Sadie’s patterns of acquisition compared to her monolingual peers. For example, Sadie starts her inflections with what appears to be plural nominative (‘truskawki’ (strawberry+INF), ‘kaczki’ (duck+INF) and ‘piłki’ (ball+INF) followed by singular vocative (‘mamo’ (mummy+INF) and ‘tato’ (daddy+INF) and singular genitive (buta’ (shoe+INF), ‘dzidzi’ (baby+INF) and ‘taty’ (daddy+INF). Meanwhile, Polish children tend to start with the singular accusative (which is missing from Sadie’s data), followed by vocative and genitive (Zarębina 1965; Smoczyńska 1985) before they move on to plural markings. Atypical of other children is also the observation that in Sadie’s acquisition, some nominative forms, such as mama (mummy), tata (daddy), and but (shoe), continue to be used along their newly emerging inflected variants. However, in the case of other nouns, the inflected variants completely replace the nominative forms, with the child ceasing to use them altogether which suggests certain ‘regression’ in acquisition. Among them are all the nouns Sadie can say in Polish which default to the –i marking, including truskawki (strawberry+INF), babci (nanny+INF), kaczki (duck+INF), piłki (ball+INF) and dzidzi (baby+INF). For example, the word babcia (nanny) emerged at 1;07.29 and was used in its nominative form until the emergence of its variant (babci+INF) at 1;11.10. Diary data show that at the time of emergence of the inflected form, both forms were used interchangeably for some time but then the –i form took over completely. Thereafter, whether asked to name a person (nominative), to say who is missing (genitive), or to indicate the recipient of action (dative), Sadie would always say ‘babci’(nanny+INF), as a default.

4.4 Sadie’s Polish Inflections in the Light of Input: Case Frequencies

Data for Polish show that in maternal input the five most commonly used forms were: singular nominative (27% types, 44% tokens), singular accusative (24% types, 22% tokens), singular genitive (12% types, 7% tokens); plural nominative (9% types, 6% tokens) and plural genitive (4% types, 6% tokens). This corresponds with the child’s productions only in that the most prevalent group in the input (singular nominative) is also the one to emerge first in acquisition. Despite high numbers of singular accusative inflections in the input and their early emergence in monolingual acquisition, there are no such markings whatsoever recorded in Sadie’s data. Instead, less frequent singular genitive and plural nominative, or at least the –i marking often shared by them, is favoured from early on. Thus the number of nouns recorded in a particular case regardless of gender, does not appear to predict accurately the order of acquisition of individual cases. It is indeed more likely that faced with limited input in Polish, the child prefers to rely on more easily perceptible phonological contrasts rather abstract contrasts between individual cases. The question which now needs to be addressed is whether the –i marking, the first sign of inflection among Sadie’s feminine nouns, has the highest type frequency in the child’s input, as precisely such frequency is expected to facilitate the emergence of grammar (Bybee 2001).

4.5 The Type Frequency of the –i marking

Maternal input data show that the –i marking was heard only on 10% of all noun types and 13% of all feminine noun types. By far the most commonly heard marking within the group of feminine nouns was the singular accusative–ę: it was heard on 30% noun types which is more frequent than the singular nominative –a (26%). Thus if type frequency were a factor, singular accusative forms should have emerged before the –i marked forms. However, a closer look at the data shows that individual token frequencies of the nouns from the feminine ‘gang’ could potentially explain the salience of the –i marking. Although words truskawka [strawberry] as well as piłka [ball] are altogether missing from maternal input captured on video, the –i marking on all the remaining words from the group is the most frequent marking heard after the nominative: e.g. kaczka [duck] was heard eight times while kaczki [duck+INF] four times, babcia [nannie] was heard 32 times while babci [nannie+INF] 14 times, and dzidzia [baby] was heard 14 times while dzidzi [baby+INF] only three times in the input. Considering the striking similarity of the error patterns, as well as their affiliation with a particular ‘gang’ of feminine nouns, it could be argued that Sadie learnt through analogy as all the words she attempted in the inflected forms were uttered relatively close together and at a time of intensive exposure to Polish language which would have increased the salience of the relevant word forms. She produced the word truskawki [strawberry+INF] for the first time at 1;11.05 which was followed by the use of the word babci [nannie+INF] at 1;11.10, and the word kaczki at 1;11.13 [duck+INF], all during the holiday in Poland.

It is also possible that the salience of the –i marking was reinforced by its grammatical versatility. When the frequency of the –i marking is considered from the point of view of distribution on all nouns that the child heard, the –i marking is indeed the one to overlap the most with 16% capacity to support a range of grammatical contexts (see Table 4). This is the greatest capacity among all the markings on the most commonly heard feminine nouns and marginally higher than the –i marking on masculine nouns (15%), the second most prevalent group in the input. Following type frequency in the acquisition of grammar is a sophisticated strategy: as overlapping markings have a greater potential to apply to various grammatical contexts, they give the child a greater chance of being understood. While it is impossible to say with any certainty how the child would realise that the same form has similar functions, and is therefore more ‘useful’ than others, this realisation must have its origins in situations where the same form is used to denote strikingly different entities. For example, at 2;01.02 Sadie’s mother was recorded on video as saying ‘szukasz drugiej kaczki?’ [are you looking for another duck?] where the word kaczki [duck+INF] was used to refer to a single entity in a genitive context and then she said ‘to są dwie kaczki’ [these are two ducks] where the same word form denoted a plural entity. While learning though analogy seems to provide a sufficient explanation for the acquisition of these first inflected forms, I argue that the salience of the same word forms used close together to refer to contrastive functions could have helped in the early acquisition of these particular inflections. The attractiveness of such overlapping or so-called ‘promiscuous’ forms has also been documented with reference to children making pronoun case errors, in particular overgeneralising ‘me’ as in ‘Me do it!’ (Tanz 1974). It was suggested that ‘me’ occurs in a wider range of constructions including direct object, the object of a preposition, as the answer to questions, etc., which means children overgeneralise it more readily than the ‘I’ which is restricted to nominative contexts (Tanz 1974).

Table 4 The proportions of case markings that the actual nouns from the input can take and an example of how –i overlaps across contexts of use on the noun kaczka (duck)

4.6 Why Does the –i Marking Push Out the Selected Nominative Forms?

One last question is why the –i marking came to replace the singular nominative case: was this a case of unlearning? Sadie’s data show that the default nominative forms were initially used across all grammatical contexts rather than being applied correctly and consistently to relevant nominative contexts so the constructivist claims of unlearning remain unjustified. Although regression remains a possibility in acquisition, in this particular case it is more likely that Sadie’s language use reflects a shift from learning through imitation towards being able to manipulate various grammatical aspects, such as number, gender and case, allowing comparisons between individual words as well as whole word groups. Earlier models of language acquisition have explained this apparent ‘regression’ in acquisition in terms of disparity between the linguistic behaviour and the actual linguistic competence. Karmiloff-Smith refers to it as ‘behavioural regression’ and attributes it to representational progression, arguing that it provides a clue to reorganisation of the stored representations (1985). This view could help to explain that in Sadie’s acquisition, the apparent ‘unlearning’ of the nominative forms might have been simply a sign of coming to terms with complex input.

5 Conclusions

Overall, Sadie’s acquisition of nominal inflection is consistent with the predictions of RCG in that the emerging pattern-finding skills do indeed appear to be language specific from the outset. However, while the concept of frequency is accurate in predicting language outcomes, its realisation is different in the case of two languages which are typologically different. In the case of English, nominal markings emerge in the order predicted by the frequencies of morphological groups in the input, with the bare forms followed by the plural default and then singular genitive, and the plural genitive absent from the input as well as the output captured in the recordings. However, as the three emergent markings present no phonological contrasts, it is argued that their order of emergence is governed purely by their functionality in English language. In the case of Polish, Sadie’s minority language, the concept of frequency has altogether different implications for the acquisition of inflection. Although Sadie acquires first the singular nominative markings which occur on the highest number of noun tokens, this strategy is disregarded when she is faced with a more complex system of inflection. From there on, the child starts to draw analogies between noun exemplars characterised by more easily perceptible phonological differences rather than abstract differences between individual cases. Also, she appears sensitive to the exceptional functional capacity of the –i marking. In fact, the –i marking is so attractive that it starts being overgeneralised across all grammatical contexts for the relevant nouns. This is explained not in terms of regression but instead in terms of the child’s developing ability to manipulate multiple aspects of the language used.

Findings from this study, albeit limited to a single case, help to understand why the notion of frequency cannot always be interpreted (a) in the same way for different types of languages, especially if they do not occupy comparable space in the input, and (b) in the same way at different stages of acquisition. It would appear that grammatical acquisition depends on contrasts available in the given language, which calls for a more dynamic approach to frequency as a factor facilitating language development.