1 Introduction

The mental lexicon of many languages is organized by morphology, which relates structural and semantic constituents within words and paradigmatic information (Haspelmath & Sims, 2010; Marslen-Wilson, 2007; Paterson et al., 2011). Therefore, morphology has long served as a central arena for testing hypotheses and theories about language acquisition and development (Ravid, 2019). Many accounts of morphological learning seek to explain children’s path from a limited beginning, fraught with omission and commission inflectional errors, to full and automatic command of grammatical paradigms (Boloh & Ibernon, 2013; Clahsen & Fleischhauer, 2014; Lignos & Yang, 2016; Marcus et al., 1992; McCauley & Christiansen, 2019; Tatsumi & Pine, 2016). Fewer studies have examined the acquisition of derivational morphology in lexical development, although children are learning derivational patterns as early as in their third year of life (Clark, 1993, 2016; Schipke & Kauschke, 2011), especially in languages with rich morphological systems (Vainio et al., 2018). In morphology-rich languages such as Hebrew, where many grammatical and lexical notions are encoded in word-internal structures (Berman, 1987; Ravid, 2003; Schwarzwald, 2002), studying the emergence of morphological organization in the lexicon is paramount.

The mind, and the mental lexicon within it, are often modeled in a network-based perspective (Elman, 2009, 2011; Den Hartigh Ruud et al., 2016; Kenett et al., 2011; Siew et al., 2019; Siew & Vitevitch, 2020). Network science is a theoretical and methodological framework that enables a unique understanding of the human cognition: it provides the mathematical tools for explaining the structure of complex cognitive systems and the influence of that structure on cognitive processes (Castro & Siew, 2020). Specifically, Cognitive Network Science is used to examine the structural properties of cognitive systems with the advantage of being straightforward, with no hidden (“black box”) layers (Siew et al., 2019).

Studies have shown that a network-based cognitive representational system is plausible, promising, and valid (Steyvers & Tenenbaum, 2005). Generally, cognitive networks are composed of entities or events holding some type of relation, emphasizing the role of representing both entities and relations. For example, semantically related words can be captured by a network, allowing for the examination of the semantic structure of the mental lexicon (Beckage et al., 2011), as do phonologically similar words (Siew & Vitevitch, 2020), or syntactically related words (Ibbotson et al., 2019). The flexibility of network representations of cognitive phenomena makes it a very useful tool, since it can capture a large variety of relations and processes, providing well defined measures for assessing structure and effects. Moreover, structural changes in the network are a good model of the dynamics of cognitive systems. Thus, Siew et al. (2019) have shown that network science can provide novel insights on the complexity of cognitive systems and on systematic processes. Cognitive network science provides quantification tools for modeling both cognitive structure and cognitive processes, relating them to one another and accounting for their interactions in producing behavior. Importantly, network analysis is useful due to the variety of measures it provides, ranging from, for example, the description of single nodes, through the quantification of relative positions, to the characterization of the structure as a whole (Siew et al., 2019).

A main feature of the generalizable nature of cognitive network science is the ability to define the components of the network at several different levels. Focusing on language, relations between words are argued to take the form of links between nodes in a network in various forms: associative, semantic, morphological, phonological, or phonetic (Elman, 2011; Sporns, 2002; Stella et al., 2017; Steyvers & Tenenbaum, 2005). Givón (2005) explicitly notes that “[t]he most plausible assumption one can make, given the facts of language, cognition and neurology, is that conceptual/semantic meaning is represented in the mind/brain as a network of nodes and connections (pp. 69).” And crucial to our developmental inquiry, which assumes a network-based cognitive representation of language, learning is viewed as constantly updating connection weights between nodes and their properties, based on experience (Bybee & McClelland, 2005; Kapatsinski, 2018).

In accounting for language development, network analysis has the advantage of flexibility and generality: networks are specifically suitable for representing a structure that undergoes dynamic topographical changes (Costa et al., 2007). The present study focuses on a special kind of networks from two perspectives – one being the language level, another being the network type. At the language level, we are interested in the morphological-paradigmatic relationships that occur within and between words, yielding a morphological system. Specifically, we are interested in the categories of the Semitic Root, binyan verb patterns, and morphological families, as detailed below. Regarding network type, we are focusing here on bipartite networks. Bipartite networks are the representation of interactions between two distinct types of entities, such as plants and their pollinators. Such networks do not display interaction within each type, but rather between the two types (Beckett, 2016). Specifically, in our case, roots cannot interact with roots to create verbs, and neither can patterns with patterns. Rather, a Hebrew verb is the result of a specific interaction between two types of non-linear morphological constructs: a root and a pattern. We show in the current paper that this interaction is best captured through a bipartite network.

We account for the emergence, growth and evolution of morphological complexity in the typically Semitic domain of verbs, inherently characterized by morphological networks of root and pattern affixation (Boudelaa & Marslen-Wilson, 2005; McCarthy, 1981). We characterize the Hebrew non-linear morphological relations as a single-layer bipartite network in which one type of morpheme (the Hebrew root) is linked to another type of morpheme (the Hebrew verb pattern), together constituting a wordform. Thus, our network is a representation of the morphological aspect of the mental lexicon, in which the entities (i.e., network nodes) are morphemes, and relations between them (i.e., network links) represent words. The main motivation of the current study derives from the fact that Hebrew verbs participate in two kinds of networks, as presented below – root-based (Sect. 1.2) and binyan-based (Sect. 1.3). These networks make it possible for language learners to forge reliable paradigmatic relationships between verbs with shared components, so that morphology as a system emerges from usage (Abbot-Smith & Tomasello, 2006; Ackerman & Malouf, 2013; McCauley & Christiansen, 2019).

1.1 Non-linear affixation in Hebrew verbs

Our interest in the current paper lies in the ontogenetic evolution of root-pattern networks within verbs, the prototypical habitat of Hebrew root and pattern systematicity (Kastner, 2019; Laks, 2018). Verb morphology is among the earliest systems to be acquired by Hebrew-speaking children (Armon-Lotem & Berman, 2003; Berman, 1985a). The early command of Hebrew verb systematicity is largely due to the central role of verbs in acquisition. Verbs constitute the “architectural centerpiece” of grammar for language learners (Hirsh-Pasek & Golinkoff, 2006) as lexical items expressing the critical semantic relationship between people, actions, and objects (Merriman & Tomasello, 2014; Smiley & Huttenlocher, 1995); by underscoring the temporal facets of events (Berman, 1985b; Hirsh-Pasek & Golinkoff, 2006; Timberlake, 2007); and by determining the argument structure of the clause (Peter et al., 2015; Rispoli, 2014; Wonnacott et al., 2008).

Side by side with these universal cues, the specific morphological properties of the Hebrew verb system also drive its early acquisition. Ample psycholinguistic evidence points to non-linear affixation of Semitic roots and patterns as a major systematic device organizing the Hebrew content-word lexicon (Bolozky, 1999; Deutsch & Kuperman, 2019; Frost et al., 1997; Moscoso del Prado Martín et al., 2005; Ravid, 2003; Schwarzwald, 2000, 2002; Velan et al., 2005). Verbs constitute the hallmark of non-linear root and pattern systematicity. Noninflected verb stems all result from the affixation of two sub-lexical morphological primes: the Semitic root and a set of seven verb patterns known as binyanim (literally ‘buildings’), traditionally termed Qal, Nif’al, Hif’il, Huf’al, Pi’el, Pu’al, and Hitpa’el respectively (Berman, 1987; Bolozky, 1999, 2012; Ravid, 2006; Schwarzwald, 2006). Roots constitute the consonantal skeletons of verbs, while binyan patterns provide their basic morpho-phonology, including the location of root radical slots, specific pattern vowels, pattern prefixes and suffixes, and stress assignment sites (Bolozky, 1997; Ravid, 2020).

Basic Hebrew verb morphology is typically acquired between the ages of two and three years, initially learned by toddlers as conveying modality, aspectual, and tense distinctions within the same binyan (Ashkenazi et al., 2016; Berman, 1985a; Lustigman, 2013; Ravid et al., 2016). In the following preschool years, verb morphology becomes the first derivational system to develop in Hebrew, linking different verbs with shared morphemes and creating morphological verb families (Ashkenazi et al., 2020; Berman, 1993a,b; Levie et al., 2020). This developing complexity in verb morphology is at the center of the current study.

In the Hebraist tradition, the abstract notion of verb lemma refers to the unique combination of a specific root with a specific binyan: nirdam ‘fall asleep’ in Nif’al is one verb lemma, hirdim ‘cause to sleep’ is another distinct lemma, despite sharing the same root r-d-m. And tipel ‘take care of’ and sider ‘make orderly’ are two distinct verb lemmas, despite sharing the same binyan pattern Pi’el (Berman, 1993a,b; Schwarzwald, 1981). However, in accounting for the usage and early development of the verb system, we need to delve deeper into Hebrew verb morphology, capturing roots and binyan verb patterns as two networks that dynamically interact with each other across developmental time, hence best modelled by a single-layer bipartite network.

In the current study we focus on modeling the morphological level of language representation, aiming to show that the network model of cognitive representation captures the interconnected nature of morphological relations within the Hebrew verb lexicon – wordforms and their parts – in the same representational model. In our case, the morphological constructs of roots and patterns are nodes in a network, while actual wordforms are links between these nodes. Both nodes and links can be studied simultaneously, quantifying the relations between parts (morphemes) and wholes (words).

1.2 Root-based networks

Many studies point to the Semitic root as the most accessible Hebrew morpheme in spoken and written language development and usage (Ben-Zvi & Levie, 2016; Moscoso del Prado Martín et al., 2005; Deutsch & Kuperman, 2019; Gillis & Ravid, 2006; Ravid & Bar-On, 2005; Schiff et al., 2012), including contexts of language disability or environmental deprivation (Levie et al., 2017, 2019; Ravid & Schiff, 2006; Schiff & Ravid, 2007). Indeed, young Hebrew-speaking children demonstrate an early ability to extract roots from familiar words and use them in novel forms (Berman, 1985a, 2000, 2012; Ravid, 2003). While a root is not a verb, it functions as a shared consonantal skeleton that most often, especially in the speech addressed to young children and in their own speech, conveys the shared lexical meaning between verbs—e.g., r-d-m in nirdam ‘fall asleep’ and hirdim ‘make sleep’ (Levie et al., 2020). Therefore, roots are key in Hebrew morpho-lexical development.

1.2.1 Derivational families

One focus of this study is thus the emergence and growing complexity of root-based networks in the Hebrew verb lexicon, that is, clusters of verbs with different patterns sharing a single root. One manifestation of this network structure is canonical derivational verb families, where verbs with different binyan patterns are based on a single shared root (Bolozky, 1999; Ravid, 2020). For example, consider the following two root-based verb families: (1) lamad ‘learn’ (in the Qal binyan verb pattern), nilmad ‘be learned’ (in Nif’al), limed ‘teach’ (Pi’el), and hitlamed ‘apprentice’ (Hitpa’el)—all sharing root l-m-d ‘learn’; and (2) lavash ‘wear’ (Qal), nilbash ‘be worn’, hilbish ‘put on’ (Hif’il), hulbash ‘be put on’ (Huf’al), and hitlabesh ‘dress oneself’ (Hitpa’el)—all sharing root l-b-s̆ ‘wear’. In principle, given the seven binyan patterns, root-based families can have up to seven members. However, participating in a derivational, semi-productive system, root-based families are of different sizes, and are fraught with unpredictability—empty cells, redundancy, semantic and structural inconsistencies (Berman, 1987).Footnote 1 A recent study of Hebrew verb development across childhood to adulthood (Levie et al., 2020) reveals that most (70-80%) verbs heard and produced by young Hebrew-speaking children are in fact singletons; that is, they have no derivational root-based verb siblings in the same corpus, as demonstrated by shiker ‘lie’. Around age 3, about 25% of the verbs produced or heard by children eventually organize into small, two-member families, e.g., saraf ‘burn,Tr’ / nisraf ‘burn,Int’. It is only in later childhood and adolescence, and especially with the advent of linguistic literacy (Ravid, 2012), that Hebrew users start producing larger root-based families in their discourse. The increase in number, size and complexity of root-based verb families is a clear indicator of a growing verb lexicon (Levie et al., 2019; Ravid et al., 2016). The expansion of the verb lexicon into more numerous, larger families in older and more literate speakers most probably also enhances the perception of root-based organization in root-sharing nouns and adjectives, as demonstrated in the following example of the full derivational family based on root g-d-l ‘grow’: gadal ‘grow’, hidgdil ‘enlarge’, hugdal ‘be enlarged’, gidel ‘raise’, gudal ‘be raised’, gadol ‘big’, megudal ‘physically grown’, megadel ‘grower’, magdélet ‘enlarging glass’, gódel ‘size’, gdila ‘growing’, hagdala ‘enlarging’, gidul ‘growth’, gdula ‘eminence’, and gadlut ‘greatness’ (Levie et al., 2017). In sum, an important feature of Hebrew verbs is their organization in networks of root-related derivational families which grow more complex and interact with more systems as language users grow older and more literate.

1.2.2 Inflectional families

Roots do not only relate verbs across the different verb patterns as derivational networks, but also play an important role in relating the temporal verb stems within each binyan in inflectional paradigms. Each of the seven entities termed binyanim consists of a phonologically unique bundle of five temporal patterns—past tense, present tense, future tense, imperative and infinitive forms—as depicted in Table 1.Footnote 2 For example, CaCaC, CoCeC and li-CCoC serve as the respective past, present and infinitive inflections of the pattern Qal. When combined with root k-t-b ‘write’, the stems katav ‘wrote’, kotev ‘writes/writing’ and li-xtov ‘to-write’ are respectively yielded. In the same way, hiCCiC, maCCiC, yaCCiC and le-haCCiC serve as the respective past, present, future and infinitive patterns of Hif’il, combining with k-t-b to respectively yield hixtiv ‘dictated’, maxtiv ‘dictates/dictating’, yaxtiv ‘will dictate’ and le-haxtiv ‘to-dictate’. This means that temporal shifts within the same binyan paradigm also require the use of the same root with different patterns.Footnote 3 Recent research (Ashkenazi et al., 2016, 2020; Ravid et al., 2016) indicates that young Hebrew speaking children initially learn to manipulate roots and patterns in the inflectional shifts across the temporal stems in the paradigm of a verb lemma in a single binyan (most often the ubiquitous Qal), where semantic coherence of roots is highest. This is in fact the launching pad of non-linear formation in the verb system. It is only later on, at schoolage, that verb lemmas in different binyanim sharing the same root – i.e., derivational families – enrich the young verb lexicon (Levie et al., 2020).

Table 1 The seven binyan paradigms as sets of temporal patterns

Noting the root-and-pattern structure in verb morphology has two consequences for accounts of Hebrew morphological acquisition. On the one hand, this is a facilitating property of the system, so that for the learning child, root-based relations in the verb system are not confined to derivation, and can be construed by attending to the root-pattern temporal bases within the same binyan (Ashkenazi et al., 2016, 2020). This is important, given the fact that the Qal pattern, which occupies about 80% of the verb tokens heard or produced by children up to three years of age, has the most phonologically distinct temporal patterns, a boost to the transparency-aided acquisition of root and pattern structure (Ravid, 2019). But on the other hand, this means that instead of eventually acquiring the morphology of seven binyan patterns, Hebrew speaking children are actually faced with 31 binyan-specific temporal patterns (Table 1) that need to be learned (see Footnote 2). While some temporal patterns are phonologically similar (e.g., the temporal paradigm of Hitpa’el), others display more phonological distinctions (e.g., the temporal paradigms of Qal, as noted above, and Nif’al).

Lexical expansion drives and promotes the formation of root-based networks in young learners. Given the prominence of the root morpheme in the Hebrew lexicon, this process is critical in the acquisition of verb morphology (Berman, 1987; Ravid, 2003). The larger, more numerous and variegated root-based verb networks (both temporal and derivational) in the lexicon of the language learner – the more complex, productive and abstract the organization of the lexical network relying on roots (Levie et al., 2020).

1.3 Pattern-based networks

In addition to root-based systematicity, a second network organizing the Hebrew verb lexicon is pattern-based, where verbs with different roots share the same binyan pattern. For example, the verbs higbir ‘make stronger’, higdil ‘make bigger’, histir ‘hide,Tr’ and hiklit ‘record’ all share the Hif’il pattern, with each based on a different root. This organization is no less critical for morphological learning than root-based networks, from two different points of view. From a morpho-phonological perspective, the formation of pattern-based networks highlights the shared vocalic structure of verbs. If we take into account the more specific notion of binyan-temporal pattern described above, this network will include learning all five temporal stem forms unique to each (non-passive) binyan and relating them to each other to form the abstract binyan category. To illustrate the central role of this network, think about noting the formal resemblance of verbs sharing the meCaCeC present-tense Pi’el pattern (e.g., medaber ‘talking’, meshaker ‘lying’, melamed ‘teaching’), the similarity of their temporal semantics, and their relation to other Pi’el patterns such as past-tense CiCeC in diber ‘talked’, shiker ‘lied’, and limed ‘taught’ respectively. Evidence of errors from toddlers and young children acquiring the binyan-temporal system indicates that it takes time and linguistic experience for this knowledge to crystallize towards the beginning of elementary school (Berman, 1982; Ravid, 1995).

Pattern-based networks are central to verb acquisition from a second point of view, as they underscore the syntactic-semantic functions typically associated with the binyan system. While high-frequency Qal has both transitive (e.g., shalax ‘send’) and intransitive (avad ‘work’) verbs, the other (non-passive) members of the system display two systematic tendencies: Hif’il and Pi’el mostly express high transitivity and causativity (hilbish ‘dress,Tr’, kipel ‘fold’), whereas Nif’al and Hitpa’el mainly express low transitivity, middle voice, and inchoativity (nivhal ‘get scared’, hitmale ‘fill up,Int.’ (Berman, 1993a,b; Kastner, 2019). Therefore, morpho-lexical knowledge of pattern-based derivational paradigms is central in gaining command of Hebrew syntactic constructions and argument structure.

Recently, evidence has been accumulating that the seven binyan patterns in fact consist of two semi-redundant sub-systems (Dattner et al., 2021; Levie et al., 2019; Ravid, 2020), each expressing the same set of transitivity functions and syntactic relations. Sub-system I – Qal, Nif’al, Hif’il, and Huf’al – has most verb types and is used with most frequency (Ravid et al., 2016), while sub-system II – consisting of Pi’el, Pu’al and Hitpa’el – has been extremely productive since the revival of Modern Hebrew (Bolozky, 2007; Schwarzwald, 2002). This classification has historical motivations (Sivan, 1976), and is also currently grounded in morpho-phonological similarity (Schwarzwald, 1996) and derivational affinity (Bolozky, 2007). The developmental analyses in Levie et al. (2020) show that this dual system is a highly efficient platform for expanding the verb lexicon across development. It enables the early learning of binyan forms and functions and root linkage via small networks of verbs within the same sub-system (usually the older, more frequent sub-system I), efficiently organizing lexical knowledge into categories that support the emergence of basic syntactic relations. Consider, for example, the derivational root family based on root k-n-s, where sub-system I contains a low transitivity Nif’al verb – nixnas ‘enter’, and a causative Hif’il verb – hixnis ‘bring in’, as well as its passive form huxnas ‘be brought in’; whereas sub-system II has again a causative verb, this time in Pi’elkines ‘assemble,Tr’, its passive form kunas ‘be assembled’, and again a low-transitivity verb, this time in Hitpa’elhitkanes ‘assemble,Int’. Older speakers/writers gain command of the subtle differences expressed by specific verbs sharing patterns with similar functions across the two systems, creating semi-productive (i.e., minor or less generalizable), weak links of the type discussed in Landauer and Dumais (1997), which organize the binyan system in its lexically and morphologically rich adult form.

1.4 Research goals

A great deal of research into the nature of the morphological organization of the mental lexicon concerns modeling productivity and comprehension (e.g., Baayen, 2007, 2009; Deutsch & Meir, 2011; Lõo et al., 2018; Moscoso del Prado Martín et al., 2004; Plag, 2006; Plag & Baayen, 2009). This line of research has yielded several crucial findings regarding the role of frequency and information-theoretic measures in modeling morphological cognitive representation. For example, whole-word family size and paradigm entropy were found to affect processing, recognition and production, as well as productivity. These studies highlight the benefits of modeling the mental lexicon in information-theoretic perspective. For example, Moscoso del Prado Martín et al. (2005) show that this approach leads to understanding processing and morphological representation in typologically different languages.

The model proposed in the present paper is embedded in such representation as it encompasses both morphemes and wordforms in a single network (as first delineated in Dattner et al., 2021; Levie et al., 2019). This investigation is in line with the paradigmatic view of morphology (i.e., a Word and Paradigm perspective), which considers wordforms as “representing types of configurations of elements and whole surface word forms as elements in a network of related word forms” (Ackerman & Malouf, 2013, p. 431), as well as with the information-theoretic, word and paradigm framework (Blevins, 2014, 2016, 2013). While we recognize the importance assigned to whole (complex) wordforms in our model, we emphasize the psychological reality of the sub-lexical, morphological constructs of roots and patterns (Deutsch & Kuperman, 2019; Frost et al., 1997, 2000; Moscoso del Prado Martín et al., 2005; cf. Bat-El, 2017). Specifically, we do so by proposing a model encompassing both the morphemes and the wordform simultaneously. In this model, morphemes are taken to be nodes in a network, and wordforms to be links between nodes. The paradigm of the link (that is, of the wordform) is composed of its immediate nodes, as well as of the links going out from each of its nodes: words that share a pattern on the one hand, and words that share a root, on the other.

We further argue that the Hebrew verb system is a dynamic network of roots and patterns best captured by Ackerman and Malouf (2013) designation of a “systemic organization underlying the surface patterns” (p. 435). Taking a system-level perspective, and given the two types of morphological verb families described above, we claim that a word’s paradigm is not an isolated entity within the lexicon. Rather, the lexicon is a continuum in the sense that each cell in the paradigm is related to other words’ paradigms as well, in an interpredictability manner (Blevins, 2016). This is represented in our model through the continuing relations emerging between roots, patterns, and wordforms. With regards to development, we show that network-based measures have meaningful implications, and that the emergence of paradigms (i.e., morphological categories), can be described in terms of network structure.

Note, however, that we do not propose a model for language production or comprehension. Rather, we propose a model for the emergence of the system’s structure. The system structure, we argue, is composed of both parts and wholes (i.e., morphemes and words). That is, while processing can be accounted for without relying on specific representations for sub-words, morphemic constructs (Baayen, 2009; Blevins et al., 2016), the resulting emergent structure, we argue, includes representation for roots and patterns in Hebrew (Deutsch & Kuperman, 2019; Frost, 2012; Moscoso del Prado Martín et al., 2005). And while the question of whether this structure might have an effect on processing is still under investigation, recently it was shown to have an effect on acquisition (that is, the emergence of the system) through systematic adaptation between child speech and child directed speech (Dattner et al., 2021).

Against this background, the present study aims to model the dynamic nature of the development of the morphological systems described above: root-based and binyan-based families, with focus on the emergence of the two sub-systems. To achieve our objectives, we adopt a dynamic systems approach to our data, using Network Analysis.

2 Data and method

The analyses were carried out on ten sub-corpora with a total of 458,828 words: nine corpora of spoken language in the modes of Infant Directed Speech (IDS), Child Directed Speech (CDS), Child Speech addressed to caretakers (CS), and children’s spontaneous Peer Talk (PT); and one corpus of texts of children’s books. All participants in the spoken language corpora were typically developing, native monolingual Hebrew speakers from mid-high SES background. The children’s texts were written (or translated) by native Hebrew speaking authors. A detailed description of each sub-corpus is provided below.

2.1 Composition of the database

2.1.1 Infant directed speech (IDS)

One dyad of mother and female infant was recorded for a total of 4 hours at four points of time – 3, 6, 9 and 12 months – in natural interaction, yielding a corpus of 4,906 word tokens with 1,569 verb tokens (Peleg, 2013).

2.1.2 Child directed speech (CDS) and child speech (CS)

Two dyads, a boy and (mostly) his mother, and a girl and (mostly) her mother (toddlers aged 1;8–2;2) were densely recorded for six months, three times a week, one hour each time, in natural spontaneous interactions during mealtime, bath time, and play time. The 97 hours of interaction were transcribed and coded, yielding 299,461 word tokens in parental CDS and 72,086 word tokens in CS, respectively containing 54,810 verb tokens in CDS, and 7,706 verb tokens in CS (Ashkenazi, 2015).

2.1.3 Peer talk of children aged 2–8 years

Six groups of children between the ages of 2–8 years, three triads in each age group, were recorded in 30-minute long conversations during spontaneous play (Zwiling, 2009). The two youngest groups of children were 2- and 2;6-years old respectively, followed by three consecutive groups of 3-, 4- and 5-year olds, and a group of 7-year olds. All conversations were compiled into a total of 9 hours of transcribed recordings of all age groups, altogether yielding 32,991 word tokens with 6,073 verb tokens.

2.1.4 Children’s books

This sub-corpus, containing 49,384 word tokens with 10,943 verb tokens, was based on children’s storybooks targeting toddlers and preschoolers, which were originally composed or translated by expert native speaking writers of Israeli children’s literature; and school texts, primarily narratives, for beginner readers in 1st and 2nd grades (ages 6–7 years), composed in Hebrew by child education experts (Grunwald, 2014).

2.1.5 Sampling the data

In order to provide a developmental argument from the differently compiled data sets, we sampled our data according to average number of verb tokens per minute. The shortest recording session in our database is within the peer talk corpus, comprising 1.5 hours of speech for each age group. Accordingly, we sampled the equivalent of 1.5 non-consecutive hours of speech from each corpus, based on the total number of tokens and the total number of hours of recordings for that corpus. For example, the IDS data consists of 4 hours of recordings, which yielded 1,569 verb tokens. Consequently, the network is analyzed on a sample of 588 tokens (1.5 × (1569/4)=588). The children’s books corpus was sampled based on the speech rate in the CDS corpus, assuming parents are the main readers of these books. We account for the ten networks as ten points in a dynamically evolving network, analyzing the development of network measures as obtained in each instance of network. We focus on the measures detailed below.

2.2 Method

2.2.1 Network components: Nodes and links

The present study models the morphological system of Hebrew verbs as a bipartite network, in which the nodes belong to two mutually exclusive types, and links exist only between a node of type A and a node of type B (but not within types). That is, we do not model a network of words that are related to each other as a function of some similarity measure or on the basis of shared functions. Rather, the networks in the current study are composed of the following three morphological entities:

  1. 1.

    Root - the Semitic consonantal construct, e.g., k-t-b, s-p-r or r-d-m. The root is the first node type in the bipartite network.

  2. 2.

    Binyan-specific temporal pattern, as in the following examples (see Table 1 for the full array of temporal patterns across the binyan system):

    • CoCeC is the Qal present tense pattern;

    • le-hiCaCeC is the Nif’al infinitive pattern;

    • hiCCiC is the Hif’il past tense pattern;

    • and yeCaCeC is the Pi’el future tense pattern.

    The pattern is the second node type in the bipartite network.

  3. 3.

    Verb temporal lemma, the noninflectedFootnote 4 combination of a root and a binyan-temporal pattern, as in the following examples. Note that verb temporal lemmas are not specified for person, number, or gender.

    • k-t-b + Qal.Future, yielding the verb temporal lemma ‘will write’;

    • k-t-b + Qal.Past, yielding the verb temporal lemma ‘wrote’;

    • k-t-b + Hif’il.Future, yielding the verb temporal lemma ‘will dictate’;

    • s-p-r + Qal.Past, yielding the verb temporal lemma ‘count’;

    • s-p-r + Pi’el.Past, yielding the verb temporal lemma ‘told’.

    Verb temporal lemmas constitute the links in the bipartite network.

2.2.2 Network measures: Motivating network analysis as a methodological framework for analyzing morphological development

Network analysis uses a variety of measures that shed light on different aspects of the data (Brandes & Erlebach, 2005; Kolaczyk, 2009; Siew et al., 2019). The present paper focuses on four network measures, highlighting their relevance in modeling the cognitive representation of the morphological networks in the verb lexicon. We employ two measures related to the nodes of the network – Degree Centrality and Eigenvector Centrality – corresponding to individual morphological constructs; and two measures related to the network as a whole – Density and Modular Structure – corresponding to relations among constructs.

To exemplify these measures, consider Table 2, delineating a mock-up corpus consisting of four verb lemmas, composed of three root types and three past tense pattern types. This corpus can be represented as a network, as shown in Fig. 1.

Fig. 1
figure 1

Mock-up network: Nodes and links, based on the data in Table 2

Table 2 Mock-up corpus: Roots, patterns, and verb lemmas

Table 3 is the network’s adjacency matrix, representing the network in a mathematical form: a value of 1 represents a link between two nodes (i.e., a wordform found in the network), and a value of 0 indicates that there is no link between the corresponding nodes.

Table 3 Adjacency matrix representation of mock-up network in Fig. 1

Nodes can be measured for their Centrality in the network from different perspectives. We focus here on two points of view: Degree Centrality, and Eigenvector Centrality.

The Degree Centrality measure corresponds to the number of links a node has with other nodes in a network. For example, taking a vector to be an arrangement of numbers, the vector v of the root s-d-r in the mock-up network’s adjacency matrix in Table 3 is as follows:

$$ v_{s-d-r} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \\ 1 \\ 0 \end{bmatrix} $$

That is, within the context of our mock-up network, the root s-d-r can be represented as the arrangement of numbers [000110]. Accordingly, the Degree Centrality value of s-d-r is the sum of its vector, namely 2. Indeed, the root is linked to two other nodes, as seen in Fig. 1: the pattern Pi’el, yielding the verb sider ‘arrange’, and the pattern Hitpa’el, yielding the verb histader ‘arrange,Int’. A node with a high degree value represents a higher linkage level of the corresponding construct, as it participates in more language events. The Degree Centrality measure can also reveal hubs within the network. In our case, we define hubs as those nodes that have a degree value which is higher than 95% of the nodes in the data (Hwang et al., 2012; Oldham & Fornito, 2019).Footnote 5 We hypothesize that the number of hubs will increase with age, representing networks with more high-linkage sites. In morphological terms, a low linkage network with a low number of hubs has a skewed distribution of roots and patterns, such that usage is probabilistically restricted to a small set of verbs. On the other hand, a highly linked network with a higher number of hubs has a more evenly distribution of roots and patterns, representing a more balanced, versatile verb lexicon.

A second centrality measure used here is the Eigenvector Centrality of particular nodes (roots or patterns in our case). Eigenvectors are special vectors (a vector being an arrangement of numbers, as seen above) that remain stable (to some extent) during various matrix manipulations. Due to their relative stability, these vectors are used to reduce dimensionality in multidimensional data, finding the principal components of the data. While a principal component is built on the actual data, it represents the data with fewer dimensions, enabling its description from a new, and simpler, perspective. In network analysis, Eigenvector Centrality measures the importance of a node by assigning it the importance of its neighbors. It is an iterative computation starting with multiplying the network’s adjacency matrix (e.g., Table 3) by a vector composed of the nodes’ degree values. In our mock-up network (Fig. 1), this vector is [211211], indicating that two nodes (root s-d-r and pattern Hitpa’el) are linked to two nodes each, and four nodes are linked to one other node each. In this way, each node is assigned a value based on the values of its neighbors, yielding a new centrality vector. This process is iterated (multiplying the network’s matrix by the resulting vector) until a certain balance is gained when all the numbers increase by the same factor. In fact, this balance is reached when the matrix is multiplied by its highest eigenvector: the principal eigenvector that stays on its span during matrix multiplication, thus assigning a new centrality value for each node, namely its Eigenvector Centrality.

Eigenvector Centrality is taken here to reflect a node’s importance (Bonacich, 2007; Lohmann et al., 2010; Oldham et al., 2019). A node with high Eigenvector Centrality is linked to many other nodes that, in turn, are linked to many other nodes. In non-directed networks, as in the present study, such nodes are said to be in a central, prominent position. Thus, in our case, a binyan-temporal pattern that is linked to many roots that are linked to other binyan-temporal patterns has high Eigenvector Centrality. Since this measure quantifies the significance of a node relative to other nodes in the network, it can reveal those morphological constructs that act as centers of gravity, and changes in the centrality of a particular construct can be measured during development. We hypothesize that node centrality will change through development in a dynamic, non-linear manner, reflecting changes in discourse circumstances. Crucially, these changes are not a matter of mere frequency, but rather of the frequency of links with other frequent nodes.

Figure 2 provides a comparison of the construct-specific measures of Degree Centrality and Eigenvector Centrality as they are projected on the mock-up network presented above. Note that while two nodes may have the same degree value (e.g., the past tense patterns of Qal and Pi’el in Fig. 2(b)), they might be attributed with a different Eigenvector Centrality value due to their respective links with other nodes in the network (as seen in Fig. 2(c)).

Fig. 2
figure 2

Mock-up network: Two types of node measures, comparing degree (b) and eigenvector centrality (c), relative to a non-measured network (a)

The importance of (and difference between) the two measures is highlighted in Dattner et al. (2021), a longitudinal study of dense recordings, aiming at revealing patterns of adaptation between CDS (Child Directed Speech) and CS (Child Speech) morphological networks. Dattner et al. (2021) showed that CDS Eigenvector Centrality levels were affected by the Degree Centrality levels of the CS network in the antecedent recording, and that the Degree Centrality levels of a CS morphological network were (marginally) affected by the Eigenvector Centrality of both the antecedent and the corresponding CDS networks.

The third measure illustrated here is the Density of the network, quantifying fulfilled links between nodes (Wasserman & Faust, 1994). Network Density is calculated as:

$$ d = \frac{2E}{N(N-1)}$$

where E is the number of links in the network (edges), and N is the number of nodes. Given the data in Table 2, our mock-up network has four links (i.e., four temporal verb lemmas) and six nodes (i.e., three root types and three pattern types). The mock-up network’s density is thus (2 × 4)/(6 × (6 − 1))=0.266. Network Density in the current paper is calculated on simplified versions of the networks, namely, deleting multiple links (edges), since density is ill-defined for graphs with multiple edges (Wasserman & Faust, 1994).

Network Density measures the proportion of observed links relative to the maximum number of possible links: A network can have only so many links (Wasserman & Faust, 1994); the closer the density is to one, the more possible links are actually manifested. A fully dense network is said to be complete. In the current context, Network Density is regarded as representing the growth potential of the system: A dense network has low growth potential to the extent that it has exhausted its potential to form new links given its current state (Levie et al., 2019). Conversely, a sparse network means that the pool from which one can choose how to verbalize experience is not exhausted, and new forms (i.e., new links between existing, unconnected roots and patterns) are available for use.

The conceptualization of Network Density as representing growth potential is related to measures of productivity proposed in the literature (Plag, 2006), and specifically, to Potential Productivity as defined in Baayen (2009). The potential productivity of a morphological rule is related to the ratio of hapax legomena of the rule and its total number of tokens, and is a measure for the growth rate of the rule. For example, a new word incorporating the Dutch nominal feminine agent suffix -STER is easier to think of than a neologism with the unmarked agent noun suffix -ER. This is so even though -ER is the unmarked, more productive suffix (Baayen, 2009). However, since -ER has already been used with most of its potential available verbs, the less productive -STER has not been used so, and can be more easily used in new contexts. Adopting Baayen’s (2009) metaphor, consider a company that has a large share of the market (i.e., a very productive morphological rule in a sense), but hardly any future buyers left, since all of the potential buyers have already bought the company’s product. Such a company (or a rule) has a low Potential Productivity, to the extent that the market is saturated. That is, while a category or a rule might be frequent and productive, they may still be in danger of not encompassing new tokens. Conversely, a category or a morphological rule that has a low risk of saturation has greater Potential Productivity. Importantly, looking at Potential Productivity we may conclude that productivity can be “a self-defeating process” (Baayen, 2009): a rule that is already used very frequently may have less potential for further expansion.

Extending our perspective from the productivity of categories and rules towards quantifying the productivity of morphological systems, the focus of the current endeavor, we propose the network-based measure of Density to estimate a system’s growth potential. This measure does not concern the potential productivity of a particular category, but rather the growth potential of the network as a whole, namely, the morphological system within development. Thus, in addition to type and token frequencies, we also consider the ratio between actual and possible links within the data, indicating whether a morphological system has the potential to grow based on its current status. As a metaphor, consider a network of highways between cities. One way to develop the system (i.e., make it bigger) is to build more cities (network nodes), and to pave new roads leading to these new cities (network links). Another way to develop the system, given its current state, is to pave roads between existing cities that are not yet connected. In this case, we can use the current system and enlarge it by linking unconnected nodes. The principal process to enlarge a system in which many of the cities are linked by roads (a dense system) is to build new cities. Conversely, a system in which many of the cities are not yet connected (a sparse system) can be enlarged by paving new roads, connecting existing cities (in addition to building new cities). That is, a dense system is an exhausted system, in the sense that its growth potential based on its current status is fulfilled. However, a sparse system is unexhausted, in the sense that its potential to grow based on its current status is higher and yet to be fulfilled.

Leaving the metaphor and going back to morphology, we quantify the growth potential of a morphological system using its Network Density measure, such that high density indicates lower growth potential (given the current network status), and vice versa. A network with a high growth potential allows the speaker to verbalize more (and new) fine grained aspects of events in the world by using a root or a pattern in a large variety of contexts and circumstances. Conversely, an exhausted network limits the speaker to the verb forms she has already used, and to the contexts and circumstances she has already verbalized.

The final measure we use here, inherent to the analysis of morphological development as dynamic networks, is Community Structure and Modularity (Estrada, 2009). Communities within a network (also known as cliques) are groups of nodes that are interconnected, forming a dense subgraph. Modularity is a statistical measure which calculates the difference between the observed fraction of intra-community edges to an expected fraction in an equivalent random graph – i.e., a null model. Community Structure and Modularity are used together here since the optimal community partition of a network can be found by searching for the partition that maximizes modularity (Beckett, 2016) – especially in bipartite networks such as the present case, in which links are connecting nodes from two different types (Pesantez-Cabrera & Kalyanaraman, 2016). Community detection algorithms identify clusters of nodes that are more likely to be linked within themselves than with other nodes in the network, and modularity evaluates the network structure in terms of separate subsets, forming modules, or compartments.

Aiming to assess the emergence of morphological categories as communities within a network, rather than the transition of information, we adopt the Louvain community detection algorithm. The Louvain algorithm, also known as Multilevel algorithm, seeks to maximize modularity by merging nodes into communities (Smith et al., 2020). It is different from the Walktrap and Infomap community detection methods in that it does not assume that nodes within a community are likely to be connected by shorter random walks. In a systematic-morphological perspective, we find the modularity maximization algorithm more suitable to accounting for the emergence of coherent morphological categories.

We take communities within a morphological network to represent emergent morphological categories – that is, sets of related roots and binyan-temporal patterns that act as references for verbalization. These morphological constructs are the formal pole of a form-function pair. Thus, a network with a large number of small morphological communities constitutes a dichotomous, less productive conceptual space: if a set of roots are exclusively linked to one pattern, a conceptualization that demands the link of one of these roots with another pattern is harder to achieve. That is, we assume that roots or patterns that are part of a small, modular community will have a low probability of being linked to roots or patterns from other communities (Benedek et al., 2017; Kenett et al., 2014). Thus, high modularity and small community structure means low system-side productivity.

2.2.3 Hypotheses

Previous work shows that changes in network structure represent morphological development. Specifically, network density has been shown to decrease with age, differentiating children from high socio-economic status vs. low economic status, and related to Degree and Eigenvector Centrality distribution (Dattner et al., 2021; Levie et al., 2019). In the current paper we use Network Density as a proxy for morphological development, hypothesizing that the growth potential of the morphological system will affect its productivity in terms of network structure. Furthermore, we hypothesize that young networks will show a low number of network hubs, high modularity structure, and a large number of small communities. Conversely, older networks are hypothesized to consist of more network hubs, and a few but large communities, indicating the consolidation of abstract morphological categories and systematic relations between roots and patterns. Regarding Eigenvector Centrality, we hypothesize that changes will occur through development in the centrality of particular roots and patterns relative to the other roots and patterns in the system. As presented above, these hypotheses are related to the paradigmatic conceptualization of morphology (Ackerman & Malouf, 2013; Blevins, 2016), and specifically to the role of morphological family size in the organization of morphological knowledge (Deutsch & Kuperman, 2019; Moscoso del Prado Martín et al., 2004).

3 Results and discussion

Table 4 provides the number of nodes and links in each network. Recall that in the present case, links correspond to temporal verb lemma tokens, and nodes correspond to root and pattern types. For example, a token of the root l-m-d and the pattern Pi’el-past tense constitutes one link yielding limed ‘taught’, while a token of l-m-d and the pattern Pi’el-present tense constitutes another link, yielding melamed ‘teaches’. Networks represent a 1.5 hour sample from each age group, based on the number of verb lemmas as detailed above. That is, the number of network links was determined by the sample, while the number of nodes is a factor of the particular network structure.

Table 4 Network sizes per 1.5 recording hours

Node numbers increase with age, meaning that more types of roots and binyan patterns are used in a time slot of 1.5 hours. Note, however, that while the IDS and CDS consist of speech by adults, they have fewer nodes than the 5;0–6;0 and 7;0–8;0 Peer talk (PT) corpora. Number of links per 1.5 hours, corresponding to temporal verb lemma tokens, are clearly increasing with age in two trends: one rising trend for the output data (CS–8;0), and another rising trend for the input data (IDS–Children’s books). The clear division of the data into two trends underscores the difference between the PT corpora, on the one hand, and the parental corpora, on the other - with IDS having fewer tokens than the young children (2;6–3;0) engaged in Peer talk.

Figure 3 shows the ten networks in a consecutive manner: First the output of children, from the youngest corpus (CS, 1;8–2;2, Fig. 3a) through the PT recordings between the ages of 2;0–8;0 (Figs. 3b–3g). Then, the input to children is shown, starting from the IDS (0;3–1;0, Fig. 3h), followed by the CDS (1;8–2;2, Fig. 3i), and finally, the children’s books network (Fig. 3j). Root and binyan-pattern type frequencies are graphically represented by the number of nodes (light blue nodes representing roots, and red nodes representing binyan-patterns; color figure online). Lemma type frequency is represented by the number of unique links. Lemma token frequency is not graphically represented in the present networks, for better readability. The size of the nodes represents Eigenvector Centrality, as described above: a node that is linked to many nodes that are themselves linked to many nodes gains centrality in the network.

Fig. 3
figure 3figure 3figure 3

Root and pattern networks

Several visual observations can be pointed out by examining the data points in Fig. 3. First, these networks show the fundamental organization of the Hebrew verb lexicon into binyan-pattern-based families as links between a single pattern node and several root nodes. For example, Fig. 3b shows that the pattern signifying Qal-present tense is linked to more than ten different root nodes as early as in the youngest PT group. Second, these figures portray the growth in the number of root-based families through development, depicted as links between a single root node and several pattern-nodes. This can be seen, for example, in Fig. 3g, where a single root node is linked to Qal-infinitive, Hitpa’el-past, Hitpa’el-present, Hitpa’el-future, and Hitpa’el-infinitive. Third, the networks in Fig. 3 demonstrate that the Hebrew verb lexicon is inherently organized into two sub-systems of verb patterns. Sub-system I (Qal-Nif’al-Hif’il-Huf’al) is shown to be more frequent than sub-system II (Pi’el-Pu’al-Hitpa’el) in all ten networks, and both sub-systems show more linkage within themselves than between systems. Fourth, and relatedly, more connections between the two sub-systems are formed with age, as shown by single roots that are linked to patterns which belong to two sub-systems, representing two (or more) verb lemmas that share a root but belong to patterns from different sub-systems. Such links barely exist in the youngest group (Fig. 3a), they increase a bit more in the four through eight year old PT (linking Qal and Hitpa’el, for example, Fig. 3g), and are much more frequent in the adult CDS and children’s books networks (Figs. 3i and 3j).

3.1 Network Density (growth potential of the network)

Figure 4 plots the density score for each network in the data in two perspectives. Network Density is significantly different between the groups (X = 260.59,df = 9,p<0.0001). Density is decreasing with age (Fig. 4a), concurring with previous reports and confirming our hypothesis that older networks are sparser and have a higher growth potential in terms of exhausting morphological links within the verb lexicon. Crucially, Fig. 4b shows that network density is not a factor of network size. Thus, in the following analyses, we use network density as a quantitative proxy for morphological development in modeling the development of network measures.Footnote 6

Fig. 4
figure 4

Density distribution: Age group vs. network size

3.2 Degree centrality distribution

Figure 5 presents Degree Centrality distributions, comparing attested networks (emerging from morphology) with corresponding random networks (with no organizing principle, built according to the same number of nodes and links as their corresponding attested networks). Degree Centrality distribution is clearly different between attested and random networks: attested networks have a highly skewed distribution, while the corresponding random networks show a normal distribution of Degree Centrality values. These differences suggest that Degree Centrality values in the study’s networks are inherently related to the structure of the Hebrew verbal morphology system.

Fig. 5
figure 5

Degree Centrality distribution across networks: Attested vs. random networks

Given the unique structure of the study networks regarding Degree Centrality distribution shown in Fig. 5, we move on to examine the difference between the two morphological constructs constituting the networks’ nodes (roots and patterns), following the bipartite characteristics of the current networks. Figure 6 presents Degree Centrality distributions for roots and patterns separately.

Fig. 6
figure 6

Degree centrality distribution, across networks and construct type (roots and patterns). Rhombuses mark the mean

Two findings are highlighted in Fig. 6. First, Degree Centrality distribution is different between roots and patterns: While root values remain relatively stable, pattern degrees seem to change with age in two trends, one for children’s output (increasing from the CS network to the 7;0–8;0 network), and another trend for the children’s input (increasing from the IDS to the children’s books). Thus, in order to account for changes in Degree Centrality values, we should only consider pattern degree levels. Second, distributions of pattern Degree Centrality are highly skewed throughout the data. Thus, in order to assess the development of Degree Centrality levels as representing pattern linkage, we account for log-transformed Degree.

3.2.1 Pattern Degree Centrality

We fitted a linear model (estimated using OLS) to predict log-transformed Degree Centrality with Network Density (Table 5).Footnote 7 The model explains a statistically significant and weak proportion of variance (\(R^{2} = 0.04, F(1, 215) = 8.63, p = 0.004\), adj. \(R^{2} = 0.03\)). Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using the Wald approximation.

Table 5 Patterns Degree Centrality (log-transformed) model

Table 5 shows that the effect of Network Density on Degree Centrality level is statistically significant and negative (β = −40.78, 95% CI [−68.14,−13.41],t(215)=−2.94,p = 0.004). This is depicted in Fig. 7: As Network Density rises, Degree Centrality (log-transformed) is predicted to be significantly lower. That is, networks with high growth potential are predicted to have more roots linked to each pattern compared with networks with low growth potential.

Fig. 7
figure 7

Patterns Degree Centrality (log-transformed) predicted by network density

3.2.2 Network hubs

Another point of view made available using the Degree Centrality measure concerns the development of network hubs. Hubs are heavy weight, high Degree Centrality nodes, through which most information tends to flow. In our case, network hubs indicate highly linked nodes, i.e., roots linked to many patterns, and patterns linked to many roots (either in terms of type or token frequency). These nodes constitute highly predictable means of verbalizing experience. A network with a low number of hubs has a low productivity, in the sense that the probability to produce new links (i.e., verb lemmas linking a root to a pattern) is higher for high degree nodes than for low degree nodes. Conversely, a network with a high number of hubs is more productive in the sense that the verbalization of experience (i.e., the links between form and function) can be carried out through a larger, less limited number of nodes, producing a more varied lexicon.

We consider a node whose Degree Centrality is higher than 95% of the total nodes in the network as hub.Footnote 8 To assess the effect of age on the number of hubs, we compared the number of nodes in the top 5 percentile for each network with a random network built on the same number of nodes and links, but with no organizing principle (i.e., morphology in our case). Figure 8 portrays the number of hubs in each attested network (solid line) against the number of hubs in each respective random network (dashed line). Figure 8 shows that while the random networks do not display any clear trend, the number of hubs in the attested networks is rising with age. That is, given a morphological organizing principle, networks become more productive with age.

Fig. 8
figure 8

Number of network hubs: nodes with Degree Centrality higher than the 95th percentile (Attested vs. random networks; color figure online)

The rising productivity realized as increasing network hubs can also be attested looking at the specific constructs functioning as hubs throughout the networks. Table 6 lists the hubs in each network, showing that the variety of hubs is increasing in the sense that hubs are extended in an accumulative manner: those roots and patterns that function as hubs in the young networks also function as hubs in the older networks, and older networks have more, rather than different, hubs. For example, the patterns qal-present, qal-past, qal-future and qal-imperative function as hubs in almost every network, and so do the roots r-c-y, ‘want’ and b-w-ʔ, ‘come’. However, the pattern hif’il starts to function as a hub only in the 3;0–4;0 group, and the root ʕ-s-y, ‘do’ in the 4;0–5;0 group. Finally, the children’s books network has all the prior hubs, together with new ones, such as the roots ʔ-m-r, ‘say’ and š-ʔ-l, ‘ask’, and the pattern pi’el.

Table 6 Network hubs: specific constructs

We fitted a Poisson model (on count data, estimated using ML) to predict the number of hubs with Network Density (Table 7). The model’s explanatory power is substantial (Nagelkerke’s \(R^{2} = 0.94\)). Standardized parameters were obtained by fitting the model on a standardized version of the dataset.

Table 7 Network hubs model

Table 7 shows that the effect of Network Density on the number of network hubs is statistically significant and negative (β = −48.54, 95% CI [−82.78,−16.28],p = 0.004). This is depicted in Fig. 9: As Network Density rises, the number of hubs is predicted to be significantly lower. That is, networks with high growth potential (=low density) are predicted to have more highly linked roots and patterns, compared with networks with low growth potential. Recall that highly linked roots and patterns are roots that are linked to many patterns, and patterns that are linked to many roots, each forming a different verb lemma.

Fig. 9
figure 9

Number of network hubs, predicted by Network Density (Poisson regression)

3.3 Eigenvector Centrality

Figure 10 depicts the distribution of Eigenvector Centrality scores across networks and construct type (roots vs. patterns). As is evident, mean centrality scores do not show a clear trend (nor clear two trends as seen above) throughout the data, rendering it impossible to infer development from the mean. Consequently, we look at qualitative difference in the distribution of nodes’ Eigenvector Centrality. While most of the nodes in each network have a very low centrality score, the variance of centrality within each network is significantly different (Kruskal-Wallis \(X^{2} = 62.56, 9, p < 0.0001\)).

Fig. 10
figure 10

Eigenvector Centrality distribution, across networks and construct type (roots and patterns). Rhombuses mark the mean

Looking at Fig. 10 we can see that the networks of the two to three-year olds (PT) have either very low centrality nodes or very high centrality nodes. The picture is different in the older peer talk groups, as well as in the Child Speech directed at adults (CS) and in the input (IDS, CDS, and children’s books). These networks are more variegated, with more nodes in the mid-range of centrality. These findings suggest that specific nodes might show different centrality scores across development and discourse types (i.e., input, output, peer talk, output directed at adults, and spoken input vs. written input).

Given the vast number of different roots, and the specificity of roots relative to semantics and context, we focus on changes in patterns’ centrality as representing more general morphological tools that are event-construal based. Most patterns show almost no variance across development in terms of centrality scores. Nevertheless, we can detect several changes. First, the centrality of most patterns is low and remains low throughout development. Second, changes of centrality take place mostly within the old sub-system, and more specifically, within the Qal and Hif’il paradigms. To demonstrate these changes, Fig. 11 shows the Eigenvector Centrality scores of those binyan-temporal patterns that show clear changes across development.

Fig. 11
figure 11

Changes in the centrality of binyan-temporal patterns

Figure 11 shows that Hif’il-future and Qal-future gain centrality in the CDS network; the centrality of Qal-imperative is peaking in the IDS, with high values in the CS and CDS networks as well. Qal-past and Qal-present are a mirror image of one another: the past tense pattern is more central in the peer talk of the 2;0–2;6 and the children’s books, whereas the present tense pattern is more central in the other age groups (excluding the IDS). That is, we can see a trade-off in the centrality of Qal-past and Qal-present, such that when one gains centrality in the network, the other loses.

3.4 Community structure

Figure 12 is a complex figure summarizing network community structure and modularity. It compares attested and random networks, and shows the number of communities, the mean community size and standard deviation of each network, and the number of compartments for each attested network and its respective random network.

Fig. 12
figure 12

Network community structure (Louvain) and modularity: attested vs. random networks. (a) X axis marks the number of communities for each network, and community size is marked by the height of each segment. (b) Location on the Y axis marks the number of communities; size of point depicts the mean community size; point’s hue stands for community size’s standard deviation (high SD = red, low SD = yellow; color figure online). Triangles mark the number of compartments (isolated communities)

Network community structure as revealed in Fig. 12 can tell us several things. First, mean number of communities is not significantly different between attested and random networks (t(17.88)=0.37,p = 0.714). However, looking at changes through development we can see that number of communities in the attested networks decreases with age in two trends, one for the output and another for the input, while the number of networks in the random networks increases with age corresponding to network size. Moreover, the number of compartments (isolated communities; marked by triangles in Fig. 12b) is decreasing with age in the attested networks in two trends, while staying stable in the corresponding random networks.

The second point is related to the dispersion of community size across development (captured by the mean and standard deviation in Fig. 12b, by point size and shade respectively), comparing attested and random networks. Most random networks show a fairly even distribution of community sizes, resulting in a relatively low standard deviation. Conversely, dispersion increases with age in the attested networks. These differences between corresponding attested and random networks suggest that observed changes in network structure are not a factor of network size, but rather of network age, input vs. output, and modality.

In order to assess development in network structure relative to community size we fitted a linear model (estimated using OLS) to predict Community Size with Network Density (Table 8). The model explains a statistically significant and moderate proportion of variance (\(R^{2} = 0.26, F(1, 90) = 31.58, p < .001\)). Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using the Wald approximation.

Table 8 Network community size model

Table 8 shows that the effect of Network Density on Community Size is statistically significant and negative (β = −960.17, 95% CI [−1299.62,−620.72],t(90)=−5.62,p<.001; Std. β = −0.51, 95% CI [−0.69,−0.33]). This is depicted in Fig. 13: As Network Density rises, Community Size is predicted to be significantly lower. That is, networks with high growth potential are predicted to have larger communities, compared with networks with low growth potential.

Fig. 13
figure 13

Community size predicted by network density

Figure 14 illustrates the differences in community structure by projecting communities on each (attested) network of the present data. Communities are represented by colored areas, and links between communities are marked in red (color figure online). Figure 14 shows that young children’s networks are composed of many unconnected communities. The networks of the mid range children (peer talk 4;0–8;0) show fewer communities, indicating higher similarity between roots and between binyan patterns. Note that the IDS network looks more like the young children’s networks, with more unconnected compartments. Finally, the CDS and the children’s books networks are composed of fewer and larger communities, and more links can be found between communities as well.

Fig. 14
figure 14

Community development

In sum, Figs. 12-14 and the model reported in Table 8 show that attested networks community structure significantly changes through development, following two trends in our database: one for the output networks, and another for the input networks. Young networks (CS, 2;0–2;6 PT) have a large number of small communities. The mid range children’s output (2;6–3;0, 3;0–4;0 PT) shows a smaller number of communities, relatively larger than the youngest, unevenly distributed, with two or three large communities dominating the network (Fig. 12a). The older children (4;0–5;0, 5;0–6;0, 7;0–8;0 PT) show a small number of communities as well, but each is larger, with relatively high dispersion. The two spoken input networks (IDS and CDS) are closer in terms of number of communities to the young networks, but they resemble older networks in terms of community size dispersion with a few large dominating communities and many smaller ones. Finally, the written book input has a small number of communities, one very large and most of the others relatively large, showing a highly dispersed network.

4 General discussion

Arguing for the importance of all of the network’s constituents (nodes and links), we present a hybrid model, assuming both a morphemic, root-and-pattern based lexical representation, and a word-based representation (cf. Bat-El, 2017). That is, a semi-redundant cognitive representation of morphology, including both parts and wholes, and relations between all parts and wholes. Thus, the current paper presents evidence that Hebrew verb learning is a two-path journey, in which verbs are learned both as lexical items, and as part of a morpho-syntactic system (Levie et al., 2020; Ravid, 2019). Based on this approach, the present paper set out two goals: First, we aimed to harness the benefits of network analysis to explain morphological links in the verb lexicon of Hebrew. A second goal was to account for the dynamic nature of language development as age-related changes in the topography of morphological networks. This corpus-based analysis was carried out in nine different transcribed and coded corpora of spontaneous speech produced by native Hebrew-speaking toddlers, children and adults, and one corpus of written, literate Hebrew in children’s storybooks – altogether comprising over 450,000 word tokens. These corpora differed along three important discourse characteristics: in their communicative setting as dyadic adult-child or triadic peer interactions; in their target audience – infant, child or adult; and in their language mode – spoken or written. The main variable of interest in the present study was the children’s age, both as speech producers and as recipients of child-directed speech from peers and from adults.

To reach the study goals, all verbs in the study corpora were identified and coded in three ways to represent our morphological variables: Semitic roots (e.g., s-t-m ‘clog’), binyan-temporal patterns (e.g., le-hiCaCeC, the infinitive pattern of Nif’al), and verb lemmas, each of which being a unique combination of a particular root and a particular temporal pattern (e.g., nistam ‘get clogged’). With roots and binyan-temporal patterns as nodes and verb lemmas as links, a model of the morphological organization of the Hebrew verb lexicon emerged, comprising two kinds of networks simultaneously (i) root-based networks, both derivational, across different binyanim, and inflectional, across temporal patterns within the same binyan; and (ii) pattern-based networks, with focus on the two sub-systems in the binyan verb system. In the following passages we summarize our findings and discuss their interpretations.

In general perspective, the Hebrew verb lexicon is shown to be morphologically organized in root- and pattern-based families (Levie et al., 2020), with verb patterns organized in two sub-systems – sub-system I, comprising Qal, Nif’al, Hif’il, and, in older language users, also Huf’al (Ravid & Vered, 2017); and sub-system II, comprising Pi’el, Hitpa’el, and, in older language users, also Pu’al (Ravid, 2019). This organization, as represented in a unique network structure of roots and patterns, evidently emerges early on, suggesting that it is an inherent characteristic of the Hebrew morpho-lexical system. In developmental perspective, which can be thought of as filling the cells of a paradigm (Ackerman & Malouf, 2013), the current paper underscores the benefits of analyzing dynamic changes based on root-pattern linkages in a network: the more links between roots and patterns in a network, the more new links the network can accommodate.

The network analyses presented in the current paper also offer a typologically oriented view on the systematic nature of root and pattern affixation. Specifically, as every Hebrew verb is morphologically complex, network representation allows access to both the nodes – the decomposed constituents – and the links – the complex word. Derivation is often regarded through the lens of a simplex base and a complex derived word. However, in Hebrew, derivationally complex words do not necessarily follow this simplex to complex path. Rather, it is a relation between two sub-lexical components – a root and a pattern – that derives the actual wordform. Thus, roots and patterns are not typically derivational, nor fully inflectional constructs in the traditional non-Semitic sense (Bybee, 1985; Dressler, 2005). They in fact participate in a bipartite system.

We have found that root- and pattern-based families grow larger with age, and links between the sub-systems increase with age. Sub-system I is more frequent and earlier to emerge, whereas sub-system II is less frequent and emerges later on. For a time slot equivalent to 1.5 hours of speech, network node numbers increase with age, indicating increase in root and pattern types and tokens. Network Density levels significantly decrease with age, indicating that morphological growth potential increases with development. Young children talking to their peers show a low growth potential: Many of their roots and binyan-pattern are already linked, and new verbalizations have less chances of forming new links within the network. Therefore, regarding the distribution of Network Density we can conclude that adult speech will have a larger effect on the expansion of the verb lexicon than speech directed at peers.

Using Network Density as a proxy for speaker’s (morphological) age, we modeled morphological development as increasing system-wide productivity, focusing on (i) Degree Centrality as representing linkage between roots and patterns; (ii) number of Network Hubs as representing network productivity, assuming that the probability of producing new links (verb lemmas) is higher for hub nodes than for non-hub nodes; and (iii) network Community Structure, as representing the emergence and consolidation of morphological categories. Results showed that as networks become sparser (having low density scores), Degree, Hubs, and Community Size concurrently increase. Interestingly, when young children talk to adults, their network connectivity and centrality show some similarity to adult speech and children’s storybooks, while differing from the speech of young children engaged in peer talk. That is, looking at the two-path journey of Hebrew verb learning discussed in Levie et al. (2020) from a system-wide, network-based perspective, the current paper shows that morphological development can be conceptualized as increasing system-level productivity.

Three more findings complement this picture. First, the distribution of nodes’ Eigenvector Centrality values was different in the study groups: The young children (2;0-4;0) engaged in peer talk had either very low or very high centralized nodes, while the CS network, as well as PT in children older than 4;0 year old showed more eigenvector centrality variation, employing more mid-range centrality nodes. Moreover, we found that specific patterns within sub-system I behave differently across development, gaining or losing centrality relative to other patterns. Finally, networks were shown to be less compartmentalized with age, with young networks having a large number of small communities, while older networks had a few, and larger communities. This last point has important implications.

The development of community structure within the morphological verb network suggests that categories within the lexicon are dynamic and changing. A root that is used only within one small community is limited in terms of verbalizing different aspects of its core meaning. A large community, on the other hand, grouping together many roots and binyan-patterns, allows the possibility of linking a small set of schematic patterns in various discursive scenarios. As we show here, the verb lexicon is characterized by the process of starting-small categorization (Elman, 1993), with many small categories in early stages of development, to a smaller number of large categories later on. Crucially, the less compartmentalized the network, the more productive it can be, as traveling from one compartment to another is almost impossible. In terms of morphological links, this means that roots that are linked to one pattern within a compartment may not be linked to other patterns in other compartments. Moreover, note that roots may be linked to more than one pattern in derivational or inflectional families. This means that the semantic concept coded by a single root can be manifested in two schematic event structures. Crucially, it means that other roots that are linked to one of these patterns can be linked to the other pattern as well. This opens the gate to novel root-pattern links, to the expansion of the network, and to the diversification of the category in a “the rich-get-richer” situation: the more links between roots and pattern in a network, the more new links it can accommodate.

One general conclusion arising from these results underscores the morphological nature of the networks of Hebrew verbs, as they portray a picture of how many roots are linked to how many patterns in each age group. These findings have semantic implications: in Hebrew, a verb token is a triple link between a root that stands for a core semantic concept, a binyan that stands for a schematic event structure, and a temporal pattern that stands for specific reference to time and/or modality. Dynamic changes of network structure through development can thus teach us about the emergence of semantic root-based families and schematic pattern-based families, as categories within the mental lexicon. This general note leads us to four perspectives on the results of the network analyses: (i) demonstrating age effects; (ii) showing the significance of communicative setting; (iii) highlighting the critical value of literacy; and (iv) providing evidence for the importance of looking at morphological development in a system-wide perspective.

4.1 Age related results

One major factor characterizing the 10 corpora in the current study was the age of the participants using verbs in their discourse. Obviously, there was a stark contrast between toddlers (the CS talk in dyadic interaction and the youngest groups engaged in PT) and adults (CDS in dyadic interaction and children’s storybook texts). But the corpora also hosted more graded age group differences in the PT of older preschool children. These age differences were apparent across all network measures, with a repeated two-trend slope: one for the children’s output, ranging from the CS (1;8–2;2) to the oldest children engaged in peer talk (7;0–8;0); and another for the children’s input, ranging from the IDS (targeting infants aged 0;3–1;0) to the children’s books. In the comparable context of 1.5 hours, there was an increase with age in the numbers of nodes – i.e., root and binyan pattern types – as well as of link numbers, that is, verb lemma types and tokens. In terms of Network Density, networks were showing decreased Density corresponding with speaker age, that is, more potential for growth in older networks. In terms of linkage between roots and patterns, while networks grew in size with age, it is their morphological structure (functioning as an organizing principle) that explains their increase in connectivity, as revealed by comparing attested networks with corresponding random networks. In terms of Eigenvector Centrality, networks of young children in peer talk showed a dichotomy between very low and very high Eigenvector Centrality, while older networks were mostly more variegated. And in terms of Community Structure, younger networks had numerous but small communities, whereas older networks had few yet large communities. Importantly, these developmental changes were mediated by the effects of communicative context, that is, the nature of the discourse event: dyadic interaction between an adult and an infant or toddler/child, versus triadic peer interaction between children of the same age without the intervention of adults.

4.2 Communicative settings

The communicative settings of the different corpora proved significant in the current study, depending on the type of interaction – dyadic, with child addressing adult or adult addressing child; or triadic peer talk by children. One important difference between children in interaction with adults versus children addressing their peers is that children in peer interactions are tasked with expressing intentions and meanings to their interlocutors (Forrester & Cherrington, 2009) without receiving elaborated, rich adult feedback facilitating linguistic communication (Blum-Kulka et al., 2010; Schuele, 2010). This difference is manifested specifically in the distribution of nodes’ Eigenvector Centrality, showing that the very young CS network is more similar to the older peer talk networks than to the younger, closer in age, peer talk networks. The networks of the younger peer talk groups show less mid-range centrality nodes, indicating a high repetitiveness. Such repetitiveness may reflect a juvenile device of maintaining topic coherence in conversation, a task that is handled by adults in dyadic interaction.

But the most striking finding is the fact that adults talking to young children have morphological networks with high growth potential, while young children talking to peers do not; and that this growth potential predicts other network measures. Growth potential is related to the expansion of the network by forming new links between existing unconnected nodes, as a root node can potentially be linked to more than one pattern node, and a pattern node can potentially be linked to more than one root node. Each network presented in the current study is in fact a snapshot of one static state within a dynamic continuum. In this case, the growth potential of a network is a prediction of the network’s structure in its next state(s): either within a single conversation, or within a longer range of lexical development. If, within a morphological network, most of the nodes are already linked to one another, then this network’s next state would probably be rather similar to the current state. If, however, many of the nodes are not already linked in the current state of a network, the following state has a higher probability of linking unconnected nodes, thus expanding the network.

In terms of the Hebrew verb lexicon, a network with high growth potential means that the next states of the network may accommodate new lemmas or wordforms to refer to subtle aspects of similar situations (this, of course, does not include the insertion of new nodes into the network). Consider for example the distribution of the root n-p-l ‘fall’. The root node n-p-l can potentially be linked to 20 different binyan temporal pattern nodes: four binyan patterns, five temporal inflections each (Levie et al., 2020). For example, yipol ‘will-fall’ (Qal, future tense); mapil ‘drops/dropping,Tr’ (Hif’il, present-tense); or hitnapel ‘pounced on’ (Hitpa’el, past tense). However, these 20 binyan-temporal patterns do not occur in all age groups in the present sample. For example, the 2;0–2;6 PT sample had only 13 temporal inflections, while the children’s books sample had 15. That is, based on this sample, young children speaking to young children can use the root n-p-l only with a subset of 13 out of 20 potential patterns. On the other hand, a child listening to the stories in the children’s books corpus can potentially be exposed to the root n-p-l in a larger subset of 15 out of 20 patterns. But note that while the 2;0–2;6 PT corpus has 13 out of the 20 potential inflectional patterns that are relevant to the root n-p-l in the corpus, three of them are already linked to this root. That is, this network can be expanded, in its following states, from the three links it currently has to thirteen links. However, the children’s books network manifests only two links out of the potential fifteen it has in its current state, allowing it a larger range of expansion in its following states. This means that, based on the status of the current networks, the children speaking to each other in the young PT settings are limited in their ability to expand their n-p-l network in its following states by the use of different patterns (standing for different event schemes) or different inflections (standing for tense-aspect pragmatic scenarios), compared with the parent reading a story to their child.

While this example focused on a single root and its growth potential (as expressed by the density of the network), the findings reported above concerning Degree Centrality, Eigenvector Centrality and Community Structure indicate that this is the case in general: Most of the measures showed a two-trend slope, one for the output networks, and the other for the input networks, highlighting the parallel growth in morphological productivity constituting the child’s linguistic environment. The communicative settings have an effect on the structure of the network, such that peer talk is significantly different from caregivers talking or reading to their children, and to some extent, also from children talking to their caregivers. Peer talk thus shows less morphological complexity in the verb lexicon than the interaction between children and adults that is characterized by elaborations and enrichment.

4.3 The importance of books as input to children

A finding that stands out in this study is the considerable structural difference between the children’s books network, on the one hand, and all other networks of both input and output, on the other. The book corpus can be grouped with the IDS and CDS corpora, since they all constitute child-directed input. It can also lend itself to a grouping that includes the 5;0–8;0 PT groups in our data, as this is the age range of the children’s books audience. And indeed, the children’s books corpus shows similarity to both these groups in terms of network structure, resembling the most complex qualities of each group. It had a mean Degree Centrality value similar to that of the CDS corpus, but a higher number of hubs. And in terms of Community Structure, the children’s books network was more similar to the networks of the older PT corpora than to the networks of the parents talking to their young children: The children’s books network (like the older PT networks) was composed of a relatively low number of large communities. That is, from a bird’s eye view, our findings indicate that the verb lexicon of the (only) written corpus in our data was more complex and linguistically richer than most of the lexicons of the other corpora, underscoring the critical value of exposing children to the language of narratives written by experts for children (Aram & Levin, 2014; Hutton et al., 2017a,b; Sénéchal et al., 2008).

4.4 Conclusions: dynamic morphological development as increasing system-level productivity

The results reported in the present paper suggest that morphological productivity can be measured on a systematic level, and that morphological development can be explained and modeled based on such a macro level. In this sense, the productivity of a morphological system is related to three network measures: The Density of the network, representing the system’s growth potential; network hubs, representing constructs’ linkage potential; and Community Structure, representing emergent morphological categories.

Approaching productivity from a systematic perspective we slightly veer away from how morphological productivity has been addressed to date in the literature. In this respect, several productivity measures are relevant to the present paper: potential productivity (e.g., Aronoff, 1976; Baayen, 2009), whole word frequency (e.g., Balling & Baayen, 2012), and specifically, the family-size effect (e.g., Moscoso del Prado Martín et al., 2004). Common to these accounts is the focus on paradigmatic productivity. For example, Deutsch and Kuperman (2019) show that words belonging to larger nominal word-pattern or root families (i.e. words whose nominal word-patterns or roots were shared by a larger number of other words) demonstrated shorter lexical-decision latencies and higher accuracy, suggesting that both root and nominal word-pattern families provide paradigmatic support to their members in Hebrew. That is, morphological productivity can explain pycholinguistic phenomena at the realm of language processing and production when defined on a paradigm-level (given the Word and Paradigm framework; Baayen et al., 2011; Blevins, 2016; Plag, 2006; Plag et al., 1999; Tomaschek et al., 2021).

Common to all the results reported here is the finding that changes in the morphological system of the Hebrew verb lexicon through time cannot be narrowed down to a linear growth in number of types and tokens. Rather, the system’s structure dynamically changes with time and discourse type as underscored by the comparisons with the respective random networks. The role of different components within the network is not uniform: For example, specific binyan-patterns are more important at one point in development than at others, as shown by changes in Eigenvector Centrality over time which render the system as a whole skewed towards these members. And links between the elements are also changing, creating dense sub-networks. These sub-networks were defined as communities, and were shown to qualitatively change through time.

These important structural features resemble networks with attractor basins to some extent (Spivey, 2008). In order to understand the concept of attractor basins, we need to consider that a dynamic network can represent a potential state space – for example, the possible words suitable to construe a given scenario. An attractor basin links a given set of initial conditions to its corresponding final state (Daza et al., 2016), being the most probable state (within the space) for the system to set on. For example, it can be the most probable word to be chosen to construe the specific scenario in the particular discourse. Moreover, in dynamic networks that represent a state space, changes over time in the activity within the network produce trajectories through the state space. When many trajectories end in a similar region, an attractor basin is formed.

Given these definitions, the important nodes within the (quasi-dynamic) networks in the present paper can be thought of as attractor basins, towards which the network is more probable to lean. Thus, the structure of the network matters: A morphological network with a relatively high number of important nodes that are not interconnected will result in a repetitive lexicon, since each time a concept is about to be uttered, the system is more probable to set on a state it was already set on. On the other hand, a morphological network with a high number of interconnected important nodes makes the final state of a speech event less probable, since the network can set on many states through different, almost equally probable trajectories. This is a more productive system. The present paper shows that these changes characterize the developmental path of the morphological system of the Hebrew verb lexicon through time and across discourse types. That is, at the system level (rather than at specific paradigmatic level), our network-based model of morphological development suggests that development consists of gaining larger inter-related communities, system-wide balanced distribution of central constructs, high number of hubs, and low density. These characteristics can be summed up to represent system-level productivity.