Introduction

Over the past decades, research has substantially increased our understanding of children’s early literacy development and instructional practices for supporting such development (National Early Literacy Panel, 2008; Piasta, 2016). Concurrently, research has also increasingly attended to the important role of educators in supporting children’s early literacy development, as they select and enact instructional practices in their classrooms (e.g., Bratsch-Hines, Burchinal, Peisner-Feinberg, & Franco, 2019; Gerde, Bingham, & Pendergast, 2015; Pelatti, Piasta, Justice, & O’Connell, 2014; Schachter, 2017; Silverman & Crandell, 2010; Zucker, Justice, & Piasta, 2009).

In the present study, we examined educators’ knowledge as related to emergent literacy instructional practices. This focus is grounded in theoretical frameworks positing knowledge as a key determinant of classroom practices which, in turn, influences children’s learning (Desimone, 2009; Shulman, 1987) and is substantiated by an emerging body of empirical work linking knowledge to classroom literacy practices, as reviewed below. The current study extends this literature by considering the nature of associations between educators’ knowledge and practices in early childhood classrooms (i.e., preschool classrooms serving children ages 3-5 years). Specifically, we applied generalized additive modeling (GAM) to more flexibly model these associations and surmount potential methodological limitations in prior work. In doing so, we respond to conjectures that associations may not be linear in nature (e.g., Moats, 2009a) and better characterized by thresholds, such that a certain level of knowledge must be reached in order to see associations with classroom practices, or plateaus, such that associations are no longer evident after reaching a certain level of knowledge. Both linear and non-linear associations have important implications for educator preparation and professional development as well as continued research on this topic.

Prior work concerning associations between educators’ knowledge and practice

In 1994, Moats published a seminal study in which she argued that knowledge of oral and written language structures is essential for teaching literacy. Specifically, she contended that educators must have strong content knowledge of both phonology and orthography in order to support children’s phonological awareness development and early literacy learning. This position has been echoed in ensuing years and is also reflected in standards and position statements concerning what literacy educators should know and be able to do (Brady & Moats, 1997; International Dyslexia Association, 2010; Joshi, Washburn, & Kahn-Horwitz, 2016; Moats, 2009a; Snow, Griffin, & Burns, 2005; Washburn & Mulcahy, 2014). As such, a large number of studies have continued to examine preservice and inservice educators’ language- and literacy-related content knowledge (e.g., Binks-Cantrell, Washburn, Joshi, & Hougen, 2012; Cunningham, Perry, Stanovich, & Stanovich, 2004; McCutchen, Green, Abbott, & Sanders, 2009; McCutchen, Harry, et al., 2002; Moats & Foorman, 2003; Spear-Swerling & Cheesman, 2012; Spear-Swerling & Zibulsky, 2014; Washburn, Joshi, & Binks-Cantrell, 2011a, 2011b; Washburn & Mulcahy, 2019).

Inherent in calls for attending to educators’ language- and literacy-related content knowledge is the notion that educators’ knowledge is related to their selection and implementation of classroom literacy practices (Moats, 2009a; Piasta, Park, Farley, Justice, & O’Connell, 2019; Reutzel et al., 2011). This premise has been investigated in a handful of studies to date, with mixed results. McCutchen, Harry, et al. (2002) examined associations between kindergarten through grade 2 educators’ content knowledge and their observed practices during the first 15 min of literacy instruction. Content knowledge was significantly and positively correlated with use of explicit phonological awareness instruction, with no associations between knowledge and other aspects of observed practice noted. Cirino, Pollard-Durodola, Foorman, Carlson, and Francis (2007) examined kindergarten educators’ content knowledge for teaching literacy in English and Spanish to dual-language learners as related to aspects of their observed classroom practices. Content knowledge for teaching literacy in Spanish, but not English, was positively associated with children’s engagement in literacy instruction; neither was associated with educators’ use of small-group literacy instruction. Piasta, Connor, Fishman, and Morrison (2009) examined grade 1 educators’ content knowledge and the observed amount of explicit decoding instruction provided to children in their classrooms but found no associations between these variables.

Spear-Swerling and Zibulsky (2014) examined associations between the content knowledge of kindergarten through grade 5 educators and the amount of time that these educators would devote to specific literacy activities during a typical, 2-hr literacy block. Educators’ content knowledge specific to phonology and orthography was significantly and positively related to their self-reported quantities of instruction pertaining to letters, phonological awareness, and phonics. Puliatte and Ehri (2018) examined grade 2 and grade 3 educators’ content knowledge and self-reported use of research-based spelling practices (i.e., best practices for teaching spelling such as using a pretest-teach-posttest format), use of research-based spelling strategies (i.e., strategies that apply phonological, orthographic, and morphological knowledge to spell words such as chunking), and time spent teaching spelling. Content knowledge was significantly and positively associated with use of research-based spelling strategies but not use of research-based spelling practices or time spent teaching spelling. Most recently, Piasta, Park, et al. (2019) reported significant, positive correlations between early childhood educators’ content knowledge and their classroom emergent literacy practices.

Results from professional development studies also provide mixed findings concerning associations between educators’ language- and literacy-related content knowledge and classroom practices. In these studies, researchers attempt to increase educators’ content knowledge via professional development and, in turn, determine whether such increases affect practice. For example, McCutchen, Abbott, et al. (2002) provided language- and literacy-focused professional development to kindergarten and grade 1 educators, during which these educators demonstrated gains in content knowledge. Following the professional development, educators spent significantly more time providing explicit phonological and orthographic instruction compared to educators in a control condition, suggesting a potential causal relation between educators’ knowledge and classroom practices. Similar professional development provided to grade 3 through grade 5 educators also resulted in content knowledge gains; these educators spent significantly more time providing vocabulary instruction, but no other types of literacy instruction, than controls (McCutchen et al., 2009).

In early childhood, Neuman and colleagues attempted to increase inservice educators’ language- and literacy-related knowledge via a college course and/or individualized coaching (Neuman & Cunningham, 2009; Neuman & Wright, 2010). Educators randomly assigned to experience the professional development made only modest gains in knowledge, which did not differ from those made by those assigned to a control condition, although those whose professional development included coaching demonstrated significantly higher quality classroom language and literacy practices. In contrast, Hindman and Wasik found that early childhood educators participating in their language- and literacy-focused professional development, which also included coaching, not only exhibited significantly greater gains in content knowledge compared to those in the control condition (Hindman & Wasik, 2011) but also significantly higher quality classroom instructional practices and literacy environments (Wasik & Hindman, 2011). Moreover, the content knowledge of early childhood educators who experienced the professional development was positively associated with the general quality of their classroom instructional practices (Hindman & Wasik, 2011).

Nature of associations between educators’ knowledge and practice

Notably, in most of these prior studies, researchers have analyzed linear associations between educators’ language- and literacy-related content knowledge and their classroom practices using typical correlational or linear regression techniques. To our knowledge, only one study to date has considered more complex associations. Schachter, Spear, Piasta, Justice, and Logan (2016) used quantile regression to examine associations between early childhood educators’ knowledge and the amounts of emergent literacy instruction that they provided to children. The quantile regression approach had two particular affordances. First, similar to other studies (e.g., McCutchen et al., 2009; Spear-Swerling & Zibulsky, 2014), Schachter et al. found that many educators provided little literacy instruction in certain domains, resulting in classroom data that were positively skewed. Unlike traditional linear regression, quantile regression makes no distributional assumptions and was therefore appropriate for analyzing classroom practice outcomes that were non-normally distributed (Petscher & Logan, 2014). Second, the quantile regression approach provides estimates of linear associations at multiple points along the distribution of the outcome. As such, Schachter et al. were able to discern, for instance, that content knowledge was significantly and positively associated with amount of language instruction but only for those educators who provided high amounts of such instruction (i.e., at or above the 0.75 quantile). Thus, although not directly tested, the results of Schachter et al. suggest that associations between educators’ content knowledge and practice may be non-linear.

Non-linear associations could help explain the inconsistent findings in the literature. Such associations might be masked when analyzing only linear associations, and linear estimates may inaccurately represent the nature and magnitude of associations between educators’ knowledge and practices. Moreover, estimates may be more sample-dependent in the presence of non-linear associations. For example, if knowledge and practice are more strongly associated at a particular end of the distribution, studies whose samples tend to reflect that portion of the distribution are more likely to show stronger associations whereas other samples would be more likely to show weaker or no associations. In general, understanding the functional form of associations between knowledge and practice, whether linear or non-linear, is critical for ensuring that analytic assumptions are met, analyses are applied correctly, and results are interpreted accurately.

There are also important conceptual reasons for better understanding the nature of associations between language- and literacy-related content knowledge and classroom practices. As professionals, educators are expected to have greater levels of specialized knowledge than those outside the profession, and many states require preservice educators to demonstrate their specialized content knowledge via subject-specific licensure exams (see Reutzel et al., 2011). Passing such exams signifies that the candidate meets or exceeds some minimal baseline of knowledge. This implies a threshold of knowledge that is necessary to engage effectively in educational practice. Likewise, designation as a reading specialist implies meeting a higher threshold of knowledge. However, empirical research to date has not considered whether such thresholds of knowledge exist as related to educators’ practice or, ultimately, as related to children’s literacy learning (Moats, 2009a, 2009b; Reutzel et al., 2011). Similarly, the findings of many research studies characterize the language- and literacy-related content knowledge of preservice educators, inservice educators, and even university teacher educators as less than ideal (e.g., Binks-Cantrell et al., 2012; Cunningham et al., 2004; Joshi et al., 2009; McCutchen, Harry, et al., 2002; Moats & Foorman, 2003; Schachter et al., 2016; Washburn et al., 2011a, 2011b), but research has not empirically established whether a threshold of knowledge needs to be reached or what such a threshold might be.

Conversely, it is possible that, after achieving a sufficient level of content knowledge, more knowledge is not associated with better practice (i.e., the association plateaus). Content knowledge, for instance, may be a prerequisite for basic use of quality classroom literacy practices but, after achieving a certain level, knowing how to enact this content knowledge during instruction may be more influential (Reutzel et al., 2011). Finally, associations between educators’ content knowledge and classroom practices may indeed be linear, as has been assumed in the literature, such that educator preparation and professional development programs should seek to continually increase knowledge, regardless of participants’ initial knowledge levels, as a means of further improving practice.

The current study

In the current study, we explored the nature of associations between early childhood educators’ language- and literacy-related content knowledge and the quality and quantity of their classroom emergent literacy practices. We directly tested whether the functional forms of these associations were linear or non-linear in nature. Although we allowed for all possible functional forms, we specifically considered three competing hypotheses: (1) educators’ content knowledge must reach a certain level (i.e., threshold) before demonstrating positive associations with practice, (2) educators’ knowledge is only positively associated with practice until a certain level (i.e., plateau), and (3) educators’ knowledge is positively and consistently (i.e., linearly) associated with practice, regardless of knowledge level. In doing so, we contribute to the extant literature by responding to calls to further our conceptual understanding of these associations (Moats, 2009a; Reutzel et al., 2011), addressing potential methodological issues, and demonstrating an analytic technique that may be used in continued work to examine these associations with other measures and other samples.

Method

Participants

This study involved data from 437 early childhood educators who taught in classrooms serving children 3- to 5-years of age. All were participating in a larger project evaluating their professional development experiences (see Piasta, Farley, et al., 2019; Piasta et al., 2017), and all educators who completed a knowledge measure at the start of the study were included. Educators in the current study averaged 42 years old (SD = 10.63) and were predominantly female (96%) with 11 years of experience (SD = 7.78). They were primarily Caucasian (80%) or African American (16%; 3% unreported). Most held a Bachelor’s Degree (31%); the remainder held a high school diploma or GED (16%), Associate’s Degree (20%), or Master’s Degree or higher (26%; 7% unreported). Many held certification to teach young children (70%), and 28% worked in early childhood special education classrooms. Educators’ programs were located in urban (26%), suburban (29%), and rural (35%) locations (10% unreported); most were school- or center-based (50% and 38%) with some home-based (4%; 8% unreported). The average class size was 18.11 children (SD = 7.22), with an average educator:child ratio of 1:13. Educators reported using a variety of curricula: 18% used a language- and literacy-specific curriculum; of those not listing a language- and literacy-specific curriculum, 68% used a global curriculum targeting multiple learning domains (e.g., Creative Curriculum), 5% reported not using a curriculum, and 9% did not provide a response when asked what curriculum they used.

Procedures

Educators completed a knowledge survey in the fall, and trained research assistants visited their classrooms to conduct one-day classroom observations in the fall and spring (i.e., two observations total). Observations occurred on days that educators described as typical (i.e., no field trips or special programs) and lasted for the full instructional day, as designated by the educator or program administrator (Mduration = 82.21 min, SD = 56.16). Research assistants video recorded all observations using two video cameras; one video camera with a wide-angle lens was positioned on a tripod to capture as much classroom activity as possible whereas another camera was held by the research assistant and moved throughout the classroom to capture activities. Both videos were synced and coded simultaneously in the lab to measure the quantity and quality of classroom emergent literacy practices, as further described below.

Measures

Educators’ content knowledge

Educators completed surveys in the fall that included the Teacher Knowledge Assessment Survey (Cunningham, Zibulsky, & Callahan, 2009). This is an adaptation of the widely used Moats (1994) survey, used to measure early childhood educators’ knowledge of English oral and written language structures. The measure includes 19 multiple-choice and short-answer items; 11 primarily focus on knowledge related to phonological awareness (e.g., counting or manipulating speech sounds, counting syllables) and 8 on connecting speech and print (e.g., phonetic regularity, sounds associated with specific graphemes). The number of correct responses are tallied to generate the score used in analyses. Internal consistency (Cronbach’s α) for the current sample was 0.76.

Quality of emergent literacy classroom practices

Research staff coded videotaped observations in the lab using the language and literacy subscales of the Teacher Behavior Rating Scale (see Assel, Landry, & Swank, 2008). These subscales capture the quality of emergent literacy classroom practices with respect to book reading, print and letter knowledge, phonological awareness, written expression, and oral language and are coded when considering the classroom as a whole (see Landry, Anthony, Swank, & Monseque-Bailey, 2009; Lonigan et al., 2015; Nguyen, Jenkins, & Auger Whitaker, 2018 for examples of the TBRS as used in the extant literature). Prior to coding, research staff reviewed the coding manual and associated protocols, completed a 2-day training, observed master coders, practiced coding against master-coded observations, and met developer-established benchmarks for interrater agreement. During coding, research staff viewed video from both video cameras simultaneously. Individual items are rated on a 4-point scale, with higher scores representing higher quality, and averaged to derive subscale scores (e.g., phonological awareness, print and letter knowledge) for a given observation. Subscale scores are averaged to derive an overall quality score. In addition, we averaged scores across fall and spring observations in order to reflect the quality of emergent literacy classroom practices on any typical day, similar to Lonigan et al. (2015). Interrater reliability, measured by intraclass correlations (ICCs) for a randomly selected 20% of observations that were double coded, ranged from 0.88 to 0.99 across subscales. The ICC for the overall quality score was 0.95.

Quantity of emergent literacy classroom practices

Research staff also coded videotaped observations in the lab using the Individualizing Student Instruction coding scheme, as adapted for early childhood classrooms (Connor, Morrison, et al., 2009; Pelatti et al., 2014; see also Connor, Piasta, et al., 2009; Piasta, Farley et al., 2019; Puranik, Al Otaiba, Sidler, & Greulich, 2014 for additional examples of using the ISI coding scheme). This coding scheme documents the amount of time, in min:sec, that individual children spend in emergent literacy activities relating to any of nine mutually exclusive domains: phonological awareness, alphabet knowledge, print and text concepts, word identification, text reading, oral language, vocabulary, comprehension, or writing. All activities lasting at least 15 s are captured, regardless of grouping (e.g., whole class, individual child) or whether the activity involves adults or only children. Coding is completed for a subset of individual children in a given classroom, with amounts of time averaged across children enrolled in the same classroom to approximate the typical time spent in different activities. In the current study, we selected five children per classroom for whom we completed this coding.

Prior to coding, research staff reviewed the coding manual and associated protocols, viewed a narrated powerpoint training, scored at least 80% correct on four training quizzes, observed master coders, practiced coding against master-coded observations, and met a benchmark of no ICCs lower than 0.70 for any individual code when independently coding a series of three master-coded observations. During coding, research staff viewed video from both video cameras simultaneously and used Noldus Observer Pro software to indicate the start/stop times and domain for each activity in which selected children participated. The total amount of time per domain (e.g., phonological awareness, print and text concepts) is calculated for each individual child, and time across domains is summed to determine the overall quantity (in min) of emergent literacy classroom practices that an individual child experienced. We averaged the quantity scores across children enrolled in the same classroom, to aggregate data to the classroom level and obtain a representation of what a typical child in a given classroom experiences. Similar to quality scores, we also averaged across the fall and spring observations to derive scores reflecting the quantity of emergent literacy practices on any typical day. ICCs ranged from 0.73 to 0.99 across domains for a randomly selected 20% of observations that were double coded; the ICC for the overall quantity score was 0.92.

Results

Descriptive statistics are presented in Table 1. Our primary variables of interest were educators’ knowledge and the overall quality and quantity of their emergent literacy classroom practices. However, because the knowledge measure heavily emphasized educators’ understandings of phonological awareness and connections between speech and print (e.g., grapheme-phoneme relations), we also considered the quality and quantity of classroom practices specific to phonological awareness and print and letter knowledge. To measure the quantity of print and letter knowledge practices, which was not an original domain in the Individualizing Student Instruction coding scheme, we combined the amount of time spent in activities within the domains of alphabet knowledge, print and text concepts, and word identification. Below, we describe results of preliminary analyses as well as our main analyses; the latter considered both the nature (i.e., functional form) and strength of associations between educators’ knowledge and classroom practices.

Table 1 Descriptive statistics for educators’ content knowledge and classroom emergent literacy practice variables

Preliminary analyses

Prior to our main analyses, we conducted a number of preliminary analyses. First, we examined distributions for our variables of interest. Educators’ knowledge was normally distributed, and educators demonstrated almost the full range of possible scores (observed range = 3 to 19). The overall quality of classroom emergent literacy practices variable was normally distributed. However, those variables specific to the quality of phonological awareness and print and letter knowledge practices were non-normally distributed. Moreover, all quantity variables were count data and positively skewed. Second, we examined distributions of the classroom practice variables, as our dependent variables, for potential outliers. We identified and corrected potential univariate outliers following recommendations from Tabachnick and Fidell (2007) or, in the case of the quantity variables, recommendations from Hubert and Vandervieren (2008) which utilize a robust measure of skewness and thus more accurately identify potential outliers when data are skewed. Additionally, we considered a variety of indices (Mahalanobis distance, leverage criteria, studentized residual, DFFITS, covariance ratio; see Cousineau & Chartier, 2010) to identify multivariate outliers. Approximately 6% of the sample (28 cases) were identified by one of these indices. We thus conducted all analyses both with the full sample and with these 28 cases removed; given that the results were the same, we present only those results utilizing the full sample (results excluding multivariate outliers available from the second author upon request). Finally, we examined patterns of missing data. Although educator knowledge data were complete, one educator had missing classroom practice data, and missing data rates ranged from 0% to 10% on educator and program characteristics included as covariates in analyses. Separate variance t-tests suggested that data could be considered as missing at random, given systematic associations between missingness and other variables in the dataset (Graham, 2012). We used multiple imputation to handle these missing data (Graham, 2012). This method replaces missing values with plausible values based on the available data for each participant and the overall distributions of variables. Multiple datasets are created with imputed values (we imputed 40 datasets using SAS 9.4). Analyses are conducted on each imputed dataset, and final estimates are derived by pooling across all imputations using Rubin’s rules.

Nature of associations between knowledge and practice

We used generalized additive modeling (GAM) to explore the nature of associations between educators’ knowledge and the quality and quantity of their classroom practices. GAM is a flexible extension of generalized linear modeling (GLM). The latter framework assumes a constant linear association between variables. GAM, in contrast, is a non-parametric approach in which no a priori assumptions are made regarding the functional form of associations between variables. Rather, GAM allows the strength of associations to vary across the distributions and empirically determines the forms of associations between predictors and dependent variables. For this reason, GAM has been recently utilized in other educational research seeking to identify thresholds or plateaus in associations between variables (Le, Schaack, & Setodji, 2015; Setodji, Le, & Schaack, 2013; Setodji, Schaack, & Le, 2018). Non-linear associations are identified through significance testing, in which the fit of models allowing for non-linearity are compared to those specifying linear associations, as well as inspection of smoothed plots depicting results from fitted models.

We used Stata v14 to conduct our GAM analyses, with educators’ knowledge serving as the predictor and quality and quantity of classroom emergent literacy practices as the dependent variables. For each dependent variable, we conducted the GAM analysis twice. First, we included educators’ knowledge as the sole predictor in the model, to explore the nature of univariate associations between knowledge and practices. Second, we re-estimated models after including educator and program characteristics as covariates, to explore whether results differed once these variables were controlled. Covariates included educators’ levels of education, years of teaching experience, teaching certification, general versus special education context, program location, center- versus home-based program, educator:child ratio, number of children with individualized education plans in the classroom, number of children considered dual language learners in the classroom, and curriculum. We selected these covariates because of either theoretical or empirically documented associations with educators’ knowledge or classroom practices (e.g., Cunningham et al., 2004; Downer et al., 2012; Farley, Piasta, Dogucu, & O’Connell, 2017; Gordon & Chase-Lansdale, 2001; Jenkins et al., 2018; Justice, Mashburn, Hamre, & Pianta, 2008; Manning, Wong, Fleming, & Garvis, 2019; Phillips & Morse, 2011; Spear-Swerling & Zibulsky, 2014). Notably, Stata does not currently support GAM analysis with multiple imputed datasets; we therefore followed the recommendation of Gugiu (2018) and selected the one imputed dataset that exhibited the largest amount of covariability for use in GAM analyses. GAM results did not differ when covariates were and were not included; thus, we present the results from our original analyses (without covariates) only.

Results from GAM analyses are presented in the Figures and Table 2. The smoothed functions depicted in the Figures show the fitted associations between educators’ knowledge, on the x axis, and the non-linear smoothed contribution of educators’ knowledge to the quality or quantity of classroom emergent literacy practices, on the y axis; the shading represents the confidence intervals for the estimated associations. Table 2 provides the gain parameters estimated via the GAM analyses and associated statistical tests. The gain parameter is the deviance increase when the model was fit using a linear function rather than a smoothing (non-linear) function; larger gain values indicate worsening of fit when a linear association is estimated. Whether the fit is significantly worse when using the linear function is determined by the associated statistical test; if p < 0.05, the GAM results are interpreted as evidencing non-linear associations between the predictor and dependent variables.

Table 2 Results of generalized additive modeling and regression analyses

Results for quality of emergent literacy practices

We first examined the results concerning the nature of associations between educators’ knowledge and the quality of their classroom emergent literacy practices. Beginning with the overall quality of classroom practices (top of Fig. 1), visual inspection showed that increases in educators’ knowledge appeared to be related to increases in their quality of classroom practices. Also, the association between educators’ knowledge and the overall quality of classroom practices appeared to have a stronger positive slope for those educators scoring above 7 or 8 on the knowledge measure, perhaps indicating a threshold at this point although (a) the change in slope was not particularly pronounced and (b) there was considerable variability in associations for educators scoring below 7 or 8 on the knowledge measure, as evidenced by the wide confidence intervals. When we considered the statistical output from the GAM analysis (Table 2), the results did not show evidence of non-linearity in the associations between educators’ knowledge and the overall quality of their classroom practices. Specifically, the gain parameter was not significant (p = 0.869); thus, the form of the association between educators’ knowledge and the quality of their classroom practices was adequately and most parsimoniously described as linear.

Fig. 1
figure 1

Fitted associations between educators’ content knowledge and quality of classroom emergent literacy practices [overall quality = top; phonological awareness (PA) quality = middle; print and letter knowledge (PLK) quality = bottom] based on generalized additive modeling results

The pattern of results was similar when considering the nature of associations between educators’ knowledge and the quality of their classroom phonological awareness and print and letter knowledge practices. Although, for instance, the association with quality of phonological awareness practices appears to turn negative for educators with the very highest levels of knowledge (middle of Fig. 1), neither this nor the association with print and letter knowledge practices exhibited non-linearity per the GAM analyses (Table 2). The plots in the middle and bottom of Fig. 1 imply a positive association between educators’ knowledge and the quality of these classroom practices.

Results for quantity of emergent literacy practices

Visual inspection of the plot depicting the association between educators’ knowledge and the quantity of classroom emergent literacy practices (top of Fig. 2) showed a generally positive association that shifted to a negative slope for educators scoring above 17 or 18 on the knowledge measure; notably, associations for educators with these high scores showed variability, as evidenced by the wide confidence intervals. Overall, the pattern was similar for the association between educators’ knowledge and the quantity of classroom phonological awareness practices (middle of Fig. 2), although the smoothed plot also showed a negative association for educators scoring between 11 to 14. In contrast, the association between educators’ knowledge and the quantity of their classroom print and letter knowledge practices appeared to be negative for educators scoring below 8 or 9 on the knowledge measure, flat or slightly positive for educators scoring between 9 and 15, and slightly negative for those scoring above 15 (bottom of Fig. 2). As in the other plots, confidence intervals around these associations were the widest for educators with the lowest and highest knowledge scores. Statistical analyses of these associations, however, did not provide evidence of non-linearity between educators’ knowledge and any of the quantity of classroom practices variables (Table 2). Thus, in all cases, the GAM results supported linear rather than non-linear associations.

Fig. 2
figure 2

Fitted associations between educators’ content knowledge and quantity of classroom emergent literacy practices [overall quantity = top; phonological awareness (PA) quantity = middle; print and letter knowledge (PLK) quantity = bottom] based on generalized additive modeling results

Estimates of associations between knowledge and practice

GAM tests the nature of associations between predictor and dependent variables, in terms of whether non-linear associations exist. Following GAM analyses, one typically conducts additional analyses to estimate the association between the variables, utilizing analyses that appropriately account for the now-identified functional form of the association (e.g., using piecewise regression to account for instances in which the slope or form changes, using quadratic or cubic terms to model curvature). We thus followed our GAM analyses with regression analyses to estimate the linear associations between educators’ knowledge and the quality and quantity of their classroom emergent literacy practices. Traditional regression analysis was appropriate for modeling the association with the overall quality of classroom practices, but the non-normal distributions of the other dependent variables required alternative analyses. For the quality of classroom phonological awareness and print and letter knowledge practices, we used median regression because this semiparametric approach makes no assumptions about parametric or residual distributions (Cameron & Trivedi, 2009). For the quantity of classroom practices variables, we used negative binomial regression given that these represented count data (i.e., number of min). We used Stata v14 to conduct all regression analyses and pool results across the 40 imputed datasets. We again conducted analyses both with and without covariates; both sets of results are presented in Table 2 given slight differences in findings. Results of the regression analyses showed that educators’ knowledge was significantly and positively associated with all quality of classroom practice variables, both when estimated in univariate models and after controlling for covariates. Educators’ knowledge was also significantly and positively associated with the overall quantity of classroom practices and the quantity of classroom phonological awareness practices when estimated in univariate models; these associations were not significantly different from zero after controlling for covariates. Educators’ knowledge was not significantly associated with the quantity of classroom print and letter knowledge practices.

Discussion

The results of this study support our third hypothesis and suggest that educators’ content knowledge is positively and linearly associated with their classroom emergent literacy practices. As such, these results support prior work that applied linear modeling techniques (e.g., Hindman & Wasik, 2011; McCutchen, Harry, et al., 2002b; Piasta et al., 2009; Puliatte & Ehri, 2018) and expand this work to demonstrate positive associations between early childhood educators’ literacy content knowledge and both the observed quality and quantity of their classroom emergent literacy practices. Notably, past studies of associations between content knowledge and classroom practices have often relied on measures of the latter that reflect quantity (e.g., extent to which educators used explicit and/or research-based literacy instructional strategies, amount of time providing literacy instruction; Cirino et al., 2007; McCutchen, Harry, et al., 2002b; Piasta et al., 2009; Spear-Swerling & Zibulsky, 2014), although some have measured non-literacy-specific instructional quality (Cirino et al., 2007; Hindman & Wasik, 2011) or utilized composites that include both quantity and quality (e.g., Piasta, Park, et al., 2019). Emerging work suggests, rather unsurprisingly, that both quality and quantity contribute to children’s early literacy learning (Auger, Farkas, Burchinal, Duncan, & Vandell, 2014; Burchinal, Zaslow, & Tarullo, 2016). Thus, it is an important finding that educators’ content knowledge is positively related to both of these aspects.

Although the findings uphold assumptions of linear associations between knowledge and classroom practices, it was nonetheless important to explore the functional form of these associations. To our knowledge, no prior work has empirically considered the possibilities of thresholds or plateaus in these associations despite explicit and implicit calls to attend to these (e.g., Moats, 2009a; Reutzel et al., 2011). Moreover, such non-linearity is conceptually plausible and would have serious methodological implications for future research, making it important to directly test these possibilities. Thus, affirming linear associations between educators’ knowledge and classroom practices is an important contribution. The current work also highlights the potential of GAM for testing whether thresholds, plateaus, or other functional forms characterize associations between two variables. As such, GAM may be an important methodology for continued work on this topic, including considering the nature of these associations when using other knowledge measures, classroom practice measures, and samples.

Our results show that, with a few caveats that we discuss below, higher levels of content knowledge are associated with better classroom emergent literacy practices; no thresholds or plateaus were evident. Assuming replication in future work, these results have notable implications for preservice educator preparation and inservice professional development. One implication concerns the level of literacy content knowledge required to receive initial or specialist teaching certification. Our results do not support a specific threshold of content knowledge that delineates novices from experts and imply that knowledge levels to demarcate these distinctions may need to be criterion-referenced rather than data-based (Reutzel et al., 2011). Notably, however, our sample included only inservice educators, and future research should explore the possibilities of knowledge thresholds in samples of preservice educators. A second implication concerns our finding that there does not appear to be a point at which greater knowledge is no longer associated with better practice. Although only correlational in nature and in need of future experimental testing, the lack of plateau implies that continued gains in knowledge, as through professional development, may yield continued improvements to practice. This aligns with requirements that educators participate in ongoing professional development. Likewise, this finding implies that we should continue to advance educators’ literacy content knowledge, regardless of their current knowledge levels. Accordingly, preservice coursework and inservice professional development needs to be differentiated and responsive to educators’ range of knowledge (Weber-Mayrer, Piasta, & Pelatti, 2015) and requires instructors who themselves have deep language and literacy knowledge such that they can appropriately meet these varying needs (Binks-Cantrell et al., 2012).

We note that, contrary to findings for all other practice measures, we did not find associations between educators’ content knowledge and the quantity of classroom print and letter knowledge practices. Given previous research suggesting that there must first be sufficient quantity of instruction in order to detect an association with educators’ knowledge (Schachter et al., 2016), we examined the distribution and found that the duration of print and letter knowledge practices was at least one minute for the majority of educators (72%); although this variable was positively skewed (which was accommodated in analyses), the distribution seemed sufficient to detect an association, particularly when compared to the distribution for quantity of phonological awareness practices.

This consequently raises the question as to why we found a positive association for phonological awareness but not print and letter knowledge. It is possible that print and letter knowledge practices are ubiquitous in early childhood classrooms because educators understand the importance of such activities for children’s emergent literacy development (Diamond & Powell, 2011; O’Leary, Cockburn, Powell, & Diamond, 2010). Extensive research emphasizes the importance of alphabet instruction (National Early Literacy Panel, 2008; Piasta, Petscher, & Justice, 2012; Piasta & Wagner, 2010), and basic implementation of such activities may require little specialized knowledge and be relatively easily integrated into the instructional day. However, simply because these practices are commonly used does not ensure that they are implemented well. Thus, it is reasonable to find a positive linear association between educators’ content knowledge and the quality but not quantity of print and letter knowledge practices. Conversely, phonological awareness practices may require much more specialized and detailed knowledge (Cunningham, Etter, Platas, Wheeler, & Campbell, 2015); accordingly, fewer educators may have requisite knowledge for providing such instruction, even for short amounts of time. This could explain why, although phonological awareness activities were less frequent than print and letter knowledge activities, we found a positive association between educators’ content knowledge and the quantity of phonological awareness practices but not print and letter knowledge practices.

Finally, in light of our results, we must revisit our suggestion that non-linearity in associations between educators’ content knowledge and classroom practices might explain mixed findings in the extant literature; this does not appear to be a viable explanation. The present study provides some insight into these mixed findings, and there are several other potential explanations aside from non-linearity. For example, it is possible that previous findings were dependent on how practice was measured in each study (e.g., self-report, quantity, quality). Self-reported quality or quantity is more susceptible to error and bias. Some measures of quantity may emphasize common practices, such as those involving print and letter knowledge, that are ubiquitous and thus do not show associations with knowledge. Past quality measures have often not been specific to literacy, and the accuracy of quality ratings may inherently depend on observing sufficient quantities of instruction. Moreover, including or excluding covariates could lead to mixed results. This possibility is exemplified in the current results, as only univariate associations between knowledge and quantity of instruction were evident; there is likely substantial shared variance between knowledge and covariates such as education level, teaching experience, certification, and classroom context. Other methodological challenges, such as small sample sizes, limited statistical power, and non-normal distributions of practice variables might also contribute to mixed findings across studies; a benefit of the current study was a large sample of early childhood educators such that even small associations could be detected.

Limitations, future directions, and conclusion

The current study had many strengths, including the use of GAM for considering non-linear associations, observed classroom practice measures that assessed both quality and quantity of emergent literacy practices, and a large, diverse sample of early childhood educators. However, a number of limitations are of note. First, commensurate with the body of work on this topic, we focused on educators’ literacy content knowledge; yet, other types of educator knowledge are likely relevant and necessary to consider in relation to practice (Reutzel et al., 2011). Second, the results of the current study might be specific to the content knowledge measure that we used, although our measure was similar to those widely utilized in the literature. Notably, both our measure and those commonly used in other studies emphasize knowledge of phonology and orthography (cf. Phillips, Oliver, Tabulda, Wood, & Funari, 2019; Washburn & Mulcahy, 2019), and future research should attend to educators’ knowledge relevant to supporting children’s language and other meaning-focused skills. Third, our results also might be specific to this sample of early childhood educators. Although we are reasonably confident in our results as these apply to this population, given that the sample reflected the diversity of U.S. early childhood educators, exhibited nearly the full range of knowledge scores, and demonstrated a normal distribution on the content knowledge measure, future research should replicate this work with preservice educators and those serving older children. Fourth, our results are also specific to the ways that we measured classroom practices, and alternative measures of observed practice may yield different results.

Fifth, although the sample was of sufficient size to have adequate representation of educators scoring in the tail ends of the distribution on the content knowledge measure, confidence intervals were widest for educators with the lowest or highest knowledge scores. It is unclear whether this reflects a property of the content knowledge measure, with measurement less reliable at the extremes of the distribution, or if this is a true representation of variability in associations; psychometric work will be important as knowledge measures continue to be developed and refined. Sixth, the scope of this study was limited to associations between knowledge and practice. However, analogous work examining the nature of associations between knowledge and children’s learning is a critical next step. Although we did not find thresholds of educator knowledge as related to practice, it is reasonable to test whether a particular level of knowledge is necessary to realize effects on children’s outcomes (Moats, 2009b; Moats & Foorman, 2003).

In sum, this work provides important insights into the nature of associations between educators’ literacy content knowledge and their classroom emergent literacy practices. Its findings support previous work and highlight not only the importance of content knowledge for practice but also GAM as a valuable tool to inform continued research on this topic.