Purpose of the study

The federal goal of improving the quality of teaching (U.S. Department of Education, 2002b) has made it critical that there be means of identifying more and less effective teachers. Assessment of teachers’ knowledge in the content area in which they teach is one such means. In reading, relatively little research has been done to develop and validate measures of teachers’ knowledge about early reading. Without evidence of the theoretical and psychometric dependability of such tests and without evidence that performance on such tests is related to gains students make in reading, it is difficult to assess efforts to improve teacher quality, including programs of professional development that schools and districts might offer their teachers. Of critical importance are studies designed to determine whether performance on a test of teachers’ knowledge accounts for students’ gains in reading achievement. This is the purpose of our study, focused specifically on teachers’ knowledge about early reading in high poverty schools.

Arguably, understanding methods that are effective in improving teachers’ knowledge is most important for high poverty schools, as there is evidence that the least qualified teachers are found in the most needy schools (Darling-Hammond, 2004; Peske & Haycock, 2006). The No Child Left Behind Act of 2001 authorizes funding for a federal initiative called Reading First, which is designed to help schools characterized by high poverty and low achievement make progress in improving students’ reading performance (U.S. Department of Education, 2002b). The present study was carried out in the context of the Reading First initiative in Michigan. The participants were early elementary teachers in Reading First schools, and the test of teacher knowledge was designed to measure the content emphasized by the professional development program used in Reading First schools in 2004–2005. Our results, therefore, may be limited in their generalizability given the unique context of Reading First in Michigan. However, these results should contribute to our understanding and discussions about the nature of teachers’ knowledge in reading and the promise of such tests to predict student growth in reading achievement.

Knowledge about early reading

Over the past 20 years, a number of research studies have used measures of teachers’ knowledge about reading to study teachers’ learning (e.g., effects of professional development or comparison of preservice and inservice teachers) (Bos, Mather, Dickson, Podhajski, & Chard, 2001; Bos, Mather, Narr, & Babur, 1999; Mather, Bos, & Babur, 2001; McCutchen, Abbott et al., 2002; McCutchen, Harry et al., 2002; Moats, 1994; Moats & Foorman, 2003). The content of these measures has largely been determined through logical analysis (i.e., determining what teachers need to know by analysis of what their students need to learn). Performance on these measures is often used to assess changes in knowledge or differences in knowledge held by different teacher groups. When used for such purposes, these tests are assumed to be valid measures of teachers’ knowledge about early reading.

Many of the measures of teachers’ knowledge about reading used in recent studies have adopted or adapted a survey of language knowledge developed by Moats (1994) (e.g., Bos et al., 2001; McCutchen & Berninger, 1999). In her survey, Moats placed emphasis on the knowledge of oral and written language, including phonology, morphology, phonics, and orthography (Moats, 1994; Moats & Foorman, 2003). In several studies, researchers have considered one other aspect of teachers’ knowledge that might be related to effectiveness of reading instruction—knowledge about children’s literature (e.g., Cunningham, Perry, Stanovich, & Stanovich, 2004; McCutchen, Harry et al., 2002). However, results have not shown this aspect of early reading knowledge to be closely related to various indices of effective instruction or linguistic knowledge.

Moats’ survey and adaptations of her survey of knowledge about early reading have focused more on content knowledge than on application of that knowledge in the classroom. For example, the teacher might be asked to indicate the number of phonemes in certain words (e.g., how many phonemes are in each word? shut, stripe, etc.) but not to demonstrate their knowledge of how teachers teach students to segment words into phonemes. Unlike assessment of knowledge in other content areas, reading researchers have not developed and tested measures of pedagogical content knowledge, which would be likely to reflect the ways that content knowledge about reading is utilized in instruction (Shulman, 1987). Although pedagogical content knowledge might be a better index of the effectiveness of teachers’ instruction than content knowledge (Snow, Griffin, & Burns, 2005), it is nonetheless important to determine the relationship between content knowledge and students’ gains in reading. For example, it remains an empirical question whether teachers’ content knowledge is the bottleneck in producing high quality literacy instruction. Although prior research has demonstrated associations between content knowledge and student learning, these studies have not examined whether teachers with higher content knowledge influence students’ achievement growth. This distinction, however, is important since the cross-sectional correlation between knowledge and achievement could have been produced by selection effects, where more knowledgeable teachers self-select into schools with better performing students.

Even though surveys of teachers’ knowledge have generally not been subjected to rigorous tests of validity, the studies referenced above have contributed to our knowledge in several important ways. One contribution is descriptive information about kinds of knowledge that teachers might need—e.g., knowledge about language or about children’s literature. Another contribution is evidence that the measures are sensitive to teachers’ learning. For example, in Foorman and Moats (2004), results indicated that teachers with high attendance at the professional development courses performed better on the teacher knowledge survey than those who attended some or none of the sessions. However, further efforts to establish the validity of such teacher knowledge measures must focus on the extent to which such knowledge contributes to students’ progress in learning to read.

Teachers’ knowledge about reading and professional development

Researchers interested in the development of programs for professional development in reading have used measures of teachers’ knowledge to study the effectiveness of these programs. In some cases, these studies also provided indirect evidence that teachers’ knowledge plays a role in effective instruction. For example, Bos and her colleagues (1999) studied changes in reading knowledge of teachers who did and did not participate in a program of professional development in reading. Using a survey adapted from Moats (1994) to assess teachers’ knowledge, they found that participating teachers made significant gains on the survey administered before and after the program; another group of teachers who did not participate in the program did not make significant gains. While comparisons of teaching practices and students’ gains in reading were carried out for the two groups of teachers, the researchers did not examine the relation of teachers’ knowledge about reading and gains in reading made by their students.

McCutchen, Abbott et al. (2002) used a similar study design and analytic approach. These researchers studied the impact of professional development on instructional practices and students’ gains in reading for a group of early elementary teachers. The researchers built hierarchical linear models to examine gains in students’ reading achievement. These models compared either teachers who did and did not participate in the professional development program or teachers who were or were not high quality implementers. Here, too, a parallel analysis using performance on the survey of teachers’ knowledge to examine students’ reading achievement was not conducted.

Only one study has examined the extent to which teachers’ knowledge of early reading accounted for variance in students’ gains in reading achievement. This analysis was reported in a study of the effects of professional development and teachers’ practices on student outcomes carried out by Moats and Foorman (2003; see also Foorman & Moats, 2004). This 4-year longitudinal study involved chronically low-performing schools with high-poverty populations in Houston and the District of Columbia. The teacher knowledge survey, adapted from Moats (1994), assessed knowledge of speech sounds, morphology, phonological patterns, and orthographic rules. In an attempt to base the survey more on teachers’ performance, they added questions pertaining to a running record of oral reading errors and a writing protocol. The test was given to third- and fourth-grade teachers before and after the professional development. Findings indicated low but statistically significant relations among teachers’ knowledge, their effectiveness in the classroom (assessed by observations), and students’ end of year reading achievement. However, no controls for prior achievement were included in this analysis. In a regression analysis to examine the variables that contributed to students’ year-end Woodcock Johnson-Revised broad reading cluster, teachers’ knowledge made a small contribution but interacted significantly with site, indicating a significant effect in Houston but not the District of Columbia (Foorman & Moats, 2004). Although the results suggest that teachers’ knowledge and students’ end of year reading achievement were related, the analysis would need to focus on students’ gains in reading in order to determine whether teachers’ knowledge contributed to improved reading on the part of the students.

The research literature discussed above includes studies in which surveys of teachers’ knowledge about reading are used to assess the impact of a professional development program on teacher learning. Clearly, this is an important step in validating measures of teachers’ knowledge about reading. However, these studies have methodological weaknesses. None have employed experimental or quasi-experimental designs that controlled for variables that might have influenced either instruction or student achievement. Most did not include adequate controls for socio-demographic information about students or classroom and school characteristics. These omissions limit the extent to which inferences can be drawn about whether teachers’ knowledge explains variation in instruction or students’ achievement. For example, an association of students’ reading achievement with teachers’ knowledge could be attributable to socio-demographic characteristics of the student population (i.e., students from higher socio-economic backgrounds are found in schools and classrooms with more knowledgeable teachers).

Another concern is that studies using measures of teachers’ knowledge about reading usually have not reported psychometric analyses other than internal reliability. Untested assumptions about the nature of the measure (for example, an assumption of uni-dimensionality of test content) might lead to unfounded interpretations of test results. Furthermore, absence of information about the domains or types of knowledge measured by different teachers’ knowledge tests makes it impossible to determine whether inconsistent findings of the research to date are attributable to a true lack of effects for teachers’ knowledge or to poor measurement of the construct. The importance of psychometric analyses is highlighted by Phelps and Schilling (2004). In this study, psychometric analyses indicated that teachers’ reading knowledge was multidimensional. It is important that unidimensionality of our test of knowledge was not assumed since this assumption can lead to uninterpretable or misleading findings when examining the effects of teachers’ knowledge on student achievement.

Studying teachers’ knowledge in the context of reading first

The No Child Left Behind Act of 2001 (U.S. Department of Education, 2002b) provides support for improving the quality of teaching of early elementary reading in schools with high poverty and high levels of underachievement in reading. The Reading First legislation and guidance require that states provide high quality professional development to teachers as a means of improving their reading instruction (U.S. Department of Education, 2002a). The expectation is that the resulting improvements in reading instruction will lead to improvements in students’ reading achievement. From the outset of Michigan’s Reading First program, professional development was based on a program called Language Essentials for Teachers of Reading and Spelling (LETRS) (Moats, 2003). LETRS is based on Moats’ earlier (1994) view that teachers need to improve their knowledge about basic language processes. She has argued that teachers must be knowledgeable about reading development, reading difficulties, and research-based instruction; further, they must have an understanding of how to put this knowledge to work in the classroom (Moats, 2003). The LETRS program provides a foundation in current research on language and learning to read. Although there are opportunities in the program to analyze texts or samples of students’ spelling, the program does not explicitly link the content of the professional development to teachers’ current instruction, especially the required curriculum.

The program of professional development in Michigan’s Reading First schools closely followed the lessons in the nine modules of LETRS. Expert trainers were hired to train the literacy coaches in the Reading First schools, and these coaches, assisted by Reading First facilitators (reading experts who oversaw implementation of Reading First in about five schools), provided the professional development for the teachers. However, the contexts of the instructional sessions varied by district and school, as did the scheduling of the nine seminars through the 2003–2004 school year. For this reason, the teachers’ knowledge measure was administered in surveys taken by the teachers three times during the 2004–2005 school year (fall winter, and spring).

Teachers also received professional development to assist them in using the new comprehensive instructional program that was part of their Reading First program (each district selected one of five comprehensive programs). The state Reading First plan required districts to provide training for teachers in the use of the comprehensive program. Because teachers attended two types of professional development (LETRS and comprehensive program training), both may have influenced teachers’ learning about reading. In this study, we make no claims about where the teachers acquired knowledge about reading that they might have used on the test of reading knowledge. Instead, our purpose is to study the extent to which this knowledge accounts for variance in students’ reading achievement, when controlled for students’ socio-demographic characteristics and prior reading achievement and for teachers’ professional and personal characteristics.

Method

Reading first schools and data collection procedures

In 2004–2005, 112 elementary schools participated in the state’s Reading First initiative. However, because our data analyses were carried out separately for grades 1 through 3 and because the range of grade levels varied by school (some included one or two grade levels, some included five or six), the number of schools was different for each of the grade-level analyses. The average school size was 357 students (SD = 130). The mean student–teacher ratio was 22 (SD = 4). Thirty-six percent of fourth-grade students in the schools included in this study did not meet state proficiency standards in reading in the year the school became a Reading First school.

Data for this study were from the second and third years of data collection of the Evaluation of Reading First in Michigan (2003–2004 and 2004–2005). Teacher data were taken from Teacher’s Quest, a self-administered questionnaire that the teachers were required to complete three times a year (fall, winter, and spring). Reading First facilitators administered Teacher’s Quest at staff or grade-level meetings in each school. Teachers worked independently to fill out the questionnaire. Each administration of Teacher’s Quest in 2004–2005 contained one of three parts of a test of reading knowledge called Language and Reading Concepts (LRC). In addition, a self-report of information about the teacher (e.g., education, previous teaching experience) was included in the fall administration.

Student data included performance on two subtests of the Iowa Tests of Basic Skills (ITBS) from the spring of 2004 and 2005 and socio-demographic information drawn from the state’s Single Record Student Database (SRSD). The data sources for teachers and students are described below.

Student data collection

ITBS, word analysis and reading comprehension subtests

Because the study was concerned with the acquisition of students’ early literacy skills, we chose to use students’ performance on two subtests of ITBS to characterize their reading achievement: Word Analysis and Reading Comprehension. Word Analysis involves identifying and matching sounds and spelling elements of words. Reading Comprehension involves selecting responses to questions based on sentences or short passages. As reported by the publisher, the reliability (computed with Kuder–Richardson Formula 20) for each subtest for grades 1, 2, and 3 (presented in that order) is as follows: Word Analysis: .85, .85, .85; Reading Comprehension: .91, .90, .91. The measure used for the study is the developmental standard score (SS), defined as “a number that describes the student’s location on an achievement continuum” (Hoover, Dunbar, & Frisbe, 2003). According to the scale and norm information reported in the ITBS test manual, the median SS is 150 for first graders, 168 for second graders, and 185 for third graders.

Each spring, classroom teachers administered ITBS reading subtests to their students, with the assistance of the literacy coach and other staff in the school. Students included in the data analyses were those who were taught by a teacher who participated in the assessment of teachers’ knowledge and who had ITBS test results for the spring of 2004 (when they were in grades K-2) and the spring of 2005 (when they were in grades 1–3). In each of the six analyses (two outcomes for each of the three grades), more than 2,700 students were included. Table 1 provides the specific number of students included in each analysis as well as descriptive statistics for the eligible students.

Table 1 Descriptive statistics for student- and teacher-level covariates

Socio-demographic characteristics

Background characteristics of students were collected from the State of Michigan’s Single Record Student Database (SRSD). Data from SRSD were used by Michigan schools to report student race, gender, limited English proficiency, disability status, and eligibility for free or reduced price lunch for NCLB. Compared to the nationally representative sample from the Early Childhood Longitudinal Study (ECLS), schools in Michigan’s Reading First serve a greater number of poor and minority students. For example, while only about 32% of students in the nationally representative ECLS sample received free or reduced lunch, approximately 72% of the students in the Reading First schools in this study received free or reduced price lunch. Similarly, while 55% of the students in the ECLS data set were white, the percentage of white students within each grade level in Reading First schools was at or below 38%.

Participating teachers

In 2004–2005, 977 teachers completed the three parts of a test of reading knowledge called Language and Reading Concepts (LRC) administered in the fall, winter, and spring Teacher’s Quest. In these administrations, teachers also provided descriptive information about themselves.

Language and reading concepts (LRC)

The measure of reading knowledge for this study was the composite score of the three parts of LRC, administered in the fall, winter, and spring. As noted earlier, LRC was aligned with the LETRS professional development program in Michigan from 2003 to 2005 (Moats, 2003). Of the nine modules in LETRS, the content of Module 1 (an introduction to LETRS) and Module 8 (Assessment for Prevention and Early Intervention) were not included in the test. Most items were developed from the content of Modules 2, 3, 4, 5, and 6, as these map specifically onto the five required components of reading instruction in the Reading First legislation: phonemic awareness, phonics, fluency, vocabulary, and reading comprehension. LRC is made up of 56 items (20 in parts A and B and 16 in part C); these are shown in the Appendix A.

Procedures for estimating scores for teacher knowledge of reading

Responses from the three LRC administrations were combined to form a single measure.Footnote 1 Teachers with missing data due to skipped items or a missed administration were included in the analysis with missing data coded as not presented. Full information factor analysis (Thissen & Wainer, 2001), conducted to investigate the dimensionality of LRC, indicated that LRC is best fit by a single factor and that all items can therefore be scored as a single measure. Two parameter item response theory (IRT) models (Hambleton, Swaminathan, & Rogers, 1991), using the software program BILOG (Mislevy & Bock, 1997), were then used to investigate scale properties and to score participants.

LRC had a moderate to high IRT reliability of .88. The test information curve indicates that LRC provided sufficient information to reliably distinguish participants with abilities ranging from 4 standard deviations below the mean to approximately 1.75 standard deviations above the mean (see Fig. 1). The LRC measure provides sufficient information to distinguish with high levels of reliability among all but the relatively small proportion of very knowledgeable participants at the top end of the ability spectrum.

Fig. 1
figure 1

Test information curve for literacy and reading concepts assessment

Overall, the analysis indicates that LRC is a sound scale measure. There are no prominent issues of dimensionality. Further, participants are measured reliably across the majority of the ability spectrum.

Teacher information

Teacher-level covariates were obtained from Teacher’s Quest in the fall of 2005. These allow for statistical controls for teachers’ knowledge, race, background, and training. Table 1 provides descriptive statistics for the 258 first grade teachers (in 102 schools), 242 second grade teachers (in 97 schools), and 247 third grade teachers (in 99 schools) included in the statistical analyses.Footnote 2 Both the classroom teacher’s gender and number of years of teaching experience were omitted because preliminary analyses showed no relationship to students’ improvement in reading.

Relation between teachers’ knowledge and teacher quality proxies

In an effort to understand teachers’ acquisition of knowledge about reading, we examined the relation of their previous educational and teaching experience to their teaching credentials. We found that 61% had a master’s degree, 61% had permanent certification, 96% had been teaching for more than 3 years, and 94% were female. Teachers also indicated whether they had completed other programs of professional development in reading (e.g., Orton–Gillingham; Four Blocks); the number of programs completed by each teacher was summed, on the assumption that the amount of previous professional development might affect their overall knowledge about reading. The results showed that 57% of the teachers had completed two or more training programs. Table 2 presents the means on these teacher quality proxies for a categorical variable created from our continuous measure of teachers’ knowledge on LRC. Teachers were coded as low, middle, or high knowledge teachers, where the low group consisted of teachers scoring in the bottom quartile on the knowledge measure, the high group consisted of the top quartile, and the middle group consisted of the middle two quartiles. For each grade level, there was no significant association between the level of teachers’ knowledge and their professional attainments or experiences.

Table 2 Percent of teachers at each grade level with a master’s degree, permanent certification, and two or more professional trainings

Statistical models

To answer our research question concerning the contribution of teachers’ knowledge about reading to improvements in students’ reading achievement, we used hierarchical linear modeling as our data analytic strategy. The outcome variable in each of our grade-level analyses is one of two status measures of student achievement on ITBS in the spring of 2005 (Word Analysis or Reading Comprehension). However, student scores at any particular time point are not independent of a number of influences. To account for the nested nature of the data, we used a three-level hierarchical linear model (HLM) where students were nested within teachers and teachers were nested within schools. In addition, to account for various student- and teacher-level influences on achievement, we included a number of covariates in our models.

At level-1 of these models, students’ achievement was modeled as randomly varying around the mean achievement of students in a classroom within a given school, and as a function of a number of student characteristics, including an aligned measure of students’ prior achievement, the gender of the student, the race of the student, and whether or not the student was disabled, limited English proficient, or not eligible for free or reduced price lunch. In the models, the effects of these student-level characteristics were grand-mean centeredFootnote 3 and treated as fixed effects. The general form for the level-1 regression equation was as follows:

$$ Y_{ijk} = \pi_{0jk} + \sum\limits_{p = 1}^{P} {\pi {}_{pjk}a_{pijk} + e_{ijk} } $$
(1)

where \( Y_{ijk} \) is the ITBS scale score for student i who had teacher j in school k, \( \pi_{0jk} \) is the mean student outcome for teacher j in school k, \( a_{pijk} \) are the student covariates (e.g., prior achievement, free or reduced price lunch eligibility) that predict achievement status, \( \pi_{pjk} \) are the corresponding level-1 regression coefficients that indicate the strength and direction of association between each covariate and achievement, and \( e_{ijk} \) is a random effect assumed normally distributed with a mean of 0 and variance \( \sigma^{2} \).

At level-2 of the HLM model, we hypothesize that achievement outcomes in classrooms of different teachers within the same school vary randomly around school means, and are a function of several teacher characteristics, including teachers’ knowledge of reading, gender, race, the number of professional development training sessions they attended, and whether or not the teacher has a master’s degree or permanent certification. These covariates are treated as fixed effects in the model.

Primarily, we were interested in the association between our measure of teachers’ knowledge and classroom achievement, adjusting for all of the student- and teacher-level covariates. Thus, the general model of the level-2 intercept equation took the following form:

$$ \pi_{0jk} = \beta_{00k} + \sum\limits_{q = 1}^{Q} {\beta_{0qk} X_{qjk} + r_{0jk} } $$
(2)

where \( \beta_{00k} \) is the average achievement in school k, \( X_{qjk} \) are the teacher/classroom characteristics described earlier (e.g., teacher knowledge, master’s degree, etc.), \( \beta_{0qk} \) are the corresponding level-2 coefficients that represent the strength and direction of association between each teacher/classroom characteristic and the mean student outcome for teacher j in school k, and \( r_{0jk} \) is the random effect of \( \pi_{0jk} \) assumed to be normally distributed with a mean of 0 and a variance τ π.

At level-3 of the HLM models, we were primarily concerned with accounting for the nested structure of the data. No covariates are included at the school-level. Therefore, the level-3 intercept equation was as follows:

$$ \beta_{00k} = \gamma_{000} + u_{00k} $$
(3)

where \( \gamma_{000} \) is the grand mean of achievement for all students, and \( u_{00k} \) is the random effect of \( \beta_{00k} \).

Results

The research question focused on the influence of teachers’ knowledge about reading on students’ improvement in reading. In order to examine this influence, we began by assuming a linear relationship; that is, greater amounts of knowledge translated into greater amounts of student learning measured by the ITBS Word Analysis and Reading Comprehension subtests at the first-, second-, and third-grade levels. We also examined the influence of teacher knowledge by categorizing our continuous measure of knowledge into three categories—low, middle, and high knowledge teachers (where the low group consisted of the bottom quartile, the high group consisted of the top quartile, and the middle group consisted of the middle two quartiles). The effects of teachers’ knowledge in each grade were examined using both continuous (rank of raw scores) and categorical (low, middle, and high) characterizations of teachers’ knowledge. The two sets of analyses provided very similar results. Below, for ease of communication, we present only the results from the categorized teachers’ knowledge variable.

Descriptive results from teacher and student measures

Because we were interested in examining covariate-adjusted models of student achievement, adjusting for students’ prior achievement, we began by examining the 2004 and 2005 student achievement test results for teachers grouped on the basis of performance on the knowledge measure. First graders’ reading achievement was reasonably close to the median scale score of 150, as reported by the test publisher. By third grade, however, students in the Reading First schools had fallen further behind the median score published by ITBS. The mean for Reading First students was 180.41 in Reading Comprehension and 178.18 in Word Analysis, compared with the median score of 185 from the test publisher.

Figure 1 in Appendix B shows graphical depictions of our raw data in first grade. These figures showed almost identical pre- and post-achievement scores for students in the low, middle and high categories of teacher knowledge. This is true of students’ scores on both Reading Comprehension and Word Analysis. Thus, simple examination of the raw data did not indicate a direct association between the categories of teacher knowledge and improved student outcomes in first grade.

Figure 2 in Appendix B reveals similar findings for students in second grade. Once again these figures showed essentially no differences in the slopes of the lines defined by the pre-post achievement tests. There appeared to be little difference in the amount students gained on ITBS, based on their placement in a low, middle, or high knowledge teacher’s classroom.

Finally, Fig. 3 in Appendix B shows a slightly different picture. In the third grade, the pre-post achievement slopes for the three groups of teachers did appear to be somewhat different on both reading subtests.

Even though these results seemed to indicate an advantage for third grade students in a classroom where the teacher had a high level of knowledge, it remained to be seen whether these differences were statistically significant. In addition, because we were interested in the unique contribution of teachers’ knowledge to student achievement, it also remained to be seen if the apparent differences remained after adjusting for student and teacher characteristics in our statistical models.

Teacher-level results of statistical models

The results presented in Table 3 show the model-based estimates for the effects of teachers’ knowledge on students’ achievement adjusting for all student and teacher covariates. These results confirmed our initial look at the raw data, where there did not appear to be any differences between low, middle, and high knowledge teachers in the first and second grades, and only modest differences in third grade in Reading Comprehension.

Table 3 Results of statistical analyses

Thus, our HLM models revealed no significant findings of teacher knowledge (p < .05) at any of the three grade levels. This was true of comparisons between middle and low knowledge teachers as well as comparisons between high and low knowledge teachers. As shown in Table 3 (and consistent with the figures constructed from the raw data), differences between high and low knowledge teachers were evident in the third grade. The contrast between high knowledge and low knowledge teachers was marginally significant (p < .10) for Reading Comprehension in third grade. The Word Analysis outcome, however, even in the third grade, failed to reach the level of marginal significance when adjusting for the student and teacher covariates in our model.

It should be noted here that our findings are robust to some traditional concerns about regression analyses. First, multicollinearity of the teacher-level predictors is not a source of concern in these models. Consider, for example, the aforementioned results showing the lack of association between teachers’ knowledge of reading and teachers’ professional attainments/experiences. There were no significant chi-squared tests between the categories of knowledge and (a) having a master’s degree, (b) having permanent certification, or (c) having more than one training at any grade level. Thus, the addition of these variables did not alter the relationship (i.e., did not alter the coefficient or standard error) between teachers’ knowledge and improvements on students’ achievement. Second, there is sufficient power in the data to reliably detect effect-sizes as low as .12.Footnote 4 In fact, the effect size of the marginally significant finding in third grade between teachers with high versus low knowledge on reading comprehension achievement is about .13.Footnote 5 This indicates that there is sufficient power to detect even very small effects with these data. Finally, the results from the HLM analysis confirmed the results from the raw data. There was little association between levels of knowledge and students’ improvements in reading achievement in first and second grades, and only a modest relationship for Reading Comprehension in third grade.

In addition, by examining the effect of our measure of teacher knowledge, we also sought to examine whether proxies for teacher quality predicted higher student achievement. Surprisingly few of the measures of teacher quality reached statistical significance. In fact, there were only 3 significant findings out of a potential of 18—calculated by multiplying the three proxies (master’s degree, certification, and number of professional trainings) by the six models presented in Table 3. Moreover, these proxies did not show a consistent relationship to achievement in the covariate adjusted models. Although 3 out of 18 might suggest some evidence of a relation of teacher quality and students’ gains in reading, the teacher quality variables that accounted for significant variance were not the same across grade levels and subtests. In the first grade, students gained fewer points on the Reading Comprehension scale when their teacher did not have a master’s degree. Meanwhile, students gained fewer points when their teacher did not have permanent certification in two of the six models—in second grade on the Word Analysis scale and in third grade on the Reading Comprehension scale (both p < .05).

One potential reason for the lack of direct relationships between teacher covariates and students’ achievement in these models may be the limited amount of variance located between teachers within schools. Because so much of the variance in achievement lies between students, there was little between-classroom variance in achievement available to be explained in these models. Across the six different models, only 7–11% of the variance was between classrooms within schools. However, the models also did not explain much of the between-classroom variance that existed. Depending on the outcome and grade level used, addition of all the teacher level covariates explained between 5% and 22% of the teacher-level variance. This was not surprising given the lack of significant findings for the teacher covariates.

Student-level results of statistical models

In contrast, the models explained a fair amount of between-student variance. This was because the models controlled for student demographics that were highly related to their achievement status, even after adjusting for prior achievement. Thus, in nearly all of the models, females were shown to score higher than males, white students scored higher than black students, and disabled students and students eligible for free or reduced price lunch scored lower than their more advantaged peers. These findings are consistent with prior large-scale research in education (e.g., Borman, Hewes, Overman, & Brown, 2002) and confirm the role student demographics play in predicting achievement outcomes and the importance of having them in statistical models as controls.

Discussion

We began this investigation with an interest in determining the extent to which early elementary teachers’ performance on a test of reading knowledge accounted for improvements in their students’ word reading and reading comprehension achievement. The study was carried out within the context of Reading First, where professional development is required so that teachers might improve their knowledge about how to teach early reading (Moats & Foorman, 2003). Although researchers investigating teachers’ knowledge about early reading have reported significant associations of teachers’ performance on a measure of reading knowledge and their participation in professional development, they have not sought to validate these measures by looking at the extent to which such knowledge contributed to gains in reading made by students in the teachers’ classrooms. This was the purpose of the present study.

We start this section by discussing the results and then turn our attention to two factors that might account for the weak association between teachers’ knowledge and students’ reading improvement: (a) the concept of teachers’ knowledge on which Language and Reading Concepts (LRC) was based and (b) the alignment of the knowledge measure, the reading curriculum, teachers’ practices, and student tests of reading achievement.

Effects of knowledge of early reading on students’ improvement in reading

Our initial analyses indicated that improvements in reading for first through third graders did not differ for teachers who performed at high, medium, and low levels on the LRC test composite. Then, when we controlled for student and teacher socio-demographic variables and teachers’ educational and professional attainments, we found no statistically significant effects of teachers’ knowledge on first and second graders’ improvement in Word Analysis and Reading Comprehension or third graders’ improvement in Word Analysis; only third graders’ improvement on Reading Comprehension was marginally significant. There is no particular reason to expect that teachers’ knowledge on LRC should be more influential in third grade than in first or second grade, especially since the LRC most directly measures the linguistic foundation for reading that children generally acquire in the earliest grades of elementary school.

In the statistical models, teacher variables explained a small amount of the variance among teachers (5–22%). In contrast, these models explained a sizable portion of the variance among students. Thus, certain student socio-demographic variables appeared to be highly related to improvement in reading. For example, students with disabilities and students who qualified for free or reduced lunch made less improvement than their peers for all three grades and on both reading subtests. Despite the fact that these student-level findings are not unlike those reported by other researchers (e.g., Borman et al., 2002), they alone do not explain why teachers’ performance on the LRC measure did not account for classroom level differences in students’ reading achievement.

One possibility is that these findings stem from the context of our study in Reading First schools, in particular from the difficulty of improving reading achievement for students in schools with chronic patterns of underachievement. However, the results show that students were learning to read—and starting out with low scores does mean that there was plenty of room for improvement. Therefore, we looked elsewhere for explanations of the lack of an effect for teachers’ reading knowledge. Explanations might come from two major sources: the measure of knowledge about reading and the loose alignment of curriculum and the student reading assessment.

LRC as a test of knowledge about early reading

One possible reason for the findings of the study is shortcomings of the LRC measure. The psychometric characteristics were not a contributing factor. The test was a sound measure, one that can reliably distinguish the vast majority of teachers across the ability spectrum. Furthermore, the items loaded on a single factor, and thus the results are not subject to threats of multidimensionality. Finally, the test was designed to capture the specific content emphasized by the LETRS program and therefore appears to validly represent the professional development that teachers should have received. On the other hand, the content of the measure might not capture the knowledge that teachers use to teach reading to their students; this might explain why performance on the measure was not related to students’ gains in reading over the year.

The content of LRC is a source of concern. One reason is the type of knowledge that the items were designed to assess. LRC items largely focused on knowledge that is apart from, not placed in the context of, early reading instruction—what is called content knowledge. Like the LETRS program from which it was constructed, LRC covered the major components of reading instruction in the early elementary years, but still placed a heavy emphasis on the linguistic foundation for understanding reading. Participants were asked, for example, to identify prefixes in words, give an example of an expository text, or select a method of instruction that would help students’ recall details of narrative texts. They were not asked to use this knowledge in considering the decisions and activities that arise in the act of teaching reading. While a number of researchers have accepted Moats’ view (1994) that knowledge of phonology and orthography are especially critical for teachers of early reading, support for this proposal has come primarily from significant associations of performance on measures of teachers’ linguistic knowledge, teachers’ practice, and students’ reading achievement. An issue that still needs to be addressed is whether the content of measures such as LRC adequately samples the knowledge about reading that teachers use when teaching reading.

If we believe that teachers’ instructional practices are most likely to be associated with students’ progress in learning to read, then we need to recognize that the knowledge teachers draw on to teach reading is likely to come from different professional opportunities and experiences. Thus, there was the possibility that other indices of teachers’ professional preparation to teach reading would be related to students’ reading gains, even if performance on LRC was not. The results of other studies (e.g., Darling-Hammond & Youngs, 2002) led us to expect to find associations between teachers’ knowledge and proxies for other sources of their learning—specifically, earning a master’s degree, holding permanent certification, and attending professional trainings. However, associations of the teachers’ knowledge on LRC and professional attainment measures were not statistically significant in our study. This lack of association is consistent with findings from other studies of early reading (e.g., Cunningham, Perry, Stanovich, & Stanovich, 2004). One possible explanation for these findings is that teachers’ knowledge about teaching reading consists of an amalgamation of principles, procedures, and practices acquired from different sources and not clearly associated with one source (e.g., the LETRS program). Another explanation is that master’s degree programs vary widely in the content and formats for learning about reading, so that attainment of this degree does not signal acquisition of a particular kind or depth of knowledge about early reading.

Content coverage of LRC is a related issue, since other important forms of knowledge used in teaching elementary reading might not have been measured by LRC. Furthermore, how new knowledge (e.g., attained from LETRS) is integrated into teachers’ practice is also a critical factor in producing improved outcomes. In their studies of teachers’ content knowledge, Shulman and his colleagues theorized that “pedagogical content knowledge emerges and grows as teachers transform their content knowledge for the purposes of teaching” (Wilson, Shulman, & Richert, 1987, p. 118). Such integration of knowledge, in theory, might be influenced by any of the professional sources of learning described above, as well as by the work of teaching reading itself.

Other researchers have speculated on the contribution of additional domains of reading knowledge. In particular, several studies have included measures of teachers’ knowledge about children’s literature, but this domain of knowledge has generally not been found to contribute significantly to the effectiveness of reading instruction (e.g., Bos et al., 2001; Cunningham et al., 2004). More promising are current research projects that are exploring tests of pedagogical content knowledge in reading. One example comes from a preliminary report by Hapgood, Palincsar, Kucan, Gelpi-Lomangino, and Khasnabis (2005), who devised a measure called Comprehension and Learning from Text Survey to investigate teachers’ pedagogical content knowledge as it relates to teachers’ understanding and use of informational text as they work with students. An important unanswered question is whether teachers who have a solid foundational knowledge about reading differ in how they use this knowledge in planning and carrying out reading lessons.

Alignment of curriculum and assessment

One other factor that might have contributed to the lack of a significant association of teachers’ knowledge and students’ improvement in reading is the loose relation between the reading curriculum and the assessment of students’ learning from exposure to that curriculum. In Reading First schools in Michigan, the curriculum is governed by state grade-level benchmarks and the particular comprehensive reading program adopted by the district. ITBS reading subtests are not specifically aligned with the content or methods of instruction embedded in the comprehensive programs. As Shavelson, Webb, and Burstein (1986) showed some years ago, to yield valid results, assessment of students’ learning should be tightly linked to the content and learning activities to which they are exposed. Since ITBS reading subtests were not linked to the curriculum, it is possible that teachers’ knowledge contributed to gains in student reading performance that were poorly represented by the ITBS subtest scores.

Although there are no tests that specifically reflect reading instruction in Reading First schools and districts in Michigan, we had available data from students’ performance on subtests of a classroom-based measure of reading called Dynamic Indicators of Basic Early Literacy Skills (DIBELS) (http://dibels.uoregon.edu). Teachers use information from performance on such subtests as Nonsense Word Fluency (assesses fluency of decoding) and Oral Reading Fluency (assesses accuracy and rate of passage reading) to make instructional decisions—and for this reason, performance on DIBELS might be more closely aligned to the instruction students receive in their classrooms than ITBS. In analyses of the DIBELS data, we used gains across the year as our outcome measure; the same student and teacher covariates were used in the statistical models. Our results were very similar to those we have reported for the ITBS subtests: teachers’ knowledge did not significantly contribute to students’ gains in reading.

These analyses using DIBELS as an outcome suggest that just changing the student outcome measure does not clearly indicate the reason for the lack of significant relation between teachers’ knowledge and students’ gains in reading. DIBELS might be more closely related to teachers’ reading instruction than ITBS, but there is no reason to think that DIBELS is more closely related to the content of LRC. Missing in our study is a measure of teachers’ instructional practices. This is unfortunate because one might expect that differences in classroom instruction would mediate the relation of teachers’ knowledge and students’ gains. Certainly, inclusion of a measure of teaching practice would facilitate study of the importance of alignment in the validation of measures of teachers’ knowledge.

Further directions: a summary

In discussing the results of the study, we have suggested two major directions for further research. First, further investigations of the construct of teachers’ knowledge about reading need to be carried out. We are convinced that linguistic knowledge and an understanding of the developmental process of learning to read are critical to teachers’ understanding of the job of teaching children to read. As Snow et al. (2005) indicated, such knowledge probably gives teachers a way to talk with one another and a way to understand and evaluate teaching methods and the problems students have learning to read. However, a measure of content knowledge cannot assess whether teachers are able to make effective use of that knowledge in their teaching practice. One way to address this shortcoming would be to design measures of teachers’ knowledge about early reading that focus on teachers’ knowledge about reading as used in effective instruction. Yet another possibility is building a structural equation model to determine how teachers’ knowledge contributes to instructional practices, and how both of these contribute to students’ gains in reading.

The second area is alignment of the content of LRC, the reading curriculum, and measures of students’ learning. Lack of alignment of these factors might well have contributed to the lack of significant findings in our study. Further, we also see a critical role for studies that look at the alignment of teachers’ knowledge and their instructional practices. A first step in addressing this problem might be efforts to craft more specific measures of knowledge of content and practice in a particular area of early reading (e.g., first-grade decoding); researchers might then be able to determine whether performance on such a measure, along with data collected to document the first-grade teachers’ teaching of decoding, was related to their students’ gains in decoding over a year.

Results from studies designed to explore these two areas might help us understand how to measure the knowledge of effective teachers of early reading. With advances in understanding the measurement of teachers’ knowledge about reading, we will be in a better position to design projects that aim to improve programs of teacher preparation in reading and professional development for elementary teachers, particularly in high poverty, low achieving schools.