Introduction

Arguments for educating students about the “nature of science” (NOS) have appeared at least as far back as 1854 (Matthews 2012). Many benefits have been put forward for accurately teaching and understanding what science is, how science works, the epistemological and ontological foundations of science, and how society impacts and reacts to science (Matthews 1994; Driver et al. 1996; McComas et al. 1998; Zeidler et al. 2002; Sadler 2004; Rudolph 2007; Mitchell 2009). These benefits include, but are not limited to: improved teaching of science, increased student interest and engagement in science and science learning, deeper understanding of particular science ideas, better understanding of socioscientific issues, and perhaps more appropriate decision-making regarding policy matters in a complex world.

Much is understood about effective NOS teaching and learning, but while the phrase nature of science is widely recognized by science teachers, accurate and effective NOS instruction is still not widespread (Lakin and Wellington 1994; Lederman 2007). Teachers’ NOS implementation practices are subject to many complex and interrelated factors (Abd-El-Khalick et al. 1998; Brickhouse 1990; Clough 2006; Duschl and Wright 1989; Herman, Clough and Olson 2013; Hodson 1993; Lakin and Wellington 1994; Lederman 1992; Lederman and Zeidler 1987). For instance, teachers’ NOS implementation may be influenced by constraints (e.g., classroom management, pressure from cooperating teachers) (Abd-El-Khalick et al. 1998; Bell et al. 2000); teachers’ intentions, goals, and perceptions of students (Lederman 1999); teachers’ views of the NOS, pedagogy, and perceived teaching outcomes (Abd-El-Khalick et al. 1998; Bell et al. 2000; Lakin and Wellington 1994; Schwartz and Lederman 2002); teachers’ subject matter and NOS understanding; and NOS PCK (Schwartz and Lederman 2002). Additional factors that appear to impact NOS instructional decisions include: teachers’ perceptions regarding the value of NOS for teaching, learning and socioscientific decision-making (i.e., NOS utility value); sense of personal responsibility to teach the NOS; views about how people learn; self-reflection abilities; participation in informal support networks with those who share similar views about teaching and learning; and strategies for coping with teaching constraints (Herman et al. 2011).

Furthermore, Clough (1997, 2006, 2011) and Khishfe and Abd-El-Khalick (2002) have argued that effective NOS instruction ought to be tightly and seamlessly connected to everyday science instruction (e.g., inquiry activities, instructional science media, interactive presentations of science content) and assessment. While premeditated NOS instruction is undeniably important, Herman et al. (2013) note that opportunities to accurately and effectively address the NOS often arise unexpectedly and more frequently in the midst of effective teaching practices more broadly. Clough (2006) argued that science educators should not be surprised if effective NOS teaching is linked to effective general pedagogical practices because what is characterized as effective NOS pedagogy is largely congruent with effective science teaching more generally. He writes:

Teachers’ ideas regarding the purposes of schooling, science education goals, how students learn, effective teaching, classroom management, as well as real and perceived institutional constraints affect what is taught and how it is conveyed. Planning and implementing effective lessons are complex acts, and this applies equally to traditional science content as well as accurately conveying the NOS (p. 465).

Determining what factors influence NOS teaching practice is difficult because confounding variables may be at play. For instance, Abd-El-Khalick and Lederman (2000) argued that NOS understanding is necessary but insufficient, and thus should be assessed when determining whether other factors (e.g., NOS utility value or institutional constraints) impact NOS implementation. In other words, a teacher may not overtly teach the NOS claiming he/she does not have time or does not value it as highly as science content objectives, but perhaps he/she does not understand NOS well enough to teach it. Thus, NOS understanding is commonly assessed when examining other factors that may influence NOS implementation (e.g., Schwartz and Lederman 2002). This study examines the role of another variable—general reform-based science teaching practices (GRBSTPs)—that may be confounding NOS research results. That is, the extent that teachers implement GRBSTPs (e.g., implementing inquiry laboratories and other activities that require student decision-making, asking thought-provoking extended-answer questions, using students’ ideas and scaffolding them to desired understandings) to create conceptually demanding and nurturing learning environments (Clough et al. 2009) may impact the number and quality of opportunities available for effectively addressing the NOS. The NOS research community has made many claims about factors that influence NOS teaching practice (e.g., Bell et al. 2000; Schwartz and Lederman 2002; Abd-El-Khalick et al. 1998) without empirically assessing the impact of teachers’ GRBSTPs, which may overshadow other variables and be another “necessary but insufficient” condition for NOS instruction. While particular GRBSTPs may not guarantee that the NOS will be effectively and consistently taught, they may act as a “gatekeeper” that impacts the level and quality of teachers’ NOS instruction. Little empirical research exists regarding this plausible link. Schwartz and Lederman (2002) allude to pedagogical practices in their case study of two teachers’ NOS practices, but did not assess those practices separately from NOS pedagogy. Understanding how GRBSTPs may relate to NOS teaching practices may have important implications for the design of science teacher education programs and professional development experiences that seek to promote effective and consistent NOS instruction.

Research Question

This study is designed to address the following question: What link, if any, exists between secondary science teachers’ NOS teaching practices and their GRBSTPs?

Methodology

Study Participants and Research Context

The study presented here follows from prior research (Herman et al. 2013) that investigated thirteen secondary science teachers’ NOS instructional practices 2–5 years after having completed an extensive and demanding science teacher education program. Twelve of those thirteen teachers overtly taught the NOS, and nine of the thirteen did so at moderate to high levels. Table 1 provides information regarding the thirteen secondary science teachers, four females and nine males, who participated in this study. Participants were in their second to fourteenth year of professional teaching in schools, and none were currently teaching in schools that expected or encouraged attention to the NOS in science classes.

Table 1 Study participant information

Our participants were purposefully selected based on the criterion that they completed the same science teacher education program (Patton 2002). From spring/summer 2005 through spring/summer 2008, sixty-two individuals completed the same science teacher education program (Table 2) at a large Midwestern university in the USA. Twenty-one of these graduates either never taught or taught out of state, and one could not be located. Of the remaining forty program graduates, thirteen agreed to participate in our study. We have no evidence that these thirteen participants stand out from the larger pool of program graduates.

Table 2 Science teacher education program structure, sequence, credits, and contact hours

Twelve participants completed the licensure program; ten as part of the post-baccalaureate MAT degree and two as undergraduates (Table 2). The thirteenth participant was a teacher in his 14th year, but who was in his second year of teaching after having completed his M.S. degree that also included the Nature of Science and Science Education, Advanced Pedagogy in Science Education, and the Restructuring Science Activities courses.

The science teacher education program that participants completed is designed to prepare science teachers who understand and employ reform-based practices (NRC 1996; AAAS 1990, 1993) based on the best available educational research implemented in a holistic and synergistic manner (Clough et al. 2009). Preparing teachers to effectively teach the NOS in a manner that is congruent with contemporary science education research (Clough 2006, 2011; Abd-El-Khalick and Lederman 2000; Khishfe and Abd-El-Khalick 2002; Lederman 1992, 1999) is just one important goal of this program.

The NOS course in this program, taught by the second author, rejected NOS tenets in favor of exploring the NOS through questions (Clough 2007) that convey the complexity and contextual nature of NOS ideas. For example, rather than teaching a tenet that science is “empirically based (based on and/or derived from observations of the natural world)” (Lederman 1999), the instructor raised and addressed questions such as “To what extent is scientific knowledge empirically based (based on and/or derived from observations of the natural world)?” and “In what ways is it not always based on and/or derived from observations of the natural world?” The instructor, drawing from NOS research and his prior secondary science classroom experience teaching the NOS, promoted and modeled effective NOS pedagogical practices (Clough 1997, 2006, 2011). The role that accurate and effective NOS instruction plays in science literacy and effective science teaching was consistently emphasized. The instructor stressed that accurate and effective NOS instruction should be consistently implemented, most often in the context of science content instruction. Reflecting this, for the course final examination, students restructured a preexisting science unit so that the NOS was accurately and effectively conveyed in the context of the science content being taught.

Data Collection

This investigation utilized a mixed-methods approach entailing naturalistic inquiry and assigning quantitative ratings to qualitative descriptions of each participant’s teaching practices (Patton 2002). Classroom observation data and instructional artifacts (e.g., lesson plans, worksheets, activities, assigned readings, quizzes and examinations) were collected throughout the fall 2009 semester to determine the teachers’ GRBSTPs and NOS implementation practices. All participating teachers with the exception of one were observed a minimum of three times. The remaining teacher was observed twice. All but two of the teachers taught more than one science discipline, and/or multiple levels of difficulty within a science discipline (e.g., chemistry, biology, and A.P. biology). With only a few exceptions, any individual teacher was consistently observed teaching the same science course. The particular course selected was made in consultation with the teacher.

During observations, extensive field notes were taken regarding the participants’ practice and related classroom artifacts and materials (e.g., laboratory supplies and equipment). Furthermore, instructional artifacts used in observed courses were collected biweekly from each participant. The number of artifacts submitted varied greatly because some participants had a few long-term assignments (e.g., term papers and multi-day inquiry activities) that deeply addressed several science ideas, while others had many artifacts requiring little time (e.g., “bell ringers” and worksheets). Because of this difference, rather than focusing on the number of artifacts collected, artifacts were analyzed to determine consistency and depth of instruction about NOS and science content ideas over the course of the study and to provide triangulation with observations.

Instrumentation and Data Analysis

The teachers’ NOS teaching practices were rated using the Nature of Science Classroom Observation Protocol (NOS-COP), which took into account each teacher’s observed NOS teaching practices and NOS-related teaching artifacts (Herman et al. 2013). NOS-COP categories A–C gauge the extent inquiry, historical and contemporary examples of science, or other implicit opportunities for addressing the NOS are present in lessons and artifacts. NOS-COP categories D–I represent the extent that an observation or artifact reflects effective NOS instruction (e.g., accurate and explicit referral to the NOS, requiring students to reflect upon the NOS and teaching the NOS across a variety of contexts). For the purposes of this study (linking GRBSTPs to implementation levels of effective and explicit NOS instruction), only the participants’ scores on NOS-COP categories D–I are reported here.

NOS-COP categories D–I are measured through scores ranging from 1 to 5. A score of 1 on a NOS-COP category means that lessons or artifacts were not reflective of NOS instruction outlined in contemporary science education literature, whereas a score of 5 means that they were extremely reflective of NOS instruction outlined in contemporary science education literature. A rating of not applicable (N/A) appears in the event that an observed lesson or set of artifacts did not possess substantial evidence to provide a rating for a particular sub-item. NOS-COP categories D through I measure the extent that the NOS was implemented by the teacher and were averaged for each teaching observation. These individual lesson scores were then averaged to develop a mean NOS observation implementation score for each teacher across all observed lessons. A mean NOS artifact implementation score was also calculated by averaging NOS-COP categories D through I for each participant’s NOS-related artifacts as a whole.

The Local Systemic Change (LSC) Classroom Observation Protocol (LSC-COP) was used to assess each teacher’s observed GRBSTPs and teaching artifacts over the course of this study (Horizon Research Institute (HRI) 2006; Krathwohl 1998). This instrument is extensively used in science education research and is a rating tool for assessing the extent to which science lessons are congruent with standards for reform-based teaching (NRC 1996). For instance, the LSC-COP measures the extent teachers’ lessons take into account students’ prior knowledge; encourage active investigation, participation, and collaboration; address developmentally appropriate and meaningful content; and have a climate conducive to learning. A synthesis rating is given for each of the four LSC-COP dimensions (lesson design, implementation, science content, and classroom culture). Synthesis ratings are then considered when creating a capsule rating for the teacher’s practice. Capsule ratings are overarching judgments of the teacher’s practice as a whole, not simply an average of the LSC-COP dimensions. That is, capsule ratings take into account all available information about a lesson, framed by the individual LSC-COP dimension ratings, to illustrate the quality and impact of a lesson in regard to its encouraging students to deeply learn science. Table 3 summarizes the key features of capsule ratings.

Table 3 LSC-COP capsule ratings and scores used in this study

Capsule ratings range from ineffective instruction to exemplary instruction congruent with reform documents and science education research (NRC 1996; AAAS 1990, 1993; Clough et al. 2009). For a detailed description of the LSC-COP, see http://www.horizon-research.com/instruments/. While the LSC-COP is typically used to evaluate observed lessons, we also used the LSC-COP to assess the extent each teacher’s collection of lesson artifacts used over the course of this study were congruent with reform-based practices.

The first author conducted simultaneous NOS-COP and LSC-COP evaluations immediately after observing each participant’s lessons and submission of instructional artifacts. Observations were conducted in person, and extensive field notes were taken that described the lessons and learning environment. Several steps were taken to ensure the reliability, validity, and transparency of the first author’s evaluations. First, this author was trained to 95 % proficiency with the LSC-COP by the third author, who was taught to use this instrument by its developers. Second, illustrative exemplars congruent with the general descriptors on the NOS-COP and LSC-COP coding scheme were derived early in the study through the first author repetitively analyzing, coding, and cross-comparing data sources from participating teachers. This process was reiterated until the NOS-COP and LSC-COP category scores and general descriptors were accurately matched with a transparent exemplar. Once exemplars were selected, all three authors independently reviewed and cross-compared the exemplars, and then came to full agreement that the exemplars characterized the NOS-COP and LSC-COP scores and general descriptors of NOS and GRBSTP (See “Appendix 2” and the full NOS-COP instrument in Herman et al. 2013). After the exemplars were determined, each teacher’s NOS-COP and LSC-COP ratings were verified through re-analyzing classroom observation notes and artifacts using the NOS-COP and LSC-COP and their corresponding exemplars. Third, the second and third authors periodically verified the validity of the first author’s NOS-COP and LSC-COP lesson and instructional artifact ratings through coding random selections of artifacts and field notes. These periodic “spot checks” ensured that all three authors were in full agreement about how each participant’s teaching was assessed.

Because the LSC-COP and NOS-COP both assess matters of instruction, a concern may arise that they are measuring significantly overlapping constructs. That is, a high LSC-COP capsule rating would always earn a high NOS-COP rating and vice versa. However, that is not the case. Any observed lesson and lesson artifacts might earn a high LSC-COP capsule rating, but not accurately or effectively teach the NOS, thus earning a low NOS-COP rating. Alternatively, any observed lesson and lesson artifacts might earn a low or medium LSC-COP capsule rating, but earn a moderate or high NOS-COP rating, respectively. For instance, accurate NOS ideas could be presented solely through a lecture that had little, if any, student involvement. Such a lesson would earn a very low LSC-COP capsule rating. However, if that lecture overtly put forth accurate NOS ideas and may even have done so in a variety of contexts (e.g., black-box example, a laboratory demonstration, and a science historical incident), such a lesson would, based on the NOS-COP protocol (Herman et al. 2013), at best receive a moderate (e.g. “3”) rating.

After each teacher’s LSC-COP scores were determined, we grouped teachers based on NOS implementation levels. We then cross-compared the NOS implementation levels to determine patterns of GRBSTPs. This was completed by organizing and plotting the participants’ LSC-COP scores in comparison with increasing NOS implementation levels.

Determination of NOS Understanding

Prior research has established that NOS understanding impacts NOS implementation practices (Lederman 2007). Therefore, we determined each teacher’s NOS understanding so that the findings of this study are not confounded by this important variable. Each participating teacher’s overall NOS understanding was determined using items from the SUSSI that was modified to include four additional NOS constructs and procedures outlined in Liang et al. (2008). Specifically, each participant was rated as “informed,” “transitional,” or “naïve” based on the percentage of the ten SUSSI NOS constructs for which they provided both Likert and qualitative responses that demonstrated congruence with the consensus views about the NOS described in contemporary NOS literature (e.g., Smith et al. 1997; McComas 1998; Eflin et al. 1999; Abd-El-Khalick 2012). An “informed” rating was awarded to a teacher when at least seventy percent of the SUSSI constructs were responded to with Likert and qualitative responses that were fully congruent with the consensus views about the NOS. Alternatively, a “naïve” rating was awarded to a teacher when at least seventy percent of the SUSSI constructs were responded to with Likert and qualitative responses that were fully incongruent with the consensus views about the NOS. A rating of “transitional” was awarded to teachers with a percentage of responses falling outside of the parameters required for “informed” and “naïve” ratings (e.g., providing combinations of naïve and informed Likert and qualitative responses for forty percent or more of the SUSSI NOS constructs). Herman et al. (2011) provide a more extensive explanation regarding the determination of participants’ NOS understanding.

Six of the thirteen participant teachers’ summative views of the NOS were informed, and six were transitional. One teacher’s NOS understanding was unclassifiable because he completed an insufficient number of SUSSI responses (Table 1). Having no evidence that any of the participants possessed uninformed NOS understanding permits us to examine the link between NOS teaching practices and GRBSTPs with reduced concerns that lack of NOS understanding is confounding our study’s findings.

Study Limitations

This study seeks a rich understanding of teachers’ practices and conceptual understanding. Consistent with other studies that are based in this qualitative tradition, generalizability is limited. Teachers who participated in this study completed a teacher preparation program that has features uncommon to most programs, and what occurs in the program has an impact on teachers’ practices (Herman et al. 2013). Thus, findings of this study should be considered a first step toward understanding how science teachers more broadly implement GRBSTP and NOS instruction.

Findings

Based on classroom observations and artifacts (e.g., lesson plans, handouts, tests), each participant’s NOS implementation was classified into one of three levels (1 to <2.3 = low; 2.3 to <3.6 = medium; and ≥3.6 = high) using the Nature of Science Classroom Observation Protocol (NOS-COP) appearing in “Appendix 1”. Four of the participants implemented the NOS at a high level, five did so at a medium level and three did so at a low level. The remaining study participant did implicitly address the NOS in a few artifacts and lecture statements (e.g., magazine articles and lecture statements that tacitly alluded to the relationship between science and society), but he did not accurately or effectively teach the NOS in any noteworthy manner (Table 1). Teachers with informed and transitional NOS understanding were found at all three NOS implementation levels.

Participants’ NOS-COP ratings (NOS instruction implementation level) and LSC-COP capsule ratings (level of GRBSTPs) for observed lessons and artifacts are presented in Fig. 1 and Table 4, respectively. All twelve observed lessons that scored as high NOS implementation (NOS-COP rating ≥3.6) also scored at or above six on the LSC-COP. All but one of the thirteen observed lessons that scored as medium NOS implementation (NOS-COP rating 2.3 to <3.6) earned an LSC-COP rating of at least 5 (i.e., 3-high) with the remaining one medium NOS implementation lesson rated 3 (3-low) on the LSC-COP. Fourteen observed lessons demonstrated low NOS implementation (NOS-COP rating <2.3), and of these fourteen lessons, twelve scored no higher than 3 on the LSC-COP.

Fig. 1
figure 1

Study participants’ NOS-COP and LSC-COP (capsule) lesson observation ratings

Table 4 Study participants’ NOS-COP and LSC-COP artifact scores

Observed lessons that earned a high LSC-COP capsule rating (≥6) did not always score high on the NOS-COP. For instance, several of John’s, Sharon’s, and Mark’s lessons earned a high LSC-COP capsule rating, but their NOS instruction was generally at the moderate level. Furthermore, both Isaac and Mary had a low NOS implementation lesson that scored moderately (5 and 4) on LSC-COP capsule rating. Thus, based on observed lessons’ LSC-COP and NOS-COP ratings, high-quality GRBSTPs appear to be associated with, but do not ensure, effective classroom NOS implementation.

Similarly, analysis of participants’ instructional artifacts presented in Table 4 shows that higher NOS implementers had the highest LSC-COP artifact scores, followed by medium implementers’ LSC-COP artifact scores, followed then by low implementers’ LSC-COP artifact scores.

Descriptions of high, medium, and low NOS implementers’ GRBSTPs appear below. “Appendix 2” provides additional examples of participants’ observed classroom teaching practices organized according to their LSC-COP capsule ratings. While not including all lessons observed in this study, “Appendix 2” conveys the general trends within LSC-COP and NOS implementation rating classes, provides transparency regarding LSC-COP rating decisions, and assists in visualizing what classroom instruction “looks like” at each LSC-COP rating level.

High NOS Implementers’ GRBSTPs

High NOS implementers’ lessons and artifacts show that their instruction was conducive for student learning. High NOS implementers’ instruction was clearly focused on students’ thinking and meaning-making. For example, Luke, Andrew, John, and Matthew were often observed teaching science through inquiry (e.g., beginning instruction with concrete experiences and asking questions that effectively scaffold students from their experiences and ideas to more accurate understanding). The concrete experiences and ubiquitous teacher questioning promoted a highly interactive and mentally engaging learning environment. This was evident when these teachers were focused on conveying science content and when they focused on NOS ideas. For instance, Luke began a mid-unit lesson on land formations by leading a class discussion about how constructing valid scientific knowledge, through gathering empirical evidence, is like building a brick wall. In this discussion, he addressed inductive and deductive reasoning, and built on this in an inquiry activity. Students viewed several images of valleys and were asked to speculate about how they were formed. Through discussion and scaffolded questions, Luke helped the students understand that the valleys were glacially carved. Luke then asked his students to speculate on why Paul Bunyan dragging his axe would not be considered a scientifically valid explanation for valley formation. While doing so, Luke drew students’ attention to a decontextualized NOS black-box activity the students had completed a few months earlier, and how they could not test for supernatural causes (e.g., gremlins) to explain how the black box worked. The students responded that Paul Bunyan is not a naturalistic explanation that science would use to explain glacially carved valleys. (In a post-observation interview, Luke indicated that he had not planned for the Paul Bunyan example, but saw the opportunity and realized he could address how supernatural explanations are outside of science.) Luke then led a discussion focused on the scientific explanation for glacially carved valleys and erosion, and finished the lesson with a preplanned writing activity about the differences between scientific laws and theories in the context of these geology concepts. Artifacts from Luke’s classroom illustrate that the students were required to apply the NOS and science content ideas addressed in this lesson as the unit continued. For instance, later in the unit, the students completed a short story containing embedded NOS questions (e.g., about the nature of scientific investigations, evidence, and the lack of a scientific method) about Charles Lyell’s investigation of Niagara Falls erosion. The students also completed a reading assignment about the evolution of plate tectonics theory containing highly contextualized NOS questions related to the tentativeness of scientific knowledge, the role of evidence in developing scientific knowledge, and the nature of scientific theories.

Twelve of the thirteen lessons conducted by high NOS implementers resembled this lesson by Luke and were rated at or above “6” for LSC-COP capsule ratings (Fig. 1; see “Appendix 2” for LSC-COP descriptions). Furthermore, ten of these lessons were rated ≥3.6 on the NOS-COP for implementing highly effective NOS instruction (e.g., high degrees of NOS accuracy, consistent explicit/reflective referrals to the NOS, and teaching the NOS across varied contexts).

High NOS implementers’ instructional artifacts, much like their lessons, exhibited high degrees of reform-based practices while also affording great attention to the NOS. All high NOS implementers’ artifacts were rated at or above a “6” for LSC-COP ratings, and their NOS-related artifacts were rated at or above 3.7 for NOS-COP ratings. Notably, all high NOS implementers’ artifacts consistently reflected effective reform-based practices while seamlessly integrating inquiry, science content, and the NOS. Specifically, high implementers’ instructional artifacts were developmentally appropriate, drew from students’ prior knowledge, and provided students with foundational concrete experiences for learning about abstract science and NOS ideas. Furthermore, high NOS implementers’ artifacts included science readings and inquiry activities coupled with scaffolding questions and discussion prompts that required students to reflect upon targeted science and NOS ideas. For instance, one of Andrew’s artifacts had students develop and conduct an investigation where they would apply their newly developed conceptions about density. Similarly, one of John’s artifacts had students read a historical account of Galileo’s falling body investigations and reflect upon several NOS ideas. Similar to other high implementers’ artifacts, these artifacts contained several questions (e.g., What is your research question? How will you test the idea? What do the data mean to you? How do you know this is the case?) that guided students to critically think about scientific processes, scientific content, and presented opportunities for addressing relevant NOS ideas. Additionally, these artifacts contained several questions that caused students to directly reflect upon important NOS ideas (e.g., In what ways does the reading demonstrate the durability and tentative NOS? How does the reading distinguish between science as it is conveyed in the media and science as conveyed through historical accounts?), and how these NOS ideas relate to the students’ classroom experiences (e.g., In what sense did your investigations of accelerating objects accurately portray the account you read about Galileo’s investigations?).

While high NOS implementers’ observed lessons and instructional artifacts show that some of their NOS instruction was clearly preplanned, their GRBSTPs, as reflected in their LSC-COP ratings, oftentimes created unforeseen opportunities to effectively teach NOS ideas. That is, even when not planning for NOS instruction, these teachers’ GRBSTPs afforded opportunities for raising NOS ideas. For instance, “Appendix 2” provides an example lesson where Luke’s instruction about classification schemes resulted in an unforeseen discussion about the NOS. In this discussion, Luke confronted his students’ misconceptions, primarily through scaffolding questions, about the extent that science ideas are invented and/or discovered. In short, high NOS implementers employed effective reform-based science teaching practices in teaching science content and the NOS, but these reform-based practices also provided many unplanned and unforeseen NOS instructional opportunities. High NOS implementers often seize on these opportunities in the act of teaching.

Medium NOS Implementers’ GRBSTPs

Carey, Peter, Mark, Isaac, and Sharon comprise the group of medium NOS implementers. Figure 1 presents the LSC-COP scores for these teachers’ fourteen observed lessons. Of these fourteen lessons, only five (three by Sharon and two by Mark) earned LSC-COP scores equal to that of the high NOS implementers. Specifically, Sharon’s and Mark’s science instruction took into account students’ prior experiences and progressed from ideas and representations that students could concretely conceptualize to those that were more abstract. Furthermore, both participants asked questions that effectively helped students make connections between concepts. However, when Sharon and Mark implemented the NOS in lessons, they sometimes struggled to deeply draw their students’ attention to those NOS ideas that were readily apparent. Specifically, Sharon struggled to ask scaffolding questions that caused students to deeply contemplate and link NOS ideas in various contexts. Mark tended to place more emphasis on the nature of modeling in science instead of NOS ideas more relevant to the science ideas being taught.

The remaining nine lessons conducted by medium NOS implementers scored from 3 to 5 on the LSC-COP primarily because of problems with the lesson design and/or implementation. In many of the lessons rated as 5, the teachers employed an interactive presentation approach (presentation of information punctuated by questions and discussion), but made overly simplistic statements and asked superficial questions. Questions that would mentally engage students, help them make important connections, and assess students’ thinking were absent. Outside Sharon and Mark’s lessons, medium NOS implementers’ lessons were noticeably more unorganized than high NOS implementers. This was evidenced by the presence of rough transitions that impeded effective instruction. For instance, in the two lessons conducted by Isaac and Carey rated as 3, design and implementation issues resulted in minor classroom management problems that impeded these teachers’ abilities to effectively teach. For instance, in his second observed lesson, Isaac clearly struggled to conduct an interactive discussion pertaining to the rationale behind using percentages to standardize data in science. Consequently, the students became confused and off-task behavior ensued. Out of frustration, Isaac resorted to lecturing about percentages.

Medium NOS implementers’ instructional artifacts were similar to their observed lessons in regard to the extent they exhibited reform-based practices. With the exception of Sharon, whose artifacts resembled those of the high NOS implementers and received a LSC-COP rating of 6, medium NOS implementers’ teaching artifacts received LSC-COP capsule ratings of 4–5 (Table 4). Several of these teachers’ artifacts appeared to moderately impede their students’ abilities to conduct their own inquiries and apply concepts learned to other areas of science, other disciplines, or real-life situations. For instance, Carey’s artifacts included inquiry-based activities (e.g., canning activity described in “Appendix 2”), but also “cookbook” activities, multiple choice worksheets and tests, and textbook readings.

Low NOS Implementers’ GRBSTPs

Low NOS implementers’ LSC-COP capsule ratings were much lower than those of high NOS implementers and at best matched the lower range of medium NOS implementers. Ten out of the twelve low NOS implementers’ observed lessons received LSC-COP capsule ratings ranging from 1 to 3. Common to these lessons were ineffective science teaching practices such as presenting overwhelming amounts of factual and often developmentally inappropriate information through lectures. In addition, lessons largely consisted of cookbook activities, textbook assignments, and/or worksheets. For instance, in a lesson receiving a LSC-COP capsule rating of 2, Thomas implemented a cookbook laboratory where students used Punnett squares to predict phenotypes from a monohybrid cross of two heterozygous parents. Students were then required to simulate this cross by repetitively drawing, with replacement, from two bags containing 20 red and 20 white beans. Like many cookbook laboratories, this activity had students complete multiple recipe-like steps and prestructured data tables with little meaningful mental engagement. When preparing students for the activity, Thomas stifled the potential for inquiry by lecturing to students how to complete the laboratory and obtain the desired results with little difficulty. Throughout the activity, Thomas spent much time at his desk, and his interaction with students mostly directed them to laboratory procedures so the activity would be completed by the end of the class period.

In two instances, low NOS implementers’ lessons demonstrated somewhat effective science teaching practices. For example, one of Mary’s lessons earned a 5 LSC-COP capsule rating because it initially required students to organize images of protists into groups according to morphological similarities and differences. At the conclusion of the lesson, students had come to consensus regarding the categories (animal-like and plant-like) to which the protists belonged. Students were asked questions that explicitly drew their attention to important conceptual ideas. Mary continued this lesson on the following day, but with ineffectual practices (e.g., lecturing).

Low implementers’ artifacts were similar to their observed lessons regarding the extent they exhibited reform-based science teaching practices (LSC-COP capsule ratings of 1–3). These participants’ instructional artifacts, as a whole, would severely impede students’ abilities to conduct their inquiries, express their thinking, and apply and generalize concepts learned to other areas. Low NOS implementers’ instructional artifacts almost exclusively consisted of cookbook activities and assignments and assessments that required little thinking and conceptual understanding (e.g., fill in the blank worksheets, multiple choice exams).

Discussion

In our review of prior literature, we noted that teachers’ NOS implementation practices may be influenced by a variety of factors including: instructional constraints; intentions, goals, and perceptions of students; views of the NOS, pedagogy, and perceived teaching outcomes; understanding of science content, the NOS, and NOS PCK; perceptions regarding the value of NOS for teaching, learning and socioscientific decision-making (i.e., NOS utility value); sense of personal responsibility to teach the NOS; views about how people learn; self-reflection abilities; participation in informal support networks with those who share similar views about teaching and learning; and strategies for coping with instructional constraints.

The study reported here adds to this extensive list another factor impacting teachers’ NOS implementation practices. Our thirteen participants’ level of NOS implementation (reflected in their NOS-COP rating) is associated with the extent they also implemented GRBSTPs (reflected in their LSC-COP rating). This empirical association appears to stem from two important aspects of GRBSTPs.

First, teachers who employ the GRBSTPs of inquiry and require extensive student decision-making have far more opportunities than other teachers for purposeful NOS instruction. Importantly, among our study’s high and medium NOS implementers, much NOS instruction that was observed was not planned for. Rather, high and medium NOS implementers often seized on NOS teaching opportunities when they unexpectedly arose in the context of inquiry and student decision-making. Planning for NOS instruction is important, but teaching science through inquiry and requiring student decision-making create both planned and unplanned opportunities for teaching the NOS. High NOS implementers in this study accomplished much unanticipated NOS instruction during spontaneous moments that occurred in the act of the above identified GRBSTPs.

Second, GRBSTPs include the use of questions that effectively assist students in meaning-making, scaffolding them from their initial thinking to desired understandings. Teachers who have developed this cognitively demanding teaching practice are in a much better position to ask questions that effectively draw students’ attention to NOS ideas in a manner that has them meaningfully think about those ideas. Teachers who generally struggle with questioning will likely also struggle to effectively teach the NOS. This is because they will be far less proficient at uncovering students’ thinking regarding the NOS, less effective at asking questions that overtly draw students’ attention to NOS ideas in a manner that demands meaningful thinking and reflection, and less able to effectively scaffold students from their initial NOS ideas and reasoning to more thoughtful, defensible, and robust NOS understanding.

In summary, implementing inquiry laboratories and other activities that require student decision-making appear to be the GRBSTPs most important for creating opportunities for accurate NOS instruction. Asking thought-provoking extended-answer questions and playing off students’ ideas in ways that scaffold them to desired understandings appear to be the most important GRBSTPs for seizing on opportunities to effectively teach the NOS. Implementing inquiry experiences and other activities that require considerable student decision-making and teachers’ proficiency at asking highly effective questions together are important “tools” for NOS implementation efforts whether purposely planned for or arising unexpectedly in the act of teaching a lesson. These tools also make accurately and effectively teaching the NOS a far more natural part of everyday instruction.

Returning to an expression we used in the introduction, inquiry and other instructional activities that demand student decision-making along with teachers’ proficiency at asking effective questions appear to be GRBSTPs that function as a “gatekeeper,” substantially influencing the level and quality of NOS instruction. Other “gatekeepers” are also at play. For instance, while the GRBSTPs we have identified establish far more favorable conditions for accurate and effective NOS instruction, they do not ensure that such instruction will take place. Accurate and effective NOS instruction also demands that teachers accurately understand the NOS and value such understanding as a goal for science education (Lederman 2007). The GRBSTPs identified in this study appear to play a crucial, but insufficient, role in accurate and effective NOS instruction. This is exemplified in Sharon’s and Mark’s teaching. Both taught lessons receiving LSC-COP ratings that were equivalent to those of high NOS implementers. However, while their GRBSTPs appeared equivalent to that of high NOS implementers, their NOS implementation was at the medium level. Examining their NOS understanding, Sharon was transitional and Mark was unclassifiable. Perhaps their NOS implementation was lower because, despite possessing solid GRBSTPs, they lacked NOS content understanding. Thus, both GRBSTPs and NOS understanding appear necessary, yet are still insufficient for high NOS implementation. That is, robust NOS instruction may very well demand that teachers, at the very least, value NOS, accurately understand it, and be proficient at the GRBSTPs identified in this study. In this light, the science education community’s struggle to promote accurate and effective NOS instruction is understandable.

The findings of this initial foray into the association between GRBSTP and NOS implementation suggest the need for additional studies designed to further test this relationship. If our findings are corroborated, then future research regarding teachers’ NOS implementation practices should establish and report study participants’ general science teaching practices, just as is now the case with establishing participants’ NOS content understanding. If participants’ general science teaching practices are not empirically established, then accurately identifying other factors that impact NOS implementation may be confounded. For instance, imagine a study claiming the value teachers place on NOS instruction impacts NOS implementation, but neglects to account for the teachers’ general teaching practices. In such a hypothetical study, one could not tell whether deficiencies in NOS implementation were truly due to low regard for NOS instruction, or whether such deficiencies were due to broader issues with the teachers’ general pedagogy. Any study claiming that an understanding of NOS pedagogy impacted NOS instruction, but does not account for teachers’ general teaching practices, faces the same problem. Without determining general pedagogical practices, claiming that other variables are responsible for NOS implementation practices is problematic. As NOS research more precisely targets factors that impact NOS teaching practice, care must be taken to ensure that confounding variables are made explicit and assessed. While claims about the impact of general pedagogy on NOS instruction are mentioned in NOS literature, NOS research that actually assesses general science teaching practices is rare. NOS implementation studies typically report on participants’ NOS understanding because teachers cannot accurately convey what they do not understand. Likewise, we cannot expect high levels of effective NOS instruction if teachers do not employ highly effective general science teaching practices.

Our findings have important implications for science teacher education efforts directed at promoting accurate, effective, and robust NOS implementation. First, such efforts demand significant attention to both NOS teaching and learning, but also to the GRBSTPs empirically supported in our study. Efforts that primarily target one or the other will unlikely promote the level of accurate and effective NOS teaching practices observed among our study’s high and medium NOS implementers. A second implication is to overtly draw teachers’ attention to the synergistic relationship between GRBSTPs and NOS instruction. The science teacher education program (see Table 2) our participants completed devoted considerable time and effort to promoting effective general science teaching practices (Clough et al. 2009), NOS content, and effective NOS pedagogy (Clough and Olson 2004). The sequence of three required science methods courses (four for graduate licensure students), required Nature of Science and Science Education course, and the elective Restructuring Science Activities course which most program graduates complete is far more extensive and demanding than typical science teacher education programs. Whether such intensive efforts can be widely adopted for both preservice and inservice teachers remains a challenge for the science teacher education community.

Our study could be interpreted as an argument that science teacher education efforts to promote accurate and robust NOS instruction should perhaps be directed only at experienced inservice science teachers who have mastered key GRBSTPs. We strongly urge a different interpretation that better accounts for our findings and previous research literature. First, that twelve of our thirteen participants purposely and overtly taught about the NOS, and that nine of the thirteen did so at medium to high levels, is attributable to what they experienced in their secondary science teacher education program (Herman et al. 2013). Second, the NOS misconceptions that preservice teachers possess will only be exacerbated by delaying their NOS instruction and thus be more resistant to change. Third, attention to accurate and effective NOS instruction is contrary to “expectations held of science and science teaching in schools, not only by teachers and pupils but also those perceived as being held by parents and society” (Lakin and Wellington 1994, p. 186). Because constraints to NOS instruction permeate schools and society, waiting for teachers to develop highly effective GRBSTPs may also result in those same teachers developing attitudes significantly at odds with NOS instruction. Finally, while teaching experience certainly plays a role in mastering GRBSTPs and robust NOS instruction, it does not alone result in those practices. Our view is that preservice science teacher education programs are crucial for introducing and developing in preservice teachers GRBSTPs, NOS understanding, and fervent rationales for teaching and learning the NOS that set teachers on a trajectory that will, with support, lead to mastering GRBSTPs and high-level NOS implementation.