Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Our work has a long-term focus on describing and explaining the development of students’ disciplinary learning histories. We seek to better understand how and why students enter and become fluent in forms of thinking and knowing that are particular to the scientific and mathematical disciplines. What resources for and barriers to these specialized forms of thinking do young children bring? How can educators best build upon those resources without creating breaks in students’ sense making? Specifically, for the past dozen or so years we have been conducting classroom-based research on the origins and development of model-based reasoning in mathematics and science (Lehrer & Schauble, 2005). Like many complex forms of disciplinary thinking, this one does not develop over the short term. Instead, it is probably best conceived as a life-long enterprise, one that the youngest students can enter in some form, but that remains both central and challenging even in the practice of professional scientists. A pressing question for us is how classroom episodes, lessons, months, and years of instruction cumulate in a repertoire of models and a propensity to engage in what Hestenes (1992) calls the “modeling game.”

Before readers dive into the chapters to consider this question for themselves, we begin by describing the larger context within which this research was conducted and next move on to explain what we were trying to achieve educationally – both over the long term (i.e., the entire collaborative enterprise) and the relatively short term of this particular study.

The Context

It goes without saying that the conduct of inquiry about learning in any classroom is negotiated within a larger institutional system that has its own trajectory of interaction and learning. Saxe and Esmonde (2005) have suggested that a comprehensive view of development entails consideration of change at microgenetic, ontogenetic, and sociogenetic levels. Microgenetic change was the focus of the investigation reported here – that is, we were investigating conceptual change and the means to support it in a circumscribed curricular landscape. Because we are concerned with education, we focus on microgenetic arcs that may extend for weeks or months, although of course, one could focus on change at the scale of minutes, or even seconds. The second form of change, ontogenesis, traces trajectories of individual development over a more prolonged period of time. For us, the relevant time scale is measured in years: We worked with teachers to establish threads of teaching around a few core concepts and practices in sciences and mathematics. Together, teams of researcher-teacher colleagues engaged in multiple cycles of design and revision. Teachers worked in cross-grade teams, collecting samples of student work and developing cases of student learning to inform the wider teacher community (see Lehrer & Schauble, 2002 for a sample of teacher work in the data modeling domain featured in this volume). Over time, the work in this teaching community was transformative – a kind of change in the culture of teaching that Saxe and Esmonde (2005) call sociogenetic. Hence, new work, such as the work undertaken in this study, was negotiated in light of both existing and robust teacher practices and also, existing ontogenetic trajectories for the participating students.

Although these forms of change were indeed operating (see Gamoran et al., 2003, for independent documentation), they did not always operate smoothly. As usual in extended school-based work of this kind, there were a number of disruptive influences, including broader macroeconomic trends that were impacting the district. During most of our time there (and continuing today), the district was one of the fastest growing in the state. As a result, some of the participating students were long-term residents of the district, while others were newcomers and therefore were being inducted into classroom practices that were initially unfamiliar to them. These factors were a source of both variability and unanticipated contingency at the classroom level. In addition, as the district grew, so, too, did the need for more teachers. New teachers were constantly coming into the teaching community, and the rapid expansion generated wide variability in classroom practices, despite the fact that teachers endorsed similar curricular goals and tasks. The teacher who participated in the current study was a relative newcomer to the group, although he had worked with us to conduct a design study the previous academic year.

The educational design was longitudinal and purposive in character. We wanted to both identify and build on young students’ resources for modeling in mathematics and science (Lehrer & Schauble, 2004, 2006). Moreover, we had a commitment to finding ways to help younger students and novices find easy access into these ideas, but also a corresponding commitment to continually up the ante for students, so that increased challenge and explanatory power were continually being forged. Over the many years of this research, we developed reasonably elaborated notions of what we hoped to accomplish at each grade. The focus was on development, so that at every grade, children’s mathematics built on the mathematical ideas that had previously been put in place and also needed to support the modeling approaches to science that we were investigating. Many of these mathematical and scientific ideas are not typically taught to elementary school children at all, but followed from research and our own conceptual analysis of what would best support the long-term development of student disciplinary knowledge. Administrators permitted this latitude in instruction because the statewide test scores in mathematics, especially in the classrooms in which we worked most intensively, continued to show yearly improvement (Lehrer & Schauble, 2004). The superintendent often dropped in during our professional development sessions (we held them in the district’s administrative center) and decided on the basis of his observations that much of the improvement in these scores could be attributed to the activity of this professional teaching community. Innovation was further held to account in the metric of accountability via students’ performance on yearly state tests. This accountability was important to members of the school board, especially those who ran on a “core knowledge” platform.

What was the new work intended to contribute? In this instance, we were seeking a capstone to ontogenetic trajectories established (in the ideal) for measurement and for data. The measurement trajectory began with fundamental ideas of measure in length, area, and volume during the primary grades and then progressed to include ideas of error and distribution in the later elementary years (Lehrer, 2003). Our rationale was that measurement is an important mathematical system in its own right and moreover, plays a critical role in our approach to modeling. Developing a measure of an aspect of a natural system requires developing a more thorough understanding of it. The work with data focused on developing representational competence and on enhancing intuitions and representations of variability (see Lehrer & Schauble, 2002, for descriptions of work with teachers to support their ability to teach along this trajectory). Instead of skimming over a wide variety of science topics at the surface level, our students were building deep and cumulative knowledge within bounded domains by posing questions, developing measures, and building, testing, revising, and critiquing models of the natural world.

The Educational Design

Although learning cannot be considered an instance of kinematics, except in the hearts of the most die-hard epistemic realist, nonetheless, images of trajectory are useful for anticipating the scope and sequence of instruction. We had in mind an end-point where students would be able to represent natural variability with the mathematics of distribution and chance, and to employ this emerging capacity to envision growth in a new way – as change in populations. Our hope was that this more complex sense of growth would complement their resources, developed in earlier grades, for representing change in individual organisms as rates of growth. In other words, we sought to provide mathematical resources that would make it possible for students to make the difficult shift from thinking about organisms to population thinking. With that overall goal in mind, we envisioned five phases of instruction.

First Phase: Purposes and Measures

In all investigations, we aimed to underscore the tight relations between the questions that students posed about a natural system and the attributes and measures that could be generated as data in service of these questions. In this investigation, the teacher engaged the group in the generation of questions that focused on the effects of different amounts of light and fertilizer on the growth of plants (these were Wisconsin Fast Plants®, which grow in 40 days and thus are well suited for classroom investigations). Students predicted that both light and fertilizer would make the plants grow taller and perhaps affect other measures, such as their “width,” as well. Collectively, students designed experiments with contrasting conditions, regarding height as the most prominent of the dependent measures. They recorded these measures (along with others) as the plants grew.

A second goal of the initial phase of instruction was to engage students in reasoning about the means and methods of measure: What is meant by “height?” “Should the height of the plant include the roots? Suppose the plant leans as it grows or develops multiple branches?” What is a good unit of measure? Thinking through and achieving consensus on these questions helped students appreciate the interpretation of their measures and consider the degree of trust that they had in them. For some students in the room, this was a familiar kind of argument, but for relative newcomers to the classrooms of participating teachers, it was not. After resolving these issues, students recorded heights of plants grown under different conditions throughout their life cycle, keeping data in the form of simple records of their own design. We made no effort to impose any particular structure on the measurements. Students also developed methods of measure for other attributes (e.g., width, number of leaves, seedpods), but we settled on height for introducing concepts of distribution, the focus of the second phase of instruction, because it seemed most prominent to the students.

Second Phase: From Difference to Structure

In this portion of the instruction we intended to support a transition from students viewing the collection of plant heights on any particular day of growth as merely different, toward apprehending a structure (a distribution) regulating these differences. Students were keenly aware of natural variation, that is, that the collection of plants was not of a uniform height at any point in the life cycle. We asked students to design displays that would illustrate a “typical” height and spread of the plant heights at a single day (the 19th day of growth), and as we describe below, these are the segments on which the Allerton participants devoted most of their focus. We hoped to contrast case-based views with aggregate views of the same data, so that students could come to see the “shape of the data” as reflecting choices about what to represent in the data, and how. diSessa (2004) refers to this sense of appreciation of the consequences and implications of different choices of data display as meta-representational competence. The inherent tension in the design was that students had never considered viewing plants collectively, and so for them, the value of this form of analysis was not transparent. Yet many of the students in this class had generated distributions of repeated measures during the previous year (Petrosino, Lehrer, & Schauble, 2003), and so we conjectured that these earlier experiences would serve as resources for the current enterprise. We have since learned that considering variability as produced by random error or by some other “natural” random process raises very different challenges and affordances (Lehrer & Schauble, 2007; Konold & Lehrer, 2008).

Students worked in small groups to create their displays. During this small group activity we wanted to elicit students’ thinking and to promote, in at least one or two groups, displays that would treat the data as aggregated, rather than as a mere collection of discrete cases. In other words, we hoped to provide opportunities for students to develop a firmer coordination between their knowledge of and natural focus on individual cases (my plant and its unique qualities), with a sense of the aggregate, or the data itself as an object of attention (Lehrer & Romberg, 1996). After completing their display, each group handed it off to another group who were asked, in turn, to interpret what the authors were trying to communicate and to evaluate their success at showing “typicality” and “spread.” This activity structure, which we tend to use repeatedly, creates a sense of audience for the students’ work and thus highlights the communication demands of data displays. It also prompts conversation about design trade-offs, that is, how a particular data display highlights some features of the data and suppresses others (children refer to this as “showing” and “hiding” features of the data). These critique sessions often provide a strong press toward revision and ultimately, toward identifying effective ways of solving representational problems that eventually come to be accepted as classroom conventions (Lehrer & Schauble, 1994). The initial or invention phase provokes a great deal of variability, which is then pruned during interpretation and critique sessions, eventually resulting in agreed-upon solutions to representational problems. We hoped that there would be sufficient variability in the displays to make conversation about similarities and differences mathematically productive. For us, mathematically productive meant coming to see how the design of a display resulted in the ensuing shape of the data, and also, how “hills,” “holes,” and related features notable in a frequency graph result from the interplay between counts of cases and how one defines the corresponding interval. This knowledge is a precursor to the more conventional construct of the density of a distribution. As students began to think about aggregate, we aimed to tie this aggregate to the mathematically important idea of data generated by a repeated process, by asking students what would happen to the aggregate “if we grew them again.”

This task was repeated in a modified form for other days of the plants’ life cycle. Again, we explicitly focused on how choices made by designers influenced the shape of the data display. Which senses of shape afforded easy comparisons of the same sample at different points in the life cycle? One strategy we employed was to build on students’ emerging, idiosyncratic partitions of the data (e.g., thirds of the distribution). We felt that student talk about “regions” of the data showed that they were in the process of making the transition from an emphasis on collections of individual cases to aggregate structure (an example of this kind of conversation is included in one of the analyzed video clips). We used these divisions of the data, proposed by students, as a path for introducing conventional divisions, especially quartiles. Students referred to the dividers as “hinges” and the width of the quartiles as “doors.” Our goal was to relate changes in the shape of the distribution to conventional representations, such as the box plot. For example, when the distribution of plants became more “normal,” students noticed that the middle doors “shrank,” and we asked them to account for this shrinkage by relating it to the shape of the distribution of the data (expressed by relative frequencies).

Third Phase: Coming to See the Sample as Varying

As students worked on ways to structure variation as distribution, their teacher again asked them to consider what would happen if they grew the plants again, but this time tying the image of repeated process more explicitly to chance. The aim was to invoke an image of a (random) repeated process, with a sampling distribution as a way of characterizing the likely outcomes of these repetitions. Students initially explored this question with random sub-samples of their classroom data (in effect, treating it as a population). They placed cards containing the heights of their plants in large envelopes and drew random samples, pasting the results of their sampling as frequency displays on the walls of the classroom. These displays made sample-to-sample variability quite visible, and the students readily attributed this variability to chance. Sampling without replacement was motivated by our conversations with Patrick Thompson (2000, February, personal communication), who proposed that students tend to conceive of samples as parts of populations. We therefore conjectured that students would find it sensible to literally construct parts (samples) and then to examine their relationships to the whole.

We next stretched the metaphor further by introducing sampling with replacement as a model for “growing again.” Using a computer program developed by Andrea diSessa (a prototype written in the Boxer programming environment), we varied the sample size and number of samples employed to look at the shapes of the resulting distributions of statistics (means, medians). Selection of these statistics was motivated by employing them as ways of representing the tendency observed by students for the “middle” to be recovered from sample to sample (The notion of “middle” was an interesting opportunity to contrast the probabilities of recovery of any single case being sampled, that is, 1/n, to that of the event class defined by the center clump or by other regions of the data – see Lehrer & Schauble, 2004). The computer tool aggregated the results of the simulation into ordered intervals and plotted them as histograms. During these sampling experiments students generated explanations of what they were seeing and tested their explanations by conducting additional investigations. For example, some students proposed that small samples of 2, compared to larger samples, would increase the sampling variability of the mean or median, because “bad luck” might easily lead to including an extreme value that would skew the results away from the center clump.

Fourth Phase: Distributions as Signatures of Growth Processes

We next sought to repurpose these concepts about distribution and chance to a new question: How did the distribution of the plant heights change over time, and what might account for this change? The goal was to promote distributions as signatures of growth processes (Gould, 1996). When growth processes change, so, too, do distributions of the population of plants.

Fifth Phase: Reconsidering Experiment

In the final phase students revisited their initial conjectures about the effects of light and fertilizer against the background of their emerging understanding of distribution and sampling. They contrasted samples of plants grown under conditions of low light or high fertilizer to a larger sample of plants grown under standard conditions. We reminded students about “growing again” to connect these contrasts to images of repetition and thus inference about effect to sampling (e.g., what might be expected if we grew them again vs. what had happened under known conditions of light and fertilizer). We knew that students’ conjectures about the effects of fertilizer would not be supported by these data – counter to their expectations, Fast Plants do not grow taller if they are given extra fertilizer, although their canopies grow wider. Because we knew that the data would disconfirm a favored theory, we anticipated that this context would be especially productive of data based argument. We were aware of the literature about preadolescents’ tendencies to base their arguments primarily on beliefs about the way the world is (Kuhn, 1989) and felt that if students were asked to reason about data that clearly did not support their favored beliefs, they would be more likely to engage beliefs and evidence as separable dimensions of consideration. Essentially, we asked students to invent a method of comparison that would work to resolve differences about the prospective effects of light and fertilizer. Our intention was to ground inference in sampling variability, but to do so without the mechanisms of formal inference, such as the confidence interval.

The Classroom Data

The video segments on which this book is based represent our very first attempts to help students think about natural variation of populations (in this case, of plants) via deep understanding of seminal ideas about distribution. Consistent with our interest in cumulating knowledge within domains across grades, this was not the first investigation of plants that these students had undertaken. As first graders they had grown flowering bulbs of various species and had investigated changes in the heights of the bulbs over their lifecycles. These investigations served as a context for students to employ their developing understanding of measurement. In the third grade, students studied changing rates of growth by constructing piecewise linear graphs of the heights of individual plants. In response to a teacher’s challenge to find a way to “draw one line that shows how all our plants grew together,” they proposed a line that connected the midrange of the distributions of plant heights at each day of measure. They then held an extended debate about whether this solution was legitimate. Because the line intersected some points that represented the height of none of the class’s plants, the argument focused on how a value could be considered typical if it did not include any of the cases being described. Next students constructed rectangular pyramids and cylinders out of paper to test a conjecture about the changes in the volumes of the plant canopies. With mild disappointment, students noted that their conjecture (that the volumes of the plants would increase in constant proportion) was not correct, but we were impressed with both the question and the models proposed to test it. In the fourth grade, students grew plants in “crowded” and “uncrowded” conditions and compared the resulting distributions of height by eye to determine whether there were discernible differences in plant height, width, volume, number of leaves, and seedpods as a result of these two conditions of growth. The lessons analyzed in the workshop focused on our initial attempts, working with a collaborating fifth-grade teacher, MR, to develop the underlying conceptual understanding of distribution that could appropriately guide inferences of this kind.

Therefore, the video sent to the analysts represents our first attempt to work out these ideas in a classroom. Although we had, of course, a general idea of where we wanted to go, the details were being manufactured in the process, and every day was capped with a meeting between the research team and the classroom teacher to retune or revise our plans in progress. Those who consult the transcript will note that the chapter authors were in the classroom each day along with Christopher Hartmann, a research assistant (we have included below a brief summary of each of the video clips referenced by the chapter authors, along with our own abbreviated comments about these events). In addition to planning next instructional steps, our role in the classroom was to document student conversation and learning and to talk with small table groups of children as they solved problems and conducted investigations. Given the developmental status of this work, we acknowledge openly that the teaching in this segment is often shaky, sometimes even clumsy. Subsequently there have been many replications with (considerable) revision of this introduction to variability, and students in Wisconsin, Arizona, and Nashville have participated in iterations of the instruction. We would not want readers to assume that we are claiming this material as an example of excellent practice. Instead, in this piece, students, teacher, and researchers alike are struggling to create an innovation, to understand relations between teaching and learning, and to characterize what seems robust about each. Many mistakes of many kinds were made along the way. The discerning reader will readily perceive that he or she is witnessing an instructional design in the making, which may be part of what makes these data interesting.

Transcripts for the twelve video clips analyzed during the Allerton workshop can be found in Appendix B in the back of this volume. (The transcription conventions are described in Appendix A.) The events captured in these excerpts occurred during the second phase of the instructional unit described earlier. The video was collected in a fifth-grade classroom in which students discussed plant growth and development and then grew Wisconsin Fast Plants® under different conditions of light and fertilizer. The data described in the video excerpts were collected on the 19th day of plant growth, or approximately halfway through the plants’ life cycle. The video excerpts open on the 26th day following the planting of the seeds, with students being asked to design representations of the data they had collected from their plants. At that point, students were still engaged in the process of collecting data, but were structuring data already recorded. On the next day of instruction (Day 27 of the plants’ growth), students exchanged representations and began making whole-class presentations, which continued into the following day (Day 28 of plant growth).1

Excerpt 1: Introduction of a Data Representation Challenge (Day 26)

MR introduced the task that students will be working on over the next few days. Students were asked to invent and compare data displays, considering what different displays reveal and hide. This emphasis on representation is a hallmark of the program that RL and LS are introducing in the school. The students’ measurements have been collected. The values are presented on a flip chart (see Figs. 2.1 and 4.1). The task, which will be carried out in “table groups” (semi-permanent, small groups of students who sit together at a table), is to find a way to represent the data so that the display shows the typical height of a Fast Plant at Day 19 of growth and also, how spread out the data are (see Fig. 22.3). MR says, “If you could answer these questions by the end of today, you’ve done pretty well” [0:02:10].

Fig. 2.1
figure 2_1_192564_1_En

The converted Wisconsin Fast Plants® height measurements from Day 19

Excerpt 2: Getting Started (Day 26)

Group 1 begins their design process. RL has agreed to serve as “recorder” for the group. Caleb, Kent, and Garett have an extended exchange about where to start plotting their data. Should they begin the chart at 30 mm, the shortest plant, or at 0 (see Fig. 15.4)? Kent complains, “It’s just all kind of weird starting from 30” [0:09:09]. Caleb agrees, “It doesn’t make any sense to start at zero number when they’re not even up there” (i.e., there is no plant that is 0 mm tall).

Excerpt 3: How Should We Look? (Day 26)

Group 2 begins to design their representation without direct adult input (see Fig. 12.4). The students sit in a circle around a small table. On the table is a large, single sheet of oversized graph paper on which they are to produce their design. A focus of the discussion appears to be how to fit all the measurements onto the piece of paper, whether they should be listed along the short side or the long side of the piece of paper, and whether the numbers will fit if the graph is scaled so that one square accounts for 3, 5, or 10 of the values. One obstacle to the collaborative design work is that, given their positions at the table, only one student has a view of the worksheet as it will be seen when the design is complete. The others must rotate or invert their views to align with the view of the person sitting on the side of the table closest to the bottom of the graph. This leads to differences in opinion about what will serve as the top/bottom, right/left of the designed representation. Different proposals are warranted by appeal to what will fit on the sheet, what will be neat, and how one conventionally constructs a graph, etc. Students attempt to reconcile the proposals by inviting one member (Jewel) to move to the position of another (Wally) in order to better understand his position. The topics of “typicality” and “spreadoutness” are never taken up explicitly.

Excerpt 4: An Adult “Assists” (Day 26)

This excerpt captures Group 3 designing their representation with the assistance of LS (see Figs. 10.1, 10.2, and 12.3). As in Excerpt 3, much of the discussion focuses on finding a way to fit the data from the flip chart onto the provided piece of paper. Again, this entails disputes about out how to order the values, lots of counting, and some negotiation with respect to the nature of the task at hand. For example, Jasmine counts the number of squares across the side (35) and bottom (22) of the paper and then multiplies them, presumably with the intent of putting one value into each square. Later, she proposes, “…we could just show the odd numbers, maybe.” Interestingly, the problem of orienting the graph on the paper that proved to be contentious in Group 2’s design work does not arise as a problem here. In her pointing and counting activities, LS implicitly adopts an orientation for the emerging representation and the students accept this without question. Again, the topics of “typicality” and “spreadoutness” are not taken up explicitly. The students are told to “use their sense” and to think about “what it is that we want to show.” “Frequency (charts)” are referenced by LS in the context of displaying frequency in a range. To our taste, this is one of the examples of clumsy teaching that we mentioned above. LS is being far too directive here, but her explicitness is provoked by the goal of ensuring that at least one group will create a display that shows the shape of the data when relative frequencies are represented. In subsequent iterations, we have learned that we do not need to be this explicit – this aspect of the instructional design (inventing and comparing displays) is sufficiently robust.

Excerpt 5: Group 2 Explains Their Progress (Day 26)

Group 2 discusses their partially completed graph, described earlier in Excerpt 3, with RL (see Figs. 4.2 and 12.5). Rich says, “I’m not sure I understand the graph that you made.” Plant heights are displayed on the left (from Jewel’s perspective), and across the bottom are 63 elements representing the ordinal position of each data value in the table. Anneke points out that there is no need to label the plant numbers across the bottom of the graph (Plant 1, Plant 2, etc.), as April wishes to do. “Well, it doesn’t matter. ‘Cause you know there’s a plant there.” RL asks how the emerging graph answers their question (see Fig. 22.4). April’s reply suggests that she is not concerned with answering MR’s two questions, but rather, with making something that meets her criteria for a graph: “But…but that’s the way a line graph normally is.” RL asks, “Did anyone say it has to be a line graph?” He leaves the group for a moment, saying, “Well you gotta kinda figure out what you’re tryin’ to figure out” [0:41:44]. He returns a few minutes later and comments, “What I think you did very nicely here was create some way of arranging your information from smallest to largest. That’s a good start. Now you have to think about how you’re gonna show each of the values.” At this point Wally interrupts to say that he would prefer to work on a “stem-and-leaf” graph (see Fig. 2.2). Rich suggests that the group split, with Wally pursuing his plan and the girls pursuing theirs. MR later (and not transcribed) visits the table. Jewel asks if making a stem-and-leaf graph would count as “organizing the data.”

Fig. 2.2
figure 2_2_192564_1_En

“Stem-and-leaf” display constructed by a student who disagreed with the case-value consensus of his group (Group 2)

Excerpt 6: Group 5 Describes Their Approach to the Task to RL (Day 26)

Janet, Rene, Malcolm, and Kurt (Group 5) are tentatively recording their first data point on their chart (see Fig. 10.3). RL asks them, “Do you need two dimensions to show how spread out they are, or could you do it with one?” Janet replies that one of the axes is needed to record each of the plants (“in alphabetical order”) and the other to record the values of the plant heights. RL tries in vain to cue their memory of frequency graphs by reminding them of the data displays they had created the year before, when they displayed the heights of rocket launches. “Suppose the data were not about plant heights, but they were how high the rocket went?” Janet replies, “You’d still use it to show the different heights the rockets went.” She continues, “And this would be the first rocket, because it’s important to see which one it was.” RL responds gently, “Well, the rockets were all sent up at the same time, right?” (The reference by RL is to a repeated measure context the earlier year in which multiple individuals measured the height at apogee of a single rocket.)

Excerpt 7: Suppose We Grew the Plants Again? (Day 27)

MR has passed out all the displays so that each table group is holding a display authored by one of the other groups. He has explained that each table group will be asked to orally present the display of another, interpreting what the display shows and commenting both on strengths and weaknesses of the display. Group 3 (Tyler, Edith, Kendall, and Jasmine) is attempting to make sense of the graph developed by April, Anneke, and Jewel (see Fig. 2.3). Edith notes that this graph is very similar to the one that they made, so “I don’t think they need to change anything at all.” CH, a research assistant on the project, asks what would be likely to happen if we planted another 63 plants. Edith notes that in the original data, 11 out of 20 plants fell in the 160–169 mm range, and Kendall says that he would expect to get “somewhere around that number. It could be more, it could be less.” CH asks how many plants they would expect to observe from 160 to 169 mm if they planted only 20 plants the next time around, instead of the original 63. Kendall suggests dividing 11 by 63 to find the percentage of the original distribution in that range. Tyler calculates that this would be 17.4% and remarks that “…It’s the biggest amount that we will get for any of ‘em” (i.e., the 160–169 bin holds the largest percentage of the plants). Part of this exchange can be seen in Fig. 22.5. CH asks, “So does that help you talk about what the typical height would be?” We consider this discussion significant, as it demonstrates students thinking about relationships between regions of the data and the entire batch. They apparently grasp that if they “grew the plants again,” the structure of the original distribution would be a good source for deciding what to expect in the new distribution of plants. This is a frequentist view of chance that we value, because it emphasizes the role of repeated process in the interpretation of chance.

Fig. 2.3
figure 2_3_192564_1_En

Graph produced by April, Jewel and Anneke (Group 2) on Day 26 (Excerpts 3 and 5)

Excerpt 8: Grouping or “Binning” the Data (Day 27)

This is one of the whole group presentations. Rene and Janet from Group 5 present the graph designed by Group 1. The Group 1 graph is simply a list of the values, in order, across the bottom of the paper. Students apparently ran out of room and simply inserted the remaining values further up on the page. The exchange involves a misreading by Janet of Group 1’s computed mean value. Next Anneke, Jewel, and April discuss the graph designed by Group 3. MR asks students to compare the ways that the different displays group the data. He introduces the term “bin” as a special word that refers to these groups of data. We considered it critical for students to grasp the notion that data could be grouped this way, and further, to understand that changing the bin size changes the shape of the distribution. “Shape of the distribution” will be a central theme in the instruction that follows over the next several weeks. Students eventually come to understand that the shape of the data supports interpretation, and moreover, that the data representing plant heights changes its shape in a predictable manner over the life cycle of the population of plants.

Excerpt 9: Showing Spread of the Data (Day 28)

Group 1 is at the board (see Fig. 19.3) presenting a representation made by Group 5. Actually, Group 5 made two. Rene and Janet produced the display shown in Fig. 2.4 and Kurt and Malcom made the graph shown in Fig. 2.6. Rene and Janet’s display is not a graph, but rather, a list of all the plants heights in order, starting from the top left of the paper and continuing to the lower right. Just before the fragment begins, Garett, who is at the board presenting their representation, is critical of how the girls made their “graph.” He argues that his own display (not shown), which lists the values in order across the bottom of the page, does a better job of showing “spreadoutness.” He points out that Rene and Janet’s display uses up the entire page and does not emphasize the length of the string of values from beginning to end, because it continues over several lines. Rene and Janet rise to the defense of their representation and explain that it is better than Group 1’s graph (which they presented earlier), because they included an annotation indicating their answers for typicality and spread. Garrett and Kent retort that these qualities were to have been made visible in the representation itself, and without the annotation they would be “clueless.”

Fig. 2.4
figure 2_4_192564_1_En

Rene and Janet’s (Group 5) tabular representation of the data produced on Day 26 (Excerpt 9)

From our perspective, neither of these displays is going to reveal the shape of the data. MR, therefore, proposes a thought experiment – what if the highest value were 555, instead of 255 (Fig. 15.3)? Which of the graphs on the board would best show this change in spread? Kerri eventually comes to the board and identifies Group 3’s graph, a frequency distribution (see Fig. 2.5). “Well,” she says, “I think that probably this graph, because they still leave some spaces there…. you can really see how spread out it is…and you can see …how much space is there between it” [0:12:48]. Ian adds, “It’s not just the numbers that we actually measured that are in between, but all of the numbers” [0:13:41]. It is common for students to omit bins that have no observed values. Doing so hides the “holes” in the data and provides a misleading picture of the “spread.”

Fig. 2.5
figure 2_5_192564_1_En

The 10-bin graph produced by Group 3 (Tyler, Edith, Kendall and Jasmine) on Day 26 (Excerpt 4) and discussed on Day 28 (Excerpts 9 and 12)

Fig. 2.6
figure 2_6_192564_1_En

Kurt and Malcolm’s (Group 5) graph discussed in Excerpt 9 that introduces the notion of scale

Excerpt 10: What Is a Good Representation? (Day 28)

In this Excerpt we see a contest between two different criteria for what counts as an admirable representation. On the one hand, students are impressed by solutions that are clever or original, even if they are arcane. Competing with this value is MR’s continued insistence that the display should allow readers to easily interpret typicality and spread. Ian, Kerri, and Cindy (Group 4) present a graph developed by Group 6 (Fig. 2.7). This rather unusual graph orders the hundreds and tens places along the Y axis (13, 14, 15, etc.) and the ones places along the X axis (1, 2, 3, etc.). So, to identify 157, for example, one would locate the 15 on the Y axis and move over 7 on the X axis. Ian begins his description of this graph by commenting, “It’s a little bit confusing.” MR asks, “What about it helps you see that the numbers are spread and what a typical plant would be?” When they are asked to point to a “typical” fast plant, Ian and Kerri produce a series of points and gestures. Dispute ensues in which they take different positions with respect to how “typicality could be read off this representation.” Kerri explains how the graph is intended to be read, and the class is clearly impressed. Erica remarks, “You guys are so cool!” MR asks Erica, “What makes it easy to see what’s typical?” Erica admits, “It’s kind of hard to see that.”

Fig. 2.7
figure 2_7_192564_1_En

Group 6’s (Michael, Debbie, Kay and Jacki) graph of the Wisconsin Fast Plants® height data presented to the class by Group 4 (Ian, Kerri and Cindy) on Day 28 (Excerpt 10)

Excerpt 11: Another Clever (But Opaque) Solution (Day 28)

Group 6 shares a graph designed by Group 4, Ian, Cindy, and Kerri. This graph, like Group 6’s, is a design extravaganza. It displays the median of the distribution at the top middle of the page, poised on a set of carefully drawn stairs. Other values descend from the median on the stairs, although it is not clear that the display preserves interval (see Fig. 2.8). Kerri is asked by another student about her source of inspiration for this design. She replies, “We were thinking about different graphs that we could make,” confirming our impression that she was focusing on originality of design, rather than how well the graph showed typicality and spread. One of the students remarks, “I don’t think I’ve ever seen one like that before.” Another student points out, “This one is kind of hard to read. When you first look at it, you think, ‘What the heck did you do?’”

Fig. 2.8
figure 2_8_192564_1_En

The “stair graph” produced by Group 4 (Ian, Kerri and Cindy) and presented by Group 6 (Michael, Debbie, Kay, Jacki) on Day 28 (Excerpt 11)

Excerpt 12: A “Typical Region” of the Graph (Day 28)

Group 3 (Tyler, Edith, Kendell, and Jasmine) present the graph designed by Group 2 (April, Jewel, and Anneke).2 Note that they had “shared” the graph produced by Group 3 on the previous day [Day 27: 0:26:00–0:30:55]. Tyler begins by pointing out that Group 2’s graph (see Fig. 2.3) looks very much like the frequency graph that his group had created: “This is a bin graph. This basically was the exact same as ours” (see Fig. 2.5). Kendall recounts how, in their earlier conversation with CH, they had concluded that a typical plant was most likely one whose height fell in the highest column on the graph (see Fig. 22.6). He adds, “…and we found the percent. It was about 17 point something.” MR asks for clarification: “Are you saying 17% of all the numbers fall in here?” Kendall went on to explain that they did not think that 17% of the data was a sufficiently large region to feel confident that they had captured the typical value. This concern was probably sparked from his earlier conversation with CH about what values they would be most likely to see once again if they grew another set of plants. Kendall explained that Group 2 therefore added to their consideration the columns immediately surrounding the highest column of values. “So then we tried adding all these, then we get 22 numbers there. Then we got 34%, and we sort of thought that was more like…” Tyler finished his thought: “So out of these 3 (columns) were the typical area.” MR re-voiced, “You’re saying that same thing, 34% of the Fast Plants fall somewhere in this area? So you’re saying if you grow a Fast Plant, would you say, Kendall, chances are good that it would be between 150 and 170 because that’s where 30% of all the stuff was?” Then MR pointed to the outlier plant that grew to 255 mm and asked, “What about this one? What’s the odds of your Fast Plants growing 255 mm? Would you say that’s pretty good?” Tyler replied, “I would say it’s one out of 63… That’s 1.5%.” This discussion reconfirmed our belief that the students were starting to develop a sense of the shape and regions of the data. The distinction between case (1 out of 63) and aggregate (34%) and the ability to coordinate these two perspectives is something we value from a disciplinary perspective. Moreover, students were beginning to explore chance as embedded in repeated process (the notion of “growing again”), which we later focused on explicitly with a series of sampling experiments.

Coda

The previous description is quite detailed. We provide this level of specificity because we want readers to understand that we had a very particular set of goals in mind, goals that built cumulatively and systematically on the conceptual achievements that students had made in earlier grades. Our interest was not generally in whether children would find something to do with the data, or even if they would find something sensible to do. For our purposes, we were interested in constructing a pathway for moving toward these ideas of distribution. As we pursue educational designs we attempt to walk a fine line between educational romanticism and over-prescriptiveness. Children, of course, are endlessly inventive and have an impressive ability to make sense of situations, but carefully orchestrated assistance must be marshaled to keep those resources developing along ways that are valued in the disciplines. It is equally important that a teacher’s (or researcher’s) ambitions for students’ disciplinary knowledge and reasoning not over-ride or outstrip the ways of thinking and the sense-making that students bring to tasks and situations. In short, our purposes in this work were explicitly educative. We wanted students to encounter and consider a particular sequence of ordered ideas, even though at all times we were prepared to take detours or even to reroute the path based on what we were learning. As instruction progressed, our detours were more frequent. For example, although we initially intended to emphasize sample-to-sample variability, students’ use of the tool developed by diSessa prompted more in-depth excursions into sampling distributions of statistics (e.g., the median of a population) with varying sample sizes from different instances of “doing it again,” ranging from comparatively few repetitions to many. We emphasize this readiness to take conceptual detours, because we believe that our sense of developmental progression is more emergent and contingent than might be suggested by our earlier description.

Notes

  1. 1.

    See Table 23.1 for a concise summary of where each of these excerpts were referenced within the chapters. The excerpts come from three consecutive days at the beginning of a section within an extended unit in which we were just developing the concepts of distribution and chance.

  2. 2.

    Wally, the fourth member of Group 2, presented separately.