The graph in Fig. 1 displays the reaction times of 231 visitors to the Museum of Science in Boston, Massachusetts. Looking at this graph we might notice that:

Fig. 1
figure 1

Reaction times (in seconds) to a visual stimulus of 231 visitors to the Museum of Science in Boston, Massachusetts

  • The average reaction time is around a quarter of a second.

  • The majority of reaction times were between 0.2 and 0.3 s.

  • The distribution is skewed to the right, due to the fact that there is an effective minimum time but no real maximum.

These three characteristics are not properties of any of the individual reaction times. Rather, they are properties of the aggregate, or collection as a whole. Furthermore, to regard this collection as giving us any information about people’s reaction times more generally, we would need to view this aggregate as a representative sample of a larger aggregate, a population.

Perceiving, describing, and generalizing from aggregate features of data is what statistics is primarily about (Fisher, 1990/1925; Moore, 1990). Yet composing individual data values into an aggregate does not come easily to students. Cobb (1999) noted that many of the middle school students he and his colleagues worked with initially perceived graphs of data simply as “collections of points.” Thus, rather than attending to features of the frequency distribution in Fig. 1—how the values cluster and spread over their range—students might describe the data by identifying the individuals with the fastest or slowest reaction times or by locating themselves within the distribution. In earlier research, Hancock, Kaput, and Goldsmith (1992) reported an intervention in which they had encouraged students (ages 8–15) to attend to aggregate qualities such as clusters and spread in analyzing data. The students persisted, however, to home in on “individual cases and sometimes had difficulty looking beyond the particulars of a single case to a generalized picture of the group” (p. 354). Mokros and Russell (1995) likewise observed that for many students, “a representative value had no meaning because the data set was, for them, only the values of the numbers” (p. 35). Stressing the conceptual nature of this problem, Hancock et al. (1992) concluded that “to think about the aggregate, the aggregate must be ‘constructed’” (p. 355).

These and more recent studies have both pointed to the importance in statistical reasoning of perceiving data as an aggregate (Bakker & Gravemeijer, 2004) and documented the challenge many students face, and successfully navigate, in making the transition from reasoning about cases to “constructing” and reasoning about aggregates (Ben-Zvi, 2004; Lehrer & Schauble, 2004). In this article, we analyze statements of students from three different sources to explore possible building blocks of the idea of data as aggregate and to speculate on how young students go about putting these ideas together.

Various researchers, beginning at least as early as Bertin (1967/1983), have offered analyses of different types of questions that data displays can be used to address. Perhaps the most well-known framework is Curcio’s (1987) triad: “read the data, read between the data, and read beyond the data.” These categories were elaborated in Friel, Curcio, and Bright (2001) as “extracting information from the data,” “finding relationships in the data,” and “moving beyond the data.” Our approach in analyzing student statements about graphs can be considered as orthogonal to this prior research. We suggest that students can perceive what is referred to in the above categories as “the data” in fundamentally different ways. For each of these different ways of perceiving the data, students could in theory pose questions at each of Curcio’s three levels.

Readers may interpret what we refer to as an aggregate as being synonymous with the more familiar statistical term, distribution. For our purposes, however, there is an important distinction to be made between the form of a graphical display (e.g., a frequency distribution) and the way in which that form is perceived, as indicated by the sorts of questions it is used to address. As we will show, students can create frequency distributions but then not use them to formulate statements about the group as a whole. We will claim that this is sometimes due to the fact that students do not perceive the data in the distribution as a single group. Thus what a statistician like Chris Wild (2006, p. 11) typically sees in a frequency distribution (an aggregate) is quite different from what many novices are able to see (a bunch of individual cases). The variability in the data values, a feature that is highlighted by displaying them as a frequency distribution, is undoubtedly a big part of the conceptual challenge of coming to see them as an aggregate.

Recently, researchers have been investigating the development of the “idea” of distribution, and our sense is that they mean something different from, or more than, what we mean by aggregate. They have tended to look at the development and coordination of statistical reasoning about features of distributions such as location, spread, and shape (see Prodromou & Pratt, 2006; Reading & Canada, 2011; Reading & Reid, 2006). We view the idea of aggregate as necessary but not sufficient, for example, for using a median and IQR to characterize a distribution.

1 Methods and data sources

To exemplify various types of reasoning about data, we analyzed student statements from three primary sources.

1.1 Elementary school: teacher-written case studies

Our primary data source was 28 case studies that are included in the Developing Mathematical Ideas (DMI) Working with Data Casebook (Russell, Schifter, & Bastable, 2002); we designate references to this casebook as WwDC throughout the text. These cases describe classroom episodes from Massachusetts public school teachers in grades K-5. The teachers were participating in a teacher-development project and had completed several homework assignments that involved conducting data-analysis activities in their own classrooms. The case studies, which describe and reflect on these classroom experiences, include: (a) records of student dialogue, (b) copies of students’ data representations and written reports, (c) teacher accounts of classroom activities and discussion, (d) teacher interpretations of student statistical reasoning, and (e) teacher reflections on pedagogy. For the purposes of this analysis, we focused on (a) teachers’ records of classroom dialogue and (b) the data representations students produced.

The case studies provide a rich source of information about how younger students view and work with data. Part of this richness comes from the range of grades and activities they cover. Also, rather than being short, isolated instructional interventions, the cases describe classroom experiences embedded in sequences of instruction. Furthermore, they have been selected by teachers as particularly revealing episodes.

Our initial motivation for analyzing these case studies was to provide information to teachers (as summarized in Konold & Higgins, 2002) that would help them recognize different ways in which their students thought about statistics. As we looked at and discussed these case studies as a group, the conjecture emerged that there were a few fundamentally different perspectives that students applied. Once a coding framework had stabilized, we applied it to two other data sources, as described below.

1.2 Middle school: individual interviews

Eleven eighth grade students in a public school in Nashville, Tennessee were individually interviewed at the end of a 14-week teaching experiment. The teaching experiment, which focused on reasoning about bivariate data, was designed and conducted by Paul Cobb and associates at Vanderbilt University (see Cobb, McClain, & Gravemeijer, 2003). As part of the instruction, students used scatterplots to perceive and describe linear and non-linear trends in bivariate data. The interviews lasted about 1 h and followed a structured set of questions. The interviewer posed a problem that required a student to interpret bivariate data similar to the data they had been reasoning about during the teaching experiment. The interviews were videotaped and transcribed.

1.3 High school: pair interviews

Konold, Pollatsek, Well, and Gagnon (1997) interviewed two pairs of high school seniors who had just completed a yearlong statistics course at Holyoke High School in Holyoke, Massachusetts. During the interview, the students explored a data set that included a variety of information obtained from 154 students at their school, producing graphs to answer various questions. The students who were interviewed were familiar with these data as well as with the data analysis software they were using. The interviews were videotaped and transcribed.

2 Different perspectives on data

Based on an analysis of data from these three sources, we identified four general perspectives that students use in working with data. To get a quick sense of these perspectives, consider the frequency graph in Fig. 2. It displays the favorite colors of six hypothetical students, where each circle represents the color preference of a different student. Many data activities conducted in the early elementary grades involve collecting and graphing data of this sort (see, e.g., WwDC, Case 7). Students might conduct an in-class survey using a simple question (e.g., What is your favorite color?) and then consider what they learn from the data (e.g., What does this graph tell us?).

Fig. 2
figure 2

Graph of the favorite colors of six hypothetical students

Figure 3 classifies examples of four types of responses students might give in summarizing the graph in Fig. 2. These include inscribing (see Konold & Lehrer, 2008) or interpreting data as:

Fig. 3
figure 3

Depiction of four different perspectives students might take in summarizing information displayed in Fig. 2. A gray scale is intended to suggest the three different colors, blue (darkest), green (lightest), and red (middle value). The figures in the column labeled “Data Structure” depict how data from the point of view of each perspective might be mentally represented. With the data as pointer perspective, there is not a clear distinction made between the data and the real-world event (hence the porous boundary). In an aggregate perspective, each type of data value occupies a well-defined space in the mental representation such that new data types (e.g., a favorite color of purple) can be quickly incorporated (blank divisions within the rectangle) and existing types can be combined to form composite events (“green” and “blue” into “not red”)

  • pointers to the larger event from which the data came

  • case values that provide information about the value of some attribute for each individual case

  • classifiers that give information about the frequency of cases with a particular attribute value

  • an aggregate that is perceived as a unity with emergent properties such as shape and center

These perspectives attend to different questions that we can ask of data, and they organize the data into different fundamental units. As stated above, a statistical perspective focuses on the entire batch of data, on the characteristics of the data set as a whole. Thus the functional perceptual unit in a statistical perspective is the aggregate. In contrast, treating data as classifiers involves viewing cases with similar values as a unit (e.g., all the students whose favorite color is red). Treating data as case values involves taking the individual data elements as the perceptual unit, focusing on characteristics of individual cases. When treating data as pointers, there is no obvious perceptual unit. Rather, the data serve as reminders of the larger event from which the data came.

We regard these four perspectives as forming a loose hierarchy of sorts (from data as pointer to data as aggregate) where a higher level subsumes or encapsulates lower ones. Higher levels reorganize units in the lower level into a new perceptual unit (see Fig. 3). We should stress, however, that different contexts or questions may call for or cue different views of data even by the same student. And there are times during analysis when experts might regard data in ways that are akin to the case-value and classifier perspectives. For example, in inspecting a skewed distribution of income values, a statistician might quickly alternate between considering the approximate center and shape of the distribution and attending to individual data points in the tails (outliers), searching for more information about these individual cases. Note, however, that regarding an individual case as an outlier involves locating it with respect to the other values in the distribution and thus requires a coordination of aggregate and individual perspectives (cf. Ben-Zvi & Arcavi, 2001).

Because each of these perspectives is useful depending on the question at hand, we view them not as levels or perspectives to graduate from but rather as perspectives to coordinate and master. But as this statement implies, we frequently observe students who appear unable, for example, to perceive a batch of data as an aggregate even when a particular question or purpose seems to call for this view. Furthermore, some students seem inclined to view data from only one particular perspective. This inclination then influences, and perhaps constrains, the types of questions they ask, the plots they generate or prefer, the interpretations they give to notions such as the average, the methods they use to compare groups, and the conclusions they draw from the data. To the extent that our descriptions of these various perspectives are valid, they offer potential insights into how students are reasoning about data. They also may provide a basis for designing interventions that encourage the development and coordination of more aggregated perspectives. In the remainder of the article, we elaborate our description of these perspectives and illustrate them with statements and graphs made by students. In using work from a particular student as an example of reasoning from a perspective, we will not assume that that student was incapable of applying a higher-level view. The student work we had access to did not permit this type of analysis because in general it did not include examples of a particular student’s responses over a range of problems. But more to the point, we view the framework we propose not as a tool for classifying students into different reasoning types, but rather as a window on how students might be reasoning when they offer a particular interpretation of data at a specific point in time. In analyzing these relatively isolated statements, we asked ourselves, “How is this student currently regarding these data such that this student’s statements and actions in this context make sense to him or her?”

2.1 Data as pointer

In this perspective, the data and the event from which they came are not clearly differentiated. We see this perspective mostly among very young students who have collected data themselves. In these instances, students treat data records much as they might a photograph of last summer’s vacation—as an image that helps bring to mind whatever was salient about that event. The photo might remind one person of relaxing on the beach, while for another it might bring to mind the terrible sunburn she got. When students view data as a pointer, the data represent the whole event from which the data were generated. This orientation is reminiscent of Vygotsky’s (1978) account of how young children, when asked to draw an object placed in front of them, will depict “not what they see but what they know” (p. 112). Watson (2009) reports similar findings in her study of students’ understanding of variability and expectation in distributions. Given information about average daily temperatures in their city and asked to draw a graph, many younger students drew pictures depicting prototypical seasonal scenes such as trees blowing in the wind. Using a SOLO model, Watson characterized these as “idiosyncratic” (prestructural) responses.

In Case 7 of WwDC, a kindergarten class produced a representation like the one in Fig. 4 showing the frequency of the favorite colors of class members (p. 37). The students in the class came from backgrounds representing eight different language groups, and this data activity followed one in which students had learned the Chinese names for some colors.

Fig. 4
figure 4

Favorite colors of students in a kindergarten class

As the teacher recorded new values on the board, many of the students focused on numerical information in the display, making comments such as “That’s two for blue,” or “Only one person likes black.” The next day, when the teacher asked, “What does this chart tell us?”, it appears that for many of the students the graph was now a more general pointer to the entire event of the previous day. Some of the students’ responses included:

  • My shirt is blue.

  • We know everyone’s name.

  • We learned English and Chinese colors.

When reasoning from the data-as-pointer perspective, students often mention things not explicitly represented in the display they are interpreting. In this respect, there is no specific perceptual unit they are focusing on (as indicated in Fig. 3 by the question mark). Our interpretation is that these students were not regarding the display of data as a model of aspects of an event. Rather they were viewing the graph as a general rendering of that event. This orientation fits well with young students’ tendencies to construct iconic data representations in which they depict considerable detail about the events they observe.

In turning observations into data, we of necessity narrow our focus, encoding only selected information. A biologist studying the health of a salmon population may record only the weight and length of each sampled fish, ignoring a host of other information about each individual that potentially could be recorded (e.g., see Roth, 2005). For students viewing data as a pointer, the data records do not restrict the view to attributes of interest but rather serve to trigger a host of potential memories or associated facts related to the observed event.

2.2 Data as case value

This perspective entails associating a value with an individual case. Unlike using data as a pointer to an entire event or context, this perspective acknowledges that the data encode only particular aspects of an event. The perceptual unit in this perspective is the individual case, for example, a single person in a survey of students’ favorite color (see Fig. 4). This perspective is often evident in young children who tend to focus on the identity of each individual piece of data, especially data values belonging to them. For example, one of the students in interpreting the graph in Fig. 4 responded, “My favorite color is red.” (WwDC, p. 38).

Prototypical questions based on this perceptual unit include determining the value of particular cases (“How tall is Henry?”) or the case identity of salient values (“Who is tallest?” “Who is shortest?”). Figure 5 shows a representation similar to one that a class of third and fourth graders posted on the board (WwDC, Case 12). The graph shows how long each of their families had lived in town. To summarize the data, one pair of students wrote “The longest someone lived in our town is 37 years. The shortest time…is 0 years.”

Fig. 5
figure 5

Stacked “dot plot” of the number of years 23 students’ families had lived in their town. Focusing on individual cases, we may note that the longest a family has lived in town is 37 years (data as case value). Observing that four families have lived in town for 3 years, we shift the perceptual unit to cases of the same value (data as classifier). To perceive that about half of these families have lived in town less than 10 years entails viewing the entire data set as the unit (data as aggregate)

Ordering the data, which students often do spontaneously, makes it easy to locate these extreme values. The fact that the data values were ordered in Fig. 5 perhaps made the extreme values more salient.

When asked to graph numeric data, younger students frequently make the kind of plot shown in Fig. 6. In this “case-value” plot made by fifth graders to show the tail-lengths of 24 cats, each case is represented by a bar whose length corresponds to the value of that attribute. Note that this type of display leaves ample space for labeling each case with its case identifier, here the cat’s name.

Fig. 6
figure 6

A portion of a graph made by two fifth graders at Fort River Elementary School, Amherst, MA. The entire graph spanned 3 pages. The students represented each of 24 cats with a bar whose height corresponded to the length (in inches) of the cat’s tail. The bars are labeled with abbreviated forms of the cats’ names and include tail length at the top. Students were graphing different attributes of the cats, including body length, weight, and tail length, to use as a basis for making recommendations to a make-believe business that wanted to design one-size-fits-all cat clothing

Operating from a case-value perspective, students use a graph much as they might a phone book—to locate a particular case and read off its value. The case-value plot in Fig. 6 makes this particularly easy. Indeed, case-value plots are perhaps the most common type of graph in newspapers where the bars are often ordered alphabetically by case name. Wainer (2001) pointed out the limitation of this convention for ordering the bars, noting that

Since we are almost never interested in seeing Alabama first, it is astonishing how often data displays are prepared in which alphabetical order is the organizing principle of choice. The only reason I can think of is that …[it] is easy and obvious. (p. 43)

However, if one’s purpose is to locate a value for a particular case (e.g., Alabama’s per capita student expenditure), then this ordering principle makes good sense. Many students new to the study of data analysis appear to believe that associating a case with its value is what graphs are used for, and thus may make or prefer data representations suited to this purpose. Indeed, a majority of the items we found on high stakes tests that purported to assess statistical reasoning test this skill (Konold & Khalil, 2003). The students who made the graph in Fig. 6 later commented that it was not very good for showing what a “typical” tail length was. This realization indicates how students can move between viewpoints, perhaps as a result of being asked questions that require different views.

We see the viewpoint about the primacy of the case-value perspective expressed by Val, an eighth grade student interviewed at the Nashville site. The interviewer had described a study designed to investigate how time spent brushing teeth affects plaque levels. After Val had read a short description of how and what data were collected, the interviewer asked:

I: How do you think the researchers organized the data?

Val: They probably organized it in a graph because most adults like graphs instead of charts. But I would probably do it in a chart just so it’d be easier to read, so you can take a specific person and know just about that person without having to know about all the other people, and you can compare them really easily.

By “chart” we think Val meant a table of values. As Val implied, this focus on individual values allows reading off individual cases and easy comparison of cases.

The interviewer then gave Val several representations of the data which showed the brushing time and plaque level after brushing for 48 people. The statistics unit Val had just completed had focused exclusively on how to use scatterplots to reason about these types of data, and a scatterplot was one of the options offered her (see Fig. 7).

Fig. 7
figure 7

Scatterplot showing the relationship between time (in seconds) spent brushing and percent of plaque remaining on teeth for a sample of 48 people. These hypothetical data were used in a post instruction interview of eighth grade students who were participating in the teaching experiment at the Nashville site (Cobb, McClain, & Gravemeijer, 2003)

But she, along with several of the students who were interviewed, preferred the representation in Table 1.

Table 1 Part of the table of values of the tooth brushing data used in the interviews at the Nashville site

One advantage Val saw in this tabular display is that she could “see the actual numbers” and “look down and find out which person has the least amount of plaque really easily.” As she later explained, the display in Table 1 allows identification of exact values.

Val: … because no matter if you gave it to me or somebody else, they would always see that…[Angela] would have 21 [seconds brushing] and 61 [percent plaque]. They would know those numbers, and those numbers wouldn’t change no matter who you gave it to. But if you gave somebody just like this [Fig. 7], I could say that …he had 30 something, and another person would say that he had 40 something, because there’s no marks, no divisions.

In summary, students who use the case-value perspective do see data as a model or simplification of a real-world event. The perspective focuses, however, not on features of the data set as a whole but on attribute values of individual cases. When taking this perspective, students prefer data representations that make it easy to identify and rank individual cases and to read off values accurately. This perspective is well suited to the goal of determining where a single case of interest falls in a distribution (“How do I compare with other students in my class?”). It is also a perspective that fits with the apparent objectives of many current curricula and assessment instruments (Konold & Khalil, 2003), which aim to develop and test students’ abilities to decode basic graphic elements, what Curcio (1987) has called “reading the data.”

2.3 Data as classifier

Viewing data as classifiers entails combining individual cases of the same, or similar, value into a new unit—a category or type. In our example in Fig. 2, we can regard all of the students whose favorite color is red as a group, noting for example that there are three such students, or that in this sample red was selected more than any other color.

Clustering cases together to quantify them requires putting aside the fact that those cases differ in other respects—these three students may be different genders, ages, and heights. Questions that come to the fore in this perspective concern the frequency of cases of a particular value (e.g., How many students like red best? What is the most popular favorite color?). Particularly salient in the data-as-classifier perspective is the value with the most cases—the mode. Consistent with the observations of many elementary teachers, a teacher in the WwDC (p. 64) described her third and fourth graders as “heading straight for the mode,” regarding it not so much as a summary of group performance but rather as the “winning” outcome: “It felt to me like they were engaged in a race-to-the-top kind of board game — whichever value has the most xs at the end is the winner” (WwDC, p. 100).

The WwDC contains many examples of students employing this perspective. For example, students in a fourth grade class, who had little previous experience working with data, conducted a survey about the number of people in their families (Case 3, p. 12). Three students made the stacked dot plot in Fig. 8.

Fig. 8
figure 8

Graph showing the family sizes of each student in the class, with written summaries below the graph, WwDC, p. 15. The fourth graders who made this graph used stick figures to indicate cases (a student/family) and a zero to clearly indicate values with zero frequencies. Gender is indicated with long hair (girls) and baseball caps (boys)

On their plot they wrote, “Most people in 5 and 6 have a lot. Most people in 9, 11, 12, 18, 7, 4 have a less.” In making these observations, these students attended not to the values of single cases, but rather to the frequencies of occurrence of types of values. Furthermore, their treatment of these categories (families with 11 members, families with 7 members) suggests that they were regarding them as nominal rather than numeric in nature as indicated by their unordered list of values that have “less.” When the teacher asked them to explain their graph, Jacob said, “Well three kids in the class have 5 people in their family and five kids have 6. That’s more than the other numbers.” The teacher then asked, “What about the numbers that you say have less?” Tyrone responded, “At all of those places, there are only one or two kids with that many people in their family.” All of their responses indicated that they were focused on the frequencies of same-valued data types. They gave no evidence of attending to aggregate features such as the overall shape of the distribution or to relative densities of values (e.g., most of the families are bunched up on the lower end).

The Holyoke High School students who were interviewed in the study by Konold et al. (1997) were remarkably consistent in viewing the data about students in their school from the classifier perspective. For example, two students, whom we will refer to as R and P, wanted to produce a display that summarized grade levels of students in the data set. Working with the data analysis software they had been using during the course, they first produced a table of descriptive statistics, which included the median and mean grades. They were not satisfied with this table.

R: No, that doesn’t give us what we want. Isn’t there one [a display type] that give us…

I: [Interviewer] What do you want to get?

R: We want to separate them, you know like how many are of 12, how many of 8, how many of 11? How many… [the display in Table 2 appeared]. Yeah, there we go.

Table 2 Frequency table showing the number (and proportion) of surveyed students at grades 8–12

On this as well as most occasions during the interview, these students searched among the program’s menu options for a workable display. Regardless of their question or the type of variables involved, the display they finally settled on was almost always a frequency table. Note that the moment Table 2 appeared on screen, R sensed they had what they wanted.

Asked to summarize the information in Table 2, R read off all of the frequencies from the table.

R: That says we have one person from 8th grade, 10 from 9th grade, 6 from 10th grade, 31 from 11th grade and most of them are from 12th grade, which is 106 people.

Their attention seemed clearly focused on the frequency of various data types, in this case grade levels. During the yearlong statistics course, students had practiced making and interpreting a number of data displays, including histograms, box plots, and scatterplots. However, during the interview these students used frequency tables to answer nearly every question they investigated. They preferred a display that permitted reading of exact values, as we saw with the case-value perspective, and easy identification and counting of responses of a particular type—treating data as classifiers.

In addition, students often attempted to extract this same information from displays in which it is not available. For example, in the process of exploring a question about curfew and study time, R and P produced the box plot display shown in Fig. 9.

Fig. 9
figure 9

Box plots of weekly homework hours for a sample of Holyoke High School students with (yes) and without (no) curfews. Students who are viewing data as either case values or classifiers find this type of display difficult to interpret

P: That doesn’t help.

R: I know [Laughing].

I: […] Why doesn’t that help?

P: Because, it’s, like, confusing. We don’t know what the hours are, how many hours…

R: […] Like how many students studied 10 h on “no” [curfew], and how many students studied 10 h on “yes” [curfew].

In summary, when viewing data as classifiers, students focus on the frequency of like-valued cases to answer such questions as what outcome type is the most frequent or to compare frequencies of various outcome types. Students taking this perspective choose data displays that make it easy for them to accurately read both case values and their frequencies. When taking this perspective, students tend to view different response categories as independent of one another and thus do not attend to how frequency may be changing systematically over outcome type (i.e., to how frequency distributions are shaped). And because they do not think about the collection of categories as constituting a unity, they tend to reason about “raw” frequencies rather than relative frequencies of category types.

2.4 Conflicts between case-value and classifier perspectives

As they share and discuss their work in the classroom, we frequently see students working to distinguish between the case-value and classifier perspectives. In the excerpt below (from WwDC Case 9, p. 47), two kindergarten students were looking at a representation of their responses to the question, “Do you like to work on a computer?” (See Fig. 10). The chart had emerged as each student clipped a clothespin on one of the possible responses to the question. The teacher asked:

Fig. 10
figure 10

Results of a survey in a kindergarten class to the question “Do you like to work on a computer?” Each student responded to the question by attaching a clothespin to the appropriate chart area. The elaborated responses were invented by students who felt that their view was not adequately captured by a “yes” or “no”

Teacher: So do you think someone else could tell something about us from our survey?

Rhea: Yes, they know most of us like computers a lot.

Amanda: Not me, I said no.

Melinda: Me either, I said I never played before.

Rhea: I said most of us!

By her response we conclude that Rhea was attending to the large number of “Super-Dupers.” Many younger students use the term “most” to mean “more than any other.” Thus we conclude from her response that she was working from the perspective of data as classifier. Amanda and Melinda protested that Rhea’s summary did not take into account their particular responses. Taking a view of data as case value, they were attending to their individual data values. From this perspective, they apparently interpreted Rhea’s assertion as ignoring their contributions to the plot. Rhea responded with some frustration, emphasizing that she had said “most,” which from her perspective takes into account the fact that not all the students like computers a lot.

With attributes coded as integers (e.g., family size), students can find it particularly challenging to translate between the case-value and classifier perspectives. Because in this instance both value and frequency are integers, it becomes easy to confuse the value of a case (a family of 4) with the number of cases of that type (5 families with 4 members). We see this confusion at play in a discussion among three fourth grade students who were describing to one another the graphs they each had made of the size of the families of their classmates (WwDC Case 14, p. 83). Kenny had made a case-value plot like the one shown in Fig. 11 using a bar (composed of individual xs) to represent the size of each of the 12 students’ families. Thus the length of each bar corresponded to the number of members in a particular student’s family. His partners, Cara and Tim, had both plotted the same data using stacked dot plots. A stack of xs in each of their graphs represented a given number of families of a particular size. Kenny tried to explain his plot to his partners.

Fig. 11
figure 11

Kenny’s case-value plot showing the size of 12 families (family #3 has 5 members). The x axis probably indicates the order in which Kenny obtained the data

Kenny: The first family had 12 [members], the second family had 8, and the third family had 5. I put Xs for the number of people in the family.

Tim: That looks like too many.

Cara: It is. You have to put the amount of people [family size] under the line.

Tim: How many families goes on top of the line.

Kenny: Huh?

As their teacher observed, Cara and Tim saw Kenny’s plot not as an alternative way to display the data, but as a mistake to correct. Interpreting the xs in Kenny’s plot as standing for each student in the class, Tim saw “too many” students (families). Likewise Kenny was unable during this exchange to understand the objections of his partners.

The potential for confusing case-value and frequency graphs of integer data is heightened by the fact that traditionally both have been referred to as “bar graphs” (see Konold & Higgins, 2003, p. 200). Making the difference between these two plot types an explicit part of instruction could help students distinguish them. Furthermore, Cobb (1999) claims that case-value plots are an easier graph for students to interpret and that by first introducing students to them we can provide a suitable grounding for frequency plots (see Konold & Higgins, 2003, p. 200–201 for an illustration of the sequence of graphs recommended by Cobb, 1999).

2.5 Data as aggregate

When viewing data as an aggregate, the perceptual unit is the entire distribution of values. In focusing on the distribution, one attends to emergent features not evident in any of the individual data values. These features include the general shape of the frequency distribution, how spread out the cases are, and where in the distribution cases tend to be concentrated. Using aggregate reasoning, one might describe the relative number of cases in various parts of the distribution, using either percentages or quantitative descriptors such as “majority.” With numeric attributes, one might summarize group features with measures of center, of spread, or of shape. For students applying this perspective, questions that come to the fore include whether two or more groups differ on some aggregate measure or whether two attributes are related to one another.

In the third column of Fig. 3, we depict the data structure of an aggregate perspective not only as bounded, but also as internally structured. Each different data type occupies a well-defined space in the mental representation. Effectively, the different color types are joined together in a unity, a dimension called color which comprises a number of mutually exclusive values. In this sense, a classification system is more than a device for placing name tags on cases (Bowker & Star, 1999). It is a system for dimensionalizing them — placing values in relation to one another. It is this structure that allows us quickly to add a case we have not yet observed (e.g., a favorite color of purple) and to make quantitative comparisons between various case types, to say for example that half of the students have red as a favorite color, or that “the majority of families have lived in town less than 10 years” (see Fig. 5). To put these objects into such a relation to one another requires that we regard them as part of a homogeneous group despite the fact that they differ not only with regard to other attributes that we are ignoring (boys who are all 6 ft tall nevertheless differ in other respects), but also with respect to the common attribute we are currently examining (individuals vary on height).

A few students in the interviews at the Nashville site appeared to use aggregate reasoning. For example, in analyzing the relationship between calories and energy level, Susanne said that she preferred a scatterplot display to the other displays because it offered a “general view” as compared to the “hardcore facts” of the ordered table of values. Her interpretation of the scatterplot included a description of the trend for the full range of the data (an aggregate feature) and a statement relating that trend to the problem context:

Susanne: I would use …[Fig. 12] because, you know, it shows around the middle part, where the calories are, like, steady at that spot. It shows that the people who are in the area are like up [at high energy] while the people who are extremely low in calories or extremely high vary, and were like, decreasing [in energy level].

Fig. 12
figure 12

Scatterplot showing the relationship between calories consumed and self-reports of energy level for 50 football players. These hypothetical data were used in a post instruction interview of eighth grade students who were participating in the teaching experiment at the Nashville site (Cobb, McClain, & Gravemeijer, 2003)

She not only described the general trend, but noted differences she perceived in variability around that trend, describing the “middle part” of the scatterplot as “steady” compared to the extremes.

On the whole, however, instances of aggregate reasoning are relatively rare in the three data sources of this study. This observation accords with reports from numerous researchers concerning the challenges students face in learning to reason about statistical aggregates (e.g., Bakker & Gravemeijer, 2004; Ben-Zvi & Amir, 2005; Ben-Zvi & Arcavi, 2001; Hancock et al., 1992; Konold & Higgins, 2003; Konold et al., 1997; Mokros & Russell, 1995). In the WwDC, we do see evidence that teachers are aware of the fact that their students are not yet perceiving or talking about aggregates, and are trying to help promote that perspective. For example, in the fourth grade discussion of family size referred to earlier (WwDC, Case 3), one group created a graph similar to the one shown in Fig. 8. The teacher reflected, “Their line plot looked very similar to that of the previous group [see Fig. 8] … However, I noticed that they … had used terms such as typical … and range.” Accordingly, she asked:

Teacher: I noticed that you wrote 5 or 6 were typical numbers of family members for this class. Can someone say something about why you chose these numbers?

Inez: There were more people with 5 or 6 in their families than the other numbers.

This last statement may have meant “More than half of the families have 5-6,” which would be an aggregate statement about the collection as a whole. But consistent with a classifier perspective, she may have meant, “Families with 5 or 6 are more frequent than families of other specific sizes.” After making sure students were aware of how many total data values there were, the teacher probed in a way that led another student to offer an aggregate summary:

Teacher: How many students in this class had either 5 or 6 people in their family? (I was met with many responses of 7.) If 7 out of 14 students have 5 or 6 family members, how else can we say that? Is there a fraction we could use to describe this?

Denise: I know! One half of the class has 5 or 6.

In a similar episode (WwDC Case 12, p. 63–68), a teacher helped her combined third/fourth grade class to take an aggregate perspective. She did this by building on students’ tendencies to identify modal clumps (Konold et al., 2002) and by encouraging students to consider what they meant by phrases such as “a lot” and “most.” The teacher arranged the students in pairs and asked them to write a description of the data displayed in Fig. 5. She heard Anna say that 3 years had “a lot.”

Teacher: What do you mean by “a lot”?

Anna: It’s the only one that has four, and all the rest have less than that.

Teacher: Does that mean that it has a lot?

Teacher: [Reflecting]. This seemed to make her stop and think, and I decided I wanted to push her a bit.

Teacher: How many xs are up there altogether?

Anna: 23.

Teacher: How many of them are at 3?

Anna: Oh, it’s only four. That’s not really a lot, is it?

In the class discussion that followed, Anna explained to the class how even though 3 years did have the most families, that was not “a lot” when compared to the total number of families. The following exchange occurred when the teacher again asked students to summarize the data. Anna was now able to apply the same reasoning she had used in evaluating “a lot” to provide a more precise meaning for “most.”

Kevin: Most of the xs are between 0 and 6.

Teacher: How did you decide that?

Kevin: It’s the biggest clump.

Teacher: How many xs are in that clump?

Anna: There’s eleven. That’s almost half.

Teacher: Almost half of what?

Anna: Well there’s 23 altogether, and it’s almost half of that.…

In both instances, the teacher apparently saw an opportunity to support students in taking what they noticed about individual values and relating it to the data set as a whole. Part of what prompted these learning episodes was the teachers’ being sensitive to, and inquisitive about, the meanings that their students attached to words such as “most.”

Statisticians use averages such as the mean and median as one way to characterize a distribution of values, to locate the approximate center of the distribution. Even before they take a unit or course in statistics, young students have already encountered statistical terms such as “average” and have incorporated them into their everyday discourse (Gal, Rothschild, & Wagner, 1990; Watson & Moritz, 2000). It is important that the teacher does not assume, however, that students who are using the terminology of aggregates are perceiving and describing aggregate features. As an example, we show below how students apply the terminology of averages in ways that are more consistent with case-value or classifier perspectives.

2.6 Averages from the viewpoint of the case-value and classifier perspectives

In the WwDC Case 22, a third grade teacher asked her class, “What would you say is the average height of kids in our room?” Brita answered, “It’s me. I think I am average.” To further explore this question, the teacher had the students line up according to their height. Looking at this physical graph, other students used the term average in the same way Brita had—as a feature of particular individuals in the line-up.

Phoebe: I think I’m taller than average because I notice that on the playground.

Brita: I was right. Sam is average, and I’m average too. We are the same.

Tiffany: I’m average too.

Katie: I’m not average. I’m shorter.

We see this adjectival usage of average as consistent with the case-value perspective. To say that Sam is of “average” height is to characterize this single case, and not necessarily the group of which the case is a part. Certainly to make this attribution, one must attend to where Sam is located in the distribution of heights. In the same way, to identify a particular case as the “smallest” or “largest” requires locating that case with respect to the other cases in the group. Having the students line up in order makes it relatively easy to locate these salient cases. However, it seems clear that the aim of making these attributions—“of average height,” “shorter than average”—is not to characterize the whole data set but rather to assign another feature, albeit a comparative one, to a single case. To Sam’s list of features (“male,” “3rd grader,” “42 in. tall”) is added yet another—“of average height.” In the same way, Katie applies to herself the attribute “shorter.”

Watson and Moritz (2000) interviewed Australian students in grades 3 to 9 to explore their understanding of averages. Many of these students seemed to hold this perspective on what averages were. For example, a grade three student who was asked “Where have you heard ‘average’?” responded, “My stepbrother plays cricket, and he usually gets an average amount of runs.” A grade seven student who was asked the same question responded “Average is like … the middle standard…. So, if you said the television program was average, it means that it wasn’t very good and it wasn’t very bad.”

Given this case-value perspective of averages, it makes sense that some students reject average values that do not correspond to the value of at least one case. For example, in WwDC Case 25 (p. 148) third grader Robbie had made two attempts at blowing a Styrofoam cylinder as far as he could, recording the values 152 and 186 cm. The class was discussing whether he should use the average of these two values (169 cm) to represent how far he could blow the cylinder. Robbie would have none of this. “But I didn’t get 169 as one of my distances,” he objected. “It’s a lie!”

Another commonly accepted view of the average is consistent with the classifier perspective. This is the view of the average as a characteristic of either the most frequently occurring cases (the mode) or of a subset of cases in the middle of a distribution, what Konold et al. (2002) have termed a “modal clump.” We think this is the perspective that predisposes many young students to focus in particular on the mode, considering it “the end-all way to describe what’s typical in a set of data” (WwDC case 12.) Here, the attribution of “average” or “typical” is applied to a subset of cases with the same, or close to the same, value. Several researchers (e.g., Hammerman & Rubin, 2004; Makar & Confrey, 2004) have reported students’ tendencies to regard numeric data as comprising three groups: middle, low, and high. A student in the Bakker and Gravemeijer (2004) study, for example, summarized a graph of student heights with the observation that “You have smaller ones, taller ones, and about average” (p. 159). Having partitioned a distribution in this way sets the stage for applying these attributes “low” “high,” “average” to the cases in the corresponding partitions.

In both the classifier and case-value interpretations of average, the term “average” is used as an adjective to describe a characteristic of a particular case or subset of cases. “This person (group of people) who is (are) 65 in. tall has the characteristic ‘average.’” The people are average, of course, due to their central location in the distribution and/or because their value is the most commonly occurring. In both instances, students are using “average” as a label which they apply to one or perhaps several of the cases in the group, but certainly not to the entire group: some people are “average,” others are below or above average or even “outliers.”

In the aggregate sense, average is used as a noun. The claim is “This group has an average of x.” In this sense, average is not a label applied to a single case or subset but rather is a measure that applies to the entire group. The whole group is not average but rather has a particular average value. (See Bakker, 2004 and Bakker, 2007 for a similar analysis of both adjectival and noun notions of averages and spreads.) Because an average in this sense applies to the whole group, we can use it to represent the group and to compare one group to another. It is interesting in this regard to note that many researchers have reported that even though many students know how to compute and use averages with single groups, few of them use averages to compare two groups (Bright & Friel 1998; Hammerman & Rubin, 2004; Hancock et al., 1992; Jones et al., 1999; Konold et al., 1997; Watson & Moritz, 1999). Our analysis suggests a possible explanation for this result, which is similar to an argument made in Konold and Pollatsek (2002). If it is the case that some students’ averages are descriptors of only one or a subset of cases in a group, then it would make sense that they would not use these “averages” to make a comparison between two groups.

3 Summary

Given that statistics is fundamentally about the behavior of aggregates, the overarching question in statistics education is how we can foster students’ abilities to perceive and reason about aggregates. The question of how to do this with younger students has been taken up by a number of researchers including Bakker (2004), Bakker and Gravemeijer (2004), Cobb (1999), Cobb et al., (2003), English (2012), Konold and Pollatsek (2002), Lehrer and Schauble (2004), and Lehrer, Schauble, Carpenter, and Penner (2000). In this article, we have not focused on this question per se but rather have proposed a number of perspectives, in addition to viewing data as an aggregate, that we see students using as they reason about data. Our hope is that by differentiating these different perspectives and suggesting what purposes they serve, we will be better able to understand and help shape statistical reasoning of our students.

The fact that we so rarely see elementary students using the aggregate perspective is perhaps not surprising given what we know about the historical development and application of the idea. Stigler (1999), for example, claims that coming to regard different observations as belonging to a single group historically proved to be a major barrier to the wide-scale application of statistical methods to both the social and physical sciences:

The first conceptual barrier in the application of probability and statistical methods in the physical sciences had been the combination of observations; so it was with the social sciences. Before a set of observations, be they sightings of a star, readings on a pressure gauge, or price ratios, could be combined to produce a single number, they had to be grouped together as homogeneous, or their individual identities could not be submerged in the overall result without loss of information. This proved to be particularly difficult in the social sciences, where each observation brought with it a distinctive case history, an individuality that set it apart in a way that star sightings or pressure readings were not. … If it were felt necessary to take all (or even many) of these [distinct case histories] into account, the reliability of the combined result collapsed and it became a mere curiosity, carrying no weight in intellectual discourse. Others had combined …[the individual cases], but they had not succeeded in investing the result with authority. (pp. 73–74).

One might conclude from our article that treating data as an aggregate is beyond the abilities of younger students. It is important to keep in mind, however, that sustained efforts to teach statistics in the elementary grades are still comparatively recent, and that we still have much to learn about how to engender statistical reasoning. However, we reiterate that we do not regard an aggregate perspective as the way to look at data. It would be a serious mistake if educators saw as the major instructional goal to move students quickly past interpreting data as pointer, case values and classifiers. As we mentioned, many of the questions young students have about data involve locating themselves within distributions, identifying the highest and lowest values and the most frequently occurring outcome. Because of this, the case-value and classifier perspectives are well suited to the interests of many young students. They serve well the purposes students have in mind when they collect and analyze data. And we maintain that choices about how to collect and analyze data should be based on one’s questions and purposes. The perspectives one takes on data should serve one’s questions rather than the other way around. Furthermore, the case-value and classifier perspectives highlight important features of data that are essential components of an aggregate perspective. A case-value perspective, for example, makes salient the fact that individuals vary, an awareness that is too often neglect in both our curriculum (Biehler, 1994; Shaughnessy, Moritz, & Reading, 1999) and our public discourse (Gould, 1996). They are the critical building blocks of an aggregate perspective. Supporting this claim, Ben-Zvi (2004), in his study of seventh graders learning to interpret time-series graphs, observed that the initial focus on cases “sometimes restrained the students from seeing globally, but in other occasions it served as a basis upon which the students started to see globally” (p. 140).

As Lehrer and Schauble (2004) point out, by focusing on case values, students stay connected to the meaning of the data, to what the graph is about and to the information it encodes. In fact, many of the difficulties students can experience in interpreting graphs result from them losing this meaning. Focusing on the values of individual cases is a way to establish and maintain this connection.