1 Introduction

Generalizing has rightly received much attention as a core aspect of algebraic reasoning (e.g., Cooper & Warren, 2011; Kieran, 2007; Mason, 1996). Symbolizing a generalization, however, is no less significant (Kaput, 2008). The act of symbolizing derives its importance as the means by which one compresses multiple instances into the unitary form of a single statement that symbolizes the multiplicity (Kaput, Blanton, & Moreno, 2008). From this perspective, generalizing has been described as the “act of creating that symbolic object” (p. 20). To be clear, the symbolic object might be represented through either conventional or non-conventional forms. Indeed, it is possible—and productive—to symbolize and reason symbolically through non-conventional forms such as one’s natural language (Radford, 2011). However, few would disagree that algebraic reasoning ultimately involves reasoning with perhaps the most ubiquitous cultural artifact of algebra—the conventional symbol system based on variable notation (Kaput, 2008; Kline, 1972).Footnote 1 It is reasonable, then, to consider how students come to understand and use this critical tool to reason algebraically.

1.1 Perspectives on the role of variable and variable notation in school mathematics

Re-conceptualizing algebra as a Kindergarten–Grade 12 (K–12) strand of thinking in the United States (US) (e.g., National Governors Association Center for Best Practices [NGA] and Council of Chief State School Officers [CCSSO], 2010) has led to introducing variable and variable notation in grades earlier than previously thought possible.Footnote 2 However, there are different perspectives on when and how to do this. Part of the hesitation in introducing this notation prior to middle grades lies, perhaps, in strict interpretations of Piaget’s formal stages of development and the concern that premature formalisms (Piaget, 1964) might lead students to meaningless actions on symbols. Instead, some scholars emphasize that younger students should use those non-conventional systems whose meanings are already available to them—particularly natural language and drawings—to symbolize variables (Radford, 2011; Resnick, 1982).

The concern of introducing premature formalisms gains some traction from the well-documented struggles adolescents have with the concept of variable and the use of variable notation (e.g., Knuth, Alibali, McNeil, Weinberg, & Stephens, 2011; Küchemann, 1981). One might extrapolate from this that these difficulties would only intensify with younger children. However, recent research raises questions as to whether the struggles adolescents face are as tightly connected to age as previously believed. Some researchers have pointed out that, by middle grades, students have already developed ways of thinking about letters in linguistic contexts that are far removed from algebra—and mathematics. Yet, they are then expected to build a new understanding of these literal symbols as a way to notate variable quantities in algebra (e.g., Braddon, Hall, & Taylor, 1993). This suggests that difficulties with variable notation may be less about age and the premature use of formalisms and more about conflicts the notation creates with the experiences and understandings students already have with the use of literal symbols in non-mathematical contexts (McNeil, Weinberg, Hattikudar, Stephens, Asquith, Knuth, & Alibali, 2010). Furthermore, others have found that well-designed curricular materials for middle grades, along with good instruction, can help students develop conceptions of literal symbols as representing variables, where those conceptions do not necessarily exhibit commonly observed difficulties such as object/quantity confusion (Knuth, Alibali, McNeil, Weinberg, & Stephens, 2005; MacGregor & Stacey, 1997; McNeil et al., 2010). This, in fact, is well aligned with Arcavi’s (2005) call for the nurturing of symbol sense in school algebra—characterized as similar to number sense in the area of school arithmetic—by providing “supportive instructional practices” (p. 45).

1.2 Research on elementary grades children’s understandings of variable and variable notation

In spite of the struggles with variable and variable notation documented among adolescents, recent early algebra research suggests children in elementary grades (Grades K–5) have some facility with these concepts. Fujii and Stephens (2008) report that children in Grades 2–3 could engage in “quasi-variable thinking” by which they reasoned with general quantities, instantiated by numbers, prior to their use of variable notation. For example, students could offer general explanations for why an equation such as 78 – 49 + 49 = 78 is always true by interpreting 49 as a “quasi-variable” for a generalized number rather than seeing 49 for its numerical value. Studies have also shown that children in elementary grades can use variable notation to represent arithmetic properties, as well as relationships between non-specified, continuous quantities (e.g., length), and can reason with these representations in their symbolic forms (Carpenter, Franke, & Levi, 2003; Dougherty, 2008). Indeed, Carpenter et al. (2003) argue that the transition from natural language to symbolic notation is not so difficult if students already perceive and understand the variable quantities being represented.

We are interested here in children’s understandings of variable and variable notation within the context of functions. Our interest in functional thinking derives from the seminal role functions can play as a unifying strand across the K–12 curriculum (Freudenthal, 1982; Schwartz, 1990) and its promise as an entryway into early algebraic thinking (Carraher & Schliemann, 2007).

Studies suggest that elementary-aged students can account for relationships between two quantities simultaneously, understand input–output rules, and identify correspondence relationships (Stephens, Ellis, Blanton, & Brizuela, 2017) More specifically, studies find that elementary-aged children can use variable notation to symbolize functional relationships and may actually choose variable notation over non-conventional forms to represent these generalizations (e.g., Schliemann, Carraher, & Brizuela, 2007). For example, a recent study examining the impacts of an early algebra intervention in third grade found that students not only showed significant improvement in their ability to symbolize variable quantities, algebraic expressions, and functional relationships using variable notation and to reason with symbolized quantities to solve linear equations (Blanton, Stephens, Knuth, Gardiner, Isler, & Kim, 2015), but were more likely to correctly use variable notation than natural language to represent functional relationships.

Perhaps even more striking, studies suggest that children as early as Grades 1–2 (ages 6–7) can generalize functional relationships and represent these relationships with variable notation (e.g., Blanton, Brizuela, Gardiner, Sawrey, & Newman-Owens, 2015; Brizuela, Blanton, Sawrey, Newman-Owens, & Gardiner, 2015; Cooper & Warren, 2011; Moss & McNab, 2011). Moreover, children’s use of different representations such as function tables can serve to mediate and support their thinking (Brizuela & Earnest, 2008).

1.3 Research goal

These studies suggest that more systematic research concerning how young children understand and represent variable quantities associated with functions is warranted. We report here on a study of how first-grade children came to understand variable and variable notation as they explored functional relationships. We focused on first grade students because we anticipated that their lack of formal schooling—and lack of experiences with literal symbols—might help us flesh out boundary points in how young children’s thinking about variable and variable notation emerges. Our study is framed around the following research question:

What levels of understanding about variable and variable notation do first-grade children exhibit as they explore functional relationships between two quantities?

We use a learning trajectories approach (e.g., Barrett & Battista, 2014; Maloney, Confrey, & Nguyen, 2014) as the research paradigm for identifying first-grade children’s thinking about concepts of variable and variable notation in functional relationships. Learning trajectories have become an important paradigm in educational research (Clements & Sarama, 2004; Simon, 1995) and are increasingly endorsed for their potential to inform the design of coherent standards, curricula, assessment, and instruction (Daro, Mosher, & Corcoran, 2011). They provide here a theoretical framework for identifying progressions in children’s thinking. In particular, we take a learning trajectory to include three essential components (Clements & Sarama, 2004): (a) learning goals, (b) an instructional sequence, and (c) a developmental progression that specifies increasingly sophisticated levels of thinking children exhibit as they progress through the instructional sequence. Our goal is to begin to build a “systematic, detailed description of the likely progression of children’s reasoning” (Confrey, Maloney, & Nguyen, 2014, p. xiii) around the concepts of variable and variable notation.

2 Research design

Our research design centers on the use of classroom teaching experiments [CTEs], defined as a series of teaching episodes and individual interviews over an extended length of time (Cobb & Steffe, 1983). Student activity during individual interviews were our primary focus (e.g., see also Szilágyi, Clements, & Sarama, 2013) as a method by which we might more carefully flesh out the nature of children’s thinking developed in our instructional sequence. In what follows, we briefly describe our participants, our instructional sequence, the implementation of the study, and our data analysis.

2.1 Participants

Participants in this study were from two first-grade classrooms that served as the sites for our CTEs, one classroom from each of two elementary schools (Schools “A” and “B”) in the Northeastern part of the US. The schools were selected because they represented distinct demographics (summarized in Table 1).

Table 1 Demographics for the two participating schools

2.2 Design of our instructional sequence

We designed and piloted lessons for the instructional sequence based on tasks used productively in previous research on children’s functional thinking (e.g., Blanton, Stephens, et al., 2015; Carraher, Schliemann, Brizuela, & Earnest, 2006). Lessons focused on functions of the type y = mx and y = x + b.Footnote 3 Function tasks centered on correspondence relationships, where the focus is on the relation between two sets and on identifying an explicit rule (Confrey & Smith, 1995). Values for m and b were chosen to be arithmetically meaningful for first-grade students. For example, functional relationships that might formally be represented as y = 2x were used to leverage children’s experiences with counting by two’s and doubling.

The instructional sequence was designed as a set of two, four-week instructional cycles (Cycles 1 and 2). Each cycle included eight 30–40 min lessons (16 lessons total), with two lessons taught during each week of instruction. Each lesson was based on one task. Lessons in Cycle 1 focused on functional relationships of the form y = mx. Lessons in Cycle 2 focused on functional relationships of the form y = x + b.

2.3 Implementation: Lessons and individual interviews comprising the CTEs

Lessons were taught by one member of the project team at each research site and were observed and videotaped by other team members. The project team met weekly to discuss observations from the lessons regarding children’s thinking about concepts addressed in the function tasks and to discuss possible revisions for subsequent tasks.

Semi-clinical, 30-min, individual interviews were conducted before Cycle 1, between Cycles 1 and 2, and after Cycle 2 using a subset of students from the CTEs. The interviews provided a way to more closely detail individual students’ thinking from the CTEs. During each interview, we asked the student to solve a task similar to those used in the CTE lessons and to describe his or her thinking aloud.

Students selected for interviews were from the upper 30% of the class in terms of their understanding of arithmetic and ability to talk about their thinking.Footnote 4 We also selected students that were not within the top 30% academically, but that we viewed as able to verbalize their thinking well. From these groups of students, 10 were selected as our interview cohort for the study.

Table 2 summarizes the sequencing of interviews and lessons and provides a sample task focus for the function type addressed. Table 3 summarizes the core areas addressed in an interview and provides selected questions from these core areas.Footnote 5

Table 2 Sequence and focus of lessons and interviews
Table 3 Selected interview protocol

Finally, variable notation was introduced intentionally in both classroom instruction and interviews as a way to represent variable quantities. In particular, variable notation was introduced in the context of representing children’s rules for functional relationships. For example, once students had constructed a general rule in words for the relationship between someone’s height and their height when wearing a one-foot hat (e.g., “whatever their height is, just add one more and we will know the total height of the person while wearing a hat”), we introduced variable notation as a way to represent variables (e.g., b could be used to represent a person’s height, m a person’s height while wearing the one foot hat). Eventually, we were able to simply ask how they would express a relationship using variable notation.

2.4 Data analysis

The primary data source for this study was videotaped individual interviews and students’ written work produced during the interviews (e.g., Szilágyi et al., 2013).Footnote 6 We used a grounded theory approach (Strauss & Corbin, 1990) whereby data analysis occurred in conjunction with the development of a (local) theory regarding a progression in children’s thinking about variable and variable notation. In particular, our analysis focused on identifying progressively more sophisticated levels in children’s thinking about these concepts as they advanced through our instructional sequence. In our analysis we looked for qualitative profiles across our data that could represent levels of thinking about variable and variable notation that would constitute a developmental progression.

During formal data analysis, one member of the project team transcribed pre-, mid-, and post-interview video data for one first grade student and flagged any instances in the data related to variable or variable notation. Flagged data were then analyzed line-by-line, and theoretical memos (Glaser, 1998) were constructed to characterize the incidents in which students used variable notation or discussed their understanding of variable quantities, how they represented these quantities, or what the notation represented. The memos consisted of descriptions of students’ thinking, a possible interpretation of that thinking, and associated transcripts that served as evidence for the descriptions.

Pre-, mid-, and post-interview data from a second first-grade student were analyzed in a similar fashion and results from both analyses were compared to refine the memos. Memos were then sorted (Glaser, 2002) according to similar types of thinking and given a preliminary descriptive code that reflected the thinking within the memo group. These categories were then organized based on their level of sophistication in children’s thinking and an analysis of the more canonical understandings of the mathematics attempted (Battista, 2004). The resulting progression served as an emerging model of the levels of sophistication in children’s thinking.

The full project team then analyzed the progression and supporting memos and transcript data for the two students to refine the levels. Using the revised model, two members of the project team independently coded the full set of first-grade interview data to further refine earlier analyses. To facilitate this, the remaining transcripts from interview video data were reviewed and flagged when any conversation addressed variable or variable notation. The flagged units of conversation were then coded according to the levels of thinking exhibited. Independent coding decisions were compared to determine agreement and resolve disagreements. If a discrepancy was not resolved, the full project team reviewed the transcripts and codes to reach a resolution. Throughout this process, existing codes were refined or new codes were created to reflect findings from the analysis. The constant comparison and refinement of levels of the progression continued until no new findings emerged in the data.

3 Results

In what follows, we describe levels of sophistication in children’s thinking about variable and variable notation in functional relationships. It is important to keep in mind that levels are not only intended to convey students’ conceptualizations, but also what students “can and cannot do” (Battista, 2004, p. 187). Moreover, the developmental progression derived from an instructional sequence represents only a possible progression in thinking (e.g., Barrett & Battista, 2014), and students may skip levels in a progression or revert to lower levels when the instructional or environmental setting changes (Clements & Sarama, 2014; Filloy, Rojano, & Puig, 2008).

3.1 Level 1: Pre-variable/Pre-symbolic

Children exhibiting characteristics of this level were “pre-variable” in that they did not yet perceive a variable quantity in a mathematical context. Moreover, they were “pre-symbolic” in that they did not use any symbols—literal or non-literal—to represent such a quantity. In other words, at this level we would describe the concept of variable and a symbolic system for notating it as outside of the child’s conceptual field, where we take a conceptual field to include concepts of which the child has some awareness and can perceive within a (mathematical) situation. A concept that is outside of the child’s conceptual field is one which the child cannot perceive and, hence, cannot imagine or create a mathematical scenario that includes it. By “pre-symbolic,” we do not mean that the child has not had any experiences with symbols—by first grade children have had many experiences with literal symbols—but that the child cannot use literal symbols to symbolize variable quantities.

If a child does not recognize a variable as a viable mathematical construct (i.e., his or her thinking is “pre-variable”), we would not expect the child to use some set of inscriptions, such as literal symbols or other non-conventional forms, to symbolize the variable. In other words, we came to understand that the child must perceive a variable quantity in a mathematical situation before he or she has a need to symbolize it in some way. At the same time, a child can have experiences with components of a symbolic system that can potentially serve the purpose of symbolization in a mathematical context, but for which such symbolizing has not yet occurred.

As this suggests, the concept of variable and variable notation can co-emerge in children’s thinking. For example, a child can have experiences with the inscription that some cultures use to symbolize the operation of addition (‘+’) that are not related to addition in the child’s thinking (e.g., a child might see and repeatedly press the symbol ‘+’ on her mother’s phone). Similarly, a child can begin to develop an understanding of the operation of addition by joining sets of objects, but without knowing that the inscription ‘+’ is sometimes used to symbolize this operation. At some point, however, these experiences coalesce, and the child comes to understand that the operation of addition can be symbolized by ‘+’. We are interested here in how that process occurs with variable and variable notation, that is, the particular nexus in which children begin to use variable notation as a way to symbolize variable quantities.

We observed pre-variable thinking in children’s responses to problem situations before variable notation was introduced (that is, when children were also “pre-symbolic,” as we use the term here). The following episode occurred during Levon’s pre-interview, in which the interviewer explored with him the identity relationship between an unknown number of dogs and the number of noses on those dogs. As the excerpt opens, the interviewer has just asked Levon how many noses there would be for any number of dogs:

  1. 1.

    Levon: Three?

  2. 2.

    Interviewer: Okay. So, what if you didn't know how many dogs there were?

  3. 3.

    Levon: You have to guess.

  4. 4.

    Interviewer: You have to guess. Is there a way to show how many dogs there are if you don’t know how many there are?

  5. 5.

    Levon: Um, yes.

  6. 6.

    Interviewer: How?

  7. 7.

    Levon: You could see.

When asked to consider a situation that involved an unknown number of dogs, Levon gave two related approaches that suggest he did not yet recognize a variable quantity as an object to be mathematized in the problem. First, when asked how many noses there would be for any number of dogs, he responded, “Three,” then explained, “You have to guess.” When asked further how he might “show how many” dogs there would be if he did not know the number of dogs, he suggested that “you could see” (hence, count) the unknown number of dogs. Levon’s approach was to guess or count the number of dogs (or noses) in order to find a numerical value. That is, his conception of a quantity required that it be fixed and knowable. This is quite logical given that his experiences—both in and out of school—had likely involved only operating on known quantities to find an unknown, but never mathematizing an unknown (variable) quantity. Because his experiences until this point likely had not included quantities whose values were unknown and could not be found (or did not need to be), it is reasonable that he did not conceptualize the quantity as an “unknown.” Moreover, he had no logical necessity for constructing a symbolic system to represent a construct—an unknown quantity—that he did not yet perceive. As such, his thinking remained pre-symbolic.

Pre-variable and pre-symbolic thinking sometimes persisted after problem scenarios with both variables and variable notation were introduced in our CTEs. That is, for children whose thinking was characteristic of Level 1, the concepts of variable and variable notation had not yet sufficiently developed—even after the introduction of these concepts—into a recognition of variable quantities and literal symbols as a tool for representing them. Kaput, Blanton, and Moreno (2008) discuss how symbol and referent can be experienced as separate by children as their understandings coalesce towards more conventional ways of thinking. This coalescence, which they characterize as a socially mediated process of symbolization through which children’s thinking about symbol and referent is iteratively transformed, captures for us here the process of children coming to understand both variable and variable notation as a system for representing variables. We would characterize this transformation for children in our study as being initiated in the social plane (Vygotsky, 1978) through our introduction in lessons and interviews of problem scenarios that involved variable quantities and literal symbols as a way to represent them.

The following excerpt from Miah’s mid-interview illustrates this persistence of pre-variable/pre-symbolic thinking. At this point, Miah has participated in Cycle 1, where variable quantities and the use of letters as a way to represent them had been introduced. In this excerpt, the task was to examine a relationship between a person’s (unknown) height and his height when wearing a one-foot hat:

  1. 8

    Interviewer: What if I said someone was y feet tall [writes y on the student’s paper]? What do you think about that?

  2. 9

    Miah: Hmm…. Just measure them [shrugs].

  3. 10

    Interviewer: Okay. But could that [circles y] stand for any…

  4. 11

    Miah: No.

  5. 12

    Interviewer: …thing? No? What does it stand for?

  6. 13

    Miah: Nothing.

  7. 14

    Interviewer: Nothing. It doesn’t stand for anything.

  8. 15

    Miah: No.…

It might be argued that, in Levon’s episode (lines 1–7), the interviewer’s references to “how many” or “show” prompted Levon to count. However, Miah’s thinking—consistent with Levon’s—suggests that this was not necessarily the case. That is, Miah, too, objects to both the use of a letter to symbolize someone’s unknown height (evidenced by her remark that y “doesn’t stand for anything”) and to the fact that someone’s height could be an unknown (“just measure them”). Thus, while a letter had been interjected by the interviewer as a way to represent a person’s unknown height, Miah’s thinking about the symbol at this point did not include that it might represent a variable (line 14), nor did she recognize an unknown height as a variable quantity to be mathematized (line 9). That is, her thinking remained pre-variable and pre-symbolic.

For a child such as Levon or Miah whose thinking is characteristic of Level 1, a reasonable move when encountering a variable quantity in a mathematical situation would be to find a numerical value for the unknown quantity by guessing or counting. Miah does this by proposing that the person’s height—which the interviewer refers to as y—be measured. We suggest that Miah did not conceptualize the person’s height as a variable quantity, so it is reasonable that, for her, y could not “stand for anything.” That is, the referent did not exist, so the letter could not symbolize it.

3.2 Level 2: Pre-variable/letters as labels or as representing objects

Children whose thinking was characteristic of Level 2 viewed a letter as a label or as representing an object, not a quantity. In this, they tacitly accepted the use of a letter as representing something other than a variable quantity. In particular, they used letters as a label for the name of an object, such as using y to represent a person’s name, or as a way to represent tangible objects, such as the person whose height was unknown. While we might still characterize this as “pre-symbolic” in the way this term is used here, we found there to be a cognitive shift from Level 1 thinking in that children at Level 2 were trying to take up the idea of variable notation, but exhibited confusion about the referent being symbolized.

We infer from this that children’s thinking was still “pre-variable.” That is, they still did not perceive variables as concepts to be mathematized in the problem situation. Thus, they used letters to represent objects, such as a person (arguably a more visible referent than a quantity) and not the quantity itself.

The following excerpt, which occurred during Jada’s mid-interview, illustrates this thinking. The excerpt opens after the interviewer has reminded Jada that she (the interviewer) used y to represent “how tall someone is” and y + 1 to represent “how tall that person is with the hat.”

  1. 16

    Interviewer: Does that make sense or not?

  2. 17

    Jada: y plus....

  3. 18

    Interviewer: It’s y plus one ’cause one is [the height of] the hat. Does that make sense or no?

  4. 19

    Jada: It does because, because, so the person y, let’s just pretend their name is y.

  5. 20

    Interviewer: Yeah.

  6. 21

    Jada: Then the person’s name is y. So, the hat is one-foot tall, so it does make sense because the person’s name is y and....

Jada paused for about 13 s, then added, “I would have to measure them.” We maintain that Jada did not perceive the variable in the situation because she ultimately abandoned her search to explain why the representations suggested by the interviewer (y and y + 1) made sense as a way to represent someones unknown height. Instead, she stated, “I would have to measure them.” In other words, Jada interpreted the quantity—a person’s height—as something to be found by measuring, not symbolized as an unknown. Moreover, variable notation had been introduced to Jada prior to this interview during Cycle 1. Yet, Jada’s assignment of y as the name of—or, label for—the person whose height was unknown (“let’s just pretend their name is y”) suggests that she had taken up the use of a literal symbol and recognized that it was intended to symbolize something unknown. However, what it seemed to symbolize for Jada was a person’s unknown name, not a mathematical unknown.

This type of thinking is well documented among adolescents (e.g., Booth, 1988; McNeil et al., 2010). What is less understood is why it occurs in young children as well. Jada’s thinking about variable and variable notation at this point is consistent with our earlier claim that it is counterintuitive for a child to accept the use of an inscription to symbolize an object the child does not yet perceive in the problem situation. That is, we contend that Jada did not see y as symbolizing a variable quantity because she did not yet perceive the quantity as something to be mathematized. However, the person whose name is unknown did exist for Jada in some sense. That is, she could mentally visualize such a person and imagine letting this person’s (unknown) name be denoted by the letter y. It is reasonable in her thinking that the person (i.e., object) was the likely referent to be symbolized. Thus, while she accepted the literal-symbolic notation—that is, the use of a letter to notate something—she did not take it up as variable notation that is, as a means to symbolically represent a variable quantity.

3.3 Level 3: Letters as representing variables with fixed, deterministic values

Children’s thinking about variable quantities at Level 3 reflected a fundamental shift in that they now recognized a variable quantity in the problem situation and saw a literal symbol as representing that variable. However, even though the function tasks addressed in our study were based on the role of variable as a varying unknown, children whose thinking was characteristic of Level 3 viewed variable as an unknown with a fixed value that could be logically determined by a process external to their control. The process was external in that they could not find the value of the unknown through their own actions, such as randomly guessing or physically counting or measuring. Instead, they viewed the value of the variable as deterministically linked to the letter used to symbolize the variable quantity and its ordinal position in the alphabet. That is, they now recognized a variable quantity in the problem situation, and the letter used to represent it also provided the mechanism for finding its value. For example, if the variable was symbolized with the letter d, then in the child’s view its value must have been 4. In this sense, we describe this “fixed” view of a variable quantity as deterministic because its value was bound by the symbol representing it. However, their understanding of variable notation was incomplete because they did not yet see that the choice of symbol was arbitrary and thus the symbol itself could not be linked to the value of the variable.

This type of thinking has been characterized as a misconception with regards to adolescents’ understanding of variable (MacGregor & Stacey, 1997; McNeil et al., 2010). For young children, however, this might be seen as an emerging conception that reflects their attempts to make sense of a new notational system. Before children begin formal schooling, they develop an understanding of an alphabetic system whose structure might naturally be brought to bear on their construction of a new symbolic system such as variable notation. As Radford (2001) notes, examples such as this support that students bring meanings for symbols constructed in other domains to bear on the ways they make sense of variable notation.

The following episode, which illustrates this type of thinking, occurred in the pre-interview with Rebecca when she was first introduced to the use of variable notation to represent a relationship between an unknown number of dogs and the number of noses on the dogs:

  1. 22

    Interviewer: I’m going to use a letter. I’m going to use W. I don’t know how many dogs I have in my back yard. I have W dogs. So how many noses do I have?

  2. 23

    Rebecca: Ohm…. (Rebecca looks down at her fingers and starts counting audibly).

  3. 24

    Interviewer: Tell me what you’re doing.

  4. 25

    Rebecca: Counting the alphabet to see how many letters are, are lower, I mean lesser, that’s how much.

  5. 26

    Interviewer: OK, so go ahead. You do that.

  6. 27

    Rebecca: (Sings the alphabet song and counts.) Wait, there are 24 letters in the alphabet?

  7. 28

    Interviewer: There are 26 letters in the alphabet.

  8. 29

    Rebecca: (Counts backwards from the letter Z.) Twenty-six, twenty-five, twenty-four, twenty-two, twenty-one. Twenty-one dogs.

  9. 30

    Interviewer: OK, so you think I have 21 dogs because W is the twenty-first letter of the alphabet?

  10. 31

    Rebecca: Yeah.

First, given that Rebecca took up the interviewer’s question regarding the number of dog noses, we infer that she now accepted that there could be an unknown number of dogs in the backyard and the convention that the unknown number of dogs could be represented by a letter. That is, she perceived the variable in the problem situation and she accepted the use of a letter to represent it. However, as with children’s thinking in previous levels, she still sought a mechanism for finding the (fixed) value of the variable quantity. She determined the number of dogs, symbolized by W, to be 21 by looking for its ordinal position in the alphabet.Footnote 7 In essence, this “alphabet strategy” worked as a system for decoding the value of a variable.

An important difference in this type of thinking and that of previous levels is that the mechanism for finding this value was contained within the alpha-mathematical system in which Rebecca worked and that was external to her own actions such as guessing and measuring. That is, this new strategy for finding the value of the literal symbol was based on an ordering of letters that represented an authoritative norm that was not her own. This established ordering of the alphabet seemed to lend legitimacy to Rebecca’s thinking that the value of W must have been 21. In the limited, alpha-centric world of a first-grader, this was a logical choice for how to make sense of W.

As the interview continued, Rebecca represented the number of dog noses by writing W in the second column of her function table, next to the W representing the number of dogs in the first column (see Fig. 1). She then proceeded to write the number “21” beside W in the second column, explaining: “For that much dogs, and then over here I’m going to write how many numbers. Twenty-one.”. That she used both W and 21 as representations of the number of dog noses suggests to us that she saw W as representing a quantity whose value was 21. Moreover, constraining the value of W to 21 suggests that she did not, at this point, view W as a varying unknown. That is, she held a view of variable as a fixed unknown whose value was specific and could be found using established norms (such as by following the order of the alphabet).

Fig. 1
figure 1

Rebecca’s written work for the pre-interview task. Reprinted with permission from Journal for Research in Mathematics Education, copyright 2015, by the National Council of Teachers of Mathematics. All rights reserved

In this sense, children whose thinking was at this level interpreted a variable that actually functioned as a varying, unknown quantity, as a fixed unknown. This interpretation might be qualitatively distinct from a (correct) view of variable as a fixed, unknown in situations where it actually functions as such (e.g., the role of x in the equation 3 + x = 12). That is, even though the value of a quantity in a functional relationship could be any number that satisfies the relationship, children whose thinking was characteristic of Level 3 viewed the value of the quantity as fixed.

3.4 Level 4: Letters as representing variables with fixed but arbitrarily chosen values

As with Level 3, children whose thinking was at Level 4 viewed a variable as a fixed unknown. However, an important distinction of Level 4 is that the unknown had a single, fixed value that could be randomly chosen. That is, they no longer viewed the letter symbolizing the variable as determining the “fixed” value of the variable. Instead, children conceptualized a letter as representing any number, but interpreted “any number” as any fixed number that could be arbitrarily chosen and was not bound by a mathematical relationship or some deterministic choice such as the “alphabet strategy”. The pre-interview with Rebecca, in which the task focused on exploring the identity relationship between an unknown number of dogs and the corresponding number of dog noses, illustrates this idea:

  1. 36

    Interviewer: How many noses would I have if I had U number of dogs?

  2. 37

    Rebecca: Twelve.

  3. 38

    Interviewer: Ok, twelve. I could have twelve.

  4. 39

    Rebecca: You could have twelve.

  5. 40

    Interviewer: Why could I have twelve?

  6. 41

    Rebecca: Because it’s any number.

First, Rebecca’s choice of “12” as the value of U was not based on the ordinal position of U in the alphabet. We note, however, that this type of thinking seems similar to that exhibited in Levels 1 and 2 in which children randomly chose a value for the variable by guessing. We think a critical distinction here is that, at earlier levels, children did not perceive a variable quantity, so their act of assigning a numerical value did not involve reasoning about either variable notation (as a symbol of) or a variable (as a referent to be symbolized or a quantity to be mathematized). At Level 4, however, students recognized the variable quantity in the problem situation, but viewed that quantity as having a value that was fixed and arbitrarily chosen. For example, when the interviewer later asked Rebecca, “What if we had P noses, how many dogs would we have?”, Rebecca responded, “Ohm…ninety-one.” In this, Rebecca seemed to be taking up the notion of “any number” by choosing the number of dog noses—91—in a way that suggested she was free to choose any (fixed) number she wanted. She was not bound—as she was in Level 3 thinking— by a strategy that produced the only value that P could possibly have been (e.g., by using the “alphabet strategy” to find the value of P). In other words, children’s thinking at Level 4 seemed to follow this chain of reasoning: (1) There is a quantity whose value I do not know; (2) The value of that quantity can be any randomly chosen number; (3) Once the value is chosen, the quantity cannot have another value.

We see this as a different view of “fixed” than is reflected, for example, in the value of x in the equation 3x + 6 = 12. In an equation such as this, there is only a single, valid choice for x (x = 2) and there is an underlying logic based on algebraic syntax that allows one to determine the fixed value of x. In contrast, children’s thinking at Level 4 allowed for the value of the unknown to be any number (that is, the value did not have to satisfy a mathematical constraint such as an equation), but once that number was chosen, it was fixed.

3.5 Level 5: Letters as representing variables that are varying unknowns

At Level 5, children conceptualized a variable as a varying unknown and a literal symbol as representing a varying unknown. Recall that during Rebecca’s pre-interview, she had initially said there would be 12 noses for U dogs (lines 36–41). She then wrote U in the second column of her function table to correspond to U dogs and wrote ‘12’ beside the U in the second column (see Rebecca’s written work in Fig. 1). At that point in her (Level 4) thinking, Rebecca’s notion of “any number” seemed to be that she could select any value and assign that value to the variable, but once the value for the variable was decided, it was fixed. We infer this partly because when asked about the number of noses for U dogs, her response was simply “twelve” (line 37). If she had conceptualized the variable as a varying unknown, we would expect her to make some claim suggesting that 12 was one of a possible set of values for U. Rebecca and the interviewer had the following exchange later in the interview:

  1. 42

    Interviewer: So, if I wasn’t asking you for a number at all, we were just sticking to letters, we would say the relationship would be, if you have U dogs, you would have U noses?

  2. 43

    Rebecca: Yeah.

  3. 44

    Interviewer: How come?

  4. 45

    Rebecca: Because it, uhm, because it could be any, uhm, numbers of the alphabet could be any number.

  5. 46

    Interviewer: OK. Letters could be any number (clarifying that by “numbers” of the alphabet, Rebecca actually meant “letters” of the alphabet)?

  6. 47

    Rebecca: It could be any number. Like twelve, two, one, zero.

Rebecca then proceeded to affirm that U could mean other numbers such as 33, and “even a hundred.” Line 47 is the first point during the pre-interview at which Rebecca suggests that the value of the quantity could vary, indicating to us that her thinking is transitioning to a view of variable as a varying quantity, an appropriate conceptualization of variable in a functions context. We do not claim that her understanding of variable is robust at this point. Indeed, as we saw in subsequent interviews with Rebecca and other participants, and as others have observed elsewhere (e.g., Clements & Sarama, 2014), students often revert to lower levels in a progression when faced with a new task.

It would be useful, then, to examine children’s understandings of variable and variable notation later in the CTEs. The following episode occurred during the post-interview with Jackson, where the task (the Train Problem) was to explore a relationship between the number of stops a train makes and the number of train cars it has, assuming that the train picks up two cars at every stop and the engine is not counted. As the following episode opens, the interviewer asks Jackson, “Suppose we didn’t know how many stops the train made.”

  1. 48

    Jackson: So it would be a number, A.

  2. 49

    Interviewer: Tell me what A is representing.

  3. 50

    Jackson: Like, the number of stops he was at, like 20, 40, 60 – any stop.

  4. 51

    Interviewer: So I’m glad you said that. You said A could be 20, 40, 60, so A could be anything?

  5. 52

    Jackson: Yeah.

  6. 53

    53 Interviewer: Do we need to know what it is?

  7. 54

    Jackson: No.

Jackson viewed the value of the variable represented by A as varying. More importantly—like Rebecca’s thinking earlier—he did not need to assign a specific value to the variable. We see sophistication in Jackson’s thinking about variable in that the variable could exist in an unresolved, indeterminate state. This use of a literal symbol reflects what has been described elsewhere as an algebraic use of letters in that one does not need to immediately assign “concrete meaning” to it (Vlassis, 2002, p. 354, as cited in Hackenberg & Lee, 2015).

3.6 Level 6: Letters representing variables as mathematical objects

Children whose thinking was characteristic of Level 6 not only recognized a variable as a varying unknown and represented it with a literal symbol, they were also able to use variables to represent functional relationships (see also Blanton, Brizuela, et al., 2015). Furthermore, they were able to act on variables, represented with letters, as mathematical objects.

The following excerpt is taken from Rebecca’s post-interview with the Train Problem. As the excerpt opens, Rebecca has just completed a function table (see Fig. 2):

Fig. 2
figure 2

Rebecca’s written work on the post-interview task. Reprinted with permission from Journal for Research in Mathematics Education, copyright 2015, by the National Council of Teachers of Mathematics. All rights reserved

  1. 55

    Interviewer: So what if your train made 100 stops? How many cars would [it] have?

  2. 56

    Rebecca: Two hundred?

  3. 57

    Interviewer: OK. How did you get that?

  4. 58

    Rebecca: Because you just double it. Because 1 + 1 = 2, 2 + 2 = 4, 3 + 3 = 6, and 4 + 4 = 8 (she writes these equations beside the corresponding values in her table).

  5. 59

    Interviewer: Great. I like the way you did that. What if you didn’t know how many stops your train made?

  6. 60

    Rebecca: You could use a variable.

When asked what letter she would use, Rebecca stated R because “that’s the first letter of my name.” Rebecca then used V to represent the total number of cars the train had after R stops and characterized the relationship between the two variables as R + R = V. Although the choice of R had personal meaning for Rebecca, the choice of V did not, suggesting to us that she understood the choice of letter to be arbitrary. Rebecca further explained that “R represents how many stops the car, the train, makes” and that V represents “how many carts (cars) he has.”

This suggests that Rebecca both perceived the variable quantities in the problem scenario and knew they could be represented with a literal symbol. She understood the meaning of the notation she used within the problem context, and she could operate on symbolized quantities (R, V) to produce a rule depicting a relationship (R + R = V).

It is not trivial for a 6-year-old to construct a representation in which it appears that letters are being “added” together and that the result is equivalent to another letter. Thus, while she might understand that R and V symbolized quantities, a reasonable question is what sense did she make of the rule she constructed? Rebecca explained her interpretation of her symbolic rule as follows: “Whatever number how many stops it made, if you doubled it, that’s how many cars it would have.”

Finally, not only was Rebecca able to articulate a rule and interpret its meaning within the problem context, she also seemed to intuitively understand the boundaries of that rule and how perturbations in the problem situation might be reflected in its components. When asked earlier in the interview whether her rule, R + R = V, would always work, Rebecca suggested that if the engine were to be counted, it would change her rule. When later asked how her rule might change if she now counted the engine, Rebecca described that “you can just add a plus one” and symbolized the new relationship as “+1 R + R = V” and later as “+1 + R + R = V.” In other words, we maintain that the rule and its constituent parts—including variables—were objects themselves that she could transform or operate on to produce a (new) relationship.

4 Discussion

We make several observations here about the levels of sophistication in the progression we observed in children’s thinking. First, there was a shift in children’s thinking about symbolic notation between Levels 1 and 2 from not knowing to use a letter to symbolize a variable quantity, to the interpretation of a letter as representing something that was not known but not inherently mathematical (such as the name of an unknown person). This conceptualization of symbolic notation seemed consistent with children’s thinking about variables across Levels 1 and 2, which held that unknown quantities could not exist—that is, they were not yet recognized by the child as objects to be mathematized in the problem situation—and, therefore, there was no logical necessity to symbolize them. At the same time, by Level 2 children recognized that a letter might be used to represent something, although not yet a variable. That is, children at Levels 1 and 2 were beginning to appropriate the use of literal symbols for symbolic notation, but not variable notation.

If we frame progressions in children’s thinking from a process-to-object lens (Sfard, 1991), we might characterize the mental activity in Levels 1 and 2 as a process of interiorization in that concept formation had been initiated. However, children at these levels were still operating with familiar actions such as counting or measuring in order to assign a value to an unknown (e.g., a person’s height). In this, they did not conceptualize the variable quantity in a problem situation as indeterminate.

Because children at Levels 1 and 2 did not seem to perceive a variable quantity in a problem situation, it seems reasonable that any reference to a quantity whose value was unknown would be countered by assigning a numerical value to that unknown. In such cases, children proposed a mechanism for finding the value (e.g., measuring someone’s height, counting the number of dogs). In this, children mathematized the problem in a way that treated a variable quantity as a known quantity.

The thinking exhibited in Levels 1 and 2 suggests that not perceiving a variable quantity might constrain the development of a symbolic system to represent that quantity (in contrast to the situation where children might recognize a variable quantity in a situation, but not yet have a means to symbolize it). Conventional wisdom has argued that children are not yet “ready” to use variable notation. We wonder, instead, if the problem lies not in children’s readiness to use variable notation, but in their lack of experiences with mathematical situations that involve mathematizing variable quantities that could motivate the need for children to construct such a system. As McNeil et al. (2010) suggest, “The process of generating representations may help students learn to use symbols meaningfully to represent unknown quantities (p. 632).” In other words, if children were routinely provided with experiences to mathematize unknown quantities, would variable notation arise more naturally as a logical necessity for representing their thinking?

Levels 3–6 suggest that young children can learn to correctly mathematize and reason with variable quantities. In Levels 3–5 children’s thinking about variable and variable notation began to condense (Sfard, 1991) in that they began to perceive the variable quantity in the problem situation and represent this new construct through variable notation. An important characteristic of this shift in their thinking was the shift from a view of the value of a variable as being fixed and determined (Level 3), to a view of the value of the variable as being any one, fixed value from an implicit range of values (Level 4). This seemed to set the stage for a shift from a conception of variable as a fixed unknown (Levels 3 and 4) to that of a varying unknown (Level 5). In our view, what condensed across these levels was the conceptualization of the variable as a varying unknown—an important objective for functional thinking—as well as the use of letters as a set of inscriptions that could serve as variable notation. Finally, we suggest that Level 6 reflected a reification of variable and variable notation in which children could mathematize unknown quantities and act on these quantities as objects in and of themselves, or even combine them with other symbols (operations, numerals) to represent functional relationships.

We also observed that the levels in first-graders’ thinking are not unlike those identified in adolescents’ thinking, although our analysis was conducted independently of frameworks characterizing adolescents’ thinking. Küchemann (1981) found, for example, that 14-year-olds sometimes assign a numerical value to a letter from the outset of a task, a characteristic that we find consistent with Level 1 in our study. He further found that students sometimes see a letter as representing an object and not a quantity (Level 2), as representing “a specific but unknown number” (p. 104) (Level 4), as “being able to take several values rather than just one” (p. 104) (Level 5), or as representing a “range of unspecified values” (p. 104) for which a relationship between two such sets of values exists (Level 6).

The meaning of these parallels is worth considering. We recognize, for instance, that some children in our study—like adolescents in other research—took up symbolic notation early on as a way to represent an object they viewed to be the unknown (e.g., a person). This misalignment between symbol and referent seemed rooted in a logical attempt by children to make sense of what the literal symbol might represent in the absence of recognizing a variable quantity as part of the mathematical situation, leading to what has been characterized as object/quantity confusion and which has been interpreted as an inherent difficulty with a symbolic system (Lucariello & Tine, 2011) and one’s developmental readiness (Küchemann, 1981) to engage with such a system.

We suggest, however, that the root of this issue might not be in the child’s lack of ability to use a symbolic system in and of itself, but in the fact that he or she does not yet perceive the unknown quantity in the problem situation to be mathematized. As such, we suggest that the difficulty students—even adolescents—exhibit might be more closely attributed to the lack of experiences children are given to mathematize situations that involve variable quantities than to developmental constraints that cast them as not ready to use literal symbols. We found that once children recognized the variable quantity in the problem, they were able to advance in their thinking about variable and variable notation in quite sophisticated ways.

5 Conclusion

In this study, we characterize a possible progression in young children’s thinking about variable and variable notation. Because of the lack of research on young children’s thinking in this area, this study represents only an initial phase that should be followed by studies in which the progression proposed here is validated through quantitative methods, or studies that explore the mechanisms by which children progress through the levels posited here (see e.g., Szilágyi et al., 2013).

Our purpose has not been to quantify how many children exhibited thinking at a particular level. Instead, it has been to look closely at children’s thinking as they encountered variable and variable notation through functional thinking tasks and to identify levels of sophistication that emerged in their thinking in response to our instructional sequence. However, we do not want to lose sight of the extent to which some children in this study were able to understand variable and variable notation. It is noteworthy that any of the 6-year-old participants exhibited thinking characteristic of Level 6. In light of this, we suggest that adolescents’ difficulties with variable and variable notation do not necessarily portend that younger children will have even greater difficulties with these concepts. In fact, the opposite might be true. That is not to say that non-conventional representations (e.g., natural language) do not also play an important role in children’s activity of symbolizing—we think they do. We suggest, however, that providing long-term, sustained experiences with variable and variable notation from the start of formal schooling might be an avenue by which we can address the misconceptions that adolescents exhibit. As Arcavi (2005) suggests in relation to the development of symbol sense, in order to foster symbol sense one needs “learning materials and classroom practices that…support the building of the patience needed for learning in general, and more precisely the capability of accepting partial understandings” (p. 47). The partial understandings—and, ultimately, sophisticated thinking—exhibited by children in this study underscore the potential of investing in building “pieces” of knowledge (DiSessa, 1993) that can become robust over time.

We recognize, however, that some view the use of variable notation in lower elementary grades as controversial or even unnecessary. While we strongly support that children should use their own natural language to reason about algebraic situations, we equally support offering them other representations—such as variable notation—as a way to represent and reason with their ideas. We wonder how children would cope with symbolic and highly complex written language if they were not exposed to squiggles to which we ascribe alphabetic meaning as “letters” long before they start formal schooling. We wonder how children might fare in their acquisition of written language if they were not given experiences until middle grades to begin combining these letters in sequences to form words, then sentences, through an (arbitrary) system of rules and syntax that govern these actions and sometimes exhibit little logic. If this were the case, it is likely that the development of written language, too, would be littered with tales of students’ misconceptions and difficulties. We hope that results of this study can provide insights for designing mathematical experiences that can nurture children’s thinking about variable and variable notation and ultimately shift perceptions that algebraic language is beyond the grasp of young children.