Introduction

As the 21st century progresses, it will become increasingly important that all citizens be prepared to weigh scientific arguments about such dilemmas as what one should do to optimize water quality and achieve sustainability (Lee et al. 2012; Osborne 2010). The corpus of arguments about any one scientific problem can contain diametrically opposed claims and come from seemingly well-qualified persons (Duschl 2008). This is not a new phenomenon, nor is it necessarily an unproductive one, as can be seen in the debates about the structure of DNA (Crick 1974). To be able to engage meaningfully with competing claims about scientific phenomena, it is critical to be able to judge the relative success of the persuasive argumentation in leading the audience to accept the claim as valid. Making this more complicated, potential solutions to many scientific dilemmas have important social implications (to be referred to a socioscientifc issues). Central to judging the success of arguments in support of a solution to a socioscientific issue is the ability to assess the credibility of the evidence and premises used to support the claims (Britt et al. 2014; Perelman and Olbrechts-Tyteca 1958). When engaging in this assessment, individuals refer to their epistemic beliefs about the nature of knowledge and knowing (Bråten et al. 2014; Mason and Scirica 2006; Nussbaum et al. 2008).

Engaging with socioscientific issues with appropriate support has been shown to lead to improved argumentation abilities among K-12 students (Evagorou and Osborne 2013; Tal and Kedmi 2006). But epistemic beliefs are key to this process, as having less sophisticated epistemic beliefs can lead middle school students to engage in suboptimal argumentation, including poor assessment of the credibility of evidence (Kyza 2009). Much of what is known about the relationship between epistemic beliefs and argumentation was gathered (a) in laboratory experiments or studies where students were presented with written cases involving scientific dilemmas and asked questions, (b) in very brief studies, or (c) among university students. Some evidence has emerged that engaging in sustained argumentation about socioscientific issues while supported by scaffolds can lead elementary students to develop more sophisticated epistemic beliefs (Ryu and Sandoval 2012). But further work is needed to understand the relationship between epistemic beliefs and how middle school students assess credibility of data, make sense of data and evidence, and address a socioscientific issue during a sustained problem-based learning unit. In this study, we use an ethnomethodological approach to examine the order underlying how middle school students conduct scientific inquiry during a problem-based learning unit. In the next section, we review relevant literature and the research questions. Then we review methodology, present case studies for 5 small groups, conduct a cross case analysis, and discuss the results in light of the literature.

Literature review

STEM education goals

For many years, STEM education researchers have been sounding the alarm that the STEM education pipeline is broken, leading many students to be ill-prepared to pursue higher levels of STEM education and ultimately enter the STEM workforce (Allen-Ramdial and Campbell 2014; Gray and Albert 2013; Tyson et al. 2007). Many of these authors propose changes to K-12 STEM curricula and instructional strategies to encourage students who would otherwise end up in a non-STEM career to persist in the STEM pipeline. While pursuing the goals of the STEM pipeline movement (e.g., ensuring the broadest possible talent pool in STEM) is important, it is also critical to develop the abilities and propensity of students who will not pursue STEM careers to engage with locally relevant scientific issues (Feinstein et al. 2013). This includes (a) abilities to engage with locally authentic STEM problems and formulate questions that science can address, (b) abilities to judge the credibility of claims made about scientific problems, and (c) interest in scientific issues (Feinstein et al. 2013). In this way, such students can become competent outsiders who can engage with scientific problems happening in their own communities, sort out valid claims about the problems, and understand the implications of the problem for themselves and others (Feinstein et al. 2013). A promising way to develop these competencies is by having K-12 students address locally authentic, socioscientific issues in science class with appropriate support (Belland et al. 2015a; Khishfe 2014; Tal and Kedmi 2006).

Socioscientific issues

Socioscientific issues are defined as ill-structured problems that cannot be addressed solely from a scientific perspective. Rather, the social, ethical, and political implications of different problem aspects and solutions need to be considered alongside scientific knowledge, principles, and processes (Sadler et al. 2007). For example, students investigating their local river’s water quality need to consider the competing interests of different people who live and work near the river, other people in the greater watershed, as well as flora and fauna (Belland et al. 2015a). As each stakeholder could potentially have different desires and concerns as pertains to the problem, it is important that students take this into consideration (Lee and Grace 2010; Tal and Kedmi 2006).

Being ill-structured, socioscientific issues do not have a single solution or a single solution path (Jonassen 2011). And given that they involve multiple, often conflicting, stakeholders and multiple ways to gauge the success of a solution, solutions to socioscientific issues can only be judged on the basis of the extent to which the solution was justified acceptably through persuasive argumentation (Çalik and Coll 2012; Jonassen 2000).

Skills and knowledge required when addressing socioscientific issues

Persuasive argumentation can be defined from the perspective of the arguer and the audience. From the perspective of the arguer, persuasive argumentation is designed to lead an audience to accept the validity of claims by connecting evidence to claims via premises (Eemeren et al. 2014; Perelman and Olbrechts-Tyteca 1958). From the perspective of the audience, persuasive argumentation can be defined as evaluating the foudnedness of claims by weighing the appropriateness of the premises used in the argument, as well as their work in linking provided evidence to the claim. The promotion of persuasive argumentation skill in science is desirable for several reasons. First, in the real world, science involves very few well-structured problems that have only one correct solution and one correct solution path (Jonassen, 2000; Nersessian, 2008). Rather, scientific problems are most often ill-structured. One can only judge the sufficiency of solutions to ill-structured problems through evaluation of the persuasive arguments advanced in support of the solutions (Jonassen and Kim 2010; Kuhn 2015). Thus, argumentation is a core scientific process: through argumentation, scientists refine ideas as they design and conduct studies, present results, and work toward publication (Ford 2012; Osborne 2010). Second, argumentation is a strategy that students can use to address scientific problems: it requires that ideas be justified through evidence and progressively refined through revisions to justifications (Jonassen 2011). Third, through argumentation, students can see that science is not a basket of facts as they generate and engage with competing claims (Erduran and Jiménez-Aleixandre 2008; Osborne 2010). Middle school students often struggle evaluating (Kuhn et al. 2013) and creating (Belland et al. 2008; Berland and Reiser 2011; Nippold and Ward-Lonergan 2010) evidence-based arguments. A possible reason is that K-12 students simply are not given enough opportunities to argue in science class (Osborne 2010).

Central to persuasive argumentation are epistemic beliefs, defined as individuals’ beliefs about the natures of knowledge (i.e., how certain or simple it is) and knowing (i.e., the source of knowledge and how claims/evidence can be justified) (Buehl 2008; Hofer 2006). From a developmental perspective, epistemic beliefs can be categorized as absolutist, multiplist, and evaluativist (Greene et al. 2008). For example, with respect to a given phenomenon, absolutists believe that experts in the corresponding field know the cause; multiplists believe that experts cannot know for certain; evaluativists are skeptical that experts know the cause for certain, but believe that experts know more than non-experts (Kuhn et al. 2010). Epistemic beliefs come into play when arguers select evidence and premises to be used in arguments (Bråten et al. 2011; Chinn et al. 2011; Kuhn et al. 2013; Mason and Scirica 2006), and when the argument audience evaluates the appropriateness of evidence and premises used in an argument (Kuhn et al. 2013; Richter and Schmid 2010). For example, absolutists may argue that climate change is not caused by human activity because Expert X said so, using the unstated premise that experts know the answer and thus one only needs to ask an expert for an answer to a scientific question. Multiplists would base their argument on other evidence. Similarly, an absolutist who listens to the above argument would likely find it reasonable.

Epistemic beliefs also influence scientific investigation through students’ formation of epistemic aims, defined as what one hopes to accomplish (e.g., true beliefs) through investigation, and epistemic values, defined as one’s perception of the value of the goal (Chinn et al. 2011). Furthermore, epistemic beliefs may have a reciprocal relation with self-regulated learning: that is, students with more sophisticated epistemic beliefs may be more likely to engage successfully in self-regulated learning, and engaging in self-regulated learning may cause students to see that it is possible to arrive at multiple answers to the same questions (Greene et al. 2010; Muis 2007; Strømsø and Bråten 2010).

Instructional support needed when engaging with socioscientific issues

Problem-based learning

Problem-based learning (PBL) is one way to structure investigations of socioscientific issues. In PBL, students confront an ill-structured problem (e.g., socioscientific issue), and then work in small groups to define and address learning issues, synthesize found information and apply such to the problem, develop a solution, and build an evidence-based argument in support of the solution ((Belland et al. 2008; Savery 2006). Self-direction of learning is central to success in PBL (Hung 2011; Loyens et al. 2008). Furthermore, students cannot succeed without adequate support of cognition, which can be provided by the teacher or technology tools (Hmelo-Silver et al. 2007; van de Pol et al. 2010).

Scaffolding

To engage in the complex reasoning required when addressing SSIs during PBL, students need scaffolding, originally defined as one-to-one support that stretched children’s abilities so as to afford problem solving performance and skill gain (Wood et al. 1976). For scaffolding to be successful, students and scaffolders need to have a shared understanding of what successful performance of the target skill would look like (Mahardale and Lee 2013; Wood et al. 1976). Scaffolding involves the following strategies: “(a) enlisting student interest, (b) controlling frustration, (c) providing feedback, (d) indicating important task/problem elements to consider, (e) modeling expert processes, and (f) questioning” (Belland 2014, p. 507). Scaffolding can support students during PBL by (1) providing guides of what they need to consider during problem solving, (2) helping them manage and monitor problem solving processes, (3) helping them make use of tools and resources, and (4) providing learning strategies (Hannafin et al. 1999). With the support of scaffolds, students can engage more effectively in PBL and gain skills of problem solving (Hmelo-Silver et al. 2007; Schmidt et al. 2011).

One-to-one scaffolding

One-to-one scaffolding is closest to the original definition of scaffolding, and is defined as one teacher working with one student to dynamically assess performance and provide just the right amount of scaffolding support (van de Pol et al. 2010). This means that when seeing that a student is unnecessarily struggling, the teacher can add scaffolding, and when seeing that a student is performing very well, scaffolding can be faded. One of the strongest influences on learning, one-to-one scaffolding leads to an average effect size of 0.79 standard deviations according to a recent meta-analysis (VanLehn 2011).

Computer-based scaffolding

Soon after the introduction of the scaffolding metaphor, researchers began to question whether computer tools could also fulfill the scaffolding function (Hawkins and Pea 1987). Such support can serve to structure and problematize students’ investigations (Reiser 2004). Computer-based scaffolding is more scalable, but less effective at customization, than one-to-one scaffolding (Belland 2016). Among computer-based scaffolding types, scaffolding embedded in intelligent tutoring systems tends to entail the most extensive customization. But a recent Bayesian Network meta-analysis of computer-based scaffolding in STEM education indicated that pre-post gains on cognitive outcomes were strongest when scaffolding had no customization (Walker et al. 2016). According to a recent pilot meta-analysis, computer-based scaffolding led to an average effect size of 0.53 standard deviations (Belland et al. 2015b).

Computer-based scaffolding can supplement one-to-one scaffolding, thereby increasing and extending the effectiveness of one-to-one scaffolding (McNeill and Krajcik 2009). Specifically, each scaffolding type can compensate for the weaknesses of the other as part of a distributed scaffolding system (Puntambekar and Kolodner 2005; Tabak 2004).

Studying group work in PBL

Studying the cognition of individual PBL group members in isolation to understand how they went about solving a problem in a group is inadvisable for two reasons: (1) social norms, shared history, and culture combine to influence thought processes (Luria 1976; Vygotsky 1962), and (2) the ways in which students engage with scaffolds, tools, and other individuals depends on their goals and the meaning they attach to these scaffolds, tools and other individuals (Akhras and Self 2002; Belland and Drake 2013; Gibson 1986). Thus, it is important to adopt the perspective of PBL groups during research. This cannot be accomplished by viewing actors as individuals and their interactions as the product of the sum of their individual characteristics (Francis and Hester 2004). One can adopt the perspective of PBL group members by applying the theoretical framework of ethnomethodology, according to which there is order in the everyday interactions among people, and it is the goal of social science to uncover the methods by which groups of individuals achieve that order in interaction amongst themselves (Garfinkel 1967, 2002). Ethnomethodologists examine interactions holistically, looking for how different phenomena influence new behavior and phenomena (Francis and Hester 2004). One way this can be done is through conversation analysis—a close examination of the social order evidenced in how participants structure a conversation (Day and Kjaerbeck 2013).

Using the lens of ethnomethodology can help researchers uncover important drivers of the functioning of groups. For example, using an ethnolmethodological lens led (Belland et al. 2009) to discover that each member of a mainstreamed small group engaged in PBL in a middle school filled a unique role and equally contributed to group success. Ethnomethodology helped indicate that when theories were generated and evaluated in a PBL session in the medical school context, the initial reaction (e.g., silence, laughter, or agreement) of groupmates to the theory led to its rejection or further exploration (Glenn et al. 1999). Lähteenmäki (2005) studied the supervision process of physiotherapy students, and found that students’ individual therapy approaches emerged from reflection on their supervisors’ modeling and instructional approach. And using an ethmomethodological lens helped Çakir et al. (2009) determine how group members interconnected and built off of each other’s ideas in knowledge building. In short, adopting an ethnomethodological perspective helps researchers consider group functioning in a holistic manner, rather than as a sum of each group member’s individual efforts.

Research questions

  1. 1.

    How do middle school students’ approaches to judging the credibility of evidence vary when provided/not provided argumentation scaffolds during problem-based learning?

  2. 2.

    How do middle school students’ approaches to making sense of data and evidence vary when provided/not provided argumentation scaffolds during problem-based learning?

  3. 3.

    How do middle school students’ approaches to addressing the central problem vary when provided/not provided argumentation scaffolds during problem-based learning?

Method

Setting and participants

The setting was a small middle school (grades 6–8) in a rural valley in the Intermountain West of the USA. 48 % of the student body receives free and reduced lunch. Sixty-nine 7th grade students in three class sections participated in the overall study from which this dataset emerged. The teacher formed small groups that consisted of students who represented a range of abilities and genders, and who the teacher thought would work well together. We used typical case sampling to select from among these groups five small groups of 3–4 students each—three from the experimental condition, and two from the control (Merriam and Tisdell 2016).

The 3-week unit focused on the water quality of the local river (Dale River; all names were changed). Students collected water quality data (e.g., dissolved oxygen) at three points in the valley, analyzed trends, researched information online and through other strategies (e.g., interviews), and argued what should be done to optimize water quality. Class activity details (see Procedures section) were the same in each condition, except that groups in the experimental condition had access to the Connection Log as described below. Each group represented a stakeholder such as hunters or common citizens. Each period lasted 50 min, and there were 5 class periods per week.

Design

Experimental condition

Two class sections were randomly assigned to use the Connection Log computer-based scaffolds designed to help middle school students address ill-structured problems and create evidence-based arguments in PBL. The Connection Log was designed to help students engage in productive group work and learn the process and norms of argumentation. This is done through the use of question prompts arranged in a process map, support for consensus-building, and support in considering criteria by which arguments are judged.

The Connection Log consists of five stages—Define Problem, Determine Needed Information, Find and Organize Information, Make Claim, and Link Evidence to Claim. These stages are iterative. For example, in the process of finding and interpreting information, students may realize that their problem definition needs to be revised.

Each stage has several questions to which students articulate answers individually. For example, Step 1 of the Make Claim stage reads:

  • Here is the information you thought was useful and the subcategories you made for that information.

  • Do you know enough about the problem to make a claim?

  • Your claims should be about either:

  • What is happening

  • Why it is a problem

  • What can be done to improve the situation

  • In this step, you will use this information to create claims about what is going on, why your stakeholders care about it (or why they don’t), and what you might do to solve the problem.

These answers are sent to a database, and can be read by group members in later stages/steps, where they come to consensus on these same questions.

The version of the Connection Log used in this study, as well as an earlier version, led to increased argument evaluation abilities among lower-achieving (Belland et al. 2015a; Belland et al. 2011) and average-achieving (Belland 2010) middle school students. Students from different groups used the Connection Log for different reasons corresponding to differing challenges (Belland et al. 2015a; Belland et al. 2011; Belland 2010).

Control condition

Students in the control condition completed the same activities as those in the experimental condition, except that they never were introduced to or used the Connection Log. For example, when students in the experimental group used the Connection Log to guide their work, students in the control condition could ask the teacher questions, brainstorm among themselves, or look at the board where the teacher wrote what students should be working on that day (corresponding to major stages of the Connection Log).

Data collection

Videotaped classroom interactions

Each group in this study was videotaped during the entire unit. All dialogue was transcribed verbatim.

Prompted, retrospective interviews

Thirty-minute interviews were conducted with each small group. A unique prompting video consisting of clips representing typical episodes was used in each interview. This promoted students’ recollections of what they did and thought, and why. The interview guide focused on (a) sources of confusion, (b) how and why students performed certain tasks, (c) strategies, and (d) what students think it means to prove something. Asking students about these topics helped us investigate how students judged the credibility of evidence (RQ1), made sense of data (RQ2), and addressed the central problem (RQ3). Example questions include “What do you think it means to really prove something?” and “In the segment we just watched, what were you doing? Why?”

Database information

What group members in the experimental condition wrote in response to Connection Log prompts was retrieved from the database.

Log data

Log data indicated how long experimental students spent on each page of the Connection Log where they came from, and where they went afterwards.

Computer documents

We collected all documents (e.g., spreadsheets) that students created during the unit and saved on their computers.

Procedures

On Day 1, a guest speaker described the history of the Dale River. On Day 2, the teacher explained the problem, what students would be doing, and concepts students needed to understand (e.g., turbidity and stakeholder). On Day 3, groups were assigned unique stakeholders such as common citizens and recreationalists. Students went to various locations along the river to collect water quality data (e.g., nitrates) on Day 4. On Days 5–15, students worked in small groups to (a) compare water quality data to standards, (b) identify trends, (c) determine and find needed information, (d) develop a problem solution from their stakeholder perspective, and (e) develop an oral argument that supported the solution. After unit end, the small groups were interviewed.

Data analysis

Theoretical framework

Data were analyzed from an ethnomethodological framework, according to which the goal of research is to uncover the order underlying human behavior (Garfinkel 1967). A fundamental assumption of ethnomethodology is that when individuals work together in a group, each individual does not act in a random way, but rather he/she negotiates actions with groupmates, and in the course of groupwork, a social order emerges that underlies the group members’ actions (Garfinkel 2002). Uncovering the social order that underlies the groupwork helps one to understand why group members do what they do. This in turn helps one to understand the goals of group members, which inter-relates with epistemic beliefs. One can uncover such order by examining closely how individuals interact with each other in a group, including how they use language in such interactions (Francis and Hester 2004; Garfinkel 2002).

Process

The first two authors used open and axial coding to develop the coding scheme. Top level codes related to group process and who/what directed new activity, as well as how students defined the problem, determined information to find, developed argument/solution, interpreted water quality data, interpreted information found online, and searched for and recorded information. Using these codes helped us establish how students went about analyzing and evaluating data, as well as working together to address the problem. Five researchers then applied the coding scheme to the data (one researcher per group) to identify themes, which were verified by examining evidence across data sources (Glaser and Strauss 1967). Next, we created visual displays of coding results (Miles et al. 2013). Transcript excerpts in which a theme was illustrated were reviewed, and the excerpts that (a) best represents the typical interaction approach, and/or (b) reveals the personalities of group members, were analyzed via conversation analysis (ten Have 2007). Conversation analysis helped us to delve deeper into the methods by which the group members negotiated actions and created a social order.

Validity and reliability

The analysis processes helped us draw conclusions, which we verified through triangulation across data sources and methods, as well as searching for disconfirming evidence. Validity was further established through prolonged engagement in the field, thick, rich description, and weekly peer debriefing throughout the analysis process (Miles et al. 2013; Morse 2015; Tracy 2010).

Results

See Table 1 for an overview of the results for each group. Greater detail is presented in the sections that follow.

Table 1 Summary of results

Group E1: experimental condition

Group overview

Danny, Stephanie, Adam, and Ethan formed Group E1, and represented the Environmental Protection Agency (EPA). As scribe, Stephanie assigned tasks. Adam completed tasks, served as a buffer between Danny and the rest of the group, and helped Ethan the most of all group members. Danny had the clearest understanding of the water quality testing data, and did the most to push his group in the direction of a solution.

How they judged the credibility of evidence

Group E1 members never explicitly addressed the credibility of water quality testing data. They also did not address the credibility of websites in oral discussions. Some evidence indicates that Danny was concerned with where he found information, as he noted to the teacher on Day 9 that he found how the number of stoneflies reflect a river’s water quality on a fisherman’s website. However, Danny’s groupmates often simply noted having found information “on the Internet” or “on Google.” Furthermore, few Group E1 members mentioned the need to verify information by finding it in other data sources. Danny did in this passage from Day 14:

Danny:

The stoneflies [insect that is especially sensitive to water pollution] shortage is caused by polluted water, and if there aren’t any stoneflies, obviously it means the water is polluted. And the Dale River does have a shortage of stoneflies so it means it is polluted. So that is our evidence baking in there, it’s definitely polluted

Researcher 3:

What else do you have on the sites beside just stoneflies?

Danny:

A lack of fish… If you don’t have stoneflies, you don’t have fish, which supports the evidence that there aren’t any stoneflies or mayflies

Here one sees evidence that Danny in many ways used triangulation to verify the credibility of information. Data indicated that there was a lack of fish, and so Danny concluded that there was congruence between what the two data sources said. Elsewhere, Danny noted that it is crucial tUsing the data and the Connection Loghat the EPA ensure that their evidence is solid to enforce laws and policies.

How and why they made sense of data and evidence

The students interpreted water quality data using the Connection Log, and never discussed such orally. They referred to their problem definition, assignments, and found information in the Connection Log.

In the interview, Danny noted that his groupmates were always telling him “you don’t have evidence,” and he initially wanted to change topics because he thought it would take too much time to find evidence that stoneflies need healthy water to survive. But eventually he learned that it was not hard to find substantiating evidence.

Early in the unit, Adam asked Stephanie if he could copy and paste found information. Stephanie shrugged her shoulders. Adam, Stephanie and Danny appeared to summarize found information. Ethan was often lost, and rarely engaged with information.

How and why they addressed the central problem

Group E1 members addressed the problem through an iterative process of defining the problem, examining water quality data, searching for information on the Internet, and practicing their presentation. When they encountered challenges, they were twice as likely to ask groupmates as the teacher for help. They only asked groupmates for help a little over one time per day. They defined the problem and determined information to find largely by articulating information in response to the Connection Log prompts, and discussing such. In contrast to many other groups, Group E1 never asked the teacher what information they should find. Groupmates noted and addressed holes in ideas as they were articulated. The Connection Log played a big part in the articulation of the group’s argument. They did not often ask the teacher to evaluate their argument. During the unit, Danny noted that they could build a “recycling plant,” which in the interview was clarified as a water treatment plant. However, Danny was not sure of the efficacy of this strategy, noting, “It’s gonna be pretty hard to get out the existing pollution in there as it is fluid.”

In the presentation, Danny noted that they would increase fines for polluting and littering, and hire local people to clean up the river. Stephanie added that they could create a canal along the river where cows could drink so that they did not muddy the water.

Group E2: experimental condition

Group overview

Consisting of Andrew (scribe), Sean, Jessica and Keisha, Group E2 represented Farmers and Ranchers of Madison. Considered the most knowledgeable person in the group, Andrew was often turned to for help on what to do next and how to use the Connection Log. When Andrew worked alone, Sean, Jessica and Keisha often worked together as a sub-group, discussing ideas and what to do next. Sean was vocal and a tone setter. To ask for help, Jessica often exclaimed that she was confused. Keisha often kept Jessica on track, and would stand up for Jessica when Sean would point out Jessica’s mistakes.

How they judged the credibility of evidence

The group understood how to identify outliers and compare data to standards, as in this episode from Day 5:

Sean:

Um, on water temperature, in Celsius, it says 3.1. And then, in water temperature, Fahrenheit, it says 55.7

Mr. Thomas:

So why, what’s wrong with that?

Sean:

Well, the rest of them say, like, 12 and 13

Furthermore, group members sometimes established credibility by indicating the source of information they found. For example, on Day 13, Sean searched for information on whether anything in the water can hurt farm animals:

Researcher 2:

Did you find out if there is anything in the water that can hurt the animals?

Sean:

(Pointing to entry in the Connection Log that read “There can be plastic bags that [animals] eat and can suffocate them”) That one’s mine

Researcher 2:

Awesome, good job. Where did you find the evidence that says that?

Sean:

I found it on MadisonFarmBureau.com

How and why they made sense of data and evidence

As Sean examined the data, he often thought aloud to himself as he compared and contrasted information. Sean’s groupmates often then reexamined the data.

Through the middle section of the unit, all group members attempted to identify evidence for their assigned topics (e.g., turbidity, nitrates) by following the Connection Log’s process flow. Jessica often said that water quality readings were “bad,” but did not elaborate on what “bad” meant. She often sought help from the teacher, and fell behind the rest of the group. Her groupmates appeared to understand better the meaning of trends in the collected data—that something(s) in between Site 1 and Site 3 was degrading the water quality. Sean thought that the culprit was trash, as he noted having found tires and other trash in the river.

As the unit progressed, Sean decided that the data really did not indicate any problems from the farmer perspective, as he noted that “there are no reports of cows dying” or of crops being impacted. Thus, Sean suggested stating that there was “no problem” in the river.

How and why they addressed the central problem

Group E2 seemed to grasp the idea that water quality in the river might pose a problem for farmers and ranchers of Madison. Their discussion indicated that they needed to examine field data and other information to identify a problem, assign evidence and create a strong presentation.

Sean’s position that there was no problem of significance in the river put him at odds with his groupmates. For example, Keisha concluded that if river water were used for irrigation, chemicals (e.g., phosphates) in the river might impact crops both positively and negatively.

In the presentation, Andrew, Jessica and Keisha demonstrated that they considered the cause and effect of poor water quality and its impact on farmers and ranchers. All three presentation segments made attempts to connect the field research to the problems farmers and ranchers might have. Keisha suggested that farmers should rotate their crops, which would allow them to use less fertilizer, thus potentially improving water quality.

Group E3: experimental condition

Group overview

This group included Jenny, Rachel, Josh, and Cole, and represented common citizens of Greenville. Josh helped his groupmates get back on track when they struggled to focus. Rachel was the most passive member. Cole also played a pivotal role in Group E3. Cole often engaged in off-task behavior, but he contributed to the group’s discussion due to his ability to search for and analyze relevant information. Jenny often appeared to be a group leader, but sometimes instigated off-topic behavior. She was the group member who was most concerned with defining the problem from their stakeholder perspective.

How they judged the credibility of evidence

At first, Group E3 members accepted water quality data in their entirety. However, through use of the Connection Log team members gradually distinguished between good and bad data. For example, Cole noted:

Test site 2… had like 172 point something [degrees Celsius]… and… [the] other ones had like 18 point something, 19 point something, so we estimated it was either 17 point something or they just messed up.

How they made sense of data and evidence

By Day 5, the group members began to see trends in the data, as evidenced by this quote from Rachel: “At Site 1 [we] caught a bunch of stoneflies and then the people down the river only caught one…Site 1 had more phosphates and nitrates…Turbidity from Site 1 to Site 3 changed big time [got much higher].” Their understanding evolved while articulating the problem definition, information, and solutions in the Connection Log. As team leader, Josh often found errors in the data and helped his team identify the principal water quality problem, as in this example from Day 6: “All we need to worry about are stoneflies…why aren’t there very many stoneflies?” Through Josh’s guidance, Rachel and Jenny found that the level of nitrates and the water temperature were not problematic.

How and why they addressed the central problem

Through their work in the Connection Log Group E3 members understood how to interpret water quality data, but they still struggled to define the problem from their stakeholder perspective. At first, they relied on fragmentary evidence, but discussing through the Connection Log helped them move towards deeper engagement with evidence, as they noted during the interview:

Rachel:

The Connection Log… slowed us down and made us think a little more and like go a little… more detailed with it

Jenny:

Give more directions. So that kids could know what to do, so that they wouldn’t have to get that much help from the people who are doing it and so that they could hear about it on the Connection Log

Using the data and the Connection Log group members identified problems and considered the impact on their stakeholder. Moreover, one of the researchers helped them reexamine the problem in this example from Day 12:

Researcher 3:

Ok. So what’s the problem for common citizens in the Dale River?

Josh:

The water’s too dirty to use

Jenny:

Drinking water and why kids can’t play in it

Researcher 3:

Ok. So what do you mean by dirty?… Garbage?

Cole:

You can’t have fun in the river without getting hurt, sick

Josh:

Some of the farmers’ pesticides can poison the water so you get sick

Researcher 3:

That goes right along with pollution. So is…anything else…causing problems for the common citizens? We have garbage, we have pollution

Cole:

And sewage

Josh stated that if garbage and pesticides get into the river, the water quality will be poor. He also explained how pollution affects common citizens more logically than at unit beginning.

In the presentation, group members described the problem from the perspective of common citizens and proposed to raise money to purchase equipment to clean the river.

Group C1: control condition

Group overview

Dave was group leader and scribe. When on task, he assigned tasks to groupmates, however when the latter asked about their tasks he avoided answering. Derek worked well, but would get distracted very easily. Mike was often quiet, but would talk within the group and completed his assigned tasks. Sara’s role was as worker. She asked her groupmates if she did not know how to proceed, tried to keep the group on task and was the only one who tried to involve Jason. He often seemed too tired or disinterested to contribute.

How they judged the credibility of evidence

Group members largely just accepted any found information, including articles not relevant to Monroe farmers. The group quoted Wikipedia often, perhaps due to it being the first to appear in a web search.

The group judged the reliability of evidence only during the interview. They were asked how they would support a petition to stop global warming to a senator:

Researcher 2:

If you send them a link to Wikipedia do you think that would be good?

Dave:

Probably not

Researcher 2:

Why not?

Dave:

Cause that’s only one thing

Researcher 2:

So if you have multiple websites that said that global warming was a problem that would be good for them

Dave:

They’d probably need more…More sources of like telling you and like problems that you found

How and why they made sense of data and evidence

They relied on their own interpretation until a teacher came. Sara often initiated contact with the teacher and then Dave offered one- or two-word responses to the teacher’s questions as in this example from Day 3.

Mr. Thomas:

What else do they put on them?

Sara:

Um…

Dave:

Fertilizer

Mr. Thomas:

Fertilizer, and why do they do that?

Dave:

To help it grow

Mr. Thomas:

Okay. Would that affect the river at all, if it got into it?

Dave:

Yeah

Mr. Thomas:

How does it get in there?

Dave:

When they irrigate it comes down [washes into the river]

In addition, the group often discussed information that did not relate to the central problem. For example, Mike noted that nitrates treat chest pain: “I got on Yahoo answers and it says it’s [nitrates] for chest pain.” The group just accepted this and moved on. While this was not in their final presentation, they also did not discuss if the information was relevant.

The group also classified things as either Good or Bad, as in this discussion from Day 7.

Mike:

Nitrates, yeah. It was zero [at Site 1]. At Site 3 it was point one, at Site 2 it was point two. So is um…nitrate good or bad?

Sara:

Um…I dunno, it’s your job. You should figure that out

Mike:

Well is it good or bad? Cause if it…Hey are nitrates good or bad?

Dave:

I dunno

Derek:

Bad

They oversimplified the problem by labeling nitrates as simply Good or Bad. For example, they noted that a pH of 6 was “really acidic” and thus “bad.” While a pH of 6 is slightly below the standard for rivers in the region (6.5), the group inflated the problem. Yet, pH was dropped in favor of other issues. They also did not consider how water quality elements changed along the course of the river.

How and why they addressed the central problem

In the presentation, nitrates were not labeled as good or bad, but the group discussed negative effects of nitrates on people:

Mike:

[If] the nitrates get into your blood cells, they can slow it down and give you a disease called methengeral; [Methaemoglobinemia] I think that’s what it’s called. And it can slow the path to the brain that carries the oxygen so you will basically die… and it will poison the babies

While presenting, the group abandoned their stakeholder position as Monroe farmers. Nitrates may kill fish, but the group did not establish a connection between “methengeral” and farmers. They proposed solutions to keep the water clean such as planting grass, reducing litter, and reducing paved surfaces next to the river. But other than briefly mentioning that irrigation pipes could be affected by bacteria, the presentation did not relate to a farming perspective.

Group C2: control condition

Group overview

Group C2 consisted of Zach, Pat, Rob, and Christian, and represented common citizens. As scribe, Zach assigned tasks. Rob was often confused about how to find information. Other group members helped him by (a) providing direction for online searching, (b) evaluating information Rob found, and (c) providing potential evidence. Christian functioned mostly as information seeker and Pat as a problem solver.

How they judged the credibility of evidence

Teacher prompting helped the students identify outliers. For example, Rob noted, “Wait, the data of water temperature—I think it’s supposed to be 12.8. They wrote 21.” Zach wrote in his notes, “The water temperature was about the same all the way down the river; it was about…12.8 Celsius.”

When searching for information online, the students seldom considered credibility. For example, when Rob used information from a personal blog as his evidence, the teacher pointed to his computer and said, “That just looks like somebody’s opinion.” Rob replied, “Well, it somewhat is someone’s opinion, but it kind of is also true.” Zach seemed to understand the need to examine generalizability of information. After he found a water quality standard from a neighboring state, he asked the teacher whether it applies to Madison.

How and why they made sense of data and evidence

At unit start, group members were confused about how to identify patterns in the data. With help from the teacher, they compared data across the three test sites and noticed that the amount of insects decreased as the river went downstream. Thus, they noted, “There wasn’t as much pollution [near the source of the river]…as there was down the stream.” They now understood how to identify patterns in the data and concluded, “Turbidity is gotten [sic] worse over the route of the river.”

Although they identified patterns in the water quality data, they saw this as unconnected to addressing the unit problem. They soon began searching for information online to address the unit problem. For example, instead of examining phosphate levels in the collected data, Zach and Pat thought phosphates might be a problem because they found online that high levels of phosphate can affect water quality. Although the river was not tested for parasites, Christian and Rob thought parasites was a problem since the teacher told them to search for how parasites in the water can affect common citizens. After reviewing found information, Christian said the river’s water was “terrible” and they “should never drink any of it.” In the interview, when asked about why they did not refer much to the water quality data, Zach said, “Um, I think we forgot about it actually. I think we were more focused on it affects us how, and stuff like that.”

How and why they addressed the central problem

Having ignored the water quality testing data, Group C2 members attempted to address the unit problem through online searching. On Day 11, Christian found a webpage about pollution in U.S. drinking water, which stated, “in 2009 drinking water started getting polluted [and] the pollution is increased.” Pat also found another piece of information—“big cities tend to rely on rivers or lakes as the source of drinking water” from this webpage. Thus, Pat proposed to use more groundwater for drinking water because “surface water is up above where it can get contaminated more, like from trash” and “ground water gets filtered through the soil and rock.” During the presentation, he proposed “putting in more wells so we can start getting the ground water out and drinking that”.

On the last day of the unit, the teacher asked the students whether there was a water quality problem. Zach noted that turbidity increased along the river and then realized that turbidity was problematic. Zach pointed out that sediment in the river comes from runoff, so Pat proposed to build a fence on the mountain to prevent dirt and trash from getting into the water. Zach and Rob questioned the applicability of this solution, which led Pat to search for more information. Based on the additional information Pat found, the group agreed that building a silt fence would solve the turbidity problem and the cost was acceptable.

Cross case analysis

See Table 2 for a summary of the primary goals of each group, and whether the group adhered to its stakeholder position. Members of Groups E1 and E3 (both experimental condition) appeared to hold as goals to engage with (a) evidence that other 7th graders and they collected, noting what it said about trends along the course of the river, and (b) evidence from other sources (e.g., websites). Both groups interpreted information and addressed the problem from their stakeholder positions. Three of the four members of Group E2 (experimental condition) appeared to hold similar goals as those of Group E1 and E3 members: they considered trends in the data, and what may have caused changes, but largely not from their stakeholder perspective. The remaining member considered the data from his stakeholder perspective, but then abandoned it when he could not find any reports of cows dying or crops being adversely impacted. Members of Group C2 (control condition) appeared to simply desire to find answers online about whether the river was polluted. These students considered trends in water quality data early on, but then promptly forgot about it in favor of online research on how water quality “affects us…and stuff like that.” They also read about some rivers somewhere having water quality problems, and interpreted that as meaning that their local river had those same problems. They finally included something about a water quality indicator at the urging of the teacher on the last day of the unit. Members of Group C1 (control condition) appeared to hold as goal simply to label different aspects (e.g., nitrates level) of the river as good or bad, using Internet information to tell them if these aspects are good or bad. They also largely abandoned their stakeholder position.

Table 2 Cross-case analysis

Discussion

Approaches of the groups

Rather than aim to generalize findings of a case study from the sample (i.e., case) to the population, case study researchers should pull from case studies constructs and the relationships among such in a process of analytic generalization (Yin 2013). In so doing, one can connect the network of constructs emanating from the target case study with extant theory, resulting in an enhanced ability to explain and predict phenomena. We found preliminary evidence of a difference in approach to addressing the central problem between groups. Members of two (Group E1 and E3) of the three groups in the experimental condition displayed evidence of a goal to engage with and synthesize information from multiple sources, including data that their classmates and they collected and information from experts. In short, these students displayed evidence of an epistemic aim of “acquiring true, justified beliefs” (Chinn et al. 2011, p. 147). Consistent with a more sophisticated epistemology, the students viewed experts as not having all of the answers (Bråten et al. 2013; Greene et al. 2008) and saw justification as relying on reliability/validity (Strømsø et al. 2011), multiple sources (Bråten et al. 2011; Mason et al. 2011), and coherence (Barzilai and Zohar 2012; Kienhues et al. 2011). However, members of the third experimental group (Group E2) exhibited epistemic vices in abandoning their stakeholder perspective (three members) or demonstrating a need for quick learning (one member) (Chinn et al. 2011; Hofer and Pintrich 1997). Three members considered water quality data as pertinent to addressing the problem but interpreted the data from an environmentalist perspective rather than that of farmers; the remaining member largely ignored the data when he did not find reports of cows dying or crops being impacted.

Members of Groups C1 and C2 (control condition) demonstrated an absolutist stance and a complete appeal to authority, respectively, which do not align with sophisticated epistemic beliefs (Buehl 2008; Limón 2006). Rather, it belies good science to believe that a single piece of data can answer a dichotomous question of whether the water quality of a river is good or bad (Nersessian 2008). It also is problematic to believe that experts have all the answers, and one simply needs to consult experts to find out answers (Mason et al. 2006; Muis 2007), especially when the experts do not have any knowledge of the river in question. This reflects the epistemic aim of acquiring minimally justified beliefs (Chinn et al. 2011).

When considering differences in goals, it is important to note that these groups were selected as typical cases. In a study issued from a different part of the same dataset, we found no difference between conditions in argument evaluation ability at unit start (Belland et al. 2015a). The idea that the Connection Log led students to adopt a more epistemologically sophisticated approach is tentative, but is interesting and warrants further research.

One possible reason for the differences in approaches may have to do with the extent to which students regulated their own learning or relied on the teacher to do so. Members of Groups C1 and C2 largely relied on the teacher to tell them what to do. Jessica (Group E2) also did this. Members of Groups E1 and E3 were much better able to regulate their own learning, by their own admission largely due to support from the Connection Log. Some evidence indicates that by regulating one’s own learning, one can move to a more epistemologically sophisticated inquiry approach (Greene et al. 2010; Muis 2007). Defining learning issues and regulating one’s own learning is also an important prerequisite to lifelong learning—an important goal that the integration of socioscientific issues seeks to promote (Kolstø 2001; Loyens et al. 2008).

It is also possible that the Connection Log led students in the experimental condition to question each other’s ideas more than students in the control condition. Questioning ideas, especially those of experts, is key to developing more sophisticated epistemological beliefs (Ikuenobe 2001). Middle school aged students may need to be actively encouraged to evaluate each other’s ideas and engage in argumentation (Kuhn and Udell 2007). The Connection Log does that by asking students to (a) read and discuss their groupmates’ input, and come to consensus on the topic at hand, and (b) prepare an argument in support of their problem solution. Furthermore, the scaffold gives students criteria by which to judge sources.

None of the groups questioned the credibility of collected data and Internet sources to the extent that would be desirable. It takes time to transition from absolutist to more sophisticated epistemic beliefs (Sandoval 2005). It is important to consider how to design scaffolding such that epistemological stances can transition faster and more effectively (Duschl 2008).

Implications for research

Sophisticated epistemic beliefs are critical to success in PBL, in particular evaluating and synthesizing information, and making and engaging with others’ claims (Sahin 2009; Schmidt et al. 2011; Strømsø and Bråten 2010). But middle school students often espouse unsophisticated epistemic beliefs (Kuhn et al. 2013), and this can lead them to evaluate evidence (Britt et al. 2014; Nicolaidou et al. 2011) and arguments (Papadouris and Constantinou 2014) based on irrelevant criteria. This is clearly problematic for effective engagement with PBL problems. The natural question is how to lead middle school students to adopt more sophisticated epistemic beliefs. Unfortunately, while the research base indicates that epistemic beliefs are crucial to addressing ill-structured problems, it does not indicate the most effective methods for helping middle school students develop more sophisticated epistemic beliefs. If one assumes that holding sophisticated epistemic beliefs is a prerequisite to starting a PBL unit, then it is a skill that should be taught before students engage in PBL, for example, using a direct instruction approach. But the idea of direct instruction about sophisticated epistemic beliefs is antithetical to the very idea of sophisticated epistemic beliefs. After all, if students end up believing without questioning when told that it is important to question knowledge claims, it could be said that they are in fact exhibiting unsophisticated epistemic beliefs in the service of developing more sophisticated epistemic beliefs—a confusing and illogical causal path.

There is some evidence that, by engaging in argumentation, students can begin to develop more sophisticated epistemic beliefs and acquire and deploy norms of argumentation when evaluating arguments (Kuhn et al. 2013; Osborne et al. 2013). This makes sense because the very nature of argumentation privileges the hallmarks of sophisticated epistemic beliefs (e.g., justification through reference to multiple sources, the importance of questioning knowledge claims). But much of this prior research dealt with very short duration and relatively well-structured problems that did not require much in the way of self-directed learning. In these cases, students often could only make one of two claims (e.g., that the death penalty is appropriate or inappropriate in a particular situation) along with evidence in support of the claim. There was no need for the collection and analysis of data, and the creation of claims on that basis. Thus, one could question whether students could both develop sophisticated epistemic beliefs and have a strong understanding of how such can be applied when evaluating claims and addressing problems. Evidence from this study implies that by engaging in PBL aided by appropriate support, middle school students can develop sophisticated epistemic beliefs and can use such in the process of evaluating the credibility of evidence and addressing the central PBL problem.

Implications for scaffold design

In the article that lay the groundwork for the use of the scaffolding metaphor in education, Wood et al. (1976) noted that effective scaffolders use theories of the task and of the tutee to generate scaffolding messages. This sort of dynamic generation of scaffolding is a hallmark of effective teacher scaffolding, but unrealistic when it comes to computer-based scaffolding supporting ill-structured problem solving (Belland et al. 2011). Rather, the design of computer-based scaffolding needs to happen before students use it. To do so, one can construct several different models of what one is trying to achieve and how to achieve it—process, situation, affordance, and motives models (Akhras and Self 2002; Belland and Drake 2013). This gives one a lens through which to examine how results from this study can inform scaffold design. In this study, members of student groups who used the Connection Log displayed sophisticated epistemic beliefs, while their control counterparts displayed unsophisticated epistemic beliefs. The Connection Log did not overtly teach epistemology. Taking a step back from the immediate results from this study, one can think about how the Connection Log was designed. A key step in the design was the construction of the process model—a description of the activity system in which the target students were meant to engage (Belland and Drake 2013). In it, we considered the goal that students be able to engage in problem solving and argumentation in the manner of professional scientists. Specifically, the goal was that through engagement with the Connection Log students would engage with a cultural tool and gradually internalize the scientific practices of looking for evidence in multiple places, etc. (Leont’ev 1974; Luria 1976). Professional scientists maintain a healthy level of skepticism when engaging with evidence and arguments, and so fostering that was a goal with the scaffold. But the strategy of helping students learn to do this was not through direct instruction or an overt focus on this, but rather through engaging students in the types of activities aided by the types of tools and processes that scientists would employ when addressing problems and engaging in argumentation. In this way, more sophisticated epistemic beliefs may be the cognitive residue (Salomon et al. 1991) of the Connection Log. This makes sense because the Connection Log prompted students to gather information from multiple sources. More research is needed to ascertain that the differences in approaches between the experimental and control groups are evidence of cognitive residue.

Computer-based scaffolding is a term that encompasses interventions that employ a wide range of strategies. Such variations in strategies can in large part be attributed to differences in the theoretical frameworks that guide scaffolding development (Belland 2016). For example, scaffolding that is driven by the Adaptive Character of Thought-Rational (ACT-R) framework (Anderson et al. 1997) is at a much smaller grain size and is designed to meet different goals (e.g., help learners develop production rules) than scaffolding that is driven by cultural-historical activity-theory (e.g., get learners to engage in actions in a similar manner to members of the target culture) (Leont’ev 1974; Luria 1976). This may mean that argumentation scaffolding guided by theoretical frameworks other than cultural historical activity theory would not likely impact students’ epistemic beliefs. This can only be ascertained through reference to empirical data.

Limitations and suggestions for future research

This study is exploratory, and as such, it can only provide tentative conclusions about differences in approaches between groups. We did not directly measure students’ epistemic beliefs except through questions in the interview. Using computer-based learning environment features to collect data on students’ self-regulated learning and epistemic cognition, and the extent to which these influence each other is a fruitful avenue for future research, and may lead to more generalizable conclusions (Greene et al. 2010). Furthermore, differences in approaches to (a) judging the credibility of evidence, (b) making sense of data and evidence, and (c) addressing the problem are not the only constructs of interest when examining how students’ approaches to solving authentic problems vary based on being given argumentation scaffolds or not. For example, it is important to consider how students define the problem space, self-direct their learning, and collaborate. It is also important to consider the additional support/structure that can be provided due to the nature of PBL implementation (Hung 2011). But there is only so much that can be addressed in one study. Future research can build off off the findings of this study to further understand the nature of problem solving in PBL.

The student population of the participating school was largely homogeneous. Thus not all findings may apply to students from other settings and backgrounds. However, qualitative research does not aim at generalizability to the greater population, but rather further unpacking of theories and associated constructs (Yin 2013).

Due to microphone difficulties, audio from Days 3 and 5 was not clear at times. Therefore, we could have missed dialogue that could have led to an enhanced understanding of group processes or dynamics. On Day 3, students prepared for collecting water quality data, and on Day 5, they took their first look at the collected water quality data.

The unit was stopped for 1 week in the middle of the unit due to the state’s hunting season, when about half of the students were absent. Some video was missing for Group C2 (Days 5 and 7) and Group E1 (Day 11). We asked students about those days in the interview.

The Connection Log had some technical issues, which confused some students.

The Connection Log was not designed with the explicit objective of epistemic cognition support. Furthermore, the study design is not such that specific design features can be isolated that may have led to enhanced epistemic cognition. Effort is underway to embed greater support for epistemic cognition in the Connection Log. To examine if these changes lead to more sophisticated epistemic cognition, further case studies from an epistemological standpoint would be warranted, as this approach allows for a holistic view of how students construct social order in a group. But the addition of think-aloud protocols would help further document how students think about approaching evidence and the problem (Ferguson et al. 2012). Furthermore, it is important to not only measure epistemic variables, but also to relate this to variables such as self-regulated learning (Bråten et al. 2011; Muis 2007) and motivation (Chinn et al. 2011; Hofer and Pintrich 1997; Muis 2007). As it often takes much time to develop sophisticated epistemic beliefs (Greene et al. 2008), it would be worthwhile to conduct longitudinal studies along these lines.