Introduction

Ability to think critically and reflectively about text is essential for academic success (Kuhn 2005). In today’s schools, there is a pressing need to help students read beyond a text’s surface (Chang-Wells and Wells 1993). However, instructional approaches for promoting higher level comprehension of text are not well-established. In this article, we define higher level comprehension as a way of thinking about text that goes beyond a literal understanding.

Classroom discourse, in Cazden’s (2001) terms, “the language of learning”, plays a fundamental role in the development of reading comprehension. Since 1970s, research has identified a number of productive classroom discussion approaches that appear to be effective in promoting students’ reading comprehension and high-level responses to text in elementary, middle and high school settings (Applebee et al. 2003; Murphy et al 2009; Nystrand 1997, 2006). Collaborative Reasoning (CR) (Anderson et al. 2001) is a peer-led small-group discussion approach that promotes critical thinking and reasoning about text in American classrooms (Murphy et al. 2009). In the present study, we tried out CR in fourth grade classrooms in China over a period of eight weeks and investigated the impact of CR on Chinese children’s reading comprehension and teachers’ adaptation and learning in CR.

Previous research has shown that English-speaking students produce more productive talk in CR discussions than in conventional discussions or recitations (Anderson et al. 2001). CR promotes Spanish-speaking English language learners’ (ELLs) reading comprehension, assessed by a sentence verification task which requires limited inference (Zhang et al. 2013). However, it remains unknown whether CR promotes student higher-level reading comprehension. Reading instruction in elementary classrooms in China is predominantly teacher initiation—student response—teacher evaluation (IRE; Mehan 1979). Such a regimen may be efficient for teaching factual knowledge and basic skills, but does little to promote children’s reasoning, critical thinking, and high-level comprehension.

Although the feasibility and effectiveness of CR for Chinese students has been explored in several elementary classrooms (Dong et al. 2008, 2009), the discussions were facilitated by young researchers rather than regular classroom teachers. Power and authority differentials inherent in Chinese teacher-student relationships may not have fully played out. The question remains, therefore, can regular Chinese teachers and their students re-negotiate their classroom roles and alter the discourse pattern, where teachers become facilitators, allowing students to become active discussants with one another? This study is the first in China to examine the impact of CR as implemented by ‘real’ teachers.

Collaborative reasoning and reading comprehension

CR is a peer-led small-group discussion approach. In CR, students read a text containing an unresolved issue with multiple, competing points of view. Then, in heterogeneous groups of five to eight, students deliberate the Big Question raised by the text. Reasoned argumentation is expected in which students take public positions on the Big Question, support positions with reasons and evidence, evaluate and respond to one another’s arguments, and challenge one another when they disagree. Students are responsible for managing their discussions, negotiating turn-taking in an orderly fashion. The teacher’s role is to scaffold student argumentation and social skills from the side.

Research spanning two decades shows that CR promotes American children’s thinking, language acquisition, and social skills (e.g., Anderson et al. 2001; Chinn et al. 2001; Li et al. 2007; Wu et al. 2013). However, to date, CR appears to be neutral with regard to performance on standardized reading tests that assess comprehension in brief passages accompanied by multiple-choice questions, but may promote higher level text comprehension. In a study with fifth grade Spanish-speaking ELLs, Zhang et al. (2013) used an alternative reading comprehension measure, the Sentence Verification Technique (SVT; Royer and Carlo 1991). The SVT requires Yes or No judgment about whether test sentences have the same meaning as sentences in a previously read passage. Results showed that the CR students, especially students with higher English proficiency, performed significantly better than no-CR students on the SVT test after participating in eight CR discussions.

Reading comprehension is an active meaning-making process in which readers simultaneously extract information and construct interpretations (Afflerbach and Cho 2009; McKeown et al. 2009). According to the Kintsch’s construction-integration model of text comprehension, there are at least two levels of text comprehension: surface level and deeper level or situation model (Kintsch 1988; van Dijk and Kintsch 1983). The surface level comprehension means literal understanding of specific words and syntax structures. The situation model is a deeper conceptual depiction of what the text is a mental representation of people, states, settings, actions, and events that are either explicitly mentioned or inferred from text (Graesser et al. 1994; Kintsch and Rawson 2005). Constructed-response questions are well suited for assessing aspects of high-level comprehension that require the readers to distill meanings, check the plausibility of those meanings, create connection between text and prior knowledge and personal experience, reasoning about, interpreting, and evaluating text.

Besides, in typical elementary classrooms in China, reading instruction relies heavily on textbooks to ensure that the National Curriculum Guide is followed. The prevailing mode of reading instruction in China is teacher-directed whole-class instruction (Wu et al. 1999). It stands to reason that participation in CR discussions would enhance a deeper level or situation model of text comprehension in Chinese children. First, CR calls for deeper reading, extended responses to central issues, and reflective and open-minded thinking, which rarely occur in typical reading lessons in China. Unlike the ordinary story reading where mastery of details and facts are emphasized, CR students read stories more deeply in order to be better prepared for participation in discussions.

Second, different from most available texts in Chinese textbooks or extracurricular material, which tend to be one-sided, the set of texts selected for CR discussions in this study contain ample consideration of both sides of issues. CR discussions with these carefully selected texts enable students to develop an argument schema (Reznitskaya, et al. 2008). According to argument schema theory, argument schema is a mental representation of the structure of a sound argument consisting of a claim, explanation, evidence, counterargument, and rebuttal, which is developed through socialization into argumentative discourse in a collective setting (Reznitskaya et al. 2008). Lin et al. (2014) proposed that the argument schema assists a reader in monitoring and regulating the formation of a situation model. Once the argument schema is activated, it provides a structural framework for readers to analyze and integrate various argument components in the text so as to produce better and deeper understanding. The proposal was supported by empirical findings that the argument scaffolding group generated more knowledge-based inferences than did the no scaffolding group (Lin et al. 2014).

Teacher adaptation to collaborative reasoning

Previous research on CR has primarily focused on how the open-format discourse pattern influences student social and cognitive outcomes. Limited research has investigated teacher’s adaptation to CR discussions with the exception of a case study by Nguyen-Jahiel et al. (2007). Nguyen-Jahiel et al. (2007) documented the challenges of a veteran American elementary school teacher’s implementation CR discussion in her fourth grade classroom. Although the transformation from the teacher-directed and typical initiation response evaluation (IRE) pattern of interaction to student-centered open-format discussions was anything but smooth, the teacher and her students were able to adjust their respective roles and acquire new way of managing their discourse.

To date, less is understood about the effective teacher professional development approaches to student-centered text-based classroom discussions and how teachers apply the new discussion approaches in classrooms. A few teacher professional development instruments were developed to train teachers reading comprehension instruction, including comprehension and learning from text survey (CoLTS) (Kucan et al. 2011), Video viewing task (VVT) (Kucan et al. 2009). Kucan et al. (2009) focused on teacher learning across a 4-day professional development institute using VVT. Analyses of teacher’s comments and reflections revealed a shift in teacher stance, from an initial descriptive perspective to a more analytical one, and a shift from noting explicit to implicit features of participation and interaction during discussions. The teacher transformation from recitation to dialogic instruction was incremental and needed guided support (Beck et al. 1996; Kong and Pearson 2003).

In a large-scale teacher survey study, Zhao et al. (2009) found that Chinese teachers spent a lot of time in training and researching activities under the new curriculum reform policies. Chinese teachers had low satisfaction with such professional development activities. The limitations of professional development activities for teachers in China are deviation of teaching content from practical needs, lack of follow up guidance, limited competence of experts, and ineffective training methods. What the teachers need most is the knowledge and skills that are close to their teaching practice (Li et al. 2013).

The present study

This study aimed to investigate how Chinese teachers and students adapt to CR, and the effects of CR on reading comprehension of Chinese children and teachers’ learning and practices in reading instruction. The first goal of the study was to investigate the impact of CR on children’s reading comprehension. In the present study, constructed- response (both short and extended) questions in two well-known reading tests, National Assessment of Educational Progress (NAEP) and Progress in International Reading Literacy Study (PIRLS), were used to examine the effects of CR on high-level comprehension of Chinese children. These two tests were chosen because no standardized reading comprehension measure is available in China and commercialized, standardized multiple-choice questions alone did not meet the purpose of the study. In each test, four target cognitive processes, representing different levels of reading comprehension, are assessed. NAEP 1992–2007 assesses four aspects of reading: forming general understanding; developing interpretations; making reader/text connections; and examining text content and structure (NCES report 2012). Four similar cognitive processes are assessed in PIRLS: focusing on and retrieving explicitly stated information, making straightforward inferences, interpreting and integrating ideas and information, examining and evaluating content, language, and textual elements (Mullis et al. 2006). In both NAEP and PIRLS reading tests, the latter two cognitive processes are considered higher level reading comprehension.

The second goal of the study was to investigate whether Chinese teachers and students can successfully adapt to CR discussions, and the impact of CR on teacher’s learning and teaching practices in reading instruction. Given the huge contrast of the expectations of successful CR implementation and Chinese teachers’ traditional professional development and teaching experiences, it is interesting to see how teachers alter their roles and if the CR experiences impact their practices in reading instruction.

Three research questions guided the current study:

  1. 1.

    To what extent does CR impact Chinese children’s reading comprehension, especially higher level comprehension?

  2. 2.

    Can Chinese teachers and their students re-negotiate their classroom roles and alter the discourse pattern?

  3. 3.

    To what the extent does CR impact Chinese teacher’s professional learning and practices in reading instruction?

Method

Participants

The participants were 106 fourth-grade students from four classrooms in an elementary school serving working-class families in Beijing, China. Two classrooms (n = 56) were randomly assigned to implement CR while the other two classrooms (n = 50) served as controls. Two homeroom teachers were chosen randomly as CR teachers. One was Mrs. Chang (pseudonym), who had a bachelor’s degree in elementary education and had taught for 4 years at the time of study. There were 27 students, 14 boys and 13 girls, in her class. She was a motivating young teacher. The other CR teacher, Mrs. Kai (pseudonym), had taught for 20 years and was regarded as a master teacher. There were 29 students, 15 boys and 14 girls, in her class.

Measures and procedure

Pretests

A baseline reading lesson was videotaped in the two CR classrooms. The teachers were asked not to alter anything they would normally do in daily reading lessons. As a reading comprehension pretest, students completed the Grade 4 reading test from NAEP 2005. The passage was ‘How the Brazilian Beetles Got Their Coats’ (Bennett 1995). There were 3 multiple choice questions and 6 constructed response items.

CR training and implementation

After the pretest, the four participating teachers attended a 2-day workshop on moderating CR discussions and trained by senior CR researchers who managed the CR research program in the United States. They were first introduced the theoretical framework of CR and its research background. Then they were trained to use instructional moves. Clark et al. (2003) designed to facilitate student independent thinking and self-management of discussions. Teachers watched video clips of exemplary CR discussions, considered how to improve problematic discussions, and practiced preparing an argument outline to facilitate a discussion. Teachers were randomly assigned to the CR and control conditions after the teacher training. The control teachers were told not to implement CR and not provided with the CR stories and manual of CR discussion until the end of the project.

In the following 8 weeks, students in the two experimental classrooms engaged in one CR discussion per week. Each class was divided into four discussion groups, with 6–8 students per group. Each group was a cross-section of the class in terms of gender, reading ability, and talkativeness. Students in the two control classrooms continued their regular reading lessons when the students in CR group participated in discussion. Required by the principal, the two CR teachers wrote a weekly reflection after the debriefing with the CR researchers during the CR implementation.

Eight stories successfully used in the CR discussions in American classrooms were chosen by two researchers who have extensive background in CR and Chinese reading instruction. The stories were translated by two graduate students fluent in Chinese and English. These translated stories were interesting and relevant to Chinese children life experience, and appropriate for their reading level determined by Chinese elementary school teachers. The stories describe dilemmas faced by the characters, creating contexts for consideration of topics such as friendship, fairness, honesty and integrity, winning and losing, and public policy. The stories and the order of discussions were identical in two CR classrooms.

Local CR researchers provided the follow up on-site coach for CR teachers. Three participant observers were present during CR discussions to videotape, observe, take field-notes, and provide feedback when necessary. The length of discussions ranged from 12 to 30 min and had an average of 24 min. Two discussion videos were accidently lost, so 62 videos were available for transcription and analysis. The transcribed corpus ran a total 1508 min and 5580 speaking turns.

Posttests

After the intervention, all students completed two reading comprehension posttests: a translated subtest from the NAEP 2002 Grade 4 reading test and a Chinese subtest from PIRLS (2006) Taiwan Grade 4 reading test (PIRLS 2006 Taiwan Report), respectively. The NAEP items were publicly available and the NAEP passages were translated with permissions of the publisher. The PIRLS Chinese passage and items were publically available from the Taiwan PIRLS website.

The NAEP test contained the passage, The Box in the Barn (Conner 1988), 5 multiple-choice items, and 7 constructed-response items. The PIRLS reading comprehension test comprised the passage, An Unbelievable Night (Hohler 2003), 7 multiple-choice items and 5 constructed-response items. Both texts afforded reading for literary experience. Scoring of the constructed-response items in the NAEP and PIRLS tests followed the publically available scoring guides. Each multiple-choice question is worth one point. Constructed-response questions are worth one, two or three points, depending on the depth of understanding and the extent of textual support questions required. Scoring guides for the NAEP pre- and post-tests were translated. Two raters practiced scoring answers collected in a pilot study, and reached consensus on how to apply the scoring guide. Then the two raters independently scored all constructed-response items. The raters were blind to treatment condition and had no information about the children. The average inter-rater reliability for pre- and post- NAEP tests and the PIRLS posttest was .92, .91, and .94, respectively. Discrepancies were resolved in discussion.

After the posttests, the two CR teachers were interviewed individually on the extent to which CR impacts their reading instruction practice.

Data analysis

To answer the first research question—the impact of CR on reading comprehension, analysis of covariance (ANCOVA) was used first and examples of student responses to the PIRLS constructed response items were presented to illustrate basic and high level comprehension. To address the second research question—teacher and student adaptation to CR and discourse change, we analyzed classroom video data of pre-intervention or baseline reading lessons and CR discussions. To gain insight into the impact of CR on teacher learning and practice (the third research question), we also analyzed CR teachers’ weekly reflections and post-intervention interviews. The primary researcher read teachers’ weekly reflections and transcriptions of post-intervention interviews multiple times and determined the key concepts related to teachers learning and adaptation to CR. The key concepts of CR teachers’ professional learning included the feelings they had, the adjustment they made deliberately, the ways they interacted with students, and the changes they made in their reading instruction practice. Four themes or developmental stages of CR teachers’ adaptation were identified, then these themes or developmental stages were described and interpreted.

Results

Effects of CR on children’s reading comprehension

Table 1 summarizes performance on the pre- and post- reading tests. ANOVA showed no significant difference between CR students and control students on the NAEP reading pretest total scores, F (1, 104) = 2.58, p = .11, η 2 p  = .02, the multiple-choice item scores, F (1, 104) = 2.77, p = .10, \(\eta_{p}^{2} = .03\), or the constructed-response items scores, F (1, 104) = 1.58, p = .21, \(\eta_{p}^{2} = .02\). Thus the two groups can be regarded as comparable in initial reading comprehension.

Table 1 Means and standard deviations of reading pretest and posttest scores

Treating the NAEP posttest performance as the dependent variable and the NAEP pretest total as a covariate, ANCOVA analyses showed no significant overall intervention effect on the NAEP posttest total, F (1, 103) = 1.63, p = .21, \(\eta_{p}^{2} = .02\), or on the constructed-response items, F (1, 103) = .05, p = .82, \(\eta_{p}^{2} = .00\). Unexpectedly, the control group performed significantly better than the CR group on the NAEP multiple-choice items, F (1, 103) = 7.13, p = .01, \(\eta_{p}^{2} = .07\).

The CR group, however, performed significantly better than the control group on the PIRLS reading posttest total scores, F (1, 103) = 4.78, p = . 03, \(\eta_{p}^{2} = .04\), and on the PIRLS constructed-response items, F (1, 103) = 5.54, p = .02, \(\eta_{p}^{2} = .05\). According to recommendations by Cohen (1988), the values of partial eta squared were low to medium. No significant difference between the CR group and the control group was found on the PIRLS multiple-choice items, F (1, 103) = .51, p = .48, \(\eta_{p}^{2} = .01\).

To better understand the effects of CR on different levels of reading comprehension, PIRLS scores were further broken down into four aspects of reading and ANCOVA analyses were performed on the scores from each subcategory. Results indicated that the CR group performed significantly better in evaluating content, language, and textual elements, F (1, 103) = 4.81, p = .03, \(\eta_{p}^{2} = .05\). There was a non-significant trend favoring CR on interpreting and integrating ideas and information, F (1, 103) = 2.92, p = .09, \(\eta_{p}^{2} = .03\). No significant difference between the two conditions was found on retrieving explicitly stated information, F (1, 103) = .84, p = .36, \(\eta_{p}^{2} = .01\), or making straightforward inferences, F (1, 103) = .67, p = .42, \(\eta_{p}^{2} = .01\). The scoring rubric and examples of student responses to the PIRLS constructed response items are presented in Table 2. The results provide the evidence that CR supports high-level comprehension, but not basic and literal comprehension among Chinese children. The NAEP scores could not be analyzed in a similar fashion, because no item parameters were available for Chinese children due to the small sample size and multidimensional IRT techniques required to calculate NAEP scale scores.

Table 2 Sample responses to PIRLS constructed items

Moving from recitation to cr: discourse analyses

To compare and contrast the classroom discourse patterns before and during CR discussions, discourse features (e.g., length, turns, characters, and fluency) in the transcripts of baseline reading lessons and subsequent CR discussions in the two CR classes were analyzed. The contrast is meaningful to set up a context for readers to understand typical reading lessons in China and to illustrate student and teacher change from traditional reading lessons to CR. The two 40-min baseline reading lessons and 62 CR discussions (1508 min in total and about 24 min for each discussion) in the two classes were aggregated.

Table 3 shows large changes in number of speaking turns and utterance length in Chinese characters. CR increased student talk, decreased teacher talk, and teacher control of topic. The percentages of student speaking turns and characters during CR (82, 78 %) almost doubled compared to the baseline discussions (48, 38 %). The percentage of teacher talk in speaking turns and characters dropped from 52 and 62 % in baseline discussions to 18 and 22 % in CR, respectively. In the baseline discussions, teacher and student turns were roughly even (52 vs. 48 %); however, the number of characters in teacher talk was much greater than in student talk (62 vs. 38 %), suggesting a dominant teacher role.

Table 3 Teacher and student speaking turns and Chinese characters in baseline reading lessons and CR discussions

In CR discussions, the percentage of student talk in speaking turns and characters was about four times greater than those in teacher talk. Mean length of student turns (characters per turn) in CR discussions (49 %) was twice high as in the baseline discussions (25 %). Student talk in characters per minute in CR discussions (149.5) was more than double that in the baseline discussions (62.5). These patterns were similar in both CR classrooms. The results suggest that students have more opportunities to express their thoughts in elaborated, complex, and fluid language in CR discussions, in comparison to the shorter and simpler responses in conventional reading lessons.

To illustrate the transformation in the discourse Chinese teachers and students from conventional lessons to the peer-led open-format of CR, and how CR discussions support children’s high-level comprehension, three representative excerpts from Mrs. Chang’s class are presented.

In Mrs. Chang’s baseline reading lesson, students read a translated text Nature’s Way (Blumenthal 1990). It discusses a trip of the author, seven travelers, and a guide to the Galapagos Islands in search of nests of Pacific green sea turtles. Mrs. Chang had the whole class read aloud the text, then went through the text sentence-by-sentence to ensure student understanding. The 3-min excerpt below starts from reading the fourth paragraph.

1 Teacher:

OK, sit down. The little turtle, less than 3 kg, just popped its head out and saw a fierce seabird. Can you read it aloud? Who can read aloud the scene we saw? “When the baby turtle… [read from the text]”. Qi, can you try?

2 Qi:

[read from the text] When the baby turtle was hesitating, the mockingbird suddenly came closer to the nest and began pecking at the turtle’s head with its sharp beak, trying to pull it onto the beach.

3 Teacher:

Good. Sit down, please. How urgent the situation was! Can anyone else read it? Jun? Try it.

4 Jun:

[Read the same sentence from the text again] When the baby turtle was hesitating, the mockingbird suddenly came closer to the nest and began pecking at the turtle’s head with its sharp beak, trying to pull it onto the beach.

5 Teacher:

What does “suddenly came closer” mean?

6 Students:

It means flying over suddenly.

7 Teacher:

Suddenly flying over, a mockingbird. Who can read it, read it out, come on, all have a try. Yi, try to read it.

8 Yi:

[Read the same sentence the third time from the text] When…when…when the baby turtle was hesitating, the mockingbird suddenly came closer to the nest and began pecking at the turtle’s head with its sharp beak, trying to pull it onto the beach.

9 Teacher:

Good. When we see this situation, what shall we do? Liu, can you have a try?

10 Liu:

[Read from the text] My companions and I were shocked. “Aren’t you going to do something?” a voice said.

11 Teacher:

How did he say that to the guide? Can you read it aloud, Zhang?

12 Zhang:

[Read the same sentence from the text again] My companions and I were shocked. “Aren’t you going to do something?” a voice said.

13 Teacher:

How urgent the situation was! Such a little turtle will be eaten by the violent seabird! Who can read it aloud again? Ning?

14 Ning:

[Read the same sentence from the text the third time] My companions and I were shocked. “Aren’t you going to do something?” a voice said.

15 Teacher:

If this happens right in front of you, what would you think? How would you feel? Who would like to say something? Qiang?

16 Qiang:

I will go save it.

17 Teacher:

How would you feel?

18 Qiang:

Nervous.

19 Teacher:

Can you read it aloud? Look at the screen and read it.

[Three more students repeated reading the same sentence]

During the 3-min run of the reading lesson, Mrs. Chang kept calling individual students to read the same two sentences aloud. The first sentence was read 3 times while the second sentence was repeated 6 times. The 40-min lesson, required no inferences from students, posed no open-ended problems for them to solve. In turn 15, Mrs. Chang tried to elicit student extended responses and personal connections, “If this happens right in front of you, what would you think? How would you feel?” But her questions obviously prescribed certain answers, “go save it,” “nervous.” Throughout the lesson, the teacher was clearly the only authority in the classroom and her feedback to students was simple and superficial (“good”, “okay”).

Based on our observation, Mrs. Chang’s questions in this lesson were more frequent (teacher and students shared 1:1 speaking turns) than in typical reading lessons in her class. However, most questions were to nominate students to read aloud, recall text information, or required limited inference while the rest of students were left out of the conversation. Students were not given opportunities to formulate questions, reason about the text, or express their understanding in extended language. The discourse remained monologic rather than dialogic.

To contrast with the baseline reading lessons, we present excerpts from two CR discussions of one group in Mrs. Chang class. We judge that the excerpts, taken from the third discussion and the seventh discussion, are representative of the entire series of discussions. The third discussion was targeted because it was a major turning point for the students and their teacher. We selected the seventh discussion because it illustrates how skillful students had become after six discussions.

The excerpt below is taken from the first 5 min of the third CR discussion. There are seven students in this group (Qi, Cheng, Yu, Pu, Liu, Zhou, and Han). Students discussed the story A Trip to the Zoo (Reznitskaya and Clark 2001). In the story, Lily is excited about the upcoming field trip to the zoo, whereas Anna decides not to make the trip because she is worried that zoos are not good places for wild animals to live. The big question is: Are zoos good places for animals?

1 Teacher:

Today we will have another discussion. Please look at me. You did a very good job last time, but there were some problems. You guys have tried to set your own goals, right? This time I set two goals for you, and I hope you can set goals for your own group like this next time, OK? I have two goals for this discussion. First, everyone should participate in the discussion. Try to integrate the text information and your own ideas, and find supporting evidence from the story to express your opinions. Second, try to respond to others’ opinions. Let others give their opinions and try to communicate and exchange ideas with them. We will debrief later, OK? After reading this story, I have a big question: Are zoos good places for the animals? Think carefully, and if you think yes, then put your thumb up; if you think no, put your thumb down. We vote at the same time, one, two, three, ready? Go. (Zhou and Yu put their thumbs up, the others five down.) OK, now you have different opinions. Let’s talk.

2 Qi:

I do not think zoos are good places for animals.

3 Cheng:

I agree with Qi, because zoos may save some extinct animals, but the living space in zoos is limited. It is hard for animals to feel like home.

4 Yu:

I agree with Qi because the space in the zoos is limited and animals are not free.

5 Qi:

I also agree with Yu’s opinion, because it says in the story, [Read from the text] “I guess animals are kidnapped from their homes and are brought to zoos, Lily answered”. Lily answered like this because she thinks the hunters go to the animals’ habitats and disturb the animals. So I think that zoos are not good places for animals.

6 Pu:

I agree with Qi because animals lose their freedom once they are brought to the zoos.

7 Liu:

I agree with Pu. I’d like to add to that. [Read from the text] Lily thinks a moment and says, yes, those animals have already lost their habitats, for example, orangutans in Madagascar, tigers in India.

8 Qi:

I agree with Liu, too. [Read from the text]Anna also said, “These people put the animals in cages so that they can be appreciated by us.”

9 Zhou:

I do not agree with Qi because it says, because it says in the story that, zoos can save some endangered animals indeed.

10 Han:

Zhou, I do not agree with you because in the story it also says that [Read from the text] “Once animals are used to being fed and taken care of, they will lose their hunting instinct.”

11 Zhou:

Han, I do not agree with you because it says in the story that [Read from the text] “Zoos provide a safe home for animals, so that they don’t have to worry about food and being attacked by other animals, or even hiding in the dark from a gunshot.” So I think that the animals like staying in zoos.

12 Teacher:

Excuse me, “animals like zoos.” But what is the big question? “Are zoos good places for animals?”

13 Zhou:

I think it is a good place.

14 Teacher:

We should talk about the big question and not go off topic. Also I think you did very well, you were able to find the evidence from the text, but I felt you were a little too serious. Phrases like “I think someone’s opinion……”, “It says in the story…” are good. Maybe you can say something like this, “I think zoos are good places for the animals because it says in the story that ‘if there is no zoo to save the animals, they may be extinct’. So I think that zoos are good places because the rare and endangered animals can survive so that people can appreciate them”. (You should) not only read from the text, but also talk about what you think about the text, and your opinions will be much better. Have a try, OK?

Mrs. Chang started out by setting two goals for the discussion. She encouraged students to integrate text information and personal ideas to reason and use supporting evidence from the story to back up their claims. She talked very little in this discussion. In fact, in the 5-min excerpt above, she took only 3 speaking turns while there were 11 student turns. As the teacher minimized her talk, students produced longer utterances and expressed increasingly complex reasoning. At turn #12, when students went off the big question, Mrs. Chang refocused the discussion by reminding students of the big question.

In this excerpt, students had assumed responsibility for turn management and exercised interpretive authority. Students frequently used textual evidence to support their claims or challenge others’ points of view (Turn #5, 7, 8, 10, and 11). Student use of evidence, however, was mostly directly reading from the text and rarely integrated with personal understanding of the issue. At Turn #14, Mrs. Chang modeled how to integrate textual information with personal ideas to reason about text and elaborate opinions. By our standards, this was a full-fledged CR discussion. The only limitation was that students tended to overuse phrases like ‘agree and disagree’ to express thoughts and have not yet developed a variety of argument stratagems to co-construct arguments.

The second excerpt is taken from the middle 5 min of the same group’s seventh discussion of Oliver Button is a sissy (dePaola 1979). In this story, a boy, named Oliver Button, liked to dance but didn’t like boy things such as ball games. He took part in a talent show and didn’t get the first prize although his dancing teacher, his father and mother were all proud of him. The next morning Oliver Button didn’t want to go to school. The Big Question is: Should Oliver Button continue dancing?

1 Yu:

I think Oliver should learn to dance. Just now, Han said that, if Oliver does not keep practicing basketball, football and all kinds of ball games, his skills will be worse. Han, I wanted to ask you, why do poor ball skills matter? If he is not good at ball games, but he dances well, he can be a star later and will be adored.

2 Zhou:

Yu, I agree with you. If Oliver learns to dance, he will be happy dancing on the stage, but he will be sad at school.

3 Cheng:

Qi, what do you think about Zhou’s opinions?

4 Qi:

Zhou, I agree with you. I also had a similar experience, but I was not being teased. I learned painting before. When I started the painting class, I could not paint well, and it was not possible to win the first prize. I must learn the basics, but the more you practice, the better you are. Finally, I was able to paint better than others in the class.

5 Han:

Qi, I do not agree with you because you said that painting skills can get better and better, if he learns to play ball, his skills can be better and better, too. [Students had heated side conversations].

6 Teacher:

Take it easy, take it easy. You are in opposite positions. Your high enthusiasm is great, but one person at a time, take turns and do not interrupt each other, okay?

7 Cheng:

Han, I do not agree with you. Although you said that he may learn well, but now he is now being teased by others. If he continues to play ball, he will not be popular. It is better for him to make good friends with those in the dance class, so that he can not only enjoy happiness from his friends, but also the happiness of his own (dance).

8 Liu:

Isn’t it OK to play with girls?

9 Han:

Liu, I do not agree with you, because you said that “isn’t it OK to play with girls?” From this picture it shows that a lot of boys are laughing at him. If a boy relies on the help of girls, he may feel shamed. If he plays with girls, other people will laugh at him even more.

10 Cheng:

But I have heard a saying, “Follow your own course, and let people talk.” At first, I did not understand this saying. Later, I figured it out, that is, people should have their own goals. No matter what obstacles you may face and how other people may treat you, you must be persistent in your faith and you will then be successful.

This was a lively CR discussion. Students were so excited that Mrs. Chang had to remind students not to interrupt each other in Turn # 6. The excerpt illustrates co-construction of arguments by students. Students responded to and built upon each other’s contributions. As shown in Turn # 3, an emergent leader, Cheng, asked Qi: “What do you think about Zhou’s opinion?” This high degree of uptake signified free-flowing discussions and is an important characteristic of productive discussions (Soter et al. 2008).

Students freely expressed conflicting perspectives and engaged critically but constructively with each other’s ideas, with justification for counterarguments (opposing arguments) and rebuttals (responses to the counterarguments and justifications of the original position), as shown in Turns #1, #5, #7, #9. Support for their claims included textual evidence (Turn #9), personal connection (Turn #4), world knowledge (citing a proverb saying in turn #10), prediction (Turns #1, #2, #5, #7, and #9).

Teachers’ reflections during and after CR discussions

To explore the effects of CR discussions on teachers’ professional learning, the weekly reflections and post-intervention interviews were analyzed. Four clear developmental stages emerged from CR teachers’ weekly reflections: being enthusiastic but totally lost in the first CR discussion, learning how to use CR deliberately, becoming more confident participants in CR discussions after seeing positive student changes, and applying CR in their regular reading instruction practice.

In the first few CR discussions, when relinquishing control over turn-taking and topics, both CR teachers felt lost in transitioning from “being the authority to the supporting facilitator in the classroom”, according to one of the CR teacher, Mrs. Kai. For Chinese teachers, when they were required to take off the control of classroom, they had to meet a new challenging way of teaching and learning. It was not an easy task for students to assume independent control of the social aspect of discussion and topics. When students began to state their views, teachers felt that they could not keep up with them because student perspectives were divergent from traditionally “correct” answers. Teachers felt loss of control in student talk.

Being lost at first, CR teachers began to adjust their roles by studying CR philosophies and scaffolding strategies deliberately. As observed by research assistants, Mrs. Chang read the CR manual carefully and began to use instructional moves literally to facilitate the second round of discussions. Coached by the CR researchers, the teachers began to prompt students to use reasoned arguments to support their thinking and foster student independence. For example, CR researchers provided feedback about when and how to challenge students at a teachable moment, encourage counterarguments and rebuttals, “What is your viewpoint to respond to someone’s opinions or reasons or evidences?” In Mrs. Chang’s weekly reflections after the second CR discussion, she wrote “when I found that students used pronouns without clear referents, began to repeat themselves, and presented several rounds of single-sided arguments, I used several instructional moves, such as summing up, asking for clarifications, thinking out loud to facilitate the discussion.”

Coached by the CR researchers, CR teachers realized the importance in preparing argument outlines and laying out the reasoned arguments of both sides of the issue prior to the CR discussion. Thus, they became more active listeners and were able to use appropriate instructional moves more effectively. As Mrs. Kai wrote in reflections after the third discussion, “According to the big question, I found relevant two-sides evidence from the text, and listed in a table form in order to record the students’ statements.” “And also I prepared some points, e.g., how to challenge students when they have the same opinions.

Student improvement in independent and productive discussions over time reinforces the teacher’s competence and efficacy of being a facilitator. In Mrs. Chang’s weekly reflection after five CR discussions, she wrote, “During all four group discussions today, I smiled a lot, which made children more relaxed, and they began to smile and even laugh, too… Discussions today went a little beyond my expectation because they did NOT merely read from the text, but expressed their opinions by making personal connections and drawing information from prior knowledge. From the bottom of my heart, what I believe the biggest change in students after CR discussions is that students improve their logical thinking and reasoning, and reading ability.”

According to the post-intervention interviews, after the CR project, both teachers tried to use CR in their regular reading instruction practice, which showed in the following four aspects. First, teachers were more open to students’ alternative perspectives. Below is an excerpt from Mrs. Chang’s interview after the posttests:

After the CR training, I can understand the various perspectives of students’ wonderful thinking process.” “When I said little, took one step back, let the students themselves control and participate in discussions, they presented the process of thoughts, which was I can’t imagine before participating in CR discussions.”

Second, teachers valued quality of thinking more, rather than factual knowledge. Mrs. Chang said, “Now I focus on how to train students to think from alternative viewpoints or to learn knowledge in different ways rather than whether the students could remember certain knowledge. If students do not understand the text, I could play a leading role to help the student improve comprehension independently.”

Third, teachers re-negotiated the role to promote deep reasoning and encourage student free-flowing interactions. As Mrs. Kai said, “I think that the teachers’ role is to help children clarify their thinking and cultivate children to justify their opinions with well-grounded evidence and reasons.” A similar expression by Mrs. Chang was, “CR helps me to consider students’ presentations in class from more perspectives when I prepare lessons, which means that the anticipated viewpoints would be broader.”

Fourth, teachers benefited from the open participation in CR and learned new ways to improve student communication skills: active listening and expressing ideas freely. Mrs. Chang said, “Because CR can make some silent students more active so that I could pay more attention to them. CR provides me more methods to teach students how to listen to others patiently and express their viewpoints bravely.”

To recapitulate, both teachers perceived benefits of the CR experiences in their reading instruction practice. The CR teacher became engaged in considering alternative perspectives rather than remained an authority in interpreting texts. In the teaching practice, they began to cultivate students’ independent thinking rather than reinforce memory of factual knowledge.

Discussion

This study provides mixed evidence that CR promotes the reading comprehension of Chinese children. CR students did significantly better on PIRLS constructed-response questions that required integrating and evaluating information, but no better on PIRLS multiple-choice items, and significantly worse on NAEP multiple-choice items calling for information retrieval and simple inference. It appears that CR supports children’s high-level comprehension, but not basic comprehension.

The apparent positive moderate impact of CR on high-level comprehension of Chinese children may be explained by Kintsch’s construction-integration model (Kintsch 1988; Kintsch and Rawson, 2005) and argument schema theory (Reznitskaya et al. 2008). According to the construction-integration theory, text comprehension is accomplished by two cognitive processes: construction and integration. The construction process refers to identifying the meanings of words and sentences, and the integration process refers to connecting that information to long-term memory and building a situation model. According to argument schema theory, engagement in dialogic interactions promotes the development of an argument schema. Unlike conventional Chinese reading lessons, which emphasize information retrieval and limited inference, CR focuses on a major dilemma facing a story character, a consideration of reasons for different courses of action, and an appeal to the text for evidence and for interpretive context. As shown in the two excerpts from CR discussions, students began to display high-level cognitive processes as they wrestled with challenging, multifaceted issues. Students used textual evidence, sought personal connections, made predictions, gave elaborated explanations, and considered alternative perspectives. Over time, our thesis is that students developed or further articulated an argument schema, an abstract mental representation of a well-formed argument and the understanding that explanations and justifications must be comprehensive, internally consistent, and well supported (Reznitskaya et al. 2008).

Integrating Kintsch’s construction-integration model and the argument schema theory, Lin et al. (2014) proposed a two-level model of text comprehension. At the first level, a situation model is constructed by integrating text-based information and knowledge-based inferences. At the second level, readers evaluate the situation model in terms of an argument schema or a better situation model may also enhance the quality of arguments. Students who read a text as an argument are evaluating the validity of a text, which in turn facilitate constructing a better situation model. Thus, positive effects were found on the constructed-response items that require readers to integrate and evaluate information.

But CR students did significantly worse than control students on the NAEP multiple-choice items. It is highly unusual to find that an intervention has positive effects on one measure of comprehension but negative effects on another. We think the explanation for this unexpected result is that, whereas during CR discussions the main points and specific details in stories are usually brought out spontaneously, story information is not as systematically covered as in the typical Chinese reading lesson. As the excerpt from the baseline reading lesson illustrates, recitations over stories are exceptionally thorough in Chinese classrooms. Students undoubtedly learn to pay close attention to story detail, to try to get the author’s meaning exactly. They learn to take what may be called the literalist stance, which contrasts with the critical/analytic stance students are hypothesized to take as they participate in CR. When reading a story from a critical/analytic stance, in anticipation of addressing a moral dilemma or a policy question, not every story detail is important, the author’s exact meaning on every point may not matter. This probably explains why the control students performed significantly better than CR students on the multiple-choice items in the NAEP posttest.

A significant intervention effect was found in constructed-response items of PIRLS, but not the constructed-response items of NAEP. NAEP constructed-response questions like “Why did Jason think everyone would be angry with him when they found the puppy missing?” focus on limited inference and making reader/text connections, narrowing the scope of student responses. Students tended to generate short and unelaborated responses to the NAEP constructed response questions (average number of words per item = 27). Except for PIRLS items that assessed retrieval of explicitly stated information PIRLS, on which there was no difference between conditions, PIRLS constructed response items seem to require extended thinking and students gave longer, more elaborated responses (average number of words per item = 32). There was a significant difference in the response length on the NAEP vs. PIRLS items, t (55) = 5.35, p < .001.

It is worth noting that the PIRLS test, on which positive moderate effects of CR were found, was developed and standardized for international comparative studies of reading and previously used in Chinese-speaking societies, such as Hong Kong and Taiwan (Mullis et al. 2007). Passages and items were carefully written to taken into account student interest, culture appropriateness, readability, fairness and sensitivity to gender, racial, ethnic backgrounds. Special attentions were made to the translation process to ensure the original meaning and style was conveyed. A strict translation procedure was followed including submitting a cultural adaptation form to the PIRLS reading development group (PIRLS 2006 Taiwan Report). Thus, PIRLS reading tests may have greater reliability and validity in assessing the reading of Chinese children. Inter-rater reliability reported in Taiwan PIRLS 2006 test was .95 (.78–1.00) (PIRLS 2006 Taiwan Report). The translation of NAEP reading test, developed for American children, may be less culturally appropriate for Chinese students and, in any event, has not been through a process of item selection and refinement with Chinese students.

Although we made some efforts to explain why the mixed results of CR on children’s reading comprehension occurred, we acknowledged that there is a limitation focused on reading comprehension measurement in the present study, and further research should be necessary to clarify the impact that CR has on the reading comprehension of Chinese children.

Consistent with previous research (Chinn et al. 2001), discourse analyses of the baseline discussions and CR discussions show that CR increases student talk and decreases teacher talk. Percentages of student turns, mean length of each student turn, and the number of characters per minute were much greater in CR discussions than in baseline discussions. As Au and Mason (1981) argue, higher rate of student talk is a proximal indicator of student learning, these results suggest that students learn from the talk in elaborated, complex, and fluid language in CR. Indicators of cognitive processes—use of evidence, inference, prediction, and elaborated explanation, are evident in student talk in CR. The findings can be explained by the Balance of Rights Hypothesis (Au and Mason, 1981): Higher levels of productive student behavior are more probable if there is a balance between the interactional rights of the teacher and children.

Based on the teacher reflections during and after CR discussions, it is clear that CR had a positive impact on their teaching practices. Teachers experienced four developmental stages in the adaptation process: Being lost in the beginning, learning and practicing new strategies actively, gaining confidence gradually, and applying CR in subsequent teaching practice. They became more open-minded in planning reading lessons and embraced more multiple interpretations of texts and different perspectives of student thinking.

Chinese teachers are traditionally regarded as the authorities who transmit knowledge, moral values, and desired behavior. Teachers are accustomed to having exclusive control over class content and processes, and to be the absolute arbiters of the correct answers to questions. Thus, it is particularly interesting to see Chinese teachers’ developmental stages after the successful implementation of CR. Furthermore, Opfer and Pedder (2011) proposed that there is no real elaboration of the effect of classroom discussions on professional development in the extant literature. The current study fills the gap by elaborating the process of teachers’ professional development both from classroom discourse data and teacher reflections.

Given the limited follow-up data, it is less clear to what extent the CR experience had a permanent impact on the CR teachers’ regular teaching practices after the CR was over. In a study exploring the relationship between teachers’ beliefs about learning and teaching with their participation in continuing professional development, the results showed that the more a teacher’s profile is student oriented and subject matter oriented, the higher his or her participation in continuing professional development (De Vries et al. 2014). It remains a question whether the CR teachers would participate more in continuous professional development and apply innovative pedagogies in teaching practices.

In the present study, CR was introduced with a level of school support beyond the ordinary. The school principal expected the two CR teachers to make the project an exemplary school initiative and, for instance, required each teacher to write weekly reflections during the period of CR implementation. In our experience, such strong administrative support is rare in schools in China. On the one hand, teachers were held accountable for a high fidelity implementation of CR. On the other hand, the generalizability of the findings was compromised and it remains to be seen whether CR is feasible in a normal Chinese school environment.

Consistent with previous CR research in China, students made a fast and smooth adaptation to the new discussion mode, exhibited high engagement, and were able to manage the discussions themselves (Dong et al. 2008, 2009). Although both teachers successfully transitioned from recitation to CR, they found the adjustment challenging, especially in the beginning. Good rapport between researchers and teachers and ongoing coach and support are critical for teachers to successfully implement new teaching pedagogies.

To summarize, despite the small sample size and short duration of the CR intervention, the present study provides evidence that CR supports high-level comprehension, although apparently not basic comprehension in Chinese children. However, the evidence is mixed as the positive impact of CR was found on the PIRLS extended response items, but not on NAEP items. Assuming the positive and consistent findings can be replicated in a larger scale study, the tentative conclusions are that given quality CR training and ongoing administrative support, Chinese teachers can successfully shift roles and facilitate stimulating CR discussions and benefit from CR in their beliefs and practices in reading instructions; When provided with proper reading materials and given enough latitude for thinking and reasoning, Chinese students can have small-group discussions as productive as those observed in American classrooms. CR discussions provide opportunities for Chinese students to engage in independent and critical thinking, and for Chinese schools to move beyond the ‘orderly but lifeless classrooms’ where students routinely “recall what someone else thought, rather than articulate, examine, elaborate, or revise what they themselves thought” (Nystrand 1997, p. 3).