Introduction

From early childhood, we are exposed to narratives through a variety of informal channels, such as parental shared storybook reading, theatre, performance, television, cinema, and so on. This has made storytelling one of the privileged genres through which children are introduced to learning in formal education. Besides contributing to the development of many academic skills such as critical thinking, listening, comprehension, recall and vocabulary, storytelling is key to the development of children’s ability to communicate effectively to others (Tannen 1980). Therefore, it is particularly important to study how storytelling can be nurtured and developed from the early years of schooling.

One of the ways in which storytelling can be improved is by encouraging children to make stories together, as this is an activity in which children naturally engage (Devescovi and Baumgartner 1993). Peer collaboration has also been shown to increase motivation, and to provide an ideal platform for children to express and question each others’ ideas, to propose alternatives, to request explanations and to provide these through a common language (Webb 1992). Moreover, when mediated by productive discussion, collaboration has been found to promote reflection and elaboration (Barron 2003; Chi 2009; Roschelle and Teasley 1995) in many domains. Specifically, it has been shown that children as young as five can produce better stories in dyads than individually (Hayes and Casey 2002). However, Devescovi and Baumgartner (1993) found that children only benefitted from collaboration when they engaged with each other through productive discussion while making their stories together. Therefore, it is important that children’s interactions are effectively promoted through appropriate scaffolding. This study investigates how children’s storytelling can be improved by encouraging their engagement in productive interaction during collaborative storytelling.

The challenge of telling a good story

A good story includes enough information to enable a listener to make sense of its characters and the events in which they are involved. This means including plot driving elements such as settings, initiating events (i.e. a problem to be addressed), one or more characters’ reaction (i.e., their intention to address the problem), their attempt(s) at solving the problem, and a final (positive or negative) resolution of the problem (Rumelhart 1975). These are referred to as referential elements of a story (Stein and Glenn 1979). However, a good story is also one where these elements are expressed in such a way that a desired effect (of interest, enjoyment, appreciation, etc.) is attained by the listener (Peterson and McCabe 1997). This gives the story a ‘flavour’, and can be achieved, for example, through lexical choice, representation of character’s internal states, repetition, climax building, formulaic expressions, and other expressive devices which ultimately make a story worth attending to. These are defined as evaluative elements (Stein and Glenn 1979).

Given that a good story is a complex product where multiple elements come into play, it is hardly surprising that the ability to tell good stories is only gradually acquired by children. Research has shown a clear pattern of development in children’s storytelling skills. Specifically, it has been found that around the age of six children begin to occasionally tell well structured stories with multiple, interlinked episodes revolving around a central initiating event and culminating in a resolution (Peterson and McCabe 1997). Similarly, although children as young as three are able to include evaluative devices such as formulaic endings to their stories, it is not until around the age of six that children start to use evaluative elements consistently (Bamberg and Damrad-Frye 1991; Ukrainetz et al. 2005).

The challenge of telling a good story together

The importance of discussion during collaboration has been stressed by many researchers. When building on each others’ contributions, learners reflect on the subject at hand, and this leads to a richer and deeper understanding (Barron 2003; Chi 2009; Roschelle and Teasley 1995). Discussing ideas in this way has also been shown to increase motivation and task focus (Brown & Palincsar, 1989; Salomon and Globerson 1989). Chi (2009) argues that it is a specific type of discussion which is most conductive of good, collaborative learning; she defines this as ‘interactive discussion’, where learners articulate and elaborate ideas for each other as well as building coherently on each others’ ideas.

However, in order to be able to reflect and build on each other's contributions, learners need to enjoy shared understanding of their collaborative product (Dillenbourg and Traum 2006). This can be difficult to achieve when learners are still developing their ability to articulate ideas for others and to request clarifications from others. Research has shown that although children as young as five have an awareness that their listener may not know everything they know, it takes years of practice for them to develop their ability to articulate messages so as to enable their audience to understand these or detect ambiguity in others’ messages (Lloyd et al. 1995; Whitehurst & Sonnenschein, 1981).

When making a story together, it is important that children articulate ideas for each other (Ananny, 2002; Tartaro and Cassell 2008). This makes shared understanding possible, which in turn allows for the coherent building of one contribution over another in the story. One possible way to support children’s interactive discussion lies in the use of external artefacts as it has been argued that the co-construction of shared representations facilitates collaboration because of the explicit and visible nature of the co-constructed representations (Scaife and Rogers 1996). When representations such as drawings are constructed during collaboration, an external trace is created, which provides a platform for the represented ideas to be discussed and elaborated upon (Dillenbourg and Traum 2006), which may foster better collaborative outcomes (Schwartz 1995). This is further facilitated when the co-constructed representations are persistent, as they allow collaborators to review and discuss contributions (Anderson et al. 2004; Roschelle and Teasley 1995).

However, over-reliance on shared context can hinder the establishment of shared understanding (Krauss and Fussell 1991). When collaborators produce a message about something originating from a shared context, they may over rely on their sharing the same understanding of the ideas represented in the shared context. This phenomenon has been called ‘consensus bias’, where a speaker assumes that an ambiguous message is sufficient for a listener to comprehend its meaning, and the listener does not realise that his interpretation of the message is discordant with the one intended by the speaker.

This is likely to be especially true when the shared context has been constructed together, as collaborators might assume that the co-construction was based on a shared understanding of the represented ideas. It has been suggested that providing these opportunities does not necessarily mean that learners will automatically exploit them by engaging in productive collaboration (O’Connor et al. 2005; Suthers 2006). Indeed, recent research has shown that learners working together do not tend to autonomously engage in interactive discussion and ultimately do not benefit from the co-construction (De Westelinck et al. 2005; Munneke et al. 2003; Prangsma et al. 2008).

Therefore, although co-constructing representations can provide an anchoring point for collaborators to articulate and elaborate their ideas together, additional support might be needed in order to promote collaborators’ engagement in interactive discussion. This is particularly true with children, as their skills as communicators are still developing.

Approaches to scaffolding children’s interactive discussion

Given the importance of engagement in productive verbal interaction for the achievement of a coherent and elaborate collaborative product, exploring how to facilitate interactive discussion has been a focus of existing research on collaborative learning.

Webb (1992) argued for the importance of providing a social environment where contributions are encouraged and critically discussed. She suggests that peer groups might provide the best conditions for this type of productive, interactive discussion to occur, as discussions are less likely to be dominated by one more knowledgeable or authoritative individual, and because peers tend to share a common language and explain ideas to each other in a way that others can relate to (Soller 2001; Webb 1992).

However, simply placing children in peer groups and asking them to collaborate will not necessarily lead to interactive discussion. One productive approach is that of collaborative scripting, where the goal is to foster collaborative learning by shaping the way in which learners interact with one another (O’Donnell and Dansereau 1992). Collaborative scripts typically specify sequences of activities, often involving roles for different individuals to play (Kobbe et al. 2007).

There are many forms of scripting but the one implemented in this study is a form of reciprocal scripting, where learners alternate between playing different roles supported by a set of prompts to help them in their roles (Dillenbourg and Jermann 2006). Reciprocal scripts aim to facilitate learners’ engagement in discussion and reflection. Well acknowledged reciprocal scripts include the Reciprocal Teaching (Brown & Palincsar, 1989) and the Paired Reading (Yarrow and Topping 2001) methods. Both involve encouraging learners to prompt each other to explain their understanding of some presented or co-constructed material through questioning, clarifying, discussing and summarizing the presented learning material.

The approach taken in this study draws from a similar method, the Guided Reciprocal Peer Questioning (GRPQ) script, which uses question prompts to elicit articulation and elaboration. In the GRPQ script, question prompts are provided for pairs of students to use while they alternate between playing the role of the ‘questioner’ and that of the ‘explainer’ in learning about presented learning material (King 1999). Typically, two types of questions prompts are provided: ‘Review’ questions are designed to encourage learners to restate the content of the presented material (through definitions, descriptions, explanations, etc.), while ‘Thinking’ questions are designed to encourage children to go beyond the material as explicitly presented to make connections and inferences. The latter were found to benefit learning about the presented material more than Review questions (King 1999).

Whilst the Reciprocal Teaching and Paired Reading methods have been criticised for consisting of a highly specified sets of steps through which instruction takes place (Salomon and Globerson 1989; Dillenbourg 2002), the GRPQ script allows more freedom for learners to formulate their own questions based on the question stems provided, thus leaving a broader space for independent and generative thinking. Moreover, whilst other methods were designed to support expert-novice interaction or heavily relied on teachers’ modelling of the method, the GRPQ script is designed to support peer learning with minimal modelling from a teacher or instructor (King 1999).

The effectiveness of this method has been demonstrated in numerous studies with students from fourth grade through to higher education learners, showing that when they used the question prompts, students provided more explanations and justifications for their reasoning, and ultimately gained a better shared understanding of the learning material (King 1999; King and Rosenshine 1993). In the light of the above findings, the GRPQ script was employed in this study to investigate its potential to encourage children’s engagement in discussion about their collaborative stories.

Although some examples exist in the literature where the benefits of scripts are evaluated through a qualitative approach (Pozzi 2011) and some a mixed approach (Rummel et al. 2009), the majority of studies on scripts take a primarily quantitative approach (Dillenbourg and Traum 2006; Weinberger et al. 2005). More specifically, the GRPQ method has been historically evaluated through experimental, quantitative methods (King, 1993, 1999). For these reasons, an experimental, hypothesis driven, quantitative approach was taken in this study, where the effects of two tasks (unprompted and prompted story-making) were systematically compared. Specifically, a within subjects design was employed, which provided enough power for statistical analysis despite the limited sample size. Finally, despite recognising the value of qualitative approaches to the analysis of stories (Ananny and Cassell 2001; Robertson et al. 1998), the stories in this study were assessed using quantitative coding schemes as this was deemed most appropriate to the experimental approach and matched the quantitative type of analysis carried out in previous GRPQ studies.

Study hypotheses

This study examined the potential benefits of encouraging children to articulate their story ideas for each other. It was predicted that encouraging children to ask each other questions about their collaborative story would lead greater interactive discussion during story-making. This, in turn, would promote better storytelling.

  • Hypothesis 1 predicted that in the prompts condition children would engage in more interactive discussion and as a result would tell better stories (i.e., ones that were longer, referentially more complex, evaluatively richer and more coherent) than in the no prompts condition.

  • Hypothesis 2 predicted that the children would continue to engage in interactive discussion once this support was withdrawn. Accordingly when children made stories without prompts, those children who had given the prompts script first would tell significantly better stories than the children given the prompts script second.

Method

Design

The study employed a within subjects design with each pair of children telling two stories (Monkey and Frog). The order in which the prompts and the no prompts conditions were administered and the two different stories were counterbalanced. Eight pairs were given the prompts script first (four Monkey first) and ten pairs were given the prompts script second (five Monkey first).Footnote 1

Task

The task involved pairs of children making a story together using a drawing storytelling application called KidPad (Benford et al. 2000) and then telling their story to two schoolmates. The children were presented with a picture story on the computer, and asked to construct simple representations over the presented pictures. Providing a picture story for the children to base their stories on not only offered children a source of inspiration for their stories, but it also presents the important methodological benefit of ensuring that the children’s stories are more easily comparable because they recount the same core set of events (Bamberg and Damrad-Frye 1991)

Materials

The two stories selected for this study were chosen for their appropriateness to the age group, their appeal and matched as much as possible for similarity in structure. Thus, ten pictures from the book Frog, Where are You? (Mayer 1969) and ten from the book Monkey Puzzle (Donaldson and Scheffer 2000) were uploaded into KidPad to create a story sequence, with both sequencing depicting the story of a protagonist who has lost someone or something and engages in number of attempts to find them. For the practice task, pictures from the Tiny Planet website were used.Footnote 2

For the prompts script, an easel was set up showing the question prompts (Table 1). Some of the words were in red (in italics in this text) in order to draw the children’s attention to the important words in the question, i.e. the setting and the characters’ internal and external states. A “Why?” question stem was also provided on a separate column to show that this could be asked as a follow-up to any of the questions of the left column.

Table 1 The question prompts

The question prompts were designed to encourage children to discuss key aspects of their story. For example, encouraging questioning about the story characters (e.g., their physical appearance and goals) and the place of the story was aimed at improving referential complexity in children’s collaborative stories. The questions about characters’ affective and epistemic states and the “Why?” question stem were expected to encourage discussion about the character’s internal states, and causality, with the aim of promoting evaluative richness in the children’s collaborative storytelling.

Some of the questions provided were aimed at encouraging children to articulate the content of the presented pictures (review questions), while others were aimed at encouraging children to go beyond the presented pictures by making elaborations and inferences (thinking questions). Given the established benefits of using thinking questions to build on review questions, more thinking questions were used overall (King 1999).

Finally, some story aspects, such as characters’ actions and behaviours, were not included in the set of question prompts. This was motivated by the desire to not overwhelm the children with too many questions, but also to leave them free to construct their own questions. Too much task structuring has been found to constrain learners’ ability to elaborate and create new knowledge through productive discussion (Salomon and Globerson 1989).

Participants and grouping

Forty-six children aged 6 to 7 years old were recruited from 2 Year Two classes in a local primary school. Ten children were paired into dyads and randomly allocated to the ‘audience’ role. The remaining 18 boys and 18 girls were story tellers (age range = 6:00–7:5, mean age = 6:9). These children were paired according to their personal preferences and attitudes towards working together (gathered through informal conversations with the teacher) as well as their similar verbal abilities (measured by the Vocabulary and the Similarity sections of Wechsler Abbreviated Scale of Intelligence test). Gender was not a factor in allocation.

Procedure

The study was carried out in a quiet room in the school, where a laptop running KidPad with the picture stories was set up, together with the question prompts (in the prompts condition only), and two camcorders recording the children’s interactions with each other and on the computer.

Before the story-making sessions, the experimenter took around 30 min to illustrate the KidPad features and ensure that each child had the opportunity to practice with the application and to demonstrate how to use these features during story making. The pairs of children were instructed to take turns at working on a story picture each, but the specific instructions differed according to whether the children were in the no prompts or the prompts condition. In the prompts condition, the children were told that once someone had finished their drawing, the other child would ask at least one question from the set of questions provided and that the child who had made the drawings would have to try and answer those questions as well as they could, before they could switch roles. In the no prompts condition, children were simply instructed to take turns at drawing on one story picture each, and to prepare to tell their story to a peer audience.

Finally, at the end of each story-making session, the children were asked to tell their story to two of their school mates from the ten children who had been selected to act as ‘audience’.

Measures: Story making

As this study focused on how the children’s story-making discussion could be influenced by the prompting intervention, and on the potential benefits of encouraging interactive discussion on the children’s collaborative storytelling, both the story-making process and the storytelling outcome were analysed.

The story related questions were identified in the story-making transcripts and coded by type. On those (quite rare) occasions where a turn included more than one question, each question was coded separately on the basis of mutually exclusive categories.

Two types of question codes were used. First, they were coded as review or thinking questions. Table 1 shows how different questions were coded. Moreover, all “Why” questions were coded as thinking, as they encouraged children to elaborate on their story ideas by providing motivations for them.

Second, the questions were coded as given or invented. Given questions were those that reproduced those given as prompts more or less verbatim (see materials). To see if children would naturally ask the same questions as in those given in the prompt condition, when coding the no prompts, questions were considered as given when they corresponded to the ones given in the prompts script. Invented questions were questions produced by the children themselves and did not reproduce those given as prompts, such as “What is [the character] doing?”. Questions that started with the provided question stem “Why?” were also coded as invented, as the children were free to fill the rest of the question with any content they liked.

The children’s answers to the questions asked by their partner were coded according to whether they provided a review or a thinking answer. When a question did not receive an answer, the turn following the question was coded as no answer.

Figure 1 illustrates an example of a drawing in a prompted story-making session, and it is followed by a transcript of the children’s discussion related to that drawing (Table 2),Footnote 3 with an indication of the speaker, what they say, and the coding of their questions and answers.

Fig. 1
figure 1

A screenshot of a drawing made in KidPad during a prompted story-making session

Table 2 Transcript from a prompted story-making session (Valerie and Jim)

Figure 2 illustrates an example of a drawing in an unprompted story-making session, and it is followed by a transcript of the children’s discussion related to that drawing (Table 3), with an indication of the speaker, what they say, and the coding of their questions and answers.

Fig. 2
figure 2

A screenshot of a drawing made in KidPad during an unprompted story-making session

Table 3 Transcript from an unprompted story-making session (Tim and Elaine)

Measures: Story telling

The children’s collaborative storytelling sessions were transcribed and rated according to the length of the stories, their referential complexity, evaluative richness, and coherence. Appendix I illustrates how a story from the sample was coded for referential complexity, evaluative richness and coherence.

Referential complexity

This measure was aimed at capturing the extent to which the plot driving information contained in the pictures was included in the children’s stories. Based on a widely acknowledged and established approach (Bamberg & Damrad-Frye, 1992; Stein and Glenn 1979), the coding scheme for referential complexity included the categories in Table 4.

Table 4 The referential complexity categories

A scoring system was developed which assigned a point for each of these elements. A total of 12 points could be obtained for the Monkey Story and 13 for the Frog Story and so for the purpose of comparison, the children’s scores were normalised. All stories were coded and the codes were tested for inter-rater reliability. A second coder (blind to condition) coded six of these stories and inter-rater agreement was deemed acceptable for both the Frog (Kappa = .85, p < .001) and the Monkey stories (Kappa = .89, p < .001).

Evaluative richness

The stories were segmented into units of analyses to ensure that each story part was assigned one category only, and that each part of the stories was coded. The segmentation unit was the proposition, i.e., “the smallest unit of meaning that can be put in predicate-argument form (with a verb operating on a noun)” (Harley, 2008, p. 379). This choice reflects the practice reported in studies on children’s storytelling abilities (Bamberg and Damrad-Frye 1991; Peterson & McCabe, 1983). The proposition was deemed to be of a fine enough granularity to allow for capturing story richness (unlike other linguistic segmentations such as T-units, which would often include more than one category under the following coding scheme) (Hunt 1970).

To code for evaluative richness, the schemes proposed in the existing literature (Bamberg and Damrad-Frye 1991; Peterson & McCabe, 1983; Ukrainetz et al. 2005) were combined into a comprehensive coding scheme (Table 5).

Table 5 The evaluative richness categories and examples

All stories were coded according to the above scheme. A second coder (blind to condition) coded six of these stories and inter-rater agreement was scored for the Frog stories (Kappa = .93, p < .001) and the Monkey stories (Kappa = .90, p < .001).

Coherence

This measure was designed in order to capture the extent to which the children built on each other's contributions in their storytelling. As this was a measure of collaboration, consecutive turns were used as a unit of analysis. Each turn was considered with respect to whether it contained an idea expressed in the previous turn, for example by repeating it or extending it (Tager-Flusberg and Anderson 1991; Tartaro and Cassell 2008).

A turn was considered to repeat the previous turn’s idea if the content was the same, except from minor differences such as use of synonyms, like in the following example:

Child A: “I have lost my mummy!”, said the monkey.

Child B: “I have lost my mum!”, said the monkey.

A turn was considered to be an extension of the previous turn’s idea if it adds details to the previous idea, whist not radically changing it, like in the following examples:

  1. (1)

    Child A: The sun was shining.

    Child B: The sun peaked over the clouds.

  2. (2)

    Child A: And the monkey said “That ain’t my mum: that’s my dad!”

    Child B: “Even better”, said the monkey.

Once each segment was coded according to whether it built on the previous turn or not (either by repeating or extending), the total number of all coherent turns was computed and normalised against the total number of turns in a story. All stories were coded through this scheme. A second coder (blind to condition) coded six of these stories and inter-rater agreement was scored for the Frog stories (Kappa = .80, p < .001) and the Monkey stories (Kappa = .73, p < .001).

Results

For all the statistical analyses reported below, when the data met the requirements of normality, homogeneity of variance and co-variance, parametric tests were used. When data failed to meet these requirements, non-parametric tests were used instead. This is particularly important in the light of the relatively small size of the sample employed for this study. Before these two hypotheses of the study could be tested, a manipulation check was performed to test that the children did use the question prompts when asked to and whether these were followed by relevant answers, i.e., whether they engaged in interactive discussion during story-making.

Story-making

The general prediction about story-making was that when the children were given the prompts script, they would engage in more and better interactive discussion than without it. If given first, these benefits would be maintained during the subsequent no prompts story. Specifically, these predictions were expected to be true for the extent to which children asked each other questions and gave each other answers.

Wilcoxon signed rank tests showed that the script increased the number of questions asked (prompts script first group: Z = 2.31, p = .02, r = .55; prompts second group: Z = 2.80, p = .005, r = .66), thus supporting Hypothesis 1. Hypothesis 2 was explored with two Mann–Whitney tests which revealed a significant difference in the number of questions asked during the no prompts condition (U = 1, p < .001, r = .81): children in the prompts script first group asked significantly more questions than children in the prompts script second group. No significant difference was found in the number of questions asked during the prompts task (U = 39.5, p = .97, r = .01). Data are shown in Table 6. These results show that the benefits of the prompts on interactive discussion were maintained once this type of scaffolding was withdrawn.Footnote 4

Table 6 The total number of questions asked by order and prompting

Analysis by [2 by 2] mixed ANOVA explored whether children asked a significantly greater percentage of thinking questions (as a proportion of the total number of questions) when given the prompts. No significant main effect of order was found (F (1, 16) = 0.48, MSE = 623.03, p = .50, η 2p  = .03). A significant main effect of prompting was found (F (1, 16) = 33.06, MSE = 163.12, p < .001, η 2p  = .67), with significantly more thinking questions asked with the script than without it. This supports the prediction that the prompts script would benefit interactive discussion through the use of thinking questions. However, no significant interaction between prompting and order was found (F (1, 16) = 2.85, MSE = 163.12, p = .11, η 2p  = .15) so this benefit was not maintained once the script was withdrawn. Data are shown in Fig. 3.

Fig. 3
figure 3

Percentage of thinking questions asked by order and prompting

Analysis of the percentage of given questions (as a proportion of the total number of questions) (Fig. 4) showed no significant main effect of order (F (1, 16) = 0.66, MSE = 871.51, p = .43, η 2p  = .04). Although a significant main effect of prompting was found (F (1, 16) = 9.11, MSE = 382.91, p = .01, η 2p  = .36) it was in the opposite direction to the one that was predicted, as significantly more given questions were asked without prompts scripts than with the prompts script. There was no significant interaction between prompting and order (F (1, 16) = 1.01, MSE = 382.91, p = .33, η 2p  = .06).

Fig. 4
figure 4

Percentage of given questions asked by order and prompting

Storytelling

The general prediction about storytelling was that when the children were given the prompts script, they would tell better collaborative stories (Hypothesis 1), and that these benefits would be maintained during the subsequent no prompts story (Hypothesis 2). Specifically, these predictions were expected to be true for the stories’ length, referential complexity, evaluative richness and coherence.

Number of words

A mixed [2 by 2] ANOVA showed no significant main effect of order (F (1, 16) = 1.36, MSE = 33498.81, p = .26, η 2p  = .08) or prompting (F (1, 16) = 1.25, MSE = 12262.30 p = .28, η 2p  = .07). However, a significant interaction between prompting and order was found (F (1, 16) =7.53, MSE = 12262.30, p = .01, η 2p  = .32). Post-hoc comparisons using the Bonferroni adjustment for multiple comparisons showed a significant difference between the two groups during the no prompts story: the children who were given the prompts script first told significantly longer stories than those given the prompts script second (mean difference = 173.58, p = .009). When the children were given the prompts task, no significant difference was found between the two groups (mean difference = 30.2, p = .72). Furthermore, children in the prompts script second group told significantly longer stories during the prompts story than the no prompts story (mean difference = 143.4, p = .01). No significant difference was found between the two stories in the prompts script first group suggesting children’s performance maintained (mean difference = 60.38, p = .29). Data are shown in Fig. 5. These results support the prediction that prompting would lead to longer collaborative stories (Hypothesis 1) and that the benefits would be maintained once this type of scaffolding was withdrawnFootnote 5 (Hypothesis 2). Finally, the relationship between the total number of questions asked and the story length was explored.Footnote 6 A Pearson correlation test showed a significant positive correlation between the total number of questions asked and the number of story words (r = .63, p = .01).

Fig. 5
figure 5

Total number of story words by order and prompting

Referential complexity

A mixed [2 by 2] ANOVA showed no effect of order (F (1, 16) = .004, MSE = 246.93, p = .95, η 2p  = .001), prompting (F (1, 16) = 0.04, MSE = 119.79, p = .85, η 2p  = .002), or interaction between prompting and order (F (1, 16) = 2.42, MSE = 119.79, p = .14, η 2p  = .13) (Fig. 6). There was no significant correlation between the total number of questions asked and story referential complexity (r = .25, p = .34).

Fig. 6
figure 6

Mean referential complexity score by order and prompting

Evaluative richness

A mixed [2 by 2] ANOVA showed no significant main effect of order (F (1, 16) = 0.46, MSE = 305.64, p = .51, η 2p  = .03) or prompting (F (1, 16) = 6.60, MSE = 179.871, p = .10, η 2p  = .16). However, there was a significant interaction between prompting and order (F (1, 16) = 7.34, MSE = 179.871, p = .02, η 2p  = .31). Bonferroni post-hoc comparisons found significant differences between two groups during the no prompts story: the children who were given the prompts script first scored significantly higher than the children who had the prompts script second (mean difference = 16.15, p = .03). When the children were given the prompts script, no significant difference was found between the groups (mean difference = 8.23). Further, children who were given the prompts script second scored significantly higher for the prompts story than for the no prompts story (mean difference = 20, p = .004). No significant difference was found between the stories of the children who were given the script first (mean difference = 4.38). Therefore, both Hypothesis 1 and 2 were supported (Fig. 7.) Finally, there was a significant positive correlation between the total number of questions asked and story evaluative richness (r = .57, p = .02).

Fig. 7
figure 7

Mean evaluative richness score by order and prompting

Story coherence

A mixed [2 by 2] ANOVA showed no significant main effect of order (F (1, 16) = 0.03, MSE = 212.39, p = .87, η 2p  = .002). However, a significant main effect of prompting was found (F (1, 16) = 4.68, MSE = 169.99, p = 0.05, η 2p  = .23), as the stories resulting from the prompts script were found to be significantly more coherent than those without it. Finally, no significant interaction was found between prompting and order (F (1, 16) = 0.41, MSE = 169.99, p = .53, η 2p  = .03). The data are shown in Fig. 8. These results support the prediction that the prompting would lead to more coherent collaborative stories (Hypothesis 1); however, the prediction that the benefits would be maintained once this type of scaffolding was withdrawn (Hypothesis 2) was not supported. Finally, a Pearson correlation test showed no significant correlation between the total number of questions asked and story coherence (r = .24, p = .36).

Fig. 8
figure 8

Mean story coherence score by order and prompting

Discussion

The first question asked in this study asked whether encouraging children to engage in interactive discussion whilst making stories would lead to better collaborative storytelling (Hypothesis 1). It was hoped that requiring children to ask each other questions using a set of provided question prompts would lead to more interactive discussion. This in turn was predicted to result in children reflecting more on their own and each others’ ideas, thus enabling them to produce better stories, i.e., longer, referentially more complex and evaluatively richer. Engagement in interactive discussion was also predicted to facilitate shared understanding of each others’ story idea, so enabling children to build more coherently on each others’ contributions during storytelling.

In order to achieve these benefits the children needed to use the question prompts. Initial analysis showed that when the children were given the reciprocal prompting script, they asked a significantly greater number of questions than when they were not so supported. Specifically, the number of questions asked when scripted was 35, compared a median of 9 without the script. As there were ten pictures, children were required to ask each other at least ten questions during the question prompting task. However, it was clear that the children did not limit themselves to asking each other at least one question in each turn. This suggests that the children engaged with the task. This might have been due to the script’s design which gave children the freedom to choose which questions to ask and even to invent their own ones. Given that some scripts have been criticised for their inability to support learners’ elaboration and reflection due to their rigidity (Salomon and Globerson 1989; Dillenbourg 2002), the GRPQ script provides a good compromise between structure and flexibility.

Moreover, to establish whether the requirement to use the question prompts limited the children’s own ways of interacting with each other through questions, the proportion of questions asked which were given as prompts was compared. The results show that the children did not limit themselves to asking the questions provided, but they also invented a great number of questions. Interestingly, proportionally fewer given questions were asked with the prompts script than without it. This suggests that providing a set of question prompts and requiring children to ask each other questions encouraged them to invent their own questions more than they would in the absence of this support.

The provided question prompts included a greater proportion of thinking questions, as these have been shown to promote greater levels of reflection and elaboration than review questions (King 1999). Therefore, it was hoped that proportionally more thinking questions would be asked under conditions of reciprocal prompting. Indeed this was found to be true, suggesting that the intervention successfully promoted elaboration of the presented pictures through interactive discussion.

Finally, the answers given by the children to these questions were analysed. It was found that most questions were answered (87% during the no prompts story and 89% during the prompts story) and that the number and type of answers provided followed the same pattern that was found for the number and types of questions asked. Having established that the children did engage in interactive discussion during story making, Hypothesis 1 about the value of encouraging interactive discussion for collaborative storytelling could then be tested.

Question prompting did promote the production of longer stories. However, and contrary to the prediction, this was not because of an increase in referential complexity of stories, as children told stories that were equally complex in both conditions. It can be argued that this was because children at this age are simply not able to tell stories with any more referential elements than this (scoring as they did over 60% in both conditions). For example, they may have found it hard to account for both main character (i.e., the monkey and the boy) and the helper (i.e., the butterfly and the dog) in the story, and just focused on one instead. However, it can also be argued that the type of scaffold was not as effective at promoting referential complexity as evaluative richness. The children were provided with more thinking than review questions, and they did ask more thinking than review questions. As thinking questions were mainly about evaluative aspects of the story, (i.e. aspects which went beyond the plot driving events illustrated in the pictures), a greater part of the story-making discussion would have presumably been dedicated to evaluative aspects of the story than to referential ones. This suggests that a stronger emphasis on the plot driving events through the question prompts would have produced referentially more complex stories.

However, the stories did show increased evaluative richness after question prompting. For example, children did not limit themselves to listing all the episodes where the monkey and the butterfly fail to find the monkey’s mum, as in the following example from the No prompts task:

G: The butterfly said 'Is this your mum?', but the monkey said 'No, this is an elephant'.

M: The butterfly said 'Is this your mum?'. 'This is it. No, that is a snake'

G: The butterfly said 'Is this your mum?', but the monkey said 'No, that's a spider'.

M: Is this your mum?'. 'No, that's a parrot'.

G: The butterfly said 'Is this your mum?', but the monkey said 'No, that's a bat'.

Gina and Martin

Instead with the prompts script, the children included good story introductions (example 1), rich descriptions of story settings (example 2) characters’ physical appearance (example 3), characters’ emotional states and motivations (example 4).

  1. (1)
    • M: Once upon a time there was a monkey. He couldn't find his mum.

    • J: He went to see his friend, the butterfly. And he asked her if she could help.

    Matthew and Jenna

  2. (2)
    • J: It was sunny and the clouds came out. The flowers were growing.

    Matthew and Jenna

  3. (3)
    • E: It's a big animal that lives in the jungle. And it is really big and it eats lots and lots of grass.

    Emily and Thomas

  4. (4)
    • T: The butterfly was feeling very worried, now, because the snake might eat the monkey

    • E: Just then, they climbed up a big tree.

    Gina and Martin

It is likely that this increase in evaluative richness could be explained by the fact that the children asked a proportionally greater number of thinking questions, which were specifically aimed at encouraging children to elaborate on what was illustrated in the pictures provided. This is further supported by the significant positive correlation that was found between the number and type of questions asked during story making and the collaborative stories’ evaluative richness. The more questions that were asked, the more evaluatively rich the children’s collaborative stories were.

Finally, the children built on each other's contributions more coherently than they did when unsupported by question prompts. This suggests that encouraging children to articulate and discuss each other’s ideas might have facilitated shared understanding. This, in turn, made it possible for children to build coherently on each other's ideas. However, the correlation between number of questions asked and stories’ coherence was not significant. This was unexpected, as the prompting script was shown to benefit stories’ coherence.

The other question addressed in this study was whether it would be possible to withdraw the questioning support whilst still maintaining the potential benefits of the prompting for interactive discussion. It was hoped that the benefits of the script would be maintained once the scaffolding was no longer present (Hypothesis 2).

First, it was found that the children continued to engage in interactive discussion during story making through questions and answers even after the reciprocal prompting support was withdrawn. This suggests that the children had internalised the reciprocal questioning script. After they had experienced the reciprocal questioning, the children continued to tell stories that were both longer and evaluatively richer than those produced by the children who had not been exposed to the scaffold yet. Together, these findings suggest that continuing to engage in interactive discussion during story making promoted to the production of richer elements, which made the stories longer as a result.

However these benefits were not maintained for referential complexity and coherence. As referential complexity was not enhanced by scripting there was nothing to maintain. However, the coherence result was disappointing, as the children were found to tell significantly more coherent stories when they were exposed to the reciprocal questioning scaffold. It is possible that a longer exposure to the question prompts would have produced a sustained effect on coherence, once the scaffold was withdrawn. It is also possible that although the reciprocal questioning helped to establish a shared understanding, the increased complexity originating from the articulation of several ideas might have made it harder for children to maintain coherence.

These results suggest that encouraging children to articulate each others’ story ideas through question prompts might not be sufficient to achieve story coherence. Although the children articulated their story ideas for each other and therefore achieved a better shared understanding of their collaborative stories, they might still have disagreed about each others’ ideas. This could have led to a lack of coherence in spite of the increased amount of discussion. Therefore, a specific set of prompts, or social scripts (Kobbe et al. 2007) might need to be designed in order to directly encourage children’s engagement with each others’ ideas beyond the simple request for articulation, to include critiquing and negotiating of ideas and this could benefit coherence in their collaborative storytelling.

Conclusions

This study examined whether using reciprocal questioning to encourage children to engage in interactive discussion during story-making would benefit their collaborative storytelling and whether these benefits could be maintained once the scaffolding was no longer present

The results showed that the GRPQ script was a successful way to scaffold children’s collaborative storytelling: while they were making their stories, the children engaged in interactive discussion through the question prompts, and this benefited the quality of their collaborative storytelling on many measures. Moreover, these benefits were mostly maintained once the prompt support was withdrawn.

Given the developmental literature showing that six-seven year old children are only beginning to tell stories which can be understood and appreciated by a naive audience and do not always do so consistently (Bamberg and Damrad-Frye 1991; Peterson & McCabe, 1983), these findings are encouraging, as they suggest that promoting children’s engagement in interactive discussion during story-making can benefit their collaborative storytelling.

The findings on the effectiveness of this scripting approach adds to the evidence on the value of the GRPQ script (King 1999; King and Rosenshine 1993) by showing that even young children can benefit from this scripting method. Moreover, this study provides evidence for the fact that these benefits can be maintained even after the scripting prompts are withdrawn. This study also showed that specific questions can be effectively devised to tailor the GRPQ script to the storytelling domain, that children are able to use these productively, and that this applies to production as well as comprehension. It is important to note that the flexibility of this type of scripting method increased the ecological validity of the task by providing space for the children to engage in it in a meaningful way. The GRPQ script was shown to provide a balanced form of scaffolding, where learners are allowed enough flexibility to choose what questions to ask and when. This was even more important in an open-ended task such as storytelling, where no right or wrong answers are given, and elaboration of ideas is paramount to the quality of the collaborative outcome (Salomon and Globerson 1989; Cohen 1994).

Finally, the findings on the children’s engagement in interactive discussion during story making are promising with respect to the broader literature on collaboration. It is suggested that even older learners can find it difficult to engage in interactive discussion when not explicitly prompted to do so (Barron 2003; Webb 1992), even when supported through co-constructive activities (de Westelinck et al. 2005; Munneke et al. 2003; Prangsma et al. 2008).

Future work includes exploring whether a more sustained exposure to the GRPQ script can benefit the coherence of children’s collaborative storytelling once this scaffold is withdrawn. Moreover, a longitudinal approach could be taken, testing whether the benefits of this scripting approach can be maintained long term once this support is withdrawn. The involvement of larger sample sizes would also benefit statistical approaches to analysis such as the one taken in this study. A desirable extension to this work will be to explore how the question prompts could be extended from a micro- to a macro-scripting context (Dillenbourg and Jermann 2006), where prompting could be orchestrated by a teacher in a whole classroom environment. The increased complexity of this context might open up opportunities for reflection on the role of the teacher in flexibly managing the work of several pairs of children making stories together, and how this might be facilitated by technology. For example, following the tradition of research on computer supported scripting (Baker and Lund 1997; Robertson, et al. 1998; Soller 2001), the question prompts might be integrated into a computer environment which could offer teachers an additional tool to help them orchestrate classroom learning. As technology is gradually evolving to support the development of tools which are sophisticated enough to afford flexible scripting (Yu 2009), this appears to be a productive opportunity to further investigate the value of flexible scripting in the real world pedagogical contexts.