1 Introduction

Argumentation represents one of the central topics in science education for which interest has strongly increased in recent years.Footnote 1 The importance results, on the one hand, from the central role of argumentation at the heart of scientific inquiry (Driver et al. 2000; von Aufschnaiter et al. 2008). On the other hand, there is the connectedness of abilities and knowledge from the area of argumentation with more general education goals: "Thus, a pedagogical emphasis on argumentation is consistent with general education goals that seek to equip students with capacities for reasoning about problems and issues, be they practical, pragmatic, moral, and/or theoretical" (Jiménez-Aleixandre et al. 2000, p. 757). Promoting argumentative abilities is therefore not only a prerequisite of doing science but also indispensable for critical thinking (Kuhn et al. 2005) and the capability of students to engage appropriately in complex decision making (Osborne et al. 2004) e.g. in the context of socioscientific issues.Footnote 2

Jiminéz-Aleixandre and Erduran (2008) summarize the potential of argumentation for science education in the following aspects:

  1. 1.

    Supporting the access to the cognitive and metacognitive processes characterising expert performance and enabling modelling for students […].

  2. 2.

    Supporting the development of communicative competences and particularly critical thinking […].

  3. 3.

    Supporting the achievement of scientific literacy and empowering of students to talk and to write the languages of science […].

  4. 4.

    Supporting the enculturation into the practices of the scientific culture and the development of epistemic criteria for knowledge evaluation […].

  5. 5.

    Supporting the development of reasoning, particularly the choice of theories or positions based on rational criteria […]” (Jiminéz-Aleixandre and Erduran 2008, p. 5).

The complex interrelation of these aspects reflects the versatility and interdisciplinarity of the topic (Bricker and Bell 2008) and is responsible for the fact that there is no uniform picture of what exactly is understood of argumentation. The aspects mentioned above also concern the question of how qualities of arguments and argumentative processes in the classroom can be analysed, evaluated and finally fostered (Sampson and Clark 2008). According to different conceptualizations in this domainFootnote 3 instructional accounts to promote argumentative abilities of students also differ considerably.

To analyse the structure and quality of arguments, different domain-general approaches (evaluating the quality of arguments without reference to the concerned scientific domain) and domain-specific approaches (evaluating the quality of arguments with reference to the concerned scientific domain) have been developed (Sampson and Clark 2008). A framework for the structural analysis of arguments, still prominent today, has been proposed by Toulmin (1958, 2003) and broadly introduced in research in the field of science education. Modifications and extensions of approaches to analyse argumentations can be found, which are aimed at dealing with methodological problems or to broaden the analytical range.Footnote 4 In their study, Zohar and Nemet (2002) focus on the way students justify their arguments, Kelly and Chen (1999) and Kelly and Takao (2002) take into account the epistemic level of the claims of students, whereas Lawson’s approach (2003) is closely related to the scheme of hypothetico-deductive reasoning.

This diversity of approaches points to some theoretical and methodological problems relating to the vagueness of the existing methods of analysis. Many frameworks are limited by describing only parts of argumentative processes. Some focus on domain-specific aspects, e.g. the epistemic level of a claim in a well-defined geographical context (cf. Kelly and Takao 2002). Others concentrate on domain-general, structural aspects of arguments, or are used in research primarily in this way (Toulmin 1958, 2003). This results in frequently underdetermining the quality of arguments. For example, it is possible that statements of a student contain all the relevant features of an argument according to Toulmin and thus would have to be judged as appropriate, although they might be wrong in content according to the concerned scientific domain (Sampson and Clark 2008, p. 452). There is a lack of differentiated approaches which connect the domain-general and domain-specific criteria to analyse arguments in detail. Only in this way, it would be possible to take into account both structural aspects and the empirical data of the relevant scientific domain to derive a substantiated and multi-faceted evaluation of argument quality. Moreover, there are several methodological difficulties that emerge in the context of Toulmin’s framework of analysing arguments structurally. Some of the categories specified by Toulmin are considered not to be sufficiently selective, so that problems of reliability and validity occur (Erduran 2008). As a result, many authors use Toulmin’s categories as a starting point for their analysis and refer to him, but also tend to adapt the evaluation scheme in correspondence to their data to ameliorate the selectivity. For example, this might be realized through the inclusion of subcategories (cf. Kelly et al. 1998).

In what follows, a model-based theory of argumentation will be presented to deal with the above mentioned problems. This framework refers to the model-based view in the philosophy of science (Giere 1988, 1992, 1999, 2004) and deduced insights to adequately describe and make comprehensible the process of scientific investigation (Giere 2001; Giere et al. 2006). The inclusion of results of cognitive science concerning the role of mental modelling in processes of argumentation and problem solvingFootnote 5 will substantiate and apply this framework in the context of science education to pursue the following goals.

A model-based approach is suitable to explain the multi-faceted nature of argumentation in practical experience, which stays underdetermined if arguments are analysed structurally as “claim-like components” (Sampson and Clark 2008, p. 464). From the perspective of the arguer, argumentation also concerns several interdependent pieces of information, e.g. if different aspects of a broader topic are to be discussed or preconditions for valid conclusions are questioned. This is comprehensible from a model-based perspective, because arguments have to be considered in the context of more complex models or clusters of models, which explain real phenomena. In this sense, arguments underline the plausibility of parts of these models, which have to be evaluated critically in a process of argumentation. Other authors share the understanding of arguing as a form of critical evaluation (van Eemeren and Grootendorst 2004). Furthermore, a model-based approach offers a clear point of reference to determine the quality of arguments, because not only structural, domain-general, considerations have to be taken into account, which is often seen as an important limitation (e.g. Driver et al. 2000). The quality of arguments as statements about the plausibility of (parts of) models can be determined depending on the context in reference to relevant data of the concerned scientific domain. In the sense of an analytical approach for science education, this marks a relevant simplification and broadening in order to evaluate the quality of arguments and argumentation processes. After demonstrating the contributions of the philosophy of science and cognitive science for a model-based theory of argumentation, a revision of this approach for the context of science education will be presented.

2 Theoretical Background: A Model-based Framework of Arguments and Argumentation

Before exploring the topic of argumentation, the units of the analysis have to be clarified. What counts as argument and how can different arguments be separated from each other? “A major issue in the study of argumentation in either written or verbal data is the unit of analysis. What becomes of the boundary markers of the data where arguments begin and end? Decisions have to be made regarding how the data will be split and subsequently how the chunks will be categorized and interpreted” (Erduran 2008, p. 57). Additionally, it has to be pointed out whether the analysis of such units suffices to represent and assess all relevant competences and qualities in the context of argumentation. According to Kuhn and other authorsFootnote 6 a preliminary distinction between argument as product, from argumentation as process will be made, and further differentiated in the course of this article.

Although this separation makes sense from a model-based perspective, both aspects are closely connected: "The terms argument and argumentation reflect the two senses in which the term argument is used, as both product and process. An individual constructs an argument to support a claim. The dialogic process in which two or more people engage in debate of opposing claims can be referred to as argumentation or argumentative discourse to distinguish it from argument as product. Nonetheless, implicit in argument as product is the advancement of a claim in a framework of evidence and counterclaims that is characteristic of argumentative discourse" (Kuhn and Franklin 2006, p. 979). Before describing a model-based theory of argumentation for use in the context of science education, it has to be shown in which way such a perspective is supported by approaches in the philosophy of science. Additionally, it will be referred to cognitive science to describe the relevant mental capacities involved in argumentation. It is important to make clear, to what degree it is possible to compare a model-based perspective from two different scientific domains (philosophy of science and cognitive science), such that a comparison might give fruitful insights into the fundamentals of argumentation and where its limits become obvious.

2.1 The Model-based View in the Philosophy of Science

For some decades now, a model-based understanding of science has become increasingly significant (Develaki 2007; Koponen 2007). Before outlining its relevance in the context of a theory of argumentation, a few general notes have to be made to explain its background.

According to the statement view, broadly accepted in the twentieth century in the philosophy of science, scientific theories are axiomatic systems of theoretical statements: "In the statement view, which was basically developed by logical empiricism, the axioms of a theory are essentially understood as immediate descriptions of the real world and as statements of universal scope, verifiable through logical and empirical proofs, the experience of the senses is thus posited both source and ultimate verification of scientific knowledge" (Develaki 2007, p. 726). Despite its broad acceptance in the first half of the twentieth century, several theoretical problems in the context of the statement view led to severe criticism. Since the 1970s the model-based view as an alternative philosophical approach attracted a lot of interest from contemporary philosophers (Develaki 2007, p. 728). The model-based view (mbv) in the philosophy of science (van Fraassen 1980a; Giere 1988) is based on the semantic view (or non-statement view) of theories (Suppe 2000). Giere is one of the most prominent proponents of a realistic mbv. This position seems to be a plausible foundation for science education (Matthews 1997, 2007; for a critic review of current realistic versions of the mbv: cf. Koponen 2007), and is theoretically closely oriented towards processes of scientific practice (Giere 1988, 1999, 2004) including elements of cognitive science: “The cognitive model of science focuses on how scientists work and communicate […] and highlights the semantic aspects of theories: their goal is not to reach truth but to make sense of the world” (Hacking 1983: in Izquierdo-Aymerich and Aduriz-Bravo 2003, p. 31). This refers to an understanding of reasoning as mental modelling in cognitive science (Koponen 2007, p. 752; cf. Sect. 2.2).

For Giere models of different degrees of abstraction are used in science to represent aspects of reality (cf. Fig. 1; Giere 1988, 1992, 2004, 2010). Referring to the non-statement view of scientific theories (e.g. Suppe 1989, 2000), models are not to be understood as linguistic entities, statements, who describe the connection between them and reality in terms of truth or falsehood. They are intermediate representational entities, whereas statements can only be used in the sense of definitions (Giere 1999, p. 73). The relation between models and reality is one of similarity, which can be analysed empirically through scientific investigation. This allows a clear distinction between relativistic epistemological positions and consequently more adequate models can be generated: “The question for a model is how well it “fits” various real-world systems one is trying to represent. One can admit that no model fits the world perfectly in all respects while insisting that, for specified real-world-systems, some models clearly fit better than others. The better fitting model may represent more aspects of the real world or fit in some aspects more accurately, or both” (Giere 1999, p. 93). Such an understanding of science overcomes the central criticism towards the correspondence theory of truth (e.g. van Fraassen 1980b) of earlier, more naïve realistic positions, because here it is […] a conception of realism that is not dependent in any important way on the concept of truth as direct correspondence between a statement and reality. With intended irony, I call this view ‚constructive realism’. Models are human constructs, but some may provide a better fit with the world than others, and be known to do so” (Giere 1992, p. 97).

Fig. 1
figure 1

Epistemic Levels in the MBV (according to: Giere 2010)

In this way, a problematic notion of truth, which has been the basis to see science as a system of axioms and for example to also formulate laws of nature, is abandoned. Giere’s position could be regarded as syntheses of scientific realism and cognitive constructivism (Grandy 1997), because the relation between models and reality can be determined more closely and the models themselves are results and constituent parts of cognitive processes. The mbv understands science as a process of generating models by continuously evaluating their similarity according to reality. If, for example, only one model is able to explain the results of an actual experiment, an agreement between received data and predictions of the model is an indicator for the better fit of the actual model compared with plausible rivals (cf. Giere 1999, p. 75): “One possibility is to define science as a process of constructing predictive conceptual models. This definition unites both the processes and products of science, and identifies model building as a superordinate process skill. Within this framework, the purpose of research is to produce models which represent consistent, predictive relationships” (Gilbert 1991, p. 73).

2.1.1 The Hierarchical Structure of a Model-based View in Science

A model-based perspective on arguments provides another aspect that can fruitfully be implemented in the context of science education. Different elements (e.g. claims) that occur in the course of the argumentative process can be associated with different epistemic levels. This has been acknowledged as a promising extension of the focus of the analysis (cf. Kelly and Takao 2002). The mbv allows several possible graduations (e.g. Fig. 1). This representation has to be understood as exemplary, depending on the underlying theoretical foundation and the concrete research question, modifications are possible.

Giere describes “Principled Models” (Fig. 1) as abstract models like “laws of nature”, which have to be substantiated by specific conditions into more concrete, representational models: “By adding conditions and constraints to the principled models one can generate families of representational models that can be used to represent things in the world” (Giere 2010, p. 270). These models can then be correlated to reality: “How does one connect abstract models to specific real physical systems? This requires at least two processes which I call “interpretation” and “identification” (Giere 1988, pp. 74–76). For interpretation, elements of an abstract principled model are provided with general physical interpretations such as “mass,” “position,” and “velocity.” Such interpretations are already present in the statements that characterize the principled models. Scientists do not begin with an “uninterpreted” formalism and then “add” interpretations. For identification, elements of a representational model are identified (or coordinated) with elements of a real system” (Giere 2010, p. 271). Hypotheses specify and connect representational models with phenomena or models of data. Their degree of generalisation defines to which specific or more general real world system (or model of real world data) the representational models refer to. The fit of these specified representational models can be assessed in comparison with data or models of data (Giere 2010, p. 271; for different hierarchies cf. e.g. Halloun 2007).

Fig. 2
figure 2

Steps in analysing studies that include theoretical hypotheses; Step 1: Real world—the focus of the study (aspect of the real world) has to be identified, Step 2: Model—a model (whose fit with the real world is to be assessed) has to be chosen, Step 3: Prediction: Predictions based on the focused model and experimental setup have to be identified, that give insights into the data that should be obtained if there is a good fit of the model and the real world aspects focused, Step 4: Data—involves identifying the data that has been obtained involving the real world aspects, Step 5: Is there negative evidence?, Step 6: Is there positive evidence? (according to: Giere et al. 2006, pp. 34–36)

Following Portides (2007) in differenciating the theory/experiment relation and notions of approximation and idealisation, one should emphasize that abstract theoretical models are idealisations and do not represent directly physical systems. Only in a concretised (de-idealised) form, he acknowledges them to acquire a capacity of representation, including theory-derived description with empirical laws and other auxiliaries. Therefore Portides characterised this as an approximation of the corresponding physical system: “Theoretical models refer to a class of ideal-types whose empirical content is supplied when they are used in the construction of a model c [concretized model; note from the author] by de-idealising them, thus changing the reference to actualisable physical systems and thus meaningfully employing the notion of approximation” (Portides 2007, p. 720). Therefore, these concretized models are the most relevant ones in the context of classroom argumentation as they can be related to (models of) empirical data. Among the concretised models hierarchies and clusters of models also occur, which represent interconnected real world phenomena well and make a differentiated model construction also necessary in the context of science education: “It is thus the hierarchical organisation of models which makes it possible to use them in different levels of abstraction and generality and as a basis for general descriptions as needed in different practical situations” (Koponen 2007, p. 758). As Portides concludes, discussing processes of idealisation and approximation in science education in the course of model construction might help the students to gain insights into the theory–experiment relation and to achieve a better understanding of the nature of science (Portides 2007, p. 721).

As the course of argumentation will be presented mainly as relating models to empirical data in the following sections, one more aspect has to be added: The philosophical underpinning of this procedure, as a variant of the Duhem-Quine thesis, still need to be outlined in more detail. Accordingly, Koponen states: “[…] there is a mutual fitting of theoretical models to empirical results, as well as models of empirical results to theoretical models. Moreover, in the latter, not only are results idealised but the experiments themselves are often changed, and the way the phenomena are produced is altered” (Koponen 2007, p. 764). Based on the theoretical background outlined so far, a model-based understanding of arguments and argumentation will be presented in the next section.

2.1.2 A Model-based Understanding of Arguments

In the context of a model-based approach the goal of an argumentation is the evaluation of models to determine their “fit”. Processes of argumentation are thus to be described as sequences of presenting and verifying the similarity of one or multiple models to reality through empirical data. The question emerges, how the similarity of models and reality can be determined. According to Giere, this can be done in deducing predictions from the model, which are to be related to existing data: “A “hypothesis” is a claim (statement) that a fully interpreted and specified model fits a particular real system more or less well” (Giere 2010, p. 271). From this point of view, arguments can be understood in the following way:

Arguments are indicators for or against the fitting of a model according to its logical coherence or in comparison to empirical data.

The better a model represents the focused aspects of reality, the better arguments for its appropriateness can be found. The underlying process of understanding and construction is presented according to Giere (Fig. 2). The results of empirical investigations are data or models of data. They can be related to the constructed model in comparing the correspondency with deduced predictions from the model. A high similarity of a model and reality results in a high accordance of these predictions and existing data. If the correspondence of model predictions and data is only marginal (Fig. 2, Step 5: No), the cause might not in every case be an inadequate model but wrong auxiliary assumptions which are also always involved in processes of scientific investigation. This is one version of the Duhem-Quine thesis and Giere proposes to weaken a possible conclusion to deal with it by only speaking of preliminary evidence of an unsufficient model fit (Giere 2001).

The same applies to the correspondence of predictions and data (Step 6: No). Here it can also be relevant to clarify whether the data could also be explained through alternative models and their predictions (Giere 2001, p. 31). The notion of Osborne et al. (2004): “Thus, supporting scientific argument in the classroom requires that relevant evidence be provided to students if arguments of better quality are to be constructed and evaluated”, is not only compatible with the model-based perspective, it may be that one should not talk about arguments but about models that have to be constructed and evaluated. Arguments are the reasons to suppose or to refuse a similarity of models and reality. In what follows, it has to be shown how this is connected to the process of argumentation.

2.1.3 A Model-based Understanding of the Process of Argumentation

From the perspective of the mbv, the evaluation of the model fit represented above (Fig. 2) can be understood as argumentation. This process may take place in different forms e.g. as an individual or joint evaluation of one or more models. Therefore, in reference to the model-based view, argumentation can be described in the following way:

Argumentation is a process of critical evaluation of models in the sense of verifying the appropriateness of one or multiple rival models according to their logical coherence and the available, empirical data.

This is comparable to the understanding of other authors concerning the goal of argumentation: “It should be born in mind, however, that the primary aim of a critical discussion is not to maximize agreement but to test contested standpoints as critically as possible by means of a systematic critical discussion of whether or not they are tenable" (van Eemeren and Grootendorst 2004, p. 188).

Having compared the fit of rival models, finally a decision-theoretical problem emerges. In the light of the existing data, it has to be decided, if a model or which of several rival models is more adequate. But this notion goes beyond the actual process of argumentation, because normative criteria of the concerned scientific domain have also to be taken into account. For example, it has to be decided, whether the plethora of arguments is sufficient to accept or to reject models well grounded. This depends among others on the criteria which decide whether data have to be considered as relevant, which relationships between model predictions and data are legitimate or which structure of a target model is acceptable. From a model-based perspective, these premises can also be described as models of normative criteria of the scientific enterprise, which in a way constitute the framework of scientific investigation in a concrete domain. These criteria might exhibit both normative assumptions and experienced proof of value in the past in the concerned scientific domain and are pragmatically retained until better ones become evident. In the sense of the above-mentioned Duhem-Quine thesis, it is not realistic or applicable to always question every aspect of every model involved or to demonstrate in which sense everyone is sufficiently well founded. This means that although some of the underlying normative models could be found to be inadequate, in processes of argumentation the critical evaluation can and has to be limited to the relevant models in focus, which mostly includes the conceptual target models and eventually some background models (Giere et al. 2006). These background models of normative criteria in science are always relevant for the educational context, because in argumentation the focus might switch from the target models in question to the validity of underlying inferences. This happens for example when students do not possess the same understanding of criteria supposed to be relevant for the context. So they might consider it to be necessary to negotiate them, before the process of model evaluation/argumentation of the conceptual target model might continue. In sum, this is why in processes of argumentation model evaluation on different levels can occur and be analysed because both conceptual target models and models of underlying scientific practice might be concerned. For reasons of simplicity and coherence, these background models and the question of how they influence the decision making in model evaluation processes for or against conceptual models will not be discussed in detail.

As noted earlier, the presented model-based view in the philosophy of science is based on the concrete procedures of scientific inquiry which means it is connected to the cognitive processes that are relevant in this context. In what follows, the presentation of results in cognitive science aims at legitimating this reference. Furthermore, the mental model approach in cognitive science is introduced as a second theoretical foundation of a model-based theory of argumentation. In consequence, the model-based perspective not only allows describing products and processes in the context of argumentation but also the competences arguers have to exhibit in order to participate successfully.

2.2 A Model-based Theory of Reasoning and Cognition

Since the early 1980’s, the model-based approach in cognitive science which describes reasoning and cognition in terms of mental modelling has been broadly established (Johnson-Laird 1983, 2006; Nersessian 2002, 2008a). Following this account, models are very generally described as representations of aspects of reality, which are being constructed to fulfil specific purposes in the reasoning process. “A “mental model” is a structural, behavioral, or functional analog representation of a real-world or imaginary situation, event, or process. It is analog in that it preserves constraints inherent in what is represented” (Nersessian 2008a, p. 93). For example, in problem-solving tasks mental modelling helps to simulate strategies in order to test their efficiency: “There is a growing literature in psychology and neuroscience that investigates the hypothesis that the human cognitive system possesses the ability for mental animation or simulation in problem-solving tasks” (Nersessian 2008a, p. 112). This seems to be a very general advantage of model-based cognition: “Modern advocates of mental modelling also speculate that the origins of the capacity lie in a general capacity to simulate possible ways of maneuvering within the physical environment. Since it would be highly adaptive to possess the ability to anticipate the environment and potential outcomes of actions, many organisms should possess the capacity for simulation” (Nersessian 2008a, p. 107).

In comparing the goal of the reasoning process and the appropriateness of the model to reach this goal, its quality can be deduced: “A satisfactory model is one that exemplifies features relevant to the epistemic goals of the problem solver. Through the models the reasoner is able to grasp insights and gain understanding, and is warranted in pursuing where the inferential outcomes deriving from the model might lead with regard to the target phenomena” (Nersessian 2008a, p. 157).

2.2.1 Intra- and Interpsychological Argumentation

In describing processes of argumentation is is not only elucidating to differentiate between arguments and argumentation but also to make a difference between intra- and interpsychological argumentation (Garcia-Mila and Andersen 2008). Interpsychological argumentation describes the process of argumentation between two or more persons who exchange and critically evaluate arguments for different standpoints or models. Intrapsychological argumentation refers to the intern process a person undergoes in order to evaluate the fit of one or more models. Considering the insights from cognitive science concerning the general capacity of mental modelling in humans to generate predictions based on existing data causes the comparability of these focuses (Nersessian 2002, 2008a; for differentiations of the actual debate see also: Schaeken et al. 2007). In this sense, the underlying cognitive mechanisms of intra- and interpersonal argumentation are thus comparable. Interpersonal argumentation is simply marked through a higher degree of complexity, because there is a higher possibility of a greater number of rival models introduced by others, which have to be considered and evaluated by the arguer in the course of the argumentation. Moreover interpersonal argumentation demands linguistic abilities to make the generated arguments for or against models explicit. “Most often, scientific questions are posed by means of two, or sometimes three or four, competing theories. The process is one of debate, with individuals typically playing advocacy roles. To participate, an individual scientist must analyze the evidence and its bearing on the different theories as a means of argument to the scientific community in support of his or her view. Equally important, this analyzing and weighing process of argument is, in interiorized form, almost certainly an important part of what goes on in the private thought of the individual scientist. Scientists are well aware that explicitly justified arguments are needed to convince the scientific community, and they become accustomed to thinking in such terms” (Kuhn 1993, p. 321). Basic principles of intra- and interpersonal argumentation are relevant for science: “[…] argumentation involves a set of core processes: the coordination of theory, evidence, and methodology that are common in the internal dialogic argumentation involved in scientific reasoning and the external dialogic argumentation involved in science discursive practices, both of them essential in science learning” (Garcia-Mila and Andersen 2008, p. 40). Nevertheless, explicit discourse is important to promote argumentative competencies in science education, because students are in consequence obligated to take a clear position and to defend it against criticism or rival models (Kuhn and Franklin 2006; Kuhn and Udell 2003).

2.2.2 Comparing Scientific Model Construction and Mental Modelling

Although some parallels between mental modelling and the before outlined characteristics of mbv in the philosophy of science are evident, there are at least several gradual differences. This is for example the case for the evaluation of quality of subjective mental models and scientific models (see for this differentiation: Seel 2006). Thus it has to be distinguished between individual satisfactory models according to subjective criteria and scientifically adequate models. This is a precondition to qualitatively and appropriately evaluate the quality of models in the context of the classroom. Mental models reduced in complexity are therefore the consequence of the limited cognitive capacities of humans e.g. of the working memory. This is why in some cases wrong conclusions are predictable (Johnson-Laird 2006). Furthermore, it has to be pointed out that consensus models exist in an explicited form, whereas mental models exist only as internal and sometimes implicit constructions. So if statements of students are to be analysed, the quality of underlying mental models can only indirectly be deduced.

The model-based view in the philosophy of science, based on cognitive mechanisms of reasoning, naturally exhibits similarities with the theory of mental modelling in cognitive science, which refer to the basic principles underlying argumentation respectively model evaluation: “How does this notion of mental modelling as simulative reasoning relate to the exemplars of reasoning practices in the sciences? My account casts the specific conceptual changes as arising from iterative processes of constructing, manipulating, evaluating, and revising analog models to satisfy constraints” (Nersessian 2008a, p. 128). Such an understanding complies with actual perspectives of the nature of science, according to which scientific thinking and everyday thinking only differ in their degree of systematicity: "Scientific knowledge differs from other kinds of knowledge, especially from everyday knowledge, by its higher degree of systematicity" (Hoyningen-Huene 2008).

Of course, one has to be careful in comparing theoretical approaches from different scientific domains. This becomes evident in regarding some of the open questions concerning mental modelling in cognitive science, e.g. its role in conceptual and procedural cognition, for long-term memorization (Baguley and Payne 2000) or its connectedness to unconscious and emotional cognitive processes (e.g. Johnson-Laird 2009). Nevertheless, in sum, this approach has proved to be fruitful and applicable to multiple topics (e.g. in the context of subjectivity and consciousness, c.f.: Metzinger 2003). Thus, a model-based theory of argumentation is broadly grounded on an understanding of reasoning and cognition in cognitive science, in which context arguments, closely related to the fit of models, and argumentations are to be described in relation to critical evaluations of model fit. In what follows, in undertaking conceptual and theoretical refinements and extensions, this approach will be adapted to develop a model-based theory of argumentation useful for and applicable in science education.

2.3 A Model-based Understanding of Science Education

According to the perspective presented here, arguments are understood as indicators of model fit and argumentation is an intra- or interpsychological process of model evaluation in order to discard inadequate models and to establish more adequate ones. It follows that arguments and argumentations always have to be regarded in relation to models and processes of model evolution they are connected with. This includes both the phases of model evaluation that are important in the course of the analysis of argumentation and also steps of model modification to ameliorate the fit of the models. The latter may concern their logical coherence or their appropriateness in comparison to actual data. In consequence, in order to describe and to evaluate argumentation in the classroom context, there is the need to reconstruct the whole process of model evolution to work out and to analyse in detail the phases of argumentation respectively during model evaluation.

Therefore, it is referred to the approach of Clement and Rea-Ramirez (2008) who also reconstruct and structure instructional sequences in the classroom from a model-based perspective. Their methodology and descriptive terms can be adapted and then used to successfully analyse argumentation. Such a description of educational and argumentative processes is based on normative decisions about the relevant focus of the analysis. Having introduced in general the perspective of classroom practice as model development, the goal dimensions of argumentation will be presented in more detail. This will help to clarify the before-mentioned normative underpinnings. Later, the theoretical and methodological consequences for the concrete analysis of argumentations will be worked out.

According to Rea-Ramirez et al. (2008) learning processes in the classroom can be described as the development of models in order to reach a target model, which might represent the goal of the particular lesson or unit. Other authors differenciate in more detail the structure of target models, e.g. in considering concepts as elementary building blocks of models and in defining dimensions for the construction of concepts in the context of modelling (Halloun 2007, p. 671). Based on the preconceptions on the students’ side, their initial mental models, students have to develop a model close to the target model, undergoing steps of intermediate model construction. Model development in the classroom might therefore be characterised by repeated steps of 1. model construction, 2. model evaluation and 3. model modification. If considerable problems occur in the course of model evaluation, e.g. in comparing a model with existing data, a new model generation has to be realised. ‘Minor problems’ could eventually be compensated for through modifications. The target model has to be determinated respecting the general surrounding conditions (developmental level of students, relevant standards, scientific consensus model, etc. (for a more detailed modelling learning cycle cf. also Halloun 2007, p. 658).

It is only consequent to describe educational practice based on results of cognitive psychology concerning mental modelling, because human reasoning and cognition can in general be adequately described this way. The possibility to also describe in detail processes of conceptual change on such a theoretical background (Nersessian 2008b), is related to this notion. No matter how the concrete learning environment is structured, it will initiate mental modelling on the students’ side. The goal is of course to facilitate processes of mental modelling that result in the construction of a model similar to the target model determined by the teacher. The question how to foster such model construction is not focused upon here. Nevertheless, a model-based perspective on instructional practice has already been introduced in science education (Núñez Oviedo and Clement 2003; Silva 2007; Tamayo and Sanmartí 2007; Clement 2008; Clement and Rea-Ramirez 2008).

The presented approach clarifies the understanding of argumentative processes: argumentation takes place in the course and as part of superordinated model evolution. Argumentation is a phase of model evaluation, in which arguments for the fit of considered models are to be checked critically. Before outlining ways of assessing the described processes of argumentation, it has to be specified which aspects are relevant in this context and should be assessed. This will be realised in presenting a prescriptive model of argumentative competence.

2.4 A Model of Model-based Argumentative Competence

Considering the model-based understanding of argumentation presented in the previous sections and the concretization of the involved argumentative processes, the question arises, which competences have to be regarded as a precondition of adequate arguing. Hodson (1992) offers an adequate reference as a starting point in distinguishing between three goal dimensions of science education: “doing science” describes the specific strategies and methods in science, “learning about science” the understanding of the nature of science and “learning science” refers to the relevance of scientific knowledge. According to these general goals of science education, in the context of a prescriptive competence model of argumentation, it is differentiated here between ‘arguing’ concerning the procedural competence, ‘understanding of arguments and argumentation’ as the superordinated understanding of the goals and structures of argumentative processes and ‘knowledge of (aspects of) models and data’ as the knowledge of relevant models and data as points of references in argumentation (Fig. 3).

Fig. 3
figure 3

Prescriptive model of argumentative competence

These dimensions will be explained and substantiated in more detail considering the model-based perspective. Furthermore, some of the abilities typically associated with argumentation will be related to these dimensions. Because processes of argumentation are to be understood as certain phases in model evolution, if the separate dimensions are sketched out, the abilities connected with model evolution have to be taken into account, too. To examine in detail the competencies relevant in the context of models in science education, a general model of model competence by Meisert (2008) also according to Hodson (1992) can be consulted, to which the here-presented scheme (Fig. 3) refers to.

2.4.1 Knowing of (Aspects of) Models and Data

The dimension ‘knowing of (aspects of) models and data’ (Table 1) has a special role in the competence model. The basis of the generation of adequate arguments that means the connection of models and empirical data is the knowledge of both of these. Only if models or parts of them and corresponding relevant data and rival models are known, argumentation is possible. It has to be pointed out that this dimension is not named ‘knowing of arguments’ because the generation of arguments is itself part of the dimension ‘arguing’. ‘Knowing of (aspects of) models and data’ simply mirrors that arguing can only be realised based on information about models and data from the model-based perspective. The better a certain model and the corresponding data are known, the better arguments for its plausibility can be deduced. Part of this is the knowledge about an adequate logical structure of models, which independent of existing data also has to be guaranteed because criticism could also refer to this aspect.

Table 1 Categories in the dimension ‘Knowing of (aspects of) models and data’

This is a precondition for adequately arguing in the course of an argumentative process for a certain model.

The above-presented classification is supported by actual results concerning the relevance of preexisting knowledge for argumentation (von Aufschnaiter et al. 2008). An insufficient knowledge of models or data therefore does not automatically indicate low procedural competencies in arguing. It is conceivable that novices who possess merely naïve initial mental models are able to gain insights in the inappropriateness of their initial models through evaluative steps in the course of an argumentation. It follows also that the knowledge about models and empirical data might vary considerably during the process of argumentation and the model evolution it is part of. This might possibly be a way out of the actual debate about reasons for inadequate argumentations, which are seen by some authors partly in deficient preconceptions, partly in deficient procedural abilities (Koslowski 1996; Sadler and Donelly 2006).

The knowledge of aspects of models and data is not only regarded in reference to one possible, explaining target model, but also concerning possibly relevant background models, which might be relevant (c.f. Sect. 2.1.3) and introduced as premises and the basis of scientific criteria in the general process of argumentation. Questioning ungrounded premises of target and background models might be the goal in argumentative discourse to weaken a rival position (Felton and Kuhn 2001). This is especially relevant if, in the course of an argumentation, the focus is switched from a target to a background model, e.g. in case of a dissent concerning the mentioned specific normative criteria. In discussions of novices, these background models can not be taken to be implicitly shared as it is often the case for experts in a scientific domain. So these models also have to be negotiated and evaluated (Kelly et al. 1998).

2.4.2 Arguing (as Model Evaluation)

The dimension ‘arguing’ (Table 2) includes the procedural competencies related to argumentation, which are understood as intra- or interpsychological model evaluation (mostly embedded in superordinated processes of model evolution). This evaluation is not only realised according to the criteria of empirical data but also taking into account the question which logical structure of a model has to be considered as accurate. The response to opposing positions/rival models is also related to this dimension: "In producing justified claims, several alternative theories must be coordinated with evidence in order to choose the evidence that best fits the justification of one of the theories" (Garcia-Mila and Andersen 2008, p. 35). From a model-based view, the process described here by Garcia-Mila and Anderson can be adequately understood as the process of comparing model evaluation as a central aspect of argumentation (Table 2: 2b). Although this aspect is connected to general model evaluation (Table 2: 2a), it is noted separately, because the procedure of comparing demands higher abilities if the model appropriateness has to be tested and assessed in comparison.

Table 2 Categories in the dimension “arguing”

The ability to criticize models, to implement expressed model criticism, is also captured by these aspects (Table 2: 2a and b). Decisions for or against models are not part of this dimension. They clearly belong to processes of model evolution, when it has to be judged which rival model exhibits a greater fit, but including broader and more complex aspects, they are not part of the process of model evaluation/argumentation itself.

2.4.3 Understanding of Arguments and Argumentation

The dimension ‘understanding of arguments and argumentation’ (Table 3) takes into account the relevance of metacognitive abilities (Kuhn 1993, 2000, 2001; Kuhn and Dean 2004) for processes of argumentation (for a differentiation of the understanding of goals and strategies see also: Kuhn and Pearsall 1998).

Table 3 Categories in the dimension “Understanding of arguments and argumentation”

First of all, in the context of argumentation, it is important for the involved persons to recognize the goals of the process. This implies the understanding of arguments as reasons of model fit (Table 3: 3b). This is the basis of a substantiated understanding of argumentation as a process of critical evaluation and evolution of adequate models (Table 3: 3b). In concrete praxis, part of this is also the ability of the arguers to recognize on which level the dissent is situated. On the one hand, this might make necessary an evaluation of rival alternative models to explain the regarded phenomenon, on the other hand the premises of the process of knowledge construction and verification might not be shared. The latter would make it necessary to evaluate first of all these preconditions in the sense of rival models of normative scientific criteria (Table 3: 3b).

Such an understanding of the competencies involved in argumentative processes has to be further adapted and differentiated as an analytic scheme according to the concrete context. This differentiation e.g. in the context of studies in science education depends upon the aimed focus of the particular research question. To understand argumentation as the evaluation of model fit can also be considered as an important part of a differenciated nature of science concept that is based on non-naïve realism (Halloun 2007, p. 657; Matthews 2007, p. 651). But this aspect is not under discussion here.

In what follows, first an analytic scheme for the dimension of ‘arguing’ will be exemplified. The theoretical reflections presented so far explain why this scheme also has to evaluate processes of model evolution and insights of the arguers involved to describe argumentation. To satisfy this demand, the method of structuring model evolution processes in the classroom originally developed by Clement and Rea-Ramirez (2008) will be extended to represent and analyse argumentation. This methodological approach allows describing processes of model evolution and involved argumentation in its complexity.

3 Describing and Evaluating Argumentation in Science Education from a Model-based Perspective

The approach of Clement and Rea-Ramirez (2008) of structuring processes of model development in the classroom allows identification of which goal is pursued and which processes are undergone in the context of an argumentation. From the perspective of a model-based theory of argumentation, this goal does not consist in defending a singular claim but in validating the plausibility of models. To fulfil this goal, the process of model development has to be assessed during the whole course of the argumentative process, eventually including modified models that are presented by the involved arguers. This might appear to be a quite extensive procedure but has to be acknowledged to be necessary in the context of other analytic approaches as well (Kelly et al. 1998, p. 857). According to Clement and Rea-Ramirez (2008) a differentiation between initial model, intermediate model and target model seems to be promising to describe the different developmental degrees of the models on the arguers’ side. Furthermore, in this way in a following step of assessment, the processes of argumentation can be analysed to finally differentiate where they merge into phases of (new) model construction or modification, which are not to be considered as a part of the core process of argumentation.

Based on the past considerations, it is suggested to differentiate between the model level, the subject level, the data level and the process level in analysing argumentation processes (c.f. Clement and Rea-Ramirez 2008). Additionally, it is possible to add the instructional level in the context of science education because it could be interesting and elucidating to analyse argumentation processes in the classroom in relation to different instructional strategies (e.g. if the target model is directly introduced or has to be developed only based on empirical data). For reasons of simplification, this last aspect will not be discussed here.

The model level gives information about which models are constructed during the course of the model evolution and thus are available for argumentative evaluation. Both the models presented by the arguers as well as the arguments connected with them (in the sense of reasons for model fit) are open to change during the argumentation. Hence, this level also contains a temporal component. The point in time when different models are advocated should thus be captured in the form of a progressing process to mirror the developmental character (e.g. initial model at the beginning, intermediate model in the middle and target model of the students at the end of the argumentation). The representation of the advocated models to different moments in time indirectly involves the argumentative steps of model modification and construction having been undertaken by the arguers. Changed aspects of models between two points in time refer, for example, to a model modification. Because this is not the focus of the analysis of argumentation here, these steps of model modification and construction neither have to be examined any further nor to be worked out in detail on the process level. This way of analyzing the involved procedures also allows outlining every model that is temporarily in the focus of the discussion even if a background model (e.g. normative criteria of scientific methods) is concerned.

It is described on the subject level who at which moment participates in the argumentation; at the data level the existing data are represented, which constitutes an important point of reference to substantiate the appropriateness of models (in the sense of a fit between predictions derived from the model and data). On this level, possible changes of data are integrated, because this also influences the possibilities of constructing valid arguments.

The process level allows the description of the processes of model development and thus especially to differentiate between argumentative steps of model evaluation and other apects of model evolution like model construction and modification. In the course of the argumentation/model evaluation, different types of arguments can be found. On the one hand, relations between an advocated or rival model and empirical data can be considered as reasons for model fit. On the other hand, it can be referred to the logical coherence of the models. This may also concern the relation of several models to each other. Decisions for or against a certain model are not part of the analysis, because this goes beyond the actual argumentative process of model evaluation. The result is the following scheme (Table 4) of analytic levels related to argumentation.

Table 4 Levels of analysis of arguments and argumentation from a model-based perspective

Referring to the analysis of argumentation presented so far, in what follows, necessary preconditions of a model-based analysis of argumentation will be clarified, before a concrete example will be regarded later.

3.1 Preliminary Considerations and Proceedings for Model-based Analysis of Argumentation

The starting point of the analysis of argument structures and argumentative processes is the clarification of some basic questions which shall be presented while distinguishing the concerned levels of analysis. A precondition of every analysis is the transcription of the argumentation process as a preparing step. The statements e.g. of students in the transcripts are to be considered as explicited arguments and models, which are supposed to give hints concerning the structure of the students’ mental models. For reasons of simplification this methodological introduction on the model level will contain no differentiation between target and background models, although such a distinction would be possible and helpful to analyse argumentation processes in the classroom.

3.1.1 Model Level

An instructional sequence in a scientific course often consists of several steps or cycles of model construction explained above (Sect. 2.3). Mostly the teacher offers a succession of relevant data to help students to evaluate, develop and differentiate their actual models of the concerned phenomenon. It follows that a structural analysis of the arguments based on the model-based approach is possible without considering the target models determined by the teacher. For example, if students are discussing the quality of their models according to the data available, the number of arguments used to link models and data could be assessed. But to evaluate the quality of the arguments and thus of the argumentative process in relation to the related scientific domain, the target models the teacher intends to foster (and he/she determines as reachable based on the given data) have to be determined. This allows them not only to verify if certain argumentative processes, like the link of data to certain aspects of a model, occur in the course of an argumentation, but also to judge if this link is qualitatively correct. As the learning environment normally allows several steps of model development, the quality of the argumentative processes has to be compared to normative target models also in these intermediated sequences. All these steps have to be related to corresponding normative target models as students are restricted to argue for a model they are able to construct based on the data actually given, deviating variably from complex scientific models for the phenomenon.

Therefore, at first (1) the determination and visualisation of the target model is proposed (cf. Sect. 3.2.1), followed by (2) the analysis of argumentation, the structuring and visualising of initiate, intermediate and target models of the participants in relation to the determined target model(s) and its relevant aspects (cf. Sect. 3.2.1). Step 2 is additionally proposed to facilitate the analysis with regard to cases of high complexity that might not be analysable by simple text/transcriptions. These steps allow an overview about the different focuses of the process of model development and make the later following assessment of argumentative steps on the process level possible. The determination of such target models is a normal step in structuring learning environments and requires an answer to the question, which performance can be achieved by the arguers (e.g. students) given the existing data, their preconceptions, their developmental level and their shared normative criteria of science.

3.1.2 Subject Level

To clearly present the complex steps of model development, it is proposed here to structure and visualise them in distinguishing the concerned arguers this means to separately outline the model development for every person involved.

3.1.3 Data Level

To decide whether in the course of the argumentation valid arguments are used and models are correctly evaluated, the existing data has to be structured and visualised schematically, too. This also allows assessing the undertaken inferences between empirical data and (aspects of) models qualitatively. Part of this is to mirror eventually occurring changes in data, if e.g. new information is available and causes new conclusions concerning the appropriateness of models.

3.1.4 Process Level: The Quality of the Argumentation Process

This level focuses on the analysis of the process of model evaluation/argumentation. Every procedural step that can be considered as a reason for or against an adequate model fit is regarded as an argument. It is separated into the analysis of singular models and of several models in comparison to each other. Such a comparison of models exhibits a higher degree of complexity. It is therefore worthwhile to separately evaluate such statements which make this comparison explicit. In relation to the initially defined target models and the existing data, the correctness of the argumentative steps can be determined qualitatively. This judgement, in accordance with the concerned scientific domain, is represented by the second dimension “knowing of (aspects of) model and data” of the prescriptive competence model. The analysis of the process gives insights into how detailed it is argued for or against one’s own or rival models and how substantiated the argumentation is. Furthermore it allows inferences concerning the degree of the comparison of one’s own and rival models of the students and their ability and will to integrate criticism into their argumentation. Depending on the focus of the analysis, the described processes can be assessed separately or as a whole.

The above-mentioned possibilities of structuring, visualisation and interpretation will be exemplified in the following section. The developed scheme of analysis will be introduced and applied on a general, abstract example. At first, model, subject and data level will be visualised, before the process level of the argumentation will be added in a later step. This differentiated analysis helps to clarify the relation between argumentation and model evolution and the methodology presented so far.

3.2 Examplified Analysis of Argument Quality

According to Clement and Rea-Ramirez (2008), the analysis of argumentation with the help of structuring and visualisations is considered to be fruitful and simplifying. A short and concrete example will introduce this analytic procedure. This methodological procedure already requires a few prossessing steps (transcription, analysis of model development, determination of target models etc.) which have been mentioned earlier and will not be discussed in more detail.

3.2.1 Exemplary Analysis of an Argumentation: Part 1–Model, Subject and Data Level

After presenting the first three levels of analysis (models, subjects and data; c.f. Table 5) the process level will be added in a second Table (Sect. 3.2.2).

Table 5 Model, subject and data level of analysis of arguments and argumentation

Two persons, Student 1 and Student 2 (S1 and S2; c.f. Fig. 4), each defend one model (M1 and M2) to explain a certain phenomenon. The different geometrical figures are used to differentiate the processes of model development of the two persons involved. At different time indexes (t1–t3) the models represent different aspects of reality (data) which are here abstractly represented with different letters. In comparison to the existing data, an increase in the fit of both models can only be considered between t1 and t2, where M1 keeps a higher general fit according to the data. At t3 of the argumentation a decision has been made in favour of M1, alternative models have been refuted. Background models are not part of the argumentation. The data also do not change in the course of this example (level D).

Fig. 4
figure 4

Structuring and visualisation of the model, subject and data level from a model-based perspective (t point in time in a chronological order, S student, M actual model, A, B, C, etc. aspects of empirical data representable with the model)

Additionally, the fit of the model that has been developed and accepted in the course of the argumentation can be compared to the fit of the target model (TM; Fig. 5) determined previously to instructional parameters e.g. by a teacher. It ideally corresponds to the demands the students have been considered to be obligated and able to cope with (c.f. TM). The model developed and shared by the participants in this concrete example does not contain every aspect of the target model (Fig. 5).

Fig. 5
figure 5

Comparison of the developed model (M1) and the determined target model (TM)

This abstract example shows how to analyse model, subject and data level to evaluate argumentations from a model-based perspective. In the next section, the argumentative processes are added in the analysis.

3.2.2 Exemplary Analysis of an Argumentation: Part 2–Process Level

The structured levels of model evolution are connected to several steps of model evaluation, which can be assessed as arguments if they are made explicit by the arguers in the course of the argumentation and therefore could have been transcribed. The following Table 6 contains some central argumentative processes, but can also be modified or extended according to the concrete research interest. For example, the aspect how exactly one of the arguers reacts towards criticism directed against his model is not included here, although it could be relevant in the context of an analysis. In general, the analysis presented here has to be considered as a simplified version of what is theoretically possible based on a model-based approach.

Table 6 Process level of analysis of arguments and argumentation from a model-based perspective

In the corresponding figure (Fig. 6) different argumentative processes take place and are visualised. At t1 both persons criticise aspects of the other model according to the available data. As a consequence, S1 and S2 review their model and the modified version is represented at t2. At t2, S1 and S2 compare the appropriateness of their models (M1 and M2). S2 concludes that the fit of the model of S1 is higher according to the data and therefore shares this model at t3. This example contains argumentative processes of model criticism based on available data (t1) and the comparison of the fit of models (t2). A decision for an alternative model can also be stated (S2 shares M1 at the end) but, as mentioned earlier, is not relevant for the analysis of the argumentation as it goes beyond its central aspects.

Fig. 6
figure 6

Structuring and visualisation of the process level of argumentation from a model-based perspective (t point in time in a chronological order, S student, M actual model, A, B, C, etc. = aspects of empirical data representable with the model)

Adding the process level to the structure presented above (Fig. 4) results in a more complete scheme of a structured and visualised process of argumentation:

The presented short sequence could for example be evaluated using the following Table 7 which does not only allow assessment of the nature of the argumentative steps but also whether these steps are to be considered correct or wrong qualitatively according to the data (and thus the concerned scientific domain). The corresponding cells have to be completed with the correct numbers (not crosses) indicating how often correct or wrong inferences and connections have been made in the relevant period of time (e.g. t1; cf. Fig. 6). As an example, the argumentative steps visualised above (Fig. 6) are evaluated as “correct”. It has to be remembered that the decision whether an argumentative step is considered as wrong or correct (Table 7) can not be verified by the reader because no concrete data or model is given in the abstract example. More concrete analyses will follow later (Sect. 4).

Table 7 Exemplary analytic scheme for argumentation processes

In this example, the table indicates for both participants the same number of correct argumentative inferences. This might be astonishing, because the model of S2 differed more considerably compared to the target model at the beginning than the model of S1, but here the proposed scheme only evaluates their competences of ‘arguing’ (although other focuses are possible within this framework). According to this dimension, both demonstrated a comparable level of ‘arguing’, not influenced by the structure of the initial model that has been exhibited by each of them in the transcripts.

Having sketched out the methodological principles of a model-based theory of argumentation the goal of the next sections is to compare this approach with others to highlight its advantages in the context of science education. A theory of argumentation does not only have to be coherent and comprehensible, it also has to prove its value against other frameworks that are already common in use in this well established field of research. Therefore, it will mainly be compared to the structural understanding of arguments promoted by Toulmin (1958, 2003) which is still very popular in the domain of science education. The goal is to outline some advantages of a model-based understanding of argumentation compared to Toulmin’s framework.

4 The Application of the Model-based Analytic Scheme

4.1 Comparing a Model-based Account of Argumentation with Toulmin’s Framework

A short comparison between the established analytic scheme of Toulmin (1958, 2003) is aimed at giving some evidence for the potential of a model-based theory of argumentation in relation to other approaches. After a short discussion of the restrictions in argumentation analysis using the Toulmin scheme in comparison to the model-base alternative, both approaches will be used to evaluate a short argumentative sequence as a practical example to outline the differences between their analytical focus and depth.

4.1.1 The Structural Analysis of Argumentation According to Toulmin

Following Toulmin (c.f. Fig. 7, Toulmin 1958, 2003) an argument is constituted of an interconnected set of a claim (C), of data (D), warrants to connect claim and data (“since W”), backings (B) to substantiate the warrants and rebuttals (R) which indicate under which circumstances the stated claim would be correct. Qualifiers (Q) describe the strength of the inferences and how universally they can be applied and are valid: “More specifically, a claim is an assertion put forward publicly for general acceptance. Data and warrants are the specific facts relied on to support a given claim. Backings are generalizations making explicit the body of experience relied on to establish the trustworthiness of the ways of arguing applied in any particular case. Rebuttals are the extraordinary or exceptional circumstances that might undermine the force of the supporting arguments. Toulmin further considers the role of qualifiers as phrases that show what kind of degree of reliance is to be placed on the conclusions, given the arguments available to support them (Erduran 2008, p. 57).

Fig. 7
figure 7

Toulmin’s analytical scheme (according to Toulmin 2003, p. 97)

In the past years, this scheme has been broadly used for the analysis of argumentation and allowed insights in the structure of argumentative processes. However, in the course of its application, several methodological and theoretical problems have become evident. In what follows, these problems will be discussed and revisited from a model-based perspective of argumentation.

4.1.2 Focus and Discriminatory Power in Analysing Argumentation Processes

The Toulmin scheme is often used to assess relevant data in the context of argumentation. Classifying the different elements of an argument according to this understanding can generate problems of selectivity.Footnote 7 This is why authors using this scheme often reduce the number of categories they apply. Erduran et al. (2004) concentrate essentially on the existence of rebuttals and (Kelly et al. 1998) focus on warrants: “One complication encountered by researchers in applying Toulmin’s framework, however, involves reliably distinguishing between claims, data, warrants, and backings because the comments made by students can often be classified into multiple categories” (Sampson and Clark 2008, p. 451). Zohar and Nemet (2002) have also merged data, warrant and backing into one single category in order to ameliorate the processability. From the perspective promoted here, these problems do essentially occur, because an understanding of arguments as “claim-like components” (Sampson and Clark 2008, p. 464) is somewhat oversimplified. So it is not always transparent which single statements represent the goal of the argumentation, “the point the student was trying to make” (Kelly et al. 1998, p. 856) and which simply have a substantiating or justifying character. These difficulties also mean that the Toulmin scheme is barely applicable for longer argumentative structures. “Furthermore, when arguments are longer, as is the case when students are writing a journal article or a position paper, statements may serve as a new claim (thus requiring support) or as a warrant for a preexisting claim […]. As a result, a researcher’s personal perspectives about what should count as a warrant, claim, or data will often influence how he or she codes a comment using this analytic framework. This type of bias typically has an adverse effect on interrater reliability and has caused some researchers to question the usefulness of this framework for studying arguments generated by students in the context of science […]” (Sampson and Clark 2008, p. 451).

In focusing on models as more complex constructs instead of singular claims, a model-based theory of argumentation not only avoids some of the stated problems of differentiation, it sometimes also allows to explain why these problems occur in the context of Toulmin’s framework. If argumentation is understood as a process of evaluation of models, then for example it has primarily to be clarified to which model the undertaken statements refer. It is possible that different statements describe several aspects of one and the same model, which in the course of the argumentation are consecutively evaluated according to the existing data. However, it is also possible that different statements describe different, explaining models, which contradict each other and have to be evaluated in comparison. In this sense, longer argumentative sequences (e.g. texts) can be understood as the presentation of the evidence and reasons why a certain model is more ‘fit’ than another and thus can also be analysed based on this perspective.

In addition to statements about concrete target models, or their relation to the data, it might be the case that in the course of an argumentation the general criteria of what is plausible and which conclusions are legitimate are discussed. So far, this has been referred to by the notion of background models which might be focused on by the arguers. Such general justificatory processes are insufficiently described as singular statements like warrants or backings that are questioned. This is the case, because different statements subsumed in these categories often belong to very different domains and levels. Kelly et al. (1998) found in their study that the category of warrants is multi-faceted and therefore separated it into three different ones. For example, in case of the interpretation of an experiment, novices do not only discuss the results but can also question the methodology used and the criteria applied to generate the data. Regarding experts, this general knowledge relevant for the specific scientific domain can be considered to be implicitly shared and therefore will only rarely be discussed or negotiated in the same way.

At this point, the advantage of a model-based description, compared to Toulmin’s framework can be highlighted: From the perspective of Toulmin’s framework, the validity of the results of an experiment would be justified by referring to single statements and thus would be inadequately reduced. From a model-based perspective, experimentation as scientific method has to be considered as a complex construct, which can be substantiated with data. It is a certain model of experimentation and not a detached single claim an arguer explicitly or in doubt implicitly refers to and that in consequence can be critically evaluated. Another advantage of the model-based approach compared to Toulmins framework is the possibility of a broader focus. In addition to arguments as rather static constructs, a model-based theory of argumentation allows a more detailed analysis of the processes which take place in the course of a critical evaluation of models. This extension to the process level to include the time-dependent changes is rather difficult and possible with Toulmin’s framework only by using special adaptations (Erduran et al. 2004) that are not a central part of the approach.

4.1.3 Criteria of Quality for Arguments and Argumentations

The majority of studies in the context of argumentation based on Toulmin’s framework, focus on domain-general structural aspects of arguments. A closer connection with domain-specific contents as criteria of quality is largely non-existent although Toulmin considered this to be important (Toulmin 1958, 2003). The reason might be the lack of any concretion for such a connection “Unfortunately, because the majority of the research using Toulmin’s argument framework has focused on the field-invariant features of an argument, we know very little about how well arguments constructed by students adhere to the criteria shared by the scientific community for judging quality. For example, do students incorporate evidence that is valid and reliable as data in their argument? Do students attempt to coordinate their claim with all available data or just the data that support their particular viewpoint? Answers to these types of questions can provide valuable insights into students’ understanding of what counts as a quality argument in science” (Sampson and Clark 2008, p. 452).

The advantage of a model-based approach is that a great number of relevant structures and processes in the context of argumentation can be described and evaluated. A possible methodology has been presented earlier (Sect. 3): Arguments as reasons for model fit can be evaluated according to actual data; the quality of argumentation as a critical evaluation of models depends on how adequate the argumentative steps are undertaken in the course of time. Points of reference to determine the quality are both scientific consensus models and background models concerning relevant scientific criteria. Contrary to the analysis of arguments according to Toulmin, the evaluation of the structure and the content of arguments is connected more closely. Therefore, structural categories to classify arguments as being of a high quality are not sufficient: “In addition, research relying on standard Toulmin frameworks has generally provided less insight in terms of other issues of justification and content. For example, although the sample student argument would be considered relatively strong structurally according to most Toulmin-based frameworks, the content is inaccurate from a scientific perspective” (Sampson and Clark 2008, p. 452). The qualitative classification of arguments is undertaken not only through considering their compatibility with the existing data but also in relation to their logical coherence in comparison to rival models.

The Toulmin framework has also been criticized for only evaluating the simple existence of the different elements but ignoring their connectedness and their logical structure respectively: “Similarly, a standard application of Toulmin’s framework does not include an assessment of the logical structure and coherence of the justification beyond the presence or absence of data, warrants, and backings. Hence, all that matters is their presence or absence regardless of accuracy or relevance. As a result, those interested in examining the content of an argument must supplement this framework with other measures because it does not take into account the accuracy of the components from a scientific perspective or even if the argument, as a whole, makes sense” (Sampson and Clark 2008, p. 452). In addition to this concrete point of reference, because of its theoretical basics (Sect. 2) a model-based theory allows one to attribute statements made in the course of an argumentation to different epistemic levels. This has been considered to be a useful goal of assessment by other authors (Kelly and Takao 2002). In this way, statements relating to data, hypotheses or models can easily be differentiated. In the end, the focus of the analysis depends on the interest and the scientific goals of the researcher. In general, because of its multiple descriptional levels, a model-based theory of argumentation allows a broad understanding and the possibility of several foci. In the next section, as an example, some of the differences in analytic approach discussed here will be demonstrated in relation to a short argumentative sequence.

4.1.4 Comparing the Analysis of Arguments According to Toulmin’s Framework and the Model-Based Perspective

As a basis for argumentative analysis the following example refers to a short argumentative sequence found in a study concerning students’ understanding of the structure of electric circuits (Kelly et al. 1998). The authors used the Toulmin model to analyse the discourse. After a short summary of the goal of the referred study the authors’ analysis, according to Toulmin, will be presented.

In the second and third week of a longer investigation, students had to answer questions concerning unknown components of given electric circuits: “This assessment presents students with six ‘mystery boxes’, each of which contains one of five things: two batteries, a wire, a bulb, a battery and a bulb, or nothing. The students were provided with batteries, bulbs and wires to construct electric circuits of their choosing. By constructing electrical circuits with the mystery boxes and examining bulb brightness, students could determine the contents of the boxes. When students thought they knew the contents of a particular box, they recorded what they thought was in the box, the circuit that led them to their conclusion, and wrote out how this circuit helped them reach this conclusion. Scores were assigned based on students’ answers and circuit drawings” (Kelly et al. 1998, p. 855).

Following the above-described problem-solving sequence is the way it was been analysed by Kelly et al. The figure (Fig. 8) shows different circuits that had been presented to the students (the exact significance of every symbol is not relevant here) and in the “message unit” area, their statements. On the right-hand side of the figure the analysis according to Toulmin is shown. A focus of Kelly et al. (1998) was the classification of different warrants which, according to their results, can be attributed to three different dimensions. Furthermore, they adapted the analytic scheme in some respects as a consequence of a first inspection of their data. The extract of their analysis can nevertheless be used to compare the central features of both approaches.

Fig. 8
figure 8

The analysis of a sequence of argumentation according to Toulmin (Kelly et al. 1998, p. 865)

4.1.5 Analysis According to Toulmin

In the presented example (Fig. 8) students discuss what each of the ‘mystery boxes’ in the electric circuits contains based on observations in their experiments. The statement of Betty “but it can’t be a wire” is categorized as a claim based on the different circuits (data; c.1-c.3). The following statement of Dana adds a justification (warrant: “when we connected 2 batteries to C it still diddn’t light”). This warrant is in turn understood to be the empirical basis (data) for the following claim (“there’s nothing inside”). The final conlusion of Dana that, if a bulb would be part of the mystery box the other visible bulb still would have to light up a little bit, is categorized as the next claim, for which the warrant of the previous argument serves as hypothetical data basis. Here some of the earlier outlined difficulties (Sect. 4.1.2, 4.1.3) become evident: on the one hand the categories mentioned by Toulmin are not sufficiently selective compared to each other. In connection to different claims, certain statements have different functions, so the evaluation can become unclear. On the other hand, statements that are classified into the same category can have a very different character (according to the scheme used by Kelly et al. (1998) e.g. empirical and hypothetical warrants).

In the following analysis, based on the model-based approach, the problems of selectivity are avoided through a broader understanding of arguments as indications for the appropriateness of models. Furthermore, this approach will also allow a more differentiated analysis of warrants.

4.1.6 Analysis According to the Model-based Approach

As it has been described before (Sect. 2) argumentation is understood as the evaluative phases of more complex model development. Concerning the example presented here, the general course of model development has to be reconstructed before the detailed processes of argumentation can be worked out. From a model-based perspective, the example is about the evaluation of different models which have to describe the ‘mystery boxes’ as components of electric circuits in the most adequate way. A gradual development of the explaining model takes place which at the beginning does not predict a cable in the box (Betty, line 202–203; Dana, line 204–206) and which is then extended through the evaluation of existing data. In this case it is the fact that although two batteries are used, the light in the remaining circuit has not turned on. Such a model is confirmed through the insight that a second light bulb can also be excluded as the contents of the ‘mystery box’. The reason is that it is understood here that in such a case the visible bulb would nevertheless have to light up (l. 213–214: “would have to light a little bit”). The last statement has to be considered as another model prediction deduced from the actual model and based on the available data. This correspondence constitutes an additional argument for the appropriateness of the model. From a model-based perspective this short sequence can be understood as a corporate model development in which different argumentative steps of critical model evaluation can be found. This is briefly structured and visualised in the following figure (Fig. 9).

Fig. 9
figure 9

The analysis of the argumentative sequence from a model-based perspective

The key argumentative processes in this example concern at first the evaluation of models of the ‘mystery box’ as a component of the circuit which does not contain any wire; this is concluded after having connected two batteries. The implicit background of this insight is the model prediction that the bulb would not light up in connecting two additional batteries if the ‘mystery box’ contains no wire. Then, there is a further step of model development when Dana modifies the actual model based on the data and expects the mystery box to be empty (at t2). This model is also evaluated in referring to the data. The insight that in connecting two batteries the bulb cannot light up and a closed circuit can not be considered was sufficient to generate a model that implies the ‘mystery box’ to be empty. The following statements clarify eventually how this additional insight happened: Dana compares her imagination of an empty ‘mystery box’ and the hypothesised consequences, with the alternative case of a box including a bulb. She realizes the latter could not be the case because the second, visible bulb still would have to light up a little, which she did not find. From a model-based perspective she evaluates two rival models through a comparison and deduces predictions from each model to see which corresponds better to the data. Having presented the central argumentative processes, it can now be determined which elements in the concrete example have to be considered as arguments in the sense of reasons for model fit. In the figure (Fig. 9) arguments are labelled with A1 and A2 and they represent relations between model predictions and data. A1 is an argument for model 1 (M1) and model 2 (M2), A2 is an argument against model 3 (M3).

4.1.7 Comparison of Both Analytic Approaches

Central differences between both analytical approaches referring to the above-mentioned example will be outlined in this section. Firstly, the model-based approach allows a simpler description of the goals of the argumentation: Through model generation and development an adequate explanation, the contents of the ‘mystery box’ shall be found. The different argumentative steps of the students turn around the adequacy of this reconstruction. If the analysis is based on the Toulmin scheme, there is no differentiation between explanation and argument. An argument then always contains a stating part (claim) which is based on certain data. Adding other elements (e.g. warrants and backings) completes the structure of an argument. This understanding of an argument is too brief because both the goals of the argumentation (if arguments for or against a specific explanation are evaluated) and the arguments themselves (the reasons why a certain explanation or model has to be preferred compared to another) are not differentiated.

In the presented argumentation sequence (Fig. 9) this is evident concerning the coding of the fourth claim. Before, a claim has always been an explanation about the content of the ‘mystery box’ (c ≠ wire, c = nothing) substantiated by an observation, this categorization is reversed in the context of the fourth claim. Now, a possible explanation for the content of the mystery box is coded as data (if c = bulb) and the claim is a hypothetical assumption (“bulb in c3 would have to light a little bit”). At the same time, the goal of the discourse has not changed. As it has been undertaken from the model-based perspective, it is more conclusive to act on the assumption of an evaluative process referring to a rival model, described by the statement “if c = bulb” and wrongly coded as data. Based on this model, the prediction can be made that the connection of two batteries should cause the lighting of the bulb, if the model is correct. In fact, this prediction is not explicitly stated, but it is the basis for the statement which follows next, that this model can also be refuted: “even if it had a bulb in there, it would have to light a little bit”. In this way, the alternative explanation of Dana, which she abandons finally, has a hypothetical character and could be understood as a claim, much more than data, as it was categorized as by Kelly et al. (1998).

A differentiation between the goal of the argumentation and the concrete arguments, supporting certain explanations is necessary in the analysis of argumentation. The difference is important to the model-based theory, because argumentations can always be understood as one important part of superordinated processes of model development. The consequences for the afore-mentioned methodological problems of selectivity in the context of Toulmin’s framework also follow.

From a model-based perspective, statements are not only analysed structurally but also according to their relation towards a single or several models, they may have several clearly identifiable functions in the context of the models. It is for example conceivable that specific data serve as a basis for different models. E.g. the information “when we connected 2 batteries to C it still didn’t light” is a valid argument for both a model of “c ≠ wire” and a model of “c = nothing”.

Furthermore, the complexity of Toulmin’s category ‘warrants’ stated by Kelly et al. (1998) can be explained based in a model-based approach. The warrant “when we connected 2 batteries to C it still didn’t light” is on the one hand the connection of a model (c = nothing) with data. In this sense, from a model-based perspective, it is an argument for the appropriateness of the model “c = nothing”. This model allows the prediction that even the connection of two batteries to the circuit would not light up the bulb. This statement has not been made explicit in the example, so the warrant can be considered as an empirically based argument for the model “c = nothing”. The categorisation of another warrant as hypothetical is more complex, because this classification would be quite different in a model-theoretic perspective as mentioned before. In the context of a model-based approach, statements can have a hypothetical character if the arguers deduce, based on the model, which phenomena (observations, experimental results etc.) are expected, if the model is adequate. Referring to the available data allows validation of such hypothetical predictions.

The aspects mentioned here concerning the focus of the analysis of argumentation and the problem of methodological selectivity refers to the analytical depth of a model-based theory of argumentation. The possibility of establishing a connection to the concerned scientific domain to evaluate the arguments qualitatively is only one advantage of the broadness of the model-based perspective, which is promising to allow a sufficient flexibility and also priorities of focus to analyse argumentation in the context of science education. Depending on the specific needs of the research question, aspects can be added or left out in adapting the analytic scheme and the analysis (cf. Sect. 3.2.2). Two analytical approaches have now been introduced and compared with reference to a simple argumentative sequence in order to point out the advantages of the model based approach. A more complex argumentative sequence will now be analysed in the next section. In addition, the analytical scheme will be applied to evaluate the argumentative processes and its qualities.

4.2 Analysing and Evaluating an Argumentation According to the Model-based Framework

After the introduction of the model-based analysis of argumentation in the previous section, it will be applied to a more complex argumentative sequence including the structural and qualitative evaluation now (cf. Sect. 3.2.2).

4.2.1 Describing and Analysing the Argumentative Steps

A hypothetical discussion of two students in a physics class about the reasons of ametropia is taken as an example. According to a realistic learning environment, the formation of the picture in the human eye and the processes of accommodation have to be considered as the relevant prior knowledge of the students.

In this example, both the structure of learning environments in the form of different sequences of model development and evaluation and the relevant argumentative processes come together. The analysis is separated into three parts according to the structure of a realistic school science lesson. Between part 1 and 2 (Table 8), the students have the possibility to make practical experiments with an optical bench in order to diversify and develop their theoretical models about the causes of ametropia. An optical bench is a piece of equipment used for simple model experiments in the classroom context. Components such as light sources, lenses and screens can be shifted along a long rail of steel. This experimental tool for testing effects is a necessary instructional step, as the students typically generate narrow hypothesis about the phenomenon of ametropia focussing plausibly on their knowledge to the process of accommodation. Correspondingly the experimental tool is suitable for the learning process as it initiates explorative experimentation fostering the development of alternative hypothesis. The optical bench then helps to develop more hypothetical models to explain ametropia. Between part 2 and 3 of the analysis, the students are provided with additional data which allows them to evaluate the hypothetical models developed so far empirically.

Table 8 The analysis of the argumentative sequence from a model-based perspective (Part 1 and 2)m

At the beginning (part 1/t1; cf. Table 8) the students refer to their prior knowledge in order to explain the phenomenon of ametropia they have been confronted with. Their mental model of the human eye (mofe) includes its capacity to accommodate to focus on objects of different distances. The students suppose this process to be deficient in people that need glasses to see objects sharply. No additional data to refer to is presented here, so there are no argumentative processes to analyse. Part 1 is nevertheless an important preliminary sequence to understand the following discussion. Due to instructional input via the work with an optical bench, the number of available hypothetical models to explain the phenomenon has been augmented (part 2/t2). On the one hand, by changing the lenses in an optical bench, the sharpness of the picture on the screen can be modified (M1). On the other hand, increasing the distance between the lense and the screen affects the sharpness (M2), too. Student 2 argues with a hypothetical argument (A1) in favour of M1, Student 1 argues with the other, plausible theoretical model (A2) in favour of M2. On the basis of model-based empirical and therefore hypothetical data, additional information is needed to evaluate which of the possible models is empirically plausible.

A second instructional input offers additional data for the students to continue the argumentative process and to discuss which of the models of the eye is more plausible (Table 9). In part 3 of the argumentation (t3), student 1 refers to the available data (info 1) to evaluate empirically his model (A3). Moreover the data allows further alteration of the model to explain the reasons for ametropia via different forms of eyeball. This is why the modified model is here renamed (M2’). The data not only allows an additional step of model construction, but also to evaluate some aspects empirically. Student 2 refers to another aspect of empirical data available (info 2). It is important to outline here, that the argument used (A3) is empirically incorrect with regard to the available data. Student 1 then compares the two models available (M1 and M2’) to exclude M1 based on the available information. This description shows in how far different aspects of argumentative processes (hypothetical arguments, empirical arguments, models of phenomena, data, comparison of models, a.s.o.) can be analysed based on a model-based approach. In the next section, the analytic scheme proposed earlier (Sect. 3.2.2) will be used to evaluate the argumentative processes and its quality.

Table 9 The analysis of the argumentative sequence from a model-based perspective (Part 3)

4.2.2 Evaluating the Argumentative Steps

The following analytical scheme will be used to assess the argumentation described above. Because no arguments can be found at t1, only part 2 and 3 are evaluated (Table 10).

Table 10 Exemplary analytic scheme applied to evaluate argumentative sequence

According to this simplified analytic scheme, Student 1 generates three correct arguments in the course of the argumentation, two in favour of their own model, the third to criticise a rival model. Student 2 generates one correct and one qualitatively wrong argument. Although a relationship between an aspect of the model and empirical data is established and is thus structurally comparable to other arguments, to draw such a comparison is wrong concerning its content. Whereas the structural function is evaluated by the different dimensions in the rows, the division of the column (into wrong or right) allows this judgment to be made. Of course, depending on the concrete requirements of the research question, further differentiation of the row and column parameters can be made to modify and adapt this analytic scheme. Moreover, the weighting of the concrete results, e.g. the overall significance of correct or wrong argumentative steps to assess the students’ argumentative competence, has to be made in relation to the normative framework used by the concerned researchers.

In sum, the model-based approach presented here offers a strong and broadly applicable tool to understand, describe and evaluate argumentation in science education.

5 Conclusion

The model-based theory represents a suitable theoretical framework for describing arguments and argumentation referring to the similarity between models and empirical data as the central reference for model evaluation.

As it has been shown, this analytical method has several advantages:

Firstly, in the course of the analysis, the model-based approach allows the separation of argumentative units in relation to the models evaluated; models can therefore function as an adequate point of reference to allow the definition of arguments as structural and content-related.

Secondly, as this approach allows the differentiation of content-related qualities, it can be implemented successfully in analysing argumentation qualities within learning processes, characterised as successive steps of model evolution. This theoretical differentiation and the understanding of argumentations as evaluative processes embedded in a superordinated model development allow the description of the complex processes in more detail. Fewer problems of selectivity in the understanding and the methodological evaluation of argumentations are also produced. It is therefore possible to identify to which model, especially to which aspect of this model, an argument refers to and which data is taken into account as an empirical foundation. Moreover different structural dimensions are distinguishable, depending of the role of the argumentative step in the process of model evaluation.

Thirdly, the quality of arguments can also be assessed in relation to the concerned scientific domain. As explained above, prescriptively determined relevant target models allow the assessment of the quality of argumentative steps in the course of the model evaluation. Connecting specific features of arguments to the relevant discipline is an important aspect. This step causes problems in the context of other analytical approaches (e.g. Toulmin’s framework).

Finally, the presented approach is based on a broad theoretical background in the philosophy of science, cognitive science and also upcoming model-based instructional frameworks in science education.

Several aspects that could only be briefly mentioned in this article can be focused on in future research from the chosen perspective. For example it might be questioned in what way argumentative competences can be fostered in the classroom through adequate instructional strategies.Footnote 8 It has been indicated that in this context a model-based theory could adequately promote the relevant argumentative competences to the students. Such considerations could be a next step in working out a model-based theory of argumentation.