1 Introduction

Computational simulations have progressively become integral to contemporary scientific research. A query for the term “computer simulation” in reputable scientific journals' databases (e.g., Journal of Biological Chemistry, Nature, PNAS, Journal of the American Chemical Society) reveals a substantial increase in its usage over the past decade. Processes and phenomena such as the evolution of the universe, galaxy formation, molecular structure, modeling of cancerous tissues, and climatic conditions represent some of the applications facilitated by computational simulations across various disciplines.

The design, integration into scientific activities, and generation of information about the world through simulations introduce novel approaches to understanding phenomena (Winsberg, 2009). These discussions are pertinent to the didactics of science as they foster new perspectives on the methodologies employed by scientists in investigating their study phenomena. They also influence the perceptions of science and scientists that students develop in the classroom. Hence, the inclusion of metatheoreticalFootnote 1 discussions on computational simulations, aimed at enhancing science education, appears to be a fresh and promising agenda within the discipline (Seoane, 2018).

We consider this work agenda to be integrated into model-based didactics of scienceFootnote 2 (Adúriz-Bravo, 2013; Ariza, 2015; Ariza et al., 2016, 2020; Chamizo, 2013; Justi, 2006) because computational simulations are a special class of representational models (Vallverdú, 2014) that function as theoretical and experimental research devices (Morrison & Morgan, 2010) and account for: 1) systems of the world through their imitation (Durán, 2021), 2) non-material objects, or, 3) semi-material objectsFootnote 3 (Morgan, 2003).

Simultaneously, within the field of didactics, a specific category of computational simulations has been developed to enhance science education. Despite simulations being the focus of didactic studies examining their impact on science learning (Rodriguez et al., 2013; Mijares-Almanza et al., 2017; Haryadi & Pujiastuti, 2020; Rahmawati et al., 2022), few studies delve into the didactic characteristics of these simulations (López et al., 2016; Smetana & Bell, 2012; Velasco & Buteler, 2017). Scarcer still are studies that subject science education computational simulations (CSSE) to metatheoretical reflection (Seoane et al., 2015). Consequently, the primary aim of this contribution is to review the most relevant scopes and limitations reported in didactic literature concerning CSSE. Simultaneously, a second objective of the article is to present arguments contributing to the metatheoretical characterization of CSSE. We contend that both objectives can benefit science teaching and contribute to the initial and ongoing development of science teachers.

To achieve these goals, the article navigates three complementary paths. Initially, it embarks on a descriptive review (Carrasco, 2009) of selected contributions and challenges revealed by didactic research concerning the impact of simulations on science learning. This review identifies the scope and limitations emerging from their implementation in the classroom.

Subsequently, we assert that although the conducted review allows us to discern didactic characteristics of CSSE, to perform metatheoretical analyses linked to the didactics of science, a second-order framework associated with the nature of science (Adúriz-Bravo, 2002; Adúriz-Bravo et al., 2011) is necessary. This framework justifies the introduction of metatheoretical discussions about simulations in the discipline. Finally, we present reflections from philosophers of science on models and simulations, as their dynamics in scientific activity have prominently featured in current research. Employing such a metatheoretical framework, we identify metatheoretical characteristics of CSSE.

2 Computational Simulations in Science Learning. Contributions and Challenges of Didactic Research

For several decades, science educators have undertaken numerous investigations to comprehend the advantages, limitations, and potential challenges posed by computer simulations in science education across various academic levels (cf. Almasri, 2022; Rahmawati et al., 2022). Specifically, discussions about their impact on science learning have prompted considerations of their role in students’ development. Numerous researchers concur that simulations not only enhance and facilitate conceptual understanding but also provide an initial approach to laboratory procedures (Almasri, 2022; Balamuralithara & Woods, 2009; Mellado et al., 2013; Rahmawati et al., 2022).

Similarly, it is argued that simulations have inherent limitations as they cannot fully substitute for physical laboratories. One of the reasons cited is their inability to generate the tacit knowledge essential for developing procedural skills, limiting the acquisition of human experiences (such as feelings and sensations) in connection to real-world phenomena (Hodson, 1994; Saputri, 2021).

Mijares-Almanza et al. (2017) asserted that conducting experiments initially through simulations, followed by hands-on practice in the chemistry laboratory, resulted in fewer errors in procedural performance. A significant reduction in reagent waste was evident compared to prior practices, leading to a notable decrease in environmental impact. While this conclusion is specifically tied to the development of procedural skills, it underscores the role of simulations as a bridge that facilitates students' preparation for laboratory work (Balamuralithara & Woods, 2009; Mellado et al., 2013).

Other authors contend that the use of simulations allows a substantial reduction in the interference of concrete experiences by adopting idealized conditions, thereby enhancing the understanding of the models involved (Hodson, 1994). Such simulations offer greater flexibility, time efficiency, control, and simplification of experiences compared to physical laboratory practices (Smetana & Bell, 2012). Simultaneously, simulations that afford a certain degree of freedom in manipulating instruments enable students to construct experimental designs (Ashe & Yaron, 2013). This approach facilitates safe learning from mistakes, encourages reflection on processes, and enables students to determine whether the procedure employed was the most suitable for addressing the studied phenomenon (Jaakkola & Nurmi, 2008; Zacharia & Constantinou, 2008).

Within its limitations, various authors contend that simulations constrain the human aspects of laboratory experiences, including curiosity, emotionality (Balamuralithara & Woods, 2009), sensory exploration, the development of tacit knowledge (Hodson, 1994), and social interaction. Moreover, when attempting to iconographically represent the microscopic world, simulations may distort the theoretical concepts of scientific nature, such as atomic and molecular orbitals (Scerri, 2000),Footnote 4 and chemical activity (Martín Sanabria & Garay Garay, 2020), as they aim to facilitate students' comprehension.

Other studies have explored the impact of simulations on fostering scientific thinking processes. Haryadi and Pujiastuti (2020) concluded that the experimental group, taught through a participatory didactic design with simulations, produced more robust arguments about concepts like "temperature" and "heat" compared to the control group taught through traditional methods. The authors attribute the success of the experimental group to factors like motivation, interactivity, and exploration, aligning with the results of other research where simulations guided by situated didactic designs supported the understanding of scientific representations (Jaakkola & Nurmi, 2008; Zacharia & Constantinou, 2008; Rodriguez et al., 2013; Rahmawati et al., 2022; Almasri, 2022).

Recently, amid the global health crisis, experimental processes had to adapt to digital resources, with simulations emerging as a means to replicate real laboratory experiences. Across various educational disciplines, there has been a concerted effort to promote experimentation through remote, mobile, and digital or everyday laboratories (Castro-Maldonado et al., 2020). An exploratory and descriptive study revealed that simulations were extensively used in secondary education (over 50%) and higher education (over 55%) by 91 teachers from different Latin American countries (Moya et al., 2021).

During the pandemic, Jeffery et al. (2022) conducted research designing a laboratory learning environment for x-ray fluorescence spectroscopy and ion chromatography. This interface allowed users to freely interact with different modules, covering topics such as sample preparation, biosafety protocols, instrumental equipment, data collection, and extraction, among others. Students perceived the environment as high-quality, enabling unlimited use. Some students reported reduced anxiety about learning experimental techniques, emphasizing that the simulation provided a digital approach to laboratory instruments. While the results were highly favorable, the authors emphasize that digital resources (environments, simulations, etc.) complement but do not replace physical laboratory activities.

From the previous lines, it is possible to extract a series of characteristics of the simulations that allow their projection in the science classroom:

  • Simulations facilitate conceptual understanding of scientific models (Hodson, 1994; Jaakkola & Nurmi, 2008; Zacharia & Constantinou, 2008).

  • They enable the development of experimental setups (Ashe & Yaron, 2013).

  • Provide interactivity in the learning of scientific models (Ashe & Yaron, 2013; Castro-Maldonado et al., 2020).

  • Provide qualitative and quantitative data on simulated systems (Balamuralithara & Woods, 2009; Jeffery et al., 2022).

  • Facilitate the preparation of students in physical laboratory work (Almasri, 2022; Balamuralithara & Woods, 2009; Mellado et al., 2013; Rahmawati et al., 2022).

Similarly, some limitations in their application are identified:

  • They are not substitutes for physical laboratories (Jeffery et al., 2022).

  • They do not allow the development of procedural skills (Hodson, 1994; López et al., 2016; Velasco & Buteler, 2017).

  • They can generate distorted views of scientific concepts (Scerri, 2000; Martín Sanabria & Garay Garay, 2020).

  • They grant simplified and idealized representations of the simulated system (Smetana & Bell, 2012).

  • They restrict the human aspects of laboratory experiences (Hodson, 1994; Saputri, 2021).

While the aforementioned characteristics outline the attributes simulations bring to the classroom, it is essential to delve into the definition of a computational simulation and explore its metatheoretical characteristics. This exploration aims to enhance our understanding of its essence and the perspectives held by science teachers. The metatheoretical aspects in question fall within the domain of philosophy of science (POS). However, in the didactics of science, these metatheoretical considerations align with what is commonly referred to as the “nature of science” (NOS). This established line of research within didactic studies offers reflections and metatheoretical positions of educational significance, shedding light on how science is taught and its modes of representation in the classroom.

3 About the Nature of Science

The analyses conducted within the so-called meta-sciences (including history of science, philosophy of science, sociology of science, and more recently, psychology of science and anthropology of science) have been the subject of study in didactics due to their substantial value for teaching (Adúriz-Bravo, 2007; Ariza et al., 2016). These second-order analyses serve as the foundation for constructing the “nature of science” a research line that focuses on its own denomination as an explicit and implicit subject in science curricula (Matthews, 2012; Allchin, 2011; Adúriz-Bravo et al., 2011; Adúriz-Bravo & Ariza, 2012). In addition, it is meant to be learned, discussed, and reflected upon by both teachers and students (Adúriz-Bravo, 2007).

In this context, the NOS is presented as a form of meta-knowledge with educational significance (Acevedo et al., 2005; Adúriz-Bravo & Ariza, 2013). It explores various aspects, including the ways in which science produces knowledge, its differentiation from other forms of knowledge, its reciprocal influence with society and culture, its evolutionary changes, and the structures and forms involved in knowledge construction (Adúriz-Bravo, 2007; Ariza, 2015). Some innovative perspectives on NOS encompass the different dimensions in which science functions as a human activity and as a process of knowledge-building (Adúriz-Bravo, 2005; Adúriz-Bravo et al., 2011; Allchin, 2011; Irzik & Nola, 2011; Matthews, 2012).

As teachers play a fundamental role in shaping the perspectives of NOS in the classroom, it is pertinent to consider the various purposes that NOS serves in both initial and ongoing teacher education. These include fostering critical reflection on science with a contextual, conceptual, pragmatic, and ethical sense (Ariza, 2015); promoting a sociocultural awareness of science enriched by diverse human factors (Acevedo et al., 2005); and enhancing the teaching and learning of science by leveraging the wealth of resources that NOS provides for teaching practices (Adúriz-Bravo, 2007).

While NOS, as a research line, encompasses a variety of approachesFootnote 5 that teachers should be acquainted with and promote (Adúriz-Bravo, 2007), and is rooted in diverse phases or frameworks of the philosophy of science (cf., Adúriz-Bravo et al., 2011), for the purposes of this paper, we assert that the proposals of the so-called semantic view of theories robustly contribute to understanding the dynamics, processes, and products of scientific activity (Ariza, 2015, 2021, 2022; Ariza et al., 2016).

Simultaneously, we will consider the reflections of different philosophers of science, representing various schools of thought, who have pondered on computational simulations. In this context, we will present some reflections that will guide the discussion towards a metatheoretical characterization of computational simulations of science (CSS), particularly in the realm of science education (CSSE).

4 Models from the Semantic Conception and Simulations from other Philosophical Schools of Science

The semantic conception of theories cannot be unambiguously characterized, particularly when considering the contributions of Ronald Giere, Bas van Fraassen, Frederick Suppe, and metatheoretical structuralism. The diversity of perspectives on this conception becomes evident when one examines the exclusions of specific proposals, such as the structuralist proposal according to Suppe (1989), and the omission of Ronald Giere's proposal regarding the broader notion of “model” as a structure of a particular type, as outlined by Frigg and Hartmann (2020) and Frigg and Nguyen (2020).

In this context, we will approach the semantic conception from the perspective proposed by Ariza et al. (2016) and Lorenzano (2003), who define this conception as a semanticist family sharing fundamental aspects. Within this family, specific semanticist approaches can be identified, including the proposals of Giere, van Fraassen, Suppe, and the structuralist metatheory.

These authors of the semantic conception of theories argue that the most important construct for the identification/characterization of scientific theories is/are the class, collection, population or set of their modelsFootnote 6 (Ariza et al., 2016; Lorenzano, 2003). Such models are formulated with the intention of “accounting” for a certain fraction of the worldFootnote 7 and representing it in a way that fits the available data or empirical evidence.

Some contemporary philosophers also posit that models exist in an intermediary space between theory and the aspects of the world they seek to represent (Giere, 1988; Morrison & Morgan, 2010; van Fraassen, 2008). The connections that models establish with these fractions of the world are defined by a linguistic act known as a “theoretical hypothesis”, which asserts the existence of a link (matching to some extent and with specific aspects) that is testable between the model and the world (Ariza et al., 2016; Giere, 1988). The way of understanding the relationship between models and such portions of the world differs (sometimes significantly) among different authors of the semantic conception: some authors understand it as a similarity (Giere, 1988), as a subsumption (Balzer et al., 1987; Moulines, 2006), as a homomorphism (van Fraassen, 1980), among others.

Due to the diversity of perspectives within the semantic conception, some argue that models are dependent on theory, while others posit that they are partially autonomous entities (Lombardi, 2010). Among the latter, it is suggested that models possess local and specific knowledge that doesn’t solely derive from the theory or its connection with the world (Lombardi, 2010). This characteristic of partially autonomous models has also been ascribed to simulations by philosophers of science from other schools (Morrison & Morgan, 2010; Winsberg, 2003).

In this context, simulation is conceptualized as a distinct type of model (Lenhard & Carrier, 2017), also referred to as a simulation model (Durán, 2020), implemented on a computer and characterized by its unique construction methodology (Durán, 2021).

On the one hand, when embedded in a computer, simulations virtually imitate processes in the world (Hartmann, 1996). They function as theoretical and experimental research instruments (Morgan, 2003; Morrison & Morgan, 2010). In addition, they offer a form of experimentation (Fox Keller, 2003; Sismondo, 1999; Shubik, 1960) on semi-material or non-material objects (Morgan, 2003). The ways in which they do this can be through: 1) the exploration of the properties of dynamic models of a mathematical nature (Hartmann, 1996); 2) the execution of integrated models by modifying their parameters and variables (Shannon, 1998).

Vallverdú (2014) explains that “non-material” experiments in science refer to computational experiments performed in computer systems.Footnote 8 Among them, there are experiments with computational simulations that follow a pre-established and fixed process (in silico),Footnote 9 which are allowed to continue their course of execution or there are experiments that are performed with computational simulations that allow internal models to be perturbed in the middle of the execution (in virtuo).Footnote 10

In the process of constructing a simulation, a combination of models, pseudonumbers, incompatible models, and graphical representations is employed (Vallverdú, 2014). Frequently, idealizations and assumptions are introduced (Humphreys, 2004), and graphical techniques are utilized to convert the output into graphs and videos (Winsberg, 2003). Techniques involving the restructuring and grouping of multiple models into a simulation model are also applied (Durán, 2020). In the midst of the design process, hypotheses regarding the scopes and limitations are formulated to enable inferences from the simulated system to the ‘real’ world (Durán, 2021). Similarly, simulations yield idealized results akin to the outcomes of a modeling process.

On the other hand, Vallverdú (2014) states that the design of simulations involves a process of transformation of semantic elements into syntactic elements for their translation into codes of a computational nature, and after the experiment, the meaning is interpreted from the results obtained. Although the way simulations are constructed differs from other scientific modeling processes (Humphreys, 2004), the use of computational systems that allow the integration and implementation of several representative components and other inferences seems to be a notable difference between simulations and models (Durán, 2020; Vallverdú, 2014). In addition, some simulations contain a diversity of models and inferences that represent a part of a complex system (Durán, 2021).

Among the simulations developed for scientific research are those involving discretization techniques, which transform differential equations into computable algorithms executable on a computer (Durán, 2021). Upon receiving the results, the researcher's decision-making regarding the reliability of the findings is informed by contrasting skills with previous experiences and theory. If the data are comprehended and deemed reliable, explanatory links to real-world phenomena are constructed (Durán, 2021).

These considerations lead various philosophers to view the epistemological status of simulations as comparable to other modeling processes and scientific experiments (Durán, 2021; Vallverdú, 2014), or as occupying a similar mediating position between theory and the world. Although scientific computational simulations facilitate research in the natural and social world, we find it pertinent to specifically analyze the characteristics of CSSE. These include their unique design methodologies and the types of scientific models they engage with.

5 What are the Characteristics of CSSE?

As we stated in the previous paragraph, CSSE present a particular design that allows them to be differentiated from CSS. Although it is possible to identify differences between CSS and CSSE linked to their application objectives, scientific on the one hand, and educational on the other (depth and explanatory character [the former], and representativeness and educational character [the latter]); Clark et al. (2016) postulate a series of basic principles of multimedia design to maintain its quality and favor learning:

  • Contiguity: show graphics and text simultaneously.

  • Coherence: avoid distracting elements.

  • Redundancy: avoid introducing repeated information.

  • Example principle: demonstration of the solution to a problem.

Other particular aspects that are emphasized in the design of CSSE are: the form of representationFootnote 11 (iconographic and linguistic) of scientific knowledge (Cankaya & Kuzu, 2009), the didactic (Ashe & Yaron, 2013), psychological (Clark et al., 2016), pedagogical and epistemological principles in which it is inscribed, for whom it is intended and the degree of usabilityFootnote 12 (Adams et al., 2008), among others.

An exemplary case of the use of the aforementioned aspects is shown by Ashe and Yaron (2013) for the design of a computational simulation of the energy landscape concept for the ChemCollective project. The energy landscape concept is expressed by a graphical representation of the potential energy of a molecule as a function of its reaction coordinates (Ashe & Yaron, 2013). Such a representation allows understanding how the potential energy changes as the spatial configuration of the molecule changes (Jansen, 2014). When the potential energy is the lowest, the most stable configuration of the molecule is represented.

The simulation of Ashe and Yaron (2013) is based on analogies that allow the comparison of two structures: a base known to the students and an unknown target. The chosen target is energy landscape and the features they chose from the target, are the stable, metastable and activated states of the molecule 1,2-dichloroethylene. As a base structure, they selected a cardboard box that rests on a platform in a different way.

The analogical correspondence of the conformational state of the molecule is the orientation of the box. Thus, in the stable state, the box rests on the longer face; in the metastable state, the box rests on the shorter face; and in the activated state, the box rests on a corner. The energy state of the molecule corresponds to the center of mass of the box and the reaction coordinate corresponds to the rotation angle of the box. The correspondence between the analogical and scientific representation is shown in Fig. 1. The scientific representation is shown by spheres and bars for the molecule.

Fig. 1
figure 1

Correspondence between scientific and analog representation and the use of some of the principles. Adapted from Ashe and Yaron (2013)

Based on this analog correspondence (Fig. 1), Ashe and Yaron (2013) designed an introductory simulation (see Fig. 2) that displays to the user: 1) an analog representation (a box resting on a platform), 2) a scientific representation (parts of the energy landscape graph), and 3) a menu of buttons (including two buttons labeled “Kick” and two scroll bars).

Fig. 2
figure 2

Graphical interface and the constituent parts of the simulation (adapted from Ashe & Yaron, 2013)

In this simulation, it is possible to identify the basic principles associated with CSSE (and to identify differences with respect to CSS), in particular, scientific and analogical representations are presented simultaneously (contiguity). In turn, Ashe and Yaron (2013) mention that in the design of the introductory simulation, the number of representations shown was as few as possible, without including too much information so that it would not be complicated for students to understand (simplicity and redundancy), and, finally, they avoided using decorative elements that could divert students' attention from the fundamental content of the information (Coherence).

However, the simulation presented by Ashe and Yaron (2013) exhibits some limitations. For instance, it assumes that the box must always rest on the platform, which represents an idealized version of the expected behavior of a real-world box. Additionally, the analogies employed in the simulation may lead to inadequate or distorted mental models of scientific representations when used outside their intended context. Furthermore, the simulation is designed for a single user to carry out actions, thus limiting cooperative work opportunities. However, within its intended scope, the dynamic and interactive nature of the simulation can make it appealing to students, capturing their attention and focusing it on the educational activity. Moreover, it offers a more memorable and engaging learning experience compared to relying solely on equations and graphical representations. By dynamically representing abstract scientific concepts such as reaction coordinates and potential energy, the simulation encourages students to develop more appropriate mental models than those derived solely from discursive explanations or static images.

Other CSSE use representations of laboratory instruments, materials, equipment, and substances. Some of these simulations are sequential and allow some interaction with their elements (Amrita-CDAC, 2024; Pearson, 2024), and others are less restrictive and allow a greater degree of freedom, which favors the design of virtual experiments (Carnegie Mellon University NSDL, 2024a; University of Colorado Boulder, 2024). This particular type of CSSE is defined as “virtual laboratories” since it attempts to resemble physical laboratories (Carnegie Mellon University NSDL, 2024a). Research in science education has identified that virtual laboratories allow an improvement in knowledge comprehension processes and an enhancement in students’ prior preparation for the experimental design of a physical laboratory (Finkelstein et al., 2005; Zacharia & Constantinou, 2008).

Being the construction methodology and the educational purposes distinctive features of CSSE, it is worth asking whether this kind of simulations present particular types of scientific models on which they intervene (Shannon, 1998). To this end, we will draw on van Fraassen’s (2008) semantic view and metatheoretical structuralism (Balzer et al., 1987). We will argue that CSSE that allow quantitative data to be obtained make use of theoretical models and interrelated data models,Footnote 13 which have been translated into a particular programming language and which will serve as inputs for the simulated experimentation. Thus, data models are considered as abstract structures that are representations of phenomenaFootnote 14 (van Fraassen, 2008), in the sense that they numerically describe certain selected results of specific measurement procedures on phenomena. Likewise, such data models (or intentional applications of theories) can be subsumed by isomorphism relations with (a certain class of) substructures of some theoretical model of a class of models of a scientific theory (Balzer et al., 1987). Such a “class of models” is defined by the principles or laws of the theory (van Fraassen, 1989).

In this sense, we characterize CSSE as entities composed in part of 1) data models; 2) theoretical models (understood as mathematical structures that establish relationships and functions on entities) and 3) the representation of phenomena through data models and 4) relationships of subsumption and isomorphism [between data models and theoretical models]). To support this statement, we will analyze the virtual laboratory called “determine the concentration of acetic acid in vinegar” of the ChemCollectiveFootnote 15 project (Carnegie Mellon University NSDL, 2024b). The objective is to show the different theoretical models and data models of weak acid-strong base equilibrium that are found within the simulation. Also, to show the different relationships between them by means of the graphical representations that are possible to construct and that account for the simulated process of transformation of the chemical system in equilibrium under recognizable theoretical models.

5.1 Analysis of CSSE: Determine the Concentration of Acetic Acid in Vinegar

The objective of the virtual laboratory is to determine the concentration of acetic acid present in a vinegar solution.Footnote 16 For this, the laboratory has a standardized solution of sodium hydroxide NaOH 0.110 M, the acid–base indicator phenolphthalein, distilled water, the vinegar solution, and different instruments such as erlenmeyer, volumetric flask, burette, and a pipette. This virtual laboratory allows interaction with the scientific representations of instruments, substances and equipment, which provides the possibility to design virtual experimental setups (see Fig. 3).Footnote 17 Likewise, the action of moving an element (reagent or laboratory material) over another element with the mouse allows suction, transfer, addition, pouring, heating, among others.

Fig. 3
figure 3

Graphical interface and constituent parts of the virtual laboratory “determine the concentration of acetic acid in vinegar”. Adapted from Carnegie Mellon University NSDL (2024b)

When carrying out experimental practice, the first instructionFootnote 18 reveals the need to use the dilution model:

$${C}_{1}* {V}_{1}= {C}_{2}*{V}_{2}$$

Model 1

Dilution of a solution.

After having carried out the dilution (1/10), we proceed to take an aliquot (we take 10 mL) and a few drops of phenolphthalein (0.5 mL). Subsequently we pour a quantity of the titrant in a 50 mL burette, and we carry out the acid–base titration with the colorimetric and potentiometric methods,Footnote 19 obtaining the following data (Table 1):

Table 1 Data produced (simulated by the potentiometric and colorimetric procedure) in the virtual laboratory of Carnegie Mellon University & NSDL (2024b)

The models of obtained pH data displayed by the virtual laboratory can be associated with the underlying models involved in the four regions (before addition of the titrant, before the equivalence point, at the equivalence point and after the equivalence point) of the acid–base titrations. In that sense, the theoretical models associated with each region of the titration are presented.

Before the addition of the titrant the system is only a vinegar solution and can be represented (the data can be subsumed) in the form of a weak acid equilibrium with its respective acid constant of the analyte Ka.

$$Ka= \frac{{x}^{2}}{\left[HA\right] -x}$$

Model 2

Ideal chemical equilibrium of a weak acid.

After solving the equilibrium and characterizing the representation of x -as the concentration of hydronium ions- it is substituted into the ideal pH model (Martín Sanabria & Garay Garay, 2020).

$$pH= -log\left[{H}^{+}\right]$$

Model 3

Ideal hydrogen potential.

Before the equivalence point, when adding x number of moles of the titrant, the number of moles of the acid and conjugate base after the reaction must be determined (see reaction 1). With the data of moles and total volume the concentration of acetic acid and acetate ion can be calculated:

$${NaOH}_{(ac)}+{C{H}_{3}COOH}_{(ac)} \leftrightarrow {C{H}_{3}COONa}_{(ac)}+{{H}_{2}O}_{(l)}$$

Reaction 1

Titration equation.

Subsequently, the Henderson-Hasselbalch model is used for such a system and the pH in the second region is determined.

$$pH= {pK}_{a}+log\frac{\left[{A}^{-}\right]}{\left[HA\right]}$$

Model 4

Henderson Hasselbalch.

The equivalence point was reached by adding 7.23 mL when observing the color change of the system to pale pink with a pH of 8.69, so there are "only" moles of the conjugate base (acetate ion) in the total volume of the system. At that point, hydrolysis of the acetate ion occurs and thus its equilibrium (see model 5) by the reaction:

$$C{H}_{3}CO{O}_{(ac)}^{-}+{H}_{2}{O}_{(l)}\leftrightarrow {CH}_{3}COO{H}_{(ac)}+{OH}_{(ac)}^{-}$$

Hydrolysis of acetate ion.

$$Kh= \frac{{x}^{2}}{\left[{A}^{-}\right]-x}$$

Model 5

Ideal chemical equilibrium of conjugate base hydrolysis.

When solved, the hydroxyl concentration is found and thus the hydroxyl potential model is used.

$$pOH= -log\left[{OH}^{-}\right]$$

and then the pOH result is incorporated into the pH model containing the 25 °C constant of the ionic product of water in logarithmic form (\(-log{K}_{w}\)), to determine the ideal hydrogen potential of the equivalence point.

$$pH=14-pOH$$

Model 6

Ideal hydroxyl potential.

Model 7

Relationship between ideal hydrogen, hydroxyl, and ionic product potentials of water.

After the equivalence point, the pOH is determined from the excess moles of hydroxyls from the titrant and their respective concentration by the following relation:

$$\left[{OH}^{-}\right]= \frac{mol in excess}{total volume}$$

Data model 1

Hydroxyl ion concentration.

And then use model-theoretical structures 6 and 7.

Each region and their respective theoretical models represent different stages of the transformation process of the acid–base system under study. By constructing the graph of pH vs. titrant volume (see Fig. 4), it becomes evident the relationships presented by the data models involved in the four regions of the weak acid-strong base titration after being subsumed under the empirical substructures of the theoretical models that represent them:

Fig. 4
figure 4

Representation of the transformation process of the acetic acid-sodium hydroxide system, the most representative data models used by regions and characterized under the theoretical models

Furthermore, with the data produced it is possible to determine the volume of the titrant at the equivalence point from the construction of the graph of the first derivative (\(\frac{\Delta pH}{\Delta V}vs mL de NaOH\)) (see Fig. 5) with the use of the corresponding model:

Fig. 5
figure 5

First derivative method to determine the volume of titrant at the equivalence point

$$\frac{\Delta pH}{\Delta V }= \frac{{pH}_{n+1}-p{H}_{n}}{{V}_{n+1}-{V}_{n}}$$

Model 8

Variation of the pH of the analyte with respect to the volume variation of the titrant.

Finally, to answer the CSSE objective, we take the titrantFootnote 20 volume data shown in the Fig. 4 (7.23 mL) and the model associated with the determination of the analyte concentration.

$${N}_{analyte}*{V}_{analyte}={N}_{titrant}*{V}_{peq}$$

Model 9

Determination of analyte concentration.

By applying model 9, the concentration of acetic acid present in the dilution can be determined and then multiplied by a factor of 10 to find the concentration in the vinegar solution.

The analyzed simulation presents a series of ideal data models connected to each other in the four regions of the weak-strong acid-strong base equilibrium and subsumed under the empirical substructures of the theoretical models, which allows interpreting the results shown by the virtual laboratory. Thus, the virtual laboratory provides information on the weak-strong acid-strong base system imitation of the world (acetic acid—sodium hydroxide) understood as data models that are sequentially subsumed under theoretical models in order to “account” for such phenomena. In this sense, CSSE are instruments or composite structures that allow performing experiments on models (Shannon, 1998), in this particular case, on data models (van, Fraassen, 2008) and their relationships with the empirical substructures of theoretical models (Ariza et al., 2016; Balzer et al., 1987) that have been translated into the formalism of a programming language (Durán, 2020). We consider that the CSSE analyzed could be categorized in the type of experiments (in silico) of Vallverdú (2014) by following a set of pre-established and fixed steps. It is relevant to mention that the CSSE analyzed, counts with constants at 25° C as the value of Ka of the acid and Kw. The population of models presented belongs to the class of theoretical models of chemical equilibrium (acid–base) and can be defined (identified) in part by the law of mass action and the principles of LeChâtelier and van't Hoff.

In this sense and in the educational context, it is desirable that teachers previously carry out school scientific modeling processes around the balance theory and the Brønsted-Lowry acid–base theory (Martín Sanabria & Garay Garay, 2020). Based on this, the use of computer simulation can contribute to the understanding of the transformation process of the acid–base chemical system in equilibrium under the fundamental notions of the analysis methods (colorimetric and potentiometric) used in the experiment. Such simulation analysis methods allow producing data models that are represented by the titration curve and that are characterized under the theoretical models of each titration region.

Under the guidance of teachers, students are expected to construct the titration curve and the first derivative graph to determine the equivalence point of the acid–base system. It is recommended to initially use computer simulation, followed by physical laboratory practice, to enhance the development of procedural skills and conceptual understanding. In this regard, CSSE serves as a procedural bridge by facilitating students' initial understanding of laboratory procedures, allowing them to simulate experimental designs before conducting hands-on experiments. It also aids in familiarizing students with the instruments, materials, equipment, and substances involved in the practice beforehand.

Similarly, CSSE serves as a conceptual bridge by supporting teachers’ explanations and predictions regarding the data models produced throughout the experiment. The presented metatheoretical elucidation enables teachers to utilize simulation for explaining the ideal theoretical models involved within the simulation and in each region of the titration, as well as generating predictive dynamics concerning the data that will be produced in the virtual laboratory under specific experiment conditions (ambient temperature at 25 °C and 1 atm pressure).

6 Conclusions

In this paper we review and reflect on some of the didactic and metatheoretical characteristics of CSSE. The metatheoretical analysis of a virtual laboratory allows us to affirm that CSSE using quantitative data have associated data models coming from simulated measurement systems, related to each other, that are substantially linked to the phenomena studied and that can be subsumed under (empirical substructures of the) theoretical models. In this sense, the use of data models, theoretical models, graphical and mathematical representations, measurement simulations (to obtain data models) and definitions of initial conditions on simulated scientific practices turn CSSE into tools for the construction of scientific knowledge in schools that allow the representation of world phenomena in virtual spaces. For this reason, and by integrating such didactic and metatheoretical characteristics, CSSE can be positioned as bridges that facilitate the transition towards the understanding of theoretical models and the phenomena that these models seek to explain, through their implementation in school activities. Similarly, it is possible to establish differences between CSSE and CSS, mainly in the design process. We consider that the present contribution allows us to expand and update the didactic and philosophical notions about the production of scientific knowledge and those associated with the use of CSS or CSSE.