Keywords

1 Introduction

Students’ learning processes for acquiring advanced and abstract scientific knowledge are complicated ones, with cognitive and social learning playing crucial roles [1,2,3]. Such learning processes have been discussed from viewpoints based on complex dynamic systems [3,4,5], where the cognitive and social aspect of learning [1, 6] are seen in equally important roles. In that learning process, two key issues are: First, learner’s mental models and explanatory schemes are strongly context dependent [2, 3], giving emergence to varied but robust outcomes within a given context [4, 5], and second, social learning may significantly boost learning even in cases where only indirect effects operate through constant peer-to-peer comparisons which reinforce students’ self-efficacy [6] or mutual appreciation [7].

Here, an agent-based-model is introduced for exploring the social and cognitive aspects of teaching-learning processes, referred briefly as the sociocognitive aspects of learning. The target system to be modelled here is a five-person group, with a learning task to learn a tiered system of explanatory schemes to explain a set of observed phenomena, for which only a few possible explanatory schemes of different levels of sophistication are available, corresponding to some well-known and extensively studied cases of learning scientific knowledge [4, 5]. The basic assumptions in modelling such a teaching-learning process are that the process is affected by: (1) the context of learning and its design, (2) students’ cognitive abilities and proficiencies, and (3) social interactions. These three sociocognitive aspects and how they are idealised are discussed in more detail in what follows. The teaching-learning task and the corresponding explanatory schemes are modelled as an epistemic landscape [8, 9] while the cognitive dynamics of learning is described as the agent’s exploration of the epistemic landscape. Social interactions, on the other hand, are modelled by using an agent-based model of how agents’ proficiencies develop solely through their mutual comparisons of their proficiencies [10].

2 Models of Knowledge and Learning

Knowledge systems which are the target of learning of interest here are systems of tiered knowledge schemes [4, 5]. A concrete example of such system consist of explanatory schemes describing the behaviour of simple DC-circuits, where from five to seven explanatory schemes can be discerned [4, 5]. Consequently, a three-tiered system consisting of five explanatory schemes \(m_{1}-m_{5}\) is assumed here. The details of the tiered systems and how they correspond to real learning tasks are explained elsewhere [5].

Each scheme \(m_{1}-m_{5}\) can be associated with a utility function \(u_k\), with \(k=1, \dots 5\), which provides an abstract representation of the likelihood that scheme \(m_k\) provides an explanation. The utility \(u_k\) depends on two external (exogenous) variables \(\epsilon \) and \(\kappa \). The first variable \(\epsilon \in [0,1]\) is the relative number of explained features (i.e. explanans) contained in tasks. The value \(\epsilon =1\) describes the explanandum, where all features are explained and the explanans becomes equal to the explanandum. The second variable is the proficiency \(\kappa \in [0,1]\), which describes the proficiency required from a learner to use a given scheme \(m_k\) in providing explanations. The value \(\kappa = 1\) denotes full mastery in using the highest-level schemes [10].

Explanatory schemes have different utilities in different situations of explanation. Using or not using the given scheme is assumed to depend on its utility in a given context or situation and the proficiency of the user, higher level schemes requiring higher proficiency. The tiered system of explanatory schemes can be described by constructing a corresponding manifold of utility functions, called an epistemic landscape (see refs. [8, 9] and the references therein). The system of utility functions is modelled here as a set of Gaussian functions in a two-dimensional space \((\epsilon , \kappa )\) spanned by the explanans \(\epsilon \in [0,1]\) and proficiency \(\kappa \in [0,1]\), in form

$$\begin{aligned} u_{k} (\epsilon ,\kappa ) = \mathrm{exp}[- (\frac{1}{2(1-\rho ^2)} ( \frac{(\epsilon -\epsilon _k)^2}{2 w_{\epsilon }^2}+\frac{(\kappa -\kappa _k)^2}{2 w_{\kappa }^2}) +2 \rho \frac{(\epsilon -\epsilon _k)(\kappa -\kappa _k)}{w_{\epsilon }w_{\kappa } })] \end{aligned}$$
(1)

where \(\epsilon _k\) and \(\kappa _k\) define the maximum, with \(\epsilon _{k+1} > \epsilon _k\) and \(\kappa _{k+1} > \kappa _k \) corresponding to the tiering of schemes \(m_k\). The allowed variation in utility is governed by \(w_{\epsilon }\) and \(w_{\kappa }\), respectively, while \(\rho \) controls the (positive) correlation between proficiency and explanans, taken here to be only moderate with \(\rho =0.20\).

The fact that explanatory schemes contain similar elements means that learning one scheme may help or hinder learning a closely related scheme. Such entanglement of the explanatory schemes \(m_{1}-m_{5}\) is here described at an idealised, generative level, by using an entanglement factor which modifies the schemes so that utility functions \(u_{k}'\) affected by entanglement are given by (compare with ref. [11])

$$\begin{aligned} \tilde{u}_{k} = u_{k}(1+\varDelta _{k} \varTheta ), \, \, \, \, \, \mathrm{where} \, \, \sum _{k} \varDelta _{k}=0 \end{aligned}$$
(2)

where \(\varTheta =\sum _{k} u_{k} \cos {[ (\pi R_{k})/(2 \lambda ) ]}\) with \( R_{k} =[(\epsilon -\epsilon _{k})^2+(\kappa -\kappa _{k})^2]^{1/2}\) models the effects of entanglement. The parameter \(\lambda \) is roughly related to the number of combinatorial factors responsible for the entanglement and thus affects the number of intermediate maxima in the entangled landscape between maxima in the non-entangled landscape. The entanglement factor \(\varDelta _{k}\) for the utility functions are defined as \(D_{1}=A_{1,2} + A_{1,3}\), \(D_{2}=- A_{1,2} + A_{2,3} \), \(D_{3}=- A_{1,3} - A_{2,3} + A_{3,4} + A_{3,5} \), \(D_{4}= -A_{3,4} - A_{3,5} + A_{4,5} \) and \(D_{5}= -A_{4,5} \), where \(A_{k,k'} = A_{0} \sqrt{u_{k} u_{k'}}\). The entanglement factors sum up to zero so that they only redistributes the probability mass. Three different epistemic landscapes A-C studied here are shown in Fig. 1.

Fig. 1.
figure 1

The epistemic landscape in two dimensional space spanned by explanans \(\epsilon \) and proficiency \(\kappa \) and consisting of utilities \(u_{1}\) (orange), \(u_{2}\) (blue), \(u_{3}\) (green), \(u_{4}\) (purple) and \(u_{5}\) (red). The three different landscape models shown are: A (no entanglement), B (\(\lambda =3\)) and C (\(\lambda =5\)). (Color figure online)

Cognitive learning is described by a probabilistic learning model (PLM), where the most probable scheme is selected through comparison of utilities \(u_{k}\) so that the selection of a given explanatory scheme \(m_k\) follow a simple canonical probability distribution [10,11,12]

$$\begin{aligned} P(m_{k}) = \left[ \ 1+\sum _{j \ne k} \exp \left[ \ -\beta ( \tilde{u}_{k}-\tilde{u}_{j}) \ \right] \ \right] ^{-1} \end{aligned}$$
(3)

with utilities \(\tilde{u}_{k}\) given by Eq. (2). The parameter \(\beta \) determines the noise-level of selection and is termed in what follows the confidence of choice. In what follows, only high confidence choices with \(\beta =10\) are considered, corresponding in practice to all choices with \(\beta \gg 1\).

The foraging on the epistemic landscape consists now of \(\tau \) attempts to find the best explaining scheme. In practice, the number of attempts is chosen to be \(\tau _\mathrm{MAX} = 15 \times 12\) corresponding to 15 attempts for each key feature. This is enough to reach stable final states and stable LOAs in the simulations. At each instant when the value of \(\tau \) is increased by one event, it is decided:

  1. 1.

    which scheme \(m_{k}\) becomes selected

  2. 2.

    what is the explanans provided by \(m_{k}\)

  3. 3.

    how proficiency \(\kappa \) changes as guided by \(m_{k}\)

Each of these three steps is characterized by a set of probabilities and the selection of an outcome is carried out on the basis of the “roulette wheel” method [13]. In this method a discrete set of possible outcomes \(k=1,2, \ldots , N\) with probabilities \(\varPi _ k\) are arranged cumulatively with cumulative probability \(\varPhi _{k}=(\sum _{j' \le k} \varPi _{j'} ) / \sum _{j' \le N} \varPi _{j'}\). The outcome k is selected if a random number \(r* \in [0,1]\) falls in the slot \(\varPhi _{k-1}< r* < \varPhi _{k}\), where \(\varPhi _{0}=0\). In case (1) the probabilities \(\varPi _{k}\) are given by Eq. (5) with \(k=1, \ldots , 5\) for all five possible choices. In cases (2) and (3) \(\varPi _{k}\) is given by marginal probability distributions \(U_{\kappa } (k=\kappa *)=\int u_{k} (\kappa *,\epsilon ) d \epsilon \) and \(U_{\epsilon } (k=\epsilon *)=\int u_{k} (\kappa ,\epsilon *) d \kappa \), where \(\epsilon *\) and \(\kappa *\) are discretised to \(k \in [1,100]\) discrete bins. The values of \(\epsilon *\) and \(\kappa *\) sampled from the marginal distribution \(U_{\epsilon }\) and \(U_{\kappa }\) represent the agent’s new attempted explanans and proficiency, which may be larger or smaller than the initial ones. However, the agent is not assumed to change its state independent of its current state. Instead, the change of state depends on how agent’s attempted new state at \(\tau +1\) is related to its initial state at \(\tau \). The realised changes are calculated from a discretised evolution equation for explanans \(\epsilon \) and proficiency \(\kappa \) as follows

$$\begin{aligned} \epsilon _{\tau +1}\leftarrow & {} \epsilon _{\tau } + \delta \epsilon \end{aligned}$$
(4)
$$\begin{aligned} \kappa _{\tau +1}\leftarrow & {} \kappa _{\tau } + \mu \, \delta \kappa \, [4 \kappa _{\tau } (1-\kappa _{\tau })] \end{aligned}$$
(5)
Table 1. Changes \(\delta \epsilon \) and \(\delta \kappa \) in explanans and proficiency, respectively, to be used in the evolution equations for agent’s state changes in Eqs. (4)–(5). The initial values are \(\epsilon \) and \(\kappa \) and the new attempted values sampled from marginal distributions \(U_{\epsilon }\) and \(U_{\kappa })\) are \(\epsilon *\) and \(\kappa *\).

The changes \(\delta \epsilon \) and \(\delta \kappa \) depend on the sign of change where \(\delta \epsilon \) and \(\delta \kappa \) depend on the state of the agent and on the attempted new state as shown in Table 1. The evolution Eq. (5), where in the equation for \(\kappa \) parameter \(\mu \) is the memory effect and the term \(4 \kappa (1-\kappa )\) takes into account the cognitive limits in changes of proficiency, leads to logistic evolution of the proficiency [10]. Regarding the explanans, the above rule means that the utility function decides how much in a given stage \(\tau \) of the exploration (or foraging) agent manages to explain, given its current state i.e. proficiency \(\kappa \) and adopted explanatory scheme \(m_k\). Regarding proficiency, the above rules implement the idea that if evolution is in the direction of stronger explanations, then the proficiency \(\kappa \) increases, but if the direction is on the weaker explanations, corresponding to failure, then proficiency \(\kappa \) decreases. Such cognitive dynamics can be also interpreted as a “hill climbing” –type of exploration of an epistemic landscape [8]. The parameter \(\mu \) controls the strength of the memory of success or failure. In principle it can be different for success and failure, but in what follows, for want of better information, we discuss only the case of equal memory for success and failure.

Proficiency is here not considered as a fixed property, but depending on peer-to-peer comparison and appraisals between peers [10] (see also refs. [6, 7]). The dynamic equations for the proficiency are thus assumed to follow a bounded confidence model [10, 14, 15]. In that model, the changes in proficiency due to interaction between agents q and \(q'\) with possession of explanatory schemes \(m_k\) and \(m_k'\) and proficiencies \(\kappa \) and \(\kappa '\), respectively, are given by

$$\begin{aligned} \kappa\leftarrow & {} \kappa + \gamma \, J_{q,q'} (\kappa '-\kappa ) [4 \, \kappa (1-\kappa )] \end{aligned}$$
(6)
$$\begin{aligned} \kappa '\leftarrow & {} \kappa ' + \gamma \, J_{q,q'} (\kappa -\kappa ') [4 \, \kappa ' (1-\kappa ')] \end{aligned}$$
(7)

where \(J_{q,q'}=\mathrm{exp} \left[ \ - (\sqrt{(k'/5)} \, \kappa ' -\sqrt{(k/5)} \, \kappa )^2/(2 \sigma ^2) \right] \ \mathrm{exp} \left[ \ -( \epsilon ' - \epsilon )^2/(2 \sigma ^2) \right] \) is the propagator for the change (compare with ref. [14, 15]). The width \(\sigma \) of the Gaussian function is related to the agents’ tolerance to diversity (the diversity in what follows) in proficiency. In the simulations \(\gamma =0.15\), chosen to represent moderate sensitivity, is kept fixed, and only the parameter \(\sigma \) is changed. The output variables of the simulations are the agents’ proficiencies and the relative number density \(n_{k} (\epsilon ,\kappa )\) of adopted explanatory scheme \(m_k\) in the space spanned by proficiency \(\kappa \) and explanans \(\epsilon \). Because \(\kappa \) evolves during the simulations, this leads eventually to accumulation of scheme choices, seen as peaked values of \(n_{k} (\epsilon ,\kappa )\) at certain regions in the \((\epsilon , \kappa )\)-space. These regions, in what follows, are called Learning Outcome Attractors or LOAs.

The LOAs and their evolution during the simulations when explorations on the epistemic landscape increases with increasing value of \(\tau \) provides, however, very detailed information of the evolution of the agents’ states. A more compact measure is provided simply as an integral measure of the total (relative) number density \(N_k\) of a given explanatory scheme \(m_k\), in the form

$$\begin{aligned} N_{k} = N_{0}^{-1} \int n_{k} (\epsilon , \kappa ) d \epsilon d\kappa , \end{aligned}$$
(8)

with the normalisation \(N_0\) chosen so that \(\sum _{k} N_{k} =1\). The total number density \(N_{k}\) is then used to track the learning process.

3 Results

The dynamic systems model, which describes learning as foraging for knowledge on an epistemic landscape, leads to the formation of robust learning outcomes attractors (LOAs), where learning paths accumulate. The formation of the LOAs is determined by the interplay of learning by foraging for knowledge on the epistemic landscape and by social learning. Here, the focus is on social learning and on the effects the entangled and overlapping components of explanatory schemes have on learning. In order to keep the social learning effects and entanglement effects in control, we have chosen here to keep the parameters \(\beta \), \(\mu \) and \(\gamma \) fixed, a corresponding high confidence (\(\beta \) = 10) in selection of explanatory schemes, low cognitive learning (\(\mu = 0.05\)) and moderate sensitivity to social learning (\(\gamma =0.15\)). In addition, we study only one type of cohort of learners, where all the learners have low initial proficiency \(0.05< \kappa < 0.25\). This cohort is the most interesting one and shows the most nontrivial behaviour in regard to learning, thus best revealing the effects of social learning.

The learning outcome attractors (LOAs) resulting from cognitive learning and social learning are shown in Fig. 2 for epistemic landscape C for three diversities \(\sigma \) = 0.08, 0.10 and 0.14 and for an increased number of exploration attempts \(\tau \) = 0.05, 0.15, 0.40 and 1.00. The results are shown as density distributions \(n_{k} (\epsilon ,\kappa )\) of preferred explanatory schemes in the end of the learning sequence corresponding to \(\tau =1\) The shift to select more advanced schemes during the learning when \(\tau \) increases from \(\tau =0.15\) (little exploration) to \(\tau =1\) (exploration to nearly saturation) is particularly clear when a density from \(n_{k} (\epsilon ,\kappa )\)- of selected explanatory scheme in the \((\kappa , \epsilon )\)-space is examined.

Fig. 2.
figure 2

The Learning Outcome Attractors (LOAs) for epistemic landscape C. The LOAs are recognised as peaked regions in number density distribution \(n_k (\epsilon ,\kappa )\) for schemes \(m_k\), shown as: \(n_{1}\) (orange), \(n_{2}\) (blue), \(n_{3}\) (green), \(n_{4}\) (purple) and \(n_{5}\) (red). The results are shown at different stages of evolution and for different values of diversity \(\sigma \), as indicated in panels. Only densities \(n_k >0. 1\) are shown. The darker/lighter shade indicates positive/negative gradients of \(n_k\). (Color figure online)

The results in Fig. 2 show that by increasing the tolerance to diversity \(\sigma \) in social learning, the outcomes of learning are significantly improved. Interactions with more competent peers, although they do not directly nor proportionally increase the agent’s proficiency, increases the rate of growth of proficiency. In all these cases, however, the LOAs are located roughly in areas of \((\epsilon ,\kappa )\) -space, where the epistemic landscape has peak values, but the details of formation of LOAs depend on diversity and entanglement. In practice this would mean that very different learning outcomes are observed depending on how extensively learner’s explore the tasks (described by \(\tau \)) and how tolerant they are to their peers’ diversities in proficiency (described by \(\sigma \)). For shallow exploration (low values of \(\tau \)) and low tolerance to diversity, i.e. high homophily (low values of \(\sigma \)), learning outcomes may appear better in comparison to cases when diversity is high. However, when chances for exploration and thus for interaction are increased (increasing value of \(\tau \)), learning outcomes become better for cases where tolerance to diversity is high; given enough time for explorations and interactions, interactions with peers is always beneficial even in the absence of bias to learn from more competent peers. This is an outcome of how exploration of the epistemic landscape, its structure, and social learning are interconnected. In practice, it means interconnections between task structure (designed to advance learning) and collaborative learning where learners communicate with their peers. It is noteworthy that the advantageous effect of competent peers persists even if the strength \(\gamma \) of social learning is fixed and only the diversity \(\sigma \) changes.

Fig. 3.
figure 3

The Learning Outcome Attractors (LOAs) at the intermediate stage of evolution (\(\tau =0.40\)) compared for epistemic landscape A (no entanglement), B (entangled with \(\lambda =3\)) and C (entangled with \(\lambda =5\)), from left to right. The LOAs are recognised as peaked regions in number density distribution \(n_k (\epsilon ,\kappa )\) for schemes \(m_k\), shown as: \(n_{1}\) (orange), \(n_{2}\) (blue), \(n_{3}\) (green), \(n_{4}\) (purple) and \(n_{5}\) (red). The results are shown for an intermediate stage of evolution \(\tau =0.40\) and for diversity \(\sigma =0.10\) (upper panels) and \(\sigma =0.14\) (lower panels), as indicated in panels. Only densities \(n_k > 0.1\) are shown. The darker/lighter shade indicates positive/negative gradients of \(n_k\). (Color figure online)

Fig. 4.
figure 4

The Learning Outcome Attractors (LOAs) at the intermediate final of evolution (\(\tau =1.00\)) compared for epistemic landscape A (no entanglement), B (entangled with \(\lambda =3\)) and C (entangled with \(\lambda =5\)), from left to right. The LOAs are recognised as peaked regions in number density distribution \(n_k (\epsilon ,\kappa )\) for schemes \(m_k\), shown as: \(n_{1}\) (orange), \(n_{2}\) (blue), \(n_{3}\) (green), \(n_{4}\) (purple) and \(n_{5}\) (red). The results are shown for an intermediate stage of evolution \(\tau =0.40\) and for diversity \(\sigma =0.10\) (upper panels) and \(\sigma =0.14\) (lower panels), as indicated in panels. Only densities \(n_k > 0.1\) are shown. The darker/lighter shade indicates positive/negative gradients of \(n_k\). (Color figure online)

The effect of entanglement in LOAs is shown in Fig. 3 for models A (no entanglement), B (entangled with \(\lambda =3\)) and C (entangled with \(\lambda =5\)) an intermediate stage of evolution, and for final states in Fig. 4. The effect of entanglement is also detectable, although it is clearly weaker than the effect of diversity. In Fig. 3 we see that for intermediate diversity the non-entangled landscape A leads to the formation of very sharply defined LOAs, and even the LOA corresponding scheme \(m_4\) and high proficiency is formed. In the entangled landscapes B and C equally, strong LOAs corresponding \(m_4\) and \(m_5\) of high proficiency emerge only when high explanans values \(\epsilon \) are reached. In practice, this means that if only low values of \(\epsilon \) (corresponding to relatively simple situations and only a moderate number of observations to be explained), non-entangled model A provides better learning outcomes in comparison to entangled cases B and C. In all cases, however, the diversity increases learning. Interestingly, the highly entangled model C with high diversity \(\sigma =0.14\) (Fig. 3 lower right) leads to the emergence of high proficiency LOAs already corresponding to scheme \(m_5\) at intermediate stages of evolution and already for intermediate explanans with \(\epsilon > 0.3\). This shows that appearance of states due to entanglement may help the very effective shift from lower level explanatory schmes to higher level ones if social learning facilitates this shift.

The interplay of entanglement and diversity in social learning is, however, rather complicated. When final, nearly stabilised LOAS corresponding \(\tau \) = 1 are examined, shown in Fig. 4, we observe that while diversity \(\sigma =0.10\) again leads clearly to better learning outcomes for entangled (B and C) than for non-entangled landscapes (A), this is no longer the case when diversity is high, having a value of \(\sigma =0.14\). Now, again, we see that the non-entangled model A gives emergence to sharply defined high proficiency LOAs for \(m_5\) and for \(m_4\). In fact, if learning outcomes only for \(\epsilon <0.7\) are examined (corresponding roughly to tasks I-III and omitting task IV) it appears that A outperforms B and C. The advantages of the entangled condition contained in B and C become evident only for \(\epsilon >0.7\), i.e. only when a complex enough task are involved.

In all cases we observe the strong presence of an LOA corresponding to \(m_3\). This LOA is present and clearly visible also at the final stages for high diversity cases. Interestingly, the LOA corresponding to \(m_3\) persists even when the LOA corresponding to \(m_4\) begins to diminish, apparently feeding the LOA corresponding to \(m_5\). This phenomenon, where at final stages and especially for high diversity, LOAs of \(m_3\) and \(m_5\) are the most persistent ones, shows that symmetric, non-biased and zero-average interaction of social learning and non-biased, zero average cognitive learning leads to the polarisation of learning outcomes. Many agents successfully reach the highest, high proficiency LOAs, but some agents are stuck forever at the final low LOAs with low proficiencies. This is, of course, an outcome of the bounded confidence type model adopting certain proficiency states, which, when formed, remain stable. This phenomenon corresponds to the resistance of learners to change their strong adherence in certain low level explanatory schemes irrespective of the fact that they do not explain but only a part of the observations they encounter in the given task (see e.g. [4, 5]).

To get a more comprehensive picture of how social learning and differently entangled epistemic landscapes affect the agents’ learning we need to condense the information contained in Figs. 2, 3 and 4. For this purpose, we use the total number density \(N_k\) of adoption of scheme \(m_k\). The total number density \(N_k\) compresses the information of how a given explanatory scheme is learned into a single number, but no longer provide the information of explanans and proficiency contained in the LOAs. The results for different diversities from \(\sigma \)=0.08 to 0.18, for landscapes A, B and C and for \(\tau \in [0,1]\) are shown in Fig. 5.

Fig. 5.
figure 5

The total number density of \(N_k\) for adoption of schemes \(m_k\) for epistemic landscape A (no entanglement), B (entangled with \(\lambda =3\)) and C (entangled with \(\lambda =5\)), from left to right. Total number densities \(N_k (\epsilon ,\kappa )\) for schemes \(m_k\), shown as: \(N_{1}\) (orange), \(N_{2}\) (blue), \(N_{3}\) (green), \(N_{4}\) (purple) and \(N_{5}\) (red). Results are shown for the complete stage of evolution from \(\tau \in [0,1]\) to and for diversity \(\sigma =0.08, 0.10, 0.14\) and 0.18. (Color figure online)

The results in Fig. 5 show that if one focuses only on simple tasks (tasks I-III, corresponding roughly \(\epsilon < 0.7\)) with low and moderate diversity (\(\sigma =\) 0.08 and 0.10), the non-entangled landscape A produces the best learning outcomes and scheme \(m_3\) is rapidly learned. Only when the task becomes more demanding (\(\epsilon >0.7\)), or when diversity increases (\(\sigma >0.10\)), the entangled landscapes B and C become more advantageous for learning. On the other hand, the best learning outcomes are reached for entangled landscapes and high diversity \(\sigma > 0.14\). In examining the results, it should be borne in mind that in all the cases the same value of memory \(\mu \) of cognitive learning and strength \(\gamma \) of social learning has been kept constant. Also, the variations in probability mass contained is kept unbiased, with zero-averaged variation, in the same way as the tolerance to diversity in social learning which is also unbiased with zero-average variation.

The learning models of cognitive and social learning which produce these results are highly idealised, but, nevertheless, they show how delicately the learning outcomes depend on the interplay between task (as describe by epistemic landscapes) and the peer-to-peer interactions contained in social learning, and on the extension and duration (parameter \(\tau \)) of exploration of the possible explanations (parameter \(\epsilon \)). It is evident, that all the factors discussed here - cognitive learning, entanglement of different explanatory schemes, social learning, and tolerance to diversity - affect the learning outcomes, not only separately and independently but together as a whole system.

4 Discussion and Conclusions

The process of learning scientific knowledge from the dynamic systems viewpoint [4] is here discussed in terms of the Probabilistic Learning Model (PLM) for the cognitive effects of learning and in terms of a social learning model (SLM) for the effects of social interactions in learning [10]. The model of learning based on PLM and SLM is a sociocognitive model of learning, which considers some very primary features of a student’s learning process on the levels of individual cognition and the sociodynamics of learning. The model is an idealised description of the learning process, based on the assumptions that: (1) the teaching-learning sequence can be described as an epistemic landscape; (2) the only relevant cognitive property that characterises the learner and changes during the teaching-learning sequence is the learners’ proficiency in using the given scheme, enhanced/weakened by success/failure; and (3) social interaction either increases or decreases proficiency independently of cognitive abilities. In the present model, social learning is thus indirect instead of being a direct transfer of knowledge. This agrees with the views that social learning very often seems to operate through the indirect effect increasing the learners’ self-efficacy and their feelings of competence [6, 7].

The model of knowledge system studied here is a tiered system of explanatory schemes which is a generic description of certain well-known empirically studied learning situations (see e.g. [4, 5]) and references therein). Sociocognitive dynamics, as it is implemented in the model, leads to the formation of dynamically robust preferences for certain explanatory schemes, which explains a set of evidence contained in the learning task designed to facilitate targeted learning. Adopting and using such explanatory schemes require appropriate proficiency from a learner. Thus, each explanatory scheme is characterised by what it explains (explanans) and what is the proficiency it requires from a learner. Robust learning outcomes can be then conceptualised as Learning Outcome Attractors (LOAs) corresponding to these schemes, located in space spanned by explanans and proficiency. These learning outcome attractors (LOAs) are essentially outcomes of the interplay between the design of the learning task, learners’ cognitive dynamics and social dynamics.

The development and implementation the model shares many similarities with decision models and opinion dynamics models. The model is basically an agent-based model (ABM), where agents have an internal state characterised by the adopted explanatory scheme and proficiency. Both these features evolve during the simulations. The selection of interaction between agents is based on the bounded confidence type model. A similar type of social interaction has been recently proposed for modelling social interaction in the task centered collaboration [14]. The current model, in its use of epistemic landscapes, has also many similarities with simulation models designed for discovery of knowledge [8, 9].

In the present model, the effects of social learning can account for a considerable part of successful learning and be comparable to cases where the memory effects (i.e. cognitive effect) in learning are high (see refs. [10]). Interestingly, although even in cases where the effect social learning depends on the difference between proficiencies and the positive and negative changes have similar effects, and even in case the probability for such events is without bias to positive effects, the effective outcome favours advancement in learning. This is due to the fact that the epistemic landscape itself, due to its design, is biased to advancement; positive change in proficiency matters more than equally strong negative change and learning bias emerges. This, of course, is due to fact that learning tasks, which the epistemic landscape is intended to model, is deliberately designed to support learning. The tolerance of diversity of peers’ proficiencies thus allows learners to benefit occasionally with interaction with more competent peers and such cases are magnified by the effect it produces as an improved foraging capacity; the occasional encounter with more competent peers open new vistas for cognitive development. The social effect is thus not only restricted to adoption of peer’s best choices but more importantly, it drives the agent’s own cognitive proficiency. This result, although based on idealised model, suggest that tolerance to diversity of peers in social learning, when learning tasks are appropriately designed, is always beneficial for growth of proficiency, because it opens learning possibilities which may significantly enhance learning.

The results presented here have also implications for research into learning. In the picture presented here learning appears as a dynamic, continuous change from one dynamically and contextually determined learning outcome to another rather than a switch from one static, cognitively determined and context independent state to another. Research settings that can detect such a continuous change evolution and its context dependence of learning outcomes should pay attention to the effects of alternation between contexts and how interleaving of different contexts affect the learning outcomes. However, the complexity of the situation makes the mapping of the model parameters onto empirically testable research settings challenging. While proficiency can be mapped to success in providing correct answers in the given task (see e.g [4, 5]), the tolerance to diversity, although in principle possible to operationalise in empirically accessible form, would require a novel types of reliable self-evaluation reports. Therefore, developing research settings which are appropriate for exploring rich variations of learning outcomes related to context dependence and how it interacts with learning dynamics remains thus as a challenge. In meeting such challenges and in advancing the research in sociocognitive learning the agent-based simulations may prove to be an invaluable tool. One advantage of the agent-based modelling as it is presented here is that the conceptualisation of learning within it is designed to be close to conceptualisations of social and collaborative learning phenomena now emerging in educational research of learning and instruction (see e.g. [16] and references therein). Complementing such research with agent-based modelling may eventually open new fruitful ways to model learning phenomena and to find new empirical approaches to study complex sociocognitive learning phenomena.