Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Our paper occupies a new space between the learning sciences, educational research and interaction studies of in vivo classroom teaching and learning activities in higher education in the Asia-Pacific region . As the title of this volume suggests, higher education is currently being reshaped and directed along pathways that conform to the diktat of global neo-liberal capitalism and its economic and political ideology and praxis (see Smith, 2016, Chap. 16 in this volume). There is much talk in the learning sciences and in educational research about the ‘learning society’, ‘lifelong learning’, ‘learning relations’, and so on. Some scholars have now begun to examine these educational developments in relation to Michel Foucault’s later work on governmentality (Fejes & Nicoll, 2008). Governmentality refers to the array of rationalities, practices, technologies , and values through which people engage with institutions and their practices and in the process engage in processes that are productive of particular forms of conduct, particular kinds of social relation s, particular value systems, and particular forms of selfhood. The reforms of teaching and learning alluded to in the title of this volume can be understood in the context of recent neoliberal practices of governmentality in higher education. Such an understanding opens up the possibility of researching the nexus of socio-cultural, institutional, and individual factors currently working through higher education in order to constitute individuals as learning subjects. As other chapters in this volume show in various ways, the institutional practices of management, testing, assessment , and so on are embedded in a larger-scale socio-cultural matrix of cultural affordances , discourses , forms of knowledge , measurement and monitoring techniques, and learning technologies that are organized to produce the effects of a specific mode of social production—that of the learning self (Gu, 2016, Chap. 4 of this volume; Tran, 2016, Chap. 5 of this volume).

Learning is based on particular assessment regimes that seek to accredit students with competencies, knowledge and skills that students have attained (Joughin & Hughes, 2016, Chap. 15; Nakano, Ng, & Ueda, 2016, Chap. 17; Ng, 2016, Chap 6, all in this volume). In this context, it is important to develop empirical research methodology and theory that yield understandings about how students learn in addition to what they learn. Whereas much educational research, including, for example, learning analytics (Knight et al., 2014) uses macro-level theoretical constructs and analytical procedures that are not sensitive to the subtleties of real-time embodied interactivity between persons and the subtle ways in which agents attune to the learning situation, our chapter is grounded in micro-analytical techniques that can yield insights about how real persons —learners and teachers—interact with the affordances of the learning situation in real-time. Our approach does not argue for a one-size-fits-all methodology that can serve to analyse the many different learning situations that exist in higher education. We recognize that the development of the skills of scaffolding and self-scaffolding and the role of teaching in this development are very different across different domains (Fox, 2016, Chap. 8 in this volume). With reference to a detailed analysis of a single episode of learning, our more modest goal is to propose a new integration of theoretical perspectives and analytical methods and techniques that show how the real-time interactivity of learners and teachers with the affordances of the learning situation requires knowledge of how to recognize and avoid error (Bickhard, 2001). The dialect of teaching and learning involves regulation of this interactivity. This is so in two senses: (1) (self)-regulation of the interactivity between teachers and learners and the learning environment ; and (2) regulation of the processes of microgenetic construction of new learning. Unlike recently predominant social constructivist approaches to classroom interaction that are founded on the socio-discursive construction of positive knowledge that is ‘encoded’ in conventional forms, our approach is a naturalistically grounded one that requires abandoning many of the assumptions and formalisms of these approaches so that real progress can be achieved in developing a process model of learning and teaching based on a naturalistic epistemology that addresses the open system dynamics in which learning and teaching take place.

From the moment they come out of the womb and begin the process of becoming persons (Ross, 2007), humans begin to participate in and to learn in Distributed Cognitive Systems, which are a distinctive hallmark of the extended human ecology (Steffensen, 2011; Thibault, 2011). Persons create and sustain their learning and teaching trajectories on the basis of complex, dialogically coordinated relational and affective dynamics that cannot be reduced to technical skills or to the properties and characteristics of the technologies used.

Drawing on recent theoretical developments in distributed cognition, distributed language, and multimodal interactivity, we undertake a “thick” empirical description of video-recorded data from a pilot study of students’ interactivity in tutorials in conjunction with a course taught in the Faculty of Business and Economics at the University of Melbourne. Using the techniques of Multimodal Event Analysis (MMEA), we investigate how participants’ multimodal interactivity with the changing affordance arrays of these learning episodes is not only a form of action, but also a form of publicly enacted thinking when persons are coupled both to each other and to external resources as they engage in problem-solving and other cognitive tasks.

MMEA will be applied to the learning activities in which students and tutors participate, with a specific focus on those affordances and predispositions to learning which enable learners to select and focus on cognitively salient aspects of the task in ways that promote effective learning. MMEA yields valuable micro-analytical insights on how learners develop different trajectories of learning and different learning strategies and practices. In other words, ‘local’ and ‘global ’ factors are integrated and oriented to in diverse ways by learners throughout the development of their learning trajectories.

Our main focus is on how culturally saturated interactivity and its effective utilisation in the classroom guides and shapes learning. Interactivity is not the same as “interaction”, as commonly understood in discourse -analytical and social interaction approaches. Interaction tends to rely on theoretical abstract a like the exchange of “shared” meanings between persons , shared codes, and abstract systems that mediate and make possible interaction. Interactivity is more concrete: it is situated and embodied. It affords the manipulation and reshaping of the learning task through very natural, intuitive ways in which our bodies engage with material affordances , artefacts, and tools in the physical and social environments . Activities such as touching, moving, pointing, visual scanning, talking, writing , reading, and auditory prompts and cues of various kinds are coordinated in our interactive engagements with technologies of learning. In other words, learning is grounded in and extends the natural interactivity of human bodies, i.e. our natural sense of ‘being there’ (Clark, 1997). By the same token, interactivity and therefore learning in the here-&-now is constrained and enabled by non-local and hence virtual cultural resources deriving from cultural-historical traditions that can be evoked in situated interactivity and which perfuse it with meaning and sense (Thibault, 2011, 2012).

We therefore place the emphasis on the culturally saturated nature of human interactivity and on the learning trajectories that participants co-construct through their dialogically coordinated interactivity. Much more is at stake here than issues of managerial efficiency or cost effectiveness. Instead, we seek to show how interactivity can enhance our theoretical understanding of human learning and teaching. In doing so, we will emphasise that learning, which is ubiquitous in human interactivity, is a values-realizing mode of behaviour (Hodges, 2007a, 2007b). This presumes a heterarchy of diverse and shifting values that shape and guide teaching and learning along their trajectories rather than pre-determined, hierarchically ordered goal states that pre-determine the learning trajectory top-down fashion.

2 Distributed Cognition and Learning

Learning is a context-sensitive and adaptive process in which the learner must solve problems that are often ill-defined or underspecified. The learner must therefore engage in ongoing processes of interactivity with the learning environment that provide the learner with information which it can use to modify its own future interactions. It is in this way that learners progressively hone and refine initial, poorly defined problem spaces into ones with enough structure to guide the construction of a solution. Learning involves processes that both provide heuristic action guidance and improve the learning system’s capacity to guide action (Christensen & Hooker, 2000). Heuristic action guidance (‘scaffolding’) may, crucially, involve dialogically coordinated processes whereby one agent provides heuristic guidance to another agent’s learning activity. However, learning is also a self-directed process whereby learners generate “high order anticipative structure that improve self-direction” (Christensen & Hooker , 2000, p. 7).

Our account of learning thus focuses on the interactive processes that shape learning and the heuristic guidance of learning. A core principle that informs our discussion is the ecological and situated embeddedness of learning: learners and their learning are not independent of their environment . Instead, the learner’s brain is embedded in an interacting body and this body-brain system in turn is embedded in a complex, culturally saturated environment . Learning is a dynamic, time-extended and organised mode of interactivity in a complex environment . Learners are embodied agents who must learn to harness and deploy their bodily capacities and interactive processes in order to achieve the goal of overall or global system autonomy . Autonomy , as Christensen and Hooker (2000, p. 9) argue, requires that all the system’s processes must be interrelated in order to focus on the autonomy of the learning system as a whole.

Learning, like many cognitive processes, encompasses processes and activities of many different kinds. However, learning processes involving problem solving, interpretation, evaluation , decision-making, and so on, all seem to have a number of features in common. They are all dynamic and enactive processes that are not readily explainable in terms of mental states, representations, or static contents. Cognitive processes, according to classical models of cognition, are brain-bound and individual-centred processes: cognition takes place in the head of the individual. Clark’s (1997, 2008, 2013) Extended Mind Hypothesis (EMH) and Hutchins (1995a, 2014) theory of Distributed Cognition (DC) have challenged this view and articulated alternatives.

Clark’s extended mind thesis includes external cognitive resources so that the concept of mind is extended beyond the individual organism by its coupling to extra-somatic resources that enhance and upgrade cognitive performance, for example, by the use of digital technologies —iPads, smart phones, computers, Google Earth, and so on.

Hutchins points out that the term Distributed Cognition does not refer to a kind of cognition, but is a perspective on all cognition: the working assumption is that “all instances of cognition can be seen as emerging from distributed processes” (2014, p. 36). The important and interesting question in this perspective is not whether cognition is distributed or not distributed, or whether it is sometimes distributed or always is (Hutchins, 2014), but what on any scale of investigation of cognitive processes are the component processes of the cognitive system, the relations between them, and how cognitive processes arise through the interactions among the components . Hutchins (1995a) showed how the cognition involved in navigating a boat into port is embedded in social institutions and practices without which the cognitive processes required to bring the boat safely into port could not take place. The technologies and artefacts with which persons couple in order to accomplish these cognitive processes may be seen as “external aids” (Luria, 1973, pp. 30–31) that provide situational support to internal processes of neural circuitry building or they may be seen as more fundamentally and deeply constitutive of these cognitive processes.

As we shall see below, it is a formidable challenge to model the different facets of the distributed relations that are involved. We identify three kinds of relations that are relevant to this goal : (1) how the organisation of the learning system as a whole responds to and interrelates in a global way the various normative constraints on the system so that it attains and maintains in time its autonomy ; (2) the organisation of the various component processes that form a Distributed Cognitive System; and (3) the creation and maintenance of a coherent self-directed learning/action trajectory in response to multiple constraints and across multiple timescales. Consider for example a University tutorial setting in which a student is required to solve a mathematical problem using regression analysis. In this thought experiment, the following hypothetical scenarios may be entertained:

  1. 1.

    The student alone solves the problem ‘in her head’;

  2. 2.

    The student solves the problem in concert with the heuristic guidance provided by other members of the tutorial group (tutor, fellow students);

  3. 3.

    The students together solve the problem without any explicit guidance from the tutor.

Let us take Scenario 1. In this scenario, our brilliant student only apparently performs all of the needed intellectual work in her head. She would not be able to perform the cognitive task without accessing and interacting with the mathematical and linguistic tools and practices that exist in some sense ‘out there’ in the culture or in some functional subcomponent of the culture where these resources, the expertise for using them, and the practices in which this expertise is embedded are located, stored, maintained and revised over time in the form of the texts, technologies , institutional knowledge and practices, and expertise that together constitute a body of academic knowledge , its history and traditions.

A University tutorial setting is a specific, albeit selective, embodiment or actualization on a particular occasion of these institutionalised meanings and practices. No one—student or expert—can think up all of this on his or her own. Instead, individuals neurally and bodily couple to and interact with meanings and procedures that derive from the longer, slower time scales of the academic discipline , the culture, including its traditions of literacy and numeracy, and the norms and values associated with these. Following Hutchins on this point, we would say that the cognition emerges through the coupling of these time scales that is enacted in and through the student’s problem solving activity. The student interacts with and is entrained to the dynamics of the cultural affordances of her ecosocial niche, defined as the array of affordances that constitute her world (Thibault, 2014). The ability to go solo in the solving of a complex problem is the outcome of a long apprenticeship in the development of the ability to entrain, through what we refer to as ‘deliberate practice’, one’s own neural and bodily dynamics to the dynamics of increasingly complex scales that extend cognition into the realm of the virtual cultural entities created by language, mathematics, and other second-order cultural constructs in increasingly distal realms beyond here-and-now interactivity.

Scenario 2 shows a different kind of distribution of cognitive dynamics. The student concerts her own learning with that of her fellow students as well as the tutor, who all provide different kinds of heuristic guidance. There is dialogically coordinated interactivity between, for example, the student and the tutor. The tutor has prepared his lesson a week ago in a discussion group with his fellow tutors. The questions and answers were pre-prepared by the course lecturer and made available as text on the worksheet which the students and tutor handle and refer to during the tutorial. Moreover, the questions (and answers) on the worksheet draw upon well-defined models and have a well-defined location in the learning topologies.

There is in this scenario a very different distribution of component processes across scales. In this case, the participants in the tutorial engage in here-and-now interactivity both with each other and with relevant artefacts in the situation (the text of the worksheet, writing on the whiteboard). The participants must adjust and entrain their real-time bodily and neural dynamics to each other and to the artefacts they couple with at the same time that they also learn to orient to, to entrain to and thus to anticipate the dynamics of increasingly distal time scales as the learning objects emerge through a microgenetic process of small-scale selection and variation on established patterns (Bickhard & Campbell, 1996). In the situation, it is the tutor who is best able to locate the current learning in relation to successful prior learning constructions and as well as to evaluate how near or far the student’s current efforts are with respect to those prior constructions in the overall learning topology. These prior products of successful learning are collective entities that are now the constitutive elements of a system of institutionalised cognitive processes and products that are stored and maintained in collective social practices, texts, digital technologies , and highly specialised semantic, visual, mathematical and other patterns and relations characteristic of the domain-specific functional processes characteristic of a highly specialised socio-cognitive domain.

The two scenarios described here show very different distributions of components and processes. And yet, we saw that it is entirely feasible to view both from the perspective of distributed cognition. What is interesting to us is the different kinds of relations among the components and how the learning arises from the interactions among these. We see too that in spite of appearances to the contrary, the student in the first scenario is unlikely to be doing cognition exclusively in the head. Instead, her enhanced abilities depend on and would be impossible without her entrainment to and self-scaffolding by the cultural dynamics of the collective cognitive processes and products that constitute the institutional history of a particular body of knowledge . Individuals couple with and learn to entrain their neural and bodily dynamics to the distal dynamics of these virtual cultural constructs as they become increasingly skilled practitioners in the high-order cognitive processes of the socio-cognitive domain in question.

3 Distributed Cognitive Systems, Multimodal Interactivity, and the Learner-Environment Interaction System

Humans live in and have constructed a unique extended ecology that is defined by our inter-connectedness—with other persons , with artefacts, with social institutions , with technologies . By means of these resources, humans integrate their activities to shared cultural patterns and in doing so they coordinate their activities across times and places. In recent years, both the biological and cognitive sciences have demonstrated an increased sensitivity to the fact that human learning and thinking are not purely private and internal processes of individuals. They reflect inter-individual dynamics that are shaped by human culture. This realisation has also cast doubt on the traditional academic distinction between ‘cognition’, seen as taking place within persons , and ‘communication’, seen as occurring between persons.

Perceiving, acting, thinking, learning, decision making, moving, doing things with artefacts, and language are all shaped by the norms and values of what Goffman (1983) called the interaction order. Moreover, perceiving, acting, thinking, etc. are not outcomes of exclusively individual-centred processes. Instead, they have the capacity to affect both other persons and aspects of their environments . What is often somewhat loosely called ‘communication’ is in fact a socially organised way of co-ordinating thinking, feeling, perception, and action between persons . On the traditional view, ‘cognition’ and ‘communication’ were seen as separate areas of study within very different research traditions. This view is now seen as less tenable. Cognition also routinely occurs between persons and between persons and their artefacts in culturally rich environments . Humans are born into and learn to exploit Distributed Cognitive Systems (DCSs). It is our participation in DCSs after birth that enables us to become persons . Humans are ‘ecologically special’ (Ross, 2007) precisely because of this fact. The human ecology depends on culturally saturated DCSs to a far greater extent than other species.

A DCS consists of a network of persons who interact with each and with relevant artefacts and technologies in order to perform cognitive and learning tasks that could not be achieved by any of the components of the DCS on their own (Clark, 1997, 2008; Hutchins, 1995a). A DCS thus has cognitive properties that are irreducible to the properties of its component parts. Cognition is distributed between brains, bodies, and aspects of the physical, technological, and cultural worlds of persons (Clark, 1997).

A growing body of evidence shows that interactivity, not abstract symbol manipulation, internal representations or information processing centred on the internal mental processes of the individual, is the key to human learning and intelligence. Text-based literacies mediated by abstract social and semiotic codes have privileged pedagogies that abstract away from this basic fact. Humans learn best in situations that promote rich, culturally saturated interactivity when they engage with and manipulate external artefacts to solve learning tasks and cognitive problems in often complex environments such as aircraft cockpits, interpreting fMRI brain scans by brain scientists, and medical simulations involving senior doctors and trainee doctors (Alač & Hutchins, 2004; Clark, 1997, 2008; Hutchins, 1995a, 1995b, 2010; Kirsh, 1995a; Steffensen, Thibault, & Cowley, 2010).

Multimodal interactivity and the forms of coupling of agents to their environments that it enables is not reducible to low-level perceptual-motor skills , but is central to higher-order cognitive operations in complex environments requiring expert knowledge . Moreover, a long tradition of work in experimental psychology (Koffka, 1910, 1935; Luchins, 1942; Vallée-Tourangeau, Euden, & Hearn, 2011) shows that what Koffka (1910) identified as latente Einstellung (‘latent attitude ’), or experience-based predispositions to learning, can influence learning negatively and therefore can guide learning in inefficient ways that delay (Kirsh, 1986) or frustrate desired outcomes (Kirsh, 1995a, 1995b; Vallée-Tourangeau et al., 2011). Building on Koffka’s insight, the experimental work of Luchins and Vallée-Tourangeau et al. points to the potential of interactivity to diminish negative predispositions towards learning.

Unlike learning based on text-based models or mental simulation inside the individual’s head, we conceptualise the Learner-Environment Interaction System (LEIS) as a rich, dynamical multimodal environment consisting of manipulable artefacts which afford a changing array of affordances and possibilities of perception and action. The affordances and possibilities of the LEIS attract and shape attention and action. In doing so, they constrain action, knowledge and cognition in ways that seem more likely to promote positive learning experiences and outcomes . Interactivity with artefacts, tools, and technologies in the physical and cultural environment of the LEIS enables learners to segment and identify the features of that environment so that they develop more effective learning strategies . Interactivity, through visual scanning, haptic manipulation and exploration, sound, and movement, enables learners to manipulate and re-organise the physical aspects of the learning task such that active exploration and manipulation of physical artefacts gives rise to new perceptions . In turn, these can transform the learning task.

As mentioned above, activity is not directed to a single goal that determines the activity (hierarchical model), but is embedded in and is constrained by multiple values that agents orient to and seek to realize (heterarchical) (Hodges, 2007a). In the distributed view, there are multiple organising principles around activity, not just task or goal . The learning task thus becomes a changing, dynamical multimodal configuration or affordance layout that reveals new affordances during the learner’s time-extended interactions with the learning task. In this sense, interactivity involves sense-saturated coordination that contributes to human action, cognition and learning (Kirsh, 1997; Steffensen, 2013; Thibault, 2011, 2014; Vallée-Tourangeau et al., 2011).

According to the classical view, interactivity consists of a sequential unfolding of static state transitions: action > reaction > action (see Kirsh, 1997 for critical discussion). The agent acts on the environment , the environment reacts and the agent acts again in response to the reaction from the environment. On this view, the agent formulates a desired goal state and acts to obtain the desired goal . A more dynamical conception of this view is also possible. According to the dynamical view, action is a continuous response to feedback from the environment . Moreover, the environment is seen as external to the agent who acts on it. Both the static and dynamic views are based on transitions between static states. However, interactivity is much richer and more complex than this. Interactivity is best characterised as follows:

  1. 1.

    Interactivity involves multiple functions , not only pragmatic ones, including (a) exploratory activity that generates information; (b) the continual re-structuring of the affordance layout through this activity; (c) probing the environment for solutions (Cowley & Nash, 2013), (d) epistemic action (Kirsh & Maglio, 1994); (e) heuristic guidance; (f) guiding and shaping attention; (g) coordinating with resources/affordances ; (h) creating and responding to reminders (Kirsh, 1997); (i) maintaining the environment in an optimum condition; (j) anticipating the future development of the trajectory; (k) responding and adjusting to normative signals.

  2. 2.

    Interactivity is guided and shaped by a heterarchy of multiple and fluctuating values that continuously modulate its trajectory rather than aiming for a single final goal state as in a command hierarchy;

  3. 3.

    Interactivity is the sense-saturated coordination of agents both with each other and with the affordances and artefacts of their social and cultural worlds (Steffensen, 2011);

  4. 4.

    Interactivity couples agents and environment in a unified Agent/Learner-Environment Interaction System rather than seeing the environment as external to the agent: the environment is in part the outcome of agents’ interactivity and is not therefore independent of either the agent or the interactivity that couples agent and environment in the LEIS;

  5. 5.

    The coupling of agents to their environments is spread across a diversity of time scales;

  6. 6.

    Interactivity prompts and shapes agents’ learning: agents learn to exploit and be guided by the dynamics of their interactivity.

4 Interactivity, Microgenesis and Learning: Multimodal Event Analysis (MMEA)

In the episode to be analysed below, three students participate in a problem-solving exercise together with their tutor. The problem is a discrete probability distribution that includes Poisson and hypergeometric probability distribution components . The students are working on Questions a. and b. in the worksheet (see Appendix 1 for these two questions and Appendix 2 for the solutions).

Following the seminal work on extended mind and distributed cognition by Clark (1997, 2008, 2013) and Hutchins (1995a, 1995b, 2010), respectively, analysts have tended to stress the individual problem solver and his or her interactivity with environmental affordances such as tools, technologies , artefacts, and so on. However, problem-solving is also very often a dialogically coordinated form of interactivity involving other persons in addition to the technological and artefactual character of the external affordances that have the capacity to extend human cognitive processes beyond the individual person (Cowley & Nash, 2013; Thibault, 2011). In ways that are clearly crucial to teaching and learning, this is also true of the other persons with whom the learner interacts in the learning environment .

Three students and the tutor participate in the problem-solving exercise. Two of the students talk whilst the third remains silent except for a brief exchange with the tutor at the end. Student 1 in the transcription below is the student who is charged with responding to Question b above. Our analysis is of the transcribed episode presented in Figs. 9.1, 9.2, 9.3 and 9.4.

Fig. 9.1
figure 1

Multimodal event analysis: phase 1: the problem

Fig. 9.2
figure 2

Multimodal event analysis: phase 2: exploring the problem space

Fig. 9.3
figure 3

Multimodal event analysis: phase 3: insight dawns

Fig. 9.4
figure 4

Multimodal event analysis: phase 4: the wrap up

4.1 The Transcription

The transcription featured in Figs. 9.1, 9.2, 9.3 and 9.4 is of a 02.42.000 s. sample of an extended learning trajectory that was video-recorded. Timings were obtained by means of the multimodal language analysis program Elan 4.1.2. The transcribed episode begins 38 s. after the beginning of the video recording. The start time of each Phase according to Elan is shown in the left-most column of the transcription. Other time details are indicated when required. These too are from Elan. In the transcription, an utterance unit designates a single pulse of synchronized bodily activity that is coordinated with other persons or with other aspects of the situation. In the transcription, a line refers to a complete utterance unit and may in fact extend over more than one line of printed text. An utterance unit is of variable duration and includes some synchronization of variables such as body movement, deictic points, gaze, gesture, head nods, posture, and speech, taken as a single unit of whole-body sense-making. The verbal component of an utterance unit in any given line is indicated in bold italics. Other bodily events (gaze, gesture, etc.) are indicated in normal font. The use of the ‘+’ sign indicates that one event is concurrent with some other in the same utterance unit. The numbered lines refer to the utterance units of the Tutor and Students 1, 2 and 3. In the transcription, the following abbreviations are used: S1 = Student 1; S2 = Student 2; S3 = Student 3; T = Tutor; WB = white board; WS = work sheet. The use of square brackets […] serves to indicate that the utterance activity of one participant is concurrent with and/or overlapping with the utterance activity of some other participant. In such cases, the concurrent utterance is placed in the same line of the transcription.

Each line of the transcription is correlated with the MMEA displayed in Figs. 9.1, 9.2, 9.3, and 9.4. Figures 9.1, 9.2, 9.3, and 9.4 are stills from the video recording annotated with reference to each student’s utterance activity. In the detailed analysis in Sect. 9.5, specific events are cross-referenced with reference to both the line in which they occur in the transcription and the relevant Figure of the MMEA. For example, Line 1, Fig. 9.1 refers to Line 1 of the orthographic transcription and Fig. 9.1 refers to the MMEA. Figures 9.1 and 9.2 additionally show screen shots of written text that is written on the whiteboard and is concurrent with the activity referred to in that Figure.

The four participants are in a semi-enclosed area that is partitioned off from another group of students with which the same tutor is working concurrently. The male Tutor and Students 2 and 3 are seated while Student 1, who is the main problem-solver in the transcribed episode, is standing before the whiteboard. Student 2 is seated to the Tutor’s immediate left. Student 3 is seated further to the Tutor’s left in the foreground of the screen shots.

The interactive event is embedded in and coupled to and also co-constructs a local micro-ecology consisting of ecologically salient architectural spaces, objects, surfaces, etc. and what these afford the participants (Thibault, 2008, pp. 318–320). For reasons of space, we cannot discuss this in detail (see Thibault: 318–320 for a detailed account). The physical environment of the learning situation is itself a mediator and enabler of the interactivity; it is not a neutral physical setting. The physical environment is saturated with cultural meanings and values. Surfaces such as the whiteboard, objects such as pens, the handheld worksheets, the seating arrangements and the locations of the tutor and students all play their role in affording certain kinds of interactive relations between the participants and the physical environment in which the episode takes place.

5 Learning as Microgenetic Constructive Process

Learning is a form of growth (Brown, 2005, p. 206) that occurs in particular spatio-temporal circumstances. Our focus on interactivity has the potential to show that learning is dependent on the microgenetic individuation of acts of cognition. Microgenesis is a micro-level constructive process on small time scales; it is constructive in the sense that it sets up conditions in the system (e.g., the learner) that did not previously exist (Bickhard & Campbell, 1996, p. 129). Microgenesis is a continuous and ubiquitous feature of the system: it happens whenever the system functions (Bickhard & Campbell , 1996, p. 129), for example, during the system’s real-time interactivity with its environment . Microgenetic processes are small-scale shifts in the manner of the system’s function that set up new constructions. Learning is dependent on prior microgenetic construction of the system and modifies it (Bickhard & Campbell, 1996, p. 129; Smith, 1991, p. 205).

The term ‘microgenesis’ was first proposed by Werner (1956, 1957). However, the concept goes back to the work of Sander in the 1920s and 1930s (Sander, 1932). The term Aktualgenese (‘genetic actualization’), from which Werner derived the term ‘microgenesis’, referred to the developmental unfolding, or actualization, on very small time scales of a thought, percept, action, or utterance. This entails a process of differentiation across diverse levels or strata of organisation as the initial vague potential is dynamically unfolded as a fully actualized form. Microgenesis is a dynamical process that refers to the development on a brief time scale of a percept, thought, gesture, vocalisation, etc. It is a developmental process in which the final outcome of an experience, the finished product, is already embodied in the early stages of its development. The final product of experience is thematised as a ‘figure’ that is developed and stabilised through dynamic processes of unfolding and differentiation. As Rosenthal (2004, p. 222) points out, microgenesis takes place in relation to a thematic field that is given from the outset no matter how poorly defined or undifferentiated it may be. Microgenesis is, then, a movement from potential to actual, a process of actualization across different strata of neural and bodily organisation, rather than a sequential chain of causes and effects (Brown, 2005, pp. 222–224). This transformative and distributed movement across levels of neural and bodily organisation has emergent properties: antecedent stages of the final product, beneath the surface, leave their trace in the final product and actively shape it (Brown, 1991, p. 57).

In our analysis, students and tutor together engage in a process of microgenetic schema construction (Werner & Kaplan, 1984/1963). For narrative purposes, our analysis divides the transcribed episode into four macro-phases, as follows: (1) The problem; (2) Exploring the problem space; (3) Insight dawns; and (4) The wrap up. The detailed analysis of these four macro-phases with reference to the transcription in Figs. 9.1, 9.2, 9.3 and 9.4 now follows.

5.1 Phase 1: The Problem

In Fig. 9.1, the Tutor invokes the course lecturer’s pre-prepared written answers and on that basis invites Student 1 to derive the required explanation. The Tutor’s invocation of the course lecturer’s answers grounds the new problem-solving trial that the student is about to engage with in the context of a previous and already successful learning dynamics. The Tutor thus prompts Student 1 to locate the new problem-solving heuristic in relation to the old learning dynamic so that both the new trial and the prior one are located within the same overall learning topology. The student is required to attempt a new trial that is seen as topologically near to the previously successful one. Learning can be seen here as the setting up of new trial dynamics on the basis of a heuristically guided rather than blind variation-and-selection dynamic. It is therefore important that the previously tried and successful dynamic is readily retrievable from the same overall learning topology so that it can guide the microgenetic process of new learning (Bickhard & Campbell, 1996: p. 143). New learning trials are variations on old, successful ones.

In Line 5, Fig. 9.1, the Tutor’s utterance is theremaybe theremember how he wrote all his answers there’s a way to better explain that locates the new trial dynamic that Student 1 is about to engage in close to the old and successful microgenesis of the course lecturer’s answers. The mental process verbFootnote 1 remember in the Tutor’s imperative utterance functions to evoke an explicit memory of the earlier process and to redintegrate it to the current learning situation. Moreover, the attributive clause there’s a better way to explain that articulates a positive evaluation of the lecturer’s answers and thereby functions as a selective mechanism that stabilizes the old (previous) dynamic as one which was successful and worth retaining. The Tutor’s comments do not add new content. Instead, they serve to locate the new trial as being near to the old, successful learning construction; the new trial is seen as being topologically close to the old one in the overall learning space. The old trial is thus established by the Tutor as the background context in relation to which the new learning trial must be developed and differentiated. The trick then is to access the information available in the prior problem solving heuristic and to make it functionally available for the new trial.

5.2 Phase 2: Exploring the Problem Space

Student 1’s response (Line 7, Fig. 9.2) begins with a long apparent hesitation in the form of the syllable ‘ah’ that is characterised by a pronounced and continuous slide in pitch starting at 300.2 Hz and moving to a low of 209.5 Hz and lasting 1.300 s. She then utters the phrase ‘the model has’, which continues the downward pitch movement from 234.4 Hz to 195 Hz. This part of the utterance lasts 1.460 s. Figure 9.5 shows the Praat analysis of the vocal dynamics described here.

Fig. 9.5
figure 5

Praat analysis of student 2’s vocalization: ‘ah the model’ ; temporal extent of S2’s utterance indicated by the blue arrow

Overall, this utterance, lasting a total of 2.766 s, is characterised by the slowing down of the speaker’s voice and the progressive fall in pitch noted above while her gaze is directed at the text she has just written on the white board (Fig. 9.1). At the end of this phase, Student 2 switches her gaze to the question sheet (Appendix 1) she is holding in her right hand. It would be easy to read the specific properties of this initial phase of Student 1’s response, as described here, as some kind of hesitation phenomenon while the student gathers her thoughts, so to speak. This view is not without merit. However, we further suggest that the student registers the indeterminate nature of her current attempt to correctly locate the new learning problem in the overall microgenetic learning space. Micro-temporal dynamical properties of vocalizations, body movements, gaze, gesture, etc. can thus be seen as the initial stage of a process of individuation of a temporally extended act of learning that is progressively unfolded in microgenesis.

An analysis which focuses exclusively on the verbal aspects of the student’s performance and thus ignores the micro-temporal or pico-scaleFootnote 2 bodily dynamics of the flow of cognitive activity effectively conceals the real nature of the learning process (Brown, 2005, p. 206). As Brown (2005) points out, this process is a morphogenetic one of the derivation of a cognitive act as it unfolds in microgenesis. We argue that at this initial stage of the unfolding microgenetic trajectory, the process is characterised by microgenetic indeterminacy or destabilization: the student is uncertain as to which way to direct the current microgenetic trajectory within the overall learning space.

Bickhard and Campbell (1996, p. 145) point out that microgenesis is a dynamical space in its own right. It is characterised by its own dynamics; processes of destabilisation and restabilisation are not extraneous to the microgenetic process, but are intrinsic to it. The stabilization of microgenetic construction is essential for learning to occur (Smith, 1991, p. 206). In turn, this leads to increasing automatization through the repetition with small variation of successful microgenetic constructions. These processes of stabilization, destabilization and automatization are the means by which microgenetic constructive change necessary for new learning occurs. Microgenetic destabilizations are regions of indeterminacy whereas stabilization corresponds to well-defined organisations of microgenetic process dynamics. Student 1 is engaged in the process of learning a new mathematical skill. The trick is to avoid those regions of the larger dynamic learning space of strategies and approaches that don’t work. The Tutor has already provided a vital clue (Line 5, Fig. 9.1) as to the correct region of the overall learning space that is relevant to the solving of the problem and, implicitly, of those regions to be avoided. Thus further implies that the Tutor has already learned those regions of intrinsic microgenetic instability that have the potential to constitute “implicit vicariant guides to the avoidance of constructive, microgenetic, error” (Bickhard, 2001, p. 205).

The current learning trial is, for the student, a differentiation of microgenetic process out of more holistic and undifferentiated processes. The explanatory power of “the model ” is presented as a further differentiation of the lecturer’s answers that the Tutor invokes (Fig. 9.1). The Tutor’s follow up question (Line 8, Fig. 9.2) like what does what does .89 actually mean? has a monitoring function. The Tutor seeks both to nudge the student towards a more stable region of the learning space at the same time that the Tutor’s question is itself influenced by the student’s prior activity such that the microgenetic dynamics of the Tutor’s monitoring question is influenced by the student’s microgenetic process. More specifically, the Tutor’s question differentiates one class of dynamics of the student’s monitored process from other possible classes and in ways that seek to stabilize the student’s destabilized microgenetic dynamics. That is, the Tutor’s question is a selection constraint that operates in favour of the stabilization of the microgenetic process and against its destabilization as the Tutor seeks to guide the student away from the possibility of error (destabilization) and hence towards the region of greater stability. The Tutor in effect catches the student before she moves into a region of greater microgenetic instability. This requires on the part of the Tutor sensitivity to the significance of the pico-scale bodily dynamics referred to above. In catching the student in this initial zone of instability, the Tutor nudges her towards a region of greater stability when he asks the student to attend to the meaning of “.89”.

In asking the student to consider the meaning of .89, the Tutor is providing a further piece of an emerging normative differentiation that potentially will enable the student to learn more about (1) what the problem is; and (2) how to solve the problem. Whereas (1) is a matter of construction, (2) is a matter of interaction. The Tutor’s question invites Student 1 to construct an anticipatory model of the interaction process (Christensen & Hooker, 2000, p. 20). Her response, means there’s 89 percent the model … the model has an 89 per cent chance of accurately predicting the salary (Line 9, Fig. 9.2), constructs an anticipatory model of the interactive process of predicting the salary. The interaction process is concerned with the development of a specific prediction tool. However, Student 1’s grasp of the prediction tool at this point remains defective. The Tutor provides normative feedback (‘close’, Line 10, Fig. 9.2; and ‘89 % was on the right track’, Line 12, Fig. 9.2) that provide Student 1 with information that enable the student to better hone and identify the problem in her own prediction technique. This information, in turn, enables the student to form and evaluate anticipations about the prediction process.

In Line 14 (‘89 % chance’), Student 1 takes a further step in the honing of the prediction technique. However, it is Student 2 who picks up on the anticipatory model construction process in Lines 13 and 16: is it … 89 % change in Y change in salary [Tutor: yep] change in explain Y change in salary [inaudible]. Student 2’s contribution further refines the anticipation model . Again, the normative feedback from the Tutor (‘yep’) shows that Student 2 further refines and more precisely defines the prediction technique. Student 2 has more effectively picked up on and responded to emergent context-sensitive clues that have arisen in the course of the interaction. Specifically, she has picked up on the significance of the Tutor’s question in line 4 about the meaning of .89. Accordingly, her learning at this point is more self-directed than is Student 1’s. Her ability to pick up on context-sensitive information allows her to provide in line 12 a more articulated profiling of the prediction technique. In turn, this changes the anticipatory modelling, which in turn induces a change in the information that the learner becomes sensitive to.

The generation of a temporally extended learning trajectory also entails the anticipatory modelling of the future development of that trajectory. It is important to be able to anticipatively modulate the interaction flow so that the learning trajectory and its future development is coordinated with appropriate learning outcomes (Christensen & Hooker, 2000). The consciously accessible products of the microgenetic process thus constitute interactive affordances that serve to guide and modulate the further development of the learning trajectory because of their capacity to indicate possibilities of further interaction in that environment . The Tutor’s head nods, his laconic ‘yep’ and ‘yeah’, his hand gestures, and his extended comment at the end provide evaluative feedback as to the success of Student 2’s learning trajectory. Specifically, they provide normative evaluators that indicate success or failure and thus provide the student’s learning with feedback that enables her to adjust and direct the trajectory more effectively in order that the learner can stay adaptive. Importantly, these evaluators have the potential to bring about changes in the learner that bring about changes in the way the learner will interact with the relevant environment . The learner thus learns to track a complex matrix of interaction processes and environmental organisation that is spread across diverse time and place scales and is, moreover, continuously evaluated by a heterarchy of norms and values (Hodges, 2007a; see Sect. 9.6).

The exchange between Student 2 and the Tutor is nested within Student 1’s attempt to solve the problem. Table 9.1 sets out the pico-scale bodily dynamics together with timing of the exchange between Student 2 and the Tutor.

Table 9.1 Pico-scale analysis of Student 2’s microgenetic construction in Phase 2

The close synchronization of the pico-scale bodily dynamics of the two speakers again shows the importance of fine-grained context-sensitive information that is not amenable to discourse -analytical techniques. The Tutor’s rapid series of head nods together with ‘yep’ constitute a prosody that responds to and synchronous with the entire duration of Student 2’s utterance in line 12.

As noted above, learning problems are not always clearly defined. Learning systems are often required to transform vague problems into more specific ones (Christensen & Hooker, 2000, p. 31). It is important, therefore, to understand how people learn things that, initially, lack clear definition. An explanation based in algorithms cannot provide a satisfactory resolution of this problem because of the explicit, encoded nature of the problem and solution states in such accounts (see above). The resolution of this problem lies in showing how microgenetic constructive processes effect and enable the transformation from initially vague, ill-defined problems spaces to more specified, better-defined ones. Microgenesis shows how learning and hence cognitive capacity progressively emerge as the learning system exploits the interactivity of the learner-environment system to enhance and increase its differentiation-making powers and to enhance its capacity to adapt through its interactivity. The interactivity between learner and environment is the driver and shaper of cognitive processes.

5.3 Phase 3: The Dawning of Insight

Phase 3 is characterised by the key insight that a change in the x variables is a crucial part of the solution to the problem. Phase 3 begins with Student 1’s utterance so what? (Line 20, Fig. 9.3) as she shifts her attention from the Tutor back to the white board and points to the text of Question B1 written there. In doing so, she constructs a link between what the Tutor said in Line 19, Fig. 9.2 and Question B1 previously written on the white board. Student 1’s utterance prompts the Tutor in Line 21, Fig. 9.3 to expand on the notion of ‘change in variability ’, which he had introduced in Line 19, Fig. 9.2. In Line 21, Fig. 9.3, the Tutor introduces two crucial elements: 89 % of the variability and I don’t think change is necessarily wrong. The second of these two elements is explicitly normative. The Tutor builds a link to Student 2’s attempts to formulate the role of change (Line 16, Fig. 9.2). In Line 19, Fig. 9.2, he builds on Student 2’s utterance when he adds the crucial factor change in variability .

The normative element that is articulated in Line 21, Fig. 9.3 is presented as a personal opinion of the Tutor by its framing clause (I don’t think …) that frames the clausal proposition change is not necessarily wrong. The normativity of the Tutor’s utterance serves to focus on the ensuing flow of the interaction. It constitutes a microgenetic anticipation of future interactive flow (Bickhard & Campbell, 1996) by normatively anticipating and hence constraining the possible future development of the students’ learning trajectories. The point is that microgenetic anticipation—the setting up of the local conditions for the further development of the interaction—can be correct or incorrect, true or false, right or wrong, etc. (Bickhard & Cambbell).

Lines 22–29, Fig. 9.3 are a direct outcome of this set up. Students 1 and 2 concur with the Tutor in Line 22, Fig. 9.3 with their near simultaneous uttering of yeah. Both signal that they understand the normative implications of the Tutor’s prior statement. Lines 22–29, Fig. 9.3 constitute, in our view, a process of successful microgenetic consolidation of the new thematic content introduced by the Tutor in Line 21, Fig. 9.3. As the transcription reveals, Lines 22–29, Fig. 9.3 illustrate how Student 2 and the Tutor jointly articulate small fragments of and variations on this new material, which unfolds as a choral-like interweaving of the voices of these two participants. This is shown by the convergence of their voice dynamics: tempo, rhythm, and volume register a reflective style of joint communion that contrasts with the different and vocally more prominent and contrasting voice dynamics of Student 1 in Lines 25 and 20, Fig. 9.3 (can be explainedexplained by the modelexplained by the model ?). The closely synchronized voice dynamics of Student 2 and the Tutor modulate and accommodate each other to a shared trajectory that is constrained by the same or very similar higher-order parameters in the form of the verbal patterns that are evoked. In other words, the thematic content that is activated as small variations on a convergent theme sets the parameters for the attunement of the two participants to each other’s vocal (and other bodily) dynamics as they engage in this act of joint thinking together.

These higher-order parameters do not hover above the participants; instead, they too are perceived aspects of the vocal and other bodily dynamics of the two speakers. They are the most explicit layer of multiple layers of differentiation of the unfolding microgenetic process. As cultural-semantic patterns, they are longer, slower processes emanating from cultural timescales that the two participants entrain to. They set the parameters for faster, smaller bodily and neural events. In setting the parameters for these faster, smaller processes, these cultural-semantic patterns are anticipative in ways that are manifested as the functional coherence of the two agents during their brief moment of mutual attunement. The self-organising dynamics of the dialogical interaction between the two agents is oriented to interactive success. The voice and other bodily dynamics of the two participants, together with the higher-order cultural-semantic parameters that are set, tend to induce a recruitment of bodily and neural dynamics of the two agents to an overall convergence that briefly stabilizes as a common learning trajectory in Lines 22–28, Fig. 9.3. The significance of the Tutor’s normative anticipation of the solution in Line 21, Fig. 9.3 (see discussion above) lies in the fact that it attempts to and is successful in recruiting the future development of the learning trajectory to the normatively anticipated interaction outcome that is made explicit by Student 2 in Line 31, Fig. 9.3.

Student 1’s contrasting voice dynamics, including the rising intonation of her interrogative utterance by the model ? (Line 29, Fig. 9.3), strike a different melody, so to speak, that is not attuned to the unfolding insight that Student 2 and the Tutor develop together. She publicly addresses the whole group whereas Student 2 and the Tutor engage in a parallel act of thinking together that concludes in Line 28, Fig. 9.3. In Line 30, Fig. 9.3, the Tutor, in response to Student 1’s question in Line 29, Fig. 9.3, invites Student 2 to articulate to the group the insight that remains incipient in the dialogue that occurred between Student 2 and the Tutor in parallel to Student 1’s efforts to solve the problem on the white board.

In Line 31, Fig. 9.3, Student 2 illustrates a ‘tip of the tongue’ experience as she searches for the correct choice of term in response to the Tutor’s invitation in Line 30, Fig. 9.3 that she make explicit to the group the insight they had previously developed together more implicitly. The initial part of Student 2’s utterance (right can be explained by) echoes the Student 1’s prior attempts in Lines 25 and 29, Fig. 9.3 to derive an explanation. However, it is also an important modification of Student 1’s efforts. The ‘tip of the tongue’ experience mentioned above is manifested by the repetition of the definite article ‘the’, the syllabic lengthening of the third occurrence of ‘the’, which is synchronised with a rapid twirling movement of the pencil which she is holding in her raised right hand, and the ensuing pause of 700 ms. (7 deciseconds) prior to her uttering of the crucial element x variableschange in x variables , which had been anticipated by the Tutor in Line 21, Fig. 9.3. The rapid twirling movement of the pencil she is holding has no inherent meaning. This movement is schematized (Werner & Kaplan , 1963) so that it serves to anticipate the not-yet-verbalised meaning ‘x variables ’, which is the crucial insight here. It is not difficult to see that the rapid twirling movement of the pencil is schematized to serve this function at this point in the unfolding microgenetic process: the rapid pencil movement thus signifies the meaning ‘variability ’ before it is verbalised 700 ms. later. Table 9.2 presents the pico-scale dynamics of Student 2’s ‘tip of the tongue’ utterance in Line 31, Fig. 9.3.

Table 9.2 The microgenesis of insight in Student 2’s utterance in Line 31, Phase 3

In Line 31, Fig. 9.3, Student 2 searches for the semantic category that remains incipient in and yet anticipated by the prior development of the discussion from Line 21, Fig. 9.3 to this point. As the analysis of Phase 3 shows, Student 2 (unlike Student 1) has hit upon the correct category. In Line 31, Fig. 9.3, she struggles momentarily to specify it. The result is the objectification and stabilization (x variableschange in x variables ; Line 31, Fig. 9.3) of the normatively appropriate meaning construction as the multiple potentialities of the situation are articulated as a single more focal meaning in the public learning space (Draguns, 1991: 298). The repetition of the definite article ‘the’, along with the other factors mentioned above (Line 31, Fig. 9.3), evidence a microgenetic transition from a physiognomic mode of understanding to an objectified one that is adapted to the public reality of the tutorial (Werner & Kaplan, 1984/1963).

Phase 3 concludes with the Tutor providing normative feedback to the contributions of both Student 1 and Student 2, respectively. In Line 32, Fig. 9.3, he enacts a general reorientation of his body posture as he shifts from the posture he adopted in Line 31, Fig. 9.3 while attending to Student 2’s utterance to the new posture he adopts in Line 32, Fig. 9.3 (see transcription for the details) while directing his feedback to Student 1. (yeah maybe you mixed out the x variables ). In doing so, he creates a retroactive thematic tie to what Student 2 had said in Line 31, Fig. 9.3. In Line 33, Fig. 9.3, he then switches his attention to Student 2 and provides further normative feedback to her (yeah) to indicate her successful negotiation of the interaction outcome that was normatively anticipated in Line 21, Fig. 9.3.

5.4 Phase 4: The Wrap Up

Phase 4 is one of consolidation. In Line 34, Fig. 9.4, Student 2 prompts the Tutor to provide further clarification. Initially, he does so by further mention of the x variables (Line 35, Fig. 9.4). The conjunction so links this mention back to the previous discussion in a semantic relationship of consequentiality. He then invites Student 2 to elaborate further on the insight she had articulated in Line 31, Fig. 9.3. At the same time, his pointing to the text on the white board creates a further link between the two (Line 35, Fig. 9.4). In Line 37, Fig. 9.4, the Tutor builds a link between the two constructs 89 % of the variability in the expected salary and the x variable whether they’re male or not.

Student 1’s responses in Line 37, Fig. 9.4 (ah ok right) and in Line 38 (ok so by the features of the x) elicit both confirmation and normative feedback from the Tutor in Line 39 (essentially the features of the xthat’s good). The rapid up-down arm gesture that is co-synchronous with essentially (Line 39, Fig. 9.4) schematizes in a holistic and imagistic way (McNeill & Duncan, 2000) the meaning that is verbalised by the lexeme essentially to evaluate the significance of the features of the x. The utterance, comprising the gesture-verbal complex described above, thus distils the essential and important point at this stage in the discovery of the solution to the problem posed in Phase 1. It is distilled as an objectified mathematical truth in the transition from the physiognomic mode of understanding articulated by the gesture to an objectified verbal-mathematical one. This transition is further reinforced in Line 43, Fig. 9.4 when the Tutor addresses Student 3, who has not spoken at all during the entire recorded episode.

In Line 43, Fig. 9.4, the Tutor seeks to establish that Student 3 has understood the point of the discussion. Again, he uses the modal evaluator essentially to locate the statement it’s just a sentence you have to memorize in an objectified verbal-mathematical domain of scientific truths. Moreover, it is something you ‘have to memorize’, i.e., it is posited by the tutor as an obligation enforced by the conventions of the discipline that one must commit to memory and about which the learner has no choice. The essential content of the tutorial is at the end distilled as a prospective memory: later recall (of the relevant construct) is thus constrained not by first-order perceptual data of the kind that constrains our memories of past experiences , but by the second-order semantic patterns articulated in the tutorial and the mathematical conventions that these give voice to. The outcome is the invocation of a prospective memory that is construed as an obligation that one passively receives from the outside rather than as a thought that one freely entertains and develops. It is a semantic construct that one must replicate in future memory and therefore inculcate as a habit that one can intentionally orient to and reproduce when required. Memory as distinct from thought stabilizes learning experiences as categories that can be reproduced when required whereas thought is exploratory and oriented to change and innovation (Brown, 2005: 542). The flux of exploratory thought that has characterised the episode as a whole is thus stabilized at the end as a learning construct on which attention can be focused in future recall. The new learning construct is a result of microgenetic constructive effort that brings about a change in the initial learning topology. It therefore takes its place in the overall topology so that future microgenetic constructive processes can take place.

Pace the strong influence of social constructivist thinking on educational theories in recent decades, we contend that an analytical focus on abstract verbal patterns, as in discourse -analytical approaches to classroom interaction, fails to account for the grounding of human cognition and learning in our embodiment. It is by means of our embodiment that we couple with both local and non-local resources in the Distributed Learning Systems in which the activities of teaching and learning occur. Learning is indeed a constructive process rather than a passive input of information obtained from the external environment. However, we argue that learning is best viewed as an unfolding microgenetic construction process that starts from a primordial matrix of ill-defined affective, imagistic, ideational and other elements that are channelled along a microgenetic trajectory until their final articulation as a fully formed end product in consciousness. The microgenetic theories of Brown, Werner and others prove to be a fertile starting point for developing new understandings of human learning as a values-realizing activity that is shaped and guided by the culturally-saturated interactivity in which it is embedded.

Learning, then, is a microgenetic constructive process that transforms the learning space. From a microgenetic perspective, there is a clear difference between already-learned knowledge and still-to-be-learned knowledge (Bickhard & Campbell, 1996, p. 144). As Bickhard and Campbell (1996, p. 144) point out, the microgenetic process can already construct the previously learned knowledge ; it cannot yet construct the knowledge that still needs to be learned. Learning is not a matter of pre-constructed structures, schemas or rules that are stored in memory. Learning is a process which is both constructive and transformative. It is constructive in the sense that it modifies the microgenetic process so that it is able to prepare or set up those structures or resources when needed. As the interaction between the tutor and students illustrates, successful construction takes place in microgenesis as a process of small additions to and variations on already available constructions (see Table 9.1). Table 9.1 shows how Student 2’s microgenetic constructive effort proceeds as a series of rhythmic pulses on very small time scales (Buzsáki, 2006). Moreover, these pulses are interactionally synchronised with those of the Tutor. It is the interactivity between Student 2 and the Tutor that enables learning progressively to be actualised. Student 2 uses the mutual shaping of their interactivity to build up through the microgenetic process small additions to and variations on prior constructions. In the first instance, she builds on Student 1’s immediately prior (and incomplete) attempt. She also builds on the tutor’s invocation of the lecturer’s own answers.

We have seen how in the learning situation analysed here, learners establish a vague, ill-defined background meaning or context. In the microgenetic process, they construct variations on this background context. According to microgenetic theory, affect and meaning “are processed much earlier than the conscious experience of the stimulus” (Kurian, 1991, p. 83; see also Brown, 1988, pp. 46–47). For example, Brown (1988, p. 47) points out that the phonological representation of a word is a conscious perception after the meaning is already understood. Discourse -analytical transcription practices focus on the final products of microgenesis that are available to conscious perception and hence to transcription. The underlying microgenesis is accordingly frozen and reified. The linguistic pattern that we have learned to detect and to use in the stimulus flux of bodily activity, including phonetic gestures, is the outcome of a temporally unfolding microgenetic process of progressive actualization through a series of stages or strata of neural and bodily organisation that precedes the conscious experience of a cognitive act. Libet’s (1985) experimental work on brain processes in conscious experience and volitional acts has shown that a readiness potential precedes the conscious intent to act by about 350 ms. The brain microgenetically prepares or sets up such actions before conscious awareness kicks in (see also Wegner, 2002; Wegner & Sparrow, 2007). Consciousness is the outcome of this process, not its initial cause. Consciousness is a kind of rear vision mirror view that looks back on the outcome of a series of prior real-time neuronal processes that actualize the unfolding microgenetic process (see also Harnad, 1982).

The consciously accessible outcomes of the microgenetic process in real-time learning situations are the visible and audible, etc. bodily movements, e.g., vocalisations, heads nods, gaze, hand gestures, etc., that briefly crystallise before decaying and giving way to the next pulse of the microgenetic process. These consciously accessible percepts are perceived and assessed in relation to the established background meaning and are perceived to operate on this and to modify it. By the same token, these consciously accessible percepts afford anticipatory modelling of the future development of the unfolding learning trajectory. Usually, words and other external media are taken to be representations or expressions of either inner (mental) or outer (environmental) processes. This is in accordance with the encodingist assumptions of the perceive-plan-act view of cognition. On this view, a linguistic utterance, for instance, is a representation of some mental or world-side event for the speaker. The speaker-hearer uses that representation to infer some possible course of action.

An alternative and more plausible view that does away with the encodingist assumptions that have characterised virtually all accounts of representation has been proposed by Bickhard (e.g., 1998). Rather than constituting representations of the prior unfolding microgenetic processes, the observable percepts provide interactive indications as to what the current environment affords for the agents’ (learners ’) further interactions with that environment . Learning is not a process of assembling the various components of a discourse to come up with the required meaning. Instead, it is a microgenetic constructive process of creating a dynamic, internal state that is coherent and constrained moment-to-moment (Schweiger, 1991, p. 99) by the modifying influence of further constructive effort in the form of small additions to and variations on the learning process. The detection of past and present actualities in the environment and attunement to their affordances sets up the microgenetic processes that afford further interactive potential in that environment (see Bickhard & Campbell, 1996, p. 113). Words don’t ‘represent’ these actualities. Instead, they differentiate the current environment in ways that afford further interactive potential with that environment .

As the pico scale analysis in Table 9.1 shows, Student 2 is faced with the problem of generating an extended action trajectory that will produce the desired outcome. She must shape and modulate her action trajectory to solve the problem to hand. Therefore, she must manage and direct the interaction process in ways that extend and enrich the management horizon (Christensen & Hooker, 2000, p. 15; Werner, 1957) so as to encompass and integrate to her trajectory both local and global factors that span a diversity of time and place scales. These factors include the immediate situation of what is said and done by the Tutor and Student 1. They also include the course lectures which were invoked at the beginning. They include the written text on the whiteboard, the printed work sheet with the problems and the lecturer’s solutions, the various artefacts and resources of the online learning management system (LMS) and the longer-term history of the discipline , its theories and methods, etc. Figure 9.6 model s the time scales that are integrated in the learning situation analysed above.

Fig. 9.6
figure 6

Time scales of the tutorial session

Figure 9.6 displays the diverse time scales that the students must integrate to their learning trajectory. These are summarised as follows:

Beyond the Tutorial Session

  1. 1.

    The discipline , its culture and traditions;

  2. 2.

    LMS resources and affordances (enduring artefacts);

  3. 3.

    The course lectures (semester length);

  4. 4.

    Preparatory actions: tutor’s planning session the week before the recorded tutorial sessions;

Within the Tutorial Session

  1. 5.

    The Learning Project : values-realizing activities framed by institutional norms and values;

  2. 6.

    problem-solving activities; what is said and done by students and tutors;

  3. 7.

    Pico-scale bodily coordination: synchronization of and attunement to values-biasing bodily dynamics.

Some of the affordances of the LMS and online resources include Narrated PowerPoint, Narrated Excel and LiveScribe Pencasts. LiveScribe and Narrated PowerPoint enable students to track and monitor the tutor’s voice-over and integrate the tutor’s voice with his/her reading of written text and visuals. The tight Learner-Artefact coupling integrates real-time interactivity to the lecture timescale. NarratedPowerPoint, Narrated Excel and LiveScribe pencasts combine information and interactivity landscapes (Kirsh, 1997). They are examples of how the learning environment can be filled with artefacts that facilitate and enhance coordination between learner and task across place and time scales. Both of these resources are examples of the importance of setting up resources that make the learning task easy to track. This means less planning and more coordination.

6 Conclusion: Learning, Values-Realizing Interactivity and Microgenesis

According to Gibson’s ecological theory of perception, values and potential meanings in the form of environmental information are external rather than internal to the animal. They are objectively available in the environment of the animal; they can be exploited and used if the animal is disposed to engage in effortful exploratory interactivity. They do not depend on the internal needs of the animal or an act of perception of the animal (Gibson, 1986/1979, p. 139). They are not phenomena of experience that are constructed by the categories of mind in order to make sense of meaningless sensory input. The affordances of the environment are objective facts, not subjective constructions of the mind (Reed, 1996, p. 101).

The affordances of environmental objects, events, etc. are not therefore the result of the observer’s subjective interpretation of these objects and events. Objects and events afford what they afford because of what they are. The affordances of environmental objects and events are invariants that are objectively available in the environment . The observer only needs to make the necessary effort to perceive and attend to the affordance. In other words, the environment of the observer is replete with potential meaning and value that the observer can discover, exploit and refine through effortful exploratory interactivity. Meaning and value are ecological in this sense.

In the human case, the human environment is a culturally saturated one, i.e., saturated with potential meaning (information) and value that human agents learn to detect, sensitize to, and refine in the course of their learning and development. Information and value are available in the environment of the individual and the social group though they also require that the individual make the effort to enter into an interactive relationship with the meanings and values of the environment (Reed, 1996, p. 101). As Reed further points out (1996, p. 101), the nature and intensity of these efforts can vary according to the biological needs and the developmental experiences of the animal.

Linguistic utterances, other persons , and the affordances of the LEIS are all replete with both value and potential meaning in the form of information that the learner becomes sensitized to and gradually refines in the course of their interactivity with these objects and events. These learning objects and events are in fact not simple invariants, but complex combinations of invariants, or compound invariants (Gibson, 1986/1979, p. 141), that structure the information made available to learners in the learning environment . For example, linguistic utterances covary with aspects of situation, both real and virtual, such that the covarying relation between utterance and the particular aspects of the situation that is relevant to the understanding of the utterance forms a probabilistic combination of invariants that specify complex and often very subtle information structures in the environment of the learner. Learning involves the honing and refinement of one’s attunement to the subtle and complex information structures of these combinations (Bolles, 1975). Another example is the compound invariant formed by combinations of visual, spatial, auditory, and linguistic invariants in multimodal artefacts such as Narrated PowerPoint, Narrated Excel and LiveScribe pencasts discussed in Sect. 9.5. Cultural artefacts and events, typically, are complex compounds of invariants, or invariants of invariants. They, nonetheless, form complex cultural units that are not reducible to combinations or associations of elementary sensations (Gibson, 1986/1979, p. 141).

Through learning, we sensitize and attune more and more to these complex variables through the development of what Runeson (1977) calls “smart” perceptual mechanisms that are responsible for more advanced information pick up. Perception of these complex variables is directly related to adaptive and flexible behaviour. Gibson argues that: “Even in the classical terminology, it could be argued that when a number of stimuli are completely covariant, when they always go together, they constitute a single ‘stimulus’” (1986/1979, p. 141). Gibson’s theory of direct perception is concerned, above all, with the perceivable physical world. According to Gibson’s theory, there is a one-to-one relation between a given environmental feature and a corresponding pattern of information in the ambient optical, auditory, chemical, etc. array. The theory needs adjusting in order better to account for the complex phenomena of culturally saturated human cognition and meaning-making (Thibault, 2014). Not all perception is direct: patterns of information in the array may carry information about some environmental feature that is not necessarily present, but which is evoked by the pattern in the array as a form of virtual perception (Thibault, 2014). In such cases, the relationship is probabilistic, not one-to-one.

Learning is a motivated and effortful enterprise. Reed (1996, p. 102) distinguishes between two sets of motivations: “… every animal will thus evolve a set of motivations to use important affordances of its niche; in order to use these affordances , there will also have to be a set of motivations to hunt for information specifying these affordances . The first set of motivations I call the species’ effort after values and the second, its effort after meaning.” (p. 102). This applies to all animal species in different ways. The animal is motivated to seek out and use the affordances of its environment because these affordances are important for the animal in some way—for its survival, for its learning and development, and so on. In other words, the affordances of a given species have value for that species. In seeking out the affordances of its environment , the animal engages in values-seeking and values-realizing interactivity. By the same token, the animal is motivated to detect and make use of the information that specifies the affordances of its environment . In detecting and making use of this information, the animal engages in meaningful action that has the capacity to transform and to extend perception, action, awareness, and understanding. Meaning-making is this process of information detection and use that modifies the animal’s relationship to its environment ; it is not, we argue, a construction of the mind that is projected onto a flux of meaningless elementary sensations or an amorphous flux that require interpretative enrichment by means of internal representations, mental schema stored in memory or linguistic codes.

Reed’s distinction between the two kinds of motivation shows that the effort after meaning is always framed, shaped and guided by the effort after values. Hodges (2007a, 2007b) builds on the earlier insights of Gibson and Reed to show that values are both enablements of and constraints on interactivity. In acting and perceiving, we realize values and in so doing we attune to and sensitize to the potentialities of the environment . Hodges (2007a) proposed the notion of values-realizing action in order to show that our interactivity with our environment is nested within a complex values-realizing dynamics. Interactivity is not so much guided by overarching and hierarchical goal -states towards which the agent strives and which control the action top-down fashion. Instead, values are multiple and heterarchical (Hodges, 2007a, 2007b).

Interactivity is shaped, guided and informed by a fluid and fluctuating heterarchy of diverse values such that different values may modulate and guide the activity and therefore may come to the fore at different moments throughout the time-extended development of the action trajectory of the agent. In the development of their learning trajectories, learners orient to and coordinate with both other persons and with the affordances of the learning environment . In doing so, they also orient to and are guided by a fluctuating heterarchy of values that inform and shape the learning situation. From the perspectives of the tutor and the students, respectively, some of the values heterarchy that informed the learning episode analysed above may be summarized as follows:

The Tutor

  • Planning lessons that make effective use of resources;

  • Developing awareness of students’ understandings and needs;

  • Providing effective, context-sensitive scaffolds;

  • Making students become aware of relevant patterns and how to use them.

The Students

  • Developing problem-solving skills ;

  • Cooperative, dialogical learning;

  • Helping each other to detect relevant patterns;

  • Attuning to others;

  • Self- and other-scaffolding.

Learning is a constructive and values-realizing process that involves various kinds of selection pressures. Constructions only survive if they prove successful and thus are retained as valuable. Once they are retained, they serve to provide heuristic guidance to future constructive effort in the overall learning topology (Bickhard & Campbell, 1996). Construction therefore takes place in the context of previously constructed forms of organisation. As the interaction between the tutor and students illustrates, successful construction takes place in microgenesis as a process of small modifications of and variations on already available constructions.

Moreover, we have seen that the interactive success or failure of learning trials depends on (1) how they are evaluated in relation to the norms of the situation (is it appropriate, relevant, etc.?); and (2) whether the given action is supported by the environment in which it occurs. Regularities in the behaviour of persons serve as standards which persons use to evaluate others’ behaviours in ways informed by their own perspectives and experience. We accordingly give that behaviour value and meaning. Persons -in-interaction align to and are constrained by norms that shape the interaction itself and its regularities. As Goffman (1983) showed, many of these constraints give interacting bodies and cultural artefacts value and meaning for the selves in interaction.

Gibson (1986/1979, p. 141) pointed out that both affordances and the information to specify affordances face two ways—to the environment and the observer. Rather than the opposition of the phenomenal world of meaning (mind) and meaningless matter, as in the psychophysical dualism characteristic of mainstream theories of perception and cognition, this means that “the information to specify the utilities of the environment is accompanied by information to specify the observer himself, his body, legs, hands, and mouth. This is only to reemphasize that exteroception is accompanied by proprioception—that to perceive the world is to coperceive oneself. This is wholly inconsistent with dualism in any form, either mind-matter dualism or mind-body dualism. The awareness of the world and of one’s complementary relations to the world are not separable” (Gibson, 1986/1979, p. 141).

The feeling of the realness of the world that one encounters is promoted as a mode of sensory experience through one’s exploratory and always embodied contact with the objects and events of the world. The complementarity of one’s relation s to the world also means that one is part of a community of similar organisms who experience the world in like ways through the complementarity of exteroception and proprioception. The complementarity of exteroception and proprioception constitutes the core of a unitary action-perception cycle. Exteroception and proprioception are the two poles of awareness of this cycle. The learner actualizes perception through its exploratory interactivity with its environment . The values-realizing quest for affordances is a quest for objects of interest in which the perceptual world is articulated by feelings in objects of interest (Brown, 2005, p. 140).

As Brown points out, it is the objective existence of objects in the world and their temporal extensibility that define their value, not a subjective human feeling that is projected onto objective external objects (2005, p. 134). The value of things is “planted deeply in the nature of things and their evolutionary histories—in other words, that culture enhances or elaborates what is nascent in basic entities” (Brown, 2005, pp. 129–130). Gibson’s theory of affordances demonstrates, as Brown puts it, that “value brings the objectivity of the external world into relation with human emotion and conceptuality” (Brown, 2005, p. 128). The process of the actualization of the perceptual world is one of microgenesis. The microgenetic process that gives rise to a pattern of actualization of a brain state or a perception consists of “a succession of (probably) rhythmic phases ordered from earlier to later, unfolding in a fraction of a second” (Brown, 2005, p. 142). Brown (1988, p. 312) makes the following pertinent observation :

According to the microgenetic concept, objects are not “out there” in the world waiting for acts to engage them but have to be constructed in parallel with developing actions. Although there are differences between action and perception, there are deep inner similarities. Early stages in object formation provide the contextual background from which objects develop and persist abstractly as levels of conceptual or symbolic content within the object itself. Similarly, early stages in action elaborate the instinctual and affective bases that drive the action forward to its goal . Act and object also undergo a similar development. The “zeroing in” on target movements in the specification of an action has its correlate in the featural modelling of object form. Both act and object are analysed into finer units. The exteriorization of a target movement and its effectuation on extrapersonal objects correspond with the realization of an external object field. Act and object exteriorize together. A world of real objects and the effects of actions in that world are part of the same microgenetic end point. The deception that a movement is voluntary or willed by the self as an agent corresponds to the deception that we are independent of our own objects. The increasing passivity and then final detachment of an object representation mirror the activity of an action and the realization of an intentional attitude to movements directed toward those object representations.

Brown (1988) draws attention to action as a form of microgenetic constructive effort. Objects, including virtual objects, are actualized together with the actions that are directed towards them. Brown writes of “object representations” whereas we would rather say that action progressively differentiates the object. The microgenetic constructive process is a developmental process that progressively hones and differentiates the object towards which it is directed. This includes both real objects that are directly perceived and virtual ones indirectly perceived in memory, imagination etc. Gibson’s theory shows that action-perception cycles are a form of exploratory activity whereby agents learn progressively to differentiate and hence to fine-tune their attunement to their environment.

Affordances constitute the ecological niche of an animal. The cultural affordances of the human world constitute the niche of the extended human ecology (Steffensen, 2013) that spans diverse time and space scales and is populated by an increasingly large number of virtual cultural entities and processes. Human beings begin their lives outside the womb by perceiving the affordances of other persons and what they tell us about how others relate to the world. In this way, we learn of the diversity of points of view from which the affordances of the environment can be perceived. As the analysis above shows, the learning environment may consist of affordances that are available for all perceivers albeit from different points of observation , both actual and virtual, in the environment . In learning to perceive the common affordance of, for example, .89, the students learn to perceive the values of things not only from their own perspective but also from the perspective of others (see Gibson, 1986/1979, p. 141). In this way, values-realizing interactivity enables them to enact the microgenetic constructive effort whereby they are guided to detect and to make appropriate use of the information that specifies the affordances of .89 in the learning task such that perception and awareness of the learning task are modified. It is only when learners perceive the values of things for others as well as for themselves that learning takes place.

Our contribution to the discussion around the global reform of teaching and learning in higher education is to propose new theoretical constructs and analytical methods for understanding real-time teaching and learning in higher education. The domain-specific competencies, knowledges, and skills required for the scaffolding and self-scaffolding of learning and for the identification of error will look very different across different learning domains. However, (self)-scaffolding is a general principle (Bickhard, 2001) that can be empirically investigated in different domains and the differences and similarities across domains can thus be established. Teaching-learning is a dialogical encounter between teacher and learner that entails an ontological commitment to the joint processes of attending to and observing that take place when teachers lead novices out into the world rather than simply cramming their heads with ‘knowledge’ (Ingold, 2014: 388). A first step towards understanding these processes is the development of theoretically well-guided models of teaching and learning that can provide guidance to the recursive microgenetic construction processes that takes place whenever teaching and learning occur.