Based on decades of functional neuroimaging studies of the cerebellum it has been found that the cerebellum and not the cerebral cortex is predominant in the sequential-analytic brain functions behind the establishment and optimization of human cognitive, social, and technological capacities (e.g., [1,2,3,4,5,6,7,8,9,10]. This article further elucidates the predominance of the cerebellum by (1) describing details of the evolution of the human cerebellum during one and half million years of advances in cerebellar models learned during the sequential requirements of stone-tool making, and (2) describing how these models are sent to and blended in working memory in the cerebral cortex to both enter and adance consciousness.

The Evolution of the Prominence of the Cerebellum and the rise of Homo sapiens within the Evolution of Language and Composite Stone-Tool Making.

Leading Anthropologist, Stanley Ambrose [11] proposed that language evolved approximately 300,000 years ago (ya) within the cultural context of composite tool making:

…composite tools are conjunctions of at least three techno-units, involving the assembly of a handle or shaft, a stone insert, and binding materials. … Conjunctive technologies are hierarchical and involve nonrepetitive* fine hand motor control to fit components to each other. Assembling techno-units in different configurations produces functionally different tools. This is formally analogous to grammatical language, because hierarchical assemblies of sounds produce meaningful phrases and sentences, and changing word order changes meaning. Speech and composite tool manufacture involve sequences of nonrepetitive fine motor control and both are controlled by adjacent areas of the inferior left frontal lobe. A composite tool may be analogous to a sentence, but explaining how to make one is the equivalent of a recipe or short story. If composite tool manufacture and grammatical language coevolved; 300 ka, then Neanderthals and modern humans could speak. The acquisition and modification of each component of a composite tool involve planned sequences of actions that can be performed at different times and places, such as flaking a stone point, cutting and shaping a wooden shaft, and collecting and processing binding materials. The complex problem solving and planning demanded by composite tool manufacture may have influenced the evolution of the frontal lobe.(pp. 1751-1752)

*As will be shown later in this article such nonrepetitive processes in the cerebral cortex are based upon earlier-evolved repetitive and thus cerebellum-driven generalization of working memory-controlled predictive hand movements and searches. These processes were acquired during the earlier Acheulean eras and were related to learning through repetition in childhood, ongoing observation, and trial-and-error manipulation. That is, earliest language evolved within the repetitive cultural context which provided the basis of composite tool making, see, for example, Kirby [12] and Vandervert [9].

Stone-Tools, Cumulative Culture and the Rise of Homo sapiens

Anthropologist Dietrich Stout and neurologist Erin Hecht [13] described the cumulative culture that set Homo sapiens apart from all others of the genus Homo, and from other species. At the beginning of their article titled, “Evolutionary Neuroscience of Cumulative Culture,” Stout and Hecht provided an insightful description of the constant accumulation of culture among our species:

Modern humans live in a culturally constructed niche of artificial landscapes, structures, artifacts, skills, practices, and beliefs accumulated over generations and beyond the ability of any one individual to recreate in a lifetime. Like the air we breathe, this cumulative cultural matrix [italics added] is so immersive that it is easy to forget it is there. However, this is the medium through which we grow, act, and think, and it exerts profound influences on human life across a range of behavioral, developmental, and evolutionary scales. How did our species find itself in this remarkable situation? [italics added] (2017, p. 7861)

To answer this question, Stout and Hecht continued on to examine possibilities as to how “uniquely human psychological specializations for “high-fidelity” social learning [e.g., theory of mind (ToM), imitation], which have enabled the lossless “ratchet-effect” of cultural accumulation to supplant biology as humanity’s primary mode of adaptation,” (p. 7861).

How did cumulative culture become the critical mark of Homo sapiens? From a purely scientific perspective, how are we to find general principles of brain evolution that can explain the evolution of this cumulative culture of Homo sapiens? Stout and Hecht [13] argued that the solution to these difficult questions was to be found in the evolution of stone-tool making:

One solution [explanation] is to seek inspiration from the archaeological record of human evolution. As the name implies, this early “Paleolithic” evidence is dominated by stone tools. These artifacts are valuable, not only because they endure but because they provide prolific and fine-grained evidence of behavioral changes across a critical evolutionary interval during which hominin brains tripled in volume [italics added]to assume their modern proportions. Stone tools were key components of premodern subsistence and survival strategies and likely helped to shape the very course of this evolution. (p. 7862)

Purpose

In this article we agree with Stout and Hecht [13] that indeed stone-tool making is the best way to understand why and how the hominid brain tripled in volume and set Homo sapiens apart with their development of cumulative culture. However, we propose that the critical key evolutionary selection pressure on the brain during the tens of thousands of generations of stone-tool making was the profound level of repetition of detailed movement, mental and social-cognitive processes involved in stone-tool making, which engage the brain’s cerebral cortex, but more importantly engage the cerebellum [1, 4, 14,15,16,17]. This view is strongly supported through the following findings from brain-imaging research: (1) Within the framework of cerebellar sequence detection of all movement, social-cognitive interaction, and emotion [18], cerebellar internal modeling (modeling of processes going on internal to the cerebral cortex) leads to (2) the optimization of attentional control toward prediction [1, 16], (3) related emotional control [19] (4) behavioral, mental, and social automaticity [4, 14,15,16], and (5) cerebellum-driven generalization and blending of thought in working memory and skill [4, 20]. Within this view of a predominantly cerebellum-driven, stone-tool based origin and continuance of cumulative culture [7, 21, 22], we further propose that the tens of thousands of generations of repetitive social-cognitive prediction and emotional control processes involving the cerebellum can account directly for the three- to fourfold in the size of the cerebellum in the last million years [23, 24]. The later stages of this increase in size of the cerebellum would have been commensurate with the evolution of stone-tool making involving composite tools and the rise of Homo sapiens approximately 300,000 years ago [see [11, 25], and the evolution of stone-tool making in Fig. 1.

Fig. 1
figure 1

The brain processes related to planning, location and preparation of composite tools require the engagement of many brain regions including the prefrontal cortex that is well developed in humans in comparison with other primates [26]. However, as it will be seen in this article, prefrontal processes involving this planning and other problem-solving processes require generalization of models learned in the cerebellum and the blending of these cerebellar models in the prefrontal and other areas of the cerebral cortex. Beginning notably with late Acheulean tool making, in the last million years the human cerebellum evolved to increase three- to fourfold in size [23, 24]. This dramatic increase in the size of the cerebellum was the result of natural selection across thousands of generations of the repetition of progressively planned working memory processes and related fine bodily movements associated with the evolution of stone-tool making. Important detailed evolutionary processes derived from combined anthropological and neurological studies point towards the critical roles of the cerebellum. We argue that cerebellum evolution has been instrumental in the rise of Homo sapiens, because of the motor/cognitive/affective predictive functions of the cerebellum and its major contribution to sensorimotor learning. In this view, these processes led to the cerebellum-driven unconscious anticipatory control of attention toward prospective goals including those related to the intertwined advancement of tools, cognitive-social processes and art. Put succinctly, the natural selection of increased cerebellum functions occurred within evolving cultural contexts notably in its earliest beginnings those related to the development of composite tools. Largely as a result of these optimizing processes of cerebellum evolution, Homo sapiens emerged about 300,000 years ago [11, 25]

Cerebellar Generalization and Blending of Attention in Movement and Thought

In addition to the foregoing control of attention toward highly articulated prediction, cerebellar optimization toward goals includes generalization of solutions to problems. To accomplish this, cerebellar internal models learn dynamics equivalents to (1) those of the skeletomuscular system and (2) those driving social and mental systems [4, 14, 15]. Ito [4] explains this concept as follows:

It is important to note that what is learnt in these [dynamics] models is the dynamics or inverse dynamics, not the individual trajectory actually practiced. The simulation study of Kawato et al. [20] demonstrated that after practice of a particular trajectory [which would be coded within internal models of the cerebellum] a robot will form trajectories in any directions accurately and smoothly [thus constituting generalization of the intended trajectory]. I propose the term ‘dynamics learning’ for expressing this manner of learning. The cerebellar circuitry retains ‘dynamics memory’ (either inverse or not) but not memory of individual trajectories. (pp. 448-449)

Kawato, Furukawa and Suzuki’s [20]findings (which are cited in the above quote) strongly suggest the critical role of the cerebellum in the evolutionary transition-by-generalization from inner vocalization to inner speech and language:

Once the [cerebrocerebellar] neural network model learned some movement, it could control quite different and faster movements [italics added]. So, the present model is totally different from previous “table look-up” learning…because of its capability of generalizing learned movements. The reason is because the present model learns the dynamics and inverse-dynamics of a control object instead of a specific motor command for a specific movement pattern. (p. 182)

It is important here to note that Ito [4, 14, 16, 27] argued that this dynamics learning in cerebellar internal models would apply to sequences of both movement and mental processes.Footnote 1 This cerebellar dynamics learning (generalization, toward quite different and faster movements and mental processes) has been applied to the relationship of stone-tool making to the early emergence of silent inner vocalization and on toward language later in this article. See Alderson-Day and Fernyhough [29] and Vandervert [21, 30]) for discussions of inner vocalization and inner speech.

Blending

Within this context, it is proposed that [4, 14, 15] conception of cerebellar dynamics learning provides the basis for cerebellar internal model blending as found by [31,32,33]. Specifically, the multiple directionality (generalization) of internal models described by Ito would allow the interfacing and linkage of internal models so that, through repetition, they may be blended in various ways with other sequences of skill in inner vocalization/speech in working memory and in movement [21, 30]. Imamizu and his colleagues’ thus provides valuable insight into how, through cerebellar blending, central executive control in working memory might “elaborate” toward blended (or, as [4] suggested, generalized) solutions toward increased skill in both inner vocalization/speech and repetitive reductive chipping (knapping) during Lower Paleolithic stone-tool making. This would apply especially in extended cases where the young stone-tool maker is struggling with the development of new, highly complex movements and new internally vocalized conceptions and their related internal sound representations. It is suggested such internally vocalized conceptions and their related internal speech sound representations upon generalization/blending would have begun during the Lower Paleolithic Oldowan and Acheulean period, and thus would have been the earliest basis for the selection toward language, a selection that occurred well before the advent of composite tools.

The Silent Origin and Transmission of Cumulative Culture and the Silent Rise of Homo sapiens

Within the framework of [13] foregoing introduction to arguments for their stone-tool approach to brain evolution we can now turn to their excellent, well-articulated description of mental, emotional, and social processes that are involved in stone-tool making. Here we can unpack the equally well-articulated amount of cerebellum involvement in what can be referred to as the “silent” origin and accumulation of culture and the rise of Homo sapiens.

The Silent Origin of the Control of the Focus and Duration of Attention in the Cerebellum

Before moving forward it is important to understand that the cerebellum silently learns the automatic control of attention for all tasks. This control of attention would of course be fundamental to the control of all movement, mental, social and emotional processes related to stone-tool making. In this regard, as discussed in [1, 8] provided classic brain-imaging evidence of the cerebellum key role in the learning of unconscious, anticipatory control of attention:

The cerebellum is a master computational system that adjusts responsiveness in a variety of networks to obtain a prescribed goal [34]. These networks include those thought to be involved in declarative memory, working memory, attention [as in [35] working memory model, this would be the attentional control of the central executive], arousal, affect, language, speech, homeostasis, and sensory modulation as well as motor control. This may require the cerebellum to implement a succession of precisely timed and selected changes in the pattern or level of neural activity in these diverse networks [It would do this by learning cerebellar internal models which would implement such changes.]. We hypothesized that the cerebellum does this by encoding (“learning”) temporally ordered sequences [italics added] of multi-dimensional information about external and internal events (effector, sensory, affective, mental, autonomic), and, as similar sequences [italics added] of external and internal events unfold, they elicit a readout of the full sequence [italics added] in advance of the real-time events. This readout is sent to and alters, in advance [italics added], the state of each motor, sensory, autonomic, attentional, memory, or affective system [italics added] which, according to the previous “learning” of this sequence, will soon be actively involved in the current real-time events. So, in contrast to conscious, longer time-scale anticipatory processes mediated by cerebral systems, output of the cerebellum provides moment-to-moment, unconscious, very short time-scale, anticipatory information (italics added).

The results from our neurobehavioral and neurophysiological studies showing deficits in shifting and orienting attention in patients with cerebellar damage, as well as new fMRI studies showing cerebellar activation during focused attention and shifting attention in normal adults, suggest that the cerebellum plays an important role in several aspects of selective attention (italics added). Cerebral cortical regions appear to be primarily responsible for generating the commands for enhancement and inhibition of different sources of information and sensory signals. Our data suggest that the cerebellum plays an important role in the execution of these commands in order to optimize the quality of sensory information for coordinating the direction of selective attention. We have demonstrated that this includes the shifting, distribution, and orienting of attention (italics added). (pp. 592-593)

It is important to note here that the foregoing cerebellum-driven, sequence-based anticipatory control of attention (see italicized mentions of sequence processing above) is supported by [18] conclusion that cerebellar control of cerebral processes is indeed based on sequence detection.

Figure 2 illustrates in a simplified manner the overall cerebral-cerebellar positive feedback flow toward cumulative culture as described by [13] that can be derived from the foregoing decades of cerebellum research.

Fig. 2
figure 2

Repetition of movement, thought, and social interaction initiated in the cerebral cortex leads to optimization of goals via internal models learned in the cerebellum. This creates a positive feedback loop leading to advances in cumulative culture

A positive feedback loop operates as follows: “When output of the system is fed back, it increases the magnitude of the quality and/or quantity of the loop’s next output and so so on” [30].

The Silent Acquisition of Emotion: The Prominent Role of the Cerebellum

As a further note, considering successively outlined cerebellar areas contributing to emotion of the brain, here in particular focusing the developmental or otherwise evolutionary process, it is important to keep in mind the growing insights about the nature and characteristics of emotions (listed as affect in Akshoomoff, Courchesne & Townsend’s above findings) in clinical and neuroimaging research. In this regard, the knowledge about the well differentiated avenues of primary, secondary and tertiary emotional processes is indeed not to be restricted to cerebral, but also to the well-organized cerebellar architecture in detail, yielding a fascinating distribution of task- or action-dependent network peculiarities [36]. In fact, these differentiations between the evolutionary early primary processes, coding for basic processing of emotions mainly along subcortical networks, and the evolutionary growing skills of simple emotional learning such as fear, reflecting secondary emotional processes, and finally the more recent tertiary emotion processing with its aspects of higher-order affective-cognitive processing, this with a strong dependence to neocortical network, provide a satisfactory approach to understand the high diversity of neuroscientific reports about the cerebellum and its contribution to emotions in the evolution of the human brain. Furthermore, this approach to the evolving emotion systems of the brain from primordial instinctual up to also more elaborated intentional tool acting, has been recognized as a strong causation of the expansion of mammalian, and particular the human brain [37]. Indeed, the observation of the strongest expansion of the cerebellum as well as the parietal and prefrontal cortex is underpinning the specific development of cerebellar-cerebral networks propagating new tool facilities, these occurring on the basis of and incorporating the early circuits of more simple sensorimotor and cognitive-affective brain circuits [38, 39].

According to the presumption of specific developmental aspects of the cerebellum in emotion evolution, research has successively delineated some clear-cut features of the cerebellar impact to emotion processing. As noted by a profound overview of [40] in terms of neuroimaging findings, the cerebellum is involved in emotion processing and exhibits a complex functional topography of basic emotions along a general emotion network (GEN) [41, 42]. Moreover, neuroimaging protocols consistently observed distinct areas of the vermis and hemispheres, here in particular the lobule VI and VII with Crus I and II, as major hubs participating in cerebellar-cerebral networks, in particular intertwined in several intrinsically connected networks such as the Default Mode Network, the Salience Network, and the Central Executive Network, and the limbic system as crucial core regions in subsiding emotion recognition, attention and behavior [43, 44]. The emotion-driven, context-specific and idiosyncratic combination of these networks may contribute to interindividual and task-dependent differences in brain activation patterns, thus adjusting and optimizing autonomic/motor and cognitive aspects of emotion, through its vermis and lobule VIIa, respectively [40]. More precisely, as outlined by [6], there are emerging data suggestive for multiple representations of cognitive and affective processing simultaneously engaging focal areas within different, presumably three cerebellar regions: (a) lobule VI to crus I, (b) crus II-lobule VIIB, and (c) lobules IX-X [45, 46], which might be in parallel active along the above described cerebellar-cerebral pathways depending on task complexity, and therefore suggesting multifaceted task- and goal-oriented mechanisms along the cerebellar-cerebral circuitries and accompanied domain-specific intrinsic networks [45, 46].

The specific involvement of the cerebellum in the perception or recognition of certain emotional cues and their forwarding for further processing has been assumed to be the result of circuitry serving specific emotional domains, built up during brain development. From an evolutionary point, the cerebellar vermis and parts of the anterior lobe might has been incorporated at first in emotion-related survival circuits for rapid and stimulus-specific motor response execution such as aversive or defensive behaviors with a preferential recruitment by negative emotions including bodily expression of emotion and associated autonomic adjustments (e.g., pain resistance or arousal adaptation), and for detection of salient stimuli along the intrinsic salience network. Accordingly, the vermis has been assumed a central role in the associative processes involved in forming emotional memory traces, across all substantial stages (acquisition up to consolidation, retrieval as well as extinction), indicating the cerebellum as part of associative emotional learning as a core mode of cerebellar areas in processing emotion [47]. Following stages of phylogenesis with the need of higher-order processing of emotional stimuli, the neocerebellum deem to have been integrated in relation with bottom-up, and furthermore top-down and goal-directed attentional, language and executive networks including working memory, emotion regulation or context- and knowledge-dependent response selection, exerting a volitional control and adaptation of more complex emotion-driven behavior [6, 40].

The Silent Acquisition of High Fidelity Social Learning: The Prominent Role of the Cerebellum

To clearly illustrate the intertwining of social learning, and obviously concomitant technological learning in making stone tools, [13] described in detail how “high fidelity” social learning (e.g., involving theory of mind (ToM) of others; imitation) takes place between learners and teachers (and took place for tens of thousands of generations across roughly 1.7 million years between those ancient learners and teachers). In the following quote from Stout and Hecht we have highlighted cerebellum-critical movements, mental and social processes in italics, and added their respective supportive research studies:

Knapping is a “reductive” technology involving the sequential detachment of flakes from a stone core using precise ballistic strikes with a handheld hammer (typically stone, bone, or antler) to initiate controlled and predictable fracture. This means that small errors in strike execution can have catastrophic, unreversible effects. Experiments by Bril and colleagues have shown that fracture prediction and control is a demanding perceptual-motor skill reliably expressed only in expert knappers. Building on this work, Stout and colleagues found that even 22 mo (x̄= 167 h) of knapping training produced relatively little evidence of perceptual-motor improvement, in contrast to clear gains in conceptual understanding.

The key bottleneck in the social reproduction of knapping is thus the extended practice required to achieve perceptual-motor competence. This requires mastery of relationships, for example between the force and location of the strike and the morphology, positioning, and support of the core, that are not perceptually available to naïve observers and cannot be directly communicated as semantic knowledge. Attempts to implement semantic knowledge of knapping strategies before perceptual-motor skill development are ineffective at best, and such knowledge decays rapidly along knapping transmission chains when practice time is limited, even if explicit verbal teaching is allowed. For observational learning [between learner and teacher], the challenge is to translate visual and auditory information of another’s actions to appropriate motor commands for one’s own body. This may be accomplished by linking the observed behavior with preexisting internal models [Stout and Hecht are referring here to internal models in the cerebral cortex and not in the cerebellum] of one’s own body and actions through associative learning and stimulus generalization. Novel behaviors are copied by breaking them down into familiar action elements (e.g., lift, turn, twist), matching these, and reassembling. ([13], pp. 7862-63).

(See also [8] for a discussion of Stout and Hecht’s above description of “high fidelity” social learning.)

Below, Stout and Hecht’s foregoing account of the movement, mental, social, emotional processes required in stone-tool making is (1) disassembled into a list of critical skill requirements in quotes, (2) each followed in order by a description of the critical functions of the cerebellum that are necessary in learning that skill, followed by (3) appropriate supporting cerebellum research sources:

Cerebellum-Critical skill Components in Stone-Tool Making

  1. 1.

    “using precise ballistic strikes”: requires cerebellar optimization of attentional control, [1, 15, 16];

  2. 2.

    “initiate controlled and predictable fracture”: requires cerebellar optimization of attentional control: [1, 15, 16, 18];

  3. 3.

    “small errors in strike execution can have catastrophic, unreversible effects”: requires cerebellar control of focus of attention during emotion-laden task [1, 48];

  4. 4.

    fracture prediction and control is a demanding perceptual-motor skill reliably expressed only in expert knappers: requires cerebellum-driven increases the smoothness, appropriateness, and speed of movement and mental skills toward optimization of automaticity [1, 15, 16];

  5. 5.

    “even 22 mo (x̄ = 167 h) of knapping training produced relatively little evidence of perceptual-motor improvement”: requires cerebellum-driven increases the smoothness, appropriateness, and speed of movement and mental skills toward optimization of refinement [1, 8, 15, 16];

  6. 6.

    “The key bottleneck in the social reproduction of knapping is thus the extended practice [italics added]”: requires cerebellum-driven increases in the learning of models of smoothness, appropriateness, and speed of movement and mental skills toward optimization [8, 15, 16];

  7. 7.

    “For observational learning [between learner and teacher], the challenge is to translate visual and auditory information of another’s actions to appropriate motor commands for one’s own body”: It is the cerebellum that learns social-cognitive automaticity through models of Theory of Mind (ToM) and copying the behavior of others (teachers) first in the form of stone-tool construction and then, through generalization, artist representations/constructions of others themselves [10, 17, 19, 21, 49];

  8. 8.

    “Novel behaviors are copied by breaking them down into familiar action elements (e.g., lift, turn, twist), matching these, and reassembling”: It is the cerebellum that learns ToM and behavior models which are blended and generalized in the cerebral cortex to form new physical and new imaginative mental configurations, including vocalization-to-language selection [21, 30]. While these internal models are blended and generalized in the cerebral cortex, they are only learned during the process of optimization in the cerebellum [1, 17, 20, 31].

The foregoing eight-part breakdown of [13] detailed description of stone-tool making provides a strong indication of the prominent and, because skill optimization through extended practice (component-6 above) requiring the learning of internal models in the cerebellum was absolutely necessary, perhaps predominant role in the eventual rise of Homo sapiens with the development of composite stone tools about 300,000 years ago (see Fig. 1).

We propose that the overarching finding among these eight points is the role of social-cognitive learning that must occur through extended practice between the teacher and the learner, which requires extensive involvement of the cerebellum leading to sustained focus of attention and automaticity [1]. This view is based on the fact that extended practice must be done in settings conducive to observational learning (see components-5, 6, and 7; notably component-7.) Directly in this regard [17] described in detail the involvement of the cerebellum in social learning:

We hypothesize that the cerebellum acts as a “forward controller” of social, self-action and interaction sequences. We hypothesize that the cerebellum predicts how actions by the self and other people will be executed, what our most likely responses are to these actions, and what the typical sequence of these actions is. This function of forward controller allows people to anticipate, predict and understand actions by the self or other persons and their consequences for the self, to automatize these inferences for intuitive and rapid execution [italics added], and to instantly detect disruptions in action sequences….The cerebellum would be a “forward controller” that not only constructs and predicts motor sequences, but also takes part in the construction of internal models that support social and self-cognition. In this respect, the cerebellum crucially adds to the fluent understanding of planned and observed social inter-actions and contributes to sequencing mechanisms that organize autobiographical knowledge [48]. (p. 35).

Van Overwalle, Manto, Leggio et al.’s forward controller function of the cerebellum would be fundamental to the overarching processes of observational learning required in [13] description of stone-tool making. Moreover, within this context of observational learning, this forward controller capacity to “predict and understand actions by the self or other persons and their consequences for the self” would require the cerebellum-driven sustained focus of attention and optimization described in detail earlier by [1]. During progressive stages of this social-cognitive cerebellar optimization both within the lifetime of the individual and in the evolution of sapiens’ cerebellar optimization would underlie progressively more predictive solutions to problems. As these progressively optimized new solutions were fed forward to conscious working memory in the cerebral cortex, where blending of fed forward cerebellar models would occur [31,32,33], and these would be experienced as sudden insights or “intuitions”—this we believe is the cerebellar feedforward origin of the “a ha” experience which perhaps takes place in working memory in the right anterior temporal lobe of the cerebral cortex as found by [50]. Importantly here, depending on the kind of prediction to be implemented, different sectors of the cerebellum would be involved: the anterior cerebellum is critical for motor resonance mechanisms, whereas posterior cerebellar sectors mediate both mechanisms involved in basic socio-emotional functions (e.g., identifying a biological movement and predicting action intention, medial posterior cerebellum) and higher-level social inferential/predictive processes (e.g., inferring an emotional state or an intention on the basis of a given context, later posterior cerebellum) [49].

The idea that the cerebellum evolved as a fast information-processing adjunct to the association cortex within the context of the evolution of tool manufacture and use is strongly supported by the fact that the newly lateralized regions of the cerebellum readily modularize for the actual, imagined, and observed use of tools [3, 33]. Figure 3 contains a partial list of the 16 tools which Imamizu and Kawato found to modularize in the cerebellar cortex. Note.

Fig. 3
figure 3

A flattened view showing the posterior cerebellum appears on the left. The cognitive areas of the cerebellum expanded three- to four-fold in size in the last million or so years. The upper portions of the cognitive areas are modularized for tools. The two-way arrows in the brain illustration on the right depict in a simplified way the cerebellum’s massive number of two-way connections throughout the cerebral cortex—the 40 million nerve tracts between the cerebellum and the cerebral cortex are the most numerous in the brain, 40 times more than the one million that connect the eyes with the visual cortex [23]. Note Fig. 3 edited by K. Weathers Illustrations (kweathers10@mywhitworth.edu)

that during the pianists’ actual performances, the piano was also found to modularize in the cerebellar cortex [51]. In the eyes of the cerebellum, musical instruments are apparently learned and modularized much as are “tools.” Moreover, recent studies of the human cerebellum indicate that even the observation and imagination of the manipulation of tools are represented in the cerebellum [52]. Further, Henschke and Pakan argued that these functions would have occurred through social interaction thus supporting Van Overwalle, Manto, Leggio et al.’s above contentions.

Conclusions

It is concluded that the evolution of the cerebellum was prominent in the rise of Homo sapiens because repetition of movement and attentional control in social cognition was critical in selective adaptation leading to the development of composite tools and significant cumulative culture 300,000 years ago [11, 21, 25]. Within stone-tool technologies observational learning involving social-cognitive internal models in the cerebellum [17] was the dominant evolutionary brain context toward the development of composite tools, cumulative culture, and the rise of Homo sapiens.

Cerebellar internal models (1) generalized toward optimization of attentional control of prediction of movement and thought [4, 20] and (2) were blended in the cerebral cortex toward optimization [31] of attentional control of movement and thought resulting in creativity. Supporting these findings, the recently evolved lateral areas of the cerebellum readily modularize during the actual use, observation, and imagined use of a variety of tools [3, 33], and see general discussion of these cerebellar processes in [52].

The Combined Cerebro-Cerebellar Origins of Technology and Art–Stone Tools, Copying and Art

Following the above-cited eight cerebellum-critical evolutionary selection parameters from [13] analysis of brain selection parameters necessary to the evolution of stone-tool making, the following two (#’s 7 & 8) parameters appear to be especially significant to the evolutionary emergence of art among Homo sapiens:

7. “For observational learning [between learner and teacher], the challenge is to translate visual and auditory information of another’s actions to appropriate motor commands for one’s own body”: It is the cerebellum that learns social-cognitive automaticity through models of Theory of Mind (ToM) and copying the behavior of others (teachers) first in the form of stone-tool construction and then, through generalization, artist representations/constructions of others themselves [1, 9, 17, 21, 48, 49];

8. “Novel behaviors are copied by breaking them down into familiar action elements (e.g., lift, turn, twist), matching these, and reassembling”: It is the cerebellum that learns ToM and behavior models which are blended and generalized in the cerebral cortex to form new physical, and new imaginative mental, including vocalization-to-language selection [21, 30] and emotional configurations including those underlying art. While these internal models are blended in the cerebral cortex, they are first learned during the process of opimitization only in the cerebellum [1, 9, 20, 31].

Thus, beginning with the first appearances of the evolution of Lower Paleolithic Oldowan stone-tool making (perhaps 2 million years ago) from very early childhood such repetitive observation was in essence cerebro-cerebellar driven repetitive "copying" not only of the challenging movement patterns and related ToM of a teacher(s) but those of animals. Within this social context, copying (via observation) of movement, objects, people and animals would have been constantly generalized/blended stone tool-related hand movements in the progressively emerging lateral regions of the cerebellum [3, 4, 14, 16, 20, 32] to refine prediction in both imagination and action toward optimization of cognitive, social, and behavioral expression. This would have applied not only toward constant improvement of prediction in skilled movements and imaginative thought for social copying [1] but also eventually became the basis of art, which consisted of hand movements for the expression of the above-copied animals, objects and humans. This view is supported by the recent work on the role of lateral areas of the cerebellum in art by [19, 49]. According to this view, art arose from predominantly cerebellar refinements of Homo sapiens’ technologies of stone-tool making about 300,000 years ago. Furthermore, predictive mechanisms implemented by the cerebellum may also be involved in the perception of artistic creations. Indeed, an interesting model suggests that a temporary state of unpredictability (i.e. prediction error) is fundamental for the emergence of perceptual pleasure when encountering a work of art [53]. The available evidence suggests that posterior cerebellar regions, predominantly Crus I and II should be considered of particular interest in diverse large-scale neural networks of aesthetic processing, including art perception and appreciation as well as creative thinking. The cerebellar functional significance in these processes may rely on its ability to implicitly implement and coordinate both low-level sensorimotor predictive mechanisms and higher-level inferences requiring the appraisal of the cognitive and affective salience of stimuli. Within this same framework (notably, the cerebellar unconscious anticipatory forward control of attention found by [1] and supported by [17], it is argued that the cerebellum feeds forward unconscious control of optimum sequences of thought and imagination which meet prospective problem-solving goals in working memory related to advances in science, technology, and culture [7, 8, 21]. The reception of such unconscious optimum sequences in ongoing working memory maybe experienced as sudden intuition that produces solutions to such problems [8, 21, 30], perhaps the highest mental function associated with the evolution of sapience in Homo sapiens.

The evolution of sapiens’ to imagine realities as described above in [13] stone-tool making point-8 translates well to what anthropologist [54, 55] referred to as sapiens’ ability to create “arbitrary forms,” forms that did not naturally exist in nature. Such imagined arbitrary forms such as advances in stone tools were naturally selected into the cerebro-cerebellar system because they allowed sapiens to more powerfully predict and control nature. It is supportive of these strong imaginative and intuitive optimization roles of the cerebellum here that [56] felt strongly that intuition in problem-solving is the only path to the formation of new ideas:

A new idea [a new concept in working memory] comes suddenly and in a rather intuitive way. That means it is not [italics added] reached by conscious logical conclusions. But thinking it through afterwards you can always discover the reasons which have led you unconsciously to your guess and you will find a logical way to justify it. Intuition is nothing but the outcome of accumulated earlier intellectual experience.Footnote 2

Within the overall findings presented in this article the “accumulated earlier experience” mentioned by Einstein is modeled and optimized in the cerebellum. Within this view, such conceptual intuition may thus be best understood primarily as (1) a function of cerebellar unconscious forward control as described by [17], and (2) as a key part of what drives cumulative culture [21].