Introduction

Recent studies pointed to the cerebellum as an important contributor to action understanding and high-level social cognition operations [1, 2]. Indeed, it has been consistently found to be activated during both action execution and observation [3], being involved in motor simulation of observed actions [4], and it has been held to be a necessary node of the action observation network (AON; [5]). Accordingly, patients with cerebellar alterations are impaired in action perception [6, 7] and these deficits may contribute to their social cognition deficits [8,9,10], known as the cerebellar cognitive affective syndrome (CCAS; [11]). What the specific role of the cerebellum in action processing is, however, has remained unclear.

In the last decades, the concept of prediction has assumed a central role in understanding how human beings perceive the world and act in conditions of uncertainty [12]. In particular, the view of a “predictive brain”—which constantly generates inferences and predictions about sensory inputs through internal models—is strictly connected with the widespread of the Bayesian approach in neuroscience [13]. Bayesian statistical theory is a mathematical method based on inferential processes, affirming that the probability (i.e., the posterior) of an event could be interpreted as a product of prior knowledge and new acquisition of data. According to this predictive-coding account, perception could be considered as the result of a hierarchical inference in which top-down and bottom-up processes interact at any stage to verify the matching between the internal model of the expected input and the incoming sensorial information. The mismatch between top-down predictions and sensorial input generates a prediction error that is used to update internal models [14]. While extensive evidence on the predictive-coding mechanisms has been gathered in perception [15, 16] and sensorimotor control [17, 18], it has been recently proposed that a similar mechanism could account for both the low-level and the high-level mechanisms underlying social cognition [19, 20].

Predictive coding appears as a promising model to explore human behavior in dynamic social environments and, in particular, it may explain action understanding, which is a crucial hallmark of social cognition [21]. Motor simulation theories suggest that the activation of the motor system allows the comprehension and anticipation of others’ actions [22] through the inversion of forward models of action execution, ultimately leading to prediction of others’ motor intentions. This mechanism relies on the assumption that actions driven by different intentions should be performed differently and, thus, recognized by specific, reliable movement kinematics [23, 24]. Nevertheless, in condition of perceptual uncertainty, visible kinematics may be too ambiguous, limiting our ability to uniquely infer the meaning of the observed action [25, 26]. Hence, the combination of both observed kinematics (i.e., data, in the Bayesian account) and internal models (i.e., priors) may be required to reduce perceptual uncertainty and efficiently reach the representation of others’ actions (i.e., posterior) [27, 28]. Predictive coding accounts allow overcoming the limits of motor simulation by proposing that incoming kinematic information is compared with top-down predictions at all levels of cortical processing [26, 28, 29]. Accordingly, previous studies reported that visual familiarity [30] and motor experience [31,32,33] with the observed action facilitate the retrieving of its kinematic pattern, thus allowing prediction of its outcome ahead of realization. This expert prediction ability has been linked to modulation of the parietal, temporal, and frontal areas that are part of the cortical Action Observation Network (AON; [34,35,36]).

However, actions do not occur in isolation, but embedded in specific physical and social contexts and the context in which an action is taken constrains the likelihood of the underlying intention [37, 38]. In other words, contextual information provides an early prior that guides the selection of the most likely action representation [39]. Accordingly, Amoruso and colleagues (2018) demonstrated that motor activation during action observation is modulated by context-based expectations and this modulation likely involves cortical areas outside the motor system, such as the dorsolateral prefrontal cortex and middle temporal areas [40]. Nevertheless, it is still unclear how contextual priors are activated during action observation and then matched with incoming body kinematics. Moreover, it is still unclear which brain area may undertake these functions.

The cerebellum is engaged in low-level computations combining forward modeling and stored memory to anticipate incoming information [41]. In this vein, according to its uniform anatomical structure associated with multiple connections with cortical areas, the “universal cerebellar transform hypothesis” (UCT; [42, 43]) has been proposed. This hypothesis sustains that the cerebellum applies a universal computational mechanism based on prediction and error signaling on inputs conveyed in different cortico-cerebellar loops [44, 45]. Thus, beyond its engagement in motor control and simulation, this hypothesis invokes for the cerebellum a specific role in integrating multiple cortical commands and contextual information to generate predictive internal models: these models are then matched with sensorial feedbacks to provide a signal prediction error to cortical processing. This predictive computational mechanism may apply in multiple domains spanning from language to action processing [46,47,48]. Previous research has explored the relationship between cerebellar activity, prediction, and context in the linguistic domain (for a review, see [49]), while there is a lack of knowledge on how contextual information embedded in internal models influences and modulates the processing of others’ action. Here, we tested the hypothesis that the cerebellum plays a role in action prediction by extracting the spatio-temporal regularities of the embedding context to provide priors that explain away the incoming sensorial kinematic information; this contributes to select the most likely intention of an action. Integrating cerebellar functions into a predictive coding framework could provide new insights not only on specific mechanisms involved in action prediction but also on their implications for the multiple neuropsychiatric symptoms of the CCAS [8, 50, 51].

To verify our hypothesis, in this study we investigated whether cerebellar damage could interfere with the representation of contextual priors during action prediction, thus affecting the understanding of others’ behavior in condition of perceptual ambiguity. To this aim, we used an action prediction task composed by a probabilistic-learning (familiarization) phase and a following testing phase [52]. The task was administered to children and adolescents with typical development (TD) and to age- and gender-matched survivors of a brain tumor affecting (infra-tentorial tumors; ITT) or sparing (supra-tentorial tumors; STT) the cerebellar areas. In the familiarization phase, participants were exposed to short videos showing a child grasping different objects with one of two distinct outcomes (e.g., grasping to eat or to offer) and participants were asked to recognize the action outcome. In this phase, the videos were interrupted when the hand of the actor reached the object, thus showing clear kinematic information that could be used by participants to predict the action outcome. Crucially, each action (e.g., grasping an apple to eat) was associated to a specific contextual cue (e.g., a red plate), with pre-established probability of co-occurrence (i.e., 10–40–60–90%). This way, we promoted the building of contextual priors consisting in expectations about specific spatio-temporal cues acquired through short-term learning, which are more easily handled in an experimental setting as compared to structural priors, which reflect innate or long-term learned knowledge (i.e., the light comes from above assumption) [52]. In the testing phase, presentation of the same videos was interrupted at an early reaching phase, much before the hand-object contact occurred, and participants were asked to predict the outcome of the action. Since kinematic information was incomplete, we expected that participants’ responses should be biased toward the contextual prior implicitly learned in the familiarization (i.e., predicting a grasping to eat when the apple was on a red plate). This is what has been previously observed [52] and we expected in TD children. Regarding patients with brain tumors, net of any overall impairments due to suffering from a brain tumor (and its therapy), we expected different patterns of performance in ITT and STT patients. In particular, in keeping with the hypothesis of a specific role of the cerebellum in providing contextual prior representations of ongoing actions, we expected that ITT patients might not use contextual priors to compensate the lack of kinematics information, thus showing no contextual modulation and a worse performance in high probabilistic associations than STT and TD groups. Furthermore, such a deficit in taking advantage of contextual prior should be related with social perception deficits as measured with standardized neuropsychological tests. In contrast, according to a general motor simulation role of the cerebellum, ITT should present with general impairments in predicting actions independently from their probabilistic associations with the embedding context and should show a comparable contextual modulation as that exhibited by the STT and TD groups.

Methods and Materials

Participants

Forty-two children and adolescents with a previous diagnosis of brain tumor were enrolled in the study. All patients were referred to the Neurooncological and Neuropsychological Rehabilitation Unit of Scientific Institute IRCCS E. Medea (Bosisio Parini, Italy) for routine clinical and functional evaluation after illness and oncological treatments. Study inclusion criteria were (i) a previous diagnosis of brain tumor with no active disease at the moment of study inclusion; (ii) age ranging from 7 to 20 years; (iii) no ongoing oncological treatment; (iv) absence of moderate or severe cognitive delay; (v) absence of severe motor and sensorial diseases that could interfere with the execution of the task; and (vi) diagnosis of a tumor located only in the supra-tentorial or infra-tentorial region and not involving both these areas.

For all participants, a chart review was conducted in order to collect clinical (age at diagnosis, time since diagnosis, tumor site, tumor type, oncological treatments received, presence of hydrocephalus), demographic (sex and age at evaluation), and cognitive (full scale intellectual quotient—FSIQ, verbal intellectual quotient—VIQ, and perceptual reasoning intellectual quotient—PRIQ) information. Twenty-one patients were included in the supra-tentorial group (STT) and 21 in the infra-tentorial group (ITT). The classification of brain tumor patients as STT and ITT was in keeping with previous research examining neuropsychological outcomes after a childhood brain damage [53]. With respect to demographic variables, no differences were found between the two groups for sex (χ21 = 2.38; p = 0.12) and age at evaluation (t40 = 0.90; p = 0.37). As what concerns clinical variables, no differences were found for age at diagnosis (t40 = − 1.30; p = 0.20), radiotherapy (χ21 = 0.53; p = 0.47), chemotherapy (χ21 = 1.54; p = 0.21), neurosurgery (χ21 = 3.36; p = 0.07), and hydrocephalus (χ21 = 0.00; p = 1.00). In contrast, time since diagnosis was significantly longer in STT (M = 86.17 months; SD = 39.86) that in ITT patients (M = 54.44 months; SD = 39.01) (t40 = 2.61; p = 0.01) and, thus, was considered in statistical analyses. Moreover, differences in tumor type were found in the two groups (χ21 = 14.25, p < 0.01), with astrocytoma (19.0%) and ependymoma (19.0%) being the most frequent diagnoses in STT patients and medulloblastoma in ITT patients (57.1%); this was expected based on previous reports [54]. With respect to cognitive functioning, no differences were found in FSIQ (t40 = − 0.02; p = 0.99), VIQ (t40 = 0.19; p = 0.85), and PRIQ (t40 = − 0.42; p = 0.68). Detailed demographic, clinical, and cognitive variables in ITT and STT patients are depicted in Table 1.

Table 1 Demographic and clinical variables of brain tumor participants

A group of 21 TD children and adolescents with no previous history of neurological or psychiatric disorders were recruited at local schools, serving as control group. This group was comparable to the two clinical samples for age (F2,60 = 0.81; p = 0.45) and sex (χ22 = 0.90, p = 0.64). The study was approved by the Ethics Committee of the Scientific Institute IRCCS E. Medea (Prot. N.34/18–CE) and procedures were in accordance with the 1975 Declaration of Helsinki. Eligible participants were identified by the attending physician and reported to a research assistant who provided parents with full information about the study and asked them to sign an informed consent before starting the experimental session. Children, who were naïve to the aims and hypothesis of the experiment, gave their verbal assent before starting the procedure. At the end of the experiment, children and their parents were further debriefed about the scope and the design of the study.

Stimuli and Experimental Task Procedure

We adopted visual stimuli validated in a previous study with pediatric populations [52]. In all videos, a male child (10 years-old) sat in front of another boy of the same age and executed a reach-to-grasp movement with the right hand toward two different objects (i.e., an apple or a glass). The two actions, clearly identifiable through kinematics, were grasping-to-eat (apple) or -drink (glass) and grasping-to-offer, thus corresponding, respectively, to an individual or an interpersonal outcome. Notably, the two objects were presented with a specific contextual cue, namely an orange or a violet dish for the apple and a blue or a white tablecloth for the glass.

Participants were tested in a single session lasting approximately 60 min. Before the experiment, participants were initially introduced to the objects (i.e., apple and glass) that would be displayed/presented in the videos and were informed about the different feasible object-manipulations associated with either individual (i.e., to eat/drink) or interpersonal actions (i.e., to offer). The experimental task consisted of two blocks and lasted ~ 40 min. Each block comprised a familiarization phase (80 trials) immediately followed by a testing phase (40 trials), for a total of 120 trials per block (160 familiarization trials and 80 testing trials across the two blocks). Short breaks were allowed between blocks and phases. In both the familiarization and testing trials, participants were presented with an action videos and were required to predict the final outcome of the action (i.e., to eat/drink vs. to offer) in a two-alternative forced choice (2AFC) task.

Each trial started with a fixation cross lasting 2000 ms followed by the video-clip presentation. In the familiarization phase, all videos had the same duration of 25 frames for a total of 1000 ms; conversely, in the testing phase, length was set at 15 frames, for a total of 600 ms, in order to occlude from view the last part of the reaching phase. Immediately after the videos, a prompt frame reporting the Italian verbal descriptors of the two possible goals (i.e., “mangiare,” to eat, and “offrire,” to give, for the apple or “bere,” to drink, and “offrire,” to give, for the glass) was presented. The two goal-descriptors, written in white on a black background, were located on the right and on the left side of the screen and the participants were requested to respond by pressing with their right or left finger index the “m” or the “z” computer key to indicate the goal-descriptor on the right or on the left, respectively. A QWERTY keyboard was used and white stickers were placed on these response keys to facilitate identification. The location of the two descriptors was counterbalanced between participants and was consistent across trials for each participant. The prompt frame remained on the screen until a response was recorded. Examples of trial structure and timeline of the two phases are reported in Fig. 1.

Fig. 1
figure 1

a Trial structure, timeline and examples of probabilistic contextual cue-action associations in the familiarization phase. b Trial structure and timeline in the testing phase

During the familiarization phase, a temporal occlusion paradigm was used in order to stop the videos two frames before the model touched the object (1000-ms-long). Indeed, participants could not observe the grasping movement itself, but the pre-shaping of the hand configuration during the reaching phase of the movement. Thus, the amount of visual information available was high as revealed by a previous validation study [52]. Half of the trials was presented with the apple and the other half with the glass. Crucially, in this phase, the association between contextual cues (i.e., color of the plate for the apple and of the tablecloth for the glass) and actions (i.e., to eat/drink and offer) was implicitly biased with pre-established probability of co-occurrence: 10% (8 trials), 40% (32 trials), 60% (48 trials), 90% (72 trials). Action-context associations were counterbalanced between participants and remained constant in the two blocks of familiarization. Since neither explicit information about the associations between contextual-cues and actions nor trial-by-trial feedback following participant’s responses were provided, participants remained completely naïve to the existence of underlying statistical regularities of the co-occurrence between actions and contextual cues.

Within the testing phase, presentation of the same videos was interrupted at an early reaching phase (600-ms-long). Hence, the amount of visual information available to the children was considerably reduced in order to create a condition of perceptual ambiguity. In this phase, each action was presented embedded in each context for the same number of trials (10 trials per block for a total of 20 trials per condition). Since movement kinematics were ambiguous, we expected that children’s responses would be implicitly biased toward contextual priors acquired during the familiarization phase.

Neuropsychological Evaluation

After the experimental task, the two clinical groups were administered with the neuropsychological evaluation, lasting 20 min. We assessed the two clinical samples with the Social Perception subtests of the Italian version of the NEPSY-II [55, 56]. The theory of mind (ToM) subtest is composed of two parts. The verbal part uses verbal or pictorial descriptions of social situations in order to assess the ability to understand mental constructs, such as beliefs and intentions, and how other people could have thoughts, emotions, and perspectives, which might be different from ours. Conversely, the non-verbal part evaluates the ability to infer others’ emotions and mental states by social context. The Affect Recognition subtest assesses the ability to recognize affective states from emotional facial expressions using pictures of children. Raw scores of the two parts of the ToM subtest and of the Affect Recognition subtest were transformed into T-scores in accordance with the distribution of the age-matched normative values for the Italian sample [56]. This way, we obtained separate standardized scores for the verbal and the non-verbal parts of the ToM subtest and we avoided approximation at the low and high extremes adopted in standard conversion tables, thus including negative numbers for performance lower than − 3.33 SD from the normative mean.

Data Reduction and Statistical Analysis

We excluded from analyses trials with anticipated or out-of-time responses (RT < 150 or > 5000 ms). For the familiarization phase, we calculated the individual mean percentage of correct responses (accuracy) across trials and inserted it as dependent variable into a one-way analysis of variance (ANOVA) with group as between-subjects factor. For the testing phase, we calculated the individual mean percentage of correct responses (accuracy) separately for each action-contextual cue association. Then, we conducted a two-way mixed-model, repeated-measures 4 × 3 ANOVA, with probability (10 vs. 40 vs. 60 vs. 90%) as within-subject variable and group as between-subjects factor. We adopted the Duncan’s post-hoc test correction for multiple comparisons to analyze significant interaction effects. This sequential post-hoc test reduces the size of the critical difference depending on the number of steps separating the ordered means and is optimal for testing in the same design effects that may have different sizes [57,58,59]. In order to better explore the difference between groups in the probabilistic effect and in line with previous research [52], for each participant we calculated a standardized beta coefficient across trials of the testing phase by running, for each participant, a regression analysis with probability and accuracy as the independent and dependent variables, respectively. This beta coefficient provided an index of the effect of the probabilistic context-action associations on the performance in the action prediction task, thus representing a measure of the strength of the contextual priors. Then, we ran an ANOVA with the beta coefficient as the dependent variable and group as categorical factor. With the aim to verify the impact of clinical variables on the use of contextual priors, we inserted, after excluding collinearity between the predictors, the beta index as dependent variable of a linear regression model with the standardized scores at the social perception subtests, FSIQ and time since diagnosis as predictors. We reported effect sizes as partial Eta squared (η2p), adopting conventional cut-offs of η2p = .01, .06, and .14 for small, medium, and large effect sizes, respectively [60]. Moreover, we reported data as mean and standard error of the mean (SEM). The significance threshold was set at p = 0.05 for all statistical tests. All analyses were implemented using the Statistica software version 8 (Statsoft, Tulsa, OK).

Results

Accuracy

The ANOVA on accuracy in the familiarization phase did not reveal any significant difference between groups in predicting actions displayed almost in full during the reaching phase (F2,60 = 0.91; p = 0.41; η2p = 0.03).

The two-way mixed-model, repeated-measures 4 × 3 ANOVA on accuracy during the testing phase revealed a main effect of probability (F3,180 = 3.25; p = 0.02; η2p = 0.05), better qualified by the significant two-way interaction probability × group (F6,180 = 3.34; p = 0.004; η2p = 0.10), suggesting the presence of a diverse probabilistic modulation within the groups. This hypothesis was further corroborated by the Duncan post-hoc tests. Indeed, the TD group was significantly more accurate for the 90% action-context association compared to the 10% condition (89.33 ± 1.84% vs. 67.05 ± 5.63%, p < 0.01). In this group, accuracy for the low-probability condition (i.e., 10%) was also lower than accuracy for both the intermediate 40% (79.86 ± 4.50%, p = 0.03) and 60% (79.76 ± 5.24%, p = 0.03) associations. In the STT group, we found that accuracy for the high-probability condition (i.e., 90%) was higher than accuracy for the 60% condition (80.33 ± 2.91 vs. 67.86 ± 5.51, p = 0.04). Conversely, within the ITT group, no significant differences between probabilistic associations emerged (all p > 0.20), suggesting that the performance was not affected by contextual priors. The between-groups comparisons revealed that the TD and the STT groups showed a comparable performance at all probabilistic associations (all p > 0.11), while TD individuals outperformed ITT patients for the high-probability condition (p = 0.01) (Fig. 2). Globally, these results pointed to a reliable effect of the contextual priors in both TD and STT participants, while ITT patients did not seem to benefit from previous learning of probabilistic action-context associations.

Fig. 2
figure 2

Accuracy in familiarization and testing phase. Asterisks indicate significant comparisons (p < 0.05), error bars represent SEM

Beta Index

The pattern of results on accuracy was further corroborated by the analysis of the beta indexes, namely the individual coefficients of the regression between probability and accuracy. Indeed, the ANOVA on the beta index yielded a significant effect of group (F2,60 = 13.67; p < 0.01; η2p = 0.31), confirming the differences between groups in using contextual information to predict the unfolding action. In detail, the TD group presented a stronger modulation of the contextual priors (0.55 ± 0.07) compared to both the STT (0.11 ± 0.13; p = 0.01) and the ITT (− 0.23 ± 0.11; p < 0.01) groups. However, ITT patients showed lower influence of contextual priors on their performance compared to STT patients (p = 0.03) (Fig. 3).

Fig. 3
figure 3

Beta index results. Asterisks indicate significant comparisons (p < 0.05), error bars represent SEM

Regression Analysis

Preliminary Pearson’s correlation analyses revealed a positive association between FSIQ and the T-scores at both the non-verbal part of the ToM (r = 0.42, p = 0.01) and the Affect Recognition subtests (r = 0.38, p = 0.01), but excluded collinearity. Thus, they were all considered in the regression analysis. The whole model was marginally significant (F5,36 = 2.36; p = 0.06; Adj. R2 = 0.14) with the T-scores at the non-verbal part of the ToM as positive significant predictor (β = 0.36; t36 = 2.21; p = 0.03) (Fig. 4), while all others variables were non-significant (all t < ‖1.90‖; all p > 0.07). The coefficients are reported in Table 2.

Fig. 4
figure 4

Plot of the association between the T-score at the non-verbal part of the ToM subtest and the Beta index. Dotted gray lines indicate 95% confidence interval, black circles represent individual values

Table 2 Coefficients of the linear regression model with the beta index as dependent variable

Thus, this analysis confirmed that time since diagnosis, which differentiated the clinical profile of STT and ITT participants, did not influence contextual modulation effects.

Discussion

In the present study, we investigated whether the role of the cerebellum in action observation could be totally accounted for a motor simulation mechanism (i.e., covert rehearsing of the same motor programs used during action execution) or it could be better explained within a predictive coding framework (i.e., using contextual priors to predict ongoing actions). To this aim, we compared the performance of child and adolescent survivors of brain tumor in the cerebellar areas (ITT) with that of child and adolescent survivors of brain tumor in a STT location, thus not involving the cerebellum. This way, we tested whether a damage in the cerebellum would differently affect action observation compared to a damage in another brain area. A group of peers with TD was also involved in the study, with the aim to compare the performance of individuals with brain damage with that of individuals with a typically developing brain and, eventually, to explore the nature of the differences. To assess the ability to predict actions by using contextual cues, we used an action prediction task consisting of a familiarization phase, in which participants are exposed to diverse levels of action-context probabilistic associations, and a testing phase, in which the same videos are presented drastically shortened. We thus compared across groups the extent at which the probabilistic learning of the action-context associations in the familiarization phase biased prediction of actions presented in a condition of perceptual ambiguity in the testing phase. In addition, we verified the impact of clinical variables of brain tumor survivors on the modulatory effect exerted by contextual priors in the action prediction task. We found that both the STT and ITT groups were overall impaired in the action prediction task as compared to the TD group, in keeping with the notion that the cerebellum is a crucial node of the cortico-subcortical AON network that supports action perception [3, 5, 6]. However, while both TD and STT participants presented a contextual modulation of the action prediction performance, with higher response accuracy at higher probabilities of action-context associations, performance of ITT participants was not affected by the action-context association learning. This favors our hypothesis of specific role of the cerebellum in providing the AON with context-based predictions about what is going to be expected by others in given situations.

Previous studies have demonstrated the role of top-down contextual information on action coding at diverse levels of representation [37, 38], especially in perceptually uncertain situations [27, 61]. While it has been proposed an involvement of associative cortical areas in generating context-based priors [40], here we reported evidence of a direct contribution of the cerebellum in either forming or using contextual priors. Considerable literature highlighted that the cerebellum could have the anatomical and functional features to be the locus of predictive models even for non-motor functions [62, 63]. In our action prediction task, forward modeling implemented by the cerebellum would allow for the processing of the co-occurrence between contextual features (i.e., object color) and specific body kinematics (i.e., reach-to-grasp configuration). This processing would result into a contextual prior of the observed action as a final output. As proposed by Sokolov and colleagues (2017), the cerebellum can feed the output of the forward-model to the associative cortical areas through reciprocal connections with different network nodes. This would point to a cerebellar influence at any stage of stimulus processing rather than affecting only the final outcome of the process [47]. Therefore, the cerebellum could play a silent role when a great amount of sensorial information is available, but it may assume a more prominent role in condition of uncertainty [64, 65]. Accordingly, all participants in our study were able to predict the action outcome in the familiarization phase, since kinematics was sufficiently informative to infer the overarching goal of the action and, thus, the intervention of top-down contextual modulation was not necessary [66, 67]. Further, this result ensured us that all participants were equally exposed to the action-contextual cue probabilistic associations. A deficit in encoding contextual regularities during the familiarization phase would be in keeping with the function of the cerebellum in learning stimulus-outcome associations [68, 69] and in extracting and representing abstract combinations and higher-order rules [70, 71]. Nevertheless, since in the familiarization phase all groups performed comparably, we cannot exclude that ITT patients learned the probabilistic associations, but they did not use contextual priors to predict others’ actions in the testing phase. Further research is warranted to better delineate whether cerebellar lesions could affect the encoding or the use of contextual priors in action prediction.

While ITT patients did not show any facilitation for predicting actions embedded in more probable contexts, the contextual modulation effect was reliable not only in TD participants but also in the STT group, at least for the highest probabilistic association. This suggests that, despite cortical damage, STT participants could rely on implicitly learned contextual priors to disambiguate kinematics. However, it is noteworthy that also STT participants presented an impairment in relying on contextual priors with respect to the TD group, which showed a steeper slope of the relation between contextual probability and performance accuracy (i.e., higher beta coefficients) as compared to both the ITT and STT groups. On the one hand, this result was somewhat expected as STT patients presented damage in the cortical nodes of the AON, which are recruited during action observation and are crucial for action perception and understanding [3, 5]. On the other hand, brain damage not involving the cerebellum, but affecting areas connected to it may interfere with the flow of information along the cerebro-cerebellar pathways [44, 45]. This may hinder the integration of cerebellar outputs into cortical processing of the observed action, resulting in an attenuated reliance on contextual priors. Nevertheless, the deficits in taking benefit from contextual priors to predict actions was more massive in ITT than STT participants, pointing to a prominent role of cerebellar computations in generating contextual priors of observed actions.

Previous research has highlighted the link between action prediction and social cognition abilities [19, 72]. Here, we found that the strength of contextual priors during action prediction was associated with a reduced ability to use information from the social context when understanding others’ emotions and mental states, as examined by non-verbal part of the ToM subtest of the NEPSY-II. This effect was independent from other clinical variables, in particular IQ and time since diagnosis, which might characterize the two patient groups. This result keeps with the notion that understanding complex behaviors in social interactions is associated with the ability to use contextual priors to predict actions [73, 74]. At a neural level, it has been proposed that the cerebellum would form and update predictive models to assist the cerebrum in social processing at diverse levels of abstraction [1] through its connections with networks involved in action understanding [75] and mentalizing [76]. Thus, an alteration of the predictive-coding mechanism sustained by the cerebellum would result in impairments in building/using contextual priors as well as in social cognition deficits and autism-like behaviors often shown by patients with cerebellar disorders [11, 50]. Accordingly, other studies explored the association between deficits in using priors to predict incoming actions and social impairments in autism spectrum disorders (ASD; [77,78,79]). In particular, the “hypo-priors” hypothesis proposed that an enhanced reliance on sensory evidence in ASD persons may alter the top-down modulation of contextual priors, providing an explanation for their social difficulties and deviated Bayesian inference processes [80, 81]. In a similar vein, previous studies have found that ASD persons present impairments in action prediction [82] and have documented that the use of priors is linked with both severity of clinical symptoms in the area of social interactions [83, 84] and anxiety [52]. Notably, neuroimaging research has revealed associations between social deficits shown by ASD persons and altered connections between the cerebellum and the AON and mentalizing systems [85,86,87,88]. Abnormalities in predictive coding mechanisms, supported by findings of altered structural and functional components of the cerebellum [89,90,91,92], have been also proposed to explain positive symptoms of schizophrenia [93, 94] and depression [95]. Furthermore, an over-relying on priors, which resulted in atypical prediction, has been found in patients with schizophrenia [96]. Thus, our finding of an association of deficits in using contextual priors with, on the one hand, cerebellar damage and, on the other hand, social cognition deficits may shed new light on previous studies reporting social cognition deficits in patients with cerebellar damage and is in keeping with research on psychiatric conditions, pointing to the role of the cerebellum in regulating social behavior through predictive mechanisms.

Social cognition alterations are often neglected during clinical practice with cerebellar patients. Although a recently developed scale included emotional regulation and social skills in CCAS assessment [97], there is a lack of rehabilitative treatments addressing these deficits [98]. Our findings suggest that rehabilitative interventions focused on boosting predictive functions exerted by the cerebellum in social contexts may enhance social cognition abilities in patients with neurological or psychiatric disorders involving cerebellar alterations. Accordingly, we have recently designed a rehabilitation training based on virtual reality aimed at boosting social prediction abilities in cerebellar patients [99]. This intervention forces participants to learn probabilistic context-outcome associations in social scenarios, prompting them to extract contextual regularities and use them in predicting others’ behavior. Given the results of the present study, we expect that such a training could help patients to rely more on contextual priors during action perception and, thus, to improve social cognition abilities. Moreover, in line with the theory of a unique computation applied on different information, our findings could provide a clear rationale for designing rehabilitative interventions based on the predictive functions exerted by the cerebellum, targeting other symptoms of CCAS, such as executive functions deficits [100] or motor impairments [101].

Limitations of this study should be considered when discussing our results. First, the limited sample size and the age heterogeneity of the three enrolled groups ask for caution when generalizing the findings. Furthermore, patients with brain tumor presented a damage in different areas of the cerebrum or of the cerebellum, which may have affected results. Forthcoming studies should further explore the relationship between the exact tumor site and alterations in action prediction, as diverse cerebellar modules are connected with different cortical loops to process a specific class of information [41, 44]. A previous study by Cattaneo and colleagues (2012) reported that cerebellar patients showed a general impairment in sequencing tasks, but they had a specific alteration when seeing biological motion compared to the motion of an inanimate object. This suggests that biological movement could be represented differently from other information in the cerebellum [6]. In keeping with this notion, we found a deficit in action prediction in ITT patients, but we could not verify whether cerebellar lesions also affect predictive processing of non-biological events. Future research should consider adopting experimental paradigms with inanimate objects to compare the performance in social and non-social tasks, thus testing the specificity of this alteration in predictive mechanisms for the social domain. Moreover, we could not exclude that the effects of neurosurgery and adjuvant therapies on cognition have interfered with data, although no differences in these variables were found in the two groups of brain tumor survivors. Indeed, these medical procedures could have long-term neuropsychological sequelae, such as deficits in attention, processing speed, and visuospatial skills [102], which could have impacted on the performance of the action prediction task. However, it is likely that oncological therapies could have affected the functioning of cortical-cerebellar loops in both ITT and STT patients; still, the STT group presented a reliable effect of the contextual priors, at least for the high-probability condition, while no modulation was observed in the ITT group. These results seem to indicate that the deficits observed in the ITT group may reflect an alteration of predictive computations exerted by the cerebellum, irrespective of the influence of clinical variables not controlled in this study. Furthermore, even if time since diagnosis was longer for the STT than the ITT group, likely reflecting different clinical course of tumor and its treatment, we failed to find any effect of it on contextual modulation in our regression model. Lastly, our tasks did not allow us to examine error-related processing. Indeed, detecting errors and deviations from expected sensory outcomes in order to update internal models and successfully predict incoming sensorial information is considered as a core function of the cerebellum [103, 104] and it is thought to contribute to adaptive social behavior [47]. Further research should evaluate the impact of cerebellar damage not only on building/using contextual priors but also on error-detecting and signaling in action observation.

Conclusions

In this study, we investigated whether and how cerebellar damage following childhood brain tumor affects the representation of contextual priors, thus limiting the ability to correctly predict others’ behavior. Our findings indicated that survivors of brain tumor not involving the cerebellum presented a reduced, still spared contextual modulation on action prediction performance, as they could rely on contextual priors to compensate the lack of observed kinematics. Conversely, survivors of brain tumor affecting the cerebellar areas were able to understand the unfolding action when a great amount of kinematics information was available (i.e., in the familiarization phase), but, in condition of perceptual ambiguity (i.e., for the occluded videos in the testing phase), they did not rely on contextual priors to overcome the lack of kinematic information and correctly predict the overarching outcome of observed actions. In keeping with the role of the cerebellum in building and updating predictive internal models, we argued that this result reflects a specific deficit of cerebellar patients in either forming or relying on contextual priors during action observation. Thus, the role of the cerebellum in action prediction could not be totally accounted by motor simulation mechanisms and needs to be considered in a predictive coding framework, in which it plays a modulatory effect by providing contextual priors that guide the selection of the most likely outcome of a specific action. Furthermore, we found that the performance in the non-verbal part of ToM subtest was directly associated with the strength of the contextual priors. This suggests that integrating cerebellar functions in a predictive coding framework could not only better define the role of the cerebellum in action observation, but it may also shed new lights on social cognition deficits shown by patients with neurological or psychiatric disorders involving cerebellar alterations.