Introduction

Human beings are not passive recipients of external environmental stimuli; rather, they actively engage with their surroundings through voluntary actions and process resulting sensory information. This engagement leads to a notable decrease in the perceived intensity of sensory experiences compared to externally generated sensations, a phenomenon known as the sensory attenuation effect (e.g., Blakemore, Wolpert, et al., 1998; Schafer & Marcus, 1973). This effect is not exclusive to humans and is observed across various species, from electric fish capable of distinguishing their own electric fields from external sources (Bell, 2001) to humans experiencing reduced itchiness when self-scratching compared to being scratched by others (Schafer & Marcus, 1973). Some researchers suggest that sensory attenuation serves as a reliable mechanism for distinguishing self-generated stimuli from external ones (e.g., Blakemore & Frith, 2003; Blakemore et al., 2000). Furthermore, sensory attenuation is considered an adaptive response that enhances survival by increasing awareness of externally induced stimuli. It achieves this by filtering out predictable information derived from motor activity, thereby enhancing sensitivity to external stimuli with potentially greater survival significance (Miall & Wolpert, 1996). For example, when walking alone in a quiet alley at night, one's own footsteps may be perceived as less salient than those of others. Additionally, sensory attenuation has significant implications in medical application. Individuals with schizophrenia often exhibit reduced inhibition of self-generated sounds compared to healthy individuals (e.g., Ford et al., 2001; Ford et al., 2014; Perez et al., 2012), a phenomenon commonly associated with hallucinations and delusions (e.g., Fletcher & Frith, 2009; Heinks-Maldonado et al., 2007).

The phenomenon of sensory attenuation resulting from voluntary actions is also a robust finding observed across various methodologies. Behaviorally, it is characterized by a diminished perception of stimulus intensity (e.g., Blakemore, Wolpert, et al., 1998; Cardoso-Leite et al., 2010; Lubinus, et al., 2022; Weiss et al., 2011), or a reduced ability to detect temporal delays between actions and their corresponding sensory feedback (e.g., Arikan et al., 2019; Pazen et al., 2020; Uhlmann et al., 2020). At the neural level, sensory attenuation has been consistently evidenced by numerous studies. Previous studies utilizing high temporal resolution electroencephalography (EEG) and magnetoencephalography (MEG) have demonstrated a significant decrease in the amplitude of N/M100 component when contrasting self-generated with external-generated sounds (e.g., Baess et al., 2011; Horváth et al., 2012; Zouridakis et al., 1998). Functional magnetic resonance imaging (fMRI) studies with high spatial resolution have reported a significant reduction in blood oxygen level-dependent (BOLD) signals within brain regions implicated in sensory processing during voluntary action conditions (e.g., Arikan et al., 2019; Blakemore, Wolpert, et al., 1998; Pazen et al., 2020; Straube et al., 2017; Uhlmann et al., 2020).

The internal forward model has been extensively employed to elucidate the phenomenon of sensory attenuation (e.g., Blakemore et al., 2000). According to this model, when the motor cortex instructs the peripheral nervous system to carry out a motor action, an efferent copy of this command is concurrently transmitted to predict sensory feedback based on the movement (von Holst & Mittelstaedt, 1950). This resultant neural signal, known as "efference copy" (von Holst & Mittelstaedt, 1950) or "corollary discharge" (Sperry, 1950), is generated alongside the motor command by the central motor network (Stenner et al., 2015) and utilized by the brain to anticipate the sensory consequence of individual behavior. The predictive signal is compared to the incoming signal, leading to the attenuation of self-generated sensory stimuli (Blakemore et al., 2000; Blakemore, Wolpert, et al., 1998).

While the theoretical framework of the internal forward model appears to provide a plausible explanation for the sensory attenuation effect, the underlying neural mechanisms remain unclear. Several brain regions are considered to participate in this effect. The cerebellum is a potential brain region for the forward model, providing predictions of sensory consequences for motor commands, which are then compared with the actual sensory feedback from the movement (Bastian, 2006; Ishikawa et al., 2016; Miall & Wolpert, 1996). Previous studies have also suggested cerebellar involvement in generating (Blakemore et al., 1999; Leube et al., 2003) and updating predictions regarding sensory inputs (Roth et al., 2013; Synofzik et al., 2008), through the transmission of prediction errors specific to voluntary actions (Blakemore et al., 2001; van Kemenade, Arikan, et al., 2019). For example, participants' voluntary key-presses or externally induced touches resulting in tactile stimuli on another passive hand led Blakemore et al. (1998a, 1998b) to find increased activities in the somatosensory cortex following passive external touches, while cerebellar activities decreased compared to movements eliciting tactile stimuli from the environment. However, these observations were based on a very small sample size (6 volunteers) and utilized fixed-effect analyses. In contrast, Shergill et al. (2013) observed that cerebellar activities increased rather than decreased in conditions involving voluntary touch compared to conditions involving external touch. Additionally, studies have shown that disrupting cerebellar activities through transcranial magnetic stimulation interferes with sensory attenuation of self-generated sounds at the cortical level (e.g., Cao et al., 2017).

In addition to the cerebellum, researchers have identified the involvement of several other brain regions in the processing of sensory attenuation, including the superior temporal gyrus, middle temporal gyrus, motor cortex, and insular cortex. For example, the STG, particularly involved in detecting the delays between actions and action outcomes. Using visual feedback that immediately followed participants' movements or variable delays were introduced, Leube et al. (2003) observed a positive correlation between activation in the STG and the degree of delay. Subsequent research by Leube et al. (2010) found similar correlations with delay in a region of the temporal cortex among both schizophrenia patients and healthy control groups. Additionally, the lateral part of the middle temporal gyrus (i.e., extrastriate body area) has been shown to exhibit stronger responses to inconsistent action feedback compared to consistent action feedback (David et al., 2007). Neuroimaging evidence suggests that the insular cortex also play crucial role in distinguishing between self-generated and external-generated somatosensory stimuli (e.g., Limanowski et al., 2020). The insular cortex appears to serve as a higher-order, multisensory region located upstream in the somatosensory system (Eickhoff et al., 2006; Keysers et al., 2010; Kurth et al., 2010; Limanowski et al., 2014; Tsakiris et al., 2006). Furthermore, sustained inhibition of the sensory cortex has been observed under conditions of voluntary action, including in the auditory Heschl's gyrus (e.g., Hua et al., 2010; Pinheiro et al., 2019), somatosensory postcentral gyrus (e.g., Blakemore et al., 1998a, 1998b; Pazen et al., 2020), and visual calcarine sulcus (e.g., David et al., 2007).

As previously mentioned, sensory attenuation is a well-established phenomenon associated with voluntary actions. It involves the subjective modulation of perceived action outcomes across various modalities, including visual (e.g., Leube et al., 2003), auditory (e.g., Hashimoto & Sakai, 2003), somatosensory (e.g., Blakemore, Wolpert, et al., 1998; Blakemore et al., 1999; Kilteni & Ehrsson, 2020), and nociceptive (e.g., Braid & Cahusac, 2006; Wang et al., 2011) domains. However, the exact neural mechanisms underlying sensory attenuation remain elusive. To address this knowledge gap, we conducted a pioneering meta-analysis of existing neuroimaging studies on the sensory attenuation effect. Our study aimed to elucidate whether the neural activations associated with voluntary actions differ from those observed during the passive reception of externally induced stimuli. Furthermore, we aimed to find co-activation networks by leveraging distinct brain regions activated under self-generated and externally induced conditions, thus advancing our comprehension of the neural mechanisms that underlie sensory attenuation.

Method and materials

Search strategies

We implemented a multi-step methodology to identify literature regarding sensory attenuation, in accordance with the guidelines delineated in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA; Liberati et al., 2009). Initially, we conducted thorough searches across PubMed (http://www.ncbi.nlm.nih.gov/pubmed), Web of Science (https://www.webofscience.com/), and Scopus (https://www.scopus.com/), utilizing the specified search terms: 1) (self OR external OR voluntary OR involuntary) AND (initiated OR induced OR produced OR generated OR triggered OR administered) AND (action OR motor OR movement OR kinematic*) AND (visual OR auditory OR sound OR tone OR tactile OR touch OR tickle OR pain OR thermal OR feedback OR outcome OR effect OR consequence) AND (fMRI OR (functional magnetic resonance imaging) OR (functional MRI) OR PET OR (positron emission tomography)); 2) (("sensory attenuation") OR ("BOLD suppression") OR ("sensory suppression") OR ("BOLD attenuation")) AND (fMRI OR (functional magnetic resonance imaging) OR (functional MRI) OR PET OR (positron emission tomography) OR neuroimaging). Both searches were executed on April 22, 2022. These search methodologies yielded a cumulative total of 3988 articles. Furthermore, we scrutinized the bibliographies of the identified articles to identify any prospective neuroimaging studies pertaining to sensory attenuation, which led to the discovery of three additional articles.

Inclusion and exclusion criteria

After the removal of duplicates, the articles underwent screening based on the following rigorous inclusion and exclusion criteria. Articles were deemed suitable for inclusion if they met the following criteria: (a) employing neuroimaging techniques, either fMRI or PET; (b) scrutinizing disparities in activation patterns between self-generated and external-generated sensory stimuli (elicited by participants' passive movement or by the computer); (c) enrolling healthy adult participants; (d) incorporating diverse sensory modalities including auditory, visual, tactile, or nociceptive feedback, (e) conducting comprehensive whole-brain analyses and reporting results using standard stereotactic coordinates, either in Talairach or Montreal Neurological Institute (MNI) space.

Studies were excluded if they met the following criteria: (a) lacking relevance to the research task; (b) failing to report findings pertaining to a healthy adult human population; (c) not original research articles; (d) not fMRI/PET studies; (e) not providing whole-brain results; (f) lacking analysis of pertinent contrasts; (g) not reporting coordinates in a standard stereotactic space (MNI or Talairach).

Study selection and data extraction

The process of data selection and extraction followed the PRISMA guideline. One author selected articles based on inclusion and exclusion criteria and extracted data. A second author conducted a comprehensive review of both the selection process and the extracted data. Any discrepancies were discussed until a consensus was reached. Activation foci and relevant details were extracted from the chosen articles, including the first author, publication year, sample size and number of female participants, feedback modality, experimental task, presence or absence of time delay between action and feedback, presence or absence of passive movement in external-generated condition, imaging method, standard stereotactic space, contrast type, and contrast details for each study (Table 1).

Table 1 Details of studies included in the meta-analysis

Activation likelihood estimation (ALE) analysis

The ALE analysis examines whether activation foci in various studies sharing similar topics cluster at a significantly higher level compared to the null distribution of random spatial association between experiments (Eickhoff et al., 2012). We conducted ALE meta-analyses on GingerALE 3.0.2 (http://www.brainmap.org/ale/; refer to Eickhoff et al., 2012). Coordinates originally reported in Talairach space were initially converted to the MNI space using the built-in algorithm tool within GingerALE, employing the tal2icbm_spm transmission method (Lancaster et al., 2007). Following this transformation, the coordinates were organized into text files for various ALE analyses, adhering to the formatting guidelines specified by GingerALE.

Primarily, to assess brain areas responsible for sensory attenuation, ALE analyses were conducted for external-generated > self-generated and self-generated > external-generated comparisons, respectively. Then we also conducted subgroup ALE analyses on time delay and passive condition types to assess their potential influences on the sensory perception. All the selected articles were categorized into two groups: with or without a time delay between the action and the sensory outcome. We then extracted the coordinates from experiments adopting non-delayed and delayed sensory outcomes, considering or regardless of the comparison direction, i.e., external-generated > self-generated or self-generated > external-generated comparisons. For passive conditions, we differentiated between two types, perceiving stimuli either generated by computer (No movements) or by participants’ passive movement controlled by the experimenter or a device (Passive Movement), and conducted ALE analyses on them separately. It should be noted that some of the subgroup analyses recruited only limited number of studies with small sample size (see Results for details). As a small quantity of experiments for ALE algorism may fail to achieve a stable performance and reliable result (Eickhoff et al., 2016), the results of subgroup analyses should be interpreted with caution.

A cluster-level family-wise error (FWE) correction was applied for multiple comparisons with a threshold set at p < .05, and a cluster-forming threshold was set at p < .001 with 1000 permutations. The final thresholded MA maps were superimposed onto the MNI152 template using MRIcroGL (https://www.nitrc.org/plugins/mwiki/index.php/mricrogl:MainPage). Brain regions were identified following the MNI atlas guidelines, using Mango (http://rii.uthscsa.edu/mango/).

Meta-analytic connectivity modelling (MACM) analyses

To further explore the regions responsible for cross-modal sensory attenuation at the network level, we utilized MACM analysis. MACM is a data-driven approach that leverages neuroimaging databases (e.g., BrainMap; Langner et al., 2014; Robinson et al., 2010) to establish co-activation maps for predefined Regions of Interest (ROIs). We identified peak activation clusters from the results of ALE analyses. Specifically, the right middle temporal gyrus coordinates (60, -10, -8) were selected for the external-generated > self-generated contrast, and the right cerebellum coordinates (30, -54, −28) for the self-generated > external-generated contrast. Subsequently, two 10 mm spheres centered at these peak coordinates were extracted using Mango software and designated as predefined ROIs for MACM analyses. Neuroimaging studies reporting at least one activation focus within these predefined ROIs were obtained from the BrainMap database. The inclusion criteria were set to "Activation Only" and "Normal Mapping". The coordinates from these studies were then subjected to ALE analyses to measure their convergence and co-activation with predefined ROIs, employing parameters of cluster level FWE corrections at p < .05, 1000 threshold permutations, and a cluster-forming threshold of p < .001.

Results

Outcomes of searching

Following the process of selection and extraction, a total of 19 papers were incorporated into our meta-analyses (see Figure 1 & Table 1). Among these, there were 18 contrasts utilized for the comparison of external-generated > self-generated, and 12 contrasts for self-generated > external-generated. Notably, one study featuring two distinct groups of participants across separate experiments was treated as two external-generated > self-generated contrasts. Of the included papers, 17 were based on fMRI methodology, while two employed PET imaging. Regarding spatial reporting, 16 studies presented results in MNI space, with the remaining three utilizing Talairach space. The collective dataset encompassed 354 foci from 338 participants (including 173 females) for the external-generated > self-generated comparison. For the self-generated > external-generated comparison, 194 participants (90 females) and 93 foci were included.

Fig. 1
figure 1

Flow chart of literature search

General meta-analysis

Our primary focus was on analyzing the external-generated > self-generated comparison derived from 18 experiments across 17 studies. Notably, a significant activation cluster featuring a single peak (60 -10 -8) was identified in the right hemisphere, encompassing parts of rSTG, rMTG, and the right insula (Figure 2A).

Fig. 2
figure 2

Results of ALE meta-analyses on external-generated > self-generated and self-generated > external-generated contrasts. A Regions exhibiting higher activations in external-generated conditions compared to self-generated conditions (external-generated > self-generated contrast). Peak activation (60 -10 -8) was identified in the rMTG. B Regions demonstrating higher activations in self-generated conditions compared to external-generated conditions (self-generated > external-generated contrast). Peak activation (30 -54 -28) was observed in the right cerebellum. The cluster threshold was set at p < .05 (cluster-level FWE correction). Figures were generated using MRIcroGL with the MNI152 template. Abbreviations: STG, superior temporal gyrus; MTG, middle temporal gyrus; Ins, insula; L, left; R, right

Next, we investigated the self-generated > external-generated contrasts from 12 experiments across 12 studies. We identified one cluster in the anterior lobe of the right cerebellum with two activation peaks (30 -54 -28; 22 -56 -26), encompassing Culmen and Dentate regions (see Figure 2B).

Effect of time delay

In general, we anticipate immediate sensory feedback following our actions. Thus, introducing a time delay between an action and feedback may lead to a disparity between expected and actual feedback. We categorized the selected articles into two groups: those with and without a time delay between the action and the sensory outcome. We then extracted all coordinates for non-delayed and delayed conditions, regardless of the comparison direction, which led to 278 foci included for non-delayed condition and 169 foci for delayed condition.

Individual meta-analysis on non-delayed condition revealed two significant clusters when there was no delay between the action and outcome. One cluster exhibited peak activation (60 -10 -8) centered in the rMTG, encompassing the rSTG, rMTG, and right sub-gyral regions. Another cluster displayed two peaks (40 16 -30; 48 10 -32) centered in the rSTG, involving rSTG, rMTG, and the right inferior gyrus (rIFG, Figure 3). When a time delay exists between the action and outcome, two significant clusters were identified. One exhibited peak activation (20 -60 -50) centered in the cerebellar tonsil, a posterior part of the right cerebellum. The other cluster displayed a single peak (8 6 52) centered in the right medial frontal gyrus (rMeFG), encompassing rMeFG and the cingulate gyrus (Figure 3).

Fig. 3
figure 3

ALE Meta-Analyses results on the impact of time delay. The red-yellow bars indicate significant activation when there is no time delay between the action and sensory outcome (non-delayed condition), while the blue-green bar represents significant activation when there is a time delay between the action and sensory outcome (delayed condition). The cluster threshold was set at p < .05 with FWE correction. The figures were generated using MRIcroGL with the MNI152 template. Abbreviations used include STG for the superior temporal gyrus, MTG for the middle temporal gyrus, MeFG for the medial frontal gyrus, and L and R for left and right, respectively

To show a more detailed comparison of potential overlap in brain regions between delayed and non-delayed conditions, as well as between external and self-generated contrasts, we displayed the results of analyses for both delay conditions and for the external-generated > self-generated and self-generated > external-generated ALE analyses on the same template (Figures 4A and 4B).

Fig. 4
figure 4

Comparison of ALE meta-analysis results for time delay and self- vs. external-generated contrasts. A The blue-green bar denotes the significant activation when compared external-generated condition to self-generated condition. The red-yellow bar denotes the significant activation when there is no time delay between the action and sensory outcome (non-delayed condition). B The blue-green bar denotes the significant activation when compared self-generated condition to external-generated condition. The red-yellow bar denotes the significant activation when there is a time delay between the action and sensory outcome (delayed condition). Clusters threshold: p < .05 (cluster-level FWE correction). Figures were created using MRIcroGL with the MNI152 template. Abbreviations: STG, superior temporal gyrus; MTG, middle temporal gyrus; Ins, insula

We also conducted ALE analyses on coordinates from experiments involving or not involving a delay between the action and ensuing sensory outcome and differentiated the comparison directions. A total of 10 external-generated > self-generated and 8 self-generated > external-generated contrasts from non-delayed experiments, 8 external-generated > self-generated and 4 self-generated > external-generated contrasts from delayed experiments.

We found two significantly activated clusters for the external-generated > self-generated contrast in non-delayed condition, one with a single peak (60 -10 -8) and another with two peaks (40 16 -30; 48 10 -32). The first cluster comprised the rSTG, rMTG, and right subgyral region. The second cluster included rSTG, rMTG, and the right inferior frontal gyrus (rIFG). Four significantly activated clusters were found for the external-generated > self-generated contrast in the delayed condition, located in right cerebellar tonsil (22 -60 -50), right precuneus (20 -50 42), right medial frontal gyrus and cingulate gyrus (8 6 52), and right thalamus (-10 -18 2), respectively. No significant clusters were identified by ALE analysis of coordinates from the self-generated > external-generated contrasts in both the delayed and non-delayed conditions.

Effect of external-generated movement

In some experiments, participants received computer-generated stimuli passively without engaging in any movement (No Movement), while in others, participants' bodies were passively moved by the experimenter or a device (Passive Movement). We also differentiated these two external-generated conditions.

We collected all coordinates for No Movement and Passive Movement, regardless of the comparison direction. A total of 215 foci were included for the No Movement condition and 232 foci for the Passive Movement condition. Individual meta-analysis of the No Movement condition revealed one significant cluster, with two peak activations (62 -10 -8; 54 -6 -12) centered in the rMTG and rSTG, including rSTG, rMTG, and the right sub-gyral region (Figure 5A). Individual meta-analysis of the Passive Movement condition revealed one significant cluster with one peak (20 -60 -50) centered in the cerebellar tonsil, a posterior part of the right cerebellum (Figure 5B).

Fig. 5
figure 5

ALE Meta-Analysis results on the impact of external-generated condition. A Significant activation in the external-generated condition where participants passively received computer-generated stimuli without any movement (No Movement). Peak activation (62 -10 -8) located in the rMTG. B Significant activation in the external-generated condition where participants' bodies were controlled by the experimenter or a device to perform a passive movement (Passive Movement). Peak activation (37 -21 55) located in the right cerebellum. Figures were created using MRIcroGL with the MNI152 template. Abbreviations: STG, superior temporal gyrus; MTG, middle temporal gyrus; SG, sub-gyral; MeFG, medial frontal gyrus; Cereb tonsil, cerebellar tonsil; L, left; R, right

Coordinates from external-generated > self-generated and self-generated > external-generated contrasts were also extracted for No Movement and Passive Movement, respectively. We collected coordinates from 9 experiments for No Movement in external-generated > self-generated contrast, 6 for No Movement in self-generated > external-generated, 9 for Passive Movement in external-generated > self-generated, 6 for Passive Movement in self-generated > external-generated. Individual ALE analysis was conducted on these data. For passive perception (No movement) condition, we found one significantly activated cluster with three peaks (62 -10 -8; 54 -6 -12; 46 -18 -12) for coordinates from external-generated > self-generated contrast, including rMTG, rSTG, right Sub-Gyral, and right Insula. And we found one cluster with 1 peak (37 -21 55) for coordinates from self-generated > external-generated contrast, located in right precentral gyrus and postcentral gyrus. No significantly activated clusters were found for passive action (Passive Movement) condition for both contrasts.

MACM results

With the ROIs extracted from the results of ALE analyses, namely, the rMTG for the contrast of external-generated > self-generated conditions and the right cerebellum for the contrast of self-generated > external-generated conditions, we conducted the MACM analyses. Initially, for the ROI centered in rMTG, a total of 93 experiments with 1541 foci from 1510 participants were retrieved from BrainMap. For the ROI centered in the right cerebellum, 91 experiments with 2014 foci from 1367 participants were obtained. The coordinates of these foci were subsequently imported into GinerALE to obtain the co-activation patterns.

We observed that the following regions exhibited significant co-activation with the ROI centered in the rMTG: bilateral insula, bilateral inferior frontal gyrus, bilateral superior frontal gyrus, right middle frontal gyrus, right precentral gyrus, left medial frontal gyrus, left thalamus, and bilateral superior temporal gyrus (see Figures 6 & 7, Table S1). Meanwhile, the following regions showed significant co-activation with the ROI centered in the right cerebellum: bilateral inferior frontal gyrus, right amygdala, right medial frontal gyrus, bilateral superior temporal gyrus, bilateral precentral gyrus, bilateral postcentral gyrus, bilateral thalamus, right fusiform gyrus, right inferior temporal gyrus, left superior parietal lobule, left inferior parietal lobule, left middle frontal gyrus, left transverse temporal gyrus, left insula, bilateral lentiform nucleus, bilateral claustrum, left precuneus, and left cerebellum (see Figures 6 & 7, Table S2).

Fig. 6
figure 6

Co-activation patterns based on MACM results. The blue-cyan bar represents co-activation patterns for the region of interest centered in the rMTG, while the red-yellow bar represents co-activation patterns for the region of interest centered in the right cerebellum. Clusters threshold was set at p < .05 (cluster-level FWE correction). Figures were generated using MRIcroGL with the MNI152 template. Abbreviations: L for left, R for right

Fig. 7
figure 7

Connectivity maps for two ROIs from the MACM analyses. Blue lines represent brain areas co-activated with the cluster centered in the rMTG (Peak activation in MNI coordinate: 60 -10 -8), including the right superior temporal gyrus, right middle temporal gyrus, and right insula. Red lines represent brain areas co-activated with the cluster located in the right cerebellum (Peak activation in MNI coordinate: 30 -54 -28). Abbreviations: Cereb, Cerebellum; STG, Superior Temporal Gyrus; Tha, Thalamus; IFG, Inferior Frontal Gyrus; PCG, Precentral Gyrus; IPL, Inferior Parietal Lobule; MFG, Middle Frontal Gyrus; Ins, insula; PoCG, Postcentral Gyrus; TTG, Transverse Temporal Gyrus; SPL, Superior Parietal Lobule; Amy, Amygdala; FFG, Fusiform Gyrus; ITG, Inferior Temporal Gyrus; MeFG, Medial Frontal Gyrus; MTG, Middle Temporal Gyrus; SFG, Superior Frontal Gyrus. L, left; R, right

Discussion

The sensory attenuation effect refers to a decrement in sensory processing and neural response associated with self-generated action outcomes compared to externally induced stimuli (e.g., Benazet et al., 2016; Timm et al., 2014). This phenomenon is widely observed across various sensory modalities, including auditory (e.g., Bäss et al., 2008; Sato, 2009; Schafer & Marcus, 1973), visual (e.g., Cardoso-Leite et al., 2010; Hughes & Waszak, 2011; Schwarz et al., 2018) and somatosensory (e.g., Blakemore, Wolpert, et al., 1998; Blakemore et al., 1999). Our study employs meta-analysis for the first time to explore potential common neural mechanisms underlying the sensory attenuation effect across different modalities. Utilizing the latest GingerALE for comprehensive meta-analysis, we initially investigated the distinctions between self-generated and externally induced action outcomes. Subsequently, based on these findings, brain regions implicated in the sensory attenuation effect were further identified through contrastive analysis of regional differences using MACM analysis.

The contrast of external-generated condition versus self-generated condition

Sensory attenuation describes the phenomenon wherein externally generated stimuli are perceived with greater intensity than self-generated sensory stimuli. Synthesizing empirical studies on this effect reveals a notable cluster of activation when comparing externally induced conditions with those induced by self-initiated actions, notably including rSTG, rMTG, and right insula. The MTG, located adjacent to the temporal pole, has been identified as a region implicated in signaling violations or loss of agency (David et al., 2007; Nahab et al., 2011; Tsakiris et al., 2010). Recent research suggests that the MTG is involved in processing temporal discrepancies in action feedback monitoring, potentially conveying mismatch information about self-generated and external-generated actions (Kavroulakis et al., 2022; van Kemenade, Arikan, et al., 2019). In the external-generated conditions of studies included in this meta-analysis, stimuli were externally triggered, such as by experimenters or computers, and participants lacked a sense of agency under these circumstances. The activation of the MTG when participants passively perceived stimuli instead of voluntarily executing actions to introduce stimuli corroborates this hypothesis. It can be inferred that sensory attenuation involves alterations in the sense of agency, with the MTG attenuation process linked to an augmented perception of agency.

ALE analysis also revealed higher activation in rSTG when compared external-generated to self-generated conditions. This result is in line with previous M/EEG findings reporting M/N100 responses in the superior temporal cortex were significantly weaker to self-triggered than to externally triggered sounds (Aliu et al., 2009; Curio et al., 2000; Numminen & Curio, 1999; Martikainen et al., 2005). The rSTG, renowned for its multifaceted roles (Craig, 2009; Lopez & Blanke, 2011), is acknowledged as a central hub for multimodal processing (Blanke, 2012). In this meta-analysis, we found that rSTG was less activated when the action outcomes came from different outcome modalities were self-generated, which is also consistent with the findings of previous empirical studies using different outcome modalities. For instance, it demonstrates sensitivity to agency when receiving visual feedback of their hand movement, with stronger activation when their hand was passively moved by the device than by themselves (Uhlmann et al., 2021). It is also sensitive to distortions in visual motion feedback stemming from temporal delays or spatial offsets (Decety & Sommerville, 2003; Farrer & Frith, 2002; Leube et al., 2003; Limanowski et al., 2017; Nahab et al., 2011). In speech studies, BOLD signals in STG were stronger during listening to the playback or same words spoken by others than speaking (Christoffels et al., 2007; Creutzfeldt et al., 1989b; Woolnough et al., 2019). Noteworthy is the enhanced activity observed in the STG under external-generated conditions, wherein participants are tasked with passively receiving stimuli triggered by external environments. Thus, our findings offer additional evidence supporting the engagement of the right superior temporal gyrus in discerning perceptual distinctions between passively induced and actively generated outcomes.

A parallel pattern of activation, where external-generated conditions surpass self-generated conditions, was also evident in the right insula. As a pivotal hub for integrating signals pertaining to vestibular, proprioceptive, and audio-visual sensations, the insular cortex stands as a complex neural region receiving inputs from diverse sensory modalities (Bamiou et al., 2003; Lopez & Blanke, 2011). Prior investigations underscore the insula's involvement in auditory processing, with robust bidirectional connections to auditory regions such as the superior temporal gyrus, thalamic medial geniculate nucleus, temporal pole, and auditory temporal area (Augustine, 1996; Flynn, 1999; Ghaziri et al., 2017; Jones & Burton, 1976; Mulert et al., 2004). Additionally, neuroimaging studies have highlighted the anterior insula's role in integrating visual and auditory signals associated with movement (Lewis et al., 2000). Notably, a right-handed patient afflicted with extensive damage to the right insula exhibited severe multimodal stimulus neglect syndrome, accentuating the insula's role in external stimulus awareness (Berthier et al., 1987). In a comprehensive meta-analysis covering over 800 neuroimaging studies, Kurth et al. (2010) delineated insula activation across more than 13 distinct domains, spanning fundamental sensory functions (e.g., olfaction, gustation, and interoception) and higher cognitive processes (e.g., attention, working memory, and language). Within the domain of sensory attenuation, antecedent literature has unveiled anterior insula activation during performance monitoring (Bastin et al., 2017; Ullsperger et al., 2010), with amplified activations observed when contrasting external-generated versus self-generated conditions (Stripeikyte et al., 2021), aligning with the outcomes of this meta-analysis. This suggests that the anterior insula plausibly serves as a hub for multisensory integration, responsible for amalgamating congruent multimodal sensory inputs linked to voluntary actions and attributing them to the self.

In short, our meta-analysis revealed that external-generated conditions elicited a more prominent cluster of activations compared to self-generated conditions, particularly involving the rSTG, rMTG, and right insula. Moreover, these regions exhibited interconnected activation, forming a cohesive cluster. Based on our analysis, these three regions are identified as higher-order cortices, likely serving as pivotal sites for integrating complex sensory information, commonly referred to as "sensory conflict detection" regions. This conclusion is further reinforced by the findings from the ALE analysis conducted under non-delayed conditions. Despite the insufficient data to conduct subgroup analyses for delayed and non-delayed conditions in both self-generated and external-generated settings, we observed significant activation in the rMTG-centered cluster of brain regions under non-delayed conditions.

The contrast of self-generated condition versus external-generated condition

In contrast to external-generated conditions, significant activation was observed in the right cerebellar anterior lobe under self-generated conditions. The cerebellum is recognized as a vital component of the predictive system, providing precise predictions of sensorimotor outcomes and serving as a comparator between prediction and actual movement (Ramnani, 2006; Straube et al., 2017). Consequently, when a disparity arises between anticipated and actual sensory feedback, the cerebellum becomes active, indicating errors in motor performance (Blakemore et al., 2001; Knolle et al., 2012; Knolle et al., 2013a, 2013b; Miall et al., 1993). These erroneous signals are subsequently relayed via the thalamus to cortical regions such as the temporal cortex (e.g., middle and superior temporal gyri; Christoffels et al., 2007; Creutzfeldt et al., 1989a; Curio et al., 2000; McGuire et al., 1996; Tourville et al., 2008), premotor cortex, and primary motor cortex (Christoffels et al., 2011; Golfinopoulos et al., 2011; Zheng et al., 2013). Within this meta-analysis, we observed activation in the right cerebellum under self-generated conditions, regardless of sensory modality, compared to external-generated conditions. Our findings are in line with previous EEG studies (Knolle et al., 2012), wherein researchers found that self-initiated sounds lead to an N100 suppression compared with externally produced sounds, and this suppression effect was largely attenuated in patients with focal cerebellar lesions in comparison to healthy controls. Accordingly, the cerebellum might be centrally involved in action outcome processing and motor control (Straube et al., 2017; Wolpert et al., 1998), especially in the representation and adjustment of behavioral predictions (Knolle et al., 2013a, 2013b; Roth et al., 2013; Synofzik et al., 2008).

Our findings reveal heightened activation of the cerebellum during self-generated conditions, corroborating computational models of motor control. These models posit that sensory attenuation serves as a perceptual complement to the brain's motor control mechanisms. Specifically, our brains utilize internal forward models to anticipate the sensory consequences of our actions, a process likely mediated by the cerebellum. These predictive models play a critical role in compensating for inherent delays and noise in the sensory system, thus facilitating efficient online motor control. Additionally, these predictions are utilized to suppress self-generated reafferent feedback, distinguishing it from externally generated inputs. Consequently, self-generated sensory information is attenuated as it has already been anticipated by the internal forward models.

Furthermore, our ALE analyses on delayed conditions also partly support these hypotheses. Although our dataset precluded a direct comparison of brain region differences between delayed and non-delayed conditions in both self-generated and externally generated scenarios, our integrated analysis revealed heightened cerebellar activation under delayed conditions. This observation may also suggest that the right cerebellum is involved as a crucial component of the predictive motor system.

The networks responsible for the sensory attenuation

To deepen our understanding of the neural mechanisms underlying sensory attenuation across various sensory modalities, we employed meta-analytic connectivity modeling analysis. This method illuminates the functional relationships between brain regions by identifying ROIs that consistently exhibit co-activation across a diverse range of experimental contexts (Langner et al., 2014). In this study, we identified two ROIs by extracting peak activation clusters from probability estimates. Specifically, for the comparison of external-generated > self-generated conditions, we defined the ROI as the rMTG, encompassing all activated regions identified in the ALE analysis. The rMTG demonstrates predominant connections with bilateral brain regions, including the superior temporal gyrus, insula, inferior frontal gyrus, and superior frontal gyrus; regions in the right hemisphere, such as the middle frontal gyrus and precentral gyrus; and regions in the left hemisphere, such as the medial frontal gyrus and thalamus. These co-activated brain regions are primarily associated with a sensory conflict detection system involving the MTG and prefrontal cortex. While this detection system plays a lesser direct role in motor actions, it is more engaged in the evaluative process of general sensory information, resembling a feedback network.

In the comparison of self-generated > external-generated conditions, the ROI was delineated as the right cerebellum, primarily comprising the right cerebellar anterior lobe. Co-activated brain regions within the right cerebellum encompass bilateral areas such as the inferior frontal gyrus, postcentral gyrus, precentral gyrus, thalamus, and superior temporal gyrus; right hemisphere regions including the fusiform gyrus, inferior temporal gyrus, medial frontal gyrus, and amygdala; and left hemisphere regions such as the cerebellum, superior parietal lobule, inferior parietal lobule, middle frontal gyrus, transverse temporal gyrus, and insula. This co-activated brain regions predominantly constitute the cerebellum, pre-motor cortex, motor cortex, and fronto-parietal network, which are more implicated in action prediction and resemble a feedforward network. Additionally, they display interconnections with various higher-order sensory cortices, such as the fusiform gyrus for vision, superior temporal gyrus for audition, and postcentral gyrus and precentral gyrus for somatosensory and pain perception.

The co-activation network of the right cerebellum encompasses not only the pre-motor cortex, which is responsible for movement preparation, but also the fronto-parietal network associated with attention and high-level sensory regions of diverse modalities. Given the prior observation of heightened cerebellar activation during self-generated compared to external-generated conditions, our results support that the cerebellum acts as a hub connecting motor control and sensory systems. This cerebellar-centric brain network could potentially elucidate the sensory attenuation effect proposed by the internal forward model. When anticipating an action, an efferent copy of the action command can predict its outcome, necessitating a comparison between incoming environmental information and this predictive information—a process we posit takes place centrally within the cerebellum. This conclusion aligns with the most recent research findings (eg., Arikan et al., 2019; van Kemenade et al., 2019).

This concept of a predictive network also aligns with previous research, suggesting that perceptual differences between actively and passively generated sensations are partly attributed to the predictability of sensory feedback (Schmitter et al., 2021; Sperry, 1950; Wolpert & Kawato, 1998). Additionally, the predictive system we identified through MACM shows significant overlap with brain regions implicated in prediction in previous studies. Siman-Tov et al. (2019) conducted a meta-analysis involving 39 neuroimaging studies across three functional domains (action perception, language, and music), revealing a widely distributed brain network supporting domain-general predictions, including regions such as the inferior frontal gyrus, middle frontal gyrus, anterior insula, premotor cortex, supplementary motor area, temporo-parietal junction, striatum, thalamus/hypothalamus, and cerebellum. Friston and colleagues (Friston, 2005; Friston & Kiebel, 2009; Friston et al., 2017) outlined a general framework of brain regions potentially involved in predictive processing, encompassing primary sensory and motor cortices, motor-associated cortices, dorsomedial and ventromedial prefrontal cortices, parietal cortices, anterior cingulate cortex, insula, hippocampus, amygdala, basal ganglia, thalamus, hypothalamus, cerebellum, and midbrain. The significant overlap between these research findings and our results underscores the pivotal role of action prediction networks in sensory attenuation.

Through meta-analysis, significant activation in the cerebellum was observed when comparing self-generated and externally generated conditions. Furthermore, we have also derived the action prediction network, with its focus on the right cerebellum, through MACM analysis. Given its involvement in predicting action outcomes (Blakemore, Wolpert et al., 1998; Kilteni & Ehrsson, 2020), it, along with sensory cortices, fronto-parietal networks, and the motor system, forms the feedforward network of action, predicting action outcomes and comparing them with actual sensory feedback. Conversely, regions such as the superior temporal gyrus, middle temporal gyrus, and insula, which constitute the feedback loop, may also participate in it, as their activity increases when stimuli are externally triggered rather than self-generated. We propose that these two networks collectively contribute to the sensory attenuation effect. Further research on these two circuits would help deepen our understanding of their underlying mechanisms.

Neural mechanisms underlying sensory attenuation across various modalities

There is an imperative need to explore the variations in sensory attenuation across diverse sensory modalities. Previous studies have identified sensory attenuation effects in visual (Leube et al., 2003), auditory (Hashimoto & Sakai, 2003), tactile (Blakemore, Wolpert, et al., 1998; Blakemore et al., 1999; Kilteni & Ehrsson, 2020), and pain perception (Braid & Cahusac, 2006; Wang et al., 2011). However, distinct cortical responses are evident within the sensory cortices across different sensory outcome modalities. In the case of tactile stimuli, the BOLD response in the somatosensory cortex decreases with self-generated tactile stimuli compared to externally triggered ones (Bays et al., 2005; Blakemore et al., 2000; Blakemore, Wolpert, et al., 1998; Shergill et al., 2003; Shergill et al., 2013). For auditory outcomes, the STG shows attenuated activity during speech production compared to passive listening (Agnew et al., 2013; Fu et al., 2006), although one study reported contrasting results with bilateral enhancement in the STG during active sound generation than passive listening to identical sounds (Reznik et al., 2014). In the context of pain perception, insula and the prefrontal cortex, known areas of the lateral pain system, showed different BOLD response patterns, with stronger pain-related activity increases in insula to self-administered heat stimuli and stronger increases in prefrontal cortex to uncontrollable external stimuli (Mohr et al., 2008). When visual feedback is the action outcome, passive actions elicit greater BOLD signal in many areas including visual cortices (e.g., lingual gyrus and precuneus) as well as other areas (e.g., MTG, PCG, and SFG) compared to self-initiated actions (Kavroulakis et al., 2022; Pazen et al., 2020).

Our meta-analysis did not reveal greater activation in any single sensory cortex under externally and self-generated conditions; instead, there was more significant activation in some multisensory integration cortices, possibly due to the inclusion of data from different modalities. Interestingly, we observed significant involvement of the cerebellum as part of the predictive system in the sensory attenuation effect. MACM analysis results also showed cerebellar projections to various sensory cortices, consistent with previous research emphasizing the critical role of the cerebellum across different sensory modalities. There is evidence suggesting that the cerebellum is involved not only in generating predictions for motor but also auditory sensations (Knolle et al., 2013a). Moreover, beyond single sensory modalities, actions spanning multiple sensory modalities may lead to BOLD suppression in multiple sensory processing regions in the brain, indicating that the cerebellum within the sensory prediction system can process any sensory information related to keypress operations (Straube et al., 2017). However, extensive further research is necessary to validate these findings.

Limitations

There may be some possible limitations in the current meta-analysis. First, a sentence-reading study was involved in the analysis (Agnew et al., 2013). Considering the unique mechanism of speech (Tonndorf, 1968), this study may be partly heterogeneous from the other action-outcome paradigms such as hand touching, although including or excluding this study did not significantly affect the results of ALE analyses. Further research is recommended to restrict the inclusion criteria for hand-induced action outcomes and exclude speech studies. Second, despite our exhaustive efforts to search for relevant studies, the number of papers incorporated into this meta-analysis was constrained, and the distribution of various outcome modalities was uneven. Furthermore, the quantity of articles in our subgroup analysis was also restricted. Consequently, despite our implementation of stringent correction methods, the findings of this meta-analysis still require cautious interpretation.

Conclusion

Our meta-analysis provides robust evidence for the shared neural network involved in sensory attenuation across various sensory modalities. Initially, significant activation was noted in the right superior temporal gyrus, right middle temporal gyrus, and right insula when comparing external-generated to self-generated conditions. We hypothesize their participation in a feedback loop. Subsequently, increased activation in the right cerebellum was observed during self-generated compared to external-generated conditions. This activation likely facilitates the comparison between predicted and actual action outcomes, suggesting its integration into the feedforward network. Furthermore, network analysis supports the idea that both feedforward and feedback networks may collectively contribute to the sensory attenuation phenomenon. These findings further validate the role of computational theories of motor control in sensory attenuation and provide additional evidence for the underlying neural mechanisms.