INTRODUCTION

The perception of self-movement (i.e., voluntary movement without external influence) is critical for navigation, spatial orientation, and motor control. The movement of a person or an animal in the outside world can only be effective by constantly monitoring their location and trajectory in it. The main sensory signals that make it possible to perceive self-motion are visual, vestibular, and proprioceptive signals.

The visual input makes extensive use of the two-dimensional image movement pattern projected onto the retina by the external environment during movement and determines self-motion only in eye coordinates, while the eyes move relative to the head and the head moves relative to the body. Another limitation of visual signals about self-motion is that the image on the retina contains information not only about self-movement, but also about the movement of surrounding objects.

The vestibular system provides additional information. First, the vestibular organs located in the inner ear send signals about head movements, eliminating the need to convert eye-centered coordinates to head-centered coordinates and avoiding the errors that result from doing so. Secondly, the vestibular signal does not depend on the movement of external objects. On the other hand, the vestibular organs generate signals that carry information about head accelerations and not about velocities. The vestibular system also directly assists vision at the level of sensory detection. When the head moves, vision is potentially impaired. When head movements are detected by the vestibular organs, reflex compensatory movements that tend to keep the eyes and head stable in relation to the outside world are initiated. An example is the vestibulo-ocular reflex [1], in which turning the head causes a reflex counter-rotation of the eyes aiding to maintain visual stability. Thus, vestibular signals generate reflexes to reduce unwanted visual stimuli.

Since visual and vestibular signals must work together to provide the best assessment of self-motion, they must combine in the same areas of the brain. Although extensive visual and vestibular processing occurs in the subcortical structures of the brain, in recent years there has been a lot of new data on visuovestibular interactions in various areas of the cerebral cortex in both monkeys and humans [2]. The nature of the interaction between different types of afferent signals has become the subject of detailed study in recent years. One would expect that one area of the cortex specialized for visuovestibular integration would be sufficient to explain the perception of self-motion, but numerous studies have shown that there are at least four such areas in both the macaque brain and the human brain. In macaques, the tuning (control, management, adjustment) of the direction of movement is similar for the dorsal medial superior temporal area (MSTd) of the visual cortex, ventral intraparietal area (VIP), parietoinsular vestibular cortex (PIVC), and parietotemporal association area, and this is true for the visual, vestibular, and combined stimulation [3]. A possible interpretation is that these areas serve different purposes, in which case this would be expected to be reflected in different properties of the response. However, at least in macaques, a number of the studied response properties appeared to be quite similar in different areas. In part, this may be the result of an incomplete understanding of how neural representations of self-motion encode object movement, as well as eye and head movements. It is likely that in the analysis of self-motion, sensory and motor signals should be combined in various ways in order to generate representations of self-motion that are stable in the natural environment [2].

The existing literature on visuovestibular integration areas in macaque monkeys and humans suggests some functional differences between them. In addition, there is no complete understanding of the respective roles of these cortical regions in the perception of self-movement. Therefore, this review provides a relatively detailed analysis of research data from only three areas involved in the assessment of self-motion. Area 7a of the posterior parietal cortex (PPC), in which, as was recently shown [4], the vestibular input dominated over the visual input under combined visual-vestibular stimulation, up to its suppression. The cingulate sulcus visual area (CSv), which, due to the integration of not only visual and vestibular afferent signals, but also proprioceptive signals from the lower extremities, presumably provides interaction between the sensory and motor systems during locomotion [5]. And the region of the superior parietal lobule (SPL), in which, according to recent studies [6], visual and somatic inputs interact, allowing one to control behavior when reaching and grasping a target with one’s hand.

Posterior Parietal Area 7a

Efficient spatial navigation requires not only static external sensory signals, but also dynamic signals generated by self-motion. A possible area of the cerebral cortex that can process information about self-motion and transmit it to other structures involved in navigation control is PPC [7]. As shown by anatomical studies on macaques [8], the main part of PPC, which has a direct and indirect connection with the hippocampus (one of the functions of which is the memorization and encoding of the surrounding space), is area 7a, the posterior part of the inferior parietal lobe, between the intraparietal and superior temporal sulci. In cynomolgus macaques, monkey labyrinth-related navigation disorders were demonstrated with a lesion in area 7a [9]. It was also shown in monkeys [10] that on visual stimulation, which causes the illusion of the observer moving forward or backward, the receptive fields of neurons in area 7a are maximally activated and also modulated by head position, which indicates a possible role of this area in the transformation of sensory inputs for self-motion [11]. The source of vestibular inputs to area 7a may be the connection either with the thalamic nuclei, which transmit the vestibular signal to the cortex, or with multimodal areas that project to 7a, such as MSTd and VIP [12].

Thus, both anatomical and physiological evidence suggests that area 7a plays a role in the convergence of self-motion signals. However, although the responses of area 7a neurons to a visual stimulus are well described [10], the sensitivity of area 7a neurons to vestibular inputs was never explicitly tested until 2019, although indirect evidence of a vestibular influence on neural responses in 7a was obtained earlier [13]. In particular, head orientation has been found to modulate visual responses only when the speed of head rotation is above the vestibular threshold. In 2019, E. Avilla et al. [4], demonstrated direct influence of monkey movement (both linear movement and rotation) on the responses of individual neurons in area 7a, and the sensitivity to vestibular signals in this area was higher than to visual ones. Therefore, we will dwell on the description of this study in more detail.

The authors used a movement platform with translational and rotational degrees of freedom to investigate the perception of self-motion speed obtained from vestibular cues. For visual movement modeling, they used translational (radial outward movement) and rotational models of visual stimuli to explore the perception of both linear and angular components of the speed of self-motion. Also, for the first time, they simultaneously used visual simulation and real movement to study the multisensory convergence of self-motion signals in region 7a during rectilinear movement.

Approximately 40% of neurons responded significantly to vestibular stimulation during linear movement, while about 27% responded to visually simulated movement. Although the proportion of neurons responding to the combined stimulus was relatively large (~38%), only a small fraction of these neurons responded to both stimuli when presented separately (~17%). In fact, given the ratio of vestibular to visual neurons in the study population, the number of bisensory neurons was only marginally higher than expected by chance. Therefore, although both visual and vestibular cues are present in 7a, multisensory convergence of information about movement is rare in this region at the level of individual neurons.

Experiments using real or visually simulated rotational movement showed that neural responses were largely similar to those observed during rectilinear movements. About 31% of neurons responded to real rotation, and about 20% to visually simulated rotation. A study by Avilla et al. [4], also revealed an important qualitative feature of the perception of the angular velocity in 7a, the spike activity rate increases with an increase in the rotation speed, regardless of the direction of rotation. It can be concluded that although the magnitude of the angular velocity can be decoded from this area, information about the direction of rotation should probably be obtained from other areas of the brain.

Thus, these experiments showed that neurons in area 7a are sensitive to both translational and rotational vestibular signals.

As a rule, vestibular rather than visual signals predominated in responses to a combined visuovestibular stimulus in a study by E. Avilla et al. [4]. In some cases, neurons that selectively respond to visual motion were even suppressed when it was combined with platform movement. This dominance of vestibular influences on responses was unexpected, because area 7a is thought to be primarily visual and is part of the dorsal visual hierarchy [14]. Moreover, vestibular dominance has not been reported in other multimodal parietal areas such as MSTd and VIP [15]. The fact that vestibular inputs elicit stronger neural responses in area 7a suggests that local mechanisms may suppress visual input signals when vestibular signals are available.

Under both visual and vestibular stimulation, the responses of neurons in area 7a often changed depending on the stimulus amplitude, which the authors of [4] expressed as a function of the maximum movement speed, demonstrating a speed-dependent increase in the spike activity rate. An increase in the rate of excitation of neurons was found with an increase in the speed of both rectilinear movement and rotation; however, the populations of these neurons practically did not overlap. This spatial separation of the representation enables the translational and rotational motion components to be independently decoded by independently reading the signals of the respective constellations and integrating them to generate linear and angular position estimates, respectively. Such separate representation may also be important to provide behavioral adaptability, in which animals can choose actions depending on their spatial orientation or distance to the target depending on the context.

The authors of [4] found that neurons in area 7a are spatially grouped in accordance with sensory perception of self-motion; however, the heterogeneity of responses in different areas suggests that there may be no clear topographic organization for a particular modality or speed.

It should be noted that in addition to encoding information about self-motion, area 7a, like some other PPC areas, can reflect several levels of hierarchical spatial representation, from the transmission of sensorimotor signals to signal changes in strength and time (for example, attention, working memory or decision processing), and even encoding abstract spatial information (spatial relations, categories) [16].

Cingulate Sulcus Visual Area CSv

For many years, the analysis of visual information in macaques has been associated with the MSTd area, although similar responses have been recorded in many other areas of the brain of these animals, for example, in VIP [2]. Based on functional magnetic resonance imaging (fMRI) data, a human homolog of MSTd [17], commonly referred to as hMST, was proposed. In 2008, in human studies, M.B. Wall and A.T. Smith concluded [18] that any area of the cerebral cortex that specializes in extracting information about self-motion from a retinal image should be active in the presence of a natural visual stimulus but should not respond to any visual motion stimuli that do not contain information about self-motion. Using fMRI, five regions of the cerebral cortex were examined: hMST, hVIP (human VIP), hV6 (V6 is an area in SPL), PIVC, and a small cingulate sulcus area. Registration of neuronal activity in these areas under conditions of visual stimulation that did not reflect self-motion showed a decrease in neuronal responses compared to conditions in which the visual stimulus corresponded to self-motion, from 10% in hMST to 80% in PIVC. The greatest decrease in neuronal activity, about 90%, occurred on both sides in the studied cingulate sulcus area, which M.B. Wall and A.T. Smith named the cingulate sulcus visual area (CSv). Therefore, it was suggested that CSv has the most developed ability to respond to visual stimuli of self-motion and ignore stimuli that do not reflect self-motion [18].

In parallel and independently, the second group [19] identified the region corresponding to CSv, calling it the dorsal posterior cingulate cortex. In this fMRI-assisted study, this area was found to respond to visual stimuli but not to random motion. It has since been shown that the activity of neurons in this area is actually suppressed by random motion [20] and that CSv neurons respond during natural simulation of self-motion, but not during equivalent motion, which simulates objects moving around a static observer [21]. An experiment was carried out using motion simulating a turn to the left or right while moving forward, and it was found that the course change direction could be easily decoded in CSv. This suggests the presence of neurons in CSv that not only respond to the fact of a change in the course of motion, but also selectively respond in accordance with the direction of this change.

The use of artificial vestibular stimuli in combination with fMRI [22] showed a pronounced activity of CSv neurons in response to vestibular stimulation, which provided additional support for the assumption that CSv is involved in monitoring ego motion. Thus, CSv is potentially a site of visuovestibular interaction.

Despite the lack of clear neurophysiological evidence for the presence of CSv in macaques due to technical difficulties in recording neuronal activity deep in the medial cortex, a putative CSv analog (mCSv) in macaques was identified using fMRI [23]. The macaque cingulate sulcus contains three motor areas [24] known as the cingulate motor areas: rostral, dorsal, and ventral (CMAv). Two of them, CMAv and mCSv, are located on the ventral bank of the cingulate sulcus. An fMRI comparison [25] of the location and extent of these areas shows [24] that the mCSv is located near the caudal border of the CMAv. Moreover, careful analysis of the anatomical data suggests that the mCSv forms the caudal part of the CMAv and receives proprioceptive signals from the lower limbs represented in the CMAv.

Three motor regions of the cingulate gyrus have also been described in the human brain: the anterior and posterior zones of the rostral cingulate gyrus and the caudal cingulate gyrus zone [24]. However, their arrangement in relation to each other and to the CSv region differs from that of macaques. The most posterior of the three, and therefore closest to the CSv, is the caudal cingulate zone. In an fMRI experiment [26], the author confirmed the location of CSv at the bottom of the cingulate sulcus and concluded that the activity in CSv differs from the motor activity of the cingulate gyrus. In another fMRI experiment [27], CSv was found to be one of three visual regions that were active during leg movements.

Significantly, the association of CSv with the supplementary motor area (SMA) was noted. On this basis, CSv is perhaps better thought of as part of the sensorimotor system than as part of the perceptual system. In particular, it was suggested [22] that the function of CSv may be to transmit sensory information about self-motion to the motor system to facilitate effective online control of locomotion. In support of this suggestion, somatosensory connection has been found predominantly with the medial part of the primary somatosensory cortex, in the paracentral lobe, where the legs and feet are represented. This selective connection with the medial somatosensory cortex has been experimentally confirmed [22].

Thus, CSv can receive proprioceptive as well as visual and vestibular cues that are related to movement. Human and macaque CSv, despite some differences, have a homologous arrangement and similar properties. CSv is associated with medial motor areas in both species, especially cingulate motor areas and SMA, indicating its involvement in motor control. CSv is best thought of as part of the cingulate gyrus motor complex. The properties of CSv suggest that it provides a key interaction between the sensory and motor systems in motion control. It is likely that its role is to control movement online, including avoiding obstacles and maintaining the intended trajectory. However, there is still considerable uncertainty about the role of CSv, and research on several fronts will be required to resolve it. Refinement of knowledge about CSv connections, combined with refinement of knowledge of the functions of the regions with which it is associated, as well as neurophysiological recordings obtained in mCSv, especially during locomotion or at least leg movements, will help to better understand the role of CSv. In human studies, functional imaging during movement may be useful, although currently available imaging techniques during active body movement, such as EEG, have significant limitations in resolution and localization accuracy.

Superior Parietal Lobule SPL

For a long time, the macaque superior parietal lobule was considered a somatic structure, in which the body and especially the upper limbs are represented [28]. However, recent studies have shown that SPL also has a connection with the visual structures of the brain, and today it is clear that visual and somatic inputs interact in SPL, allowing behavior control when actions to achieve and grasp the target are performed [29]. Essentially, the SPL is an interface between the visual and somatosensory areas. Correspondingly, in the SPL there is a cortical area (V6), in which the visual input dominates from behind, on the border with the occipital pole, and an area (PE), in which the somatosensory input dominates from the front, on the border with the primary somatosensory cortex. Intermediate regions (V6A, PEc) exhibit intermediate functional properties, with a decrease in visual sensitivity moving rostrally with a parallel increase in somatosensory sensitivity [6].

The functional properties of SPL neurons have been studied in alert non-human primates through hundreds of extracellular recordings using microelectrodes [29]. Next, we discuss in more detail the SPL regions (V6, V6A, PEc, and PE) that receive visual and somatosensory information distributed in the caudorostral direction, and some of them are directly related to the dorsal premotor cortex.

V6 contains a complete and retinotopically organized representation of the contralateral visual field, especially its periphery and, in particular, the lower visual field [30]. In V6A, on the contrary, there is an overrepresentation of the upper part of the contralateral visual field with greater representation relative to V6 of the ipsilateral visual field and poor retinotopic organization, with the central part of the visual field being presented mainly dorsally, and the periphery ventrally, on the border with V6, and with mixed representations of the upper and lower visual fields [31]. In the PEc region, visual cells constitute a minority of the neural population; they are not organized retinotopically, and most of them represent the central 30° of the contralateral visual field, especially the lower hemifield [31]. Most visual cells in V6, V6A, and PEc are sensitive to the direction of movement of visual stimuli, but as the rate of occurrence of visual cells decreases from V6 to PEc, the total number of direction-sensitive cells decreases correspondingly from V6 to PEc. Visual cells are practically absent in the PE region [32].

The different presentation of the visual field in the SPL regions is likely related to the functional role played by these regions. During movement to reach and grasp an object, the retinotopic representation of the entire field of view in V6, including the far periphery, the high sensitivity of its neurons to the orientation, size, and direction of motion, and the ability of many of them to recognize the real movement of objects (“real movement cells” [33]) are properties necessary to determine the specific features of the objects that need to be grasped. Especially when these objects move in the field of view, both as a result of self-motion and as a result of the actual movement of objects. It may be suggested that V6, as well as the V6A and PEc regions, provides this type of visual information for the visuomotor centers involved in the control of purposeful movements [34]. In this connection, it is worth noting that especially represented in V6, V6A, and PEc is the lower quadrant of the visual field, which is a part of the near-personal space through which limbs usually pass during purposeful movements to grasp an object [35], or which we look at during movement to avoid obstacles. V6 can also provide useful visual information to other areas of the cerebral cortex involved in the control of movement and navigation, since it has useful visual functions for this: it represents the entire visual field, including the far periphery [30], contains many direction-selective neurons, and is activated by a visual stimulus simulating self-motion [36]. Some studies using neuroimaging in humans support this point of view [37].

In the V6A region, the retinotopic organization of the visual cells, in contrast to the V6 region, is strongly “blurred.” It has been suggested that this apparently chaotic retinotopic organization is necessary for the creation of the so-called “real position” cells, i.e., cells whose receptive field remains constant in space regardless of the position and movement of the eyes [30]. The activity of V6A neurons is also modulated by a shift in spatial attention. It is possible that the spatial coordinates encoded by the real position cells can be used to draw attention to an object. The presence in V6A of cells whose activity is modulated by the direction of gaze [30], the direction and amplitude of purposeful hand movements [38], as well as neurons with activity modulated by the shape of the hand in accordance with the captured object [35], is in good agreement with the opinion that V6A is directly involved in the control of movements aimed at capturing. It has recently been shown that the activity of individual cells in V6A is modulated by most of the above factors showing mixed selectivity [39]. The tuning of cell activity to each factor is not static, but changes over time, indicating the sequential occurrence of visuospatial and visuomotor transformations occurring in V6A, a behavior useful for controlling purposeful hand movement [39].

In the PEc region, the functional properties of optic neurons are very similar to those in the V6A region, the only difference being their prevalence in the general cell population. While in V6A visual cells account for about 60% of the total, in PEc they account for 40% of the total cell population [31]. The remaining 60% of PEc neurons are somatosensory or somatomotor in nature, as are about 40% of V6A cells. However, somatosensory and somatomotor neurons show significant differences in these two areas. While only the upper limbs are represented in V6A, both the upper and lower limbs are represented in PEc [31]. Therefore, it was suggested that V6A is involved in the control of grasping an object performed by the upper limbs, and PEc is involved in the control of the interaction of the arm/legs with environmental objects and during motion [31]. A study using neuroimaging in humans confirms this suggestion for homologous regions of the human brain showing, in particular, that the putative human homolog of PEc responds both to hand and foot movements and to visual stimulation of an area similar to that of the macaque PEc [40].

As mentioned above, visual cells are practically absent in the PE region, and most neurons respond to proprioceptive stimulation [41]. It has been found that in PE there is a rough topographic image of the body dominated by the upper extremities, while the legs are less represented [32].

Studies conducted using retrograde neuronal tracer injections have identified cortical inputs for SPL [6, 34]. Afferents to the SPL are in strict accordance with the functional gradient observed in this structure, with the caudal portion dominated by visual properties and visual afferents that originate from the primary visual cortex as well as from many extrastriatal visual areas of the parietal cortex; and the most anterior SPL is dominated by somatosensory properties and afferents from the primary somatosensory cortex, from some areas of the parietal cortex of both the superior and inferior parietal lobules, as well as motor afferents from the primary motor and premotor areas [32]. In addition, as has long been known, the macaque SPL receives thalamic input from the pulvinar complex and the lateral posterior nucleus, as well as from several other thalamic nuclei [42].

In general, SPL regions receive both visual and sensorimotor afferents that are functionally useful, with a clear functional trend from posterior visual input at V6 to anterior somatosensory/somatomotor input at PE. It should be noted that the “pure” visual area V6 receives not only visual inputs, but also inputs from bimodal areas of the cortex and thalamic nuclei. Similarly, the somatosensory PE region receives afferents from both the somatosensory and bimodal cortical regions and thalamic nuclei [6]. It is possible, however, that only optic neurons from the bimodal regions project to V6, and similarly, only somatosensory inputs reach the PE area from the bimodal regions.

Thus, both anatomical and physiological evidence suggests that the SPL is an interface between the visual and somatosensory cortical areas and is involved in the control of movements while capturing an object, from the direction of hand movement to the formation of the hand according to the shape of the object, and can also participate in the control of the interaction of arms and legs with surrounding objects and when moving in space.

CONCLUSIONS

The review of literature data on the integration of vestibular, visual, and proprioceptive inputs in various areas of the cerebral cortex and the nature of their interaction showed that, despite the abundance of studies of numerous areas of the cortex that have vestibular, visual, and somatosensory inputs, their functions and connections are poorly studied and understood [43]. The same information about self-motion is presented in several anatomically diverse cortical areas, which suggests that this information in different areas of the brain is used for different purposes, and the multiplicity of signal representation in different areas of the cortex is, apparently, the leading principle of motor organization in humans and animals [2]. It may be suggested that the exchange and distribution of information between brain regions is probably modulated by the requirements of the task. To understand this, as well as how the brain converts sensory input to a behavioral format, future research needs to combine complex natural tasks with normative models of behavior [44].