Outline

There has been renewed interest in the functional organization of the rodent visual system over the last decade, with the mouse as a target of special interest, due to the development of an expanding suite of genetic tools available for use in vivo. These methods offer considerable advantages for systems neuroscience, including the opportunity for monitoring and direct manipulation of neuronal populations with defined cell types and cortical layer location; and identification of biomarkers and neuronal populations underlying visual perception and cognitive behaviors. It is now possible to anatomically trace (Harris et al. 2019) and functionally image (de Vries et al. 2020) neurons in defined cell types and layers, and to combine these anatomical and functional investigations in single animals (Bennett et al 2019; Huang et al 2020; Kim et al. 2015; Young et al 2021). The new technologies make the mouse a leading model organism to study the relationship between structural and functional connectivity.

The seminal observations which triggered the above-described surge in interest include reports that: (1) the extrastriate visual cortex has a complex anatomical structure (Wang and Burkhalter 2007); and (2) there are visual processing functions shared with primates, such as orientation and spatial frequency selectivity in mice V1 (Niell and Stryker 2008). Prior to these studies, the mouse was not considered as an ideal visual model, due to its small brain size and the low visual acuity (Prusky et al. 2000). In this review, we discuss how the cortical and subcortical visual regions in mice are interconnected anatomically, and what is the impact of network connectivity on the visual functions. Where relevant data are available, we also cite other rodent species, with the caveat of non-negligible differences across species. We mainly consider the mesoscale level, which covers intra-areal local connectivity as well as inter-areal long-range connectivity.

The scope of this review is to provide a summary of organizational features governing the inter-areal network of the mouse visual system. For this purpose, we survey empirical evidence from the point of topographical organization, inter-areal connectivity and hierarchy, and identify points of deviation from the hierarchy. We discuss whether and how these features are shared with those of the primate visual system, one of the best characterized systems among mammals, in the following three sections:


Distinctive features of the mouse visual system: We introduce the general anatomical and physiological features that differentiate the mouse visual system from other mammalian species. We present the currently known regions involved in vision, together with the distinct characteristics of the physical connections between the regions, which may contribute to a biased visual field covered by the higher visual areas (HVAs) of mice. In the following sections, we further examine organizational features of these visual areas as an inter-areal network.


Anatomical and physiological markers of hierarchy: We discuss the expression in the mouse of the two most prominent network features of the visual system: hierarchy and dorso-ventral streams. The network of the visual system has been studied extensively in primates as the best model for human vision. Its landmark topological feature is a hierarchical organization, summarized by Felleman and Van Essen (1991). Considered as a fundamental principle of the visual system, the hierarchical organization has been widely adopted in leading theories of visual computation in the mammalian brain (Bastos et al. 2012; Kawato et al 1993; Keller and Mrsic-Flogel 2018; Marr 1982; Rao and Ballard 1999). Moreover, at higher stages of the hierarchy in the primate cerebral cortex, the visual areas are organized in dorsal and ventral streams that compute distinct aspects of the visual information in a serial manner (Desimone and Ungerleider 1989; Kravitz et al. 2011; Milner and Goodale 1993; Ungerleider and Mishkin 1982; Zeki and Shipp 1988). These concepts of the network organization appear to be a key ingredient to explain the neural activity of the visual system. Indeed, single-neuron activities of the primate’s ventral stream areas have been successfully approximated by artificial neural networks with a hierarchical and serial network architecture (V4: Bashivan et al. (2019); IT: Yamins et al. (2014)). Do such streams exist, and are they also embedded in a hierarchy in the rodent visual system? To answer this question, we review recent anatomical tracing investigations as well as functional imaging and electrophysiological investigations. These studies point to the existence of the hierarchy, but with an organization where the levels are less distinct compared with primates, indicating that the mouse visual system is also endowed with abundant non-hierarchical processing.


Non-hierarchical visual processing: We discuss two types of non-hierarchical pathways: (1) pathways within the visual cortex and (2) pathways from subcortical structures to the visual cortex. As an example of the former, we discuss bypassing connections that skip a hierarchical level. In the primates, this type of connection sends a specialized visual attribute to a recipient higher visual area, distinct from visual attributes by hierarchical projection. We review empirical evidence supporting this view in mice, together with its distinctive features. As an example of the latter type of pathways, we discuss thalamo-cortical and colliculo-cortical pathways. Against the currently dominant hypothesis in primates that the colliculo-cortical pathway is modulatory rather than driving visual responses in the cortex, recent studies suggest a more prominent role of the colliculo-cortical pathway in visual processing.

Elucidating hierarchical and non-hierarchical pathways reviewed in this article will lead to a next-level question: how do these pathways interact to process and transform visual information inside the brain? To approach this next frontier, it will be critical to record and analyze from multiple visual areas along the pathways. “Future perspective” presents some ideas for future studies that may provide critical information to answer this grand question.

Distinctive features of the mouse visual system

We introduce the regions of the mouse cortex involved in vision and how they differ from the primate visual system. We discuss the synaptic connectivity of these regions, which may explain the biased visual field coverage in higher visual cortical areas.

Visual cortical areas in mice

The primate visual cortex contains ~ 30 areas that respond to visual stimuli, and which cover more than half of the neocortex (Fig. 1A) (Felleman and Van Essen 1991). In their structural layout, there is a feature that can also be found in most diurnal mammals: area V1 is bordered almost entirely by a single elongated area V2, performing a second-order transformation of the whole visual field (Allman and Kaas 1974; Rosa and Krubitzer 1999). The neurons at the border between V1 and V2 have their receptive fields on the vertical meridian of the visual field. The rest of the visual areas are anterior to V2 (Fig. 1A), stacking one by one in a mosaic.

Fig. 1
figure 1

adapted from Felleman and Van Essen (1991)) and mice (adapted from Gămănuţ et al. 2018), at the same scale. In macaque cortex, areas V1 and V2 were separated along their border during flattening. B The flattened neocortex of mice (same as in panel A), magnified ten times. In both panels, the coloured areas represent visual areas, while the white areas are non-visual areas. The areas coloured in orange belong to the ventral stream, while the areas coloured in blue are in the dorsal stream. The purple areas are other visual areas

Comparison between the visual areas of macaques and mice. A The flattened neocortices of macaques (

The spatial arrangement of the mouse visual cortex diverges from that of the primate, particularly in the region around V1. The region with a visual response is occupied mainly with V1 and a relatively narrow strip around V1. This strip contains at least nine visual areas according to the widely accepted view today (Fig. 1B). Seven of them are directly adjacent to V1, unlike the single border with V2 in the primate visual cortex. Until 2007, the mouse visual cortex was considered to have only a maximum of four visual areas around V1 (Caviness 1975; Dräger 1975; Wagor et al. 1980). This reduced number of visual field representations in the cortex seemed in line with the low visual acuity of the mouse. However, using injections of high-precision anterograde tracers, coupled with electrophysiological recordings, Wang and Burkhalter (2007) found that this strip is composed of nine areas, each performing a unique transformation of the visual field. The vertical meridian is represented along the border between V1 and LM, making LM a candidate for the homologue of V2 in mice.

The organization in mice is consistent with that observed in rats, where anatomical and physiological investigations had also shown the existence of at least nine visual areas within regions similar to those cytoarchitectonically defined in mice (introduction in Olavarria and Montero (1989), Montero (1993)). This peculiar organization of the visual cortex may be related to the fact that mice and rats are nocturnal, not relying heavily on vision. Interestingly, the squirrel, a diurnal rodent species which depends much more on vision, exhibits both rodent features, and an extrastriate visual cortex organized more similarly to primates. On the one hand, connectivity studies suggested the existence of areas AM and PM, medial to V1 and the areas P and POR lateral-posterior to V1, most probable homologues of the areas with the same names in mice (Negwer et al. 2017). On the other hand, there is evidence for an elongated V2 lateral to V1, and a mosaic of higher visual areas anterior of it, like in primates (Kaas et al. 1989; Negwer et al. 2017; Van Hooser and Nelson 2006). There is an ongoing debate about whether V2 should be divided into multiple areas, based on the observation that projections to V1 exhibit multiple patches within V2 (Kaas et al. 1989). This patchiness is interpreted as projections from different processing units within V2 to functionally related processing units in V1 (Kaas et al. 1989). However, these patches might also indicate the partition of V2 into multiple areas, similar to mice and rats (Laramee and Boire (2014).

The extrastriate areas in mice can further be grouped in ventral and dorsal streams (Fig. 1B). In the primate visual system, the two streams (Fig. 1A) are hypothesized to compute different aspects of an incoming visual scene. The ventral stream is specialized for object identity, whereas the dorsal stream is for motion perception and visually guided behavior (Kravitz et al. 2011; Mishkin et al. 1983). The evidence for streams in mice came from anatomical studies, which showed that areas LM, LI, P, POR (Fig. 1B, red) belong to the ventral stream because they project primarily to temporal and parahippocampal cortices, located ventrally in the cortex, and that areas AL, RL, A, AM, PM (Fig. 1B, blue) belong to the dorsal stream because they project primarily to parietal, motor and frontal cortices, located dorsally (Wang et al. 2011, 2012). In primates and carnivores, the ventral and dorsal streams are known to project preferentially to ventral and dorsal regions of the rest of the cortex, respectively (Hilgetag et al. 2000; Ungerleider and Mishkin 1982). Interestingly, the target areas of the ventral stream include the lateral entorhinal cortex which has weak selectivity to space (Hargreaves et al. 2005), whereas that of the dorsal stream includes the medial entorhinal cortex which has strong selectivity to space and contains grid cells (Hafting et al. 2005). Moreover, ventral stream areas strongly project to the amygdala, while the connections from the dorsal stream areas and from V1 are absent (Meier et al. 2021). These results supported the notion that the two streams project distinct visual information to the target areas. The streams in mice also have a distinctive feature: while in primates areas V2 and V3 process aspects from both streams, in mice area LM (currently considered the most likely homologue of V2) is a part of the ventral stream (Wang et al. 2011, 2012).

There are some reported differences in visual selectivity between the streams at the neural population level. For example, neurons in superficial layers of AL are preferentially responsive to visual stimuli with high temporal and low spatial frequencies, while neurons in PM are responsive to stimuli with low temporal and high spatial frequencies (Andermann et al. 2011; Marshel et al. 2011). However, these differences in stimulus preference tend to vanish in deeper layers (de Vries et al. 2020).

There are other cortical areas and subcortical structures in mice that reportedly have visual responses. In the anterior part of the cortex, the Anterior Cingulate cortex (ACC) is known to respond to visual stimulation (Mohajerani et al. 2013; Murakami et al. 2015). ACC receives direct, monosynaptic projections from V1 and from a medial visual area (Sidorov et al. 2020). A small portion of ACC projects back to visual areas, transferring eye-movement information, qualitatively similar to primate’s Frontal Eye Field (Itokazu et al. 2018). A small fraction of neurons (~ 10%) in ACC are visually driven (Murakami et al. 2015), and these neurons are distributed within the area. A retinotopic organization is suggested (Leinweber et al. 2017) but not fully elucidated to date. Visual responses were also found extending to S1barrel and retrosplenial cortices, using a highly sensitive calcium indicator (Murakami et al. 2015; Zhuang et al. 2017). Prostriata in the subicular complex, located between the entorhinal cortex and the hippocampus, reciprocally forms projections to V1 (Ding 2013; Lu et al. 2020). This area is involved in rapid processing of moving stimuli in the temporal visual field in primates (Yu et al. 2012). As for the subcortical structures involved in vision, the dorsomedial part of the striatum in the basal ganglia was recently reported to respond to visual stimulation. Simultaneous recording from cortex and basal ganglia, together with inactivation of the cortex, suggested that a part of this response is due to the projection from the cortical area AM (Peters et al. 2021).

As described in the previous paragraphs, there are ongoing efforts to determine regions related to vision in mice. For simplicity, we will be focusing in the rest of the article on the ten cortical areas that are highlighted in color in Fig. 1B, together with the geniculo-cortical and colliculo-cortical pathways.

Pathways from the retina to the cortical visual regions

In the mouse, as in other mammalian species, visual information reaches the cerebral cortex from the retina via two subcortical pathways: (1) the geniculo-cortical and (2) the colliculo-cortical pathways (Roth et al. 2016). In the geniculo-cortical pathway, the retinal signal travels through the dorsal part of Lateral Geniculate Nucleus (dLGN) in the thalamus. In the colliculo-cortical pathway, it travels through the Superior Colliculus (SC) in the midbrain before reaching the Lateral Posterior Nucleus (LPN, rodent equivalent of pulvinar, sometimes referred to as “higher-order” thalamic nucleus of the visual system) in the thalamus. LPN, moreover, is retinotopically divided into three sub-areas (Bennett et al. 2019; Tohmi et al. 2014). A similar parcellation is reported in the primate pulvinar, although a more detailed subdivision is proposed (Kaas and Lyon 2007). The mouse LPN and primate pulvinar share a handful of histological similarities. (1) One of the subdivisions (caudal LPN in mice and central-medial inferior pulvinar in primates) receives a particularly strong projection from SC, shown by the multitude of terminals in this subdivision that contain substance P, characteristic of colliculo-pulvinar projections (Stepniewska et al. 2000; Zhou et al. 2017). (2) Two subdivisions project either to the dorsal or the ventral visual cortical streams. (3) The remaining subdivision (rostral LPN in mice and medial pulvinar in primates) projects to regions including frontal cortex, cingulate cortex, as well as the amygdala (Bennett et al. 2019; Gutierrez et al. 2000).

Despite these similarities, there is a notable difference between mouse LPN and primate pulvinar in the interaction between the bottom-up projections from SC and the dorsal and ventral streams of the visual cortex. The primate SC projects to posterior and central-medial pulvinar, primarily connected to the dorsal cortical areas (Lin and Kaas 1979; Stepniewska et al. 2000). On the other hand, the mouse SC projects to posterior LPN, primarily connected to the ventral cortical areas (Beltramo and Scanziani 2019; Bennett et al. 2019). In contrast, the top-down projections from the visual cortex to SC are organized similarly as in primates. Specifically, the strongest projection comes from V1, which together with ventral stream areas targets the superficial layers of the SC, while the dorsal stream areas project to both superficial and deep layers of the SC (Wang and Burkhalter 2013). Thus, LPN (or pulvinar) participates in a loop connecting the visual cortex and SC in both species, even though the cortical component of the loop belongs to a different stream in each species.

Connections between visual cortical areas

After reliable partitioning of the visual cortical areas, the next major challenge in understanding the mouse visual system was determining the cortical connectivity. Conventional tracer injections were complicated by the need to confine tracer deposits to a single identified area, given the small size of some of the areas, relative to the minimal effective volumes of most tracers. This was further confounded by the need to avoid damage to the fibers of the passage near the injection site. These fibers, corresponding to the axons that do not have either end at the injection site, go through layers 5 and 6 in mice, and they constitute a large proportion of the inter-areal connections (Watakabe and Hirokawa 2018). Damaged fibers may pick up the tracer and potentially label the corresponding neurons (Payne 1987), leading to false positives. The final obstacle was accurately identifying and registering labelled somas (retrograde tracers) and axon terminals (anterograde tracers). To date, these structures are most reliably identified with supervision of human observers (Gămănuţ et al. 2018; Harris et al. 2019; Wang et al. 2012), rather than fully automated processes facilitated by artificial intelligence (AI) (Oh et al. 2014). This is particularly the case for anterograde tracers that label entire axons and one must distinguish between terminals and axons (especially fibers of passage).

The resulting connectivity between visual cortical areas of the mouse is distinct from primates in its high interconnectedness: 99% of all possible anatomical connections are present (ultra-density), with moderate and strong weights (anterograde: D’Souza et al. (2020) and Wang et al. (2012); retrograde: Gămănuţ et al. (2018)). Collated data suggests that similar high-density connectivity between visual areas might also exist in rats (Bota et al. 2015). In contrast, in macaques (Markov et al. 2014a) and marmosets (Theodoni et al. 2021), only 67% of all possible connections that could exist actually do exist, with overall smaller weights of connections between the visual areas than in mice.

There are at least two factors that could explain this difference in percent of existing connections: size of the neocortex and topography of the visual system. Both macaques and marmosets have larger neocortices than mice or rats. Larger neocortices, with more neurons and more cortical areas, cannot maintain the same degree of connectivity as the small neocortices because the necessary axons would take up too much physical space. Therefore, as a general rule, larger neocortices exhibit lower overall connectivity than small ones (Ringo 1991). This is captured by the different parameter values of the Exponential Distance Rule (EDR) in each species (Ercsey-Ravasz et al. 2013; Horvát et al. 2016; Theodoni et al. 2021). EDR is an essential constraint in the organization of the cortical connections, which states that the physical lengths of individual axons forming inter-areal connections are distributed according to an exponential distribution (i.e., there are exponentially more short-range axons than long-range). The ratios of the lengths depend on the size of the neocortex (Horvát et al. 2016; Theodoni et al. 2021). To compare this across neocortices of very different sizes, a numerical procedure to rescale the physical distances across species was proposed (equivalent to adjusting the dimensions of the mouse cortex from Fig. 1A to Fig. 1B) (Horvát et al. 2016). The EDR was then expressed on the common template resulted from rescaling, with γ being the rate parameter of the exponential distribution of normalized axonal lengths. The numerical value of γ indicates how fast the probability of a projection decreases with the projection length. Species with bigger neocortices show higher γ, indicating a steeper decrease in the number of connections formed by long axons. Consequently, there are fewer long-range connections in bigger neocortices, leading to lower fractional connectivity.

The topography of the visual cortical areas adds more fractional connectivity to rodents. To better visualize this aspect, the scaled neocortices in Fig. 1 are again helpful. The ~ 30 visual areas in primates occupy about half of the neocortex (Felleman and Van Essen 1991), with V1 located the most posterior compared to higher visual areas—Fig. 1A. In contrast, the ~ 10 areas in the nocturnal rodents occupy about 15% of the neocortex (Gămănuţ, unpublished results), with V1 in the center and the others arrayed around V1 (Fig. 1B). The result is that the normalized distances between visual areas in rodents are significantly shorter than those of primates. According to EDR, shorter distances are dominated by high numbers of connections (Horvát et al. 2016) which leads to greater fractional connectivity in the visual system of the two rodent species.

However, the ultra-dense long-range connectivity and the EDR in the mouse are not enough to explain the more intricate organization of connections, which is shaped by the functional interaction between the areas. The best example is the retinotopic arrangement of the projections from V1 to the higher visual areas of rodents (Montero 1993; Wang and Burkhalter 2007). Here, the connections are made preferentially between portions of areas that process the same part of the visual field, and significantly less between pairs that process different parts of the visual field. Whether a retinotopic arrangement exists in the projections between HVAs is currently not known. What is known is that they maintain a relatively high degree of local specificity, defined by the hierarchical and non-hierarchical interactions, as we will see in “Anatomical and physiological markers of hierarchy” and “Non-hierarchical visual processing”.

Biased visual field coverage in HVAs

Since the study of Wang and Burkhalter (2007), the structure of the mouse visual cortical areas has been confirmed and refined using functional widefield optical imaging of hemodynamic activity and genetically encoded calcium indicator activity (Andermann et al. 2011; Garrett et al. 2014; Marshel et al. 2011; Zhuang et al. 2017). The HVAs were eventually classified following a strict, multipart definition: (1) consistent visual field sign within the areal borders; (2) no redundant representation of visual space within the same area; (3) all adjacent areas of the same visual field sign have overlapping representations of visual space; and (4) the area must be at a consistent cortical location across experiments (Garrett et al. 2014; Juavinett et al. 2017). Notably, these recent imaging studies have revealed that some of the HVAs do not process the entire visual field, unlike HVAs in primates. Areas located medial to V1 (e.g., AM and PM) tend to cover the temporal part (away from 30° azimuth) of the visual field, whereas areas located lateral to V1 tend to cover the nasal parts of the visual field. This partial representation of the visual field contrasts with V1, which represents every quadrant of the entire visual field from lower to upper and nasal to temporal. It is noteworthy that although the coverage of the visual field is not complete in each HVA, there are substantial overlaps of visual fields between the HVAs. For instance, the upper nasal visual field is represented by all the HVAs, according to widefield optical imaging (Garrett et al. 2014). Consistently, an anatomical projection from a portion of V1 corresponding to the nasal visual field is clearly observed in all the HVAs (Wang and Burkhalter 2007). These functional and anatomical results speak against the view that the areas surrounding V1 might constitute a single V2 (Rosa and Krubitzer 1999).

How do HVAs represent only parts of the visual field? We provide here two possible accounts from (1) inter-cortical projections and (2) sub-cortical projections. (1) The biased projections from V1 to HVAs are constrained by geometrical distances between the areas due to EDR. This indicates that a HVA near a segment of V1 receives stronger projections from that segment of V1 and represents a similar portion of the visual field. For example, PM is adjacent to the part of V1 representing the temporal visual field, and it receives a stronger projection from the part of V1 representing the temporal visual field. (2) Alternatively, or in conjunction with the projections from V1, projections from subcortical structures, particularly LPN, may explain the biased visual field coverage in HVAs. In LPN, both neurons and their axonal projections have a retinotopic organization (Beltramo and Scanziani 2019). For example, the posterior end of LPN, preferring the upper visual field, projects preferentially to POR. The anterior part of LPN, preferring the lower visual field, projects preferentially to AM. The preferred elevation of all the HVAs, estimated by weighting the elevation map measured in LPN relative to its projection volume, is indeed very well correlated with the mean elevation measured in each visual cortical area (Bennett et al. 2019). The correlation in azimuth is less clear, but the clear bias of the anterior LPN in the temporal visual field is consistent with that of its target AM, which has a strong bias to the temporal visual field. Currently, it remains unknown whether the bias in LPN creates the bias in HVAs, because these two structures are reciprocally connected. Given a recent finding that suppression of colliculo-cortical pathway completely abolished the visual response in one of the higher visual areas (Beltramo and Scanziani 2019), it is tempting to speculate LPN as the origin of the cortical bias.

So far, we covered that the mouse visual system, although composed of a smaller number of areas, has cortical and subcortical structures comparable to those of primates. Both species have the dorsal and ventral streams in the cortex, geniculo-cortical and colliculo-cortical pathways from the retina to the cortex. In the following sections, we will examine the network features of these structures. We first focus on the hierarchy of the visual cortical areas, a prominent network feature known in the primate cerebral cortex (Felleman and Van Essen 1991).

Anatomical and physiological markers of hierarchy

Here, we examine the hierarchical organization of the visual areas, by which we mean a topological sequence of projections between areas (for other definitions of hierarchy, see Hilgetag and Goulas (2020)). We review two models of ranking cortical areas that have been proposed according to the distribution and quantification of projections between areas.

Anatomy of hierarchical connections in visual cortical areas

In primates and carnivores, axon terminals were historically the first measures to build a cortical hierarchy. Axon terminals project to specific cortical layers, depending on the type of projection. Cortico-cortical feedforward pathways are those that originate primarily in L2 and 3 and terminate mainly in L4. Feedback projections arise primarily in L5 and 6, and target mainly L1, while avoiding L4 (Rockland and Pandya 1979). In mice, the feedforward projections terminate in different layers. For example, V1, the putatively lowest cortical area in the visual hierarchy, projects to all the layers of the higher visual areas, not just L4, but significantly less to L1. This projection pattern is also reported in the rat, suggesting a rodent-specific feature (Coogan and Burkhalter 1993). An additional feature in mice is the significant proportion of neurons in L4 projecting to other areas (Harris et al. 2019). On the other hand, the patterns of the feedback terminals are more comparable to what have been observed in primates and carnivores. The projections to V1 from HVAs concentrate more in L1, with intensities specific for each projection, and are less dense in other layers (D’Souza et al. 2016).

Based on the laminar properties specific to rodents reviewed above, two approaches have been proposed to define hierarchical levels in cortical areas of mice. The first used injections of the anterograde tracer Biotinylated Dextran Amine (BDA) in each visual area, every injection being confined to the borders of a single area (D’Souza et al. 2016, 2020). To classify the projections, the rules derived from the connections from and to V1 were used as reference. Thus, feedforward connections were considered those that tended to avoid L1 in target areas, while feedback were those that targeted more L1. This termination pattern was quantified as densities of terminations in L2–4 relative to terminations in L1–4. This measure is comparable to the quantified termination pattern used for the hierarchy analysis of macaque, counting supragranular projecting neurons in source areas relative to total supra- and infra-granular projecting neurons (Markov et al. 2014b). As in the macaque hierarchy analysis, the beta regression model was employed to determine hierarchical level and distance values, which best predict the laminar termination patterns for each interareal connection.

This modeling analysis revealed four processing stages in the visual hierarchy: the first three are the individual areas V1, LM, RL, while the fourth processing stage contains the remaining seven areas (D’Souza et al. 2020). The merging of the latter areas was due to many lateral connections (neither clearly feedforward nor feedback), which placed them on close hierarchical levels.

In the other approach to defining the hierarchy in mice, a machine learning method was applied on a large set of projections to the entire isocortex, to identify global patterns in the distributions of terminals across all the cortical layers (Harris et al. 2019). These projections were obtained from hundreds of experiments with injections of anterograde labeling viruses, in wild type and Cre driver mouse lines. Only the injections confined at least 50% to one cortical area or one thalamic nucleus were analyzed. The algorithm first classified all the projections into nine types, based on the relative proportions of terminals in each of the six layers. Then, it computed a hierarchy for each possible partition of the nine types in two sets, to determine which can be best considered feedforward and which feedback. In every partition, one of the sets contained a number from one to nine putative feedforward projections and the other contained the remaining types, considered feedback. The goal was to find the partition for which the hierarchy was the most consistent with the types of projections between levels (i.e., most of the feedforward projections go from lower to higher areas, and most feedback from higher to lower areas). The hierarchical position of every area was computed as the normalized difference between its feedforward and feedback connections. To avoid the bias of a mouse line for any of the two directions, each connection was adjusted with a confidence measure characteristic to the corresponding line.

The partition that resulted in the most consistent hierarchy contained six feedforward projections that were confined significantly to L2/3, L4, or both, while the other three feedback types were confined significantly to L1 & L5, L1 & L6 or L5 & L6. To assess the shallowness of this hierarchy, it was compared with a perfect hierarchy that had a complete match of the directions of projections with the hierarchical positions.

The above two studies in mice, using different experimental approaches and analysis techniques, obtained different rankings of the visual areas. Nevertheless, they both conclude that the hierarchy of the mouse visual cortex is shallow. The result by D’Souza et al. (2020) further allows direct comparison between primates and mice; the mouse visual cortical areas span 1.4 hierarchical levels (grey band in Fig. 2A), much shallower compared with ~ 10 levels in macaques (black circles in Fig. 2A).

Fig. 2
figure 2

Modified from Vezoli et al. (2021). B, C Modified from Siegle et al. (2021)

A Hierarchical levels and hierarchical distances between the areas of the primate visual system (black line), contrasted with the range of hierarchical distances of the mouse (gray band). B Distribution of first spike times in response to the flash stimulus across all units in each of 6 visual cortical areas (V1, LM, RL, AL, PM, AM) and 2 thalamic visual nuclei (LGN and LPN). C Correlation between mean time to first spike and hierarchy score from anatomical tracing studies. The hierarchical order of the 6 visual cortical areas coincides with the one in D’Souza et al. (2020). A

Emergence of orientation and direction selectivity from retina to cortex

How does the hierarchical structure of the visual system impact the neuronal activity in these areas? We first examine the hierarchical visual processing from retina to cortex, where selectivities to orientation and direction are processed.

In V1 of macaques and cats, there are neurons that selectively respond to visual orientations (orientation selectivity). By contrast, the majority of LGN relay cells do not show selectivity in orientation. To explain de novo orientation selectivity in V1, Hubel and Wiesel proposed a feedforward model from LGN to V1, where orientation selectivity in V1 is formed by convergence of spatially offset receptive fields of LGN relay cells (Hubel and Wiesel 1962). This model was clearly supported by simultaneous electrophysiological recording of spatially overlapping receptive fields in V1 and LGN in cats (Reid and Alonso 1995; Tanaka 1983). Although a debate continues about the impact of thalamic orientation selectivity on cortical orientation selectivity (Piscopo et al. 2013; Vidyasagar and Eysel 2015; Vidyasagar et al. 1996; Zhao et al. 2013), the feedforward circuit from LGN to V1 is generally considered a principal mechanism for the emergence of orientation selectivity in the visual pathway. The success of the feedforward model has led to the view that a combination of excitatory inputs with differing receptive field profiles generates novel and more complex receptive field profiles at each processing stage (Priebe 2016).

In the mouse, orientation selectivity in V1 is likely computed not only by the feedforward circuitry from LGN to V1, but also by other mechanisms at different stages. There is a report supporting de novo computation of orientation selectivity by LGN inputs to V1 with spatially offset, yet overlapping receptive fields (Lien and Scanziani 2013). However, unlike primates and cats, orientation selectivity has been also reported in LGN (Marshel et al. 2012, Piscopo et al. 2013; Scholl et al. 2013; Zhao et al. 2013). The orientation selectivity of V1 is largely inherited from that of LGN (Li et al. 2013; Sun et al. 2016). Current evidence for the mouse suggests that orientation selectivity in LGN likely derives from processing within the retina (Baden et al. 2016; Nath and Schwartz 2016). At least part of the retinal orientation selectivity has been shown to depend on the two mechanisms: (1) dendritic morphology along the retinal surface of single retinal ganglion cells and (2) an interplay between synaptic excitation and inhibition, which are selective to orthogonal orientations (Nath and Schwartz 2016). These mechanisms are distinct from the convergence of the spatially offset neurons, as reported in the LGN-V1 stage of the primate. In mice, the orientation selectivity is computed in multiple stages and circuit mechanisms.

Similar to orientation selectivity, direction selectivity in mice is likely paved by multiple stages along the visual pathway. It emerges in the retina through asymmetric excitation and inhibition (Fried et al. 2002), and is relayed to cortical L1 via a dedicated thalamic pathway (Briggman et al. 2011). In addition, direction selectivity is computed de novo in each layers of V1: firstly at L4, through integration of thalamic inputs (Lien and Scanziani 2013), and again in L2/3 through integration of inputs from neighboring neurons (Rossi et al. 2020). Further, retinal direction selectivity is shown to affect direction selectivity in visual cortical areas (Rasmussen et al. 2020). The computations of direction at these different stages share a common ground: direction selectivity is formed by integration of pre-synaptic inputs with displaced spatiotemporal receptive fields.

Hierarchical processing between visual cortical areas

After the visual signal reaches into the cortex, are there signatures of hierarchical visual processing between the visual cortical areas? One of the functional signatures of hierarchical processing is activation latency: neurons in higher regions respond to incoming signals with longer latencies. The activation latency has been reported multiple times in the macaque cerebral cortex [for review, see Lamme and Roelfsema (2000)]. In the mouse visual cortical areas, similar observations were indeed made in earlier studies using widefield voltage imaging of neural population activity. Higher visual areas such as AL and LM are activated 8–9 ms later than V1 in response to visual stimulation (Polack and Contreras 2012), and after electrical stimulation of V1 (Fehérvári and Yagi 2016). These two areas were also shown to respond earlier than other higher areas, including AM and PM. Apart from AL and LM, the latency between the other HVAs was not distinguishable. These studies left ambiguity as to whether the activation latency is genuinely the same in HVAs, or it is different but within the confines of experimental error of the population imaging technique employed—for instance, because of diversity between neurons in one area. The latter might be the case, according to a recent large-scale survey from nearly 100,000 individual neurons (Siegle et al. 2021), which reported extracellular single-unit recording data of the cortical areas V1, LM, RL, AL, AM and PM, and of the thalamic visual nuclei LGN and LPN. This massive data set revealed that neurons in HVAs tend to emit spikes later in response to visual stimulation (Fig. 2B). The visual response latencies averaged across neurons in one area gradually increased according to the position of the corresponding area in the anatomically defined hierarchy (Fig. 2C). Likewise, the temporal delay between all the visual areas, defined by the peak of the cross-correlogram, strongly correlated with the anatomical hierarchy score of these areas. Similar correspondence to the anatomically defined hierarchy was found in the other functional properties: (1) Receptive field size (cf. D’Souza et al. (2020)); (2) ratio of simple and complex cells in a given area; and (3) intrinsic time scale revealed by spike-train autocorrelation (cf. Runyan et al. (2017)). It is noteworthy that all these functional properties were substantially variable across neurons in each area (e.g., wide distributions of the response latencies across neurons in Fig. 2B). A part of this variability could be attributable to cortical layer or cell type, as reported in studies employing two-photon imaging and transgenic mouse lines (de Vries et al. 2020). It is also notable that this large-scale study was conducted mostly in the visual areas from the dorsal stream. In the ventral stream areas, there are fewer published descriptions of hierarchical processing. However, the following proxies of the hierarchy were reported: (1) a reduction in neural sensitivity to the amount of luminous energy (Tafazoli et al. 2017; Vinken et al. 2016) and (2) more repetition suppression (Kaliukhovich and Op de Beeck 2018).

Despite the high correspondence between the anatomically and functionally measured signs of hierarchies, it remains elusive how the synaptic connectivity generates the functional signs of the hierarchy, such as response latency and intrinsic timescale. For example, the response latency is affected by not only synaptic inputs from the visual cortex but also from outside of the visual cortex, such as LPN. Another example is that the increase of intrinsic timescales along the hierarchy is not explained by hierarchical connectivity, according to a simulation study of the inter-areal network (Chaudhuri et al. 2015). Rather, the increase in timescale was reproduced when excitatory neurons in higher areas received more excitatory inputs. This pattern could also arise if higher areas have a higher number of spines compared to neurons in the lower areas. Such a morphological variation along the cortical hierarchy has been reported in macaque (Elston and Rosa 1998). Similar findings have been reported for the visual cortical areas of the South American Rodent, Dasyprocta primnolopha (Elston et al. 2006), but not yet for those of more commonly used rodent species such as the rat and mouse. More generally, with the currently available data, there remains a possibility that the functional difference between areas in the hierarchy is due to other physiological or anatomical properties across areas. Although the simulation study by Chaudhuri et al. (2015) provided possible origins for the functional differences along cortical hierarchies, it left ambiguity as to whether a particular cellular property (e.g., spine morphology, density, and distribution) serves to increase the effective excitatory connectivity.

The functional hierarchy might also be dynamically modulated depending on the state of the animal. In the macaque visual cortical areas, the anatomically defined hierarchy turned out to be aligned with directed functional asymmetry between two areas, estimated through causality analysis of ECoG data (Bastos et al. 2015). However, this functional marker of hierarchy was reorganized during different phases of a visual attention task. For instance, during the pre-stimulus period, area 8L in the frontal cortex was dynamically reassigned to the lowest position in the hierarchy, becoming a driver for V1. No such dynamic reorganization has been yet reported in mice. Given the shallowness of their anatomical hierarchy, the functional hierarchy may shift even more radically depending on the task context or the brain state. The mechanisms behind the dynamic reorganization are not yet understood, but may be related to the dendritic spiking, caused by inputs of top-down signals coincident with inputs from other areas. Such dendritic activity leads to bursting in the soma of the downstream neuron. In the somatosensory—motor system of the mouse, it has been reported that this dendritic spiking was employed to drive the downstream neurons in the somatosensory area by the motor area (Manita et al. 2015). If such a circuit mechanism existed in the visual system, it could temporarily revert the anatomically constrained hierarchy and using such a neural circuit mechanism, the brain might form temporary hierarchies depending on the types of the computations.

What does the shallowness of the anatomical hierarchy imply for visual processing? One possibility is that it takes only four steps up the hierarchy for pooling visual space into a 40 deg RF (D’Souza et al. 2020). In contrast, ten steps are needed in macaque to reach a similar RF dimension, which happens in the medial superior temporal area near the top of the cortical hierarchy (Raiguel et al. 1997). The shallow hierarchy may also imply that the representations of the visual features in mice are themselves less sophisticated than in primates. Visual features can be well decomposed into a hierarchical set of increasingly complex features, from edges to combined edges, to larger configuration forming objects (Richards et al. 2019). The shallow hierarchy in the visual cortex appears to suggest that the mouse vision represents more basic visual features than the primates do. In the next subsection, we will discuss neural responses of mouse HVAs to high-level visual features that are also processed in higher-order visual areas of primates.

Computations in dorsal and ventral streams

Apart from the functional signatures of the hierarchical processing above, little has been revealed about the computational properties or functions of the mouse dorsal and ventral streams [for a more detailed survey on reported functions of HVAs, see Glickfeld and Olsen (2017)]. A recent hypothesis suggests that they may have distinct roles for navigation: the dorsal stream computes elements of self-motion from the optic flow within the peripheral visual field, whereas the ventral stream computes objects and landmarks within the central field of vision (Saleem 2020).

In the dorsal stream, neural activities have been studied from the following aspects: (1) visually guided action and (2) motion detection or perception. Visually guided action is the deliberate movement of body parts such as hands and eyes based on visually constructed information. This information involves, for example, the representation of a hand to reflect the size, shape, and orientation according to the object being grasped. This is a primary function that the dorsal stream performs [for review, see Kravitz et al (2011)]. In mice, the cortical processes associated with body movement and decision making have been studied using navigational tasks in virtual reality environments (Funamizu et al. 2016; Harvey et al. 2012; Morcos and Harvey 2016). This led to the discovery, in the dorsal stream, of neural ensemble trajectories that were choice-specific (Harvey et al. 2012). The recordings were in Posterior Parietal Cortex (PPC), a term borrowed from the macaque counterpart, roughly corresponding to RL, A, and AM in the mouse (Lyamzin and Benucci 2019). However, a follow-up study revealed that neural populations in V1 could equally reconstruct choice-specific trajectories (Krumin et al. 2018), casting doubt on whether PPC plays an exclusive role in the visually guided decision. Another major function identified in the dorsal stream of primates is the detection of global motion. The global motion of an object can be detected by integrating the movement of the object’s components (Khawaja et al. 2013). In primates MT and MST, but not in V1, neurons preferably respond to the global motion of a plaid, rather than to motion for the individual gratings of the plaid (Khawaja et al. 2013; Movshon et al. 1985; Movshon and Newsome 1996). In mouse V1, some studies (Muir et al. 2015; Palagina et al. 2017) report the existence of neurons preferentially responding to the global motion [but see Juavinett and Callaway (2015)]. Consistently, reversible inactivation of V1 leads to a deteriorated performance in discrimination of direction of motion from random-dot kinematograms, suggesting necessity of a functioning V1 for motion perception in mice (Marques et al. 2018). In the dorsal stream, areas AL and RL compute the local but not the global motion of a plaid in one report (Juavinett and Callaway 2015). A more recent study reported that the dorsal stream areas, particularly the highest rank area AM, are tuned for coherent motion (Sit and Goard 2020). These studies appear to suggest a hierarchical processing of the visual motion in the dorsal stream.

On the other hand, there is also evidence that the global visual motion is computed in another pathway. The region particularly tuned for the coherent motion was observed near the junction of AM, RL and V1, which represents the lower visual field (Sit and Goard 2020). Notably, this region is strongly connected to the posterior part of LPN, which is also sensitive to visual motion (Beltramo and Scanziani 2019; Bennett et al. 2019). Thus, it remains to be determined how does this colliculo-cortical pathway contribute to the motion processing.

In the ventral stream, neural activities related to object identification have been studied from the following aspects: recognition which is tolerant to transformation, and representation of object categories. Currently, reports on these aspects are limited to rats. In both aspects, different research groups have reported conflicting evidence. Transformation-tolerant recognition is the ability to identify objects despite substantial variation in their appearances, such as changes in size, position, viewpoint and illumination. Accumulating evidence indicates that the transformation-tolerance is computed in the primate ventral stream [for review, see DiCarlo et al. (2012)]. Investigating the rat’s ventral visual areas latero-medial (LM), latero-intermediate (LI), latero-lateral (LL) and lateral occipito-temporal (TO), Vermaercke et al. (2014) found that only TO was more tolerant to stimulus position, compared to V1, and only in relative terms (i.e., in terms of the stability, rather than of the magnitude, of the discrimination performance afforded by TO across two nearby positions). On the other hand, Tafazoli et al. (2017) found a substantial increase in the ability of neurons to support discrimination of visual objects under identity-preserving transformations (e.g., changes in position and size). The high tolerance leads to a representation of category—for instance, an increased ability to distinguish between animal and non-animal pictures. In the primate ventral stream, neurons preferably respond to coherent stimuli containing surfaces and objects compared to random texture patterns (human functional magnetic resonance imaging [fMRI—Grill-Spector et al. (1998), monkey fMRI—Rainer and Miller (2002), and monkey single-unit electrophysiology—Vogels (1999)]. This category selectivity in the rat ventral stream is supported by single-unit electrophysiological studies in one report (Tafazoli et al. 2017), but not in another (Vinken et al. 2016). Apart from the two aspects, there is also conflicting evidence in orientation tuning: it was initially reported to increase (Vermaercke et al. 2014), but later reported to decrease (Matteucci et al. 2019) along with the progression of areas. The latter study instead reported an increase in bimodal tuning.

In this section, we reviewed the hierarchical, sequential organization from retina to visual cortex and within the visual cortex, from anatomical and physiological perspectives. Along the visual pathway from retina to cortex, orientation and direction selectivity are gradually paved by multiple stages. Within the visual cortex, both anatomical and functional studies showed the existence of a visual hierarchy. They also consistently suggest that the hierarchy levels are much less distinct than those of the primates. This, in turn, indicates that the mouse visual system is endowed with abundant connectivity that is not governed by the hierarchy. In the following section, we will survey types of these connections and their possible functions in visual processing.

Non-hierarchical visual processing

In this section, we discuss two types of non-hierarchical pathways: (1) pathways within the visual cortex and (2) pathways from subcortical structures to the visual cortex.

Bypassing pathways within the visual cortex

As described in the previous section, the anatomical tracing studies reported evidence for hierarchy and its shallowness (D’Souza et al. 2020; Harris et al. 2019) due to the dense connectivity between all the visual areas (Gămănuţ et al. 2018; Wang et al. 2012). In these tracing studies, only areas V1, LM and RL occupy a solid place in the anatomically defined hierarchy, while the other areas are close to one another. Moreover, the large-scale survey of single-unit spiking activities confirmed the shallowness in the hierarchy of the visual cortical areas (Siegle et al. 2021). For instance, area AM is located at a statistically significant higher rank than area PM according to the anatomical connectivity measurement (D’Souza et al. 2020), consistent with cross-correlation of spiking activities between neurons in these areas. However, ~ 27% of neurons in PM fire after AM (Siegle et al. 2021). This suggests that, compared with primates, the visual cortical areas in mice have more connections that do not fit the hierarchical, serial processing. Such a network includes many bypassing projections that connect distant hierarchical levels, skipping levels in between (Fig. 3A1). Moreover, such a network also includes crosstalk projections between the dorsal and ventral streams, without descending through V1 (Fig. 3A2). Such cross-talk projections are weak between the primate dorsal and ventral streams (Markov et al. 2014a; Palmer and Rosa 2006). At a single-neuron level, divergent or broadcasting projections from V1 to multiple HVAs simultaneously have been reported in mice as well (Han et al. 2018) (Fig. 3A3). Surprisingly, more than 75% of inter-areal projections were projecting to multiple areas rather than to a single area.

Fig. 3
figure 3

Examples of non-hierarchical projections. A Projections between visual cortical areas. A1 Bypass projection connects distant hierarchical levels skipping levels in between (e.g., D’Souza et al. 2020; Glickfeld et al. 2013). A2 Crosstalk projection between the dorsal (red) and ventral (blue) cortical streams (e.g., D’Souza et al. 2020). A3 Divergent projection from single cortical neuron to multiple HVAs (e.g., Han et al. 2018). B Top: locations of subcortical structures Lateral Geniculate Nucleus (LGN), Lateral Posterior Nucleus (LPN) and Superior Colliculus (SC). Bottom: Parallel pathways from subcortical structures to visual cortical areas. Visual information from retina is primarily conveyed to the visual cortical areas through LGN. Additionally, visual information is conveyed via SC then LPN, at which the visual information is transmitted to dorsal and ventral cortical HVAs in parallel (e.g., Bennett et al. 2019). C Projections that involve interaction between the cortical and subcortical pathways. C1 Convergent projections to a HVA via intra-cortical and colliculo-cortical pathways (e.g., Blot et al. 2021). C2 Convergent projections to LPN from V1 and SC (e.g., Kirchgessner et al. 2021). C3 Loop connecting cortical areas, collicular and thalamic nuclei (e.g., Bennett et al. 2019)

As an example of the non-hierarchical pathways in visual cortical areas, we discuss the bypassing connections between the areas, and their functional consequences. Studies in primates have accumulated evidence that such bypassing connections provide specific information to higher areas, distinct from the canonical hierarchical pathway [see Nassi and Callaway (2009)]. For instance, in the primate dorsal stream, neurons in V1 communicate with MT through at least two cortical routes. One is the indirect projection via V2, the other is the direct projection to MT (Markov et al. 2014a; Movshon and Newsome 1996; Palmer and Rosa 2006). Selective inactivation of the indirect pathways reduced MT neurons’ stimulus preference for binocular disparity but not for direction of motion, suggesting that the direct pathway from V1 to MT provides information about speed and direction of motion, whereas the indirect pathway provides binocular disparity information (Ponce et al. 2008). Similarly, in mice, the projections from V1 to each HVAs carry different aspects of the visual scene. Using optogenetic inhibition and/or antero/retrograde tracer injections, it was reported that neurons in V1 that innervate HVAs match the visual preference of these target areas (Glickfeld et al. 2013; Matsui and Ohki 2012). For instance, PM-projecting or AL-projecting neurons match the spatial and temporal frequency preference of PM and AL, respectively. A similar picture emerges for the thalamo-cortical interactions: the visual signals from SC via LPN to L1 of V1 enhance feature selectivity in the visual signals, transmitted through the LGN-cortex projection (Fang et al. 2020), by providing subtractive surround suppression. These studies thus support the view that bypassing projection neurons in V1 innervating different HVAs specialize in distinct aspects of visual information.

However, there is also evidence against this view from two perspectives: the interference of non-visual signals and the heterogeneity of responses. In the first instance, if V1 sends functionally distinct projections to AL and PM, these projection neurons would be expected to fire with different temporal patterns. Indeed, AL-projecting and PM-projecting neurons rarely fire simultaneously, having low temporal correlations between them (Kim et al. 2018). However, this low correlation of firing persisted even after operational removal of the visual response, averaged across repeats (Kim et al. 2018). This result suggests that each pathway from V1 carries independent fluctuations that are irrelevant to visual inputs. The second issue comes from the discrepancy in visual preference between pre- and post-synaptic neurons in the HVA. Although the average tuning of the V1 inputs to the HVAs matches the neurons' tuning in the HVAs, there is considerably more diversity in the projections (Glickfeld et al. 2013; Kim et al. 2018). The specialization in visual preference could arise from other projection pathways to HVAs (Murgas et al. 2020). Given the dense connectivity between all the visual areas and the numerous connections with non-visual areas (Gămănuţ et al. 2018; Wang et al. 2012), it would be interesting to see if combinations of projections from HVAs to PM or AL can account for the visual preference of the two areas, and more generally, for the spiking activity. Alternatively, the specialization of visual preference in the HVAs could also occur due to differences in local processing within the HVAs.

The different types of non-hierarchical projections (Fig. 3A) can coexist, but their interactions for visual processing remain to be determined. As reviewed in this subsection, a significant proportion of dedicated neurons project to single HVAs—so far demonstrated for target areas LM, PM and P (the proportions of dedicated projections were 25%, 13% and 20% of all the detected projections within the respective areas) (Han et al. 2018). These subnetworks might represent independent serial streams from V1 to HVAs, forming the basis of the bypassing projections. Intriguingly, they coexist with neurons that project simultaneously to more than one HVA—the components of divergent projections (Fig. 3A3), which can have dramatic consequences over the way we view the functioning of the HVAs. For example, some of the most abundant such neurons are those that project simultaneously to areas LM and PM (Han et al. 2018). These broadcasting neurons might convey visual signals corresponding to the part of the visual field shared between the areas. Instead, the dedicated neurons that project uniquely to PM or LM might encode non-overlapping visual fields. Thus, from this perspective, LM and PM might act like homologues of primate V2, performing a second-order transformation of the information coming from all the channels in V1 (Han et al. 2018). The visual response properties of these single neurons in light of the areal definition (elaborated in “Biased visual field coverage in HVAs”) might add a layer of complexity in the future in delineating HVAs.

Parallel pathways from subcortical structures to the visual cortex

The other non-hierarchical projections in mice are embedded in the two pathways from retina to visual cortical areas: (1) via dLGN, and (2) via SC then LPN (Fig. 3B). The two pathways are largely parallel but can inter-communicate between SC and dLGN (Bickford et al. 2015; Harting et al. 1991). dLGN (but not LPN) sends feedforward projections to L4 of V1 (Beltramo and Scanziani 2019; D’Souza et al. 2019; Harris et al. 2019). However, both dLGN and LPN project to L1 of V1, where an important part of feedback connections from HVAs arrives (D’Souza et al. 2019; Ji et al. 2015). In L1 of V1, each projection from the two thalamic nuclei forms a distinct array of spatially clustered terminals, and the two types of clusters do not overlap one with the other (D’Souza et al. 2019). The dLGN and LPN clusters are aligned with feedback projections from AL and PM, respectively, and with a lattice formed by clusters of cholinergic receptors (D’Souza et al. 2019; Ji et al. 2015).

What are the functional consequences of the two thalamo-cortical pathways? It is often assumed that the geniculo-cortical pathway via V1 forms higher-order visual preference such as selectivity to moving objects in HVAs (Vermaercke et al. 2014). Recent studies in mice instead reported that the colliculo-cortical pathway plays a critical role in velocity tuning. Preferred velocity in at least some HVAs (LM, AL, RL) was affected by lesioning of SC but not V1 (Tohmi et al. 2014). Similarly, in a more recent study, the response to moving objects was completely abolished by optogenetic inhibition of the projections from SC to LPN, but persisted when only V1 was inhibited (Beltramo and Scanziani 2019). These observations coincide with those in earlier studies in primates, that the colliculo-cortical pathway carries movement information (Kaas and Lyon 2007), and resonate with the notion that the terminal area in the cortex (MT) may serve as another primary visual cortex (Bourne and Rosa 2006; Mundinano et al. 2019; Warner et al. 2015). The result in mice was a surprising finding in two ways. First, the visual response was more routed to LPN, than to V1. This is against the currently dominant hypothesis in primates that the colliculo-cortical pathway is modulatory rather than driving visual responses in the cortex (Kaas and Lyon 2007). Second, this finding was observed in area POR, the highest of the putative ventral stream, providing a piece of strong evidence against the geniculo-cortical pathway forming velocity tuning. In this study, only a part of SC region projecting to posterior LPN, which further projects to POR, was optogenetically inhibited. This pathway via posterior LPN, more specifically, conveys object motion, rather than global motion (Bennett et al. 2019).

These recent studies appear to suggest more prevalent roles of the colliculo-cortical pathways than previously thought. Given that each subregion of LPN is reciprocally connected to different higher HVAs (Bennett et al. 2019), it is possible that other parts of LPN, and their upper-stream SC also impact on visual preference of every HVAs. So far, the driving effect was reported in POR (Beltramo and Scanziani 2019) and also in a subset of the projections to AL and PM (Blot et al. 2021, the fraction of driving boutons was estimated to be 39.4% in AL and 14.4% in PM). It would be important to determine the precise condition of the driving effect, such as behavioral state, cell types and layers. Besides, exploiting the capability to optogenetically dissect the two pathways, it would be critical to answering intriguing questions regarding these two pathways such as how does the maturation of these two pathways happen during development? (Bourne and Rosa 2006).

The findings of driving influences of the colliculo-cortical pathway on HVAs opens another fundamental question: how can the cortical streams and the colliculo-cortical pathway co-exist? The functional hierarchy formed by the former appears to conflict with the latter. This potential conflict seems evident in some of the ventral stream areas, such as POR, where the driving effect is reported. We argue this is an open question which future research needs to address. In line with this question, the following two aspects have been actively investigated so far: (1) Information content processed along the colliculo-cortical pathway. The colliculo-cortical pathway has often been assumed to convey visual motion. However, this motion signal may derive from feedback from the cortex, as reported in the primate pulvinar (Berman and Wurtz 2011). Alternatively, this pathway may be responsible for non-visual signals such as saccadic eye movements (Bennett et al. 2019; Berman and Wurtz 2011), self-motion (Blot et al. 2021), and for sustaining background activity (Guo et al. 2017; Kirchgessner et al. 2021). One promising strategy to dissociate between multiple pathways projecting onto one cortical area is the imaging of axonal boutons from both cortico-cortical and colliculo-cortical projections, while manipulating activity of the source structures of the projections. The imaging of boutons in AL has revealed that the projection from V1 to AL mostly provides visual information, while the projection from LPN to AL is more relevant to both visual information and running speed (Blot et al. 2021). This finding is consistent with the hypothesis that LPN plays a pivotal role in integrating visual information and contextual signals, in line with a hypothesized role of primate pulvinar to regulate interaction between visual and non-visual areas (Saalmann and Kastner 2015; Saalmann et al. 2012). (2) The interaction between the cortical streams and the colliculo-cortical pathway. The two circuits can converge at the level of HVAs (Fig. 3C1) and also at LPN (Fig. 3C2). At these convergence sites, how do the two circuits interact? A powerful approach to study such interaction is simultaneous manipulation of the two circuits, while recording from the convergence sites. This approach, employing dual wavelength optogenetic inhibition, revealed the interaction of LPN activity with the feedback from V1 and feedforward from SC (Kirchgessner et al. 2021). At putative single neuron level, various types of LPN neurons were observed: 10% of LPN neurons were driven by retinotopic projections from L5 (but not L6) of V1, 3% of neurons were driven by projections from SC, and 2% of neurons driven by both (Kirchgessner et al. 2021). This result indicates that LPN does not simply relay but rather integrates information conveyed by cortical and subcortical sources. In addition to these two types of convergence, the two streams can also form a cortico-collicular-thalamic loop (Fig. 3C3, Bennett et al. 2019). This loop seems at odds with the no-strong-loop hypothesis (Crick and Koch 1998), yet the exact conditions and context in which this loop is activated remain to be elucidated.

In this section, we surveyed different types of non-hierarchical projections both in cortical and subcortical visual areas. Within the cortex, the number of such projections is relatively large, due to the high density of the cortical network. As an example, we examined the role of the bypassing projection from V1 to HVAs in visual processing. Earlier studies provided evidence that V1 neurons projecting to different HVAs specialize in distinct aspects of visual information. More recent studies suggested these projections may instead convey non-visual signals. In the colliculo-cortical pathway, the projections to the cortex via LPN are segregated from the projections from the retina via dLGN, although LPN does project to dLGN. A surprisingly strong driving projection from LPN was reported in at least one HVA, raising the question of how the cortical hierarchical streams interact with the colliculo-cortical pathway. The degree and condition of the driving input in other HVAs and its interaction with the hierarchical cortical system is an important open question.

Future perspective

In this article, we have reviewed the evidence for the hierarchical and non-hierarchical organization of the mouse visual system. “Anatomical and physiological markers of hierarchy” reviewed evidence that the visual cortical areas form the dorsal and ventral streams, each embedded in a hierarchical organization. The anatomically defined hierarchy echoes with a few basic functional measurements such as simple/complex cell ratio and response latency. Still, it remains largely unknown whether each stream is specialized for a particular visual attribute as in primate visual areas, and whether the areas in each cortical stream progressively compute more sophisticated and intricate visual information useful for mice to act upon. “Non-hierarchical visual processing” reviewed examples of non-hierarchical organizations found in geniculo-cortical and colliculo-cortical pathways. The functions of these projections in visual processing yet remain to be explored. One of the most outstanding questions there is how the hierarchical cortical streams interact with the colliculo-cortical pathway.

To address these outstanding issues in “Anatomical and physiological markers of hierarchy” and “Non-hierarchical visual processing”, it is critical to investigate the visual response properties of multiple visual areas in the same animal by employing the same visual stimuli, recording simultaneously in various areas, analyzing the results in the same manner, and disentangle causal influence between the areas. In this direction, a large-scale electrophysiological recording was partly implemented in the mouse dorsal stream (Siegle et al. 2021), but this approach remains challenging in conventional laboratory setups.

Future studies for functional roles of visual areas may benefit from mesoscale imaging, such as widefield optical imaging and functional ultrasound imaging (fUSI). Widefield imaging enables the monitoring of neural activity from all the visual cortical areas at one time. fUSI enables monitoring from a subcortical structure such as SC and LPN (Macé et al. 2011; Urban et al. 2015). With these imaging data at hand, the understanding of higher-order visual features represented in each visual area may be greatly advanced with a recently developed technique, encoding modeling, which analyzes the temporal information represented in imaging data (Naselaris et al. 2011). In this model, visual stimuli are used to predict neuronal activity in each recording unit (e.g., neuron, voxel, or pixel). In human visual cortical areas, the encoding model fitted in each voxel of fMRI data successfully captured hemodynamic activity in the early visual system, confirming the systematic differences in receptive field position and speed tuning from V1 to V3 (Nishimoto et al. 2011). A similar approach also captured fMRI dynamics beyond the early visual system, such as parahippocampal place area and fusiform face area, elucidating their selectivity to particular visual scene categories (Stansbury et al. 2013). These analytical approaches could provide key insights on specialized visual processing in HVAs, and help to establish understanding of the whole visual system as a network.

Concluding remarks

To date, the anatomical tracing data from the mouse, as well as functional imaging and electrophysiological investigations, point to the existence of a visual hierarchy with two streams. However, their respective role in visual processing remains to be determined. These investigations also revealed the shallowness of the hierarchy compared to that of primates, indicating more abundant non-hierarchical connections, such as bypassing connections. The functional roles of the bypassing connections have been actively studied with respect to understanding the functional discrepancy between pre-synaptic and post-synaptic neurons. These findings point to a clear distinction of the mouse visual system compared with that of primates in its hierarchical organization. Together with the ever-growing genetic tools primarily available in this species, the mouse visual system represents an ideal system to study the non-hierarchical visual processing on the hierarchical backbone, but full exploitation of this resource will require several foundational investigations in the near future.