Introduction

The brain holds a distinctive position in the hierarchy of human organs for generating thought, emotion, and personality, all while managing the more base functions that maintain the body—and also for being so difficult to study. The cells of the brain are so numerous and intricately commingled that long after it was known that the other organs are made of separate cells, the brain was still thought of as one large unbroken reticulum rather than many individual units. In applying Golgi staining to brain in 1888 (50 years after Theodor Schwann first extended cell theory to animals), Santiago Ramón y Cajal was first able to visualize the elaborate form of an individual neuron, debunking the prevailing theory that the brain was a continuous network and establishing the inception of modern neuroscience [1].

In the intervening century, the field of neuroscience has made many advances, but there is still a great distance to travel in understanding the regulation of the genesis of the human brain—and how this goes awry when a person develops a psychiatric syndrome such as schizophrenia. One promising avenue for interrogating the origins of psychiatric illness is to interpret the influence of the genetic code on the developmental trajectories that coalesce into a mature functioning brain, trajectories that are often highly cell type specific. Cellular function is determined by the molecular phenotype of the cell and by its connectivity. Cells are interconnected in developmentally programmed micro- and macrocircuits that are thought to be altered in schizophrenia. Technology for examining gene regulation in individual brain cells or populations has greatly improved since the Golgi stain, yet remains a challenge for high-throughput studies. This review provides an overview of selected aspects of human brain development in the context of major brain cell types, mechanisms that regulate the gene function that shapes cellular function, and what is known about how they influence schizophrenia. Finally, we end with an overview of technical challenges to achieve cell type-specific profiles of gene regulation in human postmortem brain.

Neocortical cellular structure, composition, and development

The brain is functionally organized in a regional manner, where each region is associated with a relatively specific group of functions. This review focuses on the cerebral neocortex, a 2–4.5 mm sheet of tissue that covers most of both hemispheres. Dorsally and laterally, the cerebral cortex is subdivided by functions such as vision, hearing, motor control, cognition, and perception [2]. Of particular interest in psychiatry is the prefrontal cortex, the most rostral subdivision of the neocortex, which is associated with higher cognitive functions such as decision-making, self-control, language, and social processes [3]. The dorsolateral prefrontal cortex (DLPFC) occupies Brodmann areas 9, 10, and 46 (rostral to the premotor cortex) and is associated particularly with executive functions like working memory, sensory input synthesis, attention, and goal-directed behaviors [3].

With authoritative overviews of brain development available elsewhere [4], we will briefly highlight here key events in cortical development that underlie cell fate determination and maturation. The types of cells in the six layers of the DLPFC are myriad but can broadly be distinguished as neurons and glia, with neurons representing roughly 33–39% of cells in the frontal lobe [5]. Beyond this dichotomy, the diverse shapes, electrophysiological properties, gene expression patterns, neurotransmitters, location, and projections of these cells lead to numerous potential cell classes. In simplistic terms, however, the main categories break down to six types: pyramidal neurons, interneurons, oligodendrocytes, astrocytes, microglia, and endothelial cells.

Pyramidal neurons are relatively large cells that express glutamate as their primary neurotransmitter and play an excitatory role in neuronal circuits. They are the predominant cells in the “deep” layers (5 and 6) of the cortex and have long axons that project to other nervous system regions, such as the striatum, thalamus, and spinal cord. Pyramidal cells in superficial cortical layers (e.g., layers 2 and 3) project principally within the neocortex and connect the neuronal assemblies that comprise the local and distributed cortical circuits involved in higher-order behaviors and conscious thought [6]. Interneurons are smaller cells that express GABA as their primary neurotransmitter and mostly act to inhibit circuit activity. While GABAergic neurons represent ~20% of cortical neurons, they are a much more phenotypically diverse group than their pyramidal counterparts, with subtypes characterized by distinct morphologies and molecular signatures [7]. Unlike pyramidal neurons, interneurons are relatively more abundant in layers 2 through 4. Interneuron axons also mostly do not project out of the cortex, instead ramifying around a cortical column or across neighboring columns, leading generally to a more local impact on circuit function [7]. In terms of glial types, oligodendrocytes produce myelin, a substance that ensheaths neuronal axons and insulates their electrical signaling for more efficient conduction [8]. Astrocytes are the next most abundant cortical macroglia and serve a variety of functions, from maintaining homeostasis of water and ion distribution, maintaining the blood–brain barrier, providing structural support, recycling glutamate released from pyramidal neurons, and participating in cell-to-cell signaling [8]. Microglia are the class of macrophages that reside in brain. They are critical players in inflammatory reactions within the central nervous system (CNS), early synapse pruning, maintaining homeostasis, and protecting against pathogens [9, 10]. Finally, endothelial cells play a critical role in forming the blood–brain barrier by lining brain microvessels at the blood–CNS interface [11].

The formation and composition of each cortical layer is determined by a developmental path and timeline unique to each cell type (Fig. 1a). Neurogenesis in humans begins soon after the neural folds fuse to form the neural tube roughly 4 weeks after conception [4]. Excitatory pyramidal neurons are born first in the dorsal pallium and migrate radially along radial glial processes into the cortical plate, the precursor to the cerebral cortex; inhibitory interneurons largely arise ventral to these cells in the ganglionic eminence and follow tangentially to the cortical plate [4]. Laminar positioning originates in a bottom-up formation according in part to birth order, where earlier-born pyramidal neurons first settle in layers 5 and 6, then later neurons populate the overhead layers [4]. While this occurs, pyramidal neurons help recruit specific interneuron subtypes to coexist in specific layers, depending on the subtype of each pyramidal neuron [12]. Neuronal expansion and migration continues throughout a prolonged window beginning in embryonic development and lasting until shortly before birth [13], although in some regions they last until age three [4].

Fig. 1: Developmental trajectories, delays, and molecular regulation in schizophrenia.
figure 1

Depending on the timing of genetic or environmental influence, different cellular processes may be affected that then impact development of behavioral or cognitive systems. a Approximate timing of a selection of cell-specific developmental processes that occur during human cortical development, as well as known risk factors for schizophrenia (SCZ). a is adapted from [171]. Several cellular processes may be impacted differentially depending on the timing and magnitude of the risk factor effect; likewise, the normative patterns depicted here are also influenced by one another because of intercellular communication and coexistence in cortical circuits. b The typical time frame of achieving several developmental milestones that are often significantly delayed in children who go on to be diagnosed with schizophrenia as adults. Achievement of motor skills [172], language skills [173], and executive functioning skills [174] is impaired in children later diagnosed with schizophrenia. b is adapted from [4]. c Molecular mechanisms of gene regulation that govern cellular identity and development. The transcriptome and epigenome integrate genetic and environmental influences on developmental paths such as those highlighted in a.

Gliogenesis commences after neurogenesis begins. Oligodendrocytes first emerge in mid-gestation and continue to be generated extensively through the first 3 years of life and then at a slower rate through adulthood [4, 14]. Astrogliogenesis also is thought to begin during mid-gestation and continues abundantly through the first 3 years of postnatal life, though astrocytes can continue to divide throughout life [4, 15]. Interestingly, unlike neurons and the macroglia (oligodendrocytes and astrocytes), microglia are derived from erythromyeloid progenitors that migrate to the brain early in prenatal development [16]. Once the blood–brain barrier is formed and microglia mature, they rely on self-renewal to maintain their population under normal physiological conditions [9, 17, 18].

Cellular, laminar, and regional identity are dictated by specific transcription factor gradients. While neurons, oligodendrocytes, and astrocytes all originate from radial glial cells [8], these gradients lead to divergent outcomes. For example, excitatory pyramidal neurons are differentiated from radial glia via a transcription factor cascade including PAX6, TBR2, NEUROD1, and TBR1 [19]. Interneurons from the medial ganglionic eminence are shaped by a cascade beginning with SHH, then followed by NKX2-1, LHX6 and LHX8, and SOX6 [20]. Oligodendrocyte lineage is established with expression of PDGFRA, SOX10, and NG2, while astrocytes are guided by the expression of ALDH1L1 [8]. Cell subtype designation is further influenced by transcription factors that specify cortical layers, such as BCL11B, FEZF2, and SOX5, which specify the identity and connectivity of layer 5 glutamatergic excitatory projection neurons [21, 22]. Knowledge of these transcription factor cascades can help in identifying cell types and stages in human postmortem brain tissue, as further discussed below.

As these cells find their way to their destinations and continue to mature, the cortex undergoes an explosive period of forming cell-to-cell connections. Thalamocortical axons reach the cortex amidst prolific dendritic arborization and synaptogenesis in late mid-fetal development; at this time, the first evidence of electrical signaling occurs and the cortical plate begins to resemble the characteristic six layers of the mature cortex [4]. This extensive connective proliferation continues through the first and second year of life, depending on the cortical region, although the expansion of pyramidal neuron size and arborization persists at a slower rate until age five [23]. Myelination, meanwhile, begins to occur in early postnatal life and continues throughout adulthood, reaching a relative plateau in DLPFC in the third decade of life [4, 24].

Following this period of synaptic overproduction, the cortex undergoes extensive pruning of primarily excitatory synapses beginning in early childhood, plateauing in early adolescence, and continuing to a lesser extent into the third decade of life [25]. While much attention has been devoted to the pruning of selected excitatory synapses in adolescence, there are also changes in synaptic architecture that appear progressive during this period of postnatal life. For example, dendritic arbors undergo extensive refinement during adolescence, including becoming more elaborate and complex, and modulatory inputs from brainstem systems, such as the dopamine system, become much more extensive [26]. The timing of maturation for different circuits is in general consistent with the developmental timing of the cognitive skills associated with their function; for instance, humans do not fully develop executive functioning until early adulthood, commensurate with the slow maturation of the DLPFC that continues well beyond the second decade of postnatal life [25].

Schizophrenia: a developmental brain disorder

Despite the adult age of typical first diagnosis, evidence from both genetic and epidemiological studies implicate aberrant brain development as a component of the etiology of schizophrenia. Schizophrenia is a highly heritable disorder, with estimates of heritability as high as 80% [27]. A large genome-wide association study (GWAS) comparing schizophrenia cases and healthy controls identified 145 genomic loci marked by common single-nucleotide polymorphisms (SNPs) that were associated with schizophrenia risk [28]. Examining the expression of genes within genomic risk loci showed that they were relatively enriched for expression in fetal compared with postnatal DLPFC [29]. A complementary study of de novo variants in schizophrenia patients found that genes containing these variants formed a network of co-expression and protein interaction in fetal DLPFC, suggesting that cortical development may be altered during gestation in schizophrenia [30].

Likewise, several early life experiences have been linked to schizophrenia development. Women whose mothers experienced severe nutritional deficits during the first trimester of pregnancy had more than doubled odds of developing schizophrenia in adulthood [31]. Obstetric complications have also been associated with developing schizophrenia [32], as well as living in an urban environment early in life [33]. Interestingly, combining genetic and environmental risk factors can greatly amplify schizophrenia risk. It was shown recently that polygenic risk score, a metric measuring inheritance of risk-associated alleles identified in GWAS, was five times more predictive of schizophrenia diagnosis in individuals that had experienced early life complications during pregnancy, labor, or delivery than in individuals without those experiences [34]. Further reinforcing the developmental component to causation is that children who grow up to be diagnosed with schizophrenia in adulthood are more likely to have delays in achieving classic developmental milestones, including those involving motor, language and cognitive skills (Fig. 1b) [35,36,37].

Although the incrimination of developmental processes in schizophrenia risk is convincing, the specific neuropathology or molecular mechanisms are not understood. Several developmental trajectories have been implicated, and due to the phenotypic heterogeneity of schizophrenia, it is likely that at least several of these play a role. For instance, neural oscillations particularly in the beta- and gamma-band frequency observed with electroencephalographic techniques are abnormal in individuals with schizophrenia [38]. Gamma oscillations emerge late in brain development, are associated with executive functioning, and are attributed to the activity of fast-spiking parvalbumin interneurons in cortical layers 2 and 3 of the DLPFC [39]. Evidence from animal studies suggesting that cortical interneurons do not develop fast-spiking characteristics until late adolescence has encouraged speculation that early developmental genetic factors bias this population of late maturing GABAergic neurons to malfunction, but the malfunction phenotype is masked until the full maturation of these cells and the neural circuitries that they subserve [40]. One hypothesis is that N-methyl-D-aspartic acid-type glutamate receptor hypofunctioning in these interneurons leads to reduced gamma-oscillation production; however, it is unclear if this imbalance is the result of a primary reduction in interneuron inhibition, or a result of upstream dysfunction in pyramidal neuron excitation [41].

A smaller body of evidence also implicates dysregulation of glial processes in schizophrenia. A possible avenue leading to cognitive dysfunction in schizophrenia is aberrant myelination, which can alter synaptic formation and function [42]. Several studies have found schizophrenia to be associated with pathways related to myelination and oligodendrocyte function [43, 44]. Another popular hypothesis is that schizophrenia may result from aberrant synaptic pruning in adolescence, a process that relies heavily on the activity of glial cells such as astrocytes and microglia [45]. Recent work identifying schizophrenia risk-associated polymorphisms within the C4 gene, a part of the complement immune system in the MHC locus, implicated a potential role for complement activation as a mechanism of the putative synaptic loss in schizophrenia [46]. Another study using co-cultured fibroblasts reprogrammed to neuronal and microglial fates found increased synaptic engulfment by microglia in patient-derived samples and further association of C4 risk alleles with increased synapse uptake, although C4 genotype did not explain all disease-associated increased pruning in their in vitro model [47]. However, while there is evidence of reduced synapses in some studies of schizophrenia postmortem brain in selective DLPFC layers (e.g., layers 2 and 3), there is no direct evidence that this reduction is a consequence of abnormal pruning rather than an epiphenomena of chronic illness and associated morbidities. A recently developed positron emission tomography technique using a radiotracer specific for SV2A, a synaptic protein that can be used as a synaptic density marker [48], may allow for better resolution studies of in vivo schizophrenia synaptic dysfunction in the future.

Given the current lack of consensus, further work to elucidate the “when” and “where” of schizophrenia pathology in terms of brain development is required, particularly for understanding how these alternative narratives of developmental perturbation fit together. Given the highly complex and synergistic developmental processes undertaken by different brain cells, it is likely that more than one cell class is implicated. It is also worth noting that the same principles that apply to schizophrenia here also apply to other neurodevelopmental psychiatric disorders, such as autism spectrum disorder and bipolar affective disorder. Identifying the timing of and cell type(s) affected by a known genetic or environmental risk factor can illuminate the path forward for rectifying the altered developmental trajectories in many related disorders, perhaps placing the brain back on a healthy path.

Functional genomics of brain development

The mechanisms that govern developmental decisions within a cell begin fundamentally with the genetic code and how the code is utilized in each cell (Fig. 1c). The central dogma of molecular biology provides a skeletal view of how this occurs: DNA is transcribed in the nucleus to mRNA, which is exported to the cytoplasm and translated to a protein that performs some cellular function. Further scrutiny drapes layers of nuance over these bones. Noncoding RNAs such as miRNA and lncRNA, for example, are not translated; rather, they can be a part of feedback loops regulating later transcription or translation of other genes or transcripts, or act themselves within the cell as signaling molecules, scaffolding structures, or regulators of protein activity [49, 50]. Posttranscriptional modifications such as alternative splicing and RNA editing can also affect how and when a transcript is used, as well as the localization of a transcript within the cell. Proteins, particularly transcription factors as discussed above, also participate in regulatory cascades affecting transcription and translation. Finally, the epigenome regulates DNA use by controlling DNA accessibility and factor recruitment, altering if and how the DNA can be used [51, 52]. Profiling the transcriptome and epigenome is particularly useful because both integrate genetic and environmental information and provide a means to interpret the functional readout of that information. Here we briefly delve further into these facets of gene regulation.

Ribonucleic acid

The transcriptome is defined as the total RNA in a sample; however, the term masks a great deal of diversity in both form and function. In addition to the mRNA expressed from the roughly 20,000 protein-coding genes in the human genome, a plethora of other noncoding RNA species exist. For instance, as mentioned above, miRNAs are roughly 22 base noncoding RNAs that influence translation of mRNA [53]. lncRNA are transcribed from noncoding genes longer than 200 bases and perform a variety of functions, from regulating transcription of neighboring cis genes, to acting as signaling molecules [54, 55]. Noncoding RNA transcribed from repetitive DNA has been shown to be an important structural component of the nuclear matrix [56]. Other RNA species are increasingly being characterized, including piRNAs, snoRNAs, circular RNAs, and others, each with their own set of functions [57]. As technology improves, the list of known RNA species expands and augments our understanding of the diverse roles transcriptional products play within the cell.

Co- and posttranscriptional RNA modifications further increase the diversity within the transcriptome. Alternative splicing, how multiple protein isoforms can arise from the same genetic sequence, is the most pivotal RNA modification [58]. RNA can also be edited by RNA editing enzymes such as ADAR to change the coding sequence or alter binding specificity in the 3′UTR [59]. Reversible modifications such as RNA methylation (i.e., m6A) and transport chaperones such as RNA binding proteins can introduce further heterogeneity and complexity [59, 60].

Beyond transcript modifications, localization of RNA can also influence RNA function. Compartmentalization by the nuclear membrane is an often-overlooked mechanism of RNA regulation. The nuclear transcriptome is populated by pre-mRNA and sequestered aberrant and activity-dependent transcripts, in contrast to cytoplasmic RNA [61,62,63,64,65,66,67]. A major role of the nuclear membrane is to act as a transcriptional noise buffer by filtering stochastic bursts of gene expression from entering the cytoplasm [68, 69]. Further, transcripts can be targeted to specific locations in the cytoplasm for immediate translation given proper environmental cues [70]. A recent study revealed that genes implicated in neurodevelopmental and neurodegenerative disorders tend to show a bias toward nuclear sequestration, implicating efficiency of translating transcripts into protein as another potential mechanism of genetic risk [71].

The brain undergoes dramatic shifts in gene and isoform expression as it develops [72], including noncoding RNA species [73]. Both coding and noncoding gene expression and alternative splicing shape the developmental trajectories defining cell identity and function within the circuits in the maturing cortex [74]. RNA editing is also developmentally regulated in human brain and is especially involved in neuronal maturation [75]. In terms of RNA localization, many of the mechanisms that regulate RNA localization across the nuclear membrane are frequently used in brain cell lineages, and have been shown to play a role in developmental programs. For instance, intron retention often occurs in weakly expressed transcripts as a signal to the nuclear surveillance machinery to sequester and degrade aberrant or superfluous transcription products via exosomes [76]. In vitro neurons as well as other cell types show increasing levels of intron retention as they differentiate from induced pluripotent stem cells, principally in lowly expressed genes involved in determining counter cell fates [77,78,79]. RNA editing has also been shown to regulate activity-dependent nuclear transcript retention, although RNA editing does not seem to globally signal sequestration [66, 80]. Overall, RNA takes on a broad assortment of shapes and roles within the cell and is highly dynamic across brain development.

Chromatin structure

Chromatin is the compacted product of DNA incorporated around proteins, reducing the length of DNA per chromosome by a factor of up to 2 × 10−5 [81]. Chromatin is composed of a series of nucleosomes—a set of eight histone proteins around which roughly 146 bases of DNA wraps twice—connected by variable length linker DNA and organized into a two-start helical 30 nm fiber [81]. Chromatin structure affects the accessibility of DNA for transcriptional or regulatory functions in a cell type-specific and activity-dependent manner as the cells develop, and can be remodeled by chromatin modifiers [82, 83].

Both histone variants themselves and modifications to the histone tails affect chromatin structure and therefore DNA use in a way that can be interpreted as a “histone code” [84]. For example, promoter sequence often occurs upstream of a nucleosome marked by trimethylation of the fourth lysine of histone 3 (H3K4me3). Monomethylation of the fourth lysine of histone 3 (H3K4me1), in conjunction with acetylation of the 27th lysine of histone 3 (H3k27ac), is a marker of active enhancer sequence. In contrast, trimethylation of the ninth lysine of histone 3 (H3K9me3) marks repressive heterochromatin [85]. These histone modifications not only label chromatin state but also actively participate in inducing them; for instance, H3K4me3 recruits the ATP-dependent chromatin remodeler CHD1 as well as other enzymes capable of repositioning nucleosomes for increased DNA accessibility and transcriptional activity [85]. H3K9me3, on the other hand, helps induce a heterochromatin state by providing a binding substrate for HP1, a protein that recruits the methyltransferase SUV39H1 that then further methylates H3K9me3 in a positive feedback loop to repress transcription [86]. Zooming out from the nucleosome to the genome scale, regions of euchromatin and heterochromatin are organized into topologically associated domains that are likewise dynamic [87].

Many studies have assessed the chromatin landscape over brain development and found chromatin patterning to reflect the rich history of cell fate decisions, experience, and genetic influence sustained by the brain region or cell population measured [88,89,90,91]. Chromatin state and chromatin remodeling are an integral facet of the epigenome that both reflect past developmental paths while being responsive to the current needs of the cell. As such, chromatin is an expression of the influence of the environment as well as genetic variation on regulation of the genome.

DNA methylation (DNAm)

DNAm, another major component of the epigenome, is the covalent addition of a methyl group to the fifth carbon of a cytosine. The methylated cytosine usually precedes a guanine (i.e., CG) but in some cell types can be followed by another base (i.e., CH, where H = A, C or T) [92].

Methylation in the CG context (mCG) is a stable modification that is faithfully copied from parent to daughter cell during DNA replication by DNMT1 [93], although de novo methylation can be added via the methyltransferases DNMT3a and DNMT3b [94]. mCG has historically been considered a repressive mark, although this view is being upended as canon [92]. In reality, the mCG landscape reads more like a map to be deciphered, where stereotyped mCG features can be interpreted to signify certain genomic conditions. In next-generation sequencing data, the ratio of methylated-to-unmethylated reads per base represents the proportion of methylation at that site across the sequenced sample. In the mammalian genome, most CGs are fully methylated, but there are notable exceptions. For example, short unmethylated regions often correspond to promoter sequence because promoters frequently contain CG islands, dense clusters of rarely methylated CGs [95]. Low-methylated regions, or short regions of less than 30% methylation, often correspond to active distal regulatory elements such as enhancers [95]. Partially methylated domains are long stretches of disordered mCG levels that correspond to heterochromatin and polycomb repressed sequence [96]. Finally, DNAm valleys (DMVs) are longer regions of hypomethylation that correspond to epigenetically regulated developmental genes. Although DMVs are unmethylated, their chromatin profile reveals a bivalent state of both activating H3K4me3 and repressing H3K27me3 signal [97]. In this way, profiling the mCG landscape can provide a picture of genomic organization and activity in the cell population being measured, although the causal relationship between DNAm levels and genomic activity is not always clear. For instance, although DNAm levels are typically reduced at transcription factor binding sites when the factor is bound, suggesting a steric hindrance effect of DNAm on binding [95, 98], actively transcribed gene bodies have increased DNAm levels in most cell types [99], and at least for the typically methylation-sensitive transcription factor CTCF, in vivo removal of DNAm does not alter CTCF binding [100].

Neurons have the unique distinction along with embryonic stems cells of having a high proportion of non-CG methylation (i.e., mCH) [101]. Unlike most mCG levels which are highly methylated, mCH levels are much lower due to the increased heterogeneity at each cytosine within a sample [102]. Unlike mCG in other cell types [99], neuronal mCH in gene bodies is anticorrelated with gene expression and is largely established de novo postnatally by DNMT3a during neuronal development [103], although low levels are detectable in prenatal human frontal cortex [104].

In terms of brain development, mCG is a significant and early player, maintaining prenatally established global mCG levels throughout the lifespan; mCH, on the other hand, increases rapidly from birth to early childhood and then continues more slowly into the 20s [101]. Postnatally, mCG and particularly mCH show the most rapid changes in the first 5 years, implicating this time of life as a period of especially high epigenetic plasticity and environmental sensitivity [105]. Changes to DNAm in both cytosine contexts also occur robustly in response to neuronal activity [106] and are critical to learning and memory and the maintenance of synaptic plasticity [107], suggesting that DNAm plays an integral role in the brain maturational processes described in the previous section. Whether the driver or passenger in establishing the epigenomic landscape, DNAm integrates the genetic and environmental influences that have paved the developmental roads.

Schizophrenia and gene regulation in brain development

Transcriptomic regulation

Many studies have identified elements of schizophrenia risk reflected in the transcriptome, particularly risk attributed to common genetic variation. For instance, an early case–control RNA-seq study concluded that at least 20% of GWAS-implicated risk loci contained variants that could influence gene expression in the DLPFC in schizophrenia [108]. A more recent study found over two-thirds of GWAS-associated risk variants to be expression quantitative trait loci (eQTLs), or variants associated with changing gene expression [109]. Another case–control transcriptomic study conducted by the PsychENCODE Consortium identified almost 5000 genes and 4000 transcripts—including many noncoding—as significantly differentially expressed in the frontal and/or temporal cortex between schizophrenic and control brains, and transcript-level changes had much greater effect sizes than gene-level expression changes [110]. However, this extent of case–control differential expression was not observed in a subsequent study [109], perhaps because of differences in how RNA quality was addressed.

Transcriptome-wide association study (TWAS) analysis in the PsychENCODE study identified 64 candidate genes implicated in several different cell types as conferring risk for schizophrenia [110]. Two later TWASs identified many additional associations between disease risk and expressed features (e.g., specific exons, junctions), though interestingly, none of them were differentially expressed between cases and controls [109, 111]. Collado-Torres et al. were the first to explore the relationship between gene expression in DLPFC and hippocampus in patients and controls and revealed reduced coherence in gene expression between these regions in the brains of schizophrenia patients [109]. Jaffe et al. further utilized laser capture microdissection (LCM) to profile RNA from the granule cell layer of the human hippocampus, which identified unique molecular signatures of schizophrenia risk missed in bulk tissue [111]. However, taken together, the results largely suggest that differences in gene expression between schizophrenia cases and neurotypical controls reflect primarily state-associated factors related to the correlates, consequences, and associated phenomenology of illness.

Beyond case–control studies, several RNA sequencing studies are beginning to explore schizophrenia risk reflected in developmental transcriptomic changes. One example of a risk-associated transcript with developmental shifts in expression is a human-specific isoform of AS3MT that lacks methyltransferase activity; this isoform was found to explain at least part of the 10q24.32 locus association with schizophrenia, and its expression was regulated very early in in vitro neuronal differentiation [112]. Another study found that many expressed sequences identified as GWAS schizophrenia risk loci eQTLs mapped to genes with shifting isoform expression across the prenatal to postnatal transition—in other words, genetic risk for schizophrenia was associated with developmentally dynamic gene expression patterns [113]. In terms of noncoding RNA, several studies have also focused on miRNA as master regulators of the transcriptome that can participate in establishing the pathology of schizophrenia [114]. For example, miR-137 is critical for proper regulation of gene expression networks in human neuronal development, and both genetic and transcriptional evidence supports a role for miR-137 in dysregulation of gene expression networks in schizophrenia [115,116,117]. Importantly, genetic risk for schizophrenia has also been found to be enriched in eQTLs and splicing QTLs identified in human fetal brain [118, 119].

While a promising start, all the studies listed above share the constraint of being conducted in homogenate tissue, limiting their resolution to the average transcriptomic signal across all measured cells. Because of the technological limitations discussed in the next section, analysis of cell type-specific transcriptional effects in schizophrenia in postmortem brain has been confined mostly to examining the ontology of differentially expressed genes in these studies. However, several recent papers have attempted to identify the most relevant cell types to schizophrenia pathology by partitioning the heritability for schizophrenia captured by common variation in GWAS across cell type-specific gene expression profiles identified in mouse and human single cell or nuclei [120,121,122]. Interestingly, these papers have implicated a variety of cell types as being most culprit in the disorder. Finucane et al. found significant schizophrenia heritability enrichment in glutamatergic but not GABAergic neurons [120], while Skene et al. found enrichment in glutamatergic neurons, dopaminergic medium spiny neurons, and some cortical interneurons, with less enrichment in glial, embryonic, or progenitor cells [122]. On the other hand, Calderon et al. found significant enrichment of schizophrenia heritability in oligodendrocytes and replicating cells from fetal cortex [121]. These competing results may be influenced by the use of different mouse and human single-cell/nuclei RNA-seq human cortex datasets, as well as different statistical methods. Even so, these early studies offer new albeit preliminary evidence to support models of cell type-specific schizophrenia pathology.

Epigenomic regulation

Compared to similar transcriptomic studies, fewer significant differences have been identified between neurotypical and schizophrenic brains in genome-wide profiles of DNAm and chromatin accessibility in homogenate human cortex. Studies of DNAm using microarrays have identified 25–2104 significantly differentially methylated positions between schizophrenia cases and controls [123,124,125,126]. Likewise, a study of accessible chromatin regions in DLPFC found only three significantly differentially accessible regions between schizophrenia patients and matched neurotypical counterparts [127]. The paucity of differences identified in homogenate tissue analysis reflects the importance of accounting for cellular heterogeneity in epigenomic studies, particularly when looking for likely small changes associated with complex polygenic disorders like schizophrenia [128].

Although few strong epigenetic leads associated with diagnosis have been identified, several studies have found association between developmentally dynamic patterns of chromatin structure or DNAm and genetic schizophrenia risk. For example, in one study, DLPFC-specific peaks of H3K27ac, a histone modification associated with active regulatory elements, were strongly enriched for schizophrenia heritability in fetal and infant samples but less so in adult samples, suggesting that these putative regulatory elements specific to higher cortical areas are established and active early in the lifetime [129]. Indeed, dynamic three dimensional chromatin looping patterns identified in early neuronal development were also associated with schizophrenia genetic risk [130]. In terms of developmental DNAm patterns, two recent microarray-based DNAm studies—one in fetal cortices and one in control cortices ranging from fetal to old age—also found that GWAS risk loci were enriched for being methylation quantitative trait loci, particularly in fetal cortices [124, 131]. Interestingly, CGs that were differentially methylated between schizophrenic and control cortical tissue were enriched for sites that were also differentially methylated between prenatal and postnatal timepoints but not for timepoints around the schizophrenia age of onset between adolescence and early adulthood [124]. Assessing developmental changes associated with genetic risk for schizophrenia has so far been a more fruitful direction for illuminating genome regulation associated with schizophrenia pathology than adult case–control epigenomic measures.

An increasing number of studies are now attempting to address cellular identity in terms of the epigenome in schizophrenia, and early evidence points to a large contribution of neuron biology and developmental patterns in the etiology of the disorder. For example, although accessible chromatin regions identified in homogenate cortex are broadly highly enriched for heritability of schizophrenia [127], this is particularly true for regions that were preferentially accessible in neurons compared with non-neurons, or regions that were differentially accessible between neurons isolated from different brain areas that can be attributable to differences in neuronal subtype identity [132]. Heritability for schizophrenia was also more enriched in regions marked by H3K4me3 and H3K27ac in human neurons than non-neurons, and histone QTLs (i.e., genetic variants associated with histone modification enrichment) tagged several schizophrenia GWAS risk SNPs more strongly in neurons than non-neurons, such as a stronger H3K4me3-QTL for the miR-137 marker SNP and stronger H3K27ac-QTL for the CACNA1C marker SNP [133]. Likewise, DNA regions hypomethylated in neurons and differentially methylated in neurons between brain areas were also highly enriched for schizophrenia heritability, indicating the potential importance of neuronal function generally and different neuronal subtypes particularly in this disease [132]. Furthermore, a recent paper examining the relationship of schizophrenia risk with cell type-specific developmental DNAm patterns found that regions progressively losing DNAm over postnatal development in postmortem human neurons but not non-neurons were significantly enriched for schizophrenia heritability [105]. The majority of demethylation in these regions occurred in early life in the first 5 years, highlighting the importance of examining developmental patterns when assessing schizophrenia disease etiology [105]. Interestingly, when using prenatal samples, schizophrenia heritability was found to be enriched in regions of accessible chromatin found in the germinal zone of the developing cerebral cortex, a region in which neural progenitor cells are abundant [134]. A second study combining chromatin accessibility data with single-cell transcriptomic data from fetal brain further expanded the neuronal focus of genetic risk implicated by epigenetic studies by identifying fetal neural progenitors, oligodendrocyte precursor cells, and microglia as being enriched for schizophrenia heritability in this earlier time point [135].

While these advances have improved our understanding of the molecular pathology associated with risk for schizophrenia, there is much more to uncover before more targeted therapies or prophylactic measures can be implemented. The cell type-specific epigenomic studies in adult brain tissue largely rely on positive selection of neurons from non-neurons using NeuN antibody (further described below), meaning that heritability in the non-neuron fraction may be masked by greater heterogeneity than the neuronal fraction. However, these early leads provide foundational evidence upon which to build.

Technical considerations for working with human postmortem brain cells

Given the human-specific nature of the affected organ in schizophrenia, and the neoteny associated with the developmental trajectories of the affected brain regions, it is critical to conduct research using human postmortem brain as the substrate. Yet working with human postmortem brain is not without its unique challenges.

As described above, a major challenge in designing experiments using human brain is addressing the serious confounder of cellular heterogeneity. Previous work has shown that cell type composition is one of the largest determinants of DNAm variability in a study that uses homogenate tissue, and that age-associated DNAm changes are highly confounded by cell type composition at individual CGs [128]. However, cells of the brain are not easily untangled, particularly after the process of being flash frozen before examination. Proximate “omics” measurements taken from more easily acquired tissues such as blood, saliva, or buccal cells vary widely in their correlation to those in brain tissue at individual sites or genes of interest and still suffer from heterogeneity concerns [128, 136].

Several options now exist to address cellular heterogeneity in postmortem human brain sequencing studies. Reviews are now available that summarize options in greater detail [137, 138], but we will highlight the four most popular here. The first option is to use software that deconvolutes the homogenate signal based on known cell type-specific profiles [139, 140]. This option is the most rapid and cost-effective option; however, the results will only be an estimate from pooled DNAm or RNA signal. It is worth noting also that using this type of strategy on RNA data will estimate the proportion of the RNA that is represented by the given cell types, and not the proportion of actual cells. This computational method also relies on the existence of pure cell population references that may not be available for all cell types.

Another option is to perform LCM, in which single or groups of cells are excised with a laser from a thin tissue section [141]. While LCM is the only technique currently that allows for the collection of the cell cytosol from postmortem brain, the method suffers from relatively low throughput, potentially poorer RNA quality, and potential cellular contamination from 3-D sampling. Nonetheless, LCM offers the unique feature of being able to select cells a priori based on their morphology or expression of a specific marker. This approach has been applied to several different unique cell populations in the human brain, including isolating dopamine neurons from the midbrain substantia nigra and the granule cell layer from the hippocampus [111, 142].

A third option is to use fluorescence-activated nuclear sorting (FANS) to isolate cell types in a high-throughput manner [143]. In addition to a higher yield, the multiple lasers of a flow cytometer allow for multiparametric analysis that improves the yield purity and offers more flexibility in experimental design. Despite these advantages, performing FANS on frozen postmortem brain is too stressful on the tissue to isolate intact mature brain cells; therefore, sorting must be done on purified nuclei, and any antibodies used must label nuclear antigens.

While several groups have successfully used lineage-specific transcription factors to isolate select subtypes of neurons and glia [144,145,146], most studies isolate neurons from non-neurons using an antibody that targets NeuN, a splice factor that is a constitutive component of the neuronal nuclear matrix and a marker of most mature neurons [147,148,149]. Important caveats to this method include the potential for loss of immunoreactivity in postmortem brain tissue, even when using as robust an antibody as the monoclonal anti-NeuN offered by Millipore Sigma (Catalog #MAB377X). Previous work has shown that NeuN reactivity decreased in mouse cortex following cerebral ischemia, but the number of neurons was unaffected [150]. Similarly, NeuN reactivity was completely lost in the spinal cord of elderly rats, although the number of neurons remained the same [151]. These studies are reminders that interpreting the absolute yield in terms of FANS-derived populations should be undertaken with care. It is also worth noting that because FANS limits RNA studies to the nuclear fraction, understanding the compositional differences between RNA compartments over human brain development would help inform future studies using nuclear RNA without a comparable cytoplasmic fraction. Several studies have now addressed this issue and determined that at the gene level, nuclear RNA can be used as a stand in for the whole transcriptome [71, 152].

In recent years, several new technologies have been developed to profile the transcriptome and epigenome in single cells, and have been reviewed elsewhere [153, 154]. Briefly, the main two technologies emerging for this purpose are plate-based methods in which each well in a PCR plate contains a single cell or nuclei [155] and droplet-based methods in which thousands of cells can be individually sequenced by isolation in nanoliter droplets containing sequencing reagents [156]. While droplet-based sequencing strategies offer higher throughput than well-based methods, RNA sampling in each cell is much more sparse and is based on 3′ tagging, although methods for resolving full-length mRNA in droplet-based strategies are being developed [157].

These techniques are increasingly being applied to cells and nuclei isolated from human postmortem brain, furthering our understanding of intercellular variability and clarifying profiles of increasingly granular cell identity [135, 158,159,160,161,162,163,164,165,166]. Moreover, new spatially resolved single- or several-cell techniques such as RNA seqFISH+, an in situ hybridization-based method for resolving spatial gene expression patterns [167], or Slide-seq, a method that transfers RNA from a tissue section to a sheet covered in DNA-barcoded beads for sequencing [168], will add back further layers of information to sequencing data that were historically part of the cellular taxonomy [169]. As single-cell technologies improve, we will also be able to better explore genomic variation such as somatic mosaicism, a phenomenon that may preferentially affect restricted cell types and that is associated with schizophrenia [170], at the single-cell level. Assessing individual cellular landscapes of genomic regulation in developing human brains is the next frontier in parsing genetic and environmental risk for schizophrenia and will hopefully better illuminate how that risk is distributed across cell types and developmental stages.

Conclusion

The cells in a human brain develop according to genetically specified paths, influenced by environment, for well over 25 years. Faculties that become present in adulthood, such as the higher-order cognitive processes that are disrupted in schizophrenia, are built upon an infrastructure of previous cellular actions taken potentially decades earlier. These actions are reflected in the transcriptome and epigenome, along with the potential signature of genetic or environmental risk that influenced a schizophrenic brain toward that state. As noted previously, although the details of heritability, affected cell types and altered developmental trajectories may differ, the principles and approaches described here for schizophrenia apply to other neuropsychiatric diseases as well. By profiling the transcriptome and epigenome in human brain tissue—the most clinically relevant substrate to psychiatric disease—at timepoints throughout that 25-year period of risk, we can anticipate building a clearer picture of normal brain developmental infrastructure, throwing into relief the risk “stress fractures” that collapse healthy brain function in schizophrenia. As sequencing techniques advance and the field gains a finer understanding of gene regulation in individual cells and cell populations over development, we can expect our understanding of the developmental etiology of schizophrenia to improve and converge at the cellular level.