Introduction

Amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are two devastating, adult-onset neurodegenerative disorders with distinct clinical presentations. ALS, also known as Lou Gehrig’s disease, is the most common motor neuron disease and is invariably fatal. Its rapid progression is caused by the loss of upper and lower motor neurons and leads to muscle weakness and paralysis [1]. FTD is the second common dementia in people under 65 years of age. Only Alzheimer’s disease is more prevalent. In FTD, the degeneration of neurons in the frontal and temporal lobes leads to behavioral and personality changes, deficits in executive functions, and language impairments [2]. It is now increasingly recognized that ALS and FTD belong to a continuous disease spectrum with clinical, pathological, and genetic overlaps. In clinic, about 15% of FTD patients show symptoms of ALS, and up to 50% of ALS patients have detectable cognitive impairment [3]. Pathologically, more than 97% of ALS and about 50% of FTD cases accumulate inclusions of the RNA-binding protein TAR DNA–binding protein 43 (TDP-43) in neuron and glia. Genetically, mutations in several genes cause either ALS or FTD [4]. In 2011, a hexanucleotide repeat expansion in the C9orf72 (abbreviated from its genomic loci in chromosome 9 open reading frame 72) was identified as the most common genetic cause of both ALS and FTD [5,6,7], and accounts for 40% of familial ALS, 5–10% of sporadic ALS, 12–25% of familial FTD, and 6–7% of sporadic FTD cases [8].

Identification of the C9orf72 locus added ALS/FTD (referred to as c9ALS/FTD thereafter) to the increasing number of repeat expansion disorders (e.g., myotonic dystrophy, Huntington’s disease, and several spinocerebellar ataxias) [9]. Based on lessons learned from these other repeat expansion disorders and initial pathological characterization of C9orf72 postmortem tissues, three disease mechanisms have been proposed: 1) loss of C9orf72 function; 2) toxicity from the bidirectionally transcribed repeat-containing RNAs, such as mediated through sequestration of key RNA-binding proteins (RBPs) into RNA foci; and 3) toxicity from production of dipeptide repeat (DPR) proteins through a noncanonical repeat-associated non-AUG-dependent (RAN) translation. Although the C9orf72 mutation was only identified 8 years ago, research progress has been rapid, and interestingly, insights from studying C9orf72 are significantly influencing the way we think about neurodegeneration and therapy. In this review, we will discuss the current understanding of the pathogenic mechanisms caused by the C9orf72 repeat expansions and therapy development.

C9orf72 Gene Structure, Repeat Sizes, and Somatic Mosaicism

The hexanucleotide repeat expansion is localized in the first intron of the C9orf72 gene (Fig. 1a). The actual sizes of the repeats vary dramatically among C9orf72 patients. The threshold pathogenic repeat size is unknown, but 30 is used in most publications. Healthy individuals have typically fewer than 11 repeats, whereas most patients have several hundreds to thousands [6, 10,11,12,13,14]. A small percentage of patients also have shorter expansions, 45–80 repeats [15]. Unlike Huntington’s disease and other CAG repeat expansion disorders where a phenomenon called anticipation causes increased size and worse phenotype in offspring, increased C9orf72 repeat size in offspring is only reported in a few families [16, 17]. The correlation of the repeat size and clinical and pathological characteristics has also been explored, and the results are variable, depending on patient size, ethnic background, and tissues in which expansion size was measured [10, 13, 15, 18,19,20,21,22]. Longer repeat sizes seem to be associated with a survival disadvantage [19] and an earlier onset of disease [10, 13, 15]. Interestingly, individuals with an intermediate repeat length in blood or fibroblast DNAs have a significant increase of repeat size (repeat instability) and variation in repeat size (mosaicism) in different regions of the central nervous system (CNS) [15, 18, 19]. The mechanisms of such instability and somatic mosaicism are not well understood but need to be considered when using repeat size from peripheral blood and fibroblast cells for genetic counseling.

Fig. 1
figure 1

C9orf72 gene structure, RNA transcripts, and proteins. (a) Schematic representation of the human C9orf72 gene with expanded GGGGCC hexanucleotide repeats. (b) Three RNA transcripts can be produced from the C9orf72 gene. Variant 1 is predicted to result in a “short isoform” C9ORF72 protein of 222 amino acids, whereas variants 2 and 3 encode a long C9ORF72 protein of 481 amino acids

Pathogenic Mechanisms of c9ALS/FTD: Loss of Function Versus Gain of Toxicity

Our current understanding of the contributions of loss of function and gain of toxicity to c9ALS/FTD pathogenesis is driven by complementary studies of human postmortem tissues and experimental model systems in vitro and in vivo. When interpreting these data, one should keep in mind the advantages and shortcomings of each system. Patient postmortem studies are more disease-relevant but are anatomically complex and generally capture the later stages of the diseases. Published experimental model systems, on the other hand, often use repeat sizes significantly shorter than those found in patients, mainly due to the technical difficulties in obtaining pure and long GGGGCC repeats. They also frequently rely on high overexpression of a single potential toxic species that may not reflect the endogenous expression in patients. Despite these challenges, a picture of pathogenic mechanisms leading to c9ALS/FTD is emerging.

Loss of C9orf72 Function Alone Is Insufficient to Cause ALS/FTD

Reduced C9orf72 Expression in c9ALS/FTD Patients

In humans, C9orf72 is transcribed into three major RNA isoforms (Fig. 1b). In variants 1 (NM_145005.6) and 3 (NM_001256054.2), the expanded GGGGCC repeat is located in an intron between two alternatively spliced exons, and in variant 2 (NM_018325.5), the repeat is located in the promoter region. Variant 2 expresses at higher level than variants 1 and 3 in CNS tissues [23]. The presence of an enormous GGGGCC repeat expansion in the intron region may cause early abortion during transcription, leading to the increased transcripts containing upstream of the repeats and the decreased transcripts containing downstream of the repeats in C9orf72 patients [24]. In addition, cytosine hypermethylation of CG dinucleotides in CpG islands (regions with significantly higher frequency of the CG sequence) is an important epigenetic modification that could lead to gene silencing. There is a CpG island at the 5′ of GGGGCC repeats, and the repeat expansion itself also provides many more CpG islands. In C9orf72 patients, the CpG island 5′ of the GGGGCC repeats is hypermethylated [25], as is the repeat itself [26]. Finally, trimethylation of histones H3 and H4 at the C9orf72 locus was detected in C9orf72 patient blood, suggesting another mechanism for transcription regulation [27]. All of these could contribute to downregulation of C9orf72 gene expression and loss of C9orf72 protein function (Fig. 2a). Many independent groups have shown reduced one or more of the C9orf72 RNA transcripts in induced pluripotent stem cell (iPSC)-derived neurons [28,29,30] and patient tissues, including lymphoblasts, frontal cortex, and cerebellum [6, 7, 31]. Demonstrating reduced C9orf72 protein levels has been more challenging, due to the low abundance of C9orf72 protein and a lack of good antibodies for its detection. However, several studies suggest that the expression levels of C9orf72 proteins are reduced in the frontal and temporal cortices in patients [32,33,34].

Fig. 2
figure 2

Proposed pathogenic mechanisms in C9orf72 ALS/FTD. (a) The presence of expanded GGGGCC repeats potentially causes abortive transcription from exon 1a and hypermethylation of both DNA and histones to reduce C9orf72 RNA transcription, leading to a loss of C9orf72 protein function. (b) Bidirectionally transcribed repeat-containing RNAs cause neuronal toxicity by sequestration of RNA-binding proteins into RNA foci or production of at least 5 aberrant dipeptide repeat proteins [poly(GA), poly(GP), poly(GR), poly(PR), and poly(PA)] through a novel repeat-associated non-AUG-dependent translation mechanism

Physiological Function of C9orf72 Protein

Transcript variants 2 and 3 encode the full-length C9orf72 protein with 481 amino acids, and variant 1 produces a predicted “short isoform” protein of 222 amino acids with a unique lysine residue at the C-terminus(Fig. 1b). Whether the predicted short isoform C9orf72 is expressed in nature is controversial. Xiao et al. [35] developed an antibody that selectively detects the predicted short isoform C9orf72 protein and showed that its expression levels on nuclear membranes were lower in c9ALS cases than in controls. However, using an antibody against the N-terminal region of C9orf72 that could recognize both the full-length and the predicted short isoform C9orf72, we demonstrated that C9orf72 is predominantly, if not exclusively, expressed as the full-length protein in human and mouse tissues [34]. This is further confirmed by another study using novel knockout-validated C9orf72 monoclonal rat and mouse antibodies [33].

Nevertheless, the physiological function of the full-length C9orf72 protein is emerging. Bioinformatic analysis shows that C9orf72 protein has high homology with the differentially expressed in normal and neoplastic cell (DENN) protein family, which functions as guanine nucleotide exchange factor (GEF) proteins to activate RAB GTPase and regulate membrane trafficking [36, 37]. In agreement with this, C9orf72 interacts with Smith–Magenis syndrome chromosomal region candidate gene 8 (SMCR8) and WD repeat-containing protein 41 (WDR41), and this complex possesses GEF activity for different Rab proteins and plays a role in multiple steps of autophagy/lysosomal pathways [38,39,40,41,42,43,44,45]. Several Rab proteins are substrates of C9orf72, including Rab1, Rab3, Rab5, Rab7, Rab8a, Rab11, and Rab39 [40, 41, 43,44,45]. Knockdown C9orf72 in human cell lines and primary neurons inhibits autophagy induction, leading to accumulation of P62 [40, 45]. Accumulation of autophagy substrates, including P62, is also observed in spleens of C9orf72 knockout mice and in iPSC–derived neurons from C9orf72 patients [45, 46]. iPS neurons from C9orf72 patients also showed increased sensitivity to autophagy inhibitions, suggesting that reductions in C9orf72 levels contribute to cellular stress [29]. In agreement with this, C9orf72 protein is necessary for the formation of stress granules [47].

Loss of C9orf72 Function in ALS/FTD Pathogenesis

Is reduced C9orf72 function as the main disease driver for ALS/FTD caused by the C9orf72 repeat expansions? In human, only one sporadic ALS patient has ever been found to carry a splice site mutation potentially causing a heterozygous loss of function, and it is unclear whether this mutation is the cause of disease or variant of undetermined significance [48, 49]. In addition, one would expect that patients homozygous for C9orf72 repeat expansions (and thus with less C9orf72 than heterozygotes) will have more severe symptoms of disease, but this is not the case [50]. Although not conclusive, these human studies do not support a loss of function as the disease-driving mechanism.

Studies in lower species model systems suggest that C9orf72 is critical for the development of motor systems and motor functions. In zebrafish, blocking C9orf72 translation or splicing with antisense morpholino oligonucleotides resulted in axonopathy and reduced mobility [51]. Overexpressing of C9orf72 that lacks the complete DENN protein domain also phenocopied the morpholino experiment, suggesting the indispensability of the complete DENN protein domain [52]. Similarly, deletion of C9orf72 orthologue in Caenorhabditis elegans led to age-dependent motility deficits, paralysis, and increased susceptibility to environmental osmotic stress [53]. Primary hippocampal neurons cultured from C9orf72 knockout mice have reduced dendritic arborization and spinal density, suggesting its role in neuronal morphogenesis [54]. However, sustained reduction of mouse C9orf72 to 30–40% of its normal level throughout the CNS using antisense oligonucleotides (ASOs) is well tolerated, producing no behavioral or pathological features characteristic of ALS/FTD [55]. Several ubiquitous or neuronal/glialcell–specific C9orf72 knockout mice have also been characterized. None of these mice produced any ALS/FTD-like pathological or behavioral phenotypes [41, 42, 56,57,58,59,60], indicating that C9orf72 loss-of-function alone is not the main ALS/FTD disease driver in mice. Interestingly, global C9orf72 knockout mice develop a proinflammatory/autoimmune phenotype with enlarged spleen and lymph nodes, consistent with the observation that C9orf72 ALS/FTD patients also show an increased prevalence of autoimmune disease [61]. These findings are particularly noteworthy, given growing appreciation of the role of the immune system in a variety of neurodegenerative diseases [62].

Overall, current available data suggest that C9orf72 loss of function is insufficient to precipitate disease. Instead, hypermethylation of the mutant C9orf72 allele (one of the mechanisms leading to reduced C9orf72 expression) might actually be protective, potentially by decreasing the transcription of toxic repeat-containing C9orf72 RNAs [63,64,65]. However, given the role of C9orf72 in autophagy and microglia, pathways implicated in ALS and FTD, haploinsufficiency might contribute to the disease process in synergy with the gain-of-toxicity mechanism. This will be discussed in the section “Synergistic Mechanisms.”

Gain of Toxicity Plays a Central Role in Disease Pathogenesis

RNAs containing sense GGGGCC and antisense CCCCGG repeats are bidirectionally transcribed from the C9orf72 mutation. These repeat-containing RNAs can exert toxic gain-of-function and drive neurodegeneration (Fig. 2b). First, they may form complicated secondary structures, such as hairpins, i-motifs, and G-quadruplexes [66,67,68], which, in turn, sequester RBPs into intranuclear RNA foci that can be experimentally visualized using fluorescence in situ hybridization with probes against the repeats. These RNA foci lead to depletion of the normal function of the sequestered RBPs and neuronal death. The RBP sequestration model is best demonstrated in the neuromuscular disease myotonic dystrophy 1, where the muscleblind family proteins are sequestered by the CTG repeats in the DMPK gene and cause downstream splicing changes [69]. Secondly, despite being in an intronic region, the repeat of the C9orf72 RNAs can remarkably be translated through an unusual mechanism (RAN translation) in every reading frame to produce five different DPR proteins that might be detrimental to neurons. More specifically, poly(GA) and poly(GR) are uniquely translated from the sense GGGGCC RNA; poly(PR) and poly(PA) are uniquely from the antisense CCCCGG RNA, and poly(GP) is from either direction.

In c9ALS/FTD patients, both sense and antisense RNA foci were detected in fibroblast, lymphoblast, iPSC-derived neurons, and several CNS regions of c9ALS/FTD patients [28, 55, 70,71,72,73]. RNA foci are most abundant in neurons but also occur in astrocytes, microglia, and oligodendrocytes [55, 71, 72]. Similarly, accumulations of DPR proteins have been detected using either immunohistochemical assays for DPR inclusions or immunoblotting and ELISA-based assays for the soluble form [71, 73,74,75,76]. DPR inclusions are widely distributed throughout the brains of c9ALS/FTD and, interestingly, are less frequent in the spinal cord [76,77,78,79,80]. Based on the staining patterns, they can be categorized into neuronal cytoplasmic inclusions (NCIs), intranuclear inclusions (INIs), dystrophic neurites (DNs), and diffuse dendritic or cytoplasmic staining. So far, no DPR inclusions have been reported in astrocyte, microglia, and oligodendrocytes. Of the different types, the sense-strand RNA-encoded poly(GA), poly(GP), and poly(GR) inclusions are far more frequent than the antisense-strand RNA-encoded poly(PA) and poly(PR), and they can co-occur in the same neuron [74]. SDS-soluble poly(GA) and poly(GP) can also be detected in CSF and blood by immunoassay [81,82,83]. This will be further discussed in the section “Biomarkers.”

To determine the gain-of-toxicity mechanism from the C9orf72 repeats, several studies have overexpressed GGGGCC repeats with variable sizes and demonstrated deleterious phenotypes in cell culture [84], C. elegans [85], zebrafish [86], and Drosophila [87,88,89]. The gain-of-toxicity mechanism is also supported by studies expressing individual DPR proteins (discussed below), as well as studies using iPSC-derived neurons. Short or interrupted repeats are generally used, given the technical difficulties to obtain long and pure repeats. Adeno-associated viruses expressing 66 or 149 GGGGCC repeats were injected intraventrically into newborn mice to induce somatic transgenesis. These mice accumulated nuclear RNA foci and inclusions of DPR proteins and consequently exhibited hyperactivity, anxiety, antisocial behavior, and motor deficits [90, 91]. Four different groups, including ours, also generated transgenic mice expressing a bacterial artificial chromosome (BAC) that can accommodate genomic DNA from c9ALS/FTD patients to express long repeats (up to ~1000 repeats) in the context of the human gene. All these mouse models produced sense and antisense RNA foci, as well as DPR proteins from the sense strand. Two transgenic models failed to produce neurodegeneration [92, 93], and our mice, expressing exons 1–5 of the human C9orf72 gene and ~450 repeats, developed increased age-dependent anxiety and cognitive deficits, accompanied by the loss of hippocampal neurons, but no unusual motor phenotype [94]. In the last model expressing the full-length C9orf72 gene and ~500 repeats, a subset of female mice developed an acute motor phenotype, including paralysis and decreased survival. Other females and males showed slower progression [95]. The reasons for differences in these mouse models are unknown. They feature different genetic backgrounds, repeat sizes, transgene insertion sites, and expression levels. An in-depth comparison of all BAC models especially the accumulation of RNA foci and DPR proteins will provide insights to the toxic species leading to motor and/or cortical neuron death.

In summary, expressing GGGGCC repeats in model systems clearly recapitulates both cellular pathology (RNA foci and DPR proteins) and neuronal degeneration as in patients, even though unphysiologically high levels of expression were used. It is also challenging to dissect the relative contributions of RNA foci or DPR proteins to disease pathogenesis since these repeats will produce both RNA foci and DPR proteins, and the relative contributions of sense and antisense repeat–containing RNAs remain unclear.

RNA Foci-Mediated Toxicity

Few neuropathological studies have attempted to correlate the abundance and distribution of RNA foci with c9ALS/FTD clinical features. Mizielinska et al. [72] showed that patients with more sense RNA foci had an earlier age of symptom onset in a small number of C9orf72 FTD patients. Antisense foci showed a similar trend [72]. In agreement with this, antisense RNA foci are associated with nucleoli and mislocalization of TDP-43 [70, 96]. However, another study with larger patient population reported a contradictory observation. Patients with more antisense RNA foci in middle frontal gyrus (cortical layers 3–6) neurons showed a later age of onset [97].

Expressing C9orf72 GGGGCC repeats in primary neurons [84], Drosophila [87, 98], and zebrafish [86] suggests that the toxicity arises from RNA foci because no DPR proteins were detected. However, the inability to detect DPR proteins in these studies is insufficient to exclude a role for these proteins. Others argue against the toxicity from RNA foci. In two important studies, researchers generated Drosophila expressing either pure GGGGCC repeats or interrupted repeats of similar size, but with stop codons inserted in each reading frame to prevent the translation of the repeats into DPR proteins [88, 99]. Both pure and interrupted repeats form similar levels of RNA foci. Expression of pure repeats caused toxicity and early lethality whereas the interrupted ones had no effect. In another study, Drosophila expressing 160 GGGGCC repeats in the intron formed abundant sense RNA foci in the nucleus but produced little DPR protein and no neurodegeneration, again suggesting that nuclear sense RNA foci are not sufficient to drive neurodegeneration in this model [100]. There are, however, some caveats for interpreting these experiments, including whether the interrupted repeat RNAs have similar secondary structures to sequester the same key RBPs as the pure repeats, and whether RNA foci in humans may sequester RBPs that are not expressed in Drosophila.

A key to validating RNA foci–mediated toxicity is to identify the RBP(s) that are sequestered and thus lose function. Many proteins are proposed to interact with short, synthesized C9orf72 repeat RNAs in vitro, including hnRNPA1, hnRNPA3, hnRNP-H, hnRNP-H1/F, nucleolin, Pur-α, double-stranded RNA-specific editase B2 (ADARB2), RanGAP1, THO complex subunit 4 (ALYREF), serine/arginine-rich splicing factor 1 (SRSF1), SRSF2, and Zfp106 [24, 28, 87, 101,102,103,104,105]. In c9ALS/FTD brain tissues, sense and antisense RNA foci colocalize with hnRNPA1, hnRNP-H/F, SRSF2, and ALYREF [101, 102, 106]. But most of these RNA-binding proteins appear to colocalize with RNA foci at a low frequency; only hnRNP-H colocalizes with 70% of all sense foci detected [102]. Nevertheless, key data to support RNA foci–mediated toxicity are missing (i.e., demonstrating that the loss of function of any of these RBPs leads to C9orf72 diseases and upregulation of their functions can rescue phenotypes caused by C9orf72 repeat expansions).

DPR Protein-Mediated Toxicity

To gain insights for whether and/or which DPR proteins are the primary culprit of c9ALS/FTD, several clinicopathological studies have been carried out. Most of these studies raised suspicions about the pathogenicity of DPR inclusions, including 1) regions with the highest burden of DPR inclusions are cerebellum, hippocampus, and neocortex, where no significant loss of neurons occurs; 2) DPR inclusions do not differ in the neuroanatomical distributions in the brain between FTD and ALS cases; 3) DPR inclusions are rarely observed in lower motor neurons in the spinal cord; and most importantly, 4) no obvious correlation between the abundance levels of DPR inclusions with neurodegeneration [77, 78, 107, 108]. However, a few neuropathological studies support DPR protein–mediated toxicity. Mackenzie et al. showed moderate associations for the amount of poly(GA)-positive DNs with local degeneration in the frontal cortex [77]. In another study, more abundant poly(GA) NCIs in cortical, hippocampal, and motor regions were associated with earlier age at symptom onset [107]. Poly(GR) inclusions, but not the other DPR inclusions, correlated with areas of neurodegeneration in C9orf72 ALS and were unique in colocalizing with TDP-43 pathology in a small sample of brains that were obtained shortly after death [34]. In a larger cohort of 40 c9ALS/FTD cases, middle frontal gyrus poly(GR) inclusions were strongly correlated with neurodegeneration in this brain region [109]. Notably, these studies detect only aggregated, insoluble proteins, and it is possible that soluble species may mediate toxicity. Alternatively, CNS regions that have extensive DPR pathology but are unaffected by neurodegeneration might contain protective factors.

To move pathology to pathogenesis, several groups have expressed individual DPR proteins in isolation in yeast [110], mammalian cells [84, 111,112,113,114,115,116,117], Drosophila [84, 88, 89, 112, 118, 119], zebrafish [86, 120, 121], and mice [122,123,124,125,126,127]. Of all the DPR proteins, poly(GR) and poly(PR) are most toxic. Synthetic 20-mer poly(GR) and poly(PR) can cause rapid death of U2OS cells and cultured human astrocytes when added exogenously [128]. These DPR proteins also consistently show toxicity when overexpressed in several different cell lines, such as HEK293T cells, NSC-34 cells, and iPSC-derived neurons [84, 111, 112]. In Drosophila, targeted expression of poly(GR) or (PR) resulted in eye degeneration, motor deficits, and reduced survival [84, 88, 89, 112, 118, 119]. Finally, AAV-mediated or transgenic expression of poly(GR) and poly(PR) in mice caused age-dependent neurodegeneration, brain atrophy, and motor/memory deficits [124,125,126,127]. Poly(GA) also exerts toxicity. Synthetic poly(GA) exogenously applied to human cells or primary neurons is toxic [129]. Poly(GA) overexpression in cultured cells [115, 116], primary neurons [115, 116], zebrafish [120, 121], Drosophila [88], or mouse brains [122, 123] leads to toxicity. The remaining DPR proteins [poly(PA) and poly(GP)] are less likely to be the toxic species. These studies provide compelling evidence that certain DPR proteins are toxic. However, since high overexpression levels of individual DPR proteins were used, a key question remains to determine whether (which) DPR proteins are the main toxic species in c9ALS/FTD when expressed at the endogenous levels in patients. Another important issue is whether different DPR proteins, and repeat RNA and DPR proteins act synergistically, potentially in conjunction with the loss of function of C9orf72, to elicit downstream effects.

Synergistic Mechanisms

In c9ALS/FTD patients, pathological evidence exists for both loss of C9orf72 function and gain of toxicity. Whether loss of C9orf72 function synergizes with gain of toxicity to exacerbate ALS/FTD disease is not known. Mice with loss of C9orf72 do not develop ALS/FTD-related behavioral deficits, but additional stressors might be required as second hits to induce neurodegeneration in these mouse models. Shi et al. reported that loss of C9orf72 in human iPSC–derived motor neurons impacts lysosomal biogenesis and vesicular trafficking and exacerbates toxicity to poly(GR) and poly(PR) exposure [30]. Similarly, loss of C9orf72 protein sensitized cells to toxicity induced by expression of 30 and 60 GGGGCC repeats [47]. Shao et al. [130] bred C9orf72 knockout mice with one BAC transgenic model expressing the repeat-expanded full-length human C9orf72 gene to determine the synergism. One would doubt such strategy since transgene overexpression would compensate for loss of the mouse C9orf72 endogenous protein. But surprisingly, the transgenic mouse line does not express full-length C9orf72 proteins from the transgene and C9orf72 haploinsufficiency exacerbate motor behavior deficits in a dose-dependent manner [130]. It is premature to say that C9orf72 loss of function induces neurodegeneration together with toxic RNA and DPR proteins. Further investigations crossing C9orf72 knockout mice with transgenic mice that do not express the full-length C9orf72 proteins or injecting AAV virus expressing GGGGCC repeats into C9orf72 knockout mice are needed to address this question. In addition, poly(GA) recruits poly(GR) into inclusions and reduces poly(PR) toxicity when co-expressed in human cells and Drosophila [117, 118]. Whether interaction between individual DPR proteins enhances or suppresses their toxicity is thus a point of interest. Poly(GA) can also induce nuclear RNA foci in cells expressing 80 GGGGCC repeats and in patient fibroblast cells, suggesting a positive feedback loop between RNA foci and DPR proteins [131]. Finally, translational frameshifting has been suggested to occur in cells expressing short C9orf72 repeats [132] and other repeat expansions [133], which might produce chimeric repeat peptides. However, it is unknown whether such a frameshift occurs in C9orf72 mutation carriers and, if it does, the contribution of chimeric repeat peptides to disease pathogenesis.

Downstream Molecular Pathway Dysfunctions

In addition to identifying the major toxic species, research has focused on determining cellular dysfunctions that result from the C9orf72 repeat expansions. These events provide a basic understanding of disease pathophysiology and new therapeutic targets (Fig. 3).

Fig. 3
figure 3

Cellular processes impaired by the C9orf72 repeat expansions and potential therapeutic interventions. A wide range of cellular pathways have been implicated in c9ALS/FTD, including DNA damage, nucleolar stress, nucleocytoplasmic transport deficits, ER stress, autophagy dysfunction, translational inhibition, proteasome inhibition, and altered stress granule dynamics. Therapies targeting these deficits, as well as directly targeting repeat expanded C9orf72 DNA/RNA and DPR proteins, are also highlighted

Nucleocytoplasmic Transport Deficits

Active transport of proteins and RNAs through nuclear pore complexes (NPCs), a process called nucleocytoplasmic transport (NCT), is essential for cellular functions [134]. Zhang et al. [98] showed that RanGAP1 binds to GGGGCC repeat RNA in vitro and is sequestered into sense RNA foci. RanGAP1 activates Ran small GTPase, which plays an important role in regulating the interactions between cargo molecules and transport receptors, such as the importin and exportin family of proteins [134]. In both Drosophila overexpressing 30 GGGGCC repeats and c9ALS iPSC–derived neurons, the nucleocytoplasmic Ran gradient is decreased, and the nuclear import of proteins and nuclear export of RNAs are compromised. In addition, the repeat toxicity can be readily suppressed by increased activity of RanGAP1, suggesting that RNA foci–mediated NCT deficit is a fundamental pathway in c9ALS/FTD [98]. In parallel, unbiased genetic screenings for phenotypic modifiers of Drosophila expressing GGGGCC repeats [89] or poly(PR) [119], yeast [110], and human cells expressing poly(GR) or poly(PR) [135] identified that many proteins involved in NCT mitigate or exacerbate disease phenotypes. Mechanistically, poly(GR) and poly(PR) interact with nucleopore proteins (Nups), including the central channel of the nucleopore, to reduce trafficking [136]. Poly(GA) also sequesters nucleocytoplasmic transport proteins, such as HR23, and induces an abnormal distribution of RanGAP1 in mice [122]. The primary cause for the NCT dysfunction in c9ALS/FTD is unknown. NCT defects have also been reported in other neurodegenerative disorders, such as Alzheimer’s disease [137] and Huntington’s disease [138, 139], suggesting that it may play a more global role in the neurodegenerative process.

Nucleolar Dysfunction

Nucleolus plays an important role in ribosomal RNA biogenesis. When applied exogenously to astrocyte cultures, poly(GR) and poly(PR) 20-mers accumulate in nucleolus, leading to splicing changes and impaired ribosomal RNA maturation [128]. Overexpressed poly(GR) and poly(PR) also colocalize with nucleoli in cultured cells, primary neurons, iPSC-derived neurons, and Drosophila, leading to abnormal nucleolar morphology [84]. However, poly(GR) and poly(PR) do not localize within the nucleolus in C9orf72 patient brains. Instead, sense RNA foci can bind nucleolin, a principle component of the nucleolus, and aberrant subcellular localization of nucleolin was observed in C9orf72 iPSC–derived motor neurons and patient motor cortex [24]. Contrary to this, antisense RNA foci associated with nucleoli were identified neuropathologically to be associated with disease [96]. Nucleolin mislocalization was also found in C9orf72 BAC transgenic mice without significant changes in ribosomal RNA processing or splicing [92]. Interestingly, C9orf72 patient brains exhibit bidirectional nucleolar volume changes, with smaller nucleoli overall but enlarged nucleoli in neurons containing poly(GR) inclusions [140].

DNA Damage

Nucleolar stress induces DNA damage, and postmitotic neurons are particularly susceptible to DNA damage. An age-dependent increase in DNA damage and oxidative stress was reported in iPSC-derived motor neurons of patients with C9orf72 repeat expansions [141]. DNA damage is also increased in spinal cord neurons of C9orf72 ALS patients [142]. Such DNA damage is partially caused by poly(GR) since expression of poly(GR) in iPSC-derived control neurons, neuronal cell lines or in mice in vivo is sufficient to induce DNA damage and toxicity [126]. Mechanistically, poly(GR) binds to Atp5a1 and compromises mitochondrial function, leading to increased oxidative stress and an overactivated Ku80-dependent DNA repair pathway [126, 143]. Introducing ectopic Atp5a1 expression or reduction in oxidative stress by treatment with antioxidants partially rescues DNA damage in motor neurons from C9orf72 carriers and neurons expressing poly(GR). As a result of DNA damage, the levels of phosphorylated ATM and P53 and other downstream proapoptotic proteins, such as PUMA, Bax, and cleaved caspase-3, are significantly increased in C9orf72 patient neurons. Partial loss of Ku80 function in these neurons through CRISPR/Cas9-mediated ablation or small RNA-mediated knockdown suppresses the apoptotic pathway. Thus, poly(GR)-mediated DNA damage contributes to ALS/FTD pathogenesis, and partial inhibition of overactivated Ku80-dependent DNA repair pathway is a promising therapeutic target.

Alteration in Stress Granules

Eukaryotic cells have evolved sophisticated strategies to combat unexpected cellular stress. Cytoplasmic ribonucleoprotein (RNP) granules, such as processing bodies (P-bodies) and stress granules (SGs), assemble quickly under stress to sequester messenger RNAs (mRNAs), translation initiation factors, 40S ribosomes, and many other RNA-binding proteins. In this way, only essential proteins needed for survival are produced. These granular structures can either dissemble upon release of cellular stress or be degraded by the autophagy pathway. Dysregulated formation or clearance of SGs has been proposed as a key pathogenic mechanism for ALS/FTD [144]. In recent years, groundbreaking work from several groups showed that SG-associated RNA-binding proteins, such as FUS and hnRNP1/2, undergo liquid–liquid-phase separation (LLPS) to form droplets in vitro and have the propensity to further fibrilize into irreversible hydrogel or even insoluble amyloid structures [145,146,147,148]. One characteristic of these RBPs is that they all contain prion-like, intrinsically disordered low-complexity domains (LCDs). Poly(GR) and poly(PR) interact with these LCD proteins, impair LLPS, and disrupt the dynamics of stress granules [149]. Consistent with this, primary cortical neurons overexpressing poly(PR) showed a reduction in cytoplasmic P-bodies with larger sizes and an increase in stress granule formation [84], and knockdown of several of the LCD proteins modifies the eye degeneration phenotype in Drosophila expressing poly(GR) [112]. Activation of the SG response or exposure to either poly(GR) or poly(PR) was also sufficient to disrupt nucleocytoplasmic transport by promoting the sequestration of several nuclear pore proteins into SGs [150]. Interestingly, poly(GR) and poly(PR) themselves undergo LLPS at high concentrations, and GGGGCC repeat RNA also undergoes gel transition itself [151] and promotes the phase transition of RNA granule proteins in vitro and in cells [152].

Translation Inhibition and Ubiquitin Proteasome System

In addition to regulating stress granule dynamics and ribosomal RNA synthesis, overexpression of poly(GR) and (PR) blocked global translation in an in vitro translation assay and in cell lines [114]. This was attributed to direct binding of poly(GR) and poly(PR) to mRNA, thereby blocking access to the translational machinery. Poly(PR) and poly(GR) also interact with translation initiation and elongation factors and ribosome subunits in pull-down assays [114]. In addition, GGGCC repeat–containing RNAs sequester ribosomal subunits, trigger stress granule formation, and inhibit translation [152, 153]. Interestingly, 26S proteasome complexes, components of the ubiquitin proteasome system (UPS), are sequestered within poly(GA) aggregates in cells as revealed by protein cryo-electron tomography technology, which allows 3D imaging of the cell interior in close-to-native conditions [154]. These results provide new insight into the mechanism by which poly(GA) deposition promotes UPS impairment and regulate protein homeostasis.

Therapeutics

Currently, no cure is available for ALS or FTD. The identification of C9orf72 as the most common genetic cause and fast progress in research unveiling disease mechanisms has inspired multiple therapeutic interventions for patients carrying this mutation, the prospects of which are particularly exciting (Fig. 3).

Targeting C9orf72 Repeat-Expanded RNA/DNA

Broad evidence supports a gain of toxicity from which the C9orf72 repeat RNAs play a central role in ALS/FTD pathogenesis. Thus, inhibiting the transcription or selectively reducing repeat-containing RNAs is a promising strategy with antisense oligonucleotide therapy being the most advanced in terms of clinical development. Indeed, unraveling the exact contributions from RNA toxicity and DPR protein toxicity or the exact cellular pathways may not be necessary for such an approach to be successful. However, unraveling the exact contributions from sense and antisense strand transcripts is necessary, since these must be targeted separately.

Antisense Oligonucleotide Therapy

ASOs are designed synthetic oligonucleotides or oligonucleotide analogs that bind to target RNAs. Depending on its chemical modifications, ASOs selectively degrade mRNAs through endonuclease RNase H recruitment or prevent the interaction of RNAs with RBPs, thereby modulating its splicing/processing without degradation [155]. In recent years, ASO therapy for neurological disorders has gained significant interest, with the FDA approval of nusinersen (Spinraza) for spinal muscular atrophy (SMA) in 2016. Nusinersen increases the expression of survival motor neuron (SMN) protein by modulating splicing of the SMN2 pre-mRNAs [156]. The ability of ASOs to selectively degrade mRNAs is also advantageous if the pathogenesis of neurodegeneration is caused by gain of toxicity from mutant RNA/proteins. Phase I trials have been completed with ASOs targeting superoxide dismutase 1 (SOD1) in familiar ALS [157] and mutant huntingtin gene in Huntington’s disease [158]. For C9orf72 diseases, early studies showed that ASOs targeting within or immediately upstream of the sense repeats reduced sense RNA foci, increased survival from glutamate excitotoxicity, and abrogated aberrant gene expression patterns in fibroblast and iPSC-derived neurons [28, 55, 101]. These observations were further extended to in vivo mouse models. We demonstrated that a single-dose, intraventricular administration of ASOs targeting the sense repeat–containing RNAs led to sustained reduction of sense RNA foci and DPR proteins and mitigated the behavioral and cognitive deficits in transgenic mice expressing 450 repeats [94]. Importantly, transcript variants 1 (NM_145005.6) and 3 (NM_001256054.2) that carry the repeat expansion can be specifically targeted by ASOs without reducing variant 2 (NM_018325.5) expression. Since variant 2 is expressed at much higher levels than variants 1 and 3, overall C9orf72 abundance remained relatively intact [94]. Thus, even if loss of function is important in pathogenesis, toxic transcripts can be reduced without further exacerbating loss. Based on these encouraging results, a phase I clinical trial of ASOs targeting the sense strand for C9ALS patients has started by Ionis Pharmaceuticals and its partner Biogen Inc. in September 2018 (NCT03626012).

RNA Interference Strategy

RNA interference (RNAi) represents another alternative approach against RNA/protein-mediated gain of toxicity. RNAi is a biological process in which certain double-stranded RNA molecules inhibit the expression of target genes by destroying the messenger RNAs in the cytoplasm mediated through the RNA-induced silencing complex (RISC) [159]. The three most common RNAi approaches include short interfering RNAs (siRNAs), shRNAs, or artificial microRNAs (miRNAs). Given the recent progress of using adeno-associated virus (AAV) for therapeutic gene delivery [160], RNAi therapy becomes even more attractive to treat neurodegenerative disorders. The major challenge for C9orf72 ALS/FTD is to target the repeat-containing C9orf72 transcripts in the nucleus. Indeed, one study showed that siRNA significantly reduced C9orf72 mRNA in patient fibroblast but did not affect nuclear RNA foci abundance [55]. However, another study suggested that single-strand silencing RNAs reduced both sense and antisense RNA foci by reducing mutant RNA transcript via RNAi and sterically blocking RBP binding to RNAs [161]. Designed double-strand RNA targeting the repeat region can also block RNA foci formation [162]. In addition, AAV5-driving miRNAs can silence C9orf72 and reduce RNA foci in both the nucleus and cytoplasm in iPSC-derived motor neurons and in an ALS mouse model [163, 164]. Further studies are needed to demonstrate the efficacy of RNAi to mitigate behavioral deficits in transgenic mice.

Small Molecules

Small molecules offer an attractive approach for targeting C9orf72 repeat DNA/RNAs given their pharmacological advantages, such as a small molecular mass, which is important for blood–brain barrier permeability. TMPyP4, a known G-quadruplex binder, binds (GGGGCC)8 RNA in vitro, ablates the sequestration of RBPs [165], and rescues transport defect pathologies and neurodegeneration in Drosophila overexpressing 30 repeats [98]. Recent progress has also been made in the design of nucleotide structure–specific small molecules. Several compounds bind the hairpin structure of GGGGCC repeats and significantly decrease foci formation and poly(GP) accumulation in cultured cells overexpressing 66 repeats and patient iPSC-derived neurons [83, 166]. It is unknown whether these compounds also reduce poly(GR) and poly(PR), two DPR proteins that are most toxic in model systems. Further experiments are also needed to demonstrate that they are beneficial in reducing cellular toxicity both in vitro and in vivo.

Reduce Repeat-Containing RNA Transcription

The complicated secondary structure formed by GC-rich C9orf72 repeat expansions suggest that specialized transcription machinery may be required to facilitate RNA polymerase II through such long repeats. DRB sensitivity-inducing factor (DSIF) complex and the RNA polymerase II associated factor 1 complex (PAF1C) are highly conserved and have nonredundant roles in activating RNA polymerase II during elongation [167]. SUPT4H and SUPT5H (components of DSIF) and PAF1C play critical roles in the transcriptional elongation of C9orf72 repeat expansions [168, 169]. Targeting SUPT4H reduces levels of sense and antisense C9orf72 repeat transcripts, as well as accumulation of RNA foci and DPR products, and ameliorates neurodegeneration in C9orf72 iPSC-derived neurons and in Drosophila [168]. Similarly, in C9orf72 patients, the expression levels of PAF1 and LEO1 (components of PAF1C) are upregulated, and their expression correlates positively with the expression of repeat-containing transcripts. Depletion of PAF1C reduces RNA and poly(GR) dipeptide production in the Drosophila-expressing GGGGCC transgene [169]. Although exciting, careful examination of this therapeutic strategy is needed since depletion of SUPT4H causes a global RNA reduction [170].

CRISPR

Recent advances in genome editing with CRISPR/Cas technology have made it possible to target the mutant gene at the genomic level. CRISPR/Cas is an abbreviation for clustered regularly interspaced short palindromic repeats and CRISPR-associated protein. It was adapted from a naturally occurring genome-editing system in bacteria [171]. CRISR/Cas9 can be designed to cut the C9orf72 repeat expansions, following gene repair through nonhomologous end joining. As a proof of concept, genetic correction of C9orf72 repeat expansions in patient iPSCs has been achieved [30, 126]. In addition, Pinto et al. [172] targeted enzymatically dead Cas9 to directly bind repeat-expanded DNAs, such as CTG expansion in DM1, CTGG expansion in DM2, and GGGGCC expansion in C9ALS/FTD. This approach sterically impedes repeat RNA transcription and leads to a drastic decrease in the production of aberrant proteins produced from RAN translation [172]. Similar to ASOs, RNA-targeting Cas9 with RNA endonuclease can degrade repeat expanded RNAs and reduce RNA foci and DPR protein levels in cell lines [173]. The clinical use of CRISPR/Cas technology in ALS/FTD is still in its infancy. Ethical concerns, adequate methods of delivery and cell targeting, off-target and safety measures, and treatment efficacy are the main issues that need to be addressed before applying this method to patients.

Targeting DPR Proteins and Downstream Mechanisms

Overall, ASO, RNAi, and other abovementioned approaches target the upstream of disease mechanism and also positively affect different downstream pathways. For example, Zhang et al. [98] showed that ASOs targeting C9orf72 sense RNA or TMPyP4 treatment to inhibit RBPs bindings to repeat RNAs can rescue NCT deficits. Nevertheless, strategies directly targeting DPR proteins and downstream mechanisms have also been pursued.

Target DPR Proteins

Aggregates of DPR proteins are a pathological hallmark of C9orf72 ALS/FTD, and DPR proteins seem to exert toxicity in model systems. In addition to inhibiting RAN translation as we learn more about its mechanisms [153, 174, 175], one way to directly mitigate DPR protein–mediated toxicity is to increase toxic protein turnover rate. To this end, overexpressing the small heat shock protein HSPB8 facilitated the autophagy-mediated disposal of a large variety of classical misfolded aggregation-prone proteins and significantly decreased the accumulation of most DPR insoluble species [176]. Antibody immunization against DPR proteins represents another novel therapeutic approach. In other neurodegenerative disorders, active and passive immunization have been explored to target toxic proteins that transmit from cells to cells, such as amyloid-β, tau, and α-synuclein [177]. The prion-like propagation of misfolded/aggregated proteins is also a pathogenic principle in ALS/FTD [178, 179]. Using coculture cell assays, poly(GA) was shown to possess cell-to-cell transmission properties [180] and so do poly(GP) and poly(PA) [131]. Interestingly, poly(GA) antibodies reduce intracellular poly(GA) aggregations and seeding activity of C9orf72 patient brain extracts [131]. Since poly(GR) and poly(PR) are more toxic than poly(GA), poly(GP), and poly(PA) in experimental models, further studies are needed to determine which DPR proteins and conformations to target. The specificity of antibodies against DPR proteins is also a concern as similar short repetitive sequences are present in human proteome.

Target Downstream Mechanisms

Other therapeutic approaches aim to correct downstream cellular pathways affected by C9orf72 repeat expansions. Reducing nuclear export by targeting the nuclear export factors SRSF1 or exportin 1 ameliorates the toxicity mediated by C9orf72 repeat expansion in Drosophila [98]. This could be due to the reduced export of toxic repeat-containing RNA to the cytoplasm and thus less DPR proteins [181] or through a more general mechanism to reverse alterations in NCT and consequent mislocalization of RBPs, such as TDP-43 [98]. In addition to the genetic tools, selective inhibitors of nuclear export (SINEs) mitigate NCT deficits and neurodegeneration in a Drosophila model [98]. SINE compounds also improve primary neuron survival and partially improve motor function in rats overexpressing mutant TDP-43 [182]. Currently, Karyopharm Therapeutics and its partner Biogen Inc. are taking one of the SINE compounds, KPT-350, into clinical trials. Zhang et al. [150] also showed that inhibition of stress granule assembly by using ASOs targeting ataxin 2 is sufficient to abrogate NCT dysfunction and neurodegeneration in patient-derived neurons and in vivo. As reducing ataxin 2 significantly extend the lifespan of TDP-43 mice, this strategy holds great promise [183].

Biomarkers

One of the critical themes that has emerged over the last two decades of clinical trials in ALS relates to biomarkers. There is no clear diagnostic biomarker for diagnosis, tracking disease progression, or determining if a putative therapy engages its therapeutic target. In C9orf72-associated diseases, effective biomarkers could also identify phenoconversion of asymptomatic genetic carriers. Clinical, neurophysiological, and imaging biomarkers are difficult to employ and/or insensitive. Recently, progress has emerged for biomarkers in the blood and CSF. Neurofilament proteins herald the onset of disease in presymptomatic carriers and appear to track with disease progression, although these biomarkers are nonspecific to C9orf72 diseases and are sensitive to damage caused by free radicals [184]. Significantly, the DPR protein poly(GP) was detected in CSF and in peripheral blood mononuclear cells from c9ALS patients. Thus, tracking poly(GP) in CSF provides a direct readout of target engagement for ASO and other clinical trials [82].

Conclusions and Future Perspectives

Since the discovery of C9orf72 repeat expansions, an impressive effort has been made to investigate disease mechanisms. Many disease models were developed to determine the loss of function and/or gain of toxicity and downstream cellular function alterations. Therapeutic approaches and biomarkers have also been explored. Excitingly, antisense oligonucleotide therapy to selectively degrade repeat-containing RNAs has gone into a phase I clinical trial for C9orf72 ALS patients. With a continuing fast pace of research on these devastating diseases, a promising therapy should soon be on the horizon.