INTRODUCTION

Trinucleotide repeats are located in the genes that are expressed in a variety of tissues, playing different roles, yet diseases connected with these repeat expansions predominantly influence particular neuron populations. For example, spinocerebellar ataxia type 17 (SCA17) is caused by expansion of CAG repeats in the TATA-binding protein (gene TBP), which is a ubiquitous transcription factor expressed in all cells. Huntingtin (gene HTT) is another example of the gene, containing CAG repeats, also is expressed in all cells and has pleiotropic functions [1]. Despite the fact that the CAG repeats appear in different genes, disease phenotypes, which are caused by the repeat expansion, have some similarities. This suggests existence of a general mechanism underlying neurodegeneration caused by trinucleotide repeat expansions. However, up to now this mechanism remains the subject of debate. Contribution of the mutant protein, as well as of the mutant RNA, to the processes of neurodegeneration is under discussion [2]. Also, it is not clear, why different number of repeats are necessary for the disease manifestation. For instance, 36-40 CAG repeats in the HTT gene lead to incomplete penetrance of Huntington’s disease (HD) [3], while 41-48 repeats in the TBP gene cause SCA17 with reduced penetrance. Remarkably, reduced penetrance of SCA17 in the cases with number of CAG repeats less 48 is associated with the presence of a particular variant in the gene STUB1 for complete penetrance [4].

Early studies of CAG expansion diseases were aimed to investigate contribution of the polyglutamine tract (polyQ), which is encoded by the CAG triplets, to pathogenesis of the diseases [5]. It was assumed, that polyQ is responsible for aggregation of proteins and subsequent cell death [6]. However, there is evidence that the RNA containing CAG triplets is neurotoxic, i.e., pathological process is induced directly by RNA. This is proven by those cases where CAG repeats, even those located in untranslated regions, still led to the disease, as in the case of ataxia type 12 (SCA12) [7]. Also, it is noteworthy that interruption of the CAG-tract by CAA triplet, also encoding glutamine, delays manifestation of the disease [8]. Hence, CAG-tract in RNA is supposed to play a paramount role in pathogenesis of the CAG expansion diseases, although polyQ, undoubtedly, influence pathogenesis. As a result, toxicity studies were carried out on the RNAs containing CXG repeats. For example, evidence of pathogenicity of such RNAs has already been observed in fruit flies [9], Caenorhabditis elegans [10], and in mice [11].

RNA editing is a natural process of enzymatic modification of transcripts. This process could lead to various consequences on a protein level, such as protein recoding, namely, non-synonymous amino acid substitution after editing of exons [12], reading frame shift due to appearance of a new start codon [13], change in the length of the protein-coding region of mRNA if stop codon is modified or a new one is formed [14], and alternative splicing in the case of adenosine deamination [15]. In humans, only two types of RNA editing are known: cytosine and adenosine deamination. The latter is predominant and leads to substitution of adenosine with inosine (A-to-I). This process is catalyzed by adenosine deaminases acting on RNA (ADARs) and occurs only in the double-stranded RNA (dsRNA) regions [16, 17]. As a result of this deamination, destabilization and unwinding of the RNA duplex occurs, since the thymidine-inosine pair is less stable than the thymidine-adenosine. It is worth noting that RNA editing is a constitutive post-transcriptional modification process that occurs in most human cells and prevents unwanted production of type I interferon in response to the presence of excessive amounts of endogenous dsRNA, which mimics corresponding viral genomes, thereby triggering an antiviral immune response [18]. Thus, ADAR1 activity prevents spontaneous activation of dsRNA sensors in response to endogenous dsRNA [19, 20]. A-to-I RNA editing at a specific residue usually does not have a 100% reaction yield, so there is a fraction of adenosines edited, which is called the level of editing. Correct RNA editing is necessary to ensure normal life and proper functioning of neurons [21]. For example, the ratio of recoded and original proteoforms can affect cell differentiation and development of nervous system [2223]. Therefore, changes in the RNA editing profile could be involved in various diseases of the nervous system. For instance, genomic mutations of ADAR in humans cause Aicardi–Gutierrez syndrome, an inflammatory disease affecting brain and skin [24]. Moreover, mutations in the Zα domain of ADAR cause bilateral striatal necrosis [25]. In addition, inosine is recognized as guanosine during translation of the messenger RNA. The ADAR2 isoform (ADARB1) has been shown to cause majority of these amino acid-coding editing events. More than 10% of A-to-I substitutions in RNA lead to protein recoding [2627]. This led to hypothesizes that alterations in the ADAR2-mediated editing could participate in the development of neurological diseases such as epilepsy [28], affective disorder, and schizophrenia [29]. Additionally, changes in the ADAR2-mediated RNA editing profile were observed in the hippocampus of patients with epilepsy [30].

Molecular mechanisms involved in pathogenesis of the CAG-repeat expansion diseases overlap with those involving RNA editing. For example, the dsRNA sensor PKR is activated in the cells affected by an abnormal number of CAG repeats, and the level of PKR activation correlates with the number of repeats [31]. ADAR1, as mentioned above, is involved in modulation of this immune response pathway. Additionally, hairpin structures formed by the CXG repeats could be used as a substrate for Dicer ribonuclease, resulting in RNA inactivation via the RNA-inducible silencing complex (RISC) [32]. The ADAR1 monomer, in its turn, forms complex with Dicer, to guide it to the dsRNA, thereby enhancing gene silencing [33]. Moreover, the sufficiently long CXG repeats form dsRNA hairpins, as well as Z-structures in DNA [34]. Hence, the RNA binding proteins (RBPs) could be involved in pathogenesis of the repeat expansion diseases by sequestering dsRNA formed by the repeats [35]. ADARs, in its turn, have a dsRNA binding domain (RBD), and ADAR1 p150 also has a Z-DNA binding domain, so these enzymes could potentially bind to hairpins consisting of CXG repeats. For instance, the GGGGCC repeats are associated with pathogenesis of amyotrophic lateral sclerosis (ALS). It was recently demonstrated that the RNA containing GGGGCC repeats could sequester ADAR3 (ADARB2 gene). At the same time, knockdown of the ADARB2 significantly reduced the number of neurons with nuclear RNA inclusions formed during differentiation of the neurons from pluripotent stem cells obtained from the patients with ALS [36]. In addition, in the same work, sensitivity of the neurons expressing an abnormal number of GGGGCC repeats to the glutamate-mediated excitotoxicity was noted. This is consistent with the results of previous studies regarding excitotoxic effects resulting from insufficient RNA editing of the glutamate receptor subunit GluR2 (GRIA2) [3738]. Another study demonstrated changes in the RNA editing caused by aberrant localization of ADAR2 in the cytoplasm of the cells expressing an abnormal number of GGGGCC repeats [39]. Moreover, CAG triplets are one of the most preferred substrates of ADARs [40]. An RNA hairpin of 15 bp is sufficient for the ADAR-mediated editing, and length of this hairpin correlates with the level of editing [41]. Thus, additional adenosine residues in the dsRNA regions formed as a result of repeat expansion could serve as an additional substrate for the constitutive form of ADAR1 (ADAR1 p110) in the nucleus, as well as for the interferon-induced ADAR1 p150 and ADAR2 isoforms in the cytoplasm. Interestingly, abnormal ADAR functioning in the context of reverse transcription has already been proposed as a mechanism involved in pathogenesis of the repeat expansion in neurons [42].

Considering all the above, it could be assumed that disruption of RNA editing may be involved in pathogenesis of the CAG expansion diseases. In this study, we conducted a targeted analysis of differential A-to-I editing in the selected regions of various RNAs using cellular models of CAG repeat expansion diseases, using induced pluripotent stem cells (iPSCs) obtained from the patients suffering from Huntington’s disease or ataxia type 17 as examples, as well as midbrain organoids differentiated from these iPSCs. The aim of this study was to compare the RNA editing levels between the pathological and normal cell models.

MATERIALS AND METHODS

Cell lines. The list of iPSC lines used and the corresponding diseases, indicating the number of CAG repeats, is presented in Table S1 in the Online Resource 1.

Neuronal precursors and midbrain organoids were generated from iPSC lines obtained by reprogramming skin fibroblasts from the patients with SCA17 (cell lines SCA17.9L, SCA17S5S, SCA17.4sev, and SCA17.8sev), HD (lines HD76.1S, HD42.1.2, HD46.5S). These cell lines were characterized according to the protocols outlined previously [43]. As a control, we used iPSC lines RG4S, FF1S, Huv4S5, obtained by reprogramming fibroblasts of the healthy donors without an excessive number of CAG repeats [44].

Differentiation of iPSCs into midbrain organoids was carried out using the method described by Eremeev et al. [45].

Immunofluorescent staining. For immunofluorescent staining, organoids were washed with 1× phosphate-buffered saline (PBS), fixed for 30 min at room temperature in 4% paraformaldehyde/PBS, and, next, embedded in a tissue freezing liquid (Leica, Germany), frozen in liquid nitrogen vapor for 5 min followed by preparation of 5-10-µm thick sections with a Thermo cryotome (Thermo Fisher Scientific, USA). The sections were fixed on a glass slide, cooled to –20°C with acetone for 5 min, and washed twice for 5 min each with PBS. They were incubated in a blocking buffer with primary antibodies for neuronal markers MAP2 (ELK Biotechnology, USA) and GFAP (Dako, Denmark) overnight in a humidified chamber at 4°C. Primary antibodies were applied in dilutions recommended by the manufacturer in PBS with 0.1% Tween 20 containing 5% fetal bovine serum (FBS) and 2% goat serum, incubated for 1 h at room temperature, and next washed 3 times for 5 min in PBS with 0.1% Tween 20. Secondary antibodies (Invitrogen, USA) conjugated with fluorescent tags (Alexa 488, Alexa 555) were applied in dilutions recommended by the manufacturer, incubated for 30 min at room temperature in the dark, washed 3 times for 5 min in PBS with 0.1% Tween 20. Next, the preparations were incubated for 10 min with 4′,6-diamino-2-phenylindole dihydrochloride (DAPI) at a concentration of 0.1 µg/ml in PBS to visualize nuclei, washed 2 times with PBS. The resulting preparations were examined under an Olympus IX53 fluorescent microscope (Olympus, Japan).

RNA editing sites for targeted analysis. Based on the results of previous studies on proteogenomic searches for recoded peptides [4647], we selected common human and mouse RNA editing sites that were observed in both transcriptomic and proteomic datasets. Additionally, we added to our list noteworthy RNA editing sites, which were observed in the glioblastoma cell lines [48]. These included RNA editing sites located in the following genes: HTR2C, PWAR5, BLCAP, and ZNF669. All selected sites are presented in Table S2A in the Online Resource 1.

Primer design. RNA sequencing primers were designed using the following parameters: primer length 15-25 bp, GC content 40-75%, melting temperature 58-65°C. Required amplicon size was in the range of 260 to 450 bp, and distance from the editing site to the end of amplicon was required to be from 70 to 115 bp. Quantitative PCR (qPCR) primers were designed using the following parameters: amplicon length 50-200 bp, primer length 15-20 bp, melting temperature 62°C. All primers were designed using Primer-BLAST [49] and complementary DNA (cDNA) sequences from the NCBI reference sequence database [50]. Each designed primer pair was tested for nonspecific amplification using Primer-BLAST. Each pair was confirmed to correspond to one PCR amplicon with a known expected length under the planned PCR conditions. Absence of formation of hairpin structures and dimers in the selected primers was verified using OligoAnalyzer [51]. Primer pairs for targeted RNA sequencing and qPCR are presented in Tables S2A and S3A in the Online Resource 1, respectively. Primer synthesis was carried out by Evrogen (Russia).

RNA sample preparation. RNA isolation was performed using a RNeasy plus mini kit (QIAGEN, Germany) according to the manufacturer’s protocol. Integrity of the isolated RNA was verified by electrophoresis in a 1.5% agarose gel. Concentration of total RNA was determined using a Qubit 4 fluorometer (Thermo Fisher Scientific) using a Qubit RNA BR Assay Kit (Thermo Fisher Scientific). Isolated RNA in an amount of 5-10 µg was purified from genomic DNA impurities using a TURBO DNA-free Kit (Thermo Fisher Scientific), after that concentration was measured again. cDNA was synthesized from 1 µg of purified RNA using a MINT Universal kit (Evrogen, Russia). Reverse transcription was carried out according to the manufacturer’s protocol using a MiniAmp Plus thermal cycler (Thermo Fisher Scientific). The resulting cDNA was diluted 10 times and used to determine the level of RNA editing and for quantitative PCR.

Quantitative PCR. Gene expression was determined by qPCR using a CFX96 Touch Real-Time PCR Detection System (Bio-Rad, USA) in 96-well plates (Bio-Rad). Reactions were performed using a commercial real-time PCR kit qPCRmix-HS SYBR (Evrogen) in accordance with the manufacturer’s protocol, final reaction volume was 25 µl. PCR was carried out in three technical replicates under the following conditions: 95°C, 3 min; 94°C, 15 s; 61°C, 10 s; 72°C, 15 s – 39 cycles. Normalization to the geometric mean of expression of two genes (TBP and ACTB) was performed because this method has been shown to be more stable in assessing expression changes during neuronal differentiation than using a single gene.

Targeted RNA amplification. Amplification of the regions containing editing sites was carried out with a ProFlex 3 × 32-well PCR system (Thermo Fisher Scientific) using a commercial kit Tersus Plus (Evrogen). Reaction mixture (15 µl) was prepared according to the manufacturer’s protocol. Optimal PCR conditions were empirically selected for each fragment to increase yield of the target PCR fragments and avoid formation of nonspecific products. All reactions were amplified under the following conditions until otherwise specified: 95°C, 2 min; 95°C, 30 s; X°C, 30 s; 72°C, 30 s – 39 cycles, where X is annealing temperature specific for each individual pair of primers. Conditions for each primer pair are presented in Table S2A in the Online Resource 1. Amplicons were separated, visualized, and analyzed using electrophoresis in a 2% agarose gel. Gel composition: agarose 2%, 1× Tris-acetate (TAE) buffer (Litech, Russia), DNA marker 1 kb (Evrogen), ethidium bromide 10 mg/ml (Helicon, Russia). Samples were loaded into gel wells using 4× Gel Loading Dye, Blue (Evrogen). Electrophoresis was carried out in a 1× TAE buffer at a voltage of 180 V for one hour using a power source Elf-8 (DNA-technology, Russia). When amplification led to formation of nonspecific products, target product extraction from the gel was required (Table S2A in the Online Resource 1). Multiple amplicons were separated in a 1.5% agarose gel, excised and purified using a Cleanup S-Cap kit (Evrogen, Russia) in accordance with the manufacturer’s protocol. Target PCR products were excised using a scalpel and tweezers. To elute samples from the column, 20 µl of elution solution was used to obtain solutions with high concentrations of amplicons.

Libraries preparation. Amplicons for each sample were mixed at equimolar concentrations and assigned a unique index for identification. Concentration of each amplicon was measured using a Qubit Flex fluorimeter (Thermo Fisher Scientific) with a QubitTM dsDNA BR Assay Kit (Thermo Fisher Scientific). Amplicon mixtures were purified using an Agencourt AMPure XP kit (Beckman Coulter Inc, USA) according to the standard protocol. Purified cDNA was used to generate libraries using a NEBNext® Ultra II DNA Library Prep Kit for Illumina (Illumina, USA) in accordance with the manufacturer’s protocol. DNA concentration in the libraries was determined using a Qubit 2.0 fluorimeter (Invitrogen) with a Quant-iT dsDNA HS Assay Kit (Invitrogen) according to the manufacturer’s recommendations. Quality of the prepared libraries was assessed using a BioAnalyzer 2100 microfluidic analyzer (Agilent, USA) using an Agilent DNA High Sensitivity Kit (Agilent, USA) according to the manufacturer’s instructions.

RNA sequencing. Libraries were mixed in equimolar ratios to prepare a 4 nM library solution. Resulting library was diluted to 10 pM and sequenced using a MiSeq Nano Reagent Kit v2 (500 cycles) (Illumina) according to the manufacturer’s protocol. Sequencing was performed with a MiSeq system (Illumina) using a set of paired reads 2×250 bp with 20% PhiX added as a control.

Differential RNA editing analysis. Quality of raw sequencing data was assessed using fastQC (version 0.11.8) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) followed by preprocessing using fastp (version 0.22.0) [52]. The reads were then aligned to the human reference genome GRCh38.p14 using STAR (version 2.7.10b) [53] with default options, and then filtered with SAMtools [54]. Next, the generated bam files were used for RNA editing sites detection using REDItools [55]. The REDItools output was filtered as follows: reference allele – A (T), alternative allele – G (C), the site is presented in the REDIportal database [56], generated by REDItools p-value is below 0.05, editing level of a site is more than 0.01%. In addition, each RNA editing site was required to be present in at least two samples of each group (HD, SCA17, control), otherwise it was excluded from the analysis. Analysis of differential editing between iPSCs and brain organoids was performed using the paired Wilcoxon test. REDIT LLR, a beta-binomial model-based tool for analyzing differential RNA editing, was used to compare RNA editing between the pathologies and controls [57]. Subsequent analysis was carried out using the R 4.2.2 programming language.

RESULTS AND DISCUSSION

Targeted NGS panel for analysis of A-to-I RNA editing in selected regions. We generated iPSCs from fibroblasts, which were obtained from 3 healthy donors, 3 patients with Huntington’s disease (HD) with different lengths of CAG repeats (42, 46, and 76) in the HTT gene, and 2 patients with spinocerebellar ataxia type 17 (SCA17) with 45 CAG repeats in the TBP gene. Among the SCA17 patients, a female proband was characterized by 45 CAG in the TBP gene, but had no clinical symptoms of the disease. The obtained iPSCs were then differentiated into midbrain organoids according to the method described by Eremeev et al. [45]. The organoids were characterized using phase-contrast microscopy and immunofluorescence staining for neuronal markers MAP2 and GFAP. All organoids expressed the markers (Fig. 1).

Fig. 1.
figure 1

Characteristics of midbrain organoids. a) Micrographs of brain organoids with morphogenesis zones after culturing in differentiation medium in minireactors for 45 days; phase-contrast microscopy (magnification ×200). b) Immunofluorescence staining of sections of brain organoids obtained from differentiated derivatives of different iPSC lines. Staining with MAP antibodies (green); staining with GFAP antibodies (red). Cell nuclei are stained with DAPI (blue). Magnification ×1000. SCA17.9L, cell line containing 45 CAG repeats in TBP; HD76.1S, cell line containing 76 CAG repeats in HTT; RG4S, control cell line.

In order to conduct targeted analysis of RNA editing, we selected RNA editing sites that we previously identified in both the transcriptome and proteome of humans and mice [4647], as well as in the glioblastoma cell lines [48]. The recoding RNA sites which have been observed in the background proteome studies, were located within the mRNA of the following genes: CADPS, COPA, CYFIP2, FLNA, FLAB, GRIA2, GRIA3, GRIA4, and IGFBP7. Additionally, we included RNA editing sites within the genes HTR2C, BLCAP, EEF1AKMT2, FLNB, CCNI, SRP9, TROAP, ZNF669, as well as in the long non-coding RNA (lncRNA) PWAR5 (all selected sites are presented in Table S2A in the Online Resource 1).

Thus, we performed targeted RNA sequencing of 23 amplicons derived from iPSCs and midbrain organoids to obtain high coverage data for each selected RNA editing site. In the whole transcriptome sequencing, differential expression of transcripts results in uneven coverage of editing sites, which influences detection and statistical analysis of the editing events [58]. On the other hand, editing profiles did not depend on the number of amplification cycles in the range from 19 to 34 cycles [59]. In other words, the targeted RNA sequencing increases coverage and improves reliability of the RNA editing analysis.

We obtained RNA coverage of the regions of interest up to 16,000 reads, with an average of 3508 (Table S4A in the Online Resource 1). The filtration procedure excluded 3 out of the 23 initially selected RNA editing sites as their reads were not represented in all sample groups (Table S2A in the Online Resource  1). Beyond the mentioned sites, the analysis identified 44 additional editing sites that met our filtering criteria. Thus, after selection, 55 editing sites were analyzed for the iPSCs and 57 sites for the midbrain organoids, with 48 sites common to both groups. The sites included RNA positions edited either by ADAR1 or ADAR2 alone or by both enzymes (Table S2B in the Online Resource 1). For several of the sites examined, the ADAR isoforms that edit them are currently unknown. However, these sites are located within the Alu repeats, suggesting that they are edited by ADAR1 [27].

iPSC differentiation into midbrain organoids was accompanied by increase of expression of the genes encoding enzymes and regulators of A-to-I RNA editing and concomitant increase in the editing levels. Differentiation of iPSCs into brain organoids simulates the process of neural tissue formation in the embryo. Accordingly, increase in the ADAR2-mediated A-to-I RNA editing was expected [27]. To interpret the changes in RNA editing observed through the targeted panel, expression of several genes directly related to the A-to-I editing was assessed. Gene expressions were assessed in all investigated cultures, regardless of pathology or its absence, since they all underwent differentiation at the morphological level.

We observed a significant increase of the ADARB1 (ADAR2) and ADARB2 (ADAR3, catalytically inactive isoform) expression after differentiation into midbrain organoids, which was concordant with the existing knowledge on higher expression of these genes in neural tissues (Fig. 2). Further, decrease in the expression of the AIMP2 and SRSF9 genes, which have been described as negative regulators of RNA editing, was observed. [2760]. AIMP2 has been assumed to reduce editing by both ADARs, while SRSF9 suppressed the ADAR2-mediated editing. For the ADAR gene, expression of mRNAs encoding all isoforms (total) as well as interferon-induced isoforms (IFN-induced) was assessed. As expected, ADAR mRNA (ADAR1) expression remained similar in both stem cells and organoids (Fig. 2; Table S3B in the Online Resource 1).

Fig. 2.
figure 2

Analysis of differential expression of the RNA editing-associated genes in the iPSCs and midbrain organoids. ΔCt, difference between the threshold cycle of the gene and geometric mean of the threshold cycles of ACTB and TBP; * p < 0.05; ** < 0.01; ns, p > 0.05. More detailed data are presented in Table S3B in the Online Resource 1.

Next, we compared the editing levels of each site between the iPSCs and brain organoids for all 10 samples, regardless of pathology, to independently assess the impact of differentiation (Table S5 in the Online Resource 1). According to the observed increase in the ADARB1 expression and decreased expression of AIMP2 and SRSF9, we expected to observe increase in the level of editing of the ADAR2-edited sites. The ADAR1-mediated editing sites were expected to remain at their original editing level. Indeed, majority of the differentially edited sites (with a significance level <0.05, Wilcoxon test) in the literature were described as ADAR2 substrates (Fig. 3a). For example, editing of the well-known ADAR2 editing sites in the mRNAs of calcium-dependent activator of secretion 2 (CADPS) and FMR1-interacting cytoplasmic protein 2 (CYFIP2) was increased (Fig. 3d). Also, increase in the editing of the BLCAP mRNA sites was observed, which, according to various sources, are edited by both ADARs (see Table S5 in the Online Resource 1 for details on individual sites).

Fig. 3.
figure 3

Changes in RNA editing after differentiation of iPSCs into midbrain organoids. a) Volcano plot of differential editing analysis between iPSCs and midbrain organoids; b) no significant change: sites putatively edited by ADAR1; c) increase at the marginal significance level: sites putatively edited by both ADARs; d) significant increase: sites of ADAR2 confirmed by independent data.

Analysis of differential editing in the model of iPSC differentiation to brain organoids could determine with some certainty which of the two active isoforms edits a site. The sites with significant increase in the editing level (p < 0.05) are the ADAR2 substrates (Fig. 3d), whereas the sites with increasing trend of the editing level (with significance levels just below the cutoff threshold) are probably edited simultaneously by both ADARs (Fig. 3c). The stably edited sites are the substrates of ADAR1 (Fig. 3b). Interestingly, the RNAs with Alu and SINE repeats included in the study, such as lncRNA PWAR5 and TROAP-AS1, mainly include the sites annotated by us as the ADAR1 substrates. According to the literature, the RNA editing sites within these repeats with relatively long dsRNA regions are usually edited by this isoform [17].

Thus, differentiation of iPSC into midbrain organoids provided an interesting, alternative approach to annotate ADAR1 and ADAR2 substrates in addition to the knockout and knockdown experiments of the corresponding genes, which could be used in future work when the study is expanded to the transcriptome-wide scale.

A-to-I RNA editing in the midbrain organoids derived from the cells of the patients with trinucleotide repeat expansion diseases. Data analysis allowed separation of the iPSCs and midbrain organoids based on their RNA editing profiles, which includes editing levels of all sites edited by ADAR1, ADAR2 or both (Fig. 4, a and b, Fig. S1 in the Online Resource 1). Principal component analysis (PCA) more or less clearly distinguished iPSCs and organoids into two distinct groups. The exceptions were two brain organoid samples derived from the iPSCs from the patients with Huntington’s disease (HD). One sample, characterized by 76 CAG repeats in the HTT gene (HD76), appeared to be an outlier among all the others, while another sample, characterized by 42 CAG repeats in the HTT gene (HD42), showed greater similarity to the iPSC group. It may be assumed that the HD42 organoids were characterized by incomplete differentiation and retained several features of the iPSCs from which they originated.

Fig. 4.
figure 4

Overview analysis of RNA editing in the iPSC and midbrain organoids differentiated from them. a) Distribution of the sites edited by different ADAR enzymes (source links in Table S2B in the Online Resource 1; ADAR1* – RNA sites in Alu- and SINE-repeats putatively edited by ADAR1). b) Principal component analysis of the editing levels of the studied sites in pathology and control. All samples were divided into two separate groups comprising iPSCs and midbrain organoids. Two HD organoid cultures containing 42 and 76 CAG repeats were outliers. c, d) Boxplots of RNA editing levels with inclusion of HD76 in the HD group and with exclusion of this sample; iPSC, induced pluripotent stem cells; HD, Huntington’s disease; SCA17, ataxia type 17; HD76, Huntington’s disease with 76 CAG repeats in the HTT gene; * p < 0.05; ** p < 0.01; *** p < 0.001; **** p < 0.0001.

Mehta et al. previously reported changes in the transcriptomes and morphological features of cortical neurons derived from the iPSCs of the HD patient [61]. Consistent with this, we observed lower levels of RNA editing, expressed as median editing of all sites examined, and narrower interquartile intervals in the HD and SCA17 patient-derived iPSCs and organoids compared to the controls (Fig. 4c). RNA editing in the midbrain organoids derived from the HD patients was significantly different from that derived from the SCA17 patients and healthy donors (Mann–Whitney test, p < 0.05).

The data were also analyzed excluding the HD76 midbrain organoids into a separate group. RNA editing in the HD76 organoid sample was significantly different from the midbrain organoids obtained from the healthy donors, SCA17 patients, and the rest of HD patients (Fig. 4d, Mann–Whitney U test, p < 0.05). The sample with such large repeat expansion stood out not only in its level of RNA editing. Expression analysis of the genes associated with the RNA editing revealed, in particular, a more than 1000-fold increase in the expression of protein kinase R (PKR) in the HD76 midbrain organoids compared to the corresponding iPSC. Other samples did not show such significant difference in the expression of any of the genes from the analyzed list (Table S3C in the Online Resource 1). This result is consistent with the previous studies that have shown presence of inflammation and high PKR expression in the CAG and CUG repeat expansion diseases [62]. Moreover, PKR can bind to CAG tract, and this ability becomes more pronounced as the length of the tract increases [31]. Notably, alternative splicing abnormalities were detected in the cells containing 69 and 74 CAG repeats in the huntingtin protein, but were absent in the samples with 44-46 of such repeats [63]. Splicing abnormalities, in turn, could directly cause PKR activation via the dsRNA regions formed by the retained introns [64]. In addition to PKR, a 72-fold increase in the expression of the ADARB1 gene encoding the ADAR2 isoform was also observed in the HD76 sample. Considering the hypothesis that additional repeats may affect ADAR activity, this increase may be compensatory, since normal amount of the enzyme is inhibited by the excess of double-stranded RNA in the huntingtin transcripts.

Taken together, these data support the common notion that the RNA editing profile changes significantly after differentiation of iPSC into brain organoids [65]. We also note significant reduction in the level of RNA editing in the midbrain organoids derived from the patient with 76 CAG repeats in the HTT gene.

Long non-coding RNA PWAR5 is enriched with the sites editing of which is reduced in pathology. After analyzing the RNA editing changes that accompany differentiation of iPSCs into the midbrain organoids, the differentially edited RNA sites were identified in the cells harboring normal numbers of CAG repeats and the cells with abnormal numbers of repeats. RNA sites were tested for differential editing in pairs for each disease, separately in iPSCs and brain organoids, using the REDIT LLR function [57]. The differentially edited sites were not observed in iPSCs in any of the groups (Table 1, details in Fig. S2 in the Online Resource 1, Table S6A-C in the Online Resource 1). In contrast, we observed that the HD76 midbrain organoids, consistent with the above results, were enriched with the differentially edited sites (Table 1, Fig. 5, for details see Table S6D-F in the Online Resource 1). We observed 25, 28, and 16 differentially edited sites when comparing the HD76 midbrain organoids with the controls, SCA17, and other HD samples, respectively. In addition, we observed 5 differentially edited sites when comparing the organoids containing less than 47 CAG repeats in the HTT gene with the controls and SCA17 (Table 1; details in Fig. S3, Table S6G-H in the Online Resource 1). At the same time, RNA editing in the organoids derived from the patients with ataxia type 17 was not different from the controls (Fig. S3C, Table S6I in the Online Resource 1). Importantly, RNA editing of the most differentially edited sites was reduced in pathology.

Table 1 Number of differentially edited RNA sites in the studied genes in the pathology groups
Fig. 5.
figure 5

Volcano plots of differential editing analysis of midbrain organoids with 76 CAG repeats in the HTT gene (HD76). Blue dots indicate sites with significant (adj. p < 0.05) changes in editing level; X-axis represents the difference in RNA editing level (%); Y-axis represents log10P-value (adj.). Comparison of the HD76 organoids with the organoids from the healthy donors (a), from the patients with SCA17 (b), and with two other cases of Huntington’s disease (c).

Majority of the differentially edited sites were located in non-coding regions (Fig. S4A in the Online Resource 1), and the lncRNA PWAR5 was enriched with the sites displaying significantly reduced level of editing (Fig. S4B in the Online Resource 1). The PWAR5 is a long non-coding RNA associated with the severe inflammatory Prader–Willi/Angelman syndrome. This RNA is located within the SNHG14 gene, which contains more than 1000 A-to-I editing sites [66]. The SNHG14 gene contains clusters of box C/D small nucleolar RNAs (snoRNAs) and overlaps with several lncRNAs associated with Prader-Willi/Angelman syndrome (PWARs) [67]. Moreover, increased expression of this gene is associated with Parkinson’s disease [68] and high density of editing sites has been reported for SNHG14 in schizophrenia [69]. Another gene located within this region, SNORD115, is associated with Prader-Willi/Angelman syndrome and has been shown to reduce the ADAR2-mediated editing [70], in particular, at the sites of the glutamate receptor GRIA2, canonical target of this enzyme isoform.

Consistent with the previously described traits of the HD76 organoids, it exhibited a significant reduction in RNA editing. Moreover, most of the PWAR5 sites in these midbrain organoids were not edited at all (Fig. 6). At the same time, this trend was not observed in the corresponding iPSC (Figs. 4d and 6). The reduced RNA editing in our experiments cannot be explained by differential expression of ADARs, as we did not observe any differences in the expression of ADAR1, ADAR2, and ADAR3, as well as SRSF9 and AIMP2 between the pathologies (Table S3C in the Online Resource 1).

Fig. 6.
figure 6

Editing levels of long non-coding RNA PWAR5; HD, Huntington’s disease; SCA17, ataxia type 17; HD76, Huntington’s disease with 76 CAG repeats in the HTT gene; **** p < 0.0001.

In addition to lncRNA PWAR5, several differentially edited sites were located in the mRNAs of BLCAP and ZNF669 genes (Fig. S4B in the Online Resource 1). The BLCAP mRNA is known to be a target of both ADARs [71]. Until now, it has not been established which isoform of ADAR edits the sites within the ZNF669 mRNA and PWAR5 lncRNA. We hypothesize that some regions on these RNAs may be edited by both isoforms, while others may be edited only by ADAR1 (Table S5 in the Online Resource 1). These data suggest that the excessive number of CAG repeats in the HD76 sample may be capable of reducing activity of both ADAR isoforms.

Editing of the GRIA2 mRNA site encoding the Q607R amino acid substitution was greater than 95% in almost all samples, which is consistent with the previous data on editing of this canonical region [72]. However, slight decrease in editing of this site was again observed in the HD76 organoids (92,7%). A similar reduction in editing at this site has been previously described in the striatum and prefrontal cortex of the HD patients, as well as in the prefrontal cortex of the Alzheimer’s disease patients [73] and in motor neurons in amyotrophic lateral sclerosis [74]. Under-editing of this GRIA2 editing site results in the glutamate-mediated excitotoxicity [37].

CONCLUSION

The study of A-to-I RNA editing of the selected regions of some previously described RNAs in the cellular models of CAG expansion diseases (spinocerebellar ataxia 17 and Huntington’s disease) was aimed at testing the hypothesis that such repeats, prone to form double-stranded structures, may attract ADARs, thus lowering the overall level of RNA editing. In particular, the decrease in ADAR1 activity, which has an overall anti-inflammatory effect, potentially, might contribute to pathogenesis of these diseases.

In addition to comparing the cell lines from the patients with pathologies, A-to-I RNA editing was assessed in the model of differentiation from iPSCs to midbrain organoids. Thus, there were two versions of the control and pathology groups – iPSCs and organoids. In agreement with the existing knowledge, in all samples but one, differentiation into organoids increased the level of RNA editing. Gene expression analysis of the proteins associated with the A-to-I RNA editing system showed that the ADAR1 isoforms maintain a stable level of expression during differentiation from stem cells to brain organoids, while expression of ADAR2 was significantly increased. This means that differentiation in culture provides a good tool for assessing which enzyme, ADAR1 or ADAR2, edits a particular site. If, according to the results of analysis in the targeted panel proposed here, the level of editing significantly increases with differentiation, then these sites are, obviously, edited by the ADAR2. The results of this analysis generally agree with the literature data, which confirms validity of the suggested approach.

We did not observe any significant differences between the normal and pathological conditions in any of the studied sites in iPSC. Global changes of RNA editing were not observed in the midbrain organoids either. Among the all examined cells, one cell culture represented a significant outlier. It was obtained from the patient with HD carrying a high level of CAG repeats in the HTT gene (76 compared to 42-46 CAG repeats in other HD samples). This sample to some extent confirmed our assumption, since it was characterized by a significantly reduced level of A-to-I editing in some of the studied RNA regions. Of course, availability of only one such sample limits universality of our conclusions, given that the patients with such high number of CAG repeats are extremely rare. Despite the fact that the hypothesis was not confirmed for most of the samples, the obtained result shows the direction of further research, specifically, precise determination of the number of CAG repeats that may directly or indirectly affect the A-to-I editing, as well as clarifying the role of this process in the development of neuropathology. Such findings could form the basis of approaches to modulate RNA editing in order to treat repeat expansion diseases.