Introduction

Stroke is a complex disease caused by environmental and genetic factors. Existing studies have shown that genetic factors play a substantial role in the risk of stroke (Bak et al. 2002; Jerrard-Dunne et al. 2003). About 80% of stroke is ischemic stroke (IS). The most common subtypes of IS are large artery atherosclerosis (LAA), cardioembolic stroke (CES), and small vessel disease (SVD) (Jerrard-Dunne et al. 2003). The genetic heritability varies by different subtypes of IS, of which LAA has the highest heritability of all IS subtypes (40.3%) (Bevan et al. 2012).

MicroRNAs (miRNAs) are a class of small noncoding RNAs that can bind to the 3′-untranslated regions (3′-UTRs) in mRNAs regulating their expression (Ambros 2004). Increasing evidences suggest that miRNAs have been involved in the pathogenesis of many diseases, such as cancer (Lu et al. 2005), schizophrenia (Beveridge et al. 2010), Parkinson's diseases (Kim et al. 2007), and stroke (Khoshnam et al. 2017). The biosynthesis of miRNAs is a complex process involving multiple miRNA biogenesis genes (Ambros 2004). Briefly, primary miRNA is produced in the nucleus by RNA polymerase II. Then, the primary miRNA is further processed into the precursor miRNA (pre-miRNA) by RNase III DROSHA and RNA binding protein DGCR8. The pre-miRNA is exported to the cytoplasm via Ran-GTPase (RAN) and Exporti-5 (XPO5). At last, the pre-miRNAs are processed to produce the mature miRNAs through a multiprotein complex that includes DICER, PIWIL1, GEMIN, and RAN (Bartel 2004).

We conducted this case–control study to investigate whether there is an association between three known miRNA biogenesis genes polymorphisms (DROSHA rs10719 T > C, RAN rs3803012 A > G, and PIWIL1 rs10773771 C > T) and LAA stroke risk.

Materials and Methods

Study Subjects

This case-control study was approved by the institutional review boards of the local participating hospitals (2017NZGKJ-041). Written informed consent was obtained from all participants. This is a case–control study designed with samples from the Chinese Han population. Our study comprised 710 LAA patients and 1,075 controls. All LAA patients had a focal neurologic deficit lasting > 24 h, which were confirmed by computed tomography (CT) or magnetic resonance imaging (MRI). All LAA patients were diagnosed according to the Trial of Org 10172 in Acute Stroke Treatment (TOAST) criteria (Adams et al. 1993). All patients with tumor, autoimmune diseases or hemorrhagic diseases were excluded. The healthy controls were recruited from those presenting at local hospitals for health physical examinations during the same period. The controls did not have neurological diseases and cerebrovascular disease as well as cardiovascular diseases according to clinical examinations or history taken. Risk factors of LAA stroke patients and controls were determined according to established criteria. Hypertension was defined as systolic pressure ≥ 140 mmHg and diastolic pressure ≥ 90 mmHg on at least three separate occasions or currently taking anti-hypertensive medications or has a history of hypertension. Diabetes mellitus was defined as fasting glucose ≥ 126 mg/dl or taking anti-diabetic medications or has a history of diabetes.

SNP Selection and Genotyping

The single-nucleotide polymorphisms (SNPs) were selected from eight known miRNA biogenesis genes (DROSHA, DGCR8, RAN, XPO5, DICER, PIWIL1, GEMIN3, GEMIN4). The selection criteria were as follows: (1) in the 3′-UTR region; (2) the minor allele frequency of each SNP ≥ 0.05 in 1000 Genome CHB (Han Chinese in Beijing) data on the GRCh37 reference assembly (www.1000genomes.org/); (3) each SNP can influence the miRNA binding predicted by miRanda (John et al. 2004) and miRbase (Griffiths-Jones et al. 2008) database using the SNPinfo Web Server (https://snpinfo.niehs.nih.gov/snpinfo/snpfunc.html) (Xu and Taylor 2009). A total of three SNPs (DROSHA rs10719, RAN rs3803012, and PIWIL1 rs10773771) met the selection criteria described above and were included in this study. Genomic DNA was isolated from whole blood. Genotyping for DROSHA rs10719, RAN rs3803012, and PIWIL1 rs10773771 was conducted using SNPscan technology, with technical assistance from Center for Genetic and Genomic Analysis, Genesky Biotechnologies Inc. (Shanghai, China) (Supplementary Figs. 1–3). For quality control and validation purposes, a random 5% of sample was repeated genotyping to check for consistency, and the results were 100% consistent.

Statistical Analysis

Categorical variables were presented as number (percentage) and continuous variables as mean (SD, standard deviation). Categorical variables were compared with Chi-square test and continuous variables with Student’s t test. The association between microRNA biogenesis genes polymorphisms and LAA stroke was estimated with the odds ratio (OR) and 95% confidence interval (CI) by multivariate logistic regression model, adjusting for age, sex, hypertension and diabetes. The Hardy–Weinberg equilibrium, the association between these polymorphisms and LAA stroke were calculated by SNPStats web tool (http://bioinfo.iconcologia.net/SNPstats) (Sole et al. 2006). Goodness of fit was evaluated using Akaike information criterion (AIC) for selecting the best genetic model to the SNPs (Sole et al. 2006).

All reported P values are two-sided. P < 0.05 was deemed as statistical significance unless otherwise specified. All analyses were performed with IBM SPSS Statistics version 23.0 (Armonk, NY: IBM Corp.).

In Silico Analysis

To evaluate the potential function of the 3′-UTR SNP rs10773771, we conducted in silico analysis using RNAfold Web server (http://rna.tbi.univie.ac.at/) (Hofacker 2003), SNPinfo Web server (https://snpinfo.niehs.nih.gov/snpinfo/snpfunc.html) (Xu and Taylor 2009) and HaploReg v4.1 (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php) (Ward and Kellis 2016; Kheradpour and Kellis 2014).

Expression Quantitative Trait Loci (eQTL)

For SNP rs10773771, we queried publicly available eQTL database GTEx V7 (https://www.gtexportal.org/home/) (The Genotype-Tissue Expression (GTEx) project 2013).

Results

miRNA Biogenesis Genes Polymorphisms and the Risk of LAA Stroke

The clinical characteristics of LAA patients and controls in the study are summarized in Table 1. Distribution of the genotypes in the control groups was not deviated from Hardy–Weinberg equilibrium (P > 0.05 for all 3 SNPs). As shown in Table 2, no significant associations between rs10719, rs3803012 and LAA stroke were detected in the multivariate logistic regression analysis. However, the rs10773771 CC genotype was significantly associated with a decreased risk of LAA stroke in both codominant (P = 0.014) and recessive model (P = 3 × 10–3). According to the AIC values, recessive model is the best-fit model for rs10773771. In the recessive model, compared with TT/CT genotype, rs10773771 CC genotype was associated with significantly decreased risk of LAA stroke (OR 0.63, 95% CI 0.46–0.86, P = 3 × 10–3). Further stratification analysis suggested that this decreased risk was only observed in males (OR 0.63, 95% CI 0.42–0.96, P = 0.032), those aged above 66 (OR 0.52, 95% CI 0.33–0.82, P = 0.005), those without hypertension (OR 0.49, 95% CI 0.28–0.89, P = 0.018), and those without diabetes (OR 0.63, 95% CI 0.46–0.87, P = 0.005). (Supplementary Table 1).

Table 1 Baseline information for the study participants
Table 2 Association between SNPs and LAA stroke

In Silico Analysis

The mRNA secondary structure can affect the interactions between mRNA and miRNA. Therefore, we examined whether mRNA structure of PIWIL1 could be altered by rs10773771. After inputting a 200 bp fragment of the 3′-UTR of PIWIL1 including rs10773771 and changing the rs10773771 allele from C to T in the RNAfold, we found the change of the mRNA secondary structure, suggesting that it may affect the stability of mRNA, with an change of minimum free energy (MFE) from ∆G = − 35.5 kcal/mol to − 36.6 kcal/mol (Fig. 1a). We further investigated whether the rs10773771 could influence the combination of miRNAs to the 3′-UTR of PIWIL1 using the SNPinfo web server. The in-silico results suggested that the rs10773771 could affect the combination of three miRNAs (hsa-miR-1264, hsa-miR-340, hsa-miR-590-3p; Table 3) to the 3′-UTR of PIWIL1. In addition, based on the HaploReg v4.1, we identified that rs10773771 is likely to alter the binding affinity of regulatory motifs HNF1, Irx, Pou2f2, Sp100 and TATA (Table 4).

Fig. 1
figure 1

a Bioinformatics analysis for prediction of rs10773771 influence on RNA folding structures; MFE minimum free energy. b Association between rs10773771 genotypes and PIWIL1 expression in human skin tissues based on GTEx V7. c Association between rs10773771 genotypes and PIWIL1 expression in human thyroid tissues based on GTEx V7. d Allele frequency of rs10773771 among different populations from the 1000 genome project. Population descriptions: AMR American, AFR African, EUR European, EAS East Asian, SAS South Asian

Table 3 microRNA-binding sites based on SNPinfo
Table 4 Regulatory motifs altered for rs10773771 based on HaploReg v4.1

eQTL Analysis

Based on gene expression data extracted from GTEx, rs10773771 was identified to be an eQTL for PIWIL1 in human skin (P = 1.534 × 10–10; Fig. 1b) and thyroid tissues (P = 4.869 × 10–6; Fig. 1c).

Discussion

In this study, we evaluated the associations between three miRNA biogenesis genes polymorphisms (DROSHA rs10719 T>C, RAN rs3803012 A>G, and PIWIL1 rs10773771 C>T) and the risk of LAA stroke. To the best of our knowledge, this is the first study to report the association between rs10773771 and LAA stroke. These findings indicated that genetic variant at PIWIL1 contribute to the development of LAA stroke.

There is mounting evidence that miRNAs have been involved in the pathogenesis of stroke (Khoshnam et al. 2017; Rink and Khanna 2011; Mirzaei et al. 2018). PIWIL1 plays an important role in stem cell renewal, division and RNA silencing (Hutvagner and Simard 2008). Moreover, it can also take part in the process of microRNA biogenesis that pre-miRNAs are processed to produce the mature miRNAs (Bartel 2004). Given the function of PIWIL1 in miRNA biogenesis and the involvement of miRNA in the pathogenesis of stroke, SNPs in PIWIL1 may affect the risk of stroke through regulating its function. The SNP rs10773771, which reached statistical significance in this study, is located in the 3′-UTR of PIWIL1. Furthermore, we performed a number of complementary in silico analysis to predict the potential function of the 3′-UTR SNP rs10773771. The SNP rs10773771 can change the mRNA secondary structure of PIWIL1. Additionally, the rs10773771 can influence the combination of three miRNAs (hsa-miR-1264, hsa-miR-340, hsa-miR-590-3p) to the 3′-UTR of PIWIL1. Reporter gene assays indicated that rs10773771 can change the binding ability of hsa-miR-1264 to the 3′-UTR of PIWIL1 in a previous study (Liu et al. 2013). The miR-1264/1298/448 cluster has been identified as a biomarker to monitor reperfusion after cerebral ischemia (Uhlmann et al. 2017). Based on the HaploReg v4.1, we also identified that rs10773771 is likely to alter the binding affinity of regulatory motifs HNF1, Irx, Pou2f2, Sp100 and TATA. Based on gene expression data extracted from GTEx, rs10773771 was identified to be eQTL for PIWIL1 in human skin (P = 1.534 × 10–10) and thyroid tissues (P = 4.869 × 10–6). The elucidation of its mechanism on the development of stroke needs further research. The allele frequency of rs10773771 from different population was different according to the 1000 genomes project (Fig. 1d), suggesting that ethnic differences may exert different influences on the association observed in our study.

In addition, we also found no significant association between DROSHA rs10719 T>C, RAN rs3803012 A>G and LAA stroke. DROSHA and RNA are essential for miRNA biogenesis. A Korean study has found that DROSHA rs10719 was associated with the risk of IS (Kim et al. 2018). However, this association was not detected in the LAA stroke, which was consistent with the results in our study. This is the first study investigating the association between RAN rs3803012 and LAA stroke. In the current study, we found that DROSHA rs10719 and RAN rs3803012 were not associated with the risk of LAA stroke, suggesting that these SNPs might not be associated with the susceptibility of LAA stroke in Chinese population.

The strengths of this study included the large sample size and in silico analyses. However, several limitations in this study should also be considered: (1) some risk factors of stroke, such as smoking and hyperlipoidemia were not included in the study, which may introduce additional biases. (2) Our study was a hospital-based case–control study, and the selection bias cannot be eliminated.

In summary, we have identified an association between rs10773771 and LAA stroke. The underlying mechanisms remain to be elucidated in the future.