Introduction

Parkinson’s disease (PD) is the second most common neurodegenerative disease and the most prevalent movement disorder. The loss of dopaminergic neurons within the substantia nigra and consequent formation of intracellular α-synuclein (α-syn)-immunopositive inclusions are the hallmarks of PD [1]. To date, genetic, environmental, and aging factors have been found to contribute to its etiology; however, the exact neurodegenerative mechanisms are largely unknown.

Accumulating evidence indicates the important role of lysosomal activity in PD susceptibility and pathogenicity [2,3,4]. The first study to support this reported a higher rate of progressively developing parkinsonian features in relatives of patients with Gaucher disease, an autosomal recessive lysosomal storage disorder [5, 6]. These observations were further confirmed by genetic studies which showed that heterozygous mutations in the glucocerebrosidase gene (GBA), which codes for a lysosomal enzyme, are the most prevalent genetic risk factors related to PD (5–20 increased risk) [7]. At the genetic and biochemical levels, a clear association was identified between variants of several lysosomal genes and an increased risk of developing PD [3]. Coupled with mechanistic studies that linked α-synuclein (α-syn) toxicity to lysosomal abnormalities, it is believed that a dysfunction of lysosomes probably plays an important role in PD [4].

Prosaposin, encoded by the PSAP gene, is a 524-amino acid protein containing 16 residue signal peptides and four saposin (Sap) domains. These domains further degrade into four active saposins, A-D. SapA and SapC bind to lysosomal hydrolase, and SapC is one of the important co-activators for GBA [8, 9], which degrades glucocerebrosides to glucose and ceramide. Importantly, SapC protects glucocerebrosidase (GCase) from α-syn inhibition and competes with α-syn binding [10]. SapB and SapD interact with sphingomyelinase and function as activators that solubilize sphingomyelin for hydrolysis [11]. As reported in a recent study [12], reduced acid-sphingomyelinase activity caused by variants in SMPD1 (which encodes for acid-sphingomyelinase) may lead to α-synuclein accumulation and is associated with PD. This suggests that disrupted binding of SapA-D to their related lysosomal hydrolases or a decrease in their expressions, which may underlie reduced hydrolase activity, increases the risk of developing PD.

Interestingly, genetic studies have identified the potential pathogenic role of variants of specific PSAP domains in PD. Studies involving Caucasian patients identified 4 out of a total of 2,290 patients with PD (0.2%) versus 0 out of 2,838 controls (0%) with rare variants in SapC, which is critical for glucocerebrosidase activation [3, 13]. Recently, three pathogenic variants in SapD were found in three unrelated pedigrees from the 230 autosomal dominant inheritance PD (ADPD) patients analyzed [14]. Further studies have also reported that variants in SapD impaired autophagic flux, altered its intracellular localization, caused α-syn aggregation in skin fibroblasts, and induced pluripotent stem cell (IPSC)–derived dopaminergic neurons from patients [14].

However, some limitations of these studies should be considered. For example, Robak et al. [3] analyzed the role of rare variants in whole PSAP but did not perform a gene-based burden analysis for the significant enrichment of rare variants of PSAP on PD risk. Ouled Amar Bencheikh et al. [13] limited their focus to the role of SapC, and Yutaka et al. [14] investigated rare variants in SapD, but no exonic variants in SapD were identified in a total of 1145 subjects with sporadic PD (SPD). In addition, the clinical spectrum of patients with rare variants of PSAP has not been described in detail in previous studies [3, 13, 14]. These findings cast an uncertainty on the role of PSAP in PD; therefore, it is necessary to investigate the role of PSAP in PD comprehensively. In the current study, we explored the rare variant frequencies of PSAP, the collective burden caused by these rare variants on PD risk, and the clinical spectrum of patients with rare PSAP variants.

Materials and Methods

Subjects

A total of 400 patients with ADPD and 300 patients with SPD were admitted to the Department of Neurology of the West China Hospital from December 2010 to June 2018 and were recruited for this study. They were diagnosed based on the United Kingdom PD Society Brain Bank Clinical Diagnostic Criteria [15] or the 2015 Movement Disorder Society clinical diagnostic criteria for PD, by experienced neurologists [16]. Patients with at least two consecutive generations of PD in their families were classified as ADPD patients [17]. Demographic data (including age, sex, education level, past history) and clinical data (including age of onset, disease duration, initial medication) of all the participants were collected. Unified Parkinson’s Disease Rating Scale part III and Hoehn and Yahr stage criteria were used to evaluate motor symptom severity. Cognitive function was also evaluated using the Montreal Cognitive Assessment and frontal assessment battery. The study was approved by the ethics committee of West China Hospital, Sichuan University, and all participants of the study completed an informed consent form prior to participation.

DNA Preparation and Genetic Analysis

Genomic DNA was collected from peripheral blood leukocytes via the standard phenol-chloroform extraction procedure. Genetic analyses, including whole-exome sequencing (WES), multiplex ligation-dependent probe amplification, data analysis, and variant annotation, are detailed in our previous publication [18, 19]. The pairwise linkage disequilibrium parameter (D′) and R2 values were analyzed using the SHEsis software [20].

Briefly, WES was conducted on the Illumina NovaSeq 6000 system following the manufacturer’s instructions. Clean data was mapped to the reference genome (GRCh37/hg19) to obtain the bam file using the BWA Picard protocol. Genotype calling was performed using the HaplotypeCaller software of the Genome Analysis Toolkit (GATK). The average depth of coverage for PSAP was > 100×.

The samples with a high proportion of chimeric reads (> 5%), high contamination (< 5%), poor call rates (< 90%), mean depth (< 10×), or mean genotype quality (< 65) were excluded from further analysis. For variant quality control, we restricted the data to GENCODE coding regions, where Illumina exomes surpassed the 10× mean coverage. The “PASS” variants, using the variant quality score recalibration (VQSR) filter of GATK, were included in further analysis. In addition, individual genotypes must meet the following criteria: (1) genotype depth of more than 10; (2) the allele balance (alternative allele cover/total allele cover) of heterozygous sites is between 0.2 and 0.8 and that of homozygous sites is > 0.8; and (3) genotype quality (GQ) is > 20.

Allele frequencies were estimated using the public databases: 1000 Genome Project (1000G), the Exome Aggregation Consortium-East Asian (ExAC_EAS), and the Genome Aggregation Database-East Asian (GnomAD_EAS). We classified variants as rare or common using the following criteria: (1) rare variants have a minor allele frequency (MAF) of < 0.001 in 1000G, ExAC_EAS (East Asian), and GnomAD_EAS and (2) common variants have an MAF of > 0.01 in 1000G, ExAC_EAS, or GnomAD_EAS.

For rare variants annotated as “missense,” protein-truncating variants (PTVs, including “frameshift_variant,” “splice_acceptor_variant,” “splice_donor_variant,” and “stop_gained”) were further classified as “deleterious” or “non-deleterious” based on the following criteria: (1) all PTVs were regarded as deleterious and (2) missense variants were only considered deleterious when they were predicted to be damaging by at least three of the 5 following in silico tools: SIFT, PolyPhen, Condel_label, Combined Annotation Dependent Depletion (CADD), and Genomic Evolutionary Rate Profiling (GERP++). Finally, rare variants were further classified as pathogenic, likely pathogenic, variant of uncertain significance (VUS), or likely benign or benign, according to the recommendations of the American College of Medical Genetics (ACMG) [21]. Co-segregation analysis of candidate variants was performed on all available family members.

Burden Analysis

To further investigate whether rare variants in the PSAP gene contribute collectively to PD risk, five different approaches implemented in R packages AssotesteR, including Sequence Kernel Association Test (SKAT), sum of squared score U statistic (SSU), sum test (SUM), cumulative minor allele test (CMAT) and Bayesian score test (BST), were used [22]. The rare variants were obtained from PD patient group, and rare variants from the East Asian cohort in GnomAD (GnomAD_EAS) v2.1 and Chinese data in the Chinese Millionome Database (CMDB) served as the control groups. The damaging missense variants were predicted as “deleterious” by Condel [23]. Benign missense variants were predicted as “neutral.” The burden of rare variants in PSAP and SapC and SapD were analyzed. Statistical significance was defined as p < 0.05. The Bonferroni method was used to counteract issues with multiple comparisons, as necessary.

Results

Demographic Characteristics

The demographic and clinical characteristics of the patients in the study are presented in Supplementary Table 1. Among the participating patients, 20 families had PD symptoms for 3 consecutive generations, while the remaining families had 2 consecutive generations affected by PD.

Rare Variants Identified in PD

A total of 10 candidate rare variants in PSAP, which have not been previously reported in PD (except p.R186H), [3] were identified in 13 patients, including 8 with unrelated patients with ADPD and 5 with SPD. Among them, three of these variants (p.P189S, p.K303_K304insM, and p.Q358L) were identified in two patients with PD (Table 1). Furthermore, all 13 patients did not carry any pathogenic/likely pathogenic variants in other previously known PD causative genes based on our comprehensive genetic analysis.

Table 1 Details of rare variants identified in PD patients

Seven variants were located in SapA–D, including p.M76 in SapA; p.A221T in SapB; p.Q358L, p.G365S, and p.V381M in SapC; and p.S441N and p.H486R in SapD (Fig. 1a). Based on the multiple sequence alignment analysis, most of them were shown to be evolutionarily conserved amino acid positions in PSAP (Fig. 1b). According to the ACMG, six variants were classified as likely pathogenic variants, two variants as unknown significance, and two variants as benign. The p.V381M variant was found to be a novel variant (Table 1). The frequencies of patients with rare likely pathogenic variants in PSAP were 0.75% (3/400) in ADPD and 1.33% (4/300) in SPD.

Fig. 1
figure 1

Variants identified in the schematic representations and protein sequence alignment analysis for PSAP and schematic images of variants in SapA–D domains (O’Brien et al., 1991 and Oji et al., 2020) identified in PD. a The variants on the upper of schematic representations were reported in previous studies, the variants under the schematic representations were identified in this study. SP: signal peptide, red denotes pathogenic/likely pathogenic variants, blue denotes uncertain significance (US) variants, black denotes benign, * indicates variants identified in two patients. b Multiple sequence alignment analysis indicates most of variants identified in this study are conservation. c–f SapA–D domains; residues forming an α-helix are indicated in yellow; residues forming a β-sheet are indicated in green; residues forming a β-turn are indicated in orange; missense mutation identified in our study are red font in circle; missense mutation identified in previous studies are red font in diamond. g–i Pedigrees with autosomal dominant inherited parkinsonism carrying likely pathogenic variants of PSAP. Black symbols denote PD patients; half of black symbols denote essential tremor patients; oblique arrow indicate the probands; fork symbols indicate deceased individuals; circles indicate women; squares indicate men, - significance indicates the wild-type allele; and NA indicates the DNA was not available

Burden Analysis

Gene burden testing was performed to investigate whether the rare PSAP variants contribute to the risk of developing PD (Table 2; Supplementary Table 2). Using the Chinese cohort from CMDB and the East Asian cohort from GnomAD as controls, damaging missense variants were not found to be more common in PD patients.

Table 2 Burden analysis for rare variants of PSAP in patients with early-onset PD

Previous studies have indicated that rare variants in the special domains of PSAP, SapC [3, 13], and SapD [14] were linked to PD; therefore, we further explored their role in the risk of PD in a Chinese cohort. Compared with GnomAD_EAS, damaging missense variants in SapC showed statistical significance (p = 0.002 in all the five different approaches of burden analysis) (Table 2).

Common Variants in PSAP Contribute to the Risk of PD

Two intronic variants, rs4747203 and rs885828, near the exonic sequence of SapD, have been reported to be associated with an increased risk for SPD [14]. To confirm this, in the Chinese mainland population, genotypes of seven intronic polymorphisms located on the adjacent exon-intron boundaries (rs4747203, rs885828, rs4747202, rs11000016, rs3747860, rs55829339, and rs2070188) were obtained from WES. In the haplotype analysis using the seven ADPD polymorphisms, three strong linkage disequilibrium blocks were observed: rs4747203 and rs885828 (D′ = 0.935), rs4747202 and rs55829339 (D′ = 0.936), and rs3747860 and rs11000016 (D′ = 0.978) (Supplementary Table 3; Supplementary Fig. 1). Therefore, rs4747203, rs4747202, rs3747860, and rs2070188 were analyzed to determine whether they contributed to the risk of developing PD. Interestingly, the alternative allele “C” of rs4747203 was significantly associated with a decreased risk of developing sporadic PD when compared with the East Asian population (p = 8.6e−7) and the Chinese population from the CMDB (p = 0.002) (Table 3). Notably, two out of three studies in the ExSNP database (http://exsnp.ibbr.umd.edu/eQTL) identified an expression quantitative locus (eQTL) association between rs4747203 (exSNP) and PSAP (exGene) (p = 1.595e−6 and p = 1.541e−7, respectively), which indicates that rs4747203 is associated with PSAP expression.

Table 3 Allele frequencies analysis of candidate risk variants of the PSAP between PD patients and controls

Genotype-Phenotype Analysis

The detailed clinical features of patients carrying rare likely pathogenic variants of PSAP are shown in Table 4. Patients with PSAP variants had a varied age of onset (range 34.7–66.8 years) and typical PD symptoms and responded well to levodopa. Six out of seven patients showed slow disease progression, except for the patient with the p.S441N variant who showed a relatively rapid pace of deterioration as assessed by UPDRS-III and H&Y scores. In addition, none of the patients had developed dyskinesia, cognitive impairment, or frontal lobe dysfunction, but depression and anxiety were common at the last follow-up, and the mean disease duration was found to be 5.9 ± 5.6 years.

Table 4 Clinical features of patients carrying rare likely pathogenic variants of PSAP gene in the cohort

Discussion

In our study, the frequencies of patients with rare PSAP variants were 1.5% in ADPD and 1.67% in SPD. Based on gene or domain level, burden analyses showed that damaging missense variants in SapC had a small statistical significance. In addition, an intronic variant, potentially associated with PSAP expression, rs4747203, was associated with reduced risk of developing PD in the Chinese population.

With the advent of next-generation sequencing, a growing number of causative genes, with Mendelian inheritance, have been identified in PD patients. Many genes associated with dominantly inherited PD (e.g., SNCA, LRRK2, and VPS35) can be considered as causal, but several others including UCHL1, HTRA2, GIGYF2, EIF4G1, DNAJC13, TMEM230, and LRP10 still need to be confirmed or have not been replicated yet [7]. Despite this, the rare pathogenic variants only account for about 5–10% of PD, despite the lower rates in Asian patients compared with European patients, owing to different genetic backgrounds. Recently, three probands carrying three pathogenic variants exclusively in the SapD domain of PSAP were identified in 230 patients with ADPD (1.3%, 3/230) but not in an additional SPD cohort of 1145 Japanese and Taiwanese patients [14]. This exciting discovery led us to explore its role in PD among the Chinese population who show substantial genetic overlap with Japanese subjects. Contradictorily however, the one pathogenic variant (p.S441N) that may affect the secondary structure of SapD (Fig. 1f) was found in an early onset SPD patient whose parents were 63 and 65 years old and did not show any evidence of parkinsonism, and pathogenic variants were not found in patients with ADPD. A previous study [14] suggested that pathogenic variants in SapD are rare (0.5% [3/630] in ADPD and 0.07% [1/1445] in SPD) in Asians, and variants in this domain may have incomplete penetrant phenotype. However, compared with a previous study [14], our burden analysis results indicated a lack of evidence for the genetic association of SapD with PD in the Chinese population.

Robak et al. and Ouled Amar Bencheikh et al. analyzed PSAP in a total of 2290 PD patients and 2332 controls and found that variants (p.G365C or p.T363M), within the SapC binding and activation site of GBA, were nominally enriched in PD [3, 13]. In our study, three putative pathogenic variants (p.Q358L, p.G365S, and p.V381M) that may affect the secondary structure of SapC (Fig. 1e) were identified in four patients, and no rare SapC variant was reported in the Chinese controls from CMDB. The patients with the same variants were confirmed to have PD in two ADPD pedigrees (p.Q358L and p.V381M) using co-segregation analysis of available family members (Fig. 1h and i). In addition, using an East Asian population, damaging missense variants in SapC might be enriched in PD based on variant burden analysis, suggesting that SapC is a potential candidate domain that contributes to PD pathogenesis. However, this needs to be confirmed in more studies including large samples.

Interestingly, no PD variants have been reported in SapA or SapB, which overlap with SapC and SapD in structure and function [11]. In our study, a likely pathogenic variant (p.M76T in SapA) was found in a familial patient whose mother developed PD at the age of 65 (Fig. 1c and g). Co-segregation analysis revealed that his mother carried the same variant, while his healthy older brother did not (Fig. 1g). Another likely pathogenic variant (p.A221T in SapB) was found in an SPD patient with an onset age of 44.6 years. Segregation analysis of this variant within the family was incompletely penetrant, and no other rare variant in SapB was reported in PD, suggesting that the role of SapB is limited.

In this study, we comprehensively analyzed the role of rare variants in the entire PSAP sequence in PD. Of note, the residual variation intolerance score of PSAP is −1, which means that this gene is intolerant to variations [24]. In addition, the probability of loss of function intolerance of PSAP is 0.99, based on the GnomAD database, indicating that this gene is intolerant to loss-of-function mutations. Functional studies identified impaired autophagic flux, accumulation of α-syn in patient-derived skin fibroblasts or IPSC-derived neurons, and abnormal pathology and behavior in mice with the PSAP p.C509S variant [14]. Therefore, although there was no significant difference in the distribution of rare PSAP variants between PD patients and controls after multiple corrections, pathogenic variants in PSAP remain an important potential cause of PD. As mentioned above, incomplete penetrance might explain why some subjects with rare variants of PSAP do not develop PD.

Saposins are important co-activator proteins of specific lysosomal hydrolase(s) [11], some of which might be involved in essential cellular pathways to degrade α-syn, such as SapC [10]. Therefore, beyond alterations in the sequence encoding the protein, it would also be interesting to assess expression levels. In our study, the variant in intron 11 of the “C” allele in rs4747203 was associated with decreased risk of developing PD. Notably, the minor allele of the variant in Chinese subjects is “T,” but in European ancestry, it is “C.” In other words, our results showed that the minor allele “T” increased the risk of PD. However, a Japanese study [14] found that the minor allele “T” decreased the risk of developing PD based on the East Asian dataset in GnomAD. This discrepancy might be due to different genetic backgrounds, sample sizes, and matched controls. We used data from East Asian controls from GnomAD as well as a large set of Chinese controls from the CMDB. Two studies in the ExSNP database [25] identified an eQTL association between rs4747203 and PSAP, indicating that this polymorphism is associated with PSAP expression. The “T” allele might decrease PSAP mRNA or protein levels and result in haploinsufficiency for activating lysosomal hydrolases, thus contributing to PD pathogenesis. However, more studies with larger sample sizes are needed to confirm this association, and further functional assessments of the relationship between rs4747203 and PSAP expression are expected.

Although the sample size of patients was limited, preliminary genotype-phenotype features were established for patients with rare variants of PSAP. Generally, patients with PSAP variants have a later age of onset and typical PD symptoms and respond well to levodopa, a relatively slow disease progression, and no obvious cognitive impairment.

Although burden analysis in our study argued against a significant overrepresentation of rare variants in PSAP in patients with PD, the results should warrant cautious interpretation. In our burden analysis, the Condel was used to help classify benign or damaging missense variants in PD and the controls from the database; however, the pathogenicity of a predicted deleterious variant should be further experimentally confirmed. Analyses using data from public databases rather than ethnically age- and sex-matched controls may also contribute to negative findings. In addition, incomplete penetrance and late-onset diseases make it difficult to detect associations between rare variants and the disease in case-control studies. Finally, the methods of variant detection and age of onset varied among the different studies, and public databases also caused deviation.

Overall, we identified 6 rare likely pathogenic variants in PSAP, which accounted for 0.75% of the Chinese ADPD and 1.33% of the Chinese SPD. Except SpaC, burden analyses for rare variants indicated a lack of convincing evidence for genetic associations of SapD and PSAP with PD in the Chinese population. Association analyses for common risk variants identified that rs4747203, in the PSAP gene, was associated with a reduced risk of developing PD. However, further studies investigating different populations and functional analyses of rs4747203 in PSAP are required to confirm these findings.