Introduction

Transporters are embedded proteins within membranes that control the uptake of different solutes [1]. They are divided into solute carrier (SLC) and ATP-binding cassette (ABC) transporters [2]. The SLC transporters control the uptake of endogenous compounds essential for cell survival, including sugars, amino acids, digested peptides, nucleotides, and inorganic ions [3,4,5].

The solute carrier (SLC) transporter superfamily with 55 families are encoded by a total of at least 362 putatively functional protein coding genes [6]. Because of key physiological roles of SLC transporters, defects in functionally specific SLC transporters can cause many Mendelian diseases or monogenic disorders [7]. More than 80 SLC genes have been involved in monogenic disorders. For example, mutations in SLC25A19 and SLC5A2 respectively was lead to Amish lethal microcephaly and familial renal glucosuria [8,9,10]. Cystinuria is another disease due to pathogenic variants in the SLC3A1 or SLC7A9 genes [11, 12].

Here, we focus on the Cystinuria, which is an inherited autosomal recessive disorder of renal reabsorption of cystine, arginine, lysine, and ornithine [13]. The protein products of SLC3A1 (rBAT) and SLC7A9 (b0, + AT) form the heterodimeric amino-acid transporter system b0, +, which is responsible for the uptake of cystine and dibasic amino acids in the renal tubular and intestinal epithelial cells [14, 15]. In Cystinuria, mutation in the two genes resulted to increased urinary excretion of cystine and finally formation of kidney stones [16]. Patients with two SLC3A1 mutations are classified as type I Cystinuria, whereas patients with two SLC7A9 mutations are classified as non-type I Cystinuria [17, 18]. Over 100 SLC3A1 mutations have been recognized, and all, except one (dupE5–E9), were limited to patients with type I Cystinuria [17, 19, 20]. At least 66 SLC7A9 mutations were identified and these mutations were found in both type I and type non-I patients [21]. From our previous studies in Iranian patients with Cystinuria, we identified four missense mutations, one intron variant and one polymorphism in SLC3A1 as well as three missense mutations, one frame shift, four intron variant and three polymorphisms in SLC7A9 [22,23,24,25,26].

Bioinformatics prediction tools can be applied in a cost efficient manner to calculate effects of specific mutations on the protein structure and function for selecting SNPs likely contribute to an individual’s disease susceptibility. Recently, several computational methods have been developed to screen functional SNPs out of large pools of disease-sensitive SNPs related to the BRCA1, ATM, PON1, ADIPOR1 and SLC genes [27,28,29].

In the current study, we used different softwares and publicly available bioinformatics tools to comprehensively analyze various mutation identified in the SLC3A1 and SLC7A9 genes of Iranian populations with Cystinuria from our previous studies [22,23,24,25,26]. Since missense mutations of the genes are associated with more abnormalities, we aimed to study the effect of mutations on protein stability. Moreover, the pathogenic effects of the intron variants using bioinformatics tools were predicted. Subsequently, the 3D modeled protein structures of the mutants were compared with the native protein to evaluate structural deviations and topological similarities.

Methods

Bioinformatic pathogenicity predictions

The degree of pathogenicity for the missense mutations identified in SLC3A1 and SLC7A9 genes was predicted using the MutationTaster (http://www.mutationtaster.org/) [30], Polymorphism Phenotyping v2 (PolyPhen-2) (http://genetics.bwh.harvard.edu/pph2) [31], Protein Analysis Through Evolutionary Relationships (PANTHER) (http://www.pantherdb.org) [32], and Functional Analysis Through Hidden Markov Models (FATHMM) (http://fathmm.biocompute.org.uk/index.html) [33]. The Mutation Prediction Predictor of Human Deleterious Single Nucleotide Polymorphisms (PhDSNP) (http://snps.biofold.org/phd-snp/phd-snp.html) [34] and (MutPred) (http://mutpred.mutdb.org) [35] were applied to estimate its functional effects.

3D structure preparation

The 3D modelled structure of the SLC3A1 and SLC7A9 proteins for wild and mutant type prepared using Homology modeling in SWISS-MODEL webserver (https://swissmodel.expasy.org/) [36,37,38,39] were applied for structural analysis.

Exploration of residue interaction networks

Cytoscape with two plugins StructureViz [40] and RINalyzer [41] was used for analysis of residue network interaction of wild type and mutated structures [42].

Sequence alignment

Sequence alignment and visualization of conserved amino acids were prepared using the cobalt constraint-based multiple protein alignment tool (https://www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi) [43] and the universal protein resource (UniProt) (http://www.uniprot.org/align/) [44] with default parameters.

Intron variant analysis

To in silico evaluate the possible effects of the identified intron variants on gene splicing, Human Splicing Finder (http://www.umd.be/HSF/, Marseille, France) softwares were used [45]. In this tool, analysis of intron sequences for putative branch points and calculation of the breakage of exonic splicing enhancers (ESE) or creation of exonic splicing silencers (ESS) was performed.

Results and discussion

In previous studies from these authors [22,23,24,25,26], some variants were identified in SLC3A1 and SLC7A9 genes including missense, polymorphism, and intron variants summarized in Table 1. Totally, six and eleven novel mutations respectively identified in SLC3A1 and SLC7A9 genes. Wass et al found 57 different mutations in UK population [46]. Similarly, they used computational methods to discover the functional and structural consequences of the nsSNPs [29]. In the current research work, the novel variants that have not been reported so far including c.1136+2/3delT in SLC3A1 and c.177G/A, c.478+14insA, c.272−273insA, c.478+10T/C, c.604+66A/G, c.993G/A in SLC7A9 were identified. As shown in Table 1, c.177G>A, c.411T>C and c.993G>A mutations in SLC7A9 as well as c.114A/C mutation in SLC3A1 lead to polymorphism/synonymous for Thr 59, Cys 137, Ala 331 and Gly 38 residues, respectively. The c.235+22T/G, c.478+14insA, c.478+10T>C, c.604+66C>G mutations in SLC7A9 and c.1136+2/3delT mutation in SLC3A1 were in the intronic region or Untranslated/No coding region.

Table 1 Description of mutations identified in the SLC3A1 and the SLC7A9 genes

Intron variant mutations

Intron variant analysis indicated that only c.478+10T/C mutation was not created/changed a significant splicing motif in SLC7A9 and probably no impact on splicing (Table 2). Furthermore, c.235+22T/G and c.478+14insA intron variants in SLC7A9 had not any effect on splicing but they can created and also changed ESS and ESE motif sites in the intronic regions (Int3 for c.235+22T/G and Int4 for c.478+14insA). However, both of variants c.1136+2/3delT and c.604+66C>G mutations in SLC3A1 and SLC7A9 genes most probably affected splicing respectively through alteration in WT donor sites and exonic ESE sites (Table 2). These molecular events probably make an alternative splicing process.

Table 2 Intron variant analysis for mutations identified in SLC3A1 and SLC7A9 genes

Missense mutations

Only the missense mutations change the amino acid sequence of the SLC transporter proteins. The protein prediction analysis for the pathogenic effects of these missense mutations on SLC3A1 and SLC7A9 proteins were calculated using six bioinformatics programs that use different prediction algorithms: PolyPhen-2, PANTHER, FATHMM, PhD-SNP, MutPred and MutationTaster (Tables 3, 4). All of these programs predicted the variants p.R362C, p.M67K/T, p.T216M in SLC3A1 and p.G105R, p.R333W in SLC7A9 to be damaging/deleterious/disease causing. While the p.V142A variant in SLC7A9 were benign/Neutral Polymorphism/Polymorphism.

Table 3 The protein prediction analysis for missense mutations identified in the SLC3A1 gene
Table 4 The protein prediction analysis for missense and frameshift mutations identified in the SLC7A9 gene

The residue interaction analysis for these mutations were presented in Figs. 1 and 2. In SLC7A9 protein, Gly 105 in the wild type had contacts with Met 101, Ileu 107, Pro 108, and Ala 109 but Arg 105 is in the connection with these residues as well as Tyr 104. Trp 333 in the mutant type of the protein, lost its contact to Ser 342 and Ala 331, while Arg in wild type connected to them as well as Tyr 329, Gly 325 and Val 330. Moreover, Val 142 which was in the connection with Cys 144, Lys 145 and Ala 142, in the mutant type, had contacts with the same residues plus Cys 137. In SLC3A1 protein, Arg 362 from the mutant type had residue interaction network similar to wild type as well as additional connection to Glu 404 and Gln 403. In M467T mutation from the mentioned protein, the mutant type had lost its connection to residues Gly645, Asp 628 and obtained connection to Leu 597. In the same way, M467K missense mutation, the mutated protein had lost its contacts to residues Leu 468, Leu 555, Gly 645, Asp 628 and expanded its network connection to other residues such as Asn 466, Leu 469 and Phe 470. In the last mutation T216M, the mutant type of the SLC3A1 protein had contacts with residues as the same as wild type and also Leu 285, His 215 and Phe 280.

Fig. 1
figure 1

The residue interaction analysis for a p.G105R, b p.R333W and c p.V142A missense mutations in SLC7A9 protein

Fig. 2
figure 2

The residue interaction analysis for a p.R362C, b, c p.M467T/K and d p.T216M missense mutations in the SLC3A1 gene

Point mutations

In point mutation, c.272−273insA, the SLC7A9 protein changed through Nonsense mediated mRNA Decay (NMD). The length of SLC7A9 protein sequence was decreased from 478 amino acid to 120. In this mutation, insertion “A” between nucleotide C272 and C273, lead to frame shift mutation and changing Lys 92 to Gln in the protein structure. The residue interaction analysis indicated the residues that were interacted to the mutant and wild type protein is also changed (Fig. 3). Lys 92 of SLC7A9 protein interacts with seven amino acids in the same chain including Ileu 90, Ser 93, Gly 94, Gly 95, Pro 98, Glu 102 and Thr 242. While Gln 92 had contact with only four residues of the protein: Ileu 90, Arg 94, Gly 95 and Ser 98. Therefore the major changes in the length of protein and contact network will be definitely pathogenic. The MutationTaster determined that c.272−273insA mutation was “disease causing”.

Fig. 3
figure 3

The residue 92 interaction network in wild type and mutated SLC7A9 protein structures. a Lys 92 has contacts with Ileu 90, Ser 93, Gly 94, Gly 95, Pro 98, Glu 102 and Thr 242. b Gln 92 is in connection with Ileu 90, Arg 94, Gly 95 and Ser 98. The residues 93–120 was colored blue for showing the frameshift mutation. (Color figure online)

Sequence alignment

The multiple sequence alignment obtained by cobalt constraint based multiple protein alignment tool indicated that Arg 362, Met 467, Thr 216 in SLC3A1 protein and Gly 105 and Arg 333 in SLC7A9 protein, are in a highly conserved region, whereas Val 142 in SLC7A9 protein is not conserved (Fig. 4). The other alignments were not shown. The substitution of the conserved residues which mainly contribute to the protein structure and function, confirm the deleterious effect of the mentioned mutations in SLC3A1 and SLC7A9 genes and also the benign effect of p.V142A in SLC7A9 predicted previously using different bioinformatics programs.

Fig. 4
figure 4

The multiple sequence alignment for SLC7A9 protein (residues from 105 to 164). The mutation position related to V142 and G105 have been marked with pink boxes. (Color figure online)

Conclusion

The present study offers that various computational tools were able to distinguish disease-causing mutations from benign polymorphisms. Four deleterious mutation (R362C, T216M, M467K/T) in the coding region of SLC3A1 were identified. Only missense mutation V142A had a benign effect on the protein structure and function of SLC7A9. The intron variants c.604+66C>G and c.1136+2/3delT respectively in SLC7A9 and SLC3A1 genes probably affected the splicing process. Overall, the present computational study will provide an insight into the genetic association of some novel deleterious mutations in SLC3A1 and SLC7A9 genes with Cystinuria.