Introduction

Palmoplantar keratoderma (PPK) belongs to a group of common skin disorders characterized by excessive thickening of the epidermis over the palms and soles of the human body (Hennies et al. 1995). This disorder can be classified into hereditary and acquired forms (Lucker et al. 1994), and multiple family-based studied revealed that genetic factors are one of the causes increasing susceptibility to hereditary PPK. The feature of hyperkeratosis may be isolated (the sole dominant clinical feature) or it may be associated with other ectodermal abnormalities or extra cutaneous manifestations.

The striate (PPKS) is a form in which hyperkeratotic lesions are restricted to the pressure regions extending longitudinally in the length of each finger to the palm. It was revealed that mutations in three genes including desmoglein (DSG1), desmoplakin (DSP) and keratin-1 (KRT1) increased susceptibility to the PPKS (Armstrong et al. 1999; Rickman et al. 1999; Whittock et al. 2002).

In punctate type of PPK (PPPK), numerous hyperkeratotic papules are distributed irregularly on the palms and soles, and mutations in the alpha and gamma adaptin binding protein (AAGAB) gene and collagen 14 α1 (COL14A1) have been reported to be genetic causes of punctate type 1 and type 2 PPK, respectively (Giehl et al. 2012; Guo et al. 2012). All of genetic mutations mentioned above co-segregated with the disorders in dominant manner.

The prevalence of affected individuals may be underestimated in cases when mildly affected individuals do not seek specialized medical care or they might be diagnosed incorrectly (Schiller et al. 2014). Isolated PPK patients (n = 36) have been observed without associated anomalies of skin or appendages including ichthyosis, ectodermal dysplasia and epidermolysis bullosa (Has and Technau-Hafsi 2016).

Recently, genetic studies using whole-exome sequencing (WES) analysis is a promising approach to identify genetic causes of hereditary PPK.

In this study, we performed whole-exome sequencing of a consanguineous Pakistani family presenting clinical features of PPKS in order to pinpoint a causative mutation for it.

Materials and methods

Study subjects

A consanguineous family affected with PPKs (Fig. 1) was ascertained from a remote village of Khyber Pakhtunkhwa province, Pakistan. The disease was assumed to segregate in autosomal dominant form in this family. Informed consent form was taken from the family elders or the legal guardians of the affected individuals under the ages of 18. The study was approved by the institutional review board for medical genetics research and ethics, King Abdulaziz University, Jeddah, Saudi Arabia under project.

Fig. 1
figure 1

Pedigree of the Pakistani family affected by striate palmoplantar keratoderma (PPKS). The black-filled symbols denote affected individuals whose symptoms are present over their palms and soles. In this family, PPKS is inherited as dominant mode. Blood samples from the individuals marked with asterisk were available for whole-exome sequencing and Sanger sequencing. Each normal individual was homozygous for the normal C allele (serine) at the 392th nucleotide position on COL20A1; on the contrary, all the affected individuals were heterozygous for the G allele (cysteine) which replaces serine with cystein at the amino position of 131th of the COL20A1 protein

Whole-exome and Sanger sequencing

Peripheral blood (10 ml) was drawn from five affected and eight unaffected family members after informed consents were obtained, and genomic DNAs were isolated using QIAamp DNA Blood Max kit (Qiagen, Hilden, Germany). Concentration of gDNAs were measured by PicoGreen™ assay according to the manufacturer’s instructions (Promega, Madison, WI, USA). Whole-exomes of five individuals were captured using SureSelect V5 kit (Agilent Technologies, Santa Clara, CA, USA), and were sequenced as 100-bp paired-end reads on an Illumina HiSeq2000 machine (Illumina, San Diego, CA, USA). The genotype of candidate variants was confirmed in all family members by Sanger sequencing.

Bioinformatic analysis

To screen candidate variants which are likely to cause hyperkeratosis in the family, we utilized several bioinformatic tools and followed a series of criteria to filter out other variants less related to the disease. Using SeqMan NGen (Lasergene Genomic Suite v.12, DNASTAR, Madison, WI, USA), we aligned the sequenced reads in FASTQ file format to hg 19 (GRCh37, NCBI). Arraystar v. 12 (Rockvile, MD, USA) annotated normal and variant alleles based on dbSNP 142 (UCSC).

Annotated normal and variant alleles were sorted out to screen the candidate mutations triggering hyperkeratosis, following a set of criteria. (1) We hypothesized that striate palmoplantar keratoderma inherits as an autosomal dominant mode of inheritance in this family; that is, affected individuals (IV-4, -6, -9, V-9 and -10) are heterozygous for the minor allele; unaffected parents (IV-8, -10, V-5, -6, -8 -11 and -13) are homozygous for the major allele. (2) Variants showing a minor allele frequency > 0.003 in the ExAc browser (http://exac.broadinstitute.org) were excluded. (3) Heterozygous variants in our in-house exome sequence data obtained from 45 normal unrelated Pakistani individuals were also excluded. (4) Synonymous and deep intronic variants other than those present at splice junctions were excluded. The remaining candidate variants were further analyzed in silico to predicted their pathogenic effect using PolyPhen-2, PROVEAN, and MutationTaster softwares (Adzhubei et al. 2010; Choi et al. 2012; Schwarz et al. 2014).

Three-dimensional protein modeling

No X-ray crystal structure was available for the N-terminal of the protein where the mutation of interest had occurred. As no suitable template was available for the N-terminal region of the protein; therefore, we generated ab-initio three dimensional protein model using online Robetta Server (Bystroff et al. 2000) and Quark (Xu and Zhang 2011). The model was predicted only for the segment 23–165 (leaving the signal peptide i.e. 1–22) due to the limitation of the servers. The predicted models were assessed using the program Verify3D (Bowie et al. 1991), PROCHECK, WHAT_CHECK (Hooft et al. 1996), Errat (Colovos and Yeates 1993), and Prove (Pontius et al. 1996). The Ramachandran plot (Ramachandran et al. 1963) was analyzed to find residues in forbidden regions. The selected model was refined using ModRefiner (Xu and Zhang 2011). the effect of Serine to Cystein amino acid at position 131 using PDB_Hydro (Azuara et al. 2006) was assessed in the predicted wild type and mutant models. The structural difference was assessed using Biovia Discovery studio visualizer.

Results

Clinical characteristics of subjects

Affected individuals in the Pakistani family were clinically diagnosed as striate type of PPK. Both males and females were affected. All the affected individuals had mild epidermal thickening over the pressure areas of palms and fingers (Fig. 2a, b); however, soles were not affected. One (V-10) of the affected individuals had intellectual disability, global developmental delay and language impairment and his brain magnetic resonance imaging showed delayed myelination; however, these neurological features were not found in other PPK patients of the family.

Fig. 2
figure 2

Clinical presentation of patients showing keratoderma over the palms. Note the specifically the yellowish appearance at the pressure areas and figures in patients IV-6 (a) and V-9 (b). c illustrates Sanger sequencing results of the normal individual (IV-10) and the PPKS patient (V-10) who is a son of the former. Including IV-10, the genotypes at the mutant loci (COL20A1 c.392 C > G) of all normal individuals are homozygous for the C allele; those of the affected are heterozygous C and G alleles. (Color figure online)

Whole-exome sequencing and bioinformatic analysis

Among 13 genomic DNA samples available, whole exomes of five samples (the affecteds: IV-6 and V-9, the unaffecteds: IV-8, V-5 and -8) were sequenced. On average, the total number of bases identified in the reads was 9.2 Gb. It was 96.73% of bases that acquired phred score over 20. The mean coverage of the target regions was 100.

Including all the synonymous as well as non-synonymous variations, the five individuals have 46,138 exonic variants that met the first quality control criteria, which required that minimum Q call should be above 20 and minimum depth coverage be above ten. Of these, 4137 variants were heterozygous, and 336 variants remained after excluding the 3801variants which were found in 45 unrelated in-house Pakistani controls. Further exclusion of variants that show minor allele frequencies higher than 0.003 and that do not change amino acid sequence, 20 missense variants remained (Table 1). To narrow down to causative mutation, we further performed genotyping of all the 20 variants in the eight other family members (III-3, IV-4, -9, -10, V-6, -10, -11, -13) by Sanger sequencing, and found that 19 variants except COL20A1 p.Ser131Cys were carried by at least one normal individuals in the family. Therefore, we could pinpoint that the COL20A1 variant replacing serine with cysteine at 131th amino acid position of the COL20A1 gene (NM_020882.2, c.392C > G) at chromosome 20 as the candidate mutation. Concurrently, this variant was not found in the Sanger sequencing results healthy 219 normal Pakistani individuals. This gene is known to encode collagen type XX alpha 1 chain, however, there has been no human disease associated with disruptions in the normal function of this protein.

Table 1 The list of candidate variations causing hyperkeratosis in the Pakistani family

In silico analysis to simulate pathogenicity of the COL20A1 p.Ser131Cys mutation suggested that this mutation was predicted to be “disease causing” by PolyPhen-2; PROVEAN, however mutation taster predicted it to be polymorphism.

Analysis of the three-dimensional structural modeling of both the normal and mutated COL20A1 proteins showed that two forms have almost identical topology except some minor changes. Both the wild-type and mutation COL20A1 consist of seven beta strands at N-terminal region, two small alpha helices and loop regions. The mutated amino acid position of 131 resides in the loop region. The mutation influenced the surface atom distributions as shown in the surface view of the models (Fig. 3).

Fig. 3
figure 3

a Schematic diagram of COL20A1 protein and position of the p.Ser131Cys mutation. The domain information was obtained from ExPASy database (http://www.uniprot.org/uniprot/Q9P218). The 3D models of COL20A1 (residues 23–165). Surface was added with atom color (Carbon:gray, Oxygen:red, Nitrogen:blue, Slufer: yellow). normal protein (with Ser131) (b and d) and mutated protein (with Cys131) shown in (c and e). The residue number 131 is shown in CPK representation. f Superimposed mutated model (blue) over normal model (red). g Blue, Serine (with side chain Oxygen in purple), Red: Cysteine (with side chain Sulphur in yellow). (Color figure online)

Discussion

Clinically PPK can be classified into several subgroups, and there are 30 genes associated with the term “palmoplantar keratoderma” in NCBI (https://www.ncbi.nlm.nih.gov/gene). Recently, various clinical subtypes of PPK have been assigned to their causative genes based on their functions as structural proteins (keratins), cornified envelop (loricrin and transglutaminase) cell-to-cell adhesion (plakophilin, desmoplakin, desmoglein 1), cell-to-cell communication (connexins) and transmembrane signal transduction cathepsin C (Sakiyama and Kubo 2016; Stypczynska et al. 2016). Here we suggest, COL20A1 p.Ser131Cys as a novel candidate mutation for PPKS phenotype identified through WES analysis. Previously its closest paralog, collagen 14α1 (COL14A1, MIM 120324) gene has been reported to be the only collagen gene associated with PPK in a Chinese family (Guo et al. 2012).

The COL20A1 spans 41,665 base pairs on chromosome 20q13.33 and encodes for 135.84 kD protein consisting of 1284 amino acids. According to the UniProtKB database COL20A1 has a signal peptide (1–22 aa) and a collagen alpha chain (23-1284 aa) with six domains of fibronectin type III (1–6), two domains of collagen like 1 and 2 and a laminin-G like domain (http://www.uniprot.org/uniprot/Q9P218). Our mutation p.Ser131Cys resides in the loop region after the fibronectin type III-1 domain (28–119 aa) (Fig. 3).

All our patients showed similar phenotype, although incomplete or age-dependent penetrance of dominant mutations has previously been reported in punctate palmoplantar keratoderma families (Guo et al. 2012; Martinez-Mir et al. 2003); however, we did not observe this phenomenon in our patients. Although there were additional neurological findings in one patient; however, they were not observed from other affected individuals in our family and thus the neurological phenotypes were not considered to be the results from COL20A1 p.Ser131Cys mutation.

The COL14A1 gene, a closest paralog of COL20A1, has been known as the only collagen gene causing PPK phenotype in autosomal dominant fashion (Guo et al. 2012). The mutant COL14A1 was suggested to alter the normal keratinocyte proliferation which could lead to the common features like hyperkeratosis in PPK patients (Guo et al. 2012). The COL14A1 is a member of fibril-associated collagen with an interrupted triple helix (FACIT) superfamily, interacting with the fibril surface and regulates fibrillogenesis (Ansorge et al. 2009). FACITs are a subgroup within the collagen family containing types IX, XII, and XIV collagens (Shaw and Olsen 1991), based on sequence homology collagen XX has also been included in this subgroup (Koch et al. 2001). The three-dimensional predicted protein structures of the mutant COL20A1 showed different atom distribution on the surface. The amino acid serine at 131st position resides at the surface of the protein and Ser-to-Cys alteration might interfere the interactions between the monomers of collagen fibers. Oxygen in serine residue can form hydrogen bond with neighboring chain of collagen which can stabilize the structure; cysteine instead forms disulfide bridges. We suspect that the mutated cysteine residue may have effect on interaction with another collagen monomer as cysteines produce knots in collagens (Barth et al. 2003; Boulegue et al. 2008). Furthermore, the hydrophobic nature of cysteine (Nagano et al. 1999) as compared to hydrophilic serine can affect protein structure while interacting with the solvent water molecules.

Our genetic and in silico analyses suggest that COL20A1 p.Ser131Cys is the genetic mutation underlying striate PPK in a consanguineous Pakistani family.