Introduction

Autosomal recessive primary microcephaly (MCPH) is a hereditary neurodevelopmental disorder. MCPH is characterized by marked reduction in occipitofrontal head circumference (OFC) and a non-progressive intellectual disability. It exists as an isolated as well as associated with additional clinical features including short stature, mild seizures, or skeletal abnormalities. MCPH shows clinical and genetic overlap with related disorders such as Seckel syndrome, Meier–Gorlin syndrome and microcephalic osteodysplastic syndrome.

To date, 16 genetic loci and corresponding genes have been implicated for isolated MCPH (Faheem et al. 2015). These include MCPH1 at MCPH1, WDR62 at MCPH2, CDK5RAP2 at MCPH3, CEP152 at MCPH4, ASPM at MCPH5, CENPJ at MCPH6, STIL at MCPH7, CEP135 at MCPH8, CASC5 at MCPH9, ZNF335 at MCPH10, PHC1 at MCPH11, CDK6 at MCPH12, CENPE at MCPH13, SASS6 at MCPH14, MFSD2A at MCPH15 and ANKLE2 at MCPH16 (Jackson et al. 2002; Guernsey et al. 2010; Bond et al. 2002, 2005; Kumar et al. 2009; Bilgüvar et al. 2010; Yu et al. 2010; Nicholas et al. 2010; Hussain et al. 2012, 2013; Genin et al. 2012; Yang et al. 2012; Awad et al. 2013; Mirzaa et al. 2014; Khan et al. 2014; Yamamoto et al. 2014; Alakbarzade et al. 2015; Guemez-Gamboa et al. 2015). Almost all MCPH genes express predominantly in neuroprogenitor cells during early brain development. Mutations in the MCPH genes lead to reduction of neuroprogenitor and neuronal cells which subsequently results in small brain size (Kaindl et al. 2010). MCPH proteins have also been found to be associated with centrosome (Bond et al. 2005; Nicholas et al. 2010; Zhong et al. 2006; Kumar et al. 2009; Hussain et al. 2012, 2013; Guernsey et al. 2010; Kumar 2009; Sir et al. 2011), suggesting that cell cycle regulation plays a significant role in proper neurogenesis.

Here, we have reported the first mutation in the citron kinase (CIT) gene, which resulted in MCPH phenotype. The splice donor site mutation, detected in the family, resulted in retention of intron in patient’s mRNA.

Materials and methods

Family recruitment and DNA isolation

Seven individuals including four affected (V:1, V:2, V:4, V:5) and three unaffected (IV:1, IV:2, V:3) in a five-generation family of Saudi origin (Fig. 4) manifesting autosomal recessive primary microcephaly were investigated in the present study. Affected individuals in the family were clinically examined by a pediatric neurologist at Madinah Maternity and Children hospital, Saudi Arabia. Permission to undertake the study was obtained from the Research Ethics Committee (REC) of Taibah University, Almadinah Almunawwarah, Saudi Arabia. The blood samples from seven individuals were obtained for genetic evaluation after obtaining informed consent. Genomic DNA was isolated using QIAquick DNA extraction kit. DNA was quantified by NanoDrop spectrophotometer (Maestrogen, 8275 South Eastern, Las Vegas, USA) and Qubit fluorometer (Thermo Fisher Scientific, 81 Wyman Street, Waltham, MA USA 02451).

Sanger sequencing of known MCPH genes

The family members were first tested for mutations in the genes reported previously to be associated with the MCPH. The genes sequenced included microcephalin 1 (MCPH1) at chromosome 8p23.1, WD repeat domain 62 (WDR62) at chromosome 19q13.12, CDK5 regulatory subunit-associated protein 2 (CDK5RAP2) at chromosome 9q33.2, cancer susceptibility candidate 5 (CASC5) at chromosome 15q15.1, abnormal spindle microcephaly associated (ASPM) at chromosome 1q31.3, centromere protein J (CENPJ) at chromosome 13q12.12, SCL/TAL1 interrupting locus (STIL) at chromosome 1p33, centrosomal protein 135 kDa (CEP135) at chromosome 4q12, centrosomal protein 152 kDa (CEP152) at chromosome 15q21.1, zinc finger protein 335 (ZNF335) at chromosome 20q13.12, component of a Polycomb group (PcG) multiprotein PRC1-like complex (PHC1) at chromosome 12p13.31, cyclin-dependent kinase 6 (CDK6) at chromosome 7q21.11, centromere protein E (CENPE) at chromosome q24, spindle assembly 6 homolog (SASS6) at chromosome 1p21.2, major facilitator superfamily domain containing 2A (MFSD2A) at chromosome 1p34.2 and ankyrin repeat and LEM domain containing 2 (ANKLE2) at chromosome 12q24.33.

Whole genome SNP genotyping

Illumina HumanOmni 2.5 M BeadChip containing 2,500,000 SNPs was used for whole genome SNP genotyping. A total of 200 ng genomic DNA was used as a starting material. DNA was denatured with 0.1 N NaOH, and whole genome amplification was carried out with random primers mix (RPM) using multi sample master mix (MSM). Enzymatic fragmentation of the amplified DNA was carried out using fragmentation mix (FMS) followed by precipitation using precipitation mix 1 (PM1) and 2-propanol. Fragmented DNA was hybridized to BeadChip by denaturing the sample and dispensing 35 μl of sample onto the BeadChip section followed by incubation for 18 h at 48 °C in the hybridization oven. BeadChips were washed, and the staining was performed following single base extension. This reaction incorporates labeled nucleotides into the extended primers.

Scanning of BeadChips was performed in Illumina iScan using iScan Control Software. Illumina GenomeStudio software, HomozygosityMapper (Seelow et al. 2009) and AutoSNPa (Carr et al. 2006) were used to call loss of heterozygosity (LOH) regions. Allegro (Gudbjartsson et al. 2005), incorporated in easyLINKAGE (Hoffmann and Lindner 2005), was used to perform multipoint LOD score.

Whole exome sequencing (WES)

Three affected individuals (V:1, V:2, V:4) were subjected to whole exome sequencing. Nextera Rapid Capture Expanded Exome kit was used for library preparation and exome enrichment. This kit delivers 62 Mb of genomic content, including exons, untranslated regions (UTRs), and miRNA. Cluster generation and DNA sequencing were performed on Illumina NextSeq 500 instrument.

Briefly, 50 ng of DNA was fragmented enzymatically and tagged with adaptor sequences (tagmentation) followed by purification and amplification of the purified tagmented DNA. Resulting libraries were purified with magnetic beads, and target regions were captured with whole exome oligos followed by PCR amplification of the enriched library. Quantification of enriched library was performed with Qubit fluorometer, and library size distribution was measured with Agilent Bioanalyzer. Quantified DNA library was loaded on flow cell for subsequent cluster generation and sequencing on an Illumina NextSeq 500 instrument.

Whole exome sequencing data analysis

NextSeq 500 generated bcl files for each of the four lanes. These files were converted to FASTQ files using BCL2FASTQ tool. BWA aligner incorporated in BaseSpace was used to align FASTQ files to the reference genome using BWA-MEM algorithm. Variants were called using Genome Analysis Tool Kit (GATK). Illumina VariantStudio was used for annotation and filtration of genomic variants (Fig. 1).

Fig. 1
figure 1

Flowchart showing stepwise exome data analysis work flow

Sanger validation of exome-discovered variants

Sanger sequencing was performed for variants of interest to validate the exome sequencing identified variants. Genomic sequences of DNM1L (NM_001278464) and CIT (NM_001206999), including exons, introns, 5′ untranslated region and 3′ untranslated region, were retrieved from the University of California Santa Cruz (UCSC) genome database browser (http://genome.ucsc.edu/cgi-bin/hgGateway). Primer sequences for PCR amplification of the variants and flanking region of the variants were designed using the Primer3 software (http://frodo.wi.mit.edu/primer3/). Sequence variants were identified via BioEdit sequence alignment editor version 6.0.7 (Ibis Biosciences Inc., Carlsbad, CA, USA).

cDNA sequencing

Total RNA was extracted from the blood collected from two patients (V:1, V:4) and control individual in Tempus RNA tubes using Tempus RNA isolation kit. ProtoScript first-strand cDNA synthesis kit was used to synthesize cDNA. Reverse transcription (RT) was carried out with 200 ng of total RNA. Hybridization of the oligo (dT) was carried out by incubating the following mix for 5 min at 70 °C: 3 µL of RNA; 2 µL of polyT oligo primers (dT) (10 mM, New England Biolabs); and 3 µL of RNase-free H2O, followed by ice quenching. RT was then performed for 60 min at 42 °C after the addition of 2 µL of M-MuLV enzyme mix and 10 µL of M-MuLV reaction mix (New England Biolabs). For the subsequent PCR, 5 µL of the obtained cDNA mix was used. Primers were designed to amplify exon 7–exon 8 (P1), exon 7–intron 7 (P2) and intron 7 and exon 8 (P3) from cDNA. Sequences and product size of cDNA primers used for RT-PCR are shown in Table 1. The reversed transcribed RNA was amplified with these primers at different hybridization temperatures. Amplified products were then visualized on 2 % agarose gel and sequenced on an ABI 3500 DNA sequencer (Applied Biosystems).

Table 1 Primer sequences used for RT-PCR

Results

Clinical description of patients

All affected individuals were clinically examined. They were born to second-cousin parents after a normal pregnancy and delivery. Clinical history showed presence of microcephaly condition by birth in all four affected individuals. Head circumferences of affected individuals varied from 3 to 6 standard deviations below the age-matched and sex-related means. Affected individuals examined were 6–15 years old, and intellectual disability ranged from mild to moderate in severity. Facial features of the affected individuals were normal except a common feature of sloping forehead (Fig. 2). With the exception of intellectual impairment, no other neurological problems were observed in the affected individuals. Magnetic resonance imaging (MRI) reports of an affected individual (V:2) showed a simplified gyral pattern and enlarged extra-axial space with no other structural brain abnormalities. The intelligence quotient (IQ) scores for affected individuals, measured at Madinah Maternity and Children Hospital Almadinah, Saudi Arabia, ranged from 53 to 64.

Fig. 2
figure 2

Photograph of an affected individual (V:2). Sloping forehead is evident

Whole genome genotyping data analysis detected a shared region on chromosome 12

Sequence data analysis of all known MCPH disease genes (MCPH1, WDR62, CDK5RAP2, CASC5, ASPM, CENPJ, STIL, CE135, CEP152, ZNF335, PHC1 and CDK6) excluded their involvement in causing MCPH in the present family. After excluding known genes, whole genome homozygosity mapping was carried out using Illumina HumanOmni 2.5 M SNP array. SNP genotypes were called by the BRLMM algorithm incorporated in Illumina GenomeStudio genotyping module. A call rate of more than 99.3 % was obtained across the entire sample. Mapping order and physical and genetic distances of SNPs were obtained from Illumina. Analysis of SNP data to detect LOH was conducted using GenomeStudio software, HomozygosityMapper and AutoSNPa. Homozygosity mapping identified a 16.9-Mb shared block of homozygosity on chromosome 12q24.11-q24.32 in all affected individuals. A multipoint LOD score of more than three was obtained for several SNPs in the homozygous region (Fig. 3). The homozygous region is flanked by SNPs rs741334 and rs7309523 (Fig. 4).

Fig. 3
figure 3

Graphical output of easyLINKAGE showing distribution of multipoint LOD values for SNPs on chromosome 12

Fig. 4
figure 4

Haplotype of an MCPH family. The disease interval is flanked by two SNP markers (rs741334 and rs7309523) indicating key recombination events. For genotyped individuals, SNP haplotypes are shown beneath each symbol, revealing that all affected individuals are homozygous for the same haplotype, whereas normal parents and healthy siblings are heterozygous carriers between markers rs741334 and rs7309523. Arrow shows the position of CIT gene

Whole exome sequencing (WES) data analysis identified a splice site mutation in CIT

WES was performed in three affected individuals (V:1, V:2, V:4) of the family. The resulting variant call format (VCF) file contains on average ~86,000 variants. These variants were filtered based on quality, frequency, genomic position, protein effect, predicted pathogenicity and previous associations with the phenotype. Candidate variants were expected to follow an autosomal recessive inheritance pattern given the reported positive family history.

In an approach to find potential candidate variants in the family, the exome data of all three affected individuals were analyzed simultaneously. Thus, different filter settings were applied to look for potential candidate variants under an autosomal recessive inheritance model. Only rare variants were taken into account (allele frequencies below or equal to 1 % in 1000G, ExAC and our in-house database containing 64 exomes), and only variants located within genes (exonic and intronic) or promoter regions were considered. Furthermore, some quality parameters were taken into account including the depth of coverage (DP >10) in at least one family member, the absence of allelic imbalance in at least one family member and good genotype quality (GQ >20) in at least one family member.

Initially, variants in the previously reported MCPH-associated genes were looked for. However, candidate variants were not present in any of the family members in those genes. Nevertheless, looking for rare homo-/hemizygous variants within the protein-coding regions of all genes that have previously been associated with one of the symptoms present in our patients (panel 1) did not clearly yield plausible candidates. Thereafter, rare, potentially harmful variants present in those genes in (compound) heterozygous state were further considered. This approach yielded one candidate in the DNM1L.

The analysis was then expanded to all genes, whereby we first focused on potentially harmful homo-/hemizygous variants and then moved on to potential candidates in (compound) heterozygous state. This yielded one additional novel, homozygous splice donor site variant in the CIT (c.753+3A>T). This variant is not present in the dbSNP, ExAC and in-house 64 exomes. Interestingly, this gene is present in the shared LOH region (Fig. 4). To date, this gene has not yet been associated with any disease in humans. However, given its function in cytokinesis and early CNS development and some observations from animal studies, this gene is likely a plausible candidate gene for microcephaly.

Sanger validation

The entire relevant coding exons of the DNM1L and CIT were sequenced in all four affected, a normal and both parents of the family. Variant in DNM1L was found not to be segregating in the family. However, splice donor site variant, identified in the CIT gene, was found to be segregating with the phenotype in the family (Fig. 5).

Fig. 5
figure 5

Sequence analysis of the CIT variant. Partial DNA sequence of the CIT identified a homozygous variant (c.753+3A>T) in the affected individual (lower panel), heterozygous in the carriers (middle panel) and wild type in the control individuals (upper panel). Arrows indicate position of the mutation

In silico analysis

Variant in the CIT occurs at splice donor site, and in silico analysis with different splice site effect prediction tools including SpliceView (http://bioinfo.itb.cnr.it/~webgene/wwwspliceview.html), NetGene2 (Hebsgaard et al. 1996), SplicePort (http://spliceport.cbcb.umd.edu/), Spliceman (Lim and Fairbrother 2012), GeneID (Parra et al. 2000), MaxEnt (Yeo and Burge 2004), ASSP (Wang and Marín 2006), and NNSPLICE (Reese et al. 1997) predicted this variant to abolish the splice donor site.

cDNA amplification

cDNA from patients (V:1, V:4) and control individuals was amplified using primers designed in exon 7, intron 7 and exon 8 to characterize the consequences of the splice donor site mutation (Fig. 6). Amplified products were visualized on 2 % agarose gel. Primers designed in exon 7–exon 8 (P1) and intron 7–exon 8 (P3) did not show any amplification in patient cDNA, however, an amplification was observed for primers in exon 7–intron 7 (P2). This shows presence of portion of intron 7 in the patient’s mRNA (Fig. 7). Retention of intron 7 sequence in the mRNA was confirmed through Sanger sequencing.

Fig. 6
figure 6

Genomic location of primers used for RT-PCR

Fig. 7
figure 7

RT-PCR products obtained with primers designed within exon 7, intron 7 and exon 8. Three sets of primers were used including forward primer in exon 7 and reverse primer in exon 8 (P1), forward primer in exon 7 and reverse primer in intron 7 (P2) and forward primer in intron 7 and reverse primer in exon 8 (P3). cDNA from two patients (V:1 and V:4) and a control individual was used for RT-PCR. Lane L 100-bps ladder, Lane P1 in controls; RT-PCR product of 267 bps from control individuals showing amplification of primers designed inside exon 7 and exon 8, Lane P2 in patients; RT-PCR product of affected individuals showing amplification of primers designed inside exon 7 and intron 7

Discussion

Exome sequencing was used to search for the variants responsible for causing primary microcephaly in the family presented here. Two rare variants in two different genes showing potential harmful effect were identified. One of these variants was identified in the DNM1L. Mutations in this gene are associated with autosomal dominant disorder, lethal encephalopathy, as a result of defective mitochondrial and peroxisomal fission. Patients with DNM1L mutations have been reported to have microcephaly associated with several other features. These additional features were not observed in the family presented here. Also, Sanger sequencing failed to show segregation of the DNM1L variant in our family.

The second homozygous variant (c.753+3A>T) was identified in the CIT in all four affected individuals of the family. Segregation of the variant with the disease phenotype within the family was validated by Sanger sequencing.

The gene CIT encodes a Citron rho-interacting kinase which plays an important role in the regulation of cytokinesis and the development of the central nervous system (Di Cunto et al. 1998; Gruneberg et al. 2006; Kamijo et al. 2006; Ackman et al. 2007; Tan et al. 2011; Bassi et al. 2013). Importantly, so far, this gene has not yet been associated with any disease in human. Earlier, it has been reported that growth of citron kinase knockout mice model takes place at slow rates and that death occurs before reaching adulthood as a consequence of fatal seizures (Di Cunto et al. 2000). Brains of these mice displayed defective neurogenesis and depletion of specific neuronal populations, which developed during development of the central nervous system as a result of altered cytokinesis and massive apoptosis. This indicated that CIT is essential for cytokinesis in vivo but only in specific neuronal precursors. Moreover, a novel molecular mechanism for a subset of human malformative syndromes of the CNS characterized by microcephaly and epilepsy has been suggested (Muzzi et al. 2009). Furthermore, a flathead (fh) rat model with a single bp deletion in exon 1 of the CIT has been described (Sarkisian et al. 2002). This deletion causes a premature stop codon resulting in cytokinesis failure in neural progenitors followed by apoptosis and a dramatic reduction in CNS growth. These fh/fh rats showed a phenotype that is nearly identical to that of the mice previously described by Di Cunto and colleagues (Di Cunto et al. 2000). Evidence exists for the co-localization of ASPM (a well-known microcephaly-causing gene) and citron kinase to the midbody ring during cytokinesis (Paramasivam et al. 2007). More recently, direct functional role of CIT in microtubule stabilization has been reported as well (Sgrò et al. 2016).

The variant detected in our family is located in a splice donor site region of CIT (between exons 7–8). The variant is predicted to have an effect on splicing according to the scores (SC Ada, SC RF) of Jian et al. (2014). In addition, an alteration of splicing due to this variant is also predicted by the multiple in silico tools including Human Splicing Finder (http://www.umd.be/HSF3/), SpliceView, NetGene2 (Hebsgaard et al. 1996), SplicePort (spliceport.cbcb.umd.edu), Spliceman (Lim and Fairbrother 2012), GeneID (Parra et al. 2000), MaxEnt (Yeo and Burge 2004), ASSP (Wang and Marín 2006), NNSPLICE (Reese et al. 1997). cDNA analysis also revealed presence of intronic sequences in the mRNA extracted from patient’s blood. Retention of intronic sequence in the mRNA is predicted to add six additional amino acids in the catalytic domain of CIT protein and immediate protein truncation (Fig. 8). In conclusion, our study revealed that the splice donor site mutation in CIT resulted in primary microcephaly in the family, presented here. This is the first report of a mutation in the CIT gene causing a disorder in the human.

Fig. 8
figure 8

Schematic representation of the organization of wild-type CIT protein domains (a). Predicted mutant CIT protein. Red diamonds in b indicate addition of six new amino acids (b). S_TKc catalytic domain of serine/threonine kinase, S_TK_X extension of serine/threonine-type protein kinase, C1 cysteine-rich domain, PH pleckstrin homology domain, CNH NIK1-like kinases

Online resources

http://genome.ucsc.edu/cgi-bin/hgGateway.

http://frodo.wi.mit.edu/primer3/.

http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq_acc.html.

http://www.fruitfly.org/seq_tools/splice.html.

www.homozygositymapper.org/.

http://dna.leeds.ac.uk/autosnpa/.