Introduction

Hearing loss (HL) is one of the most common sensory impairment in humans. It is estimated that one child in 1000 is born with a prelingual HL that can have a significant impact on normal speech and language skills (Yoshinaga-Itano 2000). Approximately 10 % of the population is affected with disabling HL by the age of 60 years and ~50 % by the age of 80 years (Davis 1995). HL can be due to environmental factors, genetic factors, or a combination thereof. However, genetic factors are now regarded as the leading cause of childhood HL in developed countries, since other causes are generally prevented by vaccines, antibiotics, and workplace regulations (Nance 2003). It is estimated that approximately 30 % of all genetic HL is syndromic in nature, i.e., (syndromic HL, SHL) (Online Mendelian Inheritance in Man; http://www.ncbi.nlm.nih.gov/omim/), and approximately 70 % of genetic HL is non-syndromic (NSHL), wherein hearing impairment is the only feature observed (Gorlin et al. 1995). NSHL generally is due to mutations in single genes. Approximately 80 % of NSHL is autosomal recessive (ARNSHL), 20 % is autosomal dominant (ADNSHL), 1 % is X-linked, and <1 % is mitochondrial. Most ARNSHL is prelingual severe-to-profound, whereas ADNSHL is often post-lingual and progressive (Angeli et al. 2012).

The genetic basis of HL is heterogeneous with numerous loci/genes already identified in humans. Over 140 loci have been described for NSHL (Hereditary Hearing Loss Homepage; http://hereditaryhearingloss.org). Over 700 syndromes may feature HL (Online Mendelian Inheritance in Man; http://www.ncbi.nlm.nih.gov/omim/). The same clinical syndrome can be caused by different genes and different mutations in the same gene may result in SHL and NSHL (Yan and Liu 2008). For some genes, there are both dominant and recessive alleles. Even the same variant in a single gene can be associated with quite variable phenotypes (Hutchin et al. 2000).

Recent technical advances have revealed new molecular mechanisms of HL and provided improved diagnostic methods. Molecular genetic testing for several HL-associated genes is now part of the standard protocol for the etiologic diagnosis of HL (King et al. 2012). An immediate benefit is that the identification of the specific genetic variant responsible for HL can establish or confirm a clinical diagnosis, and allow the implementation of personalized approaches to medical management. The information also facilitates risk assessment for affected families and enables reproductive decision making.

Decades of experience have proven the diagnostic utility of Mendelian disorders by serial additive Sanger sequencing of candidate genes (Maddalena et al. 2005; Richards et al. 2008). However, this approach is labor intensive and not cost effective for a disorder as heterogeneous as HL. An array-based method has also been developed, but it contains a limited number of genes, and is expensive, and only known mutations can be analyzed (Kothiyal et al. 2010). A disorder with high heterogeneity, such as HL, is often difficult to dissect with these techniques because of the necessity of identifying the candidate genes for testing. Today, the revolutionary targeted capture and next-generation sequencing (NGS) technologies provide a viable alternative because of their massively parallel sequencing capability, which enables the simultaneous screening of multiple HL genes in multiple samples (Shearer et al. 2010; Brownstein et al. 2011; Yan et al. 2013; Tekin et al. 2016). Gene panels are useful when multiple genes are involved in a particular disorder or when there is extensive phenotypic overlap between different disorders. Panels are also more cost effective, and results can be obtained more rapidly than a traditional gene by gene approach. In this study, we undertook a targeted sequencing of 180 known and candidate HL-causing genes in a multi-ethnic cohort of 342 GJB2-mutation-negative probands.

Materials and methods

Subjects

This study was approved by the University of Miami Institutional Review Board (USA), the Madras ENT Research Foundation (P) Ltd (MERF) (India), the University Hospital of Mahdia (Tunisia), the Growth and Development Research Ethics Committee (Iran), the Ethics Committee of University of Ibadan (Nigeria), the Ankara University Medical School Ethics Committee (Turkey), the University Hospital of Sfax Ethics Committee (Tunisia), University of Pretoria School of Medicine Ethics Committee (South Africa), and Institute for Research on Genetic and Metabolic Diseases, INVEGEM (Guatemala). A signed informed-consent form was obtained from each participant or, in the case of a minor, from the parents.

We have included in this study a total of 342 GJB2 mutation-negative families of diverse ethnicity. Of these, 185 were simplex and 157 were multiplex with at least two affected individuals. Since a three-generation pedigree was not available in some cases, we did not group multiplex families according to inheritance pattern. The multi-ethnic cohort was comprised of 91 indigenous families from South Africa, 90 from Nigeria, 53 from the USA (South Florida), 38 from Tunisia, 23 from India, 21 from Iran, 19 from Turkey, and 7 from Guatemala. The diagnosis of SNHL was established via the standard audiometry in a soundproofed room according to the current clinical standards. HL was congenital onset or prelingual onset with a severity ranging from mild to profound. Clinical evaluation included a thorough physical examination and otoscopy in all cases. Additional evaluations, including a high-resolution, thin-section computed tomography (CT) and magnetic resonance imaging (MRI) of the temporal bone, were performed when possible. None of the recruited individuals were diagnosed with a syndrome. DNA was extracted from peripheral blood leukocytes of probands according to the standard procedures.

Sequencing

Using the Agilent SureDesign online tool (https://earray.chem.agilent.com/suredesign/), a SureSelect custom kit (Agilent, Santa Clara, CA, USA, https://www.agilent.com) was designed to include all exons, 5′ UTRs and 3′ UTRs of 180 known and candidate deafness causing genes (Supplementary Table S1) (Tekin et al. 2016). This custom capture panel (MiamiOtoGenes), with a target size of approximately 1.158 MB encompassing 3494 regions, covers genes associated with both syndromic and non-syndromic forms of HL. The targeted sequencing was processed at the Hussman Institute for Human Genomics (HIHG) Sequencing core, University of Miami. The Agilent’s SureSelect Target Enrichment (Agilent, Santa Clara, CA, USA) of coding exons and flanking intronic sequences in-solution hybridization capture system was used following the manufacturer’s standard protocol. Adapter sequences for the Illumina HiSeq 2000 were ligated, and the enriched DNA samples were prepared using the standard methods for the HiSeq 2000 instrument (Illumina). Through the sample preparation, average insert size was 180 bp and paired end reads were used. Regions with lower coverage were not subjected to additional sequencing.

Bioinformatics analysis

The Illumina CASAVA v1.8 pipeline was used to assemble 99 bp sequence reads. Burrows–Wheeler Aligner (BWA) was applied for alignment of sequence reads to the human reference genome (hg19) (Li and Durbin 2010), and variants were called using FreeBayes (Garrison and March 2012). Genesis 2.0 (https://www.genesis-app.com/) was then used for variant filtering based on quality/score read depth and minor allele frequency (MAF thresholds of 0.005 for ARNSHL and 0.0005 for ADNSHL variants) as reported in dbSNP141, the National Heart, Lung, and Blood Institute Exome Sequencing Project Exome Variant Server, Seattle, WA Project (Exome Variant Server 2012), Exome Aggregation Consortium (ExAC) browser (http://exac.broadinstitute.org/), the 1000 Genome Project Database and our internal database of >3000 samples from European, Asian, and American ancestries. Variants meeting these criteria were further annotated based on their presence and pathogenicity information in Human Gene Mutation Database (HGMD; http://www.hgmd.cf.ac.uk), the Deafness Variation Database (DVD) (deafnessvariationdatabase.org), and ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/). In the final step, all variants were re-classified based on the American College of Medical Genetics and Genomics (ACMG) and Association for Molecular Pathology (AMP) guidelines (Richards et al. 2015). These guidelines recommend the use of specific standard terminology for DNA variants in five categories to include pathogenic, likely pathogenic, uncertain significance, likely benign, and benign. They describe criteria using evidence from population data, computational data, functional data, and segregation data for variant interpretation. Copy number variation (CNV) calling was performed using an R-based tool (Nord et al. 2011). This method normalizes read-depth data by sample batch and compares median read-depth ratios using a sliding-window approach.

Sanger sequencing was used for the confirmation of variant calls and PCR for the CNVs. Family members, when available, were used for segregation, de novo status, and trans configuration of biallelic variants. During the interpretation, we also considered phenotypic correlations between the gene variants and their reported phenotypes.

Results

Targeted capture sequencing

Targeted capture genome enrichment (TGE) and massively parallel sequencing (MPS) were performed on all probands. An average of 99, 87, and 60 % of the targeted bases were covered at 10×, 50×, and 100×, respectively (Supplementary Fig. S1).

Molecular findings among probands in the multi-ethnic cohort

After QC and filtration (read depth >8, Genotype Quality >35, and QUAL >20), we detected 151 variants in 119 families that we classified as likely pathogenic, pathogenic, or variant of uncertain significance based on ACMG guidelines. Of these, 44 % (66/151) have been reported in at least one of the following three databases: ClinVar, HGMD, and DVD (Supplementary Table S2).

HL causative genes in the cohort

When only pathogenic and likely pathogenic variants were taken into consideration, the underlying genetic cause was identified in 53 families, providing an etiologic diagnostic rate of 15 % (53/342) in the cohort. The detection rates in different groups were 0 % (0/7, Guatemala), 4 % (4/91, South Africa), 4 % (4/90, Nigeria), 17 % (9/53, South Florida), 26 % (10/38, Tunisia), 26 % (6/23, India), 42 % (8/19, Turkey), and 57 % (12/21, Iran) (Table 1; Fig. 1a). Causative variants were detected in 7 % (13/185) of the simplex families and 25 % (40/157) of the multiplex families (Fig. 1a).

Table 1 Identified likely pathogenic and pathogenic variants in the solved families
Fig. 1
figure 1

Representation of solved, unsolved, and uncertain families, based on ethnicity and simplex/multiplex status (a). Number of solved families for each gene and ethnicity (b)

Of the 119 families, 66 (55 %) were classified as uncertain families. Those uncertain families had at least one allele with a variant of unknown significance (VUS) even if they had another allele classified as likely pathogenic or pathogenic. The uncertain family rates in the multiplex families were 22 % (6/27) in Nigeria, 38 % (8/21) in South Africa, 21 % (8/38) in Tunisia, 22 % (2/9) in India, 33 % (1/3) in Guatemala, 12 % (2/17) in Turkey, 8 % (1/13) in Iran, and 26 % (7/27) in USA (Supplementary Table S3).

In this multi-ethnic cohort, sequence variants were identified in a total of 48 genes (Supplementary Table S2), while 27 different genes had variants in solved families. Genes identified in at least three solved families include MYO15A (MIM 602666) (13 %; 7/53), SLC26A4 (MIM 605646) (9 %; 5/53), USH2A (MIM 608400) (9 %; 5/53), MYO7A (MIM 276903) (8 %; 4/53), TRIOBP (MIM 609761) (6 %; 3/53), and MYO6 (MIM 600970) (6 %; 3/53) (Fig. 1b).

Of the 57 unique HL-causing variants identified in solved families, 26 have previously been reported in the literature (Table 1). The remaining 31 novel variations were considered pathogenic or likely pathogenic according to ACMG guidelines (Table 1). Of note in solved families, 81 % (43/53) of the 53 probands found to carry causative variants were homozygous for the identified HL-causing variant (autosomal recessive), 11 % (6/53) were compound heterozygous (autosomal recessives), 6 % (3/53) were heterozygous for a single causative variant (autosomal dominant), and 1 individual was hemizygous for an X-linked variant (Table 1).

Two novel homozygous CNVs were identified in Tunisian families, one consisted of a large deletion of approximately 86.3 kb with breakpoints within exons 21 and 22 of USH2A, and one deletion of approximately 12.3 kb, spanning exons 12 and 13 of the PCDH15 gene (Supplementary Table S4). Deleted exons did not amplify with confirmatory PCR in probands.

While we specifically queried parental consanguinity when obtaining family history, we did not incorporate it into the analysis due to concerns regarding the reliability of self-reported consanguinity in different populations. When we reviewed the variants, we noted that all Indian and Iranian and most Turkish and Tunisian probands were homozygous for pathogenic, likely pathogenic, and VUS, indicating shared ancestry between their parents.

Discussion

In the present study, we have used a panel of 180 genes sequenced by NGS for variant detection in a multi-ethnic group of 342 probands. We identified causative variants in 27 genes without predominant recurring pathogenic variants in the identified genes. The most commonly implicated genes include MYO15A, SLC26A4, USH2A, MYO7A, MYO6, and TRIOBP. As expected, most of the identified variants are autosomal recessive.

Use of the MiamiOtoGene panel established a genetic diagnosis for 28 % of all probands from non-sub-Saharan African countries, including Guatemala, USA, Tunisia, India, Turkey, and Iran. On the other hand, the etiologic diagnostic rate for families from sub-Saharan Africa (Nigeria, South Africa) is 4 %. All the variants detected in the Guatemalan probands were classified as VUS resulting in a “solved” rate of 0 % in this ethnic group. Molecular diagnostic rates for Turkish and Iranian probands are very similar to those reported by Shearer et al. (2013) using OtoSCOPE and Bademci et al. (2016) using the whole exome sequencing. It should be noted that a positive family history of deafness is an important indication for a genetic etiology. In our cohort, the distribution of simplex and multiplex cases was remarkably diverse in different ethnicities. Moreover, parental consanguinity is traditionally common in Turkey, Iran, and Tunisia, which increases the chance of having rare autozygous mutations. The current study found solved rates of around 7 % for the simplex families compared to 25 % for multiplex families. Across a variety of studies utilizing NGS, the diagnostic rate overall averaged 41 % and ranged from a low of 10–83 % (Shearer and Smith 2015). In an analysis of simplex cases, Gu et al. (2015) found a diagnostic rate of 13 %. Direct comparison between studies is difficult because of the fundamental differences in study design. These include prescreening for GJB2 variations, and the number of genes included on a “comprehensive” test, ranging from 34 to 246 different genes (Shearer and Smith 2015). In addition, the genes selected for each platform vary based on whether only NSHL genes or also SHL genes are included (and which syndromes), and also whether candidate genes identified though animal models or human studies as in the case of our platform, the MiamiOtoGenes panel, are included. Overall, our data highlight the importance of family history and generation of databases with ethnically diverse samples to improve our ability to detect and accurately evaluate genetic variants for pathogenicity. The type of mutations evaluated should also be taken into account when considering a comprehensive genetic test. While all platforms include the analysis of point mutations and small deletions, not all the studies screened for large CNVs (Shearer et al. 2013, 2014). In the current study, CNVs account for 4 % of causative alleles, yet rates as high as 13–19 % have been reported.

As NGS technology is becoming more widespread in the diagnostic setting, interpreting the clinical meaning of newly discovered variants will be one of the major challenges of ‘genomic’, or ‘precision’, medicine (Tsai and Liu 2014; Aronson and Rehm 2015). Classifying variants is an important issue. The online prediction programs, such as PolyPhen2 and SIFT, can provide an indication of whether a variant that changes the amino acid at a certain position could be deleterious; but they are unreliable, can be incorrect and alone should not be used to determine whether a variant is likely to be disease causing (Tchernitchko et al. 2004; Thusberg and Vihinen 2009). HGMD, ClinVar, and DVD are commonly checked to decide about the pathogenicity of a detected variant for HL. However, these databases are not always in agreement for the classification of DNA variants. While the recent ACMG-AMP Guidelines provide a solution to this problem, some criteria suggested are subjective that would lead to disagreement between different labs (Richards et al. 2015). Recently nine molecular diagnostic laboratories which involved in the Clinical Sequencing Exploratory Research (CSER) tested ACMG-AMP guidelines for the variant interpretation. Interestingly concordance across laboratories was only 34 % and after consensus discussions and detailed review of the ACMG-AMP criteria, authors mentioned that concordance increased to 71 % (Amendola et al. 2016).

In our study, the overall diagnostic rate is 15 %. 19 % of the families were classified as uncertain, because the probands in these families had at least one VUS. To solve these families, more functional, computational, or literature evidence is needed.