Introduction

A fully functional auditory system in humans is necessary for communication and the perception of the surrounding environment. Even mild sensorineural hearing loss (HL) can cause lifelong impacts on social, financial, vocational, and educational needs. HL is one of the most common disabilities and heterogeneous disorders concerning clinical manifestation and etiology (NIDCD 2021). The frequency of HL at birth is estimated at 1–3/1000 in developed countries (Joint Committee on Infant Hearing 2007) and, in Brazil, it may reach 3–4/1000 in some regions (Braga et al. 1999; Mattos et al. 2009; Barboza et al. 2013; Marinho et al. 2020). According to the National Institutes of Health (USA), it affects 14% of the adult population (20 to 69 years of age) and 30 to 50% of the elderly (NIDCD 2021). The last census conducted by the Brazilian Institute of Geography and Statistics in 2010 (IBGE 2021) estimated that 5.1% of the Brazilian population exhibit some degree of HL. The World Health Organization predicts that 1 in 4 people (close to 2.5 billion people worldwide) will be living with some degree of HL in the year 2050 (WHO 2021).

The frequency of HL of genetic etiology reaches 60% in developed countries, while in developing countries, such as in Brazil, this number is expected to be lower due to the outstanding contribution of environmental factors (Braga et al. 1999; Faistauer et al. 2021). However, owing to the recent improvement in health care, those frequencies should become closer. Hereditary hearing loss (HHL) is exceptionally genetically heterogeneous, exhibiting Mendelian inheritance patterns and mitochondrial inheritance. Nearly 70% of individuals with HHL are non-syndromic, and 30% present syndromic HL (Van Camp and Smith 2021; Shearer et al. 2017). Over 400 of these HL-related syndromes have been described, the most frequent being the Usher Syndrome (USH) and Waardenburg Syndrome (WS) (Shearer et al. 2017).

Since 1997, 123 genes have been associated with non-syndromic HL (Van Camp and Smith 2021), and 224 deafness genes are already listed in the Deafness Variation Database, which also includes syndromic hearing loss genes. Autosomal recessive inheritance is observed in about 80% of the non-syndromic cases, usually severe to profound, with prelingual and stable onset (Van Camp and Smith 2021; Shearer et al. 2017). The contribution of the causative genes varies across different populations and ethnicities. However, the GJB2 and GJB6 genes (connexins 26 and 30, respectively, the DFNB1 locus, OMIM #220290) account for more than 50% of autosomal recessive cases in the white populations of Europe and the United States (Snoeckx et al. 2005). Other genes frequently associated with autosomal recessive HL, in many different populations, are the SLC26A4 generally associated with cochlea-vestibular anomalies, ranging from the enlarged vestibular aqueduct to Mondini dysplasia (OMIM #600791, DFNB4 locus); and OTOF, associated with auditory neuropathy (OMIM #601071, DFNB9); followed by CDH23 (OMIM #601386, DFNB12), TMC1 (OMIM #600974, DFNB7/11), TMPRSS3 (OMIM #601072, DFNB8/10) and MYO15A (OMIM #600316, DFNB3), with no specific clinical signs associated (Hilgert et al. 2009; Shearer et al. 2017).

Conversely, autosomal dominant genes account for 10–20% of HL cases and are often postlingual and progressive (Hilgert et al. 2009; Shearer et al. 2017). Estimates worldwide point to a frequency of 1% of mitochondrial inheritance, but a higher prevalence, of 2%, has been observed in Brazil (Abreu-Silva et al. 2006). However, the predisposition to the aminoglycosides ototoxicity conferred by the mitochondrial variant m.1555A > G (OMIM #56100012S, rRNA or MT-RNR1 gene) raises the significance of its screening in HL.

Genomic technologies have speeded up discovering novel genes and variants, accelerated the establishment of genotype–phenotype correlations, and increased the power of prognosis based on molecular diagnosis. However, those technologies are still expensive in developing countries and are not accessible for screening every hearing-impaired subject. Thus, seeking alternatives to improve the screening of specific genes is crucial for genetic counseling and scientific enrichment. Moreover, variant interpretation and correlation to phenotype are further challenges to overcome.

In this study, hearing-impaired subjects were investigated through mutational screening of the GJB2 variants and GJB6 common deletions, the m.1555A > G variant in the MT-RNR1, in addition to other frequent HL genes, when the clinical presentation and inheritance pattern suggested the involvement of a specific gene and comprehensive NGS screening in few cases.

Patients and methods

Patients

A total of 542 subjects with HL were referred to the Genetic Deafness Counseling Unit of the Otorhinolaryngology Department (Clinics Hospital of University of Sao Paulo School of Medicine). Part of this cohort (313) was described in Sampaio-Silva et al. (2017). No subjects were excluded from this study, regardless of HL severity, laterality, additional clinical findings or syndromic features, or possible environmental causes. Given that most of our subjects (~ 72%) were referred by the Cochlear Implant Unit from the same hospital, our cohort is biased towards severe to profound HL.

All DNA samples were collected after signing a written informed consent form, within the scope of the Research Protocol No. 65111517.0.0000.0068, approved on 08/17/2017 by CAPPesq (ETHICS COMMISSION FOR THE ANALYSIS OF RESEARCH PROJECTS at the School of Medicine Clinics Hospital of the University of São Paulo approval No. 2,224,363).

Clinical evaluation

Audiological data, including the otoacoustic emissions test (OAE), ABR test (auditory brainstem responses), and tonal audiometry, were available from all patients. Additional audiological exams available for most cochlear implant patients included temporal bone imaging (MRI—Magnetic Resonance Imaging and CT—computed tomography scan). Vestibular dysfunction was investigated upon patient complaint or clinical suspicion. Syndromic features were assessed after physical examinations and complete anamneses.

Molecular analysis

Genomic DNA was extracted using commercial kits or salting-out methods (Miller et al. 1988) from peripheral blood leukocytes. The DNA quality and quantity were verified with a Nanodrop spectrometer (Nanodrop Technologies, Wilmington, DE, USA). The routine screening included analysis of GJB2 coding region by Sanger sequencing: GJB6 deletions test, Δ(GJB6-D13S1830) and Δ(GJB6-D13S1854), by multiplex PCR (Del Castillo et al. 2005), and m.1555A > G in MT-RNR1 by RFLP analysis (Estivill et al. 1998). GJB2 exon 1 was also sequenced when a single recessive pathogenic variant was detected. The m.1555A > G variant cases were validated by Sanger sequencing.

Sanger sequencing

In selected cases, Sanger sequencing was also performed to screen for pathogenic variants in the following genes: MITF (NM_000248.3), PAX3 (NM_181457.2), SOX10 (NM_006941.3), EDN3 (NM_000114.2), EDNRB (NM_000115.2), SLC26A4 (NM_000441), FOXI1 (NM_012188), KCNJ10 (NM_002241), OTOF (NM_194248), and TMPRSS3 (NM_024022.2). Additional specific variants in the following genes were also screened in selected cases: NCOA3 (NM_181659, c.2810C > G; p.(Ser937Cys)) as described in Salazar-Silva et al. (2021) and MYO3 (NM_017433.5, c.2090T > G; p.Leu697Trp)) as described in Dantas et al. (2018). PCR was used to amplify the coding region of each gene with specific primer pairs designed using Primer3 (https://www.bioinfo.ut.ee/primer3-0.4.0/primer3) or PrimerBlast (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). PCR products were purified using Exonuclease I and FastAP Thermosensitive Alkaline Phosphatase (Fermentas, ThermoFisher Scientific Inc., Waltham, MA). Purified samples were submitted to sequencing reaction with ABI Big Dye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, ThermoFisher Scientific Inc., Waltham, MA); the ethanol/sodium acetate/EDTA purified sequencing product was analyzed in an ABI PRISM 3500 Genetic Analyzer (Applied Biosystems, ThermoFisher Scientific Inc., Waltham, MA). The sequences were aligned to Ensembl—Human GRCh37 sequences (https://www.ensembl.org) and analyzed, employing the BioEdit v7.2.5 software (Tom Hall, Ibis Biosciences, Carlsbad, CA, USA).

Pathogenicity and variants databases

Variant pathogenicity prediction was carried out with the following bioinformatics tools: VEP (Variant Effect Predictor GRCh37, Ensembl, https://www.ensembl.org/info/docs/tools/vep/index.html) and we considered probably deleterious those predicted as moderate to high impact; dbNSFP (RRID:SCR_005178) was also used with the following enabled tools: Mutation Taster, the variants scored as disease-causing were considered deleterious (RRID:SCR_010777, http://www.mutationtaster.org/); CADD Phred, and were considered deleterious those with scores ≥ 20, unknown significance 15 ≤ scores ≤ 20, and benign scores ≤ 15 (https://cadd.gs.washington.edu/; Kircher et al. 2014); PROVEAN, deleterious if scores ≤ − 2.5 (RRID:SCR_002182); SIFT, deleterious scores ≤ 0.05 (RRID:SCR_012813 http://provean.jcvi.org/index.php); PolyPhen2, deleterious scores ≥ 0.450 (http://genetics.bwh.harvard.edu/pph2/); MaxEntScan, considered deleterious (native loss) if scores pointed to high impact (diff > 0, with alt < 6.2/ diff ≥ 1.15) or medium impact (alt < 6.2/ diff < 1.15 or 6.2 ≤ alt ≤ 8.5/ diff ≥ 1.15) (http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html; Shamsani et al. 2019); SpliceAI pred, considered deleterious through disruption of splicing if one of the delta scores ≥ 0.8, some effect in splicing if 0.5 ≤ delta scores ≥ 0.8, either acceptor or donor, gain or loss (https://hpc.nih.gov/apps/SpliceAI.html, Jaganathan et al. 2019); and REVEL, deleterious scores ≥ 0.700, unknown significance if scores ≤ 0.700 (if https://sites.google.com/site/revelgenomics/, Oza et al. 2018). All variants were searched in the Deafness Variation Database (https://deafnessvariationdatabase.com), ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), GnomAD database (https://gnomad.broadinstitute.org/, Karczewski et al. 2020), and the ABraOM (Brazilian genomic variants, https://abraom.ib.usp.br). We followed the standard ACMG guidelines as described in Richards et al. (2015) combined with the specifications of Oza et al. (2018). All the novel variants classified as likely pathogenic or pathogenic were predicted as deleterious by the majority of in silico tools and had a low MAF, below the thresholds proposed by Shearer et al. (2014) for autosomal recessive and autosomal dominant variants, 0.005 (GJB2 and OTOF) and 0.0005 (Waardenburg genes), respectively. Variants were considered causative when they were classified as pathogenic or likely pathogenic by the ACMG guidelines. The novel variants herein reported are under submission for inclusion in the Deafness Variation Database, and all variants were included in ClinVar (accession numbers SCV001792212 to SCV001792241) and the Global Variome shared LOVD.

CNVs analysis

Quantitative real-time PCR assays were conducted to screen for CNVs involving the Waardenburg genes in the WS2 patients with no pathogenic variants detected by Sanger sequencing in the genes PAX3, MITF, SOX10, EDNRB, and EDN3. Amplicons of the qPCR were located in: SOX10 exons 2 (NM_006941.3) and 4 and intron 1 (XP_005261777.1); MITF intron 2, exon 6, 7 and 9 (NM_000248.3); PAX3 exon 8 (NM_181457.2) and EDNRB exon 1 (NM_000115). The qPCR reactions were carried out in a Step One System (Thermo Scientific, Waltham, MA, USA) with 10–15 ng of DNA, 0.1 to 0.35 µM of each primer, and 2 × PowerUp™ SYBR™ Green Master Mix (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, EUA), according to the manufacturer’s protocol, but with a final volume of 10 µl. All primer pairs employed exhibited only one amplified fragment, visualized as a single melting-curve peak, and efficiency between 90 and 110%, obtained in Standard curves of serial dilutions of the DNA template. The 2−ΔCt model was used for CNV estimation (Livak and Schmittgenm 2001). For all experiments, reference gene and reference sample were run together with the tested gene, with samples from CNV carriers (when available) and non-carriers. CNV carriers (positive controls) were kindly provided by Dr. Regina C. Migroni-Netto Lab (Laboratório de Genética Humana, Instituto de Biociências da Universidade de São Paulo, São Paulo, Brazil) and were described in Bocángel et al. (2018). A two-tailed unpaired t test, 95% confidence interval, was used to determine whether the Copy Number Estimations of reference and tested samples were statistically different. Samples from subjects W12 and W13 underwent array CGH analysis (Agilent 180K, detection of ≥ 300 kb CNVs), according to the protocol described in Lezirovitz et al. (2020) and Exome Sequencing, both as a collaboration with Dr. Regina C. Mingroni-Netto Lab. MLPA analysis was performed to validate the qPCR MITF deletion of exon 6, in collaboration with Dr. Veronique Pingault Lab (Laboratory of Embryology and Genetics of Human Malformation, Imagine Institute, Université de Paris, Paris, France).

CNVs involving the DFNA58 protein-coding genes were analyzed by MLPA as described in Lezirovitz et al. (2020) and by qPCR of three amplicons, one in each gene, with the same conditions described above.

Computational molecular modeling

We performed molecular modeling of three-dimensional structures, by comparison, using SWISS-MODEL (Arnold et al. 2006; Biasini et al. 2014). The MITF variant was modeled regarding both the most extensive transcript NM_198159 and its mutant p.(Ser351Tyr) and the melanocyte-specific isoform transcript NM_000248.3 ant its mutant p.(Ser250Tyr). Sequences were collected in the GenBank database (Supplementary Table S1 and Supplementary Files S1-S5), and templates were evaluated using the protein–protein BLAST tool (Johnson et al. 2008) searching on the Protein Data Bank (PDB) (Berman et al. 2000). Multiple sequence alignments were performed using the Clustal Omega web tool (Sievers and Higgins 2014). Models’ quality was assessed using the QMEAN score function (Studer et al. 2020). Ramachandran plot analyses were performed using the MolProbity web tool (Lovell et al. 2003). Finally, structural alignments were performed using PyMOL software (https://pymol.org). Comparative modeling reports and Ramachandran plots are available in the supplementary materials. The Ramachandran plot is a protein’s phi (φ) and psi (ψ) torsional angles visual representation. Due to restrictions in the main chain bonds, not all bond angles are allowed (Ramachandran et al. 1963). Hence, this plot can be used to assess the quality of structures obtained by comparative modeling (Wiltgen 2019). Additionally, Poisson-Boltzmann surface analyses (PBSA) were performed for wild and mutant p.(Lys150Glu) model structures of SOX10 using APBS-PDB2PQR software suite (Baker et al. 2001; Dolinsky et al. 2004). Contacts analyses were performed using the ARPEGGIO web tool (Jubb et al. 2017).

Exome/panel NGS

The protocols are described in Sampaio-Silva et al. (2017), de Lima et al. (2018), and Bueno et al. (2021). In addition, the genes compositions of the NGS HL panels, both of 18 and 116 genes, are listed in Supplementary File S6.

Results and discussion

The present cohort consisted of 542 hearing-impaired subjects, mostly sporadic, non-syndromic, and sensorineural, with prelingual/perilingual onset, as shown in Fig. 1A. Among them, seven were from foreign countries, and 535 were from the five Brazilian regions (Southeast—SE, South—S, Northeast—NE, North—N, and MidWest—MW), representing 22 out of the 25 states (Fig. 1B, Supplementary Table S2). The characterization of the molecular screening strategies and findings are summarized in Fig. 1C.

Fig. 1
figure 1figure 1

A Characterization of the cohort of 542 hearing-impaired subjects; B map showing the contribution of each Brazilian region to our cohort; C flow diagram of the study concerning the genetic screening strategies and their findings. Frequencies inside the boxes are related to the whole cohort (542). For example, 16.3% of the cases had GJB2/GJB6 pathogenic/likely pathogenic variants. The frequencies outside the boxes were calculated within the previous group. For example, in 81.8% of the cases with GJB2/GJB6 variants, they were causative. Grey squares indicate causative variants found (solved cases); the number of subjects in each category is inside circles. The number of observed probands with each of the most frequent variants is inside a square. #WS subject without HL not included; both subjects had skin/nails anomalies; *a total of 14 WS cases, all screened for WS genes, including the GJB2 monoallelic and one w/o HL; **only the most frequent variants; ***two had both malformations; Chr chromosomes, WS Waardenburg syndrome, AR autosomal recessive, AD autosomal dominant, bial biallelic, mono monoallelic, Coch-vest Cochleo-vestibular, Mid-Ex middle-external ear

The investigation of the genetic heterogeneity of HL is beneficial not only for precise genetic counseling but also for fundamental contributions to auditory physiology knowledge. In addition, molecular diagnosis has further advantages like orientation regarding risk factors (such as ototoxic drugs), distinguishing syndromic from non-syndromic cases, and providing clinical prognosis and care (Alford et al. 2014; de Lima et al. 2018; Nonose et al. 2018). GJB2/GJB6 variants are the primary cause of NSHL in most populations, and consequently, are the focus of molecular assessment in genetic counseling services and newborn hearing screening programs (Shearer et al. 2017). Besides, many researchers worldwide improved genetic counseling by establishing the pathogenic nature of controversial variants (Shearer et al. 2014; Oza et al. 2018; Shen et al. 2019; Deafness Variation Database).

The nonspecific clinical presentation associated with the deafness genes demands the simultaneous analysis of several genes to accelerate molecular diagnosis (Arnos 2003; Dror and Avraham 2010). However, many comprehensive technologies widely used in developed nations as a routine in molecular diagnosis are still unavailable for most people living in developing countries. Thus, as presented here (Fig. 1C), GJB2/GJB6 and m.1555A > G screening in every hearing-impaired patient is already a remarkable achievement (Lezirovitz and Mingroni-Netto 2021, this issue).

The present report is one of the few studies in Latin America that screened cases including those with malformations, or syndromic features, or due to environmental factors, providing more accurate estimations and highlighting the importance of molecular testing even when other risk factors are present (Lezirovitz and Mingroni-Netto 2021, this issue).

Surprisingly, a lower frequency of m.1555A > G was observed (0.6% × 2%) compared to the study of Abreu-Silva et al. (2006), also conducted in São Paulo city (Fig. 1C and Supplementary Table S3). This difference in frequency could be due to some ascertainment biased, for instance, HL age of onset or ancestry, and also due to more strict laws concerning antibiotics purchase implemented in 2011 that could diminish the m.1555A > G associated HL penetrance (Lezirovitz and Mingroni-Netto 2021, this issue).

Brazil’s regional differences in GJB2/GJB6 contribution: corroborating and expanding previous studies

Overall, 20 different GJB2/GJB6 variants, likely pathogenic or pathogenic, were identified, corresponding to 12.9% of subjects with biallelic recessive pathogenic variants and 0.4% of heterozygous dominant variants (Fig. 1C and Supplementary Tables S4, S5, and S6). This frequency is quite similar to the observed 12% of biallelic cases, in an independent study also conducted in the city of São Paulo 11 years ago (Batissoco et al. 2009). Besides, biallelic GJB2 variants were found in two cases from the USA and Spain (29%), both compounds heterozygous for c.35delG and another GJB2 variant.

A higher contribution of GJB2/GJB6 causative variants was observed in cases with positive family history (19%), pre/perilingual onset (19%), consanguineous parents (22%), and non-syndromic cases (15%) (Fig. 2A). Furthermore, even though females correspond to most of our cohort (55%), males were more represented among subjects with GJB2/GJB6 variants. The males also had a higher diagnostic rate than females (17% against 10%), as shown in Figs. 1A and 2A.

Fig. 2
figure 2figure 2

Summary of the GJB2/GJB6 variants and clinical characterization. A Graphic representation of the different GJB2/GJB6 diagnostic rates among gender or each clinical group; B chromatogram showing the novel GJB2 variant in compound heterozygosis with c.35delG; C contribution of each Brazilian region to the GJB2/GJB6 alleles; D the GJB2/GJB6 diagnostic rate among the different regions of Brazil; E diversity of the GJB2/GJB6 pathogenic variants in each region; F frequency of the c.35delG variant among GJB2/GJB6 pathogenic alleles in each region; G frequency of the c.35delG variant among tested chromosomes in each region; H clinical characterization of HL caused by GJB2/GJB6 variants, demonstrating the relationship between the type of mutation (truncating or not) and HL severity. SE southeast, S south, NE northeast, N north, MW midwest

A novel frameshift variant (c.79_82delGTCCinsAGA) in compound heterozygosity with the c.35delG is described here, a frequent mutational mechanism in GJB2 and predicted as pathogenic by VEP and MutationTaster (Fig. 2B and Supplementary Tables S4, S5, and S6).

The present study is the first that captures the vast diversity of variants and their regional differences, representing all Brazilian regions, which contributed to the identified GJB2/GJB6 alleles, proportional to their contribution to the cohort (Supplementary Table S2; Fig. 2C, D). Although a wider variety of variants was found in the present study (Supplementary Table S5; Fig. 2E) compared to other previous studies from São Paulo state, the c.167delT, for example, was not detected here but had been identified in these previous studies (Batissoco et al. 2009; Martins et al. 2013). Thus, this study demonstrated that GJB2/GJB6 is a significant cause of HL in all regions of Brazil. However, there were regions with unexpected GJB2/GJB6 allele contribution and diagnostic rates, compared to the Brazilian average, including N, with smaller than expected frequency, and NE, with higher-than-expected frequency, as shown in Fig. 2C (as compared to Fig. 1B) and Fig. 2D. For instance, the Northern region had a diagnostic rate of 7.7%, and c.35delG was the only pathogenic variant detected (Fig. 2D–F), in agreement with a study from the same region (Pará state) in which a diagnostic rate was 1.3% and c.35delG represented 80% of the mutated alleles, but GJB6 deletions were not tested (Castro et al. 2013). Conversely, the other studies performed in the NE region were not comparable to ours because they included related patients (Manzoli et al. 2013) or patients from a small geographical region (Melo et al. 2014). Based on genomic data, it has been estimated that all Brazilian regions show between 60% and 80% of European ancestry, with the smaller proportions exhibited in NE (60%) and N (69%) regions. On the other hand, the N region exhibits the highest proportion of Amerindian ancestry (18.6%), and the NE has the highest proportion of African ancestry (29%) (Pena et al. 2011). Thus, the lowest GJB2/GJB6 contribution in the N could be explained by this higher Amerindian contribution, similar to that observed in Guatemala (Carranza et al. 2016) and Venezuela (Angeli et al. 2000; Utrera et al. 2007).

The c.35delG variant constituted ~ 70% of the 158 GJB2/GJB6 mutated alleles identified in Brazil (Supplementary Tables S4, S5, and S6; Fig. 2F), and it was the most frequent variant in all regions except the MW, with the lowest frequency, concerning the overall mutated alleles (20%) or all chromosomes screened (3%). This region showed a larger variety of equally contributing variants c.71G > A:p.(Trp74Ter), c.101 T > C:p.(Met34Thr), c.551G > C: p.(Arg184Pro) and del(GJB6-D13S1854). These regional differences argue against the c.35delG single test as the best screening strategy to achieve a molecular diagnosis and reinforce the need to include the GJB6 deletions test, which was also demonstrated to some extent in other Brazilian studies (Melo et al. 2014; Felix et al. 2014) . Conversely, the highest c.35delG frequency was obtained in the South region (Fig. 2F, G). The study of Faistauer et al. (2021) and the present showed similar results concerning the c.35delG frequency among mutated alleles in the S region, 100% and 90%, respectively, but the present diagnostic rate was higher than the obtained in this study, 14.3% compared to 11.5% (Fig. 2D, F).

The second most frequent mutated allele in GJB2 was c.109G > A:p(Val37Ile) which accounts for 6.4% of the mutated alleles detected in the SE and NE regions. Two other pathogenic variants were recurrent: c.71G > A:p.(Trp24Ter) (4.5%), identified in the SE and MW and c.101 T > C: p.(Met34Thr) (3.8%), found in the SE, NE, and MW regions (Fig. 1C and Supplementary Tables S4, S5, and S6). The large deletions involving GJB6, del(GJB6-D13S1830), and del(GJB6-D13S1854), were each found in 2.1% of the mutated chromosomes, in S/MW and SE/NE regions, respectively (Fig. 1C and Supplementary Tables S4, S5, and S6).

The most frequent polymorphism in the whole cohort was rs3751385 (NM_004004.6:c.*84T > C), with the T allele observed in 265/406 (65%) of the analyzed chromosomes with no pathogenic variants, and the second most common polymorphism was p.(Val27Ile), NM_004004.6:c.79G > A, rs2274084, detected in 12.5%.

Genotype–phenotype correlations of GJB2/GJB6 recessive variants

All cases attributed to recessive GJB2/GJB6 variants were bilateral (Fig. 2A) and prelingual or perilingual onset. Genotypes with two truncating variants tend to exhibit more severe phenotypes, mostly congenital and profound, whereas genotypes including at least one non-truncating variant showed milder phenotypes, sometimes progressive HL. Thus, detailed clinical data of all genotypes are displayed in Fig. 2H (and Supplementary Table S6). In our cohort, c.35delG/c.35delG genotypes were slightly more frequently associated (73%) with congenital profound HL than c.35delG in compound heterozygosity with another truncating variant (67%). Variability in phenotype severity, when the c.35delG is one of the alleles, also depends on the truncating/non-truncating characteristic of the second variant. Overall, HL tends to be less severe if the second variant is non-truncating compared to two truncating variants. Surprisingly, no genotype composed of two non-truncating variants was observed apart from the named hypomorphic variants p.(Met34Thr) and p.(Val37Ile), findings similar to the study of Snoeckx et al. (2005).

Unexpectedly, we detected two cases of profound congenital phenotypes among the homozygous/compound heterozygous cases with two hypomorphic variants. In agreement with the proposition of Shen et al. (2019), that another etiological factor should be suspected when congenital profound HL is observed among hypomorphic variants carriers (either in homozygosis or compound heterozygosis), both carriers of hypomorphic variants with profound HL have other risk factors. For example, Subject G6 was born from a high degree consanguineous couple (father–daughter), making it is likely of homozygosity for another deafness gene, and G58 had prematurity as another risk factor. Besides, a more thorough molecular screening such as WES could disclose other genetic causes, especially in the first case. Nonetheless, in a comprehensive study regarding these hypomorphic variants, only one case of 14 investigated through NGS had an additional genetic cause of HL (Chai et al. 2015).

Genotype–phenotype correlations of dominants variants

Patient G4 was a familial case with postlingual progressive HL and skin disease. Variant c.224G > A:p.(Arg75Gln) in GJB2 was identified in the proband, and in three other family members, two also with the skin disease and one showing only HL at seven years (Supplementary Tables S4, S5, and S6). These findings finally led to the clinical diagnosis of the skin disease as keratoderma palmoplantar. Their phenotype differed from the observed phenotype in another study, where congenital HL was found in most cases, and all cases had syndromic features (Pang et al. 2014).

In a sporadic case of HL, the variant p.(Arg184Gln) was identified (patient G46). His normal-hearing parents did not carry this variant. Therefore, it must have occurred de novo. Since birth, the patient also exhibited nail abnormalities without a diagnosis and is likely associated with this variant (Supplementary Table S6). Even though p.(Arg184Gln) is listed as associated with non-syndromic HL in Deafness Variation Database, there were few reports of skin and nails abnormalities (Pang et al. 2014). These findings further emphasize the variability of clinical presentation of GJB2 dominant variants, even within families. Pang et al. (2014) suggested that ethnicity could influence the differences in the genotype–phenotype correlations. Moreover, both cases presented here illustrate the power of molecular diagnosis to provide the proper clinical diagnosis and, consequently, better management of their skin diseases.

The present frequency of 0.37% of dominant GJB2 variants is quite similar to that (0.32%) encountered by Pang et al. (2014), probably because about 70% of GJB2 dominant variants arise through de novo mutational events (Pang et al. 2014). In 50% of the present cases, they were de novo, as observed in case G46.

Monoallelic cases: causative or coincidence?

A meta-analysis showed that the frequency of heterozygotes with a truncating GJB2 variant (monoallelic) in hearing-impaired subjects is twice the normal-hearing population frequency (Chan and Chang 2014). A similar result was described by Seeman and Sakmaryová (2006) regarding the c.35delG. A statistical comparison of the frequency of the c.35delG, p.(Val37Ile) and p.(Met34Thr) variants between the general population and HL cohorts (present and Batissoco et al. 2009) is presented in Supplementary Table S7. Indeed, there is an excess of c.35delG heterozygotes among the hearing-impaired subjects, but the same is not observed for the other two hypomorphic non-truncating variants. Brozkova et al. (2021) searched for CNVs, SNVs, and shared haplotypes in the GJB2/GJB6 genes in 28 GJB2 heterozygotes, failing to find another pathogenic GJB2/GJB6 variant. However, WES NGS identified causative variants in other deafness genes in 22%. Hence, many monoallelic cases might remain unsolved even after the complete molecular screening, which might raise some hypothesis that limitations in techniques or current knowledge prevented identifying the cause. Alternatively, pathogenic alleles could be one of the risk factors of a multifactorial inheritance mechanism.

Specific genes screening revealed other significant contributions to HL in Brazil

After ruling out GJB2/GJB6 and m.1555A > G variants as causative, stepwise approaches were employed to select cases based on clinical data and presumptive inheritance for additional molecular screenings (Fig. 1C) since a comprehensive molecular screening was not affordable for all cases. In addition, for few cases (21), NGS sequencing, exome, or gene panel, were performed (Fig. 1C, Supplementary Table S3); one auditory neuropathy (causative variants found), eight familial non-syndromic cases (2 with causative variants), four USH cases (two with causative variants), four sporadic non-syndromic cases (one with causative variants), three of the 16 monoallelic GJB2 cases, one of them also with WS, and in another WS case (none solved). None of the cases screened with the 18 genes panel were solved.

Besides the GJB2 novel variant described above, one novel variant was detected in OTOF and seven among the Waardenburg genes (six variants and one CNV) (Supplementary Tables S3, S4, S5, and S8).

SLC26A4 and cochlear-vestibular malformations

A range of ear malformations involving the external/ middle ear or cochlear-vestibular malformations such as enlarged vestibular aqueduct (EVA), dysplasia/aplasia of vestibular organs, an incomplete partition of the cochlea (p.e Mondini), cochlear aplasia, hypoplasia, or absence of cochlear nerve were observed in 36 subjects (10.8%, Fig. 1C). Additional syndromic features were present in 27.8% (10/36) of the cases with these malformations, such as Down syndrome, Treacher-Collins, and suspicion of CHARGE syndrome. The frequency of cochleo-vestibular malformations found here (8.4%; Fig. 1C) is similar to the one obtained by Aldhafeeri and Alsanosi (2016) in cochlear implant recipients, 7.5%.

Incomplete partitioning of the cochlea (Mondini) or Enlarged Vestibular Aqueduct (EVA) without other clinical features was the inclusion criteria to select 15 cases for SLC26A4 screening. Biallelic variants were identified in three (3/15 or 20%), co-segregating with HL in these families (Figs. 1C and 3, and Supplementary Table S3). One proband with non-syndromic EVA was classified as monoallelic (Supplementary Table S3) since only a novel variant, c.1662T > G: p.(Ile554Met), was identified even after FOXI1 and KCNJ10 molecular analysis. This novel SLC26A4 variant was considered deleterious by all bioinformatics tools and molecular modeling analysis (Supplementary Tables S1 and S3, Supplementary File S2 and Fig. 3). Hence, the clinical significance of this novel variant remains inconclusive, and further molecular screenings will be needed to clarify it, for example, analysis of the common SLC26A4-linked haplotype described by Chattaraj et al. (2017) and search for possible CNVs involving SLC26A4. Cases without SLC26A4 causative variants presented here were 70%, higher than reported for North American and European populations (Pryor et al. 2005; Albert et al. 2006; Campbell et al. 2001; Coyle et al. 1998). Possible reasons are our broader inclusion criteria, cochlea-vestibular malformation instead of EVA, or a different SLC26A4 contribution in our population. Indeed, Nonose et al. (2018), in a study also from São Paulo, found no biallelic probands among suspected cases of Pendred syndrome or cochlea-vestibular malformations, but two monoallelic were detected (13%). Conversely, two biallelic cases were identified among 16 families with presumptive autosomal recessive inheritance and microsatellite segregation consistent with linkage to SLC26A4 (Nonose et al. 2018).

Fig. 3
figure 3

Pedigrees in which the causative variants were identified and the respective chromatograms. The molecular modeling of the p.(Phe161Ile) and p.(Ile554Met) variants in SLC26A4 is represented below to the leftThis was performed by comparison using as a template the structure of the human SLC26A9. In A and C wild structures, B mutant F161I or p.(Phe161Ile) and in the D mutant I554M or p.(Ile554Met). It is noteworthy that residue 161 is located near the surface. In contrast, residue 554 is buried in the protein center, and their respective variants will probably have opposite effects on their properties; a decrease in the number of hydrophobic contacts (15 to 7) and an increase in the number of hydrophobic contacts (13 to 20), respectively in p.(Phe161Ile) and p.(Ile554Met) mutant

OTOF and auditory neuropathy

The frequency of cases within the auditory neuropathy spectrum found here (3.5%) falls within the prevalence range described in the literature, 1%–19% (Psarommatis et al. 1997; Rance 2005; Foerst et al. 2006; Maris et al. 2011; Silva et al. 2015), but higher than another Brazilian study in which it was estimated in 1.2% among cases of deafness (Penido and Isaac 2013). This range of variation may be due to differences in the diagnosis criteria or to population differences.

Among the 19 AN cases, two were selected for molecular analysis: case O1 with two affected siblings by OTOF Sanger sequencing and the sporadic case O2 by WES (Fig. 1C). Biallelic causative variants were detected in both; one out of the four is novel (Fig. 3 and Supplementary Table S3). Although the small sample, OTOF screening showed effectiveness in providing a molecular diagnosis for auditory neuropathy patients and consequently a good prognosis for a cochlear implant. In accordance, a high prevalence of OTOF causative variants was revealed in patients with AN, in a follow-up of the study conducted by Romanos et al. (2009), ~ 86% (12/14), which is presented in this issue (Lezirovitz and Mingroni-Netto 2021).

Novel Brazilian deafness variants: frequency in our cohort

The finding of a third unrelated Brazilian family, whose postlingual progressive HL also segregates with the pathogenic variant c.2090T > G: p.(Leu697Trp) in the MYO3A gene (Dantas et al. 2018), led to a collaborative study to investigate its frequency among autosomal dominant postlingual and progressive HL (Fig. 1C). As a result, Bueno et al. (2021) demonstrated that five families from Brazil, two out of them from this cohort, and one from The Netherlands, had their HL attributed to this variant and shared a common ancestor (Fig. 1C, Supplementary Table S3). Thus, this variant accounted for 3.6% (2/56) of cases screened here (Fig. 1C) and could potentially be relevant for investigating Brazilian and European families.

A ~ 200 kb genomic duplication, including three well-known protein-coding genes (CNRIP1, PPP3R1, and PLEK), was identified as the genetic signature segregating with autosomal dominant postlingual progressive HL in 20 members of the DFNA58 family (Lezirovitz et al. 2009, 2020). However, this genomic duplication or other CNVs involving these genes were not detected among 50 Brazilian families with postlingual progressive HL with presumptive autosomal dominant inheritance in this sample (Fig. 1C). Furthermore, the study of another Brazilian pedigree revealed the NCOA3 gene as a likely novel deafness gene (Salazar-Silva et al. 2021). Nonetheless, the pathogenic NCOA3 variant was not detected in 29 subjects of our cohort selected for this screening.

Relevance of NGS to achieve a diagnosis

Among eight familial cases with non-syndromic HL, screened through WES or ~ HL-genes panels (Fig. 1C), two with postlingual progressive HL had causative variants revealed (25%). A novel MYO6 pathogenic variant and the corresponding genotype–phenotype correlations were reported in Sampaio-Silva et al. (2017) (Fig. 1C and Supplementary Table S3). Biallelic variants in TMPRSS3, both already described as pathogenic, were found in three affected members from the second familial case, and none of the seven normal-hearing members carried both (Fig. 3, Supplementary Table S3). Our findings further support the pathogenicity of p.(Ala426Thr) in TMPRSS3, which was at first described as a possibly benign polymorphism. Later it was demonstrated that it affects protein function and causes HL depending on the second mutation in trans. Furthermore, the present cases expanded knowledge about the phenotypic presentation associated with these variants, usually prelingual/childhood-onset, showing later onset in the cases described here, often in adulthood.

Identifying one case with TMRPSS3 biallelic pathogenic variants motivated the screening of this gene (Sanger sequencing) in postlingual progressive HL cases with presumptive autosomal recessive inheritance (Fig. 1C). Interestingly, one sporadic case out of the 31 tested (3.2%) was found to have biallelic variants in TMPRSS3, both previously described and here confirmed to be in trans (Fig. 3 and Supplementary Table S3).

Among the three sporadic cases, apparently non-syndromic, screened with WES or HL panel of 116 genes, two have revealed potential genetic causes (Fig. 1C). In the first case, two loss-of-function variants in MYO15A were identified, with congenital sensorineural HL (severe in the left ear and profound in the right ear) (M1 see Fig. 3 and Supplementary Table 3). The c.3524_3525insA:(p.Ser1176Val fs*14) was inherited from the father, but the mother did not show any variant. This finding suggested that the pathogenic variant c.1615C > T:p.(Gln539Ter) might have occurred de novo, or the mother had germinative mosaicism.

The second case was affected by congenital moderate HL and showed biallelic variants in the USH2A gene, both previously described, but one pathogenic, and the other with conflicting interpretations of pathogenicity (Fig. 3, Supplementary Table S3). The USH2A gene is responsible for Usher Syndrome (USH), characterized by sensorineural HL and retinitis pigmentosa with or without vestibular dysfunction and autosomal recessive inheritance. It is the most common cause of deaf-blindness worldwide, exhibiting clinical and genetic heterogeneity (Toms et al. 2020). Until the age of 2 years, retinal mapping and all ophthalmological exams were normal in this patient. Retinitis pigmentosa typically shows a later onset than HL in USH patients. Thus, the association between the detected variants and the HL phenotype remained inconclusive, given the unknown significance of one of the variants and the present clinical findings. Nevertheless, periodic ophthalmological exams were counseled.

Among our cohort, eight cases had a clinical suspicion of USH because Retinitis Pigmentosa (RP) was noticed in the probands or first-degree deaf relatives, representing 1.5% of the total and 7.9% of syndromic cases. The onset of HL was perilingual and progressive in one case; postlingual in two and five had prelingual onset. Parental consanguinity was referred in three of the eight cases (37.5%), one of them also familial. Four of the eight USH cases were reported in de Lima et al. (2018), with biallelic pathogenic variants found in two cases and monoallelic in two others (Fig. 1C and Supplementary Table S3).

Summing up, NGS analysis in the selected non-syndromic cases, sporadic or familial, allowed identification of causative variants in 23.5% (4/17). Whereas considering only the large panel of 116 genes or WES, a diagnosis was achieved in 28.6% (4/14), within the range of previously published studies of the WES diagnostic rate for non-syndromic HL, which varies from 20 to 40%, whether GJB2/GJB6 is excluded or not (Likar et al. 2018; Sloan-Heggen et al. 2016). In some cases, the failure to detect pathogenic variants could be attributed to technical issues, such as low coverage, variants outside coding regions or CNVs that were not screened, or to yet unidentified HL genes.

Waardenburg syndrome—prevalence, genotype–phenotype correlations, and mechanisms

WS is characterized by the association of HL, pigmentation disorders, and facial dysmorphisms, most frequently with autosomal dominant inheritance. Fourteen cases (one out of them with normal hearing, Fig. 1C and Supplementary Table S8) were clinically diagnosed with WS because at least two clinical signs were observed among these three classes. WS represented 12.9% of the syndromic cases or 3.8% of prelingual/perilingual cases in our sample, which is within the literature range of 1–5% of cases of congenital HL (Waardenburg 1951; Read and Newton 1997; Koffler et al. 2015).

WS is genetically and clinically heterogeneous with four clinical types (WS I–IV), depending on additional symptoms and associated pathogenic variants in five different genes: WS1 and WS3—PAX3; WS2 and WS4—MITF, SOX10, EDN3, and EDNRB (Waardenburg 1951; Madden et al. 2003; Pingault et al. 2010). Patients were classified as WS1 when telecanthus was present and as WS4 when intestinal complaints were referred (Sheffer and Zlotogora 1992; Read and Newton 1997; Pingault et al. 2002; Madden et al. 2003; Pardono et al. 2003). The most frequent type in the present cohort was WS2 (64.3%) with nine cases, followed by WS1 with three cases (21.4%), and two cases (14.3%) were classified as WS4 (Supplementary Table S8).

Iris pigmentation disorders were the most common clinical feature presented by all cases. In addition, facial dysmorphisms were observed in all WS1 cases, but only 44% (4/9) of WS2. In agreement with the literature that suggests that up to 50% of WS cases have temporal bone anomalies, the most frequent being EVA or semicircular malformation (Pingault et al. 2010), 25% of our cases (3/12) showed these malformations, such as vestibular and semicircular canalsdysplasia (WS2: subjects W6 and W10), and unwound cochlea (WS4: subject W7).

Nine pathogenic/likely pathogenic variants were identified, and their segregation with WS could be confirmed in 9 of 10 cases (Figs. 1C and 4, and Supplementary Table 8). Missense variants were the most common (3); two were frameshifts, two were stop-gains, and two were splice site disruptions (one acceptor and one donor). Among the five remaining WS cases without causative variants identified, two (W12 and W13, also a monoallelic of GJB2) were submitted to array CGH and Exome sequencing, with negative results. In addition, a customized qPCR assay was performed in three WS2 subjects (W10, W11, and W14) to screen for whole-gene CNVs or CNVs similar to the previously described ones. Thus, a de novo deletion of MITF in subject W14 was detected, validated by MLPA, showing the absence of MITF exons 5 and 6 (Fig. 4). The MITF CNV represented 10% of the WS causative variants (1/10), frequency lower than those found by Bocángel et al. (2018). One possible reason is that our CNV screening was not as complete as the work of Bocángel et al. It is also possible that the contribution of each type of genetic alteration varies within samples since most of them are de novo their frequency is not influenced by ancestry or founders.

Fig. 4
figure 4figure 4

Pedigrees depicting the clinical features of the WS cases analyzed in this study, with segregation analysis and chromatograms of the identified pathogenic variants

All WS1 (PAX3) and WS4 (SOX10) cases disclosed causative variants, but only about half of WS2 (5/9), one with SOX10 and four with MITF variants (Supplementary Table S8). Indeed, PAX3 is believed to explain most if not all cases of WS1 as found in the present study (Pingault et al. 2010). On the other hand, the known WS2 genes explain only about 30% of cases (Pingault et al. 2010). Similarly, a higher contribution of the other two genes was observed in our cohort than previously reported. For example, for WS2 cases, 15% were usually associated with MITF, but, in the present report, 44% of WS2 were due to MITF. Whereas for WS4, 50% were generally related to SOX10 (Pingault et al. 2010), and 100% could be attributed to SOX10 in the present study. The WS diagnostic rate obtained here, 71.4% (10/14), was much higher than was achieved by a molecular study of Waardenburg in São Paulo (38.8%) (Bocángel et al. 2018).

In four cases, the pathogenic variants occurred de novo, three involving MITF (3/4) and one in SOX10 (1/2, the third case could not be determined). Wildhardt et al. (2013) also found 2/3 of de novo variants in MITF, conversely 1/6 cases de novo related to PAX3. Similarly, all PAX3 variants reported here were inherited.

The recurrence of the WS genes variants seems to be due to both hot spot and founder effect mechanisms. For example, p.(Arg223Ter) that was inherited here had previous reports of apparent de novo occurrences (Jalilian et al. 2015; Sun et al. 2016) but was also found in large population cohorts with South Asian background (1/30782–0.003%) (Lek et al. 2016); c.44_62delT in SOX10 was also identified in a recent study from São Paulo (Bocángel et al. 2018), but occurred de novo in these two unrelated pedigrees; c.635-1G > A in MITF, already previously reported and also occurred de novo here.

Further analysis is needed to uncover the genetic etiology of the four WS unsolved cases. One of the four cases without pathogenic variants detected (W11) was born from a first-degree consanguineous marriage, thus making it probable that a homozygous recessive variant might explain at least part of the phenotype. Even though two out of the four unsolved cases were investigated by NGS exome and array CGH, it is still possible that a variant located in a low-coverage region or a small CNV failed to be detected by these approaches. Alternatively, a yet unidentified gene may be responsible for their syndrome, and eventually, the association of the clinical features is random and not a syndrome.

The variable expressivity associated with the Waardenburg genes makes clinical diagnosis frequently challenging, precisely when mild clinical signs can be easily overlooked, raising the relevance of the molecular screening. For instance, subject W1’s father carried the same likely pathogenic variant in MITF but presented only a tiny hair lock above the ear, a clinical signal that could easily be missed. Another interesting case is W8, associated with the p.(Arg223Ter) variant in PAX3, already described as associated with WS1, whose grandmother and aunt, also carriers of the variant, exhibit only facial dysmorphisms.

Molecular modeling provides further support to the missense variants’ deleterious effects

The analysis of the MITF variant p.(Ser250Tyr)/ (p.(Ser351Tyr) (transcripts NM_000248.4 and NM_198159.3, respectively) showed that the tyrosine is a bulkier amino acid than the serine. The SWISS-MODEL predicted the oligo-state of the model as a homodimer and showed that the tyrosine of both structures performed a pi-stacking interaction, which could cause instabilities in protein structure or impact protein–protein or protein-DNA interactions (Supplementary Table S1 and Supplementary Files S2-S5; Supplementary Figs. 1 and 2). The variant p.(Ser250Tyr) in MITF carried by W1 is novel; nonetheless, another substitution in the same amino acid position p.(Ser250Pro) has already been reported associated with WS2 (Tassabehji et al. 1995). Moreover, this position is part of the HLH domain, and p.(Ser250Pro) was shown to lose DNA binding capacity and fail to activate expression from the melanocyte-specific promoters (Grill et al. 2013), which agrees with the molecular modeling predictions.

The substitution of glycine to an alanine observed in the p.(GlyG94Ala) variant in PAX3 does not cause many changes in the structure since they are both small side-chain amino acids. However, while alanine presents a CH3 group in its side chain, glycine presents only one hydrogen (Supplementary Fig. 2A–C and Supplementary File S5). Additionally, glycine could be responsible for increasing the region’s mobility. Thus, molecular modeling predicted milder effects of this variant.

Regarding the p.(Lys150Glu) variant in SOX10, lysine is a positively charged amino acid, while glutamate is negatively charged, occurring near a region of interaction with DNA molecules (Supplementary Fig. 2E–F, Supplementary File S6). Furthermore, a Poisson-Boltzmann surface analysis (PBSA) in the electrostatic surface of the protein demonstrated that this substitution alters the charge distribution at the surface (Supplementary Fig. 2G–H, Supplementary File S6), which could impact the protein thermostability and its interactions with the residues in this region.

Evidences for SOX10 genotype–phenotype correlations

Interestingly, two out of the three cases with cochleo-vestibular malformations carried SOX10 variants, giving further support for the association of this gene with temporal bone anomalies (Bondurand et al. 2007; Elmaleh-Bergès et al. 2013).

Variants in SOX10 are associated with a range of clinical presentations, including WS2, WS4, and a neurological phenotype with peripheral nerve and/or central nervous system involvement, designated PCWH (Inoue et al. 2004; Pingault et al. 2010; Chaoui et al. 2011). In the classic WS4, the intestinal phenotype is Hirschsprung disease or congenital megacolon, but a milder presentation has been described known as chronic intestinal pseudo-obstruction (Inoue et al. 2004; Pingault et al. 2002). We identified three cases with SOX10 pathogenic variants, and none had neurological symptoms; two were classified as WS4 and one as WS2.

WS4 cases related to SOX10 are mostly due to de novo truncating variants (Bondurand et al. 2007; Pingault et al. 2010; Chaoui et al. 2011; Liang et al. 2016). It has been proposed that variants that trigger or escape NMD (nonsense-mediated decay) might explain the different neurological phenotypes, WS4 or PCWH, respectively (Inoue et al. 2004). Accordingly, usually truncating variants located until the end of exon 4 trigger NMD and, consequently, do not exert a dominant-negative effect. On the other hand, genotype–phenotype correlations about missense variants have been more challenging to predict precisely. Nevertheless, it is believed that a dominant-negative effect is generally associated with a more severe phenotype (Thongpradit et al. 2020).

Among our SOX10 related WS cases, two supported this proposition but included WS2 cases as the milder phenotype together with the WS4 cases. Proband W7 was classified as WS4 due to mild intestinal complaints and carried the truncating variant c.44_62del in exon 2, which will result in an aberrant mRNA that likely will be degraded by NMD. Thus, it should lead to a milder phenotype without severe intestinal or neurological anomalies (Inoue et al. 2004; Fernández et al. 2014; Liang et al. 2016). The second patient (W6) was classified as WS2 and inherited two adjacent variants in exon 2 from his mother (c.12_13delinsAT), the first is synonymous with no frequency data (rs1383021831) and predicted as disease-causing by MutationTaster, and the second is a stop codon, thus more likely to be responsible for the phenotype. This premature stop in the fifth codon will likely result in no functional protein to exert a dominant-negative effect.

The third case related to SOX10 was de novo and classified as WS4 because of mild intestinal complaints (W5). However, instead of truncating, the variant is a missense, p.(Lys150Glu), located in exon 3. A different substitution affecting the same amino acid, p.(Lys150Asn), has been described in a patient with PCWH, and employing in vitro functional analysis, was shown to alter the subcellular protein localization (primary defect), ultimately impairing DNA binding and reporter transactivation, and also might disrupt the tertiary structure of the HMG domain (Chaoui et al. 2011). Our molecular modeling of p.(Lys150Glu) predicted that DNA binding and thermostability could be impaired. Functional assays comparing the effects of both variants could provide remarkable insights into genotype–phenotype correlations of missense variants in SOX10; for example, p.(Lys150Asn) may have a dominant-negative effect not displayed by p.(Lys150Glu).

Conclusions

In conclusion, the characterization of the genetic causes of HL in a large cohort, including subjects from all Brazilian regions, was described. Through the strategies employed in selecting probands for prioritizing genetic screening, a molecular diagnosis was achieved in 97/542 (18%) of our cohort, constituting 84/441 (19%) of the non-syndromic group and 13/101 (12.9%) of the syndromic group. Moreover, considering the positive molecular diagnosis added to the familial cases and parental consanguinity, which indicated a probable genetic cause of HL, an estimation of ~ 50% of genetic cases was obtained. Thus, this report confirms the value of large unselected cohorts analyzed through different perspectives. Besides, the present study reinforces the need for an exhaustive evaluation of genetic causes of HL in the whole country, with all regions being equally represented, to validate the present findings.