Overview of clinical types and the genetics of epilepsy

The International League Against Epilepsy defines an epileptic seizure as 'a transient occurrence of signs and/or symptoms due to abnormal excessive or synchronous neuronal activity in the brain' [1, 2]. The condition is common, with prevalence around 1% and lifetime incidence around 3% [3]. Most epilepsies can be broadly and easily classified based on their pattern of electroclinical onset as either generalized ('originating at some point from within, and rapidly engaging, bilaterally distributed networks') or focal ('originating within networks limited to one hemisphere') [1]. Within each of these broad classifications are multiple distinct syndromes, more than half of which are considered to be 'genetic epilepsies'. In older terminology, genetic epilepsies were referred to as 'idiopathic epilepsies' [4]. Syndromes, and sometimes subsyndromes, are delineated when the seizures are defined by easily recognizable electroclinical features and similar enough to be regarded as a homogeneous group, distinct from other groups in the same classification level (Table 1). For example, genetic generalized epilepsies are frequently divided into their subsyndromes of childhood absence epilepsy, juvenile absence epilepsy, juvenile myoclonic epilepsy and generalized tonic clonic seizures.

Table 1 Examples of genetic generalized and focal epilepsy syndromes

There is a subset of epilepsy syndromes that are clearly monogenic, and traditional linkage studies in large families have been useful for identifying causative genes [5, 6]. However, the vast majority of the genetic epilepsies are multifactorial, with an underlying genetic contribution that is polygenic, where few or usually none of the susceptibility genes have been identified. This multifactorial concept dates back to the early works of William Lennox [7] and was well established in the modern era with additional twin data [8]. It is important to note that epilepsy with complex genetics and complex epilepsy are distinct concepts. To the geneticist, complex epilepsy is epilepsy with complex genetics; that is, multifactorial epilepsy that is polygenic and influenced by environmental effects, both internal and external. Complex epilepsy to the epileptologist, on the other hand, refers to the complexity of the seizure pattern. Without an appreciation of the difference, interactions between basic and clinical scientists can be, and have been from personal experience, confused by 'complex epilepsy' meaning different things to different people. In the context of this article, complex epilepsy will mean that which is multifactorial in origin, rather than necessarily having complex seizure patterns.

Monogenic epilepsies

To date, more than 20 genes have been identified for the group of genetic epilepsies that are primarily monogenic [5, 6, 9, 10], prompting a recent update of clinically based classification [1]. While individual syndromes that comprise each of these groups are generally diagnosed through clinical assessment, molecular testing now facilitates more accurate definition of clinically similar disorders that are now known to be caused by mutation of different genes. While gene identity provides an alternative or additional criterion for syndrome classification, it also has clinical efficacy - providing a rapid definitive diagnosis to obviate an otherwise circuitous set of invasive or costly investigative procedures. Furthermore, in some cases, specific therapeutic intervention can be enabled to achieve improved outcomes or more accurate prognosis. Genetic testing for the epilepsies has high clinical utility in cases that may involve SLC2A1 (glucose transporter type 1 deficiency), SCN1A (Dravet syndrome), PCDH19 (familial epilepsy and mental retardation limited to females, 'Dravet-like' PCDH19 syndrome), ARX (X-linked infantile spasms and myoclonic seizures, dystonia, and X-linked lissencephaly with ambiguous genitalia) or STK9 (X-linked infantile spasms) mutations. Testing has high analytical sensitivity (ability to detect the presence of a causative mutation) and high analytical specificity (ability to exclude mutation in a candidate gene) for all of the monogenic epilepsies, but not necessarily high clinical utility apart from some of the syndromes associated with the above genes [9]. It has little or no clinical utility at this time when knowledge of the gene is not needed for accurate syndrome classification, when knowledge of the gene does not direct or affect treatment, or in cases of genetically complex epilepsies triggered by the combined effects of multiple genes spread across the genome, most likely each having only a small effect on phenotype.

Complex epilepsies

Speculation of the genetic architecture for the genetically complex epilepsies centers on the common disease-common variant hypothesis [11] and the common disease-rare variant hypothesis [12]. The general failure of linkage and association studies applied to the complex epilepsies [1316] argues against the common disease-common variant hypothesis, although the major criticism of such studies is that they are underpowered to detect the magnitude of odds ratios that are likely associated with susceptibility variants in the genetically complex epilepsies [17] and indeed other neuropsychiatric brain disorders.

The common disease-rare variant hypothesis, which suggests a variable subset of multiple rare genetic variants, has greater appeal for complex epilepsy [18, 19], especially given the failure of association studies, which work on the premise of the common disease-common variant hypothesis [16], to deliver consistent findings. A mixture of the two models is also entirely plausible [19] with functional differences in the electrophysiological properties of ion channels demonstrated for both rare and polymorphic genetic variation detected at the GABRD (encoding γ-aminobutyric acid A receptor, δ), CACNA1H (encoding calcium channel, voltage-dependent, T type, α 1 H subunit) and CLCN2 (encoding chloride channel 2) genes [2023], for example. Computer simulation supports the notion that genetic variations associated with only very small functional changes in ion channel properties are sufficient to make meaningful contributions to increasing susceptibility to epilepsy [24].

Multiple sclerosis is another disorder with complex inheritance where extensive study suggests 'risk variants likely to include hundreds of modest effects and possibly thousands of very small effects' [25]. Similar conclusions with systematic effects of multiple rare variants across the genome have been suggested for schizophrenia and bipolar disorder [26]. We predict the same for epilepsy with complex inheritance, with seizure susceptibility thresholds determined by combinations of many rare to moderately common sequence variants, copy number variants (CNVs) and perhaps non-coding DNA sequences with functional effects. Weak effects will only be detectable by genome-wide association studies using massive sample sizes. Kryukov et al. [27] pre-empted outcomes from deep resequencing by massively parallel sequencing (previously referred to as next-generation sequencing [28]) by promoting an association study approach based on the premise of multiple rare variants present in susceptibility genes in higher numbers for a given disease group (for example, epilepsy) than in their corresponding controls. The statistical tools to support that approach are now surfacing [29].

The heritability of genetic generalized epilepsy suggests a major genetic component [8] but virtually none has yet been identified. This constitutes the 'dark matter' [30]. The task is to find this missing heritability and characterize it in terms of number of loci, effect sizes, allelic frequencies of variants and the nature of the variants [31]. Areas being investigated include cis-acting genome-wide regulatory variants [32], genome-wide copy number variants [33, 34] as discussed below, and, in the future, next-generation sequencing [28].

Copy number variation in epilepsy

CNVs are deletions, duplications or insertions of DNA in the genome that range in size from approximately 1 kb to several megabases. Many CNVs have no apparent clinical significance, and numerous studies have now established that CNVs are dispersed throughout the genomes of healthy individuals and some CNVs are quite common [3537]. Importantly, CNVs have also been identified as a significant source of mutation. Small CNVs may result in the deletion or duplication of one or more exons of a known disease gene, and there are now many examples in the literature. In patients with intellectual disability (ID) or developmental delay, testing for large CNVs is now commonplace, as large CNVs underlie 15% to 20% of cases of ID [38, 39]. CNVs can be detected by targeted studies directed to specific known CNVs by techniques such as multiplex ligation-dependent probe amplification (MLPA). In the epilepsies, MLPA is generally targeted to exons of known epilepsy genes to detect intragenic deletions or duplications [4045], some of which are too small to be detected by genome-wide approaches.

Genome-wide methods to detect CNVs include array-comparative genomic hybridization (array-CGH) and SNP genotyping arrays. These technologies can be targeted to specific chromosomal regions [43, 4549]. However, their real power lies with capability for genome-wide interrogation, where there is no need for a priori knowledge of where a lesion may lie [33, 34, 46, 50]. Using that approach, Depienne et al. [46] discovered a Dravet-like syndrome caused by severe PCDH19 mutations on chromosome X, and McMahon et al. [50] 'rediscovered' the 15q13.3 CNV and found a novel 10q21.2 microduplication. Mefford et al. [33] and Heinzen et al. [34] used genome-wide approaches to establish the extent of rare CNVs in the genetic epilepsies (see below). For CNVs with boundaries extending beyond the target gene, array-CGH is a powerful tool for accurately determining size and gene content. Large epilepsy-associated CNVs detectable by MLPA, but extending well beyond the one gene of special interest (for example, beyond SCN1A), can also be reliably detected by array technologies [40, 43, 45].

The role of CNVs in epilepsy has now been addressed by several groups using both targeted and genome-wide approaches. Helbig and colleagues [51] first directed our attention to the role of the 15q13.3 microdeletion in the etiology of epilepsy. This microdeletion was first described in a series of patients with ID, most of whom also suffered from seizures [52], but is much more common in epilepsy cohorts [51, 53, 54]. This is one of the most prevalent genetic risk factors identified for the genetic generalized epilepsy syndromes. A range of rare mutations within SLC2A1 encoding the GLUT1 glucose transporter are at least as important within the childhood absence epilepsy subsyndrome of genetic generalized epilepsy [55, 56]. Although estimated confidence intervals are broad, the estimated odds risk ratio of 68 (95% confidence interval 29 to 181) for the 15q13.3 deletion [54] greatly exceeds that of most common susceptibility variants detectable by genome-wide association studies in disorders other than epilepsy. Despite its relative 'severity' in relation to risk, its frequency in epilepsy cohorts is relatively high at around 1.3%. Conversely, this variant is difficult to find in the general control population, despite the screening of large numbers of controls, even though family studies following detection of an index case disclose frequent transmissions from non-penetrant carrier parents [54, 57]. Moreover, the position of the original mutation in the pedigree is often not too far back into its living ancestry, suggesting a relatively high recurrent mutation rate. Of the seven genes within the lesion, haploinsufficiency of CHRNA7 (nicotinic acetylcholine receptor, α7) is considered to be the most likely pathogenic element, although it is not the only neuronally expressed gene affected by the deletion. Interestingly, early genome-wide linkage studies implicated the CHRNA7 region in juvenile myoclonic epilepsy [58], but this could not be replicated [59], and screening of CHRNA7 did not detect convincing mutations [60]. Could it be that the families studied by Elmslie et al. [58] contained enough families segregating the 15q13.3 microdeletion to give a linkage signal?

Subsequent studies investigated the role of other large CNVs that had previously been associated with increased risk of ID, autism and schizophrenia [53]. Somewhat surprisingly, significant numbers of the same recurrent CNVs involved in the disorders listed above were implicated as a component of the polygenic pathogenic genetic architecture in the clinically and genetically complex (idiopathic) epilepsies. Two microdeletions commonly associated with epilepsy are at 15q11.2 and 16p13.11 [33, 34, 53]. Together with the 15q13.3 microdeletion, their combined frequency in test populations of genetic generalized epilepsy is approximately 3% [33]. Other large recurrent CNVs associated with ID, autism or schizophrenia that have also been detected in epilepsy are at 1q21.1, 16p12, 22q11 and two regions within 16p11.2 [33, 53]. These CNVs represent clearly defined genetic determinants that overlap with a number of hitherto regarded distinct disorders comprising part or all of their genetic architectures. The three most common recurrent CNVs, which together account for up to 3% of epilepsies, are shown in Figure 1. Notably, the 15q13.3 microdeletion has been consistently present in 0.5% to 1% of all genetic generalized epilepsy cohorts but has not been seen in >3,000 patients who presented with focal epilepsy syndromes [34], and therefore it may be a risk factor specifically for generalized epilepsy syndromes. Deletions at 16p13.11 and 15q11.2 have been found in both generalized and focal epilepsies [33, 34, 53].

Figure 1
figure 1

Three 'common' recurrent microdeletions in epilepsy. Microdeletion of 15q13.3 (1.5 Mb) in a patient with absence epilepsy. Microdeletion of 16p13.11 (800 kb) in a patient with juvenile myoclonic epilepsy. Microdeletion of 15q11.2 (350 kb) in a patient with infantile seizures. Regions depicted for each panel are as follows: 15q13.3 deletion: chr15, 28.0 to 31.0 Mb; 16p13.11 deletion: chr16, 15.0 to 16.7 Mb; and 15q11.2 deletion: chr15, 20.2 to 20.8 Mb (National Center for Biotechnology Information Build 36/hg18). Red vertical lines represent array-comparative genomic hybridization probes that are deleted. Segmental duplications are represented by orange, yellow and gray blocks. Note that blocks of segmental duplications flank each deleted region. Genes are represented in blue, with key proposed candidate genes in red.

The large, recurrent CNVs described above occur because of specific genomic architecture at each respective chromosome region. CNV is mediated by naturally occurring sets of low copy repeats or segmental duplications [6163] that facilitate non-allelic homologous recombination [64, 65], resulting in deletion or duplication of the intervening unique sequence. Therefore, each region with such architecture is prone to rearrangement at meiosis, causing recurrence of large CNVs with nearly identical breakpoints in unrelated individuals. Because CNVs at these rearrangement-prone regions of the genome occur with an appreciable frequency, it has been possible to detect a statistically significant difference between cases and controls.

Apart from the recurrent CNVs discussed above, the rare non-recurrent CNVs are also likely to play a significant role in the genetic etiology of epilepsy. Two recent studies applied genome-wide technologies to detect CNVs in affected individuals. Heinzen and colleagues [34] evaluated 3,812 individuals and found an enrichment of large (>1 Mb) deletions in affected individuals, the majority of which were seen in one individual each. Mefford et al. [33] evaluated 517 individuals with various types of epilepsy and found that nearly 10% carried one or more rare CNVs that had not been previously found at an appreciable frequency in controls. Again, the majority of events were seen only once, and represent a subset of the rare non-recurrent CNVs involving genes that have been implicated in ID, autism or schizophrenia.

Syndrome constellations associated with CNVs

Taken literally, a constellation is a number of stars grouped within an outline. Here, we regard the CNV as the 'outline' encompassing a group of its associated syndromes comprising the syndrome constellation. Different combinations of syndromes define the constellations that are packaged within different CNVs. The CNVs can be recurrent in the population, and any recurrent CNV located in a given region is virtually identical from patient to patient. The syndrome constellations include one or more types of ID, dysmorphism, autism, schizophrenia and, more recently, genetic generalized epilepsy. The various syndromes within the constellations are themselves genetically and phenotypically heterogeneous, and in some cases have defined subsyndromes. For example, genetic generalized epilepsy consists of the subsyndromes childhood absence epilepsy, juvenile absence epilepsy, juvenile myoclonic epilepsy and generalized tonic clonic seizures. Recurrent deletions at 15q13.3 (1.5 Mb, seven genes), at 16p13.11 (1.2 Mb, eight genes) and at 15q11.2 (1.3 Mb, four genes) are emerging as the most common genetic determinants for various distinct disorders with complex inheritance. These generally include intellectual disability with or without dysmorphism, autism, schizophrenia or genetic generalized or focal epilepsy. Epilepsy was the latest addition to the constellations of syndromes associated with each of these CNVs, and is now well established [33, 34, 51, 53, 54]. A similar picture is emerging for the rarer recurrent CNVs at 1q21.1, 16p12 and two regions within 16p11.2 [33, 53].

Given the comorbidity of ID and epilepsy, autism and ID, and autism and epilepsy, for example, perhaps it should not be surprising that some CNVs cause overlapping neuropsychiatric features in affected individuals. However, it seems remarkable that the same CNV susceptibility lesion can be a genetic determinant for apparently disparate conditions (for example, only epilepsy in one patient, only schizophrenia in another). One possible explanation might be that odds risk ratios associated with disorders included within a given constellation of syndromes is relatively high in the context of disorders with complex inheritance. For example, genetic generalized epilepsy has an odds risk ratio of 68 (95% confidence interval 29 to 181) for the 15q13.3 deletion [54]; this is far higher than for susceptibility variants generally detected in complex genetic disorders. Certainly another possible explanation is the presence of as yet undetected additional genetic or epigenetic variants that influence the phenotypic outcome. All of the 'common' recurrent CNVs in epilepsy (15q13.3, 16p13.11 and 15q11.2) have probably been identified already, given the extent of the array-CGH genome-wide searches already completed [33, 34]. Some of the less common recurrent microdeletions at 1q21.1, 16p12 and two regions within 16p11.2 may be associated with their own multisyndrome constellations.

Rare or unique non-recurrent CNVs are collectively more common than the combined recurrent ones. These lesions provide a wealth of leads to candidate epilepsy genes within or closely adjacent to them. The number, frequency and distribution of each gene-bearing CNV are consistent with the common disease-rare variant model for the genetic architecture for complex epilepsy. Overall genetic profiles of susceptibility genes for each individual are likely to be unique and fit the polygenic heterogeneity concept [18]. Genes within these epilepsy-associated CNVs and genes identified through massively parallel sequencing [66] each represent independent opportunities to break out of the ion channel paradigm that might potentially constrain our thinking when the genetic architecture of epilepsy might extend beyond ion channels. Results of studies performed so far suggest that haploinsufficiency (deletions) or overexpression (duplications) of some of the genes in non-recurrent CNVs may elicit the same syndromes as those in their associated constellations.

There are two common threads in these discussions. First, the constellations of syndromes associated with each recurrent CNV can include a range of diverse phenotypes, including, in most cases, some combination of ID, autism, schizophrenia and epilepsy. Each CNV probably elicits its own specific distribution of phenotypes and frequency of each phenotype, defining the associated constellation. Second, the mechanism for genesis of this extreme clinical heterogeneity observed within virtually identical lesions is not yet known. Several mechanistic possibilities have been outlined [34, 6769] but none has been proven as a general mechanism, or even a mechanism specific to any given CNV. The clinical heterogeneity is likely to depend upon the nature of the other risk factors or genetic modifiers in the rest of the genome that alone or in combination may specify the phenotype.

Conclusions and future perspectives

The concept of extensive clinical heterogeneity in epilepsy associated with a well-defined genetic lesion is not new. Well known examples are genetic generalized epilepsy with febrile seizures plus [19], caused by mutations in sodium channel genes, and recently, genetic generalized epilepsy caused by the 15q13.3 CNV [70]. These observations have challenged complete reliance on the phenotype-first approach to diagnosis. Investigations will always begin with general clinical evaluation to broadly classify cases into disease categories. Taking genetic generalized epilepsy as an example, is it then necessary to further refine down to subsyndromes using clinical criteria alone, and to even contemplate endophenotyping for deeper clinical refinement? The answer is clearly no in the context of syndromic constellations associated with some CNVs and phenotypic spectrums associated with some familial missense mutations. The aim of that exercise of making phenotypes as clinically homogeneous as possible would be to promote genetic homogenization of study populations so that associations are easier to detect. But for CNVs and missense mutations in some genes, collections of the same CNV or same mutation are already genetically homogeneous, at least for that component of the complex polygenic architecture.

The approach needs to be turned upside down, by adoption of a genotype-first approach where novel genomic disorders such as genetic generalized epilepsy are classified and defined by detection of a common deletion or duplication. The collection of large numbers of patients with the same CNV genotype but wide variety of phenotypes including epilepsy will facilitate genotype-phenotype studies that might provide insight into the mechanisms that influence phenotype diversity in these and other disorders. Conversely, the collection of large numbers of genetic generalized epilepsy patients (not even subtyped into subsyndromes) with significantly more multiple rare DNA sequence changes within the same putative epilepsy susceptibility gene, as compared with unaffected controls, might be an outcome of their pursuit through massively parallel sequencing. That would enable us to work backwards, to endophenotype just those cases with mutations in a defined susceptibility gene to see if they have subtle phenotypic features in common. Thus might emerge a subsyndrome classification that is different to that currently in use, based on more relevant components of the phenotype that better reflect the underlying molecular genetics.

Finally, we agree that careful clinical phenotyping is a vital component of our research, as the constellations associated with each of the CNVs need to be accurately characterized. Consider cohorts comprising 15q13.3 deletions, for example. Some of the cases are regarded as epilepsy only. Others are regarded as having dual phenotypes, of epilepsy and ID, for example. Are these really dual phenotypes? Consider the hypothetical possibility that the haploid content of the 15q13.3 region lowers the seizure threshold and adversely affects intelligence in everyone who carries it. Some carriers will not have epilepsy because their susceptibility profile contains too few susceptibility variants at other loci throughout the genome, in addition to 15q13.3, to take them across the seizure threshold. Some carriers will not have ID because their baseline intelligence quotient will be high enough to begin with that even with some depression of intelligence quotient through the effects of the 15q13.3 deletion they remain within the normal range. Others, toward the lower end of the normal range to begin with, unfortunately drop down into the ID range. We challenge the clinical researchers to prove us wrong or, like us, seriously question the notion of dual phenotypes presenting in only a subset of the 15q13.3 deletion carriers.