Introduction

Huntington’s disease (HD) is an inherited neurodegenerative disease that progresses to multiple behavioral dysfunctions in motor and cognitive abilities, which eventually result in death 15–20 years after onset of symptoms [1,2,3,4]. Expansion of the CAG triplet repeat domain in the mutant HTT gene is responsible for causing HD that is inherited in an autosomal dominant manner. The disease progression to increasing severity of these clinical features involves brain neurodegeneration in HD. The purpose of this review is to describe the associations of significant clinical features of HD with CAG repeat lengths of the mutant HTT gene, and to gain understanding of the extent of clinical disabling severities with HTT CAG expansions. This topic has not been recently covered among the many HD reviews in the literature. Understanding data about the extent of HTT CAG expansions with clinical features will benefit investigations of HTT gene and cellular mechanisms of the multiple human HD disabilities suffered by patients.

The mutant HTT gene with expansion of CAG repeats is responsible for HD [3,4,5,6]. Normal healthy individuals have fewer than 35 CAG repeats (35Q), while affected HD individuals have 40 or more CAG repeats (Fig. 1). Most normal individuals have 15–21Q, while the majority of HD patients have adult-onset HD with repeats in the range of 40–50Q. The disease has reduced penetrance in the range of 35–39 repeats, while > 60 repeats usually results in juvenile HD with symptoms present in children before age 20. Importantly, these genetic features of the HTT gene illustrate that HD results from a spectrum of expanded CAG repeat lengths that initiate progressive dysfunctions of numerous HD clinical features. Thus, there is no single mutation identical in all patients, but rather a disease threshold followed by a spectrum of polyQ lengths which regulate HD disease outcomes. The mutant CAG repeat lengths of the HTT gene result in an age-dependent onset of HD motor symptoms and chorea, cognitive decline, loss of capacity for independent daily life, and brain tissue loss in HD that are more severe with increasing CAG repeat numbers. These dysfunctions of HD are significantly associated with HTT CAG repeat length, discussed in this review.

Fig. 1
figure 1

HTT gene CAG triplet repeat expansions in Huntington’s disease (HD) corresponding to phenotypic dysfunctions in HD patients. The HD causative CAG triplet repeat mutations in the HTT gene and corresponding ages of onset for clinical symptoms are illustrated for normal CAG repeat lengths of < 35, 36–39 repeat lengths that may or may not result in HD, adult HD repeat lengths of 40 to ~ 50, and juvenile HD repeat lengths of ~ > 60. The clinical dysfunction features for each of these three ranges of HTT CAG repeats are displayed

HD age of onset correlates with CAG repeat length of the HTT gene

The age of onset of HD motor dysfunction correlates with the CAG repeat lengths of the mutant HTT gene. Because the HTT disease allele is dominant, affected individuals are usually heterozygotes, with one wild-type allele. With the mutant allele, longer CAG repeats of > 60Q cause juvenile HD. These high expansion lengths have the strongest association with age of motor symptom onset, with repeat length accounting for about 70% of variance [7]. Studies examining a wide range of CAG repeat lengths have provided estimates of about 40–60% of variance in age of HD onset explained by repeat length for the whole spectrum of HD (summarized in Table 1) [7,8,9,10,11,12,13,14,15,16]. Also, lower correlation in individuals closest to the normal repeat range was observed [17]. A large study of more than 3500 participants found that CAG repeat length accounted for 66% of variation in onset age in the range of 40–53 repeats, representing most HD cases [13]. These findings demonstrate the strength of the correlation of CAG repeat numbers with age of HD onset.

Table 1 Age of onset of HD motor symptoms correlates with HTT CAG repeat numbers

Further analyses of the change in onset age for each CAG repeat were carried out by Andresen and coworkers, who found in two separate large HD populations that there was an inflection point in the regression curve at approximately 50Q [18, 19]. Interestingly, there was a steeper slope in the 40–50Q range compared to > 50Q, suggesting that CAG repeat number actually has a stronger effect on age of onset in adult HD patients compared to juvenile HD patients [18, 19]. Further evidence for the greater variance at shorter CAG repeat lengths has been provided by the PREDICT-HD study of pre-symptomatic HD patients [20, 21] with greater than 35 CAG repeats. It is unknown as to why greater variance in age-at-onset occurs for lower CAG repeat lengths, compared to the decreasing variance with higher CAG repeats. Further, it is noted that the rare HD individuals who carry two copies of mutant HTT gene alleles show similarities in age of HD onset compared to heterozygotes when adjusted for the longest HTT CAG repeat length [7, 13, 22,23,24].

These findings demonstrate the significant relationships for correlation of CAG repeat numbers of the HTT gene with the age of onset of HD disease. Further, data show greater variance for age of onset at lower CAG repeat lengths, compared to lower variance with high CAG repeats. These findings suggest some differences in how the different CAG repeat lengths of mutant HTT regulate the age of onset of HD clinical features.

Motor dysfunction correlates with CAG repeat length of mutant HTT in HD

HD progression results in characteristic extrapyramidal motor signs of chorea and incoordination early in the disease, progressing to bradykinesia, dystonia, saccadic eye movements, and impaired voluntary movement [3, 4, 25]. Several studies have examined the relationship of CAG repeat length and progression of chorea and motor dysfunctions (summarized in Table 2), measured by the Quantified Neurological Examination (QNE) or the total motor score from the Unified Huntington’s Disease Rating Scale (UHDRS) [10, 11, 26,27,28]. QNE and UHDRS motor examination consist of tests quantifying ocular, speech, walking, and movement dysfunctions. As shown in Table 2, inverse correlation of the progression of motor dysfunction per CAG repeat number was found to be significantly correlated by Pearson correlation coefficient (r) values and p values. Higher scores indicate deteriorating motor function. Investigation of chorea alone showed that progression was significantly related to CAG repeat length in the study by Brandt and colleagues [10] with 46 subjects. On the other hand, two other studies with larger patient group sizes found trends towards correlation of chorea progression with CAG repeat number, but were not statistically significant [26, 28]. Rosenblatt et al. [27] conducted a mathematical correction for the age at onset of motor symptoms, and this factor increased the effect of CAG repeat numbers on motor impairment scores by 76%. The varying age at onset of motor symptoms for the same CAG repeat number may reflect genetic and environmental factors and the aging process [27].

Table 2 Progression of total motor symptoms and chorea correlates with HTT CAG repeats

Overall, studies of the QNE and UHDSRS for neurological motor impairment and total motor scores showed significant correlation of motor dysfunction progression with changes in CAG repeat numbers for subjects with 36–109 CAG repeats in the HTT gene.

Cognitive deficits correlate with CAG repeat lengths of HTT

HD patients display cognitive changes before and during motor dysfunction. Development of cognitive deficits in HD often progresses to severe behavioral disturbances [3, 25, 29]. Studies examining the relationship of CAG repeat length with changes in cognitive symptoms, as measured by scores related to psychomotor function, emotional recognition, inhibition of cognitive interference, motor speed, planning and correction found that CAG repeat lengths in the HD disease range (36–109 CAG repeats) were inversely correlated with changes in cognitive scores, particularly when scores were normalized to disease duration [11] or adjusted for age [26, 27, 30, 31].

Recent efforts are establishing updated cognitive testing standards for monitoring disease progression and as outcome measures in clinical trials [30]. HD-CAB (Cognitive Assessment Battery) uses the most sensitive measures of cognitive decline during early disease manifestation consisting of Symbol Digit Modalities Test, Paced Tapping, One Touch Stockings of Cambridge (abbreviated), Emotion Recognition, Trail Making B, and the Hopkins Verbal Learning Test. Further evaluation of correlating cognitive decline with HTT CAG repeats in early HD may potentially reveal sensitive changes in cognitive function to predict individual patient cognitive dysfunction progression, using mathematical adjustments for age, shown in Table 3 as disease burden (DB) scores [30,31,32]. Disease burden (age × [CAG − 35.5]) models a patient’s expected disease state based on CAG repeat number and age at the time of onset or assessment. For example, a 20-year-old patient with a CAG repeat number of 55 would be expected to have similar symptom severity as an 86-year-old with a CAG repeat number of 40.

Table 3 Progression of cognitive symptoms correlates with HTT CAG repeat numbers

Overall, findings demonstrate correlation of HTT CAG repeat lengths with declining cognitive function in HD. It is noted that this relationship becomes less variable with increasing CAG length.

Daily living capacity and HTT CAG repeat lengths

Studies relating CAG repeat length with changes in quantitative scales measuring ability to function in daily life utilized the UHDRS Functional Independence Scale (FIS), total functional capacity (TFC), and the HD activities of daily living (ADL) scale [12, 26,27,28], as summarized in Table 4. These measures of daily function showed strong relationships with CAG repeat number [26, 27]. It is noted that Total Functional Capacity was not quite significant (p < 0.1) [26]. Adjusting for age was an important factor in the observed progression in these scores, since age at the time of assessment independently affected rate of decline in daily function [26, 27]. When adjusting for age at onset in a random effects model, Rosenblatt et al. found an increased effect size of 159% in ADL (activities of daily living) [27]. Of interest was one study which found that ten additional repeats were correlated with an 81% change in the UHDRS Functional Independence Scale (FIS) over the 30-month study period (shown in Table 4 as the difference in score per repeat) [26]. The relationship included age at assessment in the regression model [26].

Table 4 Progressive loss of daily living capacity correlates with HTT CAG repeat lengths

Overall, findings in the field demonstrate association of HTT CAG repeat numbers with the extent of decline in daily living capacity in HD patients, which occurs as the disease progresses.

Weight loss correlates with CAG repeat lengths of HTT

Many studies have utilized weight loss to represent overall deterioration in health. Two studies of 300–500 patients with repeats ranging from 36 to 69 found correlations of repeat length with BMI (body mass index) change and overall weight loss [26, 33]. Ravina et al. found significant differences in weight loss over 30 months for every ten repeats (p < 0.05) [26], and Aziz et al. found that each repeat was associated with − 0.22 change in BMI units per year (p = 0.017) [33]. Similarly, a study of 47 patients with 42–62 repeats reported a significant association between CAG repeat length and age of feeding tube insertion, as well as age of nursing home admission (p < 0.001 for both) [29].

A larger cohort study (n = 5821) from Enroll-HD confirmed a prior study that showed weight loss increased with increasing CAG repeats [34]. Interestingly, the researchers found that a higher BMI at the beginning of HD onset significantly correlated with slower decline of motor, cognitive and TFC that was independent of CAG repeat length (p < 0.001), even if patients were obese or morbidly obese (p < 0.01). Also, results of the van den Burg group [34] suggest that a higher BMI can supersede the impact of a patient’s genetic burden on disease progression.

In summary, these clinical findings illustrate a relationship of the extent of weight loss with HTT CAG repeat numbers.

Higher risk of death correlates with CAG repeat lengths of HTT

HD is ultimately a fatal disease, and three prospective cohort studies have examined the relationship of CAG repeat length with age at death for the subset of patients who died during the study periods [8, 14, 15]. A study of 135 patients, including 41 deceased at ages 18–83, found that each CAG repeat conferred a higher risk of death during the study period, assessed by the hazard ratio (HR) of 1.09 for one triplet repeat increase (p = 0.002) [15]. For the 51 deceased patients in a study of 360 patients, repeat length accounted for 18% of variation in age at death (p ≤ 0.01) [5]. However, another study with 40 deceased patients out of 112 total found that repeat length accounted for 58% of variation in age at death (p = 0.001) [14]. A larger study of 4448 patients confirmed a significant correlation between CAG repeat length and age of death, but the variance of age at onset and the age of death were not explained by CAG repeat length, especially at lower disease-penetrant CAG scores, indicating other factors are involved (p < 0.0001) [35]. A limitation of studies of this type is susceptibility to ascertainment bias, since patients still alive at the end of the study period would naturally be excluded from analysis; thus, length of follow-up and proportion of the cohort deceased may also affect estimated correlations.

Although studies differ in their estimates of the exact contribution of repeat length to age at death, the overall evidence clearly supports a significant relationship linking CAG repeat lengths with risk of mortality.

Brain neurodegeneration and biomarkers correlated with CAG repeat lengths of HTT

Reduced brain volume and neurodegeneration

Post-mortem examination of human HD brains and imaging in live HD patients have revealed massive cell death and neurodegeneration in the striatum, and significant loss of volume occurs in cortical brain regions [1, 4] (Table 5). An average loss of 20% brain volume in terminal HD suggests that cell loss occurs in more brain regions than only striatum and cortex [36, 37]. A variety of studies have examined neuropathology directly, either at autopsy or in living patients by MRI or CT scan [9, 36, 38,39,40]. Studies quantifying striatal atrophy at a single time point (age adjusted) found that CAG repeat length accounted for 66–78% of variance in neuronal cell loss in striatum [38, 39]. Studies measuring rate of atrophy during a 1–2-year time period found significant but more modest relationships with CAG repeat number [9, 40]. For example, an MRI study in living HD patients found 0.3% faster cortical thinning for every CAG repeat in some cortical regions, although not all cortical regions and no subcortical regions had atrophy rates that were significantly associated with repeat number during the 1-year study [40]. The study by Hadzi and colleagues [36] used a cluster analysis approach for neuropathology in HD brains. The “striatal” cluster encompassed regions including striatum, which shows correlation of CAG repeat number with neuropathology. The overall evidence supports the role of mutant HTT CAG repeat length in HD brain neurodegeneration.

Table 5 Brain volume and white matter integrity of HD correlate with CAG repeat numbers

Various mathematical correlation models for neuroimaging and CAG or age-adjusted repeat length show significant decreases in brain tissue volume in premanifest HD to advanced HD which include whole brain atrophy, caudate and putamen, midbrain, pons, thalamus, gray and white matter of all cerebrocortical regions, and ventricular expansion. Atrophy was more severe with higher CAG and disease burden (DB) scores [31, 32, 41], and CAP ((age at imaging) × [CAG − 33.6]). CAP is often used for age-adjustment in imaging studies [41, 42] (Table 5). These brain volume studies show that multiple brain regions are degenerated beginning in premanifest HD [20, 31, 32, 43], not only in the striatum, and that brain atrophy occurs in a central to peripheral and posterior to anterior pattern [42]. Loss of white matter integrity also begins in premanifest HD in the corpus callosum [45], subcortical white matter of all cerebrocortical lobes [42], and white matter tracts of the basal ganglia [44]. These results suggest that neuronal connectivity deficit is an early event in HD that is exacerbated by CAG score. Additionally, axonal demyelination occurs more sharply in premanifest HD, and slows in early manifest HD [45].

Cellular and molecular HD brain alterations

Molecular imaging techniques have identified localized changes in receptor density, ionic and glucose balance and increased glial activation that correlate with CAG repeat length in premanifest and early manifest HD (Supplemental Table 1) [32, 46,47,48,49,50]. Decreased blood oxygenation showed significant correlation between CAG repeat and loss of connectivity between the primary motor cortex and motor- and visual-associated areas; connectivity to the posterior cingulate cortex, an area of pain perception and autobiographical information, was significantly increased [41].

Globally in the brain, elevated sodium levels and decreased glucose uptake are correlated with CAG repeats and disease burden [46, 48]. Quantification of brain glucose uptake was also used as a variable in mathematical prediction modeling for HD symptom progression and improved prediction of age at onset of motor symptoms by 37% [46].

Receptor density studies, measured by tracer-tagged receptor agonists, identify changes in neurotransmitter systems in premanifest and early stage HD that are correlated by CAG and age-adjusted repeat scores. Dopamine receptor 2 (D2) density is highly decreased (50%) in the striatum of premanifest to HD grade 2 patients [32]. Prefrontal and premotor cortical cannabinoid receptor 1 (CB1) are significantly decreased in early-to-late-stage HD [50].

Other studies identified neurotransmitter receptor changes in HD patients but did not model a correlation with CAG repeat number (see Pagano et al. for a review [51]). Findings included upregulation of the adenosine receptor A1 in early premanifest HD patients’ thalamic and cerebral cortical regions, but downregulated in later premanifest and manifest HD patients caudate and amygdala [52]. GABA receptors are also decreased in the striatum and thalamus of premanifest HD [41], along with opioid receptors [53]. These data show that neurotransmitter systems are disrupted before a patient experiences HD symptoms. Cell-specific markers also identify significant changes in premanifest HD that are significant with CAG or adjusted scores. Activated microglia are sharply upregulated in the striatum, and also diffusely throughout the cerebral cortex [47, 49, 54].

Overall, brain neuroimaging of brain tissue degeneration and biomarker studies of premanifest through all stages of HD show that HD is a whole brain disease, not just a disease of the striatum. Volumetric studies of neurodegeneration, white matter integrity disruption, and molecular imaging studies of receptor systems, and cell-targeted imaging studies have characterized cellular and molecular HD brain features associated significantly with CAG repeat lengths of the HTT gene.

Discussion of HTT gene expression in human brain

Given the evidence in support of HTT CAG repeat length-dependent clinical features in HD, a logical question to address is what is the distribution of HTT gene expression in human brain regions? Analyses of HTT gene expression can indicate its expression in human striatal and cortical brain regions primarily responsible for the behavioral dysfunctions occurring in HD, compared to other brain regions that also undergo neurodegeneration and molecular alterations (explained in section “Brain neurodegeneration and biomarkers correlated with CAG repeat lengths of HTT”, above).

Gene expression microarray analyses of HTT in normal adult human brain regions have been compiled by the Allen Human Brain Atlas resource (http://human.brain-map.org/). Notably, these data show ubiquitous HTT expression among the 169 human brain regions investigated (Fig. 2 and supplemental Table 2). HTT gene expression was expressed as average log2 expression values among donors. Multiple normalization steps provided log2 gene expression values relative to the normalized mean of all human brain gene expression levels (for 21,245 genes) in human brain (see Technical White Paper “Microarray Data Normalization” of the Allen Brain Atlas resource, http://human.brain-map.org/). HTT expression was present in all regions examined, with some variations in expression levels among the regions.

Fig. 2
figure 2

HTT gene expression in human brain regions. a Human brain regions. HTT gene expression was assessed by microarray analyses in 169 brain regions (from six donor brains). Brain regions are indicated by the second bar (at top of figure) with colors indicating the main brain regions of (top bar): FL frontal lobe, insula, CgG cingulate gyri, HiF hippocampal formation, OL occipital lobe, PL parietal lobe, TL temporal lobe, Amg amygdala, BsFb basal forebrain, Str striatum, Clstr claustrum, Epithal epithalamus; hypothalamus, Thal thalamus, Subthal subthalamus, MES mesencephalon, CbCtx cerebellar cortex, CbN cerebellar nuclei; pons, MY myelencephalon, WM white matter structures, CP choroid plexus of the lateral ventricles. Color keys of these brain regions, and subregions, are shown in Supplemental Table 2. b HTT gene expression in human brain regions. Graphical representation of log2 values of HTT gene expression assessed by microarray analyses (averaged for six donor brains) is shown for 169 brain regions. The colors of the graphed bars correspond to the brain region colors indicated in panel ‘a’. The key for brain regions #1-169 is shown in Supplemental Table 2

Human brain images of HTT gene expression localization overlaid with coronal MRI (magnetic resonance imaging) images and 3D-rendered sagittal images illustrate the expression of HTT in striatum, primary motor cortex (Brodmann area 4), premotor cortex (Brodmann area 6), and globus pallidus (Fig. 3). Examination of these regions in normal human brain is of interest since these areas undergo neurodegeneration in HD. In these normal brain images, HTT is expressed in striatal areas of caudate nucleus, nucleus accumbens, and putamen (Fig. 3) which function in movement and in HD chorea [37, 55]. Expression occurs in the cortical regions of Brodmann areas 4 and 6 that are essential for motor control and mental functions (Fig. 3). HTT expression occurs in hippocampus (Fig. 3) that is important in cognitive functions in normal brain, and in HD cognitive deficits [56, 57]. Additionally, normal human brain regions of cortex (Brodmann areas 1, 2, 3, 17–19, 41), nucleus accumbens, globus pallidus, cerebellar cortex, substantia nigra, superior olive, lateral vestibular nucleus, oculomotor nucleus, and others display abundant HTT expression. In HD, these brain regions undergo neurodegeneration and are involved in behavioral dysfunctions that occur in HD.

Fig. 3
figure 3

Human brain images of HTT gene expression. HTT gene expression localization is illustrated in the human brain regions of a striatum—caudate nucleus, b primary motor cortex Brodmann area 4, c premotor cortex Brodmann area 6, and d hippocampus. Coronal brain MRI images and space filling 3D rendered sagittal brain images show colored overlaid gene expression levels (log2 values) of HTT gene expression. The MNI coronal coordinate is indicated below the coronal slab images (MNI: Montreal Neurological Institute coordinates are spatially organized to standardized stereotaxic space). Images credits: Allen Institute (human.brain-map.org)

These data of HTT expression in normal human brain illustrate that the gene is expressed in HD-affected brain areas (Supplemental Table 3). But HD brains from human subjects have not yet undergone complete HTT expression analyses. It is predicted that HD brains may display HTT expression similar to that in normal brain, since no differences have been found in the HTT gene promoter regions of mutant and normal HTT. Nonetheless, it is critical in future studies to evaluate the distribution of mutant HTT expression in human HD brains.

Overall, in normal human brain, HTT expression occurs in brain regions that are affected in HD by cellular loss and neurodegeneration. However, HTT is also expressed ubiquitously throughout the brain at levels substantially above normalized gene expression levels of all human genes. In HD, because behavioral dysfunctions and neurodegeneration occur in particular brain regions, the selectivity of HTT-mediated dysfunctions will involve the translated mutant huntingtin (Htt) protein and its unique cellular properties compared to normal Htt.

Human brain mutant HTT CAG repeat length mechanisms responsible for clinical HD features

The multiple clinical features displayed by HD patients result from a range of mutant CAG repeat expansions in the HTT gene, inherited in autosomal dominant manner in HD. The important question to address in HD research is what mutant HTT-initiated pathways are responsible for each of the numerous behavioral and brain biomarker deficits in HD? Elucidation of mutant HTT pathways leading to the clinical features will be of high importance in understanding the complex mechanisms responsible for CAG repeat length-dependent outcomes. Tremendous effort has been ongoing in the HD field to elucidate molecular pathways involved in HTT pathogenesis. Cellular pathway systems hypothesized to participate in HD include gene expression, synaptic vesicle trafficking, autophagy and energy metabolism, and are related as illustrated in Fig. 4, based on numerous studies in the field [1,2,3, 58,59,60,61,62,63,64,65,66]. It will be essential to gain understanding of mutant Htt-initiated pathway mechanisms that lead to the multiple clinical features of HD. Such knowledge will facilitate translational research to provide new targeted opportunities for therapeutics development in HD.

Fig. 4
figure 4

Cellular systems linked to mutant Htt (mHtt) in non-human animal models. a Cellular functions regulated by mutant Htt (mHtt) protein interactions. Htt protein interactions involve cellular pathways of transcriptional regulation, synaptic neurotransmitter systems, energy and mitochondria regulation, lysosome and autophagy, protein translation and folding, and are related [3,4,5, 58,59,60,61,62,63,64,65,66]. This illustration is based on numerous mechanistic investigations of mutant Htt-induced protein interactions and molecular pathways in non-human animal model systems. b Clinical HD neurodegeneration and behavioral deficits. Mutant Htt protein interactions with functional cellular pathways results in dysfunctions of motor and cognitive deficits, compromised daily living capacity, weight loss, and brain tissue loss in neurodegeneration. Ongoing HD research is defining the human HD brain mechanisms for mutant Htt induction of human HD clinical deficits