Introduction

The lentiform nucleus is a lens-shaped, bilateral structure in the basal ganglia bounded by the internal and external capsules. It has three components: the internal and external globus pallidus (Diamond et al. 1985) and the putamen (Fig. 1). The putamen receives dense corticostriate projections from throughout the cortex and funnels information to the external and internal globus pallidus through dense intrabasal ganglionar fibers (Snell 2010). In addition, fibers from the internal globus pallidus project to several thalamic nuclei and continue back to the cortex, primarily to premotor area 6. These projections form the cortico-striato-thalamo-cortical loop, which is involved in initiating and terminating movements (Snell 2010). Both the globus pallidus, and the putamen especially, receive dopamine-rich connections from the substantia nigra—the main source of dopamine for the basal ganglia (Snell 2010). Dopamine projections to the basal ganglia are part of the brain’s reward circuitry (Schultz 2002).

Fig. 1
figure 1

A coronal slice in a subject from the ADNI sample. The light-blue label represents the left globus pallidus and the darker blue label shows the putamen. External and internal portions of the globus pallidus are segmented as a single structure in FSL/FIRST (Patenaude et al. 2011)

The lentiform nucleus is implicated in several heritable degenerative and psychiatric disorders. Its role in movement disorders was first identified in studies of hepatolenticular degeneration (Wilson 1912)—a disorder that affects both the liver and the lentiform nucleus. Deficits in lentiform nucleus volume have been observed in Parkinson’s disease (Dexter et al. 1991; Obeso et al. 2000), Huntington’s disease (Marsden et al. 1983; Reiner et al. 1988), and normal aging (Raz et al. 2003). More subtle differences in lentiform nucleus volume are reported in some but not all studies of bipolar disorder (Arnone et al. 2009; Kempton et al. 2008; Strakowski et al. 1999), attention deficit hyperactivity disorder (Castellanos et al. 1996; Ellison-Wright et al. 2008) and schizophrenia (Elkashef et al. 1994; Ellison-Wright et al. 2008). Lesions in the midbrain tegmentum—which has reciprocal connections with the lentiform nucleus—are associated with visual and auditory hallucinations (Cascino and Adams 1986). In addition to its many links with known pathology, the lentiform nucleus is a plausible target for genetic analysis, as its volume is highly heritable (Kremen et al. 2010) and can be reliably measured using automated segmentation methods (Morey et al. 2010).

Building on prior studies, here we performed an unbiased genome-wide association study (GWAS) in two large independent cohorts to discover common genetic variants associated with differences in lentiform nucleus volume. The term “unbiased” is often used to describe the type of genetic analysis we performed—a genome-wide association scan—in which we search the whole genome for statistical associations, rather than prioritizing or choosing only a limited subset of variants, such as candidate genes. Arguably, if the genetic loci influencing a given trait are unknown, a broad survey of the genome may avoid missing associations in regions that have no currently known relation to the trait. Association statistics for the genetic variants were combined meta-analytically across two cohorts to boost power and reduce the risk of false positive findings. We assessed both an elderly and a young adult cohort to discover genes with robust associations throughout life.

Methods and materials

Subjects

We analyzed neuroimaging and genome-wide genotype data from two independent samples: the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Queensland Twin Imaging Study (QTIM). The Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a large longitudinal study initiated in 2003 as a public-private partnership between the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and non-profit organizations. The study aims for ADNI are to identify and investigate biological markers of Alzheimer’s disease through a combination of neuroimaging, genetics, neuropsychological tests and other measures in order to develop new treatments, track disease progression, and lessen the time required for clinical trials. The study was conducted according to the Good Clinical Practice guidelines, the Declaration of Helsinki, and U.S. 21 CFR Part 50—Protection of Human Subjects, and Part 56—Institutional Review Boards. Written informed consent was obtained from all participants before protocol-specific procedures were performed. Further information on inclusion criteria and the study protocol may be found online (http://adni-info.org/). Baseline structural MRI scans and genetic data were available for 818 subjects (as of August 1, 2011) from the public ADNI database (http://www.loni.ucla.edu/ADNI/). Here we analyzed all ADNI subjects as a single group, to exploit the broader phenotypic continuum (Petersen 2000) and increase power (Durston et al. 2005; Stein et al. 2010). Some subjects were excluded to eliminate problems caused by population stratification (Lander and Schork 1994; McCarthy et al. 2008) using multi-dimensional scaling as outlined previously for the same dataset (Stein et al. 2010). The remaining sample had 742 Caucasian subjects left for the analysis; this represented the largest homogeneous group attainable from the ADNI cohort. After removing subjects by applying quality control criteria to the lentiform nucleus segmentations (discussed below), the final ADNI dataset consisted of 706 subjects (average age ± s.d.: 75.5 ± 6.8 years; 413 men/293 women) including 162 patients with AD (75.6 ± 7.6 years; 88 men/74 women), 346 with mild cognitive impairment or MCI (75.0 ± 7.3 years; 221 men/125 women), and 198 healthy elderly controls (76.1 ± 4.8 years; 104 men/94 women).

The Queensland Twin Imaging Study (QTIM) is an ongoing, 5-year longitudinal project tasked with identifying genetic influences on brain structure. As of August 1, 2011 there were 672 subjects with both structural MRI scans and genome-wide genotyping data. An additional 40 subjects underwent repeated scans, which we used to verify the reliability of the segmentation algorithm. Each member of every twin pair and their siblings was assessed via extensive diagnostic interviews to exclude anyone with a history of brain related disorders, diseases, or injuries. To avoid problems caused by population stratification in this Caucasian sample, 10 subjects were removed from the QTIM sample, as determined by MDS analysis. All subjects were right-hand dominant as determined by Annett’s Handedness Questionnaire (Annett 1970). Written informed consent was obtained from all participants before protocol-specific procedures were performed. After quality control of the lentiform nucleus segmentations (discussed below), the final group we analyzed consisted of 639 subjects from 364 families (98 monozygotic twin pairs; 127 dizygotic twin pairs; 3 dizygotic triplet trios; 117 singletons; 63 siblings; 23.1 ± 3.1 years; 251 men and 388 women).

Genotyping and imputation

Genome-wide genotype data were collected using the Human610-Quad BeadChip (Illumina, Inc., San Diego, CA) in both samples. Several SNPs were excluded from the analysis based on standard filtering criteria, as is standard in many other GWAS studies (Wellcome Trust Case Control Consortium 2007). In the ADNI sample, SNPs were excluded based on: call rate <95 % (42,176 SNPs removed), significant deviation from Hardy-Weinberg equilibrium P < 1 × 10−6 (263 SNPs removed), minor allele frequency <0.01 (60,919 SNPs removed), autosomal chromosomes only, and a platform-specific quality control score of <0.15 to eliminate “no call” genotypes (variable number of missing genotypes across subjects). Similarly, in the QTIM sample SNPs were excluded based on: call rate <95 % (8,447 SNPs removed), significant deviation from Hardy-Weinberg equilibrium P < 1 × 10−6 (2,841 SNPs removed), minor allele frequency <0.01 (33,347 SNPs removed), and a platform-specific quality control score of <0.07 to eliminate “no call” genotypes (variable number of missing genotypes across subjects). Also, we chose to focus only on autosomal SNPs rather than those in mitochondrial DNA and sex chromosomes.

Imputation of hard genotype calls may be used to infer missing values based on the linkage among SNP sets. In addition, imputation may be used to infer SNPs not directly genotyped in a given sample, but genotyped in a reference dataset. The quality of imputed SNPs depends on the strength of the linkage between hard genotyped SNPs and the imputed SNPs. In ADNI, we excluded any SNPs imputed with an R 2 value of <0.3 from the analysis (62,053 SNPs removed); the same steps were taken for the QTIM (54,337 SNPs removed). We also excluded imputed SNPs with a minor allele frequency of <0.01 for both the ADNI (45,818 SNPs removed) and QTIM (38,481 SNPs removed). For each sample, we performed imputation with MaCH, which uses the Markov Chain Monte Carlo method to infer missing genotypes and SNPs robustly and accurately (Abecasis et al. 2010). After all rounds of quality control filtering, the ADNI dataset had 2,449,382 SNPs and the QTIM dataset had 2,439,807 SNPs. We analyzed the set of overlapping SNPs present in both datasets, totaling 2,380,200 SNPs.

Image acquisition and pre-processing

High-resolution structural brain MR images were collected from both the ADNI and QTIM samples. Structural MRI scans in the ADNI study were obtained using a standardized protocol to maximize consistency across 58 image acquisition sites, using 1.5 Tesla MRI scanners. A T1-weighted 3D MP-RAGE sequence was used (TR/TE = 2400/1000 ms; flip angle = 8˚; FOV = 24 cm; with a final voxel resolution = 0.9375 × 0.9375 × 1.2 mm3).

In the QTIM cohort, structural MRI scans were obtained on a single 4 Tesla scanner (Bruker Medspec). T1-weighted images were acquired with an inversion recovery rapid gradient echo sequence (TI/TR/TE = 700/1500/3.35 ms; flip angle = 8˚; slice thickness = 0.9 mm, with a 256 × 256 acquisition matrix; with a final voxel resolution = 0.9375 × 0.9375 × 0.9 mm3).

Skull and all other non-brain tissues were removed from each subject’s scan using the brain extraction tool (Smith 2002) (BET) prior to analysis. Test-retest data were also available for 40 young normal individuals scanned on two occasions approximately 4 months apart.

Automated delineation of lentiform nucleus volume

We delineated the lentiform nucleus structures using the well-validated, automated FIRST segmentation algorithm (http://www.fmrib.ox.ac.uk/fsl/first/index.html), which is part of the FSL (Smith et al. 2004) image processing package. Morey et al. (2010) showed that the FIRST segmentation algorithm has relatively high reproducibility and accuracy for each of the subcortical structures segmented. Using a Bayesian framework, FIRST provides accurate and validated segmentations of subcortical brain structures (Patenaude et al. 2011).

Quality control of lentiform nucleus segmentation

The quality of segmentations was assessed by examining the left and right globus pallidus and putamen separately; these were checked by hand (by DPH) following established guidelines (Duvernoy and Bourgouin 1999). If any segmentation did not properly delineate any single structure the subject was removed from the analysis. After quality checking, 36 subjects were excluded from the ADNI sample and 23 subjects were excluded from the QTIM sample. As a further measure of segmentation quality, we examined the consistency of independent runs of the FIRST algorithm on repeated scans of 40 subjects, taken a short interval apart. The reliability of lentiform nucleus segmentation was tested by computing intraclass correlation coefficients (ICC) from the 40 repeated scans. All ICC calculations were performed using the psy package in the R statistical software (version 2.13.0; http://www.r-project.org/).

Fig. 2
figure 2

Scan and re-scan volume values of the lentiform nucleus delineated using the automated FIRST segmentation algorithm. The black line on the diagonal represents the ideal situation where the segmented volumes are identical. In general, there is good agreement between the segmented volumes for both scans

Heritability analysis

To evaluate the overall genetic contribution to the variability in volumes, the heritability of the lentiform nucleus volume was estimated using a structural equation model (SEM) as implemented in the software package, Mx (version 1.68; http://www.vcu.edu/mx/). Heritability was estimated using the classic pathway-based ‘ACE’ model (Chiang et al. 2012; Neale et al. 1992). In families from the full QTIM sample, we used this analysis to compare the observed pattern of covariances in lentiform nucleus volume to what would be expected given different degrees of genetic influence. The heritability of the left, right, and average bilateral lentiform nucleus volumes were analyzed separately. We chose to study the average bilateral lentiform nucleus because it shows higher heritability than the left and right lentiform nucleus separately and because the inevitable segmentation errors should be a smaller proportion of the total volume if both sides are combined. Additionally, we estimated the genetic correlation (r g) between the putamen and globus pallidus volumes using Mx in the full QTIM sample.

Genetic analysis

In the ADNI sample, we tested each SNP dosage value for association with the lentiform nucleus volume, assuming, by default, an additive model - each SNP dosage value was recorded as the number of minor alleles, with an implicit correction for the accuracy of imputation at that SNP. Tests of association were conducted using linear regression as implemented in the publicly available program, mach2qtl (Abecasis et al. 2010). We controlled for age and sex, which both showed significant effects on lentiform nucleus volume. We also covaried for age2, sex x age, and sex x age2 to account for any quadratic or interaction effects. In addition we controlled for intracranial volume (ICV), calculated as 1/(determinant) of the transformation matrix from registration to the FSL common template. We chose to correct for head size because we are interested in individual differences in lentiform nucleus volume unrelated to differences in head size (Buckner et al. 2004). In the QTIM sample, association testing was carried out using mixed-model regression, to control for family structure. We also included the same covariates as in the ADNI model. Because of the kinship structure of the twin sample, association tests in QTIM were conducted using the family-based association test implemented in merlin (Chen and Abecasis 2007).

Meta-analysis of genetic results

Genome-wide association results from the ADNI and QTIM samples were meta-analyzed using a fixed effects inverse variance-weighted method, as implemented in METASOFT (Han and Eskin 2011). Beta coefficients from the regression analysis of each SNP from both studies were pooled based on the inverse of the variance of each beta coefficient. In addition to the standard fixed-effects meta-analysis, we performed a random-effects meta-analysis in METASOFT. The random-effects meta-analysis still follows the inverse variance-weighted model, but can more appropriately model the population statistics in cases where the effect size is not the same across cohorts (Han and Eskin 2011).

Gene-based tests and pathway enrichment analysis

The meta-analyzed P MA-values from the full set of SNPs from the GWAS analysis were used for gene-wide, gene-based association testing with the software package KGG (Li et al. 2010). No prioritizing or pre-selection of genes was performed. Gene-based tests in KGG combine univariate association statistics to evaluate the cumulative evidence of association in a gene with a phenotype, using the GATES-Simes test (Li et al. 2011). Similarly, KGG is integrated with biological pathway databases (e.g., KEGG) and combines gene sets to test for significant enrichment of a number of disease and biological pathways (Li et al. 2011). Pathways are considered to be significantly enriched if they contain more significant gene-based test statistics than expected by chance.

Results

Lentiform nucleus segmentations

In the ADNI sample, the volumes of the left (6422.2 ± 723.9 mm3) and right (6450.9 ± 686.9 mm3) lentiform nucleus were highly correlated (r = 0.83; P < 0.0001; df = 704). Similarly, in the QTIM sample the volumes of the left (6554.4 ± 744.8 mm3) and right (6729.5 ± 765.4 mm3) lentiform nucleus were highly correlated (r = 0.84; P < 0.0001; df = 637). Both samples have a slight asymmetry between left and right lentiform nucleus volume. In the ADNI sample the right lentiform nucleus was 0.4 % larger on average than the left. Similarly, in the QTIM sample the right lentiform nucleus was 2.6 % larger on average than the left. This follows a general trend in the brain where bilateral subcortical structures are slightly larger in the right hemisphere (Toga and Thompson 2003). As expected, because the cohort is younger, the average volumes for the QTIM sample were larger than the ADNI sample (Left: t = 3.30; P = 0.0010; ∆2.0 %; Right: t = 7.00; P < 0.0001; ∆4.1 %; Average Bilateral: t = 5.37; P < 0.0001; ∆3.1 %).

Reliability of lentiform nucleus segmentation

To examine how reliable the automated segmentations were, when measured by FIRST, we obtained repeated scans for 40 subjects from the QTIM sample (time between scans: 120 ± 55 days) and applied the FIRST algorithm to each scan separately. Using the intraclass correlation coefficient (ICC), we found the FIRST segmentations to be very highly reliable for the left (ICC = 0.922), right (ICC = 0.890), and average bilateral (ICC = 0.928) lentiform nucleus volumes (Fig. 2).

Heritability of lentiform nucleus volume

Using twin and family data from the QTIM, we modeled the additive genetic effects (A), effects of the common environment shared by both twins (C), and unique environment effects and experimental error (E). The components of the ‘ACE’ model are used to estimate the amount of variance in a measure that can be ascribed to purely genetic influences (its heritability). Lentiform nucleus volume is highly heritable (between 70 and 80 %) as has been found for many other structures in the brain (Kremen et al. 2010) (Table 1). The heritability of the lentiform nucleus volume is also evident in a scatterplot showing the similarity among twin pairs, monozygotic twins (black dots) have more similar lentiform nucleus volumes in general compared to dizygotic twins (open dots; Fig. 3). The genetic correlation (r g) is the proportion of the observed variance between two traits that can be explained by common genetic influences (Neale et al. 1992). As the structures of the lentiform nucleus are closely related, we expect them to share common genetic determinants. Indeed, the genetic correlation between the putamen and globus pallidus was high: r g = 0.54 (95 % CIs 0.39, 0.82) for the left and r g = 0.56 (95 % CIs 0.40, 0.69) on the right. In addition, the genetic correlation between the left and right lentiform nucleus reveals that the volume of the structure on each side has almost perfect overlap in its genetic determinants: r g = 0.93 (95 % CIs 0.88, 1.00).

Table 1 Heritability estimates (a2) for lentiform nucleus volume. These analyses were run in Mx on 637 individuals (i.e., including up to 3 individuals per family so two non-twin siblings who were the 4th family member were not included). Data were winsorised to ±3.3SD. Sex and age were included as covariates
Fig. 3
figure 3

Scatterplot of lentiform nucleus volume in monozygotic (black dots) and dizygotic (open dots) twin pairs from the QTIM. Data points closer to the diagonal line represent similar lentiform nucleus volumes across a given twin pair. In general, the lentiform nucleus volumes in the monozygotic twins are closer than their dizygotic counterparts, which is a sign of genetic influence (confirmed by our heritability analysis)

Genome-wide association testing

As the lentiform nucleus is involved in a number of brain disorders and its volume is heritable, we conducted genome-wide tests of association on a large set of SNPs from the two independent cohorts to identify genetic variants that help to explain the considerable genetic influence on lentiform nucleus volume. Q-Q plots of the distribution of P-values for each individual sample show that the association statistics are approximately Normal (Fig. 4). Genomic inflation factor values (lambda) indicate that the distribution of P-values is unbiased and that the results are not likely to be attributable to population stratification.

Fig. 4
figure 4

Q-Q plots for observed association P-values of SNPs from both datasets (after removing poorly imputed SNPs and SNPs with a minor allele frequency below 0.01). The genomic inflation factor (lambda) is given for each measure (inset). There is no evidence of inflated P-values influencing the meta-analysis

Meta-analysis

Test statistics from each study were combined meta-analytically to increase the power to detect real effects and to reduce false positives. Beta values, and their standard error, for SNPs from each regression model were combined across samples. The signs of Beta values were determined based on the reference allele in each study and combined using a fixed-effects, inverse variance-weighted meta-analysis (Han and Eskin 2011). Meta-analysis is preferred in this case, as opposed to combining all subjects into a single combined analysis, as the two samples have very different age distributions, image acquisition parameters, and the QTIM is a family-based study that requires complex regression methods (to account for kinship).

In the Manhattan plot of the P-values from each meta-analysis, a number of promising genetic variants were associated with lentiform nucleus volume, including one SNP that exceeds the standard, nominal genome-wide significance level P < 5 × 10-8 after meta-analysis (Fig. 5). A list of the top SNPs from each analysis with a meta-analyzed P-value (P MA ) threshold of P MA  < 1x10-6 is given in Table 2.

Fig. 5
figure 5

Manhattan plot of meta-analyzed P-values (P MA ) from both the ADNI and QTIM samples (N = 1345). Each plotted point is the –log10(P MA ) of a given SNP sorted by chromosome. The dotted gray line denotes the standard, nominal genome-wide significance level –log10(5 × 10−8). Each point plotted above the gray line indicates a genome-wide significant SNP

Table 2 Top SNPs identified from the meta-analysis (P MA  < 1 × 10−6). Individual significance statistics of the average bilateral lentiform nucleus are given for both the ADNI and QTIM samples as well as meta-analyzed statistics (Pooled). The R 2 value gives the estimated Pearson’s correlation coefficient of a SNP to the nearest genotyped marker. The minor allele (α) for the reported beta coefficient (β) for each sample is also given. Intergenic SNPs within 20 Kb of an annotated gene region are listed as part of that gene. P diag gives the association P-value for a given SNP when controlling for diagnosis in the ADNI sample

A broad band of SNPs shows high association with lentiform nucleus volume in the flavin-containing monooxygenase gene cluster on chromosome 1 (Fig. 6). The most highly associated SNP, rs1795240, is located approximately 5 Kb outside of the flavin-containing monooxygenase 3 (FMO3) gene. It shows genome-wide significant associations with lentiform nucleus volume (P MA  = 4.79 × 10−8). Individual association statistics for rs1795240 show significance in both the ADNI (β = −143.48; SE(β) = 28.15; minor allele = A; P = 3.46 × 10−7) and QTIM (β = −76.57; SE(β) = 30.12; minor allele = A; P = 0.011) samples. The observed effect is likely greater in the ADNI sample due to the greater sample size, but the effect may also increase with age or disease. Additionally, the second most associated SNP, rs1795243 (P MA  = 8.76 × 10−8), lies in an untranslated region of the FMO6P pseudogene. The variant rs1795243 shows strong evidence for association in both samples (ADNI: β = −141.44; SE(β) = 28.53; minor allele = C; P = 7.12 × 10−7; QTIM: β = −76.805; SE(β) = 30.20; minor allele = C; P = 0.011). Additionally, a number of the top hits were located in GATAD2B and EPB41L2 among others (detailed in Table 2). After controlling for diagnosis in the ADNI sample, there was little change in observed P-values (See the P diag column in Table 2). Two dummy variables were added as covariates to the regression model to account for each of the three different diagnostic categories in the ADNI sample. This was not necessary in the QTIM sample, as they are all healthy young adults. The random effects meta-analysis of these same SNPs gave nearly identical results to the fixed-effects meta-analysis (see Fig. 7).

Fig. 6
figure 6

Detailed view of the flavin-containing monooxygenase gene cluster. Points correspond to the –log10(P MA -value) for the average lentiform nucleus volume. The colors of each point correspond to level of linkage disequilibrium (LD) between a given SNP and the most associated SNP rs1795240. Plots were generated using the LocusZoom software (http://csg.sph.umich.edu/locuszoom/)

Fig. 7
figure 7

Manhattan plot of meta-analyzed P-values (P MA ) from both the ADNI and QTIM samples (N = 1345) using a random effects model (Han and Eskin 2011). Each plotted point is the –log10(P MA ) of a given SNP sorted by chromosome; points plotted higher on the y-axis are more significant. The dotted grey line denotes the nominal genome-wide significance level –log10(5 × 10−8). The results are consistent with those found using the standard fixed effects analysis

Gene-based tests and pathway analysis

The genes FMO3 (P = 1.03 × 10−6) and FMO6P (P = 1.32 × 10−6) exceed the nominal gene-wide significance level of P < 5 × 10−6. A number of other genes show promising evidence of association with lentiform nucleus volume: SLC39A1 (P = 7.56 × 10−6), DENND4B (P = 1.53 × 10−5), GATAD2B (P = 2.25 × 10−5), and FOXF2 (P = 9.59 × 10−5). Pathway enrichment analysis in KGG reveals seven pathways that exceed the nominal significance level for pathway enrichment (5 × 10−4) including the reactome phase 1 functionalization pathway (P = 1.34 × 10−5) and the KEGG drug metabolism pathway of cytochrome P450 (P = 5.66 × 10−5). Additional results of the pathway analysis are given in Table 3.

Table 3 Significantly enriched pathways determined by pathway analysis with KGG (Li et al. 2010, 2011). Details for the pathways given can be found of the Gene Set Enrichment Analysis website (http://www.broadinstitute.org/gsea/). Pathways that exceed the threshold P < 5 × 10−4 were considered to be significantly enriched

Discussion

In this study, we identified specific genetic variants associated with differences in lentiform nucleus volume in two large independent samples, including both young and elderly subjects (N = 1345). We were well powered to find genetic variants that explain some of the heritability of the lentiform nucleus volume, with one SNP exceeding the nominal genome-wide significance threshold. Our two cohorts differed in many ways, but mainly in mean age (50 years). Despite the differences, we identified a number of variants with compelling evidence for association in both samples. Associations were detectable despite differences in study protocols; the genes implicated may therefore have a statistical effect on lentiform nucleus volume throughout life. Further replication in independent samples (e.g. as in Stein et al. 2012 and Bis et al. 2012) and examination of functional relevance will still be required to further support a causal role for these variants.

We originally chose to study the lentiform nucleus as it is implicated in several genetically mediated disorders including Parkinsonian syndromes, Huntington’s disease, Wilson’s disease, Tourette’s syndrome, and ADHD. While the putamen is more similar to the caudate histologically, the putamen and globus pallidus are linked by dense intrabasal ganglionar fiber projections. In addition, the genetic correlations between the two structures of the lentiform nucleus were very high (r g = 0.56 and r g = 0.54, for left and right, respectively). This high genetic correlation means that the two structures share many common genetic determinants. This provided empirical support for analyzing the two structures together, in addition to our theoretical reasons for choosing to study the lentiform nucleus. Even so, we note that other natural groupings of structures may be beneficial for future assessment. Although we opted to combine the putamen and globus pallidus, the putamen is functionally more related to the caudate, and together they make up the striatum, which receives afferent projections from large parts of the cortex. In the future, when a broad range of subcortical segmentations are available in large family-based samples, it will be possible to perform genetic clustering to determine logical groupings of subcortical nuclei with coherent genetic determination (C. H. Chen et al. 2012; Chiang et al. 2012). By clustering regions with overlapping genetic determinants, it should be possible to boost the power to detect underlying genetic determinants via GWAS (as shown by (Chiang et al. 2012). In addition, variance component modeling performed in the QTIM sample shows that the left and right lentiform nucleus volume are around 70–80 % heritable (see Table 1). This agrees with published heritability estimates for the putamen and globus pallidus (Kremen et al. 2010; Peper et al. 2007). We examined the reliability of lentiform nucleus segmentations by processing repeated scans in 40 subjects from the QTIM sample. The resulting volumes were highly reliable using the automated FSL FIRST software (Patenaude et al. 2011)(Fig. 2), and the reproducibility also agrees with prior estimates (Morey et al. 2010).

A wide band of SNPs from the flavin-containing monooxygenase (FMO) gene cluster on chromosome 1 showed significant evidence of association in both samples and after meta-analysis, with one SNP exceeding the nominal genome-wide significance level. The FMO gene cluster consists of five tightly-spaced genes (FMO1-4 and FMO6P) responsible for the metabolism of trimethylamine, methionine, and cysteamine as well as a number of therapeutic medications including tamoxifen, ranitidine, sulindac, and itopride (Williams et al. 2004). Additionally, the FMO gene cluster is involved in the oxidation of certain environmental toxicants like insecticides and aldicarb (Krueger and Williams 2005). Of the genes in the FMO gene cluster, FMO1 and FMO3 have been studied extensively. Carriers of a number of common genetic variants have reduced efficacy metabolizing certain drug substrates (Koukouritaki et al. 2002; Overby et al. 1997; Yeung et al. 2000). The role of the FMO gene cluster in the metabolism of common environmental toxicants suggests a common underlying mechanism that might yield association results in the young, healthy population of twins that overlaps with association we found in our sample of elderly controls and patients. It is unlikely that the association in these samples were driven by the use of therapeutic medications such as opiates or anti-depressants, as most participants were healthy. Follow up studies are still needed to determine whether commonly used substances, such as alcohol, nicotine, commonly abused drugs, or anti-inflammatory drugs exert detectable and systematic anatomical effects on structures in the reward circuitry, and if they lead to any detectable changes in FMO gene expression.

The most highly associated SNP, rs1795240, is located just downstream of the FMO3 gene. A number of common genetic variants in the FMO3 gene have been linked with decreased catalytic activity and the disorder trimethylaminuria (Hines 2006; Koukouritaki et al. 2007). The FMO3 gene is expressed mainly in the liver but also in the human brain, and may affect how numerous therapeutic drugs are metabolized by the central nervous system (Cashman and Zhang 2002). In addition, an analysis of the Allen Brain Atlas (http://human.brain-map.org/) shows that FMO3 is differentially expressed in the posterior portion of the lentiform nucleus (Fig. 8).

Fig. 8
figure 8

Expression levels of FMO3 gene in the lentiform nucleus of two different subjects (a and b; details can be found at http://human.brain-map.org/). Expression levels were standardized to a mean expression level to eliminate background noise and are presented here as Z-scores (where |Z-score| > 2.5 indicate evidence of differential expression of the FMO3 compared to other regions of the brain). Numerous points in the posterior portion of the lentiform nucleus show evidence of significant differential expression of the FMO3 gene

The second most highly associated SNP, rs1795243, was found in the FMO6P pseudogene, which is transcribed into mRNA, but not translated into a protein product (Hines et al. 2002). Pseudogenes are not ultimately converted to proteins, but can act as regulatory elements and are under evolutionary control (Poliseno et al. 2010; Wen et al. 2011). The exact mechanism of action of the FMO6P pseudogene is still unknown, but the associations identified in this study may make it an ideal candidate for future genetic studies of neurodegenerative disorders and functional tests of FMO6P mechanistic effects. Previously, a large case-control GWAS study found mild evidence of association of the FMO6P gene with schizophrenia (Athanasiu et al. 2010).

Gene-based test statistics confirmed the association of FMO3 and FMO6P with lentiform nucleus volume—as found in the univariate study—with both genes exceeding the nominal gene-wide significance level. The gene-based tests also promoted SLC39A1 to a higher significance level (P = 7.56 × 10−6) than might be expected compared to the other genes in the univariate SNP GWAS. The role of SLC39A1 is very well studied. It is expressed in the brain and is involved in maintaining an appropriate zinc concentration inside the blood-brain barrier (Bobilya et al. 2008). Pathway enrichment analysis, performed with KGG, combines gene-based test statistics to examine whether known disease and biological pathways are over-represented in the gene sets from our analysis, relative to what might be expected by chance. In all, there were seven biological pathways that reached significance (see Table 3). The most significant pathway, reactome phase 1 functionalization, supports the many studies suggesting that the FMO3 gene is involved in processing environmental toxins (Krueger and Williams 2005; Williams et al. 2004). The next most significant pathway, KEGG drug metabolism pathway of cytochrome P450, involves the cytochrome P450 superfamily of enzymes responsible for metabolizing numerous drugs including codeine, morphine, carbamazepine, citalopram, and clozapine (Hines et al. 2008). Cashman and Zhang showed that FMO3 is expressed in various regions throughout the brain including in the striatum (Cashman and Zhang 2002). Earlier studies using human microsomes showed that numerous brain tissues actively metabolize psychoactive drug substrates including chlorpromazine, imipramine and fluoxetine (Bhagwat et al. 1996; Bhamre et al. 1995). In addition, several positron emission tomography studies have demonstrated significant differences in glucose metabolism in the lentiform nucleus in patient-versus-control comparisons of psychoactive drugs like fluoxetine and chlorpromazine (Chen et al. 2009; Mayberg et al. 2000; Wik et al. 1989). Each of these studies lends credibility to the findings in this study and future endeavors to further understand the mechanisms by which gene variants in the FMO gene cluster may influence lentiform nucleus volume.

Several weaknesses of our study should be mentioned. First, we provide evidence for association of genetic variants in the flavin-containing monooxygenase (FMO) cluster but we do not yet know the mechanistic means by which they may change expression levels or protein structures, or how they might affect lentiform nucleus volumes. Unfortunately, functional and expression data are not yet available for either cohort, but they may be available in future cohorts. Second, the two samples have very different mean ages (over 50 years). Combining data meta-analytically between groups penalizes SNPs that are significant in one sample, but not the other. This analysis of two cohorts is a special case of a meta-analysis, which tends to boost power to detect true positive associations by aggregating information from multiple cohorts. Clearly the power to detect an association with a given effect size depends on the available sample size, so in general the power is increased by increasing the sample size alone. The power of a meta-analysis may be slightly lower than that obtainable in a very large sample all scanned on the same scanner with the same protocol, but practical limitations constrain how many subjects can be scanned and genotyped at any one center, so multi-site efforts can be more efficient than studies at any single site. In that case, meta-analyses may offer high power so long as the chosen phenotypes are measured consistently and reliably across datasets. At the same time, meta-analysis reduces the chance of false positives as it penalizes results that are not consistently detectable across sites; in other words, it finds effects that are known to generalize to other cohorts, and less likely to be spurious associations attributable to the genetic diversity or particular ascertainment or sampling of any one cohort. In cases where the genetic expression has a compact temporal expression pattern, our analysis may lead to false negatives, as an effect could be detected in one sample but not the other. The genes identified in our analysis should be thought of as those associated with lentiform nucleus volume throughout life. Genetic variants that were not associated with lentiform nucleus volume could certainly still be involved in cellular or functional differences, so the findings must be interpreted recognizing the power and limitations of the study. Third, the proportion of the sample variance explained by the top SNP, rs1795240, is relatively small (ADNI: 2.68 %; QTIM: 0.84 %). However, a SNP that explains 1–3 % of the overall variability is comparable to the strongest SNP effects observed for other complex traits in even larger studies (Bis et al. 2012; Stein et al. 2012). The small effect sizes and complexity of phenotypic traits mean that individual common SNPs will each probably explain a small portion of the overall observed variability of a given trait. In addition, the proportion of variability explained by the top SNP in each sample is different. Further exploration is needed to determine age related effects of FMO3 gene variants on lentiform nucleus volume.

The genetic variants identified in our analysis provide replicated, genome-wide significant evidence for the FMO gene cluster’s involvement in lentiform nucleus volume. In addition, gene-based tests and pathway enrichment analysis provide evidence of probable mechanistic actions through which the variants in our analysis might affect lentiform nucleus volume. Future study is still needed to explain the functional mechanisms of change.