Introduction

In 1996, Lesch et al. (1996) reported an association between the promoter-based length polymorphism of the serotonin transporter gene (5-HTTLPR) and the anxiety-related personality traits neuroticism and harm avoidance. The 5-HTTLPR is defined by a length variation of a repetitive sequence with the short and the long fragment consisting of respectively 484 and 528 basepairs. In vitro analyses showed that the basal activity of the long variant was about threefold higher than that of the short variant, indicating that this is a functional polymorphism (Heils et al. 1996). As the association of the short variant with higher neuroticism and harm avoidance scores was found in two independent groups of subjects as well as within families, this seemed a very promising result (Lesch et al. 1996). The association between 5-HTTLPR and anxiety-related personality traits or depressive psychopathology has been investigated in numerous studies since then, but with conflicting results. Even meta-analyses on the association between 5-HTTLPR and personality traits (Munafo et al. 2003; Munafo et al. 2005a; Schinka et al. 2004; Sen et al. 2004) or affective disorders (Lasky-Su et al. 2005; Lotrich and Pollock 2004) reached conflicting conclusions. This might be partly due to methodological differences between the meta-analyses (Munafo et al. 2005b; Schinka 2005; Sen et al. 2005). Munafo et al. (2005b), therefore, stated that “Very large, well designed primary studies remain the most reliable way of obtaining reproducible results”.

Such a study was performed by Willis-Owen et al. (2005). An association study was carried out in three independent samples including respectively 564, 1,001 and 5,000 subjects selected from two general populations based on extreme high or low scores on neuroticism. Their samples retained virtually 100% power to detect a genetic effect accounting for just 0.5% of phenotypic variance at an alpha level of 0.05. They did not find any significant association between 5-HTTLPR and neuroticism measured with the Eysenck Personality Questionnaire (Eysenck and Eysenck 1975) or major depression as defined by the DSM-IV (American Psychiatric Association 1994).

We collected measures of neuroticism, as well as measures of anxiety, depression and DSM-IV based diagnoses of depression in a large sample of Dutch twin families. Association tests were carried out in a sample of 1,800 subjects who were genotyped for 5-HTTLPR. The self report measures of neuroticism, anxiety and depression were assessed five times in a time period of around ten years. This made it possible to carry out analyses on the scores from the different time points to test whether effects replicated across time, as well as on the mean scores over the occasions. DSM-IV major depression data, diagnosed with a structured psychiatric interview, were also investigated for an association with 5-HTTLPR. In addition to testing multiple phenotypes repeatedly over time, the family based sample, allowed testing for population stratification by decomposing the association effect in a between and within family component (Fulker et al. 1999). A significant within-family effect indicates genuine association whereas a significant between family effect may be due to population stratification. It is important to note that population stratification cannot only result in spurious associations, but can also obscure genuine associations (Posthuma et al. 2004).

Materials and methods

Subjects

This study is part of an ongoing longitudinal survey study of the Netherlands Twin Register (NTR) that has assessed families with adolescent and adult twins roughly every two years since 1991 (Boomsma et al. 2002). Each survey was sent to the twins and additional family members, namely parents in 1991 and 1993, siblings in 1997 and 2000 and parents and siblings in 1995 and 2002. Each survey collected information on personality and psychopathology, although less extensive in the 1995 wave. As part of a project aiming to find the genes underlying the susceptibility to anxiety and depression, in 1998, informative families were selected from this database. For a detailed description of the selection and the data collection, see Boomsma et al. (2000) and Middeldorp et al. (2006). In short, the selection strategy was based on the recommendation of Eaves and Meyer (1994) and Risch and Zhang (1995) to select sibling pairs for genotyping who score extreme (high/high, low/low, low/high or high/low) on a quantitative scale of interest. Simulation studies have shown the optimal selection percentages for linkage analysis in sibling pairs from random samples (Dolan and Boomsma 1998). Concordant sibling pairs were selected when both had scores in the top 12% or in the bottom 12% of the phenotypic distribution. For discordant pairs, an ‘asymmetrical’ criterion was used. Sibling pairs were selected if one had a score in the top 25% and the other in the bottom 20% or vice versa. A quantitative factor score expressing a subject’s genetic susceptibility to ‘anxious depression’ was used for the selection. The genetic factor scores were derived from a multivariate genetic analysis of the anxiety, depression, neuroticism and somatic-anxiety data that showed that the covariance between these traits was explained by a common genetic factor (Boomsma et al. 2000).

A factor score was calculated for 7,836 twins and siblings who participated at least once in the 1991, 1993 or 1997 survey. Subjects who missed one or more of the inventories that measured neuroticism, anxiety, somatic anxiety or depression were excluded. Based on these factor scores, 561 families were selected in which two siblings had extreme factor scores. All members of the selected families, regardless of their value on the genetic factor score, were asked to provide a buccal swab for DNA isolation. Twins and siblings of these families were also asked to participate in a diagnostic psychiatric interview. Two hundred and fifty four fathers, 305 mothers, 501 male and 744 female offspring of 466 participating families, all aged between 16 and 65 years, were genotyped for 5-HTTLPR. The offspring consists of 327 MZ twins and 918 DZ twins or siblings with on average 2.7 siblings per family. The psychiatric interview was administered to 1,256 twins and siblings of whom 989 were also genotyped for 5-HTTLPR.

Table 1 shows the distribution of the genetic factors scores in the total offspring sample, the offspring genotyped for 5-HTTLPR and offspring who participated in the interview. An individual was assigned to one of the groups (score below 25th percentile, above 75th percentile, or in between) based on the highest or lowest factor score in 1991, 1993 or 1997. This explains how, in the total population, the percentages of extreme scoring subjects are higher than 25%. The selection was also based on the highest or lowest genetic factor score on these three occasions. Subjects who scored high on one occasion and low on another were assigned to a separate group. Table 1 shows that the genotyped sample is similar to the total sample, probably due to the inclusion of all family members of the extreme scoring sibling pairs. The subjects who participated in the CIDI scored somewhat more extreme than the total sample.

Table 1 Distribution of factor scores (fs)a in the total sample of twins and siblings and the samples of twins and siblings who participated in the CIDI or were genotyped for 5-HTTLPR

Instruments

Association analyses were performed on the anxiety, neuroticism and depression scales. The scores on these instruments were used to calculate the genetic factor scores. In 1991, 1993, 1997, 2000 and 2002, anxiety was measured with the Spielberger State Trait Anxiety Inventory – Trait version (STAI) (Spielberger et al. 1970; Van der Ploeg et al. 1979) and neuroticism was measured with the Amsterdamse Biografische Vragenslijst (Wilde 1970), which 30-item neuroticism scale is modelled after the neuroticism scale of the Eysenck Personality Questionnaire (Eysenck and Eysenck 1964). In 1991, 1995, 1997, 2000 and 2002, anxious depression was measured with the Young Adult Self Report (YASR) (Achenbach 1990; Verhulst et al. 1997). The scores were log transformed following earlier analyses of these data (Boomsma et al. 2000). Cross-sectional correlations between the scores on these three scales varied from 0.60 to 0.75 (Middeldorp et al. 2006). Longitudinal correlations for the neuroticism, anxiety and depression scores ranged from 0.58 for neuroticism and 0.62 for anxiety measured at 1991 and 1993 and from 0.36 for depression and 0.50 for neuroticism measured at 1991 and 1997 (Boomsma et al. 2000).

Association analyses were carried out on the scores from each survey separately. In addition, the neuroticism, anxiety and depression scores were averaged over all surveys in which each subject had participated. Thus for example in the case of neuroticism, if a subject had participated once, this neuroticism score was used as the mean score. If a subject had participated twice, the average of these scores were used as the mean score. Association analyses of the mean scores have as advantage that all available data are used, while measurement error is minimized because part of the error is averaged out.

All offspring of selected families were asked to participate in a telephone interview, during which the computerized version of the Composite International Diagnostic Interview (CIDI) (World Health Organization 1992) was administered to obtain lifetime DSM-IV diagnoses of mood and anxiety disorders. The CIDI is a fully standardized diagnostic interview. No information on the reliability and validity of the Dutch version of the CIDI is available, but good reliability and validity have been reported for the English version of the CIDI (Andrews and Peters 1998). All interviewers were trained by the Dutch World Health Organization training center. In order to minimize observer bias, interviewers were unaware of interviewees’ scores on the initial selection variables throughout the study.

Genotyping

The 5-HTT regulatory gene region was amplified using a polymerase chain reaction (PCR) of oligonucleotide primers 5′-GAGGGACTGAGCTGGACAACCCAC (Greenberg et al. 1999) and 5′-ATGCCAGCACCTAACCCCTAATGT (Gelernter et al. 1997). PCR was performed in a 40 μl volume containing 10 ng of genomic template, 0.33 mM of each primer, 0.4 mM deoxynucleotide triphosphates, 2.5% dimethyl sulfoxide (DMSO),1.6 units of rTaq DNA polymerase (Amersham Biosciences). Initial denaturation at 94°C for 3 min. was followed by denaturation at 94°C for 30 s, annealing at 59°C for 30 s, and extension at 72°C for 1 min 30 s for 35 cycles. The PCR procedure was terminated by extension at 72°C for 6 min. Amplified 469/513 bp fragments were electrophoresed through 2% agarose and were visualized by ultraviolet illumination upon ethidium bromide staining.

Statistical methods

Power analyses of the family and total association analyses were carried out with the Genetic Power Calculator designed by Purcell et al. (2003). Association between the ss, sl and ll variants of 5-HTTLPR, and the quantitative traits were investigated by modelling an additive effect of the s-allele with sex included as a fixed effect. These analyses were carried out in QTDT using the test that models total association (Abecasis et al. 2000a). To exclude population stratification, the association was also tested in the orthogonal model as defined in QTDT. In this model, the association is decomposed into a between- and within-family association effect with a significant within family association effect indicating a genuine effect (Abecasis et al. 2000a; Fulker et al. 1999). Equal within- and between-family effects indicate the absence of population stratification (Abecasis et al. 2000a; Fulker et al. 1999). In each analysis, the dependency among individuals from the same pedigree was modelled as a function of their genetic relatedness.

Differences between the prevalences of DSM-IV major depression for the three genotypes were tested with a permutation χ2 test using StatXact (1992). Population stratification was evaluated with the transmission disequilibrium test (TDT) in QTDT (Abecasis et al. 2000b).

A P-value below 0.05 was considered significant, even though multiple tests were carried out. An important feature of our study is replication as indicated by the agreement between the results for different time points. If, for example, we would find that the test on the mean score over the occasions is significant as well as four of the five tests on the scores across occasions with P-values between 0.01 and 0.04, an association with 5-HTTLPR would seem likely.

Results

Table 2 gives the number of subjects and their mean age available for the total association and the within family association tests of the neuroticism, anxiety and depression data. The number of subjects that participated in the CIDI is also shown. It is important to note that in the total association test, phenotypic data of parents as well as siblings are used, whereas in the within family association test only the phenotypic data of the siblings are analysed. In the latter test, the parents only provide genotypic information. As can be seen in Table 2, phenotypic parental data were available for anxiety and neuroticism measured in 1991, 1993 and 2002 and for depression measured in 1991 and 2002. The total number of subjects for the analyses of the averaged scores across occasions is higher for neuroticism and anxiety than for depression, as in 1993, neuroticism and anxiety were also assessed in the parents, while in 1995 depression was only assessed in the twins and siblings.

Table 2 N subjects and their mean age (SD) available for the total association analyses and N siblings and their mean age (SD) available for the family association analyses of the six surveys and the CIDI. The last 2 rows show N subjects and their mean age across occasions (SD) available for the analyses of the mean scores across occasions for neuroticism and anxiety and for depression

The power of the study was calculated for the analyses of the smallest and largest sample of siblings i.e. the sample that participated in 1991 and the sample used to analyse the mean scores. When data from parents are not considered and sib ships are defined as pairs, the power to find an effect explaining 5% of the variance with an P-level of 0.05 is 0.90 for the smallest sample and virtually 1.0 for the largest sample.

The null hypothesis of Hardy Weinberg Equilibrium, tested in the total sample, was not rejected. The allele frequencies of 57% for the long allele and 43% for the short allele were similar to the frequencies reported by Lesch et al. (1996). Table 3 shows the mean neuroticism, anxiety and depression scores for each occasion and across the five occasions as well as the results of the tests for association and population stratification. None of the total association tests were significant with the exception of neuroticism measured in 2000 for which a positive regression coefficient for the s allele was found. The results of the between and within family association tests showed that anxiety and neuroticism measured in 1991 were significantly associated with 5-HTTLPR. Population stratification seemed to counteract the effect, since the total association tests for anxiety and neuroticism measured in 1991 were not significant. However, the effect was in the opposite direction, i.e. the with a negative within-family regression coefficient for the s allele.

Table 3 Association tests between the three variants of the serotonin transporter gene (ss, sl and ll) and neuroticism (neu), anxiety (anx) or depression (dep). Phenotypes were measured on five different occasions (first 15 rows). The last 3 rows give results for mean scores over these five occasions. P-values below 0.05 are shown

The analyses on the mean neuroticism, anxiety and depression scores over the five occasions showed no significant results neither for the total association test nor for the within family association test. This also becomes clear from Table 3, which shows that the scores of the subjects in the three groups of genotypes are remarkably similar.

Table 4 shows the number of subjects diagnosed with major depression per genotype. The χ2 test was significant with a P-value of 0.044 and remained significant after 2,000 and 10,000 permutations with P-values of 0.045 and 0.042. This is due to the group of heterozygotes who appear to have a higher risk for depression than both homozygous groups. TDT did not show a significant association.

Table 4 Diagnoses of major depression (%) per genotype

Table 5 gives the results for three additional analyses on the mean neuroticism, anxiety and depression scores over the five occasions, resulting in the largest possible sample size. An earlier article suggested that the effect of 5-HTTLPR is largest in subjects scoring in the middle of the distribution (Sirota et al. 1999). Our selection could have obscured such an effect, although Table 1 shows that there is no obvious effect on the distribution of the factor scores of the selection on the total genotyped sample. Since not only selected sibling pairs but also their family members were approached for this study, it was possible to repeat our analyses including only the subjects scoring between the 25th and 75th percentile on the genetic factor score on all occasions. No association was found. On the other hand, it is also conceivable that the effect is largest in the subjects on which the selection was based and that we did not find the effect because we also included non-extreme scoring family members. However, trying to maximize the effect of the selection by only including the families with sibling pairs scoring concordant high or low in the analyses did not change the results either. Another reason no effect was found could be that the neuroticism scores of young subjects are not reliable since their personality is still in development. Therefore, the analyses were repeated in a sample consisting of subjects aged over 30 years. These analyses converged with the other analyses in not showing an association with 5-HTTLPR.

Table 5 Mean (SD) of the scores averaged across occasions per genotype in additional analyses. Samples include (1) subjects scoring > 25th and < 75th percentile, (2) families with extreme high or low concordant siblings pairs, and (3) subjects over 30 years old when filling out the questionnaires

Discussion

The general picture emerging from these analyses clearly suggests no association between 5-HTTLPR and self report measures of neuroticism, anxiety and depression or DSM-IV major depression. These results support the conclusion of Willis-Owen et al. (2005) that the 5-HTTLPR variant does not contribute significantly to neuroticism, anxiety or depression.

The results of the additional analyses on three sub-samples indicate that the chance of a false negative finding is minimal. Finally, it was considered that the significant χ2 test for DSM-IV major depression could indicate that there is no additive or dominance effect, but that a mechanism called negative heterosis plays a role (Comings and MacMurray 2000). Since the TDT models an additive effect, the absence of a significant result does not exclude the possibility of heterosis. Negative heterosis signifies that having a heterozygous genotype is disadvantageous. Association studies of DRD2 gene polymorphisms and alcoholism or ADHD have suggested that this mechanism might play a role in psychiatric conditions (Comings and MacMurray 2000). However, only one out of eleven other studies investigating the association between 5-HTTLPR and major depression also found that heterozygous subjects are more at risk than the homozygotes, indicating that it is highly unlikely that negative heterosis is present (Bellivier et al. 1998; Collier et al. 1996; Frisch et al. 1999; Furlong et al. 1998; Hauser et al. 2003; Hoehe et al. 1998; Kunugi et al. 1997; Mendlewicz et al. 2004; Rees et al. 1997; Serretti et al. 2002; Willis-Owen et al. 2005).

Our results also show how associations can be found by coincidence. Had we chosen to report exclusively on the results of the 1991 survey, we would have drawn the conclusion that a significant association between 5-HTTLPR and anxiety-related traits is present, although the effect was in the other direction than expected. Therefore, not only different phenotypes should be measured, as suggested by Munafo et al. (2005b), but they should also be measured repeatedly. This also provides the opportunity to analyse mean scores over occasions, which is more powerful because of the high number of subjects and the reduction of measurement error.

To summarize, this study analysed the association between 5-HTTLPR and self report measures of neuroticism, anxiety and depression as well as DSM-IV depression in a large family based sample providing the opportunity to take population stratification into account. The quantitative phenotypes were measured at five different occasions. No consistent association was found, not even after repeating the analyses in sub-samples to decrease the effects of the selection or of the relatively low age of the original sample. These results strongly suggest the absence of a straightforward association between 5-HTTLPR and neuroticism, anxiety or depression.