Introduction

Dopamine receptor D4 (DRD4) has a seven-transmembrane structure, which couples to a G-protein. Owing to the importance of DRD4 as a candidate gene for mental disorders, DRD4 polymorphisms have been investigated in both coding and non-coding regions (Seaman et al. 1999; Seeman et al. 1994; Van Tol et al. 1991). In addition, a survey on repetitive variants in DRD4 exon III in humans showed a positive correlation between DRD4 genotypes and the normal personality trait of novelty seeking (Benjamin et al. 1996).

DRD4 exon I contains a 12-bp (GASA) tandem unit, which arises 1–3 times in the human genome (Catalano et al. 1993). The duplication occurs near the extracellular boundary of the first transmembrane region, which influences the efficiency of signal transduction (Schoots et al. 1996). To date, several surveys have been designed to compare the DRD4 genotypes among different types (e.g., regional) of human populations (Chang et al. 1996; Gelernter et al. 1997; Van Tol et al. 1992). A subsequent study has shown that the great apes possess a variable number of tandem repeats in the same region as in humans, whereas the Old World monkeys show no variation in length (Seaman et al. 2000). The difference in the degree of polymorphisms among primate taxa suggests that selective pressure on the exon may act as a decisive factor influencing the magnitude of polymorphisms. Moreover, our group has found that tandem repeats of proline commonly existed in the orthologous region of avian genomes, and that six breeds of chicken (Gallus gallus) showed intra-breed polymorphisms in the number of proline repeats (Sugiyama et al. 2004). Because the number of prolines in Galliformes is clearly different from those in the Japanese jungle crow (Corvus macrorhynchos; Passeriformes) and the Japanese cormorant (Phalacrocorax capillatus; Pelecaniformes), the extracellular region of DRD4 is expected to display high levels of genetic polymorphism among avian taxa.

The relationship between polymorphisms in neurotransmitter-related genes (e.g., functional genes in dopaminergic and serotonergic pathways) and individual behaviors has attracted much attention in a wide range of animal studies. Research targets are not restricted to captive and domesticated animals (Fidler et al. 2007; Momozawa et al. 2005; Niimi et al. 2001); wild animals can also be used as research subjects for advanced studies (Korsten et al. 2010). These studies illustrate the importance of polymorphisms in neurotransmitter-related genes as one of the key factors that propel behavioral diversity. If individual differences in a functional gene could be an initial force that encourage diversification of closely related populations in the wild, the current magnitude of polymorphisms in the gene may reflect different levels of selective pressure that have acted on this gene for a certain period.

In this study, we investigated the sequence variations in exon I of avian DRD4, especially in the extracellular region located between the amino terminus and the first transmembrane region. The main aim of our study was to characterize the extracellular region of avian DRD4 and to examine the possibility that different levels of polymorphisms in this region result from differences in selection pressure among avian taxa. We also examined the intraspecific polymorphisms detected in three eagle-owl species to gain insight into the occurrence and maintenance of intraspecific variation in the extracellular region of DRD4.

Materials and Methods

DNA Extraction and PCR Amplification

We investigated DRD4 sequences from 75 species of birds belonging to 30 families in 16 avian orders. The corresponding GenBank sequences of the zebra finch (Taeniopygia guttata) and the great tit (Parus major) in Passeriformes were added to the analysis. Information about the avian specimens used in this study is summarized in Table 1. Because of an insufficient amount of the desired PCR product probably due to an imperfect match of the primers, we were unable to determine the sequences of taxa from the following orders: Struthioniformes, Gruiformes, Coraciiformes, Piciformes (except for Ramphastidae), and Ciconiiformes (except for Threskiornithidae and Ciconiidae). Feathers, tissues (liver or muscle), or peripheral blood was used as source material for DNA extraction, depending on the vital state of sampling individuals. In each case, DNA was extracted using either commercial extraction kits or the standard technique of proteinase K treatment followed by a simple phenol–chloroform method. The primer pairs used in this study were DRD4F2 (5′-CCCGTGCAACGGCACCG-3′) and DRD4R4 (5′-CGGCGACGGCGAGGCTGACGATGAA-3′), previously described by Sugiyama et al. (2004). A combination of DRD4F2/DRD4R4 worked quite well for most of the avian species, but DRD4F3 (5′-GGCCGCCCCGTGCAACGGCA-3′) and DRD4R3 (5′-GAGGGGCAGGACGAGGAG-3′) were used for the amplification of Anseriformes genomes because of a critical base substitution that affects DRD4F2 binding. Polymerase chain reaction (PCR) was performed with a PCR cocktail containing approximately 10–50 ng of genomic DNA, 0.25 μM of each primer, 0.2 mM of each dNTP, GC Buffer II (attached with Taq polymerase), and 0.125 U of TaKaRa LA Taq or Ex Taq polymerase (TaKaRa Bio, Shiga, Japan) under standard PCR conditions at an initial incubation temperature of 95°C for 1 min followed by 35 cycles consisting of 95°C for 15 s and 65°C for 30 s.

Table 1 Information on avian specimens used in this study

Sequencing and Genotyping

Sequencing was performed on an ABI 3130xl automated sequencer (Applied Biosystems) using BigDye Terminator Cycle Sequencing reagents v3.1 according to the manufacturer’s instructions (Applied Biosystems). When more than one specimen per species was available, the size of each allele was first measured by electrophoresis using an ABI 3100 DNA Sequencer followed by analysis with GeneScan software (Applied Biosystems, CA). Then, if intraspecific size polymorphisms were detected, each allele was sequenced to identify the polymorphic sites. In Strigiformes, a 197-bp allele in the spotted eagle-owl and a 200-bp allele in the Verreaux’s eagle-owl were found as heterogeneous alleles that could not be discriminated by agarose gel electrophoresis; thus, PCR products were cloned using the TOPO TA Cloning Kit (Invitrogen, CA). For each allele, at least eight clones were purified and sequenced using the M13 primers included in the kit.

Statistical Analyses

Sequences were aligned by CLUSTAL W (Thompson et al. 1994) implemented in BioEdit version 7.0.9.0 (Hall 1999). Multiple alignments were edited and refined by eye. A majority-rule consensus sequence was determined from nucleotide position +43 to +183 (corresponding to chicken DRD4). The degree of heterogeneity in a pattern of nucleotide substitutions for a pair of sequences was measured by the disparity index (I D), which is known to be a more powerful indicator of heterogeneity than the commonly used χ2 test (Kumar and Gadagkar 2001). Pairwise comparison of I D was performed using MEGA software version 4.0 (Tamura et al. 2007). In calculating I D, estimated values of I D were averaged when sequence variants were identified within each avian taxon (order or family). A Monte Carlo test (1,000 replicates) was used to estimate the probability of rejecting the null hypothesis that a pair of sequences have evolved with the same pattern of substitution. To select the most appropriate species for estimating the P values between avian taxa, we preliminary calculated the value of I D and probability between each pair of species in the same order. All positions containing alignment gaps were treated in a pairwise deletion manner. We also calculated the ratio of nonsynonymous to synonymous substitutions (dN/dS) over all sequence pairs within each order to estimate the extent of selective pressure acting on the extracellular region. In this analysis, the order Passeriformes was separated from the other Neoaves (non-passerine Neoaves), because our preliminary data on DRD4 sequences clearly showed the existence of codon bias between passerine and non-passerine Neoaves. Analyses were conducted using MEGA software according to the Nei and Gojobori method (Nei and Gojobori 1986).

Results

Characterization of the Extracellular Region

A total of 75 DRD4 sequences including two passerine (zebra finch and great tit) from GenBank as well as five species from our previous study (Sugiyama et al. 2004) were aligned (see Fig. 1). The newly determined sequences were deposited in GenBank (Accession Nos. AB552664 to AB552735; Table 1). Among taxa, the number of nucleotides in the extracellular region differed according to the length of DNA insertions or deletions (indels) in this domain. The length of aligned nucleotides between the DRD4F2 and DRD4R4 primer sites (not including the length of primers) ranged from 105 (Columbidae and Ciconiidae) to 141 bp (Ramphastidae). There were 60 parsimony-informative sites in this region. The lengths of all indels were multiples of 3 bases (ranging from 12 to 36 bp), and no other indels were found in the first transmembrane region. Within-order length variations were detected in Falconiformes, Strigiformes, Ciconiiformes, Procellariiformes, Columbiformes, and Galliformes. A remarkable elongation observed in Ramphastidae (of Piciformes) was attributed to a 15-bp tandem duplication (CCGKGCAACGGYACC) in the extracellular region (Figs. 1, 2). Low PCR fidelity both in Piciformes (except Ramphastidae) and in Coraciiformes hampered the determination of whether the duplication was limited to Ramphastidae or shared among close relatives of Ramphastidae. Therefore, we could not exclude the possibility that gene arrangement (e.g., truncation of gene, gene conversion) is common among these taxa.

Fig. 1
figure 1

Alignment of the hypervariable region of DRD4 exon I from 75 avian species. A dot and a dash denote identity with the consensus sequence and indels, respectively. The number of individuals sampled per species is shown in parentheses when more than one individual is used for genotyping. In the intraspecific polymorphisms, the variable number of prolines is shown in brackets with the number of alleles detected in each species. Diagnostic nucleotides in each avian order are highlighted, except within Falconiformes and Strigiformes where families are highlighted. The 15-base duplications observed in Ramphastidae of Piciformes are shaded. Numbers on the top show the nucleotide position corresponding to the initial position in human DRD4 (Accession No. L12398). Ordinal names are abbreviated as follows: PAS Passeriformes, PSI Psittaciformes, FAL Falconiformes, PIC Piciformes, STR Strigiformes, CHA Charadriiformes, PEL Pelecaniformes, CIC Ciconiiformes, PRO Procellariiformes, POD Podicipediformes, SPH Sphenisciformes, MUS Musophagiformes, COL Columbiformes, PHO Phoenicopteriformes, GAL Galliformes, and ANS Anseriformes

Fig. 2
figure 2

Alignment of the deduced amino acid sequence of avian DRD4. Amino acid substitutions specific to the avian order (family) are highlighted. The 15-base duplications observed in Ramphastidae of Piciformes are shaded. Residues in parentheses indicate polymorphisms within the taxa. # denotes V/G/A/E

The existence of proline repeats in the extracellular region is a well-preserved characteristic in all species in Neognathae, although the number of repeats varies among taxa (Fig. 2). Interestingly, there is a striking bias in codon usage for proline in the extracellular region among avian taxa. CCG was the preferred codon for proline in all non-passerine Neoaves (average 88%), whereas CCC was uniquely prominent (74%) in Passeriformes (Table 2). Moreover, codon usage for proline was different even between Galliformes (CCT; 64%) and Anseriformes (CCG; 53%). Another obvious feature in this region was that a leucine residue just prior to the proline repeats was missing from the DRD4 sequences of Galliformes and Anseriformes, and leucine (CTG) was changed to glycine (GGA) in Passeriformes.

Table 2 Codon (proline) usage bias between avian taxa

Genetic Diversity Among Avian Taxa

The overall nucleotide diversity (π) was 0.13; nucleotide diversity was similar among three orders, Passeriformes, Galliformes, and Anseriformes (0.02–0.03), whereas that in non-passerine Neoaves was only slightly higher (0.05). We also measured divergence by the average dN and dS values, which represent the number of nonsynonymous and synonymous differences per site, respectively. The highest value of dN/dS was 1.65 in Passeriformes, and this value was higher than that of combined data in non-passerine Neoaves (0–2.63; average 1.17), whereas the value was zero in Anseriformes due to a lack of nonsynonymous substitution. Pairwise comparisons of I D showed significant levels of heterogeneity in the pattern of nucleotide substitution between Passeriformes or Galliformes and the other avian orders (Table 3). In the pairwise comparisons among Passeriformes, the null hypothesis of homogeneity was rejected in 11 of 16 pairs (69%) at the 5% significance level, whereas the null hypothesis was rejected in 8 of 16 pairs (50%) in the case of Galliformes. However, these comparisons should be used with caution due to following reasons: (1) although a total of 19 cases of significant were detected in Table 3, type I error rate could be estimated to be 6.8 per 136 comparisons. (2) When a Bonferroni correction for multiple comparisons was used to evaluate significance, no significant value was detected. (3) Owing to the short sequences of the extracellular region, the significant levels of heterogeneity detected in Passeriformes and Galliformes might be derived from a bias in codon usage for proline in these two orders (CCC for Passeriformes; CCT for Galliformes) against that in non-passerine Neoaves.

Table 3 Test for the homogeneity of substitution patterns measured by the disparity index

In Strigiformes, we detected intraspecific polymorphisms in three species of eagle-owls (Spotted, Greyish, and Milky; Table 4) The deviation from Hardy–Weinberg equilibrium was not significant at 5% level (Fisher’s exact test) in the spotted eagle-owl and the Verreaux’s eagle-owl, whereas no heterozygous genotype was found in the greyish eagle-owls. Amino acid substitution (A → V) at +15 was observed only in the spotted and greyish eagle-owls, consistent with the treatment of these two species as being conspecific (Marks et al. 1999). The Verreaux’s eagle-owl was discriminated from all the other owls by the insertion of a leucine residue. The existence of leucine repeats (3 or 4 times) just prior to the proline repeats was a common feature for Strigidae, but it was absent in Tytonidae and the other avian species.

Table 4 DRD4 allele frequencies in the genus Bubo (Strigidae, Strigiformes)

Discussion

Extracellular Region of DRD4 as a Key for Regulating Gene Function

From a functional point of view, polymorphisms in the amino-terminal domain of G-protein-coupled receptors have attracted considerable attention from researchers. In vitro transcription and translation experiments suggest that the extracellular region of DRD4 plays an important role in determining the efficiency of signal transduction (Schoots et al. 1996) and that protein-interaction modules, such as Src homology 3, prefer proline-rich ligand sequences (Kay et al. 2000; Ren et al. 1993). Since these different modules recognize different types of proline-rich sequences, variations in the number of proline sequences found in the extracellular region of the avian DRD4 might be important as target sequences for a family of small binding proteins. Oertel et al. (2009) illustrated that a single amino acid exchange of the human μ-opioid receptor at the extracellular region altered the in vivo effects of opioids to different degrees in pain-processing brain regions. Furthermore, an increased number of CAG repeats (polyglutamine) in the transactivation site of the human androgen receptor was associated with lower cognitive functioning due to neuronal dysfunction (Yaffe et al. 2003). These facts support the hypothesis that a variable number of proline residues in the extracellular region of avian DRD4 can be a decisive factor in determining the efficiency and accuracy of DRD4 binding and substantially affect its function as a neurotransmitter receptor. The hypothesis presented here is slightly analogous to that by Lichter et al. (1993), suggesting that a proline-rich repetitive region in DRD4 exon III might affect signal transduction by altering interactions with G-proteins or other intracellular effectors.

Selection on the Extracellular Domain of Avian DRD4

With regard to the structure of avian DRD4, one of the striking features is a strong heterogeneity in the patterns of proline repeats and their codon bias. It should be stressed that the sequential expansion or contraction of codon repeats and codon bias are caused by the different mechanisms of DNA replication. The variation in the number of tandem repeats at microsatellite loci could arise by replication slippage (Levinson and Gutman 1987), whereas codon bias results from an accumulation of point mutations, each of which is followed by genetic drift and/or selection. The current variations in the structure of avian DRD4 may be explained by the combination of different levels of stringency during replication and selective pressure that may have acted on each avian taxon. In Passeriformes, for example, the number of indels at the extracellular region is perfectly conserved among 11 families, although their DRD4 sequences contain many nonsynonymous substitutions, as indicated by their having the highest value of dN/dS. In contrast, Strigiformes was characterized as a variety of indels and a low-level of sequence divergence within the order. These findings suggest two possibilities: either purified selection in Passeriformes does not allow the elongation or contraction of short codon (i.e., CCC) repeats by eliminating frame-shift mutation or the frequency of slippage in Passeriformes is much lower than in Strigiformes.

Previous studies on codon bias mainly in microorganisms and Drosophila melanogaster reported that the direction and degree of codon bias vary between organisms (Ikemura 1985; Kanaya et al. 2001; Shields et al. 1988). In addition, codon usage bias in the plant genome was documented by correspondence analysis used to explore the variation in relative synonymous codon usage (Wang and Hickey 2007). The phenomenon that both mutational pressure and selection are involved in codon bias is currently explained by the mutation-selection-drift balance model (Akashi 1995; Bulmer 1991). This model proposes that selection favors the major codons over the minor codons, while both mutational pressure and genetic drift contribute to maintain the minor codons. This mutation-selection-drift balance model suggests that selective pressure on the efficiency of transcription and translation could be one of the factors determining the direction of codon bias (reviewed by Hershberg and Petrov 2008). In some experiments using D. melanogaster, for instance, artificial increase and decrease in the number of preferred codons for leucine in the alcohol dehydrogenase gene resulted in significant differences in ethanol tolerance when compared with that in the wild-type (Carlini and Stephan 2003; Hense et al. 2010). Therefore, the bias in codon usage for proline found in avian DRD4 may also support the possibility that selection based on the functional properties of avian DRD4 can be attributed not only to the elongation or contraction of tandem codon repeats but also to directional changes in codon usage in the polyproline region. To address this hypothesis, further in vitro experiment will be required to verify the levels of DRD4 expression between groups who have different codons of proline in the extracellular region.

Genetic Heterogeneity in Passeriformes and Its Relation to Evolutionary Significance

In this study, we detected significant differences in base substitution patterns between Passeriformes and the other avian orders by the homogeneity test using I D. In addition, the value of dN/dS in Passeriformes was apparently higher than that in the other avian taxa, although nucleotide diversity within the order was almost similar to that in the others. These lines of evidence suggest that divergent forces have acted on the passerine clade. One plausible explanation for these heterogeneities in passerines is the significantly higher level of nucleic acid substitution rate than that seen in other avian taxa (Yuri et al. 2008). Another comparative study on mutation rates of mtDNA also showed that Passeriformes has a 10-fold higher substitution rate than Anseriformes in the cytochrome b gene, and the mutation rate is correlated with life-history traits such as maximum longevity and body mass (Nabholz et al. 2009). Therefore, it is quite natural to imagine that a higher substitution rate in Passeriformes together with subsequent selection acting on nucleotide substitutions may enable passerines to overwhelm the other non-passerine birds in that they can quickly adapt to novel environments by changing their morphological traits as well as behavioral features.

The extremely high rate of nucleotide substitutions observed in Passeriformes may have directly given rise to their ability to adapt to novel environments by virtue of behavioral flexibility. Dopamine is known to regulate many social behaviors, including mate competition aggression (Kabelik et al. 2010) and singing (Sasaki et al. 2006), although the mechanisms underlying this behavioral regulation are unclear. Recently, Korsten et al. (2010) showed that DRD4 polymorphisms in a wild population of great tit (Passeriformes) were associated with the personality-related characteristic known as exploratory behavior. Their findings are valuable because they demonstrate that the correlation between DRD4 polymorphism and exploratory behavior in laboratory experiments can also be detected in a wild population. For free-living birds, such characteristics as highly exploratory behavior and aggressiveness might be advantageous traits for peripheral populations to expand their feeding and mating territories. For instance, Duckworth and Badyaev (2007) revealed that biased dispersal of highly aggressive males to a novel environment facilitated the range expansion of local passerine populations. Furthermore, DRD4-knockout mice exhibited reduced behavioral responses to novelty and a decrease in novelty-related exploration (Dulawa et al. 1999). All these findings support the hypothesis that DRD4 is one of the candidate genes that have contributed to adaptive evolution by diversifying behavior in the periphery of passerine populations.

Explication of Intra- and Interspecific DRD4 Polymorphisms

Intraspecific polymorphism has been previously reported in chickens, and we have found that three species of eagle-owls (Spotted, Greyish, and Milky) have intraspecific variations in the extracellular region of DRD4, depending on the number of proline and leucine repeats. It should be noted that all the three eagle-owls, which exhibit intraspecific polymorphism, are found in Central Africa, whereas the other eagle-owls (Horned, Magellan, Rock, Eurasian, Turkmenian, Pharaoh, and Cape; Superspecies in Table 4) distributed outside Central Africa possess the same DRD4 sequence and show no variation even in the hypervariable region. One hypothesis drawn from these findings is that the proline and/or leucine repeats in three eagle-owls may have elongated after the last ice age, and that the rapid growth of population in Central Africa served to maintain the genetic variations against genetic drift, although we could find sufficient evidence to support this hypothesis neither on fossil records nor on census data. In this scenario, fitness effects on the number of proline and leucine residues are so weak that it takes many generations before they will be fixed in the most preferred allele. The previous outcome, that intraspecific polymorphism was also found in a variety of chicken breeds, suggests the importance of rapid expansion of population as well as the maintenance of a large population size to preserve variation at a certain level. To address this hypothesis, we need to discover novel exemplars of intraspecific polymorphisms by using a large number of individuals in a variety of avian species.

In Falconiformes, Accipitridae (hawks and eagles) were clearly discriminated from Falconidae (falcons) by both nucleotide substitutions and the length of indels; this is in agreement with a recent hypothesis that Falconiformes might not be a monophyletic group (Hackett et al. 2008). However, care must be taken while assuming the phylogenetic relatedness between two or more avian species on the basis DRD4 sequence data. For instance, two families (Strigidae and Tytonidae) in Strigiformes have quite different characteristics in the presence or absence of leucine repeats and diagnostic nucleotides in DRD4 sequences, although other mitochondrial and nuclear studies have shown that these two families are sister taxa (Chubb 2004; Wink et al. 2008).

In summary, the extracellular region of avian DRD4 showed different levels of polymorphisms among avian taxa. Both the length variation in the tandem codon repeats (polyproline) and the bias in codon usage for proline among avian taxa were considered as evidence for the existence of selective pressure on the extracellular domain. Further, strong selection in this region indicates the functional importance of the extracellular domain by modulating conformation and binding efficiency. Of all avian taxa, Passeriformes was assumed to have experienced different patterns of DRD4 evolution on the basis of distinctive codon bias and an excess of nonsynonymous substitutions, but we still do not know whether the extracellular region of Galliformes, which also showed a different direction of codon bias, might have undergone processes of selection similar to those observed in passerines. Considering the neurotransmitter-related genes as key regulators of motor behavior, the variation in the extracellular region of DRD4 might have contributed to the degree of behavioral diversity in Passeriformes, which plays a critical role in the spreading of marginal populations into a novel environment.