Introduction

Ancient DNA (aDNA) studies have suffered much criticism since they began about 20 years ago. The field is still recovering from the effects of early spectacular and erroneous claims, such as that of DNA being preserved in plant fossils, dinosaur bones, and amber for many millions of years (for recent reviews see Hebsgaard et al. 2005; Willerslev and Cooper 2005). Unfortunately, unreplicated results of surprising age continue to be published, including those from old human remains (e.g., Adcock et al. 2001), microorganisms (e.g., Cano and Borucki 1995; Vreeland et al. 2000; Fish et al. 2002), and plant fossils (Kim et al. 2004). These studies have routinely underestimated the extent to which aDNA research is confounded by contamination with modern DNA, and are widely thought to result from such contamination (Willerslev et al. 2004a; Hebsgaard et al. 2005). In recent years, a greater understanding of postmortem damage and contamination has provided a more robust foundation for the field, although the authentication of studies of human remains and microbes remains highly problematic (e.g., Willerslev et al. 2004b; Gilbert et al. 2005a; Hebsgaard et al. 2005; Willerslev and Cooper 2005).

The first report of putative Neanderthal (Homo neanderthalsensis) mitochondrial DNA (mtDNA) from the type specimen (Feldhofer I [Krings et al. 1997]) was a rare example of a remarkable aDNA result obtained using very strict criteria for authenticity, including the independent replication of results and tests of biochemical preservation (Cooper and Poinar 2001; Hofreiter et al. 2001a; Pääbo et al. 2004; Willerslev and Cooper 2005). The result is convincing as the Neanderthal sequence differs from any known modern human (Homo sapiens) and chimpanzee (Pan troglodytes) sequences but is clearly human-like. Furthermore, subsequent independent retrieval of similar, but not identical, mtDNA from other Neanderthal specimens strongly supports the sequence’s authenticity (Ovchinnikov et al. 2000; Krings et al. 2000; Schmitz et al. 2002; Serre et al. 2004a; Lalueza-Fox et al. 2005).

The retrieval of Neanderthal sequences enables the possibility of addressing the long-running debate about modern human origin, something that had remained unsolved in paleontological and modern genetic studies (Wolpoff 1989; Templeton 1992). Neanderthals have been suggested either to be (i) direct ancestors of modern man or to have contributed to the gene pool of today’s humans (multiregional model [e.g. Wolpoff et al. 1984; Templeton 2002]) or (ii) to have been replaced by anatomically modern humans without leaving any genetic trace in contemporary populations (Out-of-Africa replacement model [Stringer and Andrews 1988; Harvati et al. 2003]). For a recent review of the current evidence pertaining to this debate, see Finlayson (2005). Most published phylogenetic analyses suggest that Neanderthal mtDNA is positioned outside the genetic diversity of contemporary humans. This points to the Out-of-Africa replacement model (Krings et al. 1997, 1999, 2000; Ovchinnikov et al. 2000; Schmitz et al. 2002; Knight 2003), though one cannot, with the limited Neanderthal sequences that are available, exclude other scenarios yet (e.g., Nordborg 1998). However, there are two main problems associated with the studies: (i) the use of limited contemporary human mtDNA sequences (as few as 10 [Schmitz et al. 2002]). Such restricted sequence sampling has been shown to affect the phylogenetic position of ancient human mtDNA sequences (Cooper et al. 2001); (ii) the use of analytical methods (i.e., neighbor-joining or maximum parsimony) that are, to some extent, unable to account for the extreme among-site variation in substitution rates and the levels of parallel mutations (homoplasies) that exist in the human control region (Krings et al. 1997, 1999, 2000; Ovchinnikov et al. 2000; Gutiérrez et al. 2002).

Other important issues affecting investigations of Neanderthal genetics are recent claims that the Neanderthal sequences are erroneous or simply sequence artifacts. Based on disagreements in genetic distances of the Neanderthal sequences to contemporary humans and the age of the fossils, Gutiérrez et al. (2002) argue that the first published Neanderthal sequence (Feldhofer I [Krings et al. 1997]) is erroneous. This interpretation is supported because some of the positions in the Feldhofer I sequence are unique (Caldararo and Gabow 2000; Schmitz et al. 2002) and might possibly result from postmortem damage (Hansen et al. 2001). Recent results from Pusch and Bachmann (2004) suggest that the original Neanderthal sequences might not represent “authentic” Neanderthal DNA but sequence artifacts.

Altogether these uncertainties and claims can have severe implications for the understanding of modern human evolution. This paper aims to address the claims and reevaluate Neanderthal genetics and phylogeny in an up-to-date framework. We evaluate the first published Neanderthal sequence (Feldhofer I [Krings et al. 1997]) with respect to damage-based errors, as discussed by Gutiérrez et al. (2002), and further investigate the study by Pusch and Bachmann (2004) where it is claimed that the Neanderthal sequences might not represent “authentic” Neanderthal DNA but sequence artifacts. The phylogenetic position of the Neanderthals mtDNA sequences relative to contemporary human mtDNA is analyzed using Bayesian inference and a newly compiled dataset together with the datasets used by Gutiérrez et al. (2002).

Materials and Methods

Assessing Damage and Sequence Artifacts

To investigate the problem of damage-based errors in the HVR1 Feldhofer I consensus sequence we randomly simulated 100,000 sequences with the same base composition as observed in the 11 positions that are variable in Neanderthal HVR1 sequences (Table 1). This approach assumes that the Neanderthal population is fairly homogeneous and not too genetically structured. Each of the 11 bases was drawn independently of each other and the empirical distribution, D, of the average pairwise difference (APD) to all humans was computed. The APDs between the four Neanderthal sequences and all human sequences were calculated (Table 2), and a test was applied to determine whether the average obtained for the Feldhofer I sequence was extreme in D. The true variance of the APD is likely to be underestimated in D (but not the mean value), because sites are drawn independently of each other. A variance correction was therefore performed assuming the APD is binomial with mean 11q as in D (the observed mean of D was 4.56, yielding q=0.414) and variance 11q(1 – q) = 2.67 with q=0.414 (Fig. 1). This approach is justified because the human sequences (from large HVR1) consist of one main type comprising 1584 of 1905 sequences, and the other types are very similar to the most frequent one. It was tested whether the APD between Feldhofer I and humans was extreme in a binomial distribution with q=0.414 (Fig. 1).

Table 1. Variable positions among the Neanderthal mtDNA HVR1 sequences
Table 2. Mean distances between sequences of Neanderthals and contemporary humans
Fig. 1.
figure 1

The empirical distribution of the average pairwise difference between a randomly generated sequence (see Materials and Methods) and the sample of human HVR1 sequences in the large HVR1 dataset. The average pairwise differences are mainly determined by the distance to the most common human sequence (1584 of 1905 sequences).

To test for chimeric sequences caused by jumping PCR events (Pääbo et al. 1989) in the datasets of Push and Bachmann (2004) and Krings et al. (1997), we used the approach of Gilbert et al. (2003b), examining the clone sequences for incompatible miscoding lesion-derived base substitutions.

The phylogenetic analyses of the interim consensus sequences (ICS) from Pusch and Bachmann (2004) and the Neanderthal sequences were analyzed using MrBayes (Huelsenbeck and Ronquist 2003). The Markov chain Monte Carlo analysis in MrBayes was run for 2 million generations with four chains, three times independently. Trees were sampled every 100 generations and a 50% majority-rule consensus tree was produced from the last 1000 trees. Stationarity was checked using the command “sump.”

Phylogenetic Inference

To reevaluate the Neanderthal phylogeny, we took two approaches. In one set of analyses we used the aligned mtDNA control region sequence datasets from Gutiérrez et al. (2002). The data contained a large HVR1 dataset of the hypervariable region 1 from the mtDNA control region, consisting of 422 aligned positions consisting from 1905 contemporary human and 3 Neanderthal sequences (AF011222, AF254446, AF282971). Furthermore, they also included a smaller HVR12 dataset (combined HVR1 and HVR2 mtDNA control region) of 843 aligned positions, consisting of 377 contemporary human and 2 Neanderthal sequences (AF011222, AF142095 and AF282971, AF282972). Additionally, in our reanalysis, we used an additional Neanderthal HVR1 sequence (AY149291) not used by Gutiérrez et al. (2002) and two HVR1 Cro-Magnon sequences (AY283027, AY283028) from Caramelli et al. (2003). The recently published HVR1 sequence from Vindija (Vi-80 [Serre et al. 2004b]) was not included in the analyses, as it is identical to the first Vindija Neanderthal sequence (Krings et al. 2000) and could be derived from the same individual.

In another set of analyses we created a dataset with 519 HVR1 and HVR2 sequences of 859 aligned positions from the HvrBase (Handt et al. 1998; http://www.hvrbase.org). The dataset consisted of 7 Chimpanzee sequences used as outgroup, the Vindija Neanderthal HVR1 and HVR2 sequences (AF282971, AF282972 [Krings et al. 2000]), and 511 contemporary human sequences (see Supplementary Material for the complete dataset). The Feldhofer 1 HVR sequences (AF011222, AF142095 [Krings et al. 1997, 1999]) were not used to avoid bias from possible sequence errors. The 511 human sequences were composed of 52 Africans, 162 Asians, 21 Oceanic/Australians, 145 Europeans, and 131 Americans. The dataset was created from the approximately 4000 taxa containing both HVR1 and HVR2 in the database by eliminating identical or nearly identical sequences until the dataset was reduced to 511 human sequences. We used this procedure because the Markov chain Monte Carlo method that is used in MrBayes (see below) to approximate the posterior probability distribution relies on convergence of the log likelihood and other model parameters. Convergence can be difficult to achieve when dealing with a large number of sequences and a relatively low number of unique site patterns (Huelsenbeck et al. 2002). Additionally, we aimed to include the maximum amount of sequence divergence in the dataset in order to be able to run the dataset within a reasonable timeframe.

To account for the large amount of parallel evolution and rate variation within the HVR regions (Tamura and Nei 1993), we used the general time reversible model of nucleotide substitution (GTR [Tavaré 1986; Rodriguez et al. 1990]) with gamma-distributed rates among sites with a correction for invariable sites (GTR+Γ+I). The number of gamma categories was as standard set to 4. To investigate the phylogenetic signals within the two HVR12 dataset from Gutiérrez et al. (2002), we partitioned it into the HVR1 and HVR2 sections.

The phylogenetic analyses of the datasets from Gutiérrez et al. (2002) were performed with the parallel version of MrBayes version 3 beta 4 (Huelsenbeck and Ronquist 2003) on the BioCluster at the Zoological Museum, University of Copenhagen (http://www.zmuc.dk/). The HVR12 dataset and individual HVR1 and HVR2 partitions from Gutiérrez et al. (2002) were each run for 15 million generations, and the large HVR1 dataset from Gutiérrez et al. (2002) was run for 30 million generations. Stationarity for these analyses was checked using the “sump” command in MrBayes. The HVR12 dataset created for this study was analyzed using MrBayes version 3.1.1 (Huelsenbeck and Ronquist 2003) and was run for 50 million generations. Trees were sampled every 1000 generations, with a 50% majority-rule consensus tree computed from samples after stationarity had been reached. Stationarity and effective sample size (the number of effectively independent draws from the posterior distribution that is sampled from) were checked with the program Tracer 1.2.1 (Rambaut and Drummond 2004).

To investigate the phylogenetic signal under the neighbor-joining method (as in Gutiérrez et al. 2002), we analyzed the individual HVR1 and HVR2 parts from the HVR12 dataset with neighbor joining using the TN93 (Tamura and Nei 1993) model with gamma-distributed rates among sites and correction for invariable sites with the program PAUP* version 4 beta 10 (Swofford 1998).

Results

Errors in the Feldhofer I Sequence

We find that the probability of a randomly generated sequence having a higher average pairwise difference than the Feldhofer I is only 1.4%. Applying the variance correction we found that the probability of obtaining an average pairwise difference of 8 or more (the observed pairwise difference between Feldhofer I and humans is 7.91; Table 2) is 3.7%. Thus, the Feldhofer I HVR1 sequence is extreme in base composition compared to the Neanderthal HVR1 sequences and it is likely that the sequence is erroneous.

Artificial Neanderthal DNA

A BLAST search (Altschul et al. 1997) demonstrates that seven of the different ICS (I–V, IX, and XI) obtained in the experiment of Pusch and Bachmann (2004) show 100% matches with human mtDNA GenBank sequences (accession numbers AF285377, AF285367, AY426291, AB059953, AF519867, AY314618, and AY314618), implying that a variety of contemporary human contaminant sequences was amplified. A higher frequency of transitions than transversions (164/72 = 2.3), combined with the higher frequency of type 2 (cytosine → thymine and guanine → adenine, i.e., CG→TA; total =95) than type 1 (adenine → guanine and thymine → cytosine, i.e., AT→GC; total = 68) mutations observed among the clone products of Pusch and Bachman (2004), is consistent with observations on postmortem damage-derived miscoding lesions (Hansen et al. 2001; Hofreiter et al. 2001b; Gilbert et al. 2003a, b; Binladen et al. 2006), suggesting that postmortem damage might be involved in their data. Furthermore, at least 24 of the 35 ICS (68.6%) can be identified as chimeras caused by PCR jumping events according to the method of Gilbert et al. (2003b)—in some cases, up to four jumping PCR events per sequence (Fig. 2). In comparison, only 4 of 167 clone sequences (2.4%) used to generate the first published Neanderthal HVR1 sequence were found to contain similar evidence of jumping PCR events (clones A2.10, B11.4, B11.8, and B14.9 [Krings et al. 1997]).

Fig. 2.
figure 2

Detection of “Jumping PCR” in the clone sequences from Pusch and Bachmann (2004). Transitions derived from mitochondrial light strand modifications. Transitions derived from mitochondrial Light strand modifications have been underlined, those from Heavy strand modifications, highlighted after (Gilbert et al. 2003b). Sequences containing both Light- and Heavy-strand derived transitions must have originated through recombination during the PCR reaction (marked with an asterisk).

Figure 3 demonstrates that the diverse set of clones (35 ICS) obtained in the experiment of Pusch and Bachmann (2004), including the XXVI sequences, is phylogenetically more similar to the CRS and to each other than to any of the published Neanderthal sequences. The separate position of the Neanderthals and the ICS is highly supported with a posterior probability of 100%.

Fig. 3.
figure 3

The phylogenetic position of the Neanderthal sequences relative to the clone consensus sequences generated by Pusch and Bachmann (2004). The XXVI clone consensus sequence is the one claimed to be Neanderthal-like.

Neanderthal Phylogeny

The Bayesian analyses of the four datasets from Gutiérrez et al. (2002) show that Neanderthal sequences are separated from modern human sequences with a posterior probability of 100% (Fig. 4). A schematic representation of the resulting trees for the large HVR1 and the smaller HVR12, and HVR1 and HVR2 partitions is given in Figs. 4ad. The large HVR1 (Fig. 4a), the HVR12 (Fig. 4b), and the small HVR1 (Fig. 4c) datasets support Neanderthal monophyly with a posterior probability of 100%. The small HVR2 (Fig. 4d) datasets support Neanderthal monophyly with a posterior probability of 63%. Analyzing the dataset created for this study shows that the Vindija Neanderthal HVR1 and HVR2 sequences are positioned as a sister group to the contemporary humans with a posterior probability of 100% (Fig. 5). Further, this dataset shows that six sequences of African origin form a sister group to the rest of the contemporary humans. The neighbor-joining analyses of the datasets of Gutiérrez et al. (2002) confirm that the Neanderthal sequences fall outside the sequences of contemporary humans for the large HVR1 dataset. However, the Neanderthal sequences fall within modern human variation for the combined HVR12, the small HVR1, and the small HVR2 datasets (Figs. 6a and b).

Fig. 4.
figure 4

A schematic representation of the majority rule consensus trees from the Bayesian analyses of the datasets from Gutiérrez et al. (2002) showing the phylogenetic relationship between the Neanderthals and contemporary humans. (a) The large HVR1 dataset, (b) the HVR12 dataset, (c) the HVR1 partition of the HVR12 dataset, and (d) the HVR2 partition of the HVR12 dataset.

Fig. 5.
figure 5

Majority rule consensus trees from the MrBayes analyses showing the phylogenetic relationship among the seven Chimpanzee sequences used as outgroup, the Vindija Neanderthal (Krings et al. 2000), and 511 contemporary humans.

Fig. 6.
figure 6

Neighbor-joining tree using the datasets from Gutiérrez et al. (2002). (a) The large HVR1 dataset and (b) the combined HVR12 and the individual small HVR1 and HVR2 partitions from the HVR12 dataset.

Discussion

Recent phylogenetic and population genetic research suggests that any genetic interchange between Neanderthals and anatomically modern humans was very limited during the approximately 10,000 years (10 kyr) they potentially co-occupied the same areas of Europe and Asia (Currat and Excoffier 2004; Serre et al. 2004b) and that the Neanderthals have not contributed to the mtDNA genetic diversity found in present-day humans (Krings et al. 1997, 1999, 2000; Ovchinnikov et al. 2000; Schmitz et al. 2002; Knight 2003). These issues are central to the two main theories of modern human origins: the Out-of-Africa replacement model, where modern humans rapidly replaced archaic forms (e.g., Neanderthals) as they began to spread from Africa through Eurasia and the rest of the world sometime around 100,000 years ago (Stringer and Andrews 1988; Harvati et al. 2003); and the multiregional model, where genetic exchange or even continuity exists between archaic and modern humans (e.g., Wolpoff et al. 1984; Templeton 2002).

In this paper we have investigated the genetic affinities of the Neanderthals to anatomically modern humans. First, we have evaluated whether the first published Neanderthal sequence (Feldhofer I) is erroneous (Gutiérrez et al. 2002). Second, we have investigated whether the Neanderthal sequences are sequence artifacts (Pusch and Bachmann 2004). Finally, with our reflections on the first two questions in mind, we have readdressed the controversial question about the phylogenetic position of the Neanderthals.

Errors in the Feldhofer I Sequence

One explanation for the unresolved position of Neanderthals among anatomically modern humans is that the sequence data might be considered unreliable due to the degraded nature of the Neanderthal specimens and their DNA (Gutiérrez et al. 2002). Biochemical analyses for investigating the preservation condition of excavated Neanderthal bones and teeth indicate that most of the specimens are unlikely to yield any endogenous DNA (Serre et al. 2004b). The majority of samples that have yielded putative Neanderthal DNA have only enabled PCR amplification of mtDNA in the 50-base pair (bp) size range (Serre et al. 2004b; Lalueza-Fox et al. 2005). In addition, it has been difficult to replicate the entire Neanderthal HVR1 sequences in independent laboratories (Krings et al. 1997; Ovchinnikov et al. 2000), suggesting that preservation of Neanderthal fossils is at the edge of what is required for successful DNA studies. It is therefore possible that some of the published Neanderthal DNA sequences might contain errors due to miscoding lesions (Hansen et al. 2001). This type of DNA damage is of particular concern if amplifications start from few template molecules, which appears to be the case at least in the first published Neanderthal study (the Feldhofer I HVR1 sequence [Krings et al. 1997]).

In support of errors in the Feldhofer I HVR1 sequence it has been argued that the most recent Neanderthal specimen (Mezmaiskaya, ∼29 kyr old) shows a shorter genetic distance to contemporary humans than Feldhofer I (which is believed to be the oldest of the Neanderthal specimens) (Gutiérrez et al. 2002). However, the validity of this argument is questionable, as the Feldhofer I fossil has recently been redated to ∼40 kyr (Schmitz et al. 2002), and the young age of the Mezmaiskaya fossil is debated (Skinner et al. 2005). Additionally, it has been noted that the Feldhofer I HVR1 sequence harbors four unique substitutions (positions 107, 108, 111, and 112) possibly due to postmortem damage accumulated during amplification (Caldararo and Gabow 2000; Schmitz et al. 2002; Hansen et al. 2001). Using a maximum damage-based error rate of ∼0.06%, Hofreiter et al. (2001b) reject major errors in the Feldhofer I sequence. However, the rate might be underestimated because they do not take into account the possible presence of damage hotspots in the human D-loop (Gilbert et al. 2003a) and the error rate is calculated from the consensus of only three Neanderthal sequences.

Comparisons of the average APD between the Feldhofer I sequence and the sequences of contemporary humans with the APD of randomly generated “Neanderthal sequences” indicate that the Feldhofer I HVR1 sequence is extreme in its genetic composition. It is therefore likely that the Feldhofer I HVR1 sequence is erroneous and we cannot exclude that at least this sequence is modified due to postmortem damage (see Errors in the Feldhofer I Sequence, under Results).

Artificial Neanderthal DNA

Instead of the Neanderthal sequences being affected by damage, a recent study suggests that their unique substitution patterns are caused by PCR artifacts. Pusch and Bachmann (2004) report that 35 different mitochondrial HVR1 sequences, including a group containing 7 substitutions that are in combination characteristic for the Neanderthals (i.e., clone XXVI; Fig. 2) can be amplified from a single sequence of modern human mitochondrial DNA (matching the Cambridge Reference Sequence; CRS [Anderson et al. 1981]) if the PCR reaction is spiked prior to amplification, with 14 different aDNA extracts of non-Neanderthal origin. The authors thereby indirectly imply that the published Neanderthal sequences could be explained in this manner and may, in fact, not represent “authentic” Neanderthal DNA. However, as shown in Fig. 3, the diverse set of clones (35 ICS) obtained in the experiment of Pusch and Bachmann (2004), including the Neanderthal-like XXVI sequences, is phylogenetically more similar to the reference sequence and to each other than to any of the published Neanderthal sequences. The separate position of the Neanderthals and the ICS is highly supported, with a posterior probability of 100%, and all of the artificially generated sequences are therefore clearly distinguishable from the published Neanderthal sequences.

Another interesting issue is that regular BLAST searches (Altschul et al. 1997) reveal that seven of the different ICSs obtained by Pusch and Bachmann (2004) show 100% match with different human mtDNA GenBank sequences. This strongly suggests that a variety of human contaminants is amplified in the experiment. Furthermore, the higher frequency of transitions than transversions combined with the higher frequency of type 2 than type 1 mutations (see Materials and Methods) among the clone products of Pusch and Bachman (2004) is consistent with the presence of damage-based misincorporation in the template DNA (Hansen et al. 2001; Gilbert et al. 2003a, b, 2005a; Willerslev et al. 2003). This could also explain the high frequency of chimeric sequences recorded in their dataset (>68%). Such chimeric sequences are caused by “jumping PCR” events that frequently take place when the template molecules are damaged (Pääbo et al. 1989; Willerslev et al. 1999) and/or the amplification starts from very similar molecules (von Wintzingerode et al. 1997). In some cases, up to four jumping PCR events are found per ICS in the dataset of Push and Bachmann (Table 2). In this context it is important to keep in mind that six of the seven Neanderthal-characteristic substitutions in the XXVI clone are recorded in contemporary human populations and a chimeric sequence of as few as five contaminant molecules (accession numbers are listed in Table 3) could generate a XXVI-like sequence. Importantly, only 4 of 167 clone sequences (2.4%) used to generate the first published Neanderthal HVR1 sequence contain similar evidence of jumping PCR (clones A2.10, B11.4, B11.8, and B14.9 [Krings et al. 1997]).

Table 3. Sequence comparisons of mtDNA HVR1

Considering these results the retrieval of 35 different ICS (clones I to XXVIII, clones sls and srs [Pusch and Bachmann 2004, Table 2]) from the 14 spiked reactions is not surprising, taking the experimental design and the resulting sequence compositions into account. Exogenous human DNA is present in DNA extracts from museum remains even after extensive cleaning of the specimens (e.g., Malmström et al. 2005; Gilbert et al. 2006) and up to 20 different human sequences have been reported from a single fossil (Hofreiter et al. 2001a). Contaminant human DNA may originate from a variety of sources, including handling of the specimens (often over several decades), previous PCR products, and reagents and tools used for DNA extraction and PCR (Willerslev and Cooper 2005). Even though blank controls are negative, contaminant DNA sequences might be amplified from sample extracts due to sample contamination and/or “carrier effects” (Handt et al. 1994; Cooper and Poinar 2001; Hofreiter et al. 2001a). The 14 different ancient specimens used by Pusch and Bachmann (2004) for DNA extractions can be expected to carry a considerable load of human contamination, some of which is likely to be highly degraded after years of storage. Apparently, their experiments were not conducted using aDNA standards such as an isolated facility for DNA extractions and PCR setup and the cleaning of specimens, reagents, and tools (Cooper and Poinar 2001; Hofrierter et al. 2001a; Pääbo et al. 2004; Willerslev and Cooper 2005), so both sample and laboratory-based contamination are of major concern. Furthermore, the results have not proven reproducible (Serrre et al. 2004a; Beauval et al. 2005), which has become standard in human aDNA work (Cooper and Poinar 2001; Hofreiter et al. 2001a; Willerslev and Cooper 2005), so it is impossible to determine how much of the effect might be specific to the extraction and amplification techniques used.

If, as implied by Pusch and Bachmann (2004), the amplification of artificially “Neanderthal-like” XXVI clone sequences is common in aDNA studies and constitutes 16% of their clone products, it is surprising that similar sequences are not already present in GenBank after more than 50 ancient human mtDNA publications (i.e., PubMed search). In particular, such sequences should have been spotted in recent studies such as that of mtDNA from Cro-Magnons (Caramelli et al. 2003) and the Andaman Islanders (Endicott et al. 2003), where cloning was applied. In a recent study where more than 900 human clone sequences were obtained from 34 Viking specimens (Gilbert et al. 2003a), no sequences were obtained that matched the XXVI clones of Pusch and Bachmann or that had any Neanderthal-like substitution patterns.

Neanderthal Phylogeny

Readdressing the controversy of the phylogenetic position of the Neanderthal sequences (Gutiérrez et al. 2002), we favored the Bayesian inference method (Huelsenbeck and Ronquist 2003) because it assigns a posterior probability to each possible phylogeny rather than just selecting a single best tree, and further, the Bayesian inference method allows comparison of the support for conflicting phylogenies. The method uses a Markov chain Monte Carlo method which allows large sequence datasets to be analyzed in a statistical framework with an adequate model of substitution (Huelsenbeck et al. 2002). The method has been parallelized (Altekar et al. 2004), which has increased the complexity of analytical problems that can be solved.

The Bayesian inference delivers consistent results for all the datasets (the large HVR1 and small HVR1, HVR2, and HVR12 datasets used by Gutiérrez et al. [2002] and our newly constructed HVR12 dataset; see Materials and Methods), which strongly supports the separation of Neanderthals from contemporary humans (Figs. 4 and 5). Although the 95% credibility interval (the Bayesian equivalent of a confidence interval) also includes phylogenies where the Neanderthal sequences do not form a monophyletic group, the support for a monophyletic Neanderthal group is considerable. Altogether the Bayesian phylogenies are in agreement with previous phylogenetic analyses (Krings et al. 1997, 1999, 2000; Ovchinnikov et al. 2000; Schmitz et al. 2002; Knight 2003) and show that the ambiguous results of Gutiérrez et al. (2002) could be due to the inadequacy of the neighbor-joining method with the TN93 model (Tamura and Nei 1993) for the given data. Analyzing the dataset created for this study, we see not only that the Vindija Neanderthal HVR1 and HVR2 sequences are positioned as a sister group to the contemporary humans with posterior probability 100%, but also the ancestral position of six sequences with African origin.

Thus our results are in agreement with the expectations of the Out-of-Africa replacement model for modern human origin. To exclude the possibility that the results are influenced by the possible sequence errors in the Feldhofer I HVR1 sequence (Krings et al. 1997), this sequence is not included in our newly constructed HVR12 dataset.

Conclusion

The evolutionary relationship between Neanderthals and anatomically modern humans is highly debated. Intriguingly, phylogenetic analyses addressing this issue have so far suffered from limited sequence sampling and inadequate methodology causing conflicting results. Large-scale Bayesian analyses strongly support a position of the Neanderthal mtDNA sequences outside that of anatomically modern humans, in agreement with the expectations of the Out-of-Africa replacement model for modern human origin. It is noteworthy, however, that with the limited number of Neanderthal sequences that are available one cannot yet rule out other scenarios (e.g., Nordborg 1998). It was recently estimated that the maximum interbreeding rates between Neanderthals and anatomically modern humans have been <0.1% (Currat and Excoffier 2004). However, the result relies, among other things, on the existence of a very precise relationship between the chemical preservation of amino acids and endogenous DNA and on depurination being the main type of damage limiting the half-life life of DNA in fossil remains (Serre et al. 2004b). Both of these assumptions have been questioned (Collins et al. 1999; Hansen et al. 2006). Another interesting point is the fact that Neanderthal male contribution to anatomically modern humans is not recorded in the maternally inherited mtDNA sequences, and nuclear DNA (nuDNA) from Neanderthals may tell a different story. However, despite technical advantages, such as the creation of metagenomic libraries (a technique allowing for the sequencing of large amounts of genomic DNA without an initial amplification step) from the extinct cave bear (Noonan et al. 2005), it is likely to be difficult to obtain reliable Neanderthal nuDNA sequences with current techniques. This is due to contamination problems (i.e., distinguishing endogenous Neanderthal and contaminant contemporary human nuDNA if the sequences are identical) and the generally poor preservation of nuDNA in fossil remains (Poinar et al. 2003). Thus, the recovery of a well-preserved Neanderthal specimen free of contemporary human contamination or the search for variable sequences of ancient bone proteins (Nielsen-Marsh et al. 2005) might currently be the only realistic approaches for addressing this issue. Although some of the published Neanderthal mtDNA sequences may contain a few sequence errors due to damage in the template molecules for PCR, there is currently no solid evidence for the sequences being a result of PCR artifacts.