Introduction

Animal mitochondrial (mt) genomes generally contain the same 37 genes (Boore 1999) and are thought to be selected for small sizes (Brown et al. 1979; Quinn and Wilson 1993; Rand 1993). Nonetheless, there is an increasing number of cases where partial duplications in the mt genome are documented (D’Onorio de Meo et al. 2012). For birds, such duplications were found in several unrelated taxa, i.e., albatrosses, waders, and passerines (Mindell et al. 1998; Bensch and Härlid 2000; Eberhard et al. 2001; Abbott et al. 2005; Gibb et al. 2007; Cadahía et al. 2009; Cho et al. 2009; Morris-Pocock et al. 2010; Verkuil et al. 2010; Sammler et al. 2011; Schirtzinger et al. 2012). They represent six distinct gene arrangements, all differing from the standard avian (i.e., chicken) mt genome (Desjardins and Morais 1990). These span the region close to the control region (CR), i.e., between NADH dehydrogenase subunit 5 (NADH5) and tRNA F/12S rRNA (12S). This is in agreement with expectations from previous studies on the molecular evolution of the mt genome, i.e., that genes flanking the origins of strand replication (OL in the vertebrate WANCY region, OH in the vertebrate CR) form hotspots of duplications that make convergent gene order arrangement more probable (Mindell et al. 1998; Dowton and Austin 1999; Boore 2000; San Mauro et al. 2006; Irisarri et al. 2010). Since in birds the OL is absent from its usual location within the WANCY region (Desjardins and Morais 1990; Seutin et al. 1994), it is not surprising that no such rearrangements have been described so far in this region. For the region close to the OH, Gibb et al. (2007) suggested a gene conversion scenario for avian species according to the tandem duplication and random loss model (Moritz and Brown 1987; Boore 2000).

The most common explanation for mt tandem duplication events is “slipped strand mispairing” favored by the presence of repeat units or sequences that form secondary structures (Moritz and Brown 1986, 1987; Levinson and Gutman 1987; Moritz et al. 1987; Stanton et al. 1994). During DNA replication, a portion of the DNA strand dissociates between two repeats forming a loop. The polymerase then reassociates at the first repeat and duplicates the looped section. Further explanations are highlighted by Bernt et al. (2012) including imprecise termination (Mueller and Boore 2005), head-to-head or head-to-tail dimerization of linearized monomeric mitogenomes (Lavrov et al. 2002), and other enzymatic errors causing e.g., erroneous identification of the origin of light-strand replication (Macey et al. 1997).

In species with duplicated fragments, these duplicates are often found to evolve in concert (Eberhard et al. 2001; Abbott et al. 2005; Singh et al. 2008; Cadahía et al. 2009; Cho et al. 2009; Morris-Pocock et al. 2010; Verkuil et al. 2010; Sammler et al. 2011). However, the underlying mechanisms are not fully understood. In Sammler et al. (2011), we have shown that the mt genome of two Philippine hornbill species, Aceros waldeni and Penelopides panini, is characterized by a tandemly duplicated region encompassing part of cytochrome b (cytb), three tRNAs, NADH6, and the CR (Fig. 1a). The duplicated fragments are largely identical to each other, except for a short section in domain I and for the length of repeat motifs in domain III of the CR. We discovered concerted evolution between the duplicated fragments within individuals, except for the short region, which we hypothesize to correspond to the so-called Replication Fork Barrier (RFB). In this region, we found orthologous copies across six individuals per species (duplicate I of all individuals and duplicate II of all individuals, respectively) more closely related to one another than to the paralogous copies (duplicate I and duplicate II) within individuals. The detection of these regions exempted from gene conversion apparently offers the possibility to distinguish easily between the two duplicates.

Fig. 1
figure 1

MtDNA genome organization in Philippine Hornbills (only tandemly duplicated part). a Gene order and position of five overlapping PCR-amplificons. Arrows indicate the annealing location of the primer AcePen_Glu-for used for sequencing. b Variable sites of a 646 bp alignment of the duplicated control regions of the P. m. subniger haplogroup defined in Fig. 2. The gray shaded sections represent the putative RFB region. Pmm32 (red framed) possesses interchanged control regions becoming conspicuous in the RFB region. Dots indicate positions that are identical to the first sequence; dashes are indels (Color figure online)

Here, we surveyed several individuals of another Philippine hornbill, the Luzon Tarictic Hornbill Penelopides manillae for the same features. Specifically, we analyzed the putative RFB regions in both CRs and compared their characteristics within and across individuals.

Material and Methods

Sampling, Amplification and Sequencing

We extracted DNA from blood samples of 27 captive individuals from two subspecies of Penelopides manillae which had been collected for a phylogeographic study on Philippine hornbills (Sammler et al. 2012).

The 5′end of the control region 1 (CR1) and the 5′end of the control region 2 (CR2) were inspected by amplifying fragment 3 and 5, respectively (Fig. 1a), as described in Sammler et al. (2011). Products were sequenced with the primer AcePen_Glu-for (Sammler et al. 2011). Note that the CR1 sequences of 26 specimens have been utilized for a phylogeographic study (Sammler et al. 2012), while the CR1 sequence of the 27th specimen (Pmm32) and all CR2 of all 27 specimens have been produced for the study presented here. To assure that the individual Pmm32 possesses the same mt genome organization as the other individuals (Fig. 1a), it was sequenced over 1,927 base pairs (bp) of the tandemly duplicated part in the overlapping fragments 1–5, following the protocol of Sammler et al. (2011).

Alignments and Analysis of Sequence Data

Sequences were aligned in BioEdit version 7.0.5.3 (Hall 1999). Mt haplotypes were defined based on the first 646 bp of CR1 and CR2. To infer the pattern of evolution of the duplicated region, we conducted a phylogenetic analysis including both CR copies of each individual separately, with both CR copies of one individual each of Aceros waldeni (Aw, GenBank: JX273782, JX274015), Penelopides affinis (Pa, GenBank: JX273976, JX274014), and P. panini (Pp, GenBank: HQ834457, HQ834467) as outgroups. Each sample is designated by an abbreviation identifying the subspecies (Pmm for P. manillae manillae and Pms for P. m. subniger), a unique number for the individual, and “CR1” (amplified with primers for fragment 3, Fig. 1a) or “CR2” (amplified with primers for fragment 5, Fig. 1a) to distinguish between the two copies. We used JModelTest (Posada 2008) to determine the model of sequence evolution that best fits our data. We then used the GTR + Γ + I model found by JModelTest as the best fitting for all subsequent Maximum Likelihood (ML) and Bayesian searches. ML analyses were carried out with the fastDNAml 1.2.2 program (Olsen et al. 1994) available on the Mobyle portal (http://mobyle.pasteur.fr/cgi-bin/portal.py) with 1,000 bootstrap replicates. The Bioportal of the University of Oslo (Norway) (Kumar et al. 2009) was used to run MrBayes 3.0 (Ronquist and Huelsenbeck 2003) (40,00,000 generations: sampling frequency of 100 generations). We ran one cold and three heated Markov chains and two independent runs. Stationarity of the Markov chains was checked in AWTY (Nylander et al. 2008); the first 10 % of the sampled trees were discarded as burn-in, and posterior probabilities for each node were calculated based on the remaining 90 % of sampled trees. These trees were used to construct a 50 % majority rule consensus tree using PAUP* 4.0b10 (Swofford 2003). PAUP was also used to analyze data by Maximum Parsimony (MP) with 1,000 bootstrap replicates (heuristic searches, ACCTRAN character-state optimization, 100 random stepwise additions, TBR branch-swapping algorithm, and gaps treated as a fifth base) (Farris 1970; Hendy and Penny 1982).

Results and Discussion

All studied individuals of Penelopides manillae possess two similar CRs (GenBank: JX273936-JX273953, JX273963-JX273974, JX273978, JX273979 for CR1 and GenBank: JX273980-JX274013 for CR2). Due to single heteroplasmic sites (cf. Morris-Pocock et al. 2010; Sammler et al. 2011), some specimens exhibited two haplotypes of one or both CRs. Within each individual, the two CRs generally differ in a short distinct interspaced region (grey shaded in Fig. 1b). This section corresponds to the putative RFB region described for Aceros waldeni and Penelopides panini (Sammler et al. 2011). In 26 individuals, orthologous copies (CR1 of all individuals and CR2 of all individuals, respectively) in the putative RFB region are more closely related to one another across individuals than they are to paralogous copies (CR1 and CR2) within individuals. Figure 1b shows ten (the P. m. subniger haplogroup, Fig. 2; see also Figs. 2b and 3 in Sammler et al. 2012) of these 26 sequences; the remaining 16 show an identical pattern. Consequently, CR1 and CR2 sequences cluster, in general, separately in supported monophyletic groups in the gene tree (Fig. 2). In the surrounding sections (nonshaded in Fig. 1b), a reversed pattern can be found: Paralogous copies within individuals are more similar to one another than are orthologous copies across specimens. This concerted evolution of the paralogous copies may be explained by frequent recombination (Sammler et al. 2011).

Fig. 2
figure 2

Maximum Likelihood tree for all Penelopides manillae manillae (Pmm) and P. m. subniger (Pms) haplotypes of control region 1 (CR1) and control region 2 (CR2) found in the study. The tree is based on the GTR + Γ + I model (α = 0.146; I = 0.529) with Aceros waldeni (Aw), Penelopides affinis (Pa), and P. panini (Pp) as outgroups. Numbers at nodes are bootstrap support values for ML and MP searches (1,000 replicates; first and third value, respectively) and posterior probabilities for the Bayesian search (second value). Only support values ≥75 % are reported. For heteroplasmic individuals, both haplotypes are shown, separated by a and b. P. m. manillae haplogroup I, II, III (HI, II, and III), and P. m. subniger haplogroup correspond to P. m. manillae haplogroup I, II, III, and P. m. subniger haplogroup in Fig. 2B in Sammler et al. (2012), where discrepancies between morphological identification of samples (Pmm28 and Pmm32) and phylogenetic clustering of their haplotypes were interpreted as the results between island migratory events. Sequences of Pmm32 (red framed) cluster interchanged, i.e., CR1 of Pmm32 in the CR2, and CR2 of Pmm32 in the CR1 assembly of the P. m. subniger haplogroup (Color figure online)

In individual Pmm32, we found the same pattern, but the interspaced regions are interchanged (Fig. 1b). In CR1 (amplified with primers for fragment 3; Fig. 1a), Pmm32 possesses an interspaced sequence which is more similar to the putative RFB region in the CR2 of the other 26 individuals (amplified with primers for fragment 5; Fig 1a), while the interspaced region in CR2 of Pmm32 (amplified with primers for fragment 5; Fig. 1a) corresponds to the RFB region in CR1 of the other 26 individuals (amplified with primers for fragment 3; Fig. 1a). By sequencing 1,927 bp of the tandemly duplicated part of the mt genome in the overlapping fragments 1–5 (Fig. 1a) (GenBank: JX274016, JX274017), we showed that Pmm32 possesses the same genome organization as the other individuals, but with CR1 and CR2 sequences interchanged. Apart from the differences presented in Fig. 1b, no further differences between the two 1,927 bp long copies of the individual Pmm32 were found.

This intragenomic rearrangement is also reflected in our phylogenetic analyses based on both CRs of each individual (Fig. 2). P. manillae is split in four haplogroups (P. m. manillae HI, II, III, and P. m. subniger). Within each of these groups, the CR1 and the CR2 sequences consistently cluster together. The sole exception is the individual Pmm32. CR1 and CR2 of this individual (red framed in Fig. 2) cluster interchanged in the P. m. subniger CR2 and CR1 clades. As both CR sequences of Pmm32 cluster well within the P. m. subniger haplogroup (Fig. 2), the interchange must have occured after the separation of this lineage from the other P. manillae lineages, i.e., quite recently in time.

Animal mitochondrial genome rearrangements have been repeatedly found also in birds (e.g., Mindell et al. 1998; Bensch and Härlid 2000; Eberhard et al. 2001; Abbott et al. 2005; Gibb et al. 2007; Cadahía et al. 2009; Cho et al. 2009; Morris-Pocock et al. 2010; Verkuil et al. 2010; Sammler et al. 2011; Schirtzinger et al. 2012). Two different genome rearrangements within a single species are however rare and were only reported from the aberrant mt genome of asexual squamates (Moritz and Brown 1987; Moritz 1991; Zevering et al. 1991; Fujita et al. 2007). To our best knowledge, we provide the first report of such a phenomenon within a bird species.

Owing to the differences between the two putative RFB regions within individuals and the similarity of orthologs across individuals, the detection of such an interchange is facilitated. However, this similarity between duplicates (except the putative RFB region) precludes an exact delineation of the length of the fragment exchanged.

The sequence of the cytb gene in the second duplicate is only partial, and thus not functional. This could imply that the complete first duplicate is the functional one, whereas the second is only maintained by gene conversion. In parrots, at least six independent origins of duplicated CRs are found, but other duplicated genes are degraded or eliminated in most instances (Schirtzinger et al. 2012). Maintaining additional CRs has been interpreted as an advantageous trait for, e.g., faster replication (Kumazawa et al. 1996) or protection against age-related deterioration of mitochondrial function (Schirtzinger et al. 2012). In species where duplicated genes are neither eliminated nor degraded (Thalassarche albatrosses, Abbott et al. 2005; the Black-faced Spoonbill, Cho et al. 2009; three booby species, Morris-Pocock et al. 2010; the ruff, Verkuil et al. 2010; Philippine Aceros and Penelopides hornbills, Sammler et al. 2011), their functionality is still an open question. However, the interchange in Pmm32 indicates that the RFB sequences in both CRs, which are supposed to halt replication forks (Kurabayashi et al. 2008; Sammler et al. 2011), seem to fulfill their function, regardless of being reciprocally exchanged.

In Sammler et al. (2011), we explained the homogenization of duplicates within individuals in light of the findings of Reyes et al. (2005) and the recombination model of Kurabayashi et al. (2008). According to these authors, the RFB plays a paramount role. Since it is excluded from the prominent initiation zone between cytb and 12S, and since it halts the replication fork, this region is excluded from the homogenization across the duplicated CR regions.

While our findings strongly indicate that the mt genome of Pmm32 could remain functional and replicable, despite of the intragenomic rearrangement, one can only speculate about the mechanism, by which this apparent interchange occurred. A possible scenario could be that — once in the evolutionary history of this mitochondrial lineage — the replication fork was erroneously not halted at the RFB, and the RFB itself might have been subjected to strand exchange. Under such a scenario, instead of repairing the heteroduplex strand, which would have produced two homogenized CR1 or two homogenized CR2 sequences in one mt genome, we hypothesize that a process analogous to that of chromosomal double crossing over reported from nuclear DNA during meiosis might have produced the interchange we have identified. Such a hypothesis, however, warrants experimental examination.

Most studies on mt gene evolution concentrate on surveying and comparing a broad spectrum of species to understand how rearrangements came about, but often do not include replicate specimens of the species analyzed. Our study provides a valuable example for mitochondrial genome rearrangement within a species.

The mitochondrial CR is also a standard molecular marker for population genetics and phylogeography. Here, efforts are usually taken to achieve amplification of the requested mt sequences only. In the case of awareness of duplicates and/or nuclear copies of the mt genome (Sorenson and Quinn 1998), one general strategy is to design putatively specific primers to avoid amplification of unwanted DNA fragments. However, if such primers are situated within an element subjected to rearrangements, an interchange as that reported in this study would go unnoticed. Our findings reiterate that amplification of mitochondrial sequences, even with specific primers, is not a guarantee to yield orthologous sequences, and make a plea for either targeting complete mitochondrial genomes or at least longer fragments in the amplification of fragments prone to rearrangements.