Introduction

Bananas (Musa spp., family Musaceae) originated in southeast Asia and western Pacific, and spread widely throughout the tropics and subtropics to become one of the most important sources of tropical food, next to rice, wheat, and maize. The genus Musa comprises 70 wild species (Häkkinen 2013) and 500 cultivars (Simmonds 1966).

The phylogenetics of bananas were first studied in 1753 (Linnaeus 1753), and the genus was named Musa by Carl Linnaeus. Later, Sagot (1887) divided the genus into giant bananas, fleshy edible bananas, and ornamental bananas. Based on Sagot's work, Baker (1893) further divided Musa into three subgenus: physocaulis, eumusa, and rhodochlamys. Classification at the chromosomal level was first performed by Cheesman (1947), who divided Musa into four sections: eumusa and rhodochlamys (2n = 22), and callimusa and australimusa (2n = 20). Subsequently, Simmonds (1960) added a group, ingentimusa, containing only two species: Musa ingens Simmonds (2n = 14) and Musa beccarii Simmonds (2n = 18). Thereafter, Argent (1976) created a new section ingentimusa, into which M. ingens was placed, and Häkkinen et al. (2005) suggested the inclusion of M. beccarii into the callimusa section. For half a century there has been little change in the genus classification system for Musa as proposed by Cheesman (1947). However, studies examining the taxonomic relationships within Musa using molecular approaches have questioned the validity and practicability of this system and have generally classified the genus into two groups, namely, Musa (2n = 22) and Callimusa (2n = 20/18) (Gawel et al. 1992; Wong et al. 2002; Nwakanma et al. 2003; Li et al. 2010; Liu et al. 2010; Bekele and Shigeta 2011; Christelová et al. 2011; Häkkinen 2013). Most recently, Feng et al. (2016) confirmed this classification using simple sequence repeat (SSR) markers to determine the molecular phylogeny of the genus Musa.

To date, more than 70 wild Musa species have been identified; of these, Musa acuminata Colla (A genome) and Musa balbisiana Colla (B genome) are the most prominent. The genomic constitutions of AA, AB, AAA, AAB, and ABB (Stover and Simmonds 1987) observed in present-day banana cultivars evolved through intra- and inter-specific crosses (Cheesman 1948; Simmonds and Shepherd 1955) of these two species. M. acuminata is genetically rich comprising 10 subspecies (banksii, burmannica, burmannicodes, errans, malaccensis, microcarpa, siamea, truncata, and zebrina) and the variety (var.) chinensis (Feng et al. 2009). Although higher genetic diversity has been observed in M. balbisiana (Sotto and Rabara 2000), intraspecific classification has not been reported to date.

Banana cultivars are mostly diploid, triploid, or tetraploid, with characteristics of sterility, parthenocarpy, polyploidy, or unknown origin, which has led to the slow progress of banana genetic improvements. Examining the genetic diversity and phylogenetic relationships of banana germplasms would therefore help clarifying the origin and evolution of banana cultivars and accelerate their breeding.

Three different genomes exist within plant cells: a nuclear genome, a chloroplast genome, and a mitochondrial genome. Several studies have shown the potential of using complete organellar genomes to analyze phylogenetic relationships in plants. The nuclear genome size of M. acuminata is 523 Mbps (D’Hont et al. 2012) and the M. balbisiana genome size is 79% that of M. acuminata (Davey et al. 2013). In contrast, the organellar genomes are much smaller than the nuclear genome; the chloroplast genome is 0.17 Mbps (Barrett et al. 2014; Shetty et al. 2016; Li et al. 2017), while the exact size of the mitochondrial genome is currently unknown. Moreover, the complete chloroplast genomes of M. acuminata (Martin et al., 2013) and Musa itinerans Cheesman (Li et al., 2017) have been published. In Musa spp., inheritance of the chloroplast genome is strongly biased toward the maternal lineage, while the mitochondrial genome is paternally inherited (Fauré et al. 1994). Thus, organellar genomes enable maternal and paternal lineages to be followed using chloroplast and mitochondrial markers, respectively.

Gawel and Jarret (1991a) used chloroplast DNA restriction fragment length polymorphisms (RFLPs) to analyze the phylogenetics of Musa species and subspecies, and reported cytoplasmic diversity among Musa cultivars (Gawel and Jarret 1991b). Later, Carreel et al. (2002) combined RFLPs with the hybridization of heterologous mitochondrial and chloroplastic probes to characterize 71 wild accessions, and 131 diploid and 103 triploid cultivars of Musa, and identified 10 chloroplastic patterns and more than 100 mitochondrial DNA patterns. Umali and Nakamura (2003) reported a single nucleotide polymorphism (SNP) marker from the trnL-F intergenic spacer region of chloroplast DNA, which could be used to discriminate M. acuminata from M. balbisiana. Nwakanma et al. (2003) constructed a molecular phylogeny of Musa species using restriction-site polymorphisms of organelles, and suggested that the evolutionary status of M. balbisiana was primitive. More recently, Swangpol et al. (2007) analyzed SNPs from selected non-coding chloroplast DNA sequences of Musa interspecific hybrids and found that the M. acuminata and M. balbisiana genomes could be clearly distinguished. Boonruangrod et al. (2008) analyzed the relationship between chloroplast and mitochondrial haplotypes of 54 accessions and identified six chloroplastic and seven mitochondrial gene pools. A combination of chloroplast and mitochondria gene pools identified 14 cytotypes; Cytotype VIII, resulting from the crossing of maternal Cytotypes I and II and paternal Cytotype III ancestors, was identified in the majority of the analyzed cultivars.

In the present study, sequence data from 25 chloroplast and 12 mitochondria DNA genomes were used to assess the phylogenetic relationships of 60 Musa species, including a wide range of wild and cultivated species. We aimed to elucidate the genetic diversity and phylogenetic relationships of wild and cultivated Musa species and subspecies, especially Chinese species. Furthermore, we generated additional cytoplasmic data on Musa germplasms to refine phylogenetic research, and reconstructed the paternal and maternal lineages of diploid wild species as well as that of banana cultivars.

Materials and methods

Plant materials and DNA extraction

Sixty Musa spp. samples were used in the present study (Table 1). Of these, 49 accessions were collected from the Biodiversity International Musa Germplasm Transit Centre (ITC, Leuven University, Belgium) and 11 were collected by the authors during trips throughout south China. Samples were identified and morphological characters were described using the Musa descriptors (INIBAP/CIRAD, 1996). Total genomic DNA was extracted from young leaves using a modified cetyltrimethylammonium bromide (CTAB) method (Paterson et al. 1993). The quality of extracted DNA was determined by visualization on 1% agarose gel and on NanoDrop 2000 (Thermo Fisher Scientific, MA, USA). Total DNA samples were diluted to 50 ng/μL with sterile water.

Table 1 Analyzed accessions and their cytotype affiliations

PCR amplification and sequencing

Twenty-five pairs of chloroplast DNA primers and 12 pairs of mitochondrial DNA primers were selected for use in PCR amplifications (Table 2). These were carried out in 50-μL reaction mixtures containing 2 μL of 50 ng/μL DNA, 2 μL of each primer (10 μM), and 25 μL of 2 × Taq Master Mix (Vazyme Biotech Co., Ltd., China), with the final volume adjusted using double distilled water. PCR amplifications were performed in a FlexCycler (Analytikjena, Germany) under the following reaction conditions: 94 °C for 5 min, 30 cycles of 94 °C for 30 s, 48–58.5 °C for 30 s, 72 °C for 60 s, and a final extension at 72 °C for 7 min. PCR products were visualized on 1% agarose gels and subsequently purified and sequenced at BGI Technology Co., Ltd. (China) using the Sanger method. The primers used for sequencing were the same as those used for the PCR. Sequencher v.4.2 software (Gene Codes Corp., MI, USA) was used to assemble the sequences.

Table 2 Primers used to amplify chloroplast and mitochondrial genes

Sequence alignment and phylogenetic analyses

Nucleotide sequences were aligned by MAFFT v.7 (Katoh and Standley. 2013). Characteristics of genetic diversity, including conserved sites, variable sites, and parsimony-informative (Pi) sites were computed in DnaSP v.6.12.03 (Rozas et al. 2017). The incongruence length difference (ILD) test was performed in PAUP v.4.0b (Swofford 2002) to estimate the level of potential incongruence in the data. The aligned chloroplast and mitochondrial DNA sequences were then concatenated in SequenceMatrix-Windows v.1.7.8 (Vaidya et al. 2011), and used for further phylogenetic analyses.

Phylogenetic relationships were inferred using maximum likelihood (ML) and Bayesian inference (BI) methods. For ML analysis, IQ-TREE v.1.6 software was first used for best-fit model estimation of the sequence matrix (Kalyaanamoorthy et al. 2017), and then to calculate and select suitable nucleotide substitution models and corresponding parameters. The ML tree was then constructed with the ultrafast bootstrap (BS) of IQ-TREE v.1.6 (Hoang et al. 2018; Nguyen et al. 2015) following repeated searches for 1000 repeats, using a tree with the largest likelihood value, 1000 repeated self-expansion detections to tests the confidence of each branch, and the Figtree option to view and optimize the generated ML tree.

For the BI analysis, the data sets were first tested for the best-fit model of evolution with IQ-TREE v.1.6 using the Akaike information criterion (AIC) and then analyzed using MrBayes v.3.2 software (Ronquist et al. 2012). The BI analysis utilized the Markov Chain Monte Carlo algorithm, starting with a random tree, four chains (one cold chain and three hot chains) running for 10,000,000 generations, and sampling once every 1000 generations. After reaching equilibrium, 25% of the burn-in samples were discarded, and the remaining samples were used to build a consistent tree. Bayesian trees were evaluated using posterior probability (PP).

Results

Phylogenetic analysis of Musa spp. based on 25 chloroplast gene sequences

The results of the ILD test supported the combination of the 25 chloroplast genes dataset (π = 0.01). The combined dataset covered 22,306 bp, which generated 617 variable sites with 265 Pi sites. The IQ-TREE analysis found that the best-fit models for the ML and BI trees were K3Pu + F + I and GTR + G + I, respectively. The phylogenetic trees constructed by both methods revealed the same topological structure (Fig. 1).

Fig. 1
figure 1

Maximum Likelihood (ML) tree of 60 Musa spp. accessions based on 25 chloroplast gene sequences (including chloroplast genotyping)

Three major clades were identified by the phylogenetic trees. Clade A was formed by the M. acuminata complex, Musa laterita Cheesman, Musa yunnanensis Häkkinen & H. Wang, Musa chunii Häkkinen, M. balbisiana from the ITC, M. balbisiana × Musa textiles Née, and most banana cultivars (BS = 100, PP = 1). Clade B (BS = 81, PP = 0.94) consisted of M. itinerans, Musa nagensium Prain, most M. balbisiana, and two cultivars with a B genome. Clade C included M. beccarii as the outgroup, which belongs to section callimusa, with a basic chromosome number of 2n = 18. The number of chromosomes in other test materials was 2n = 22 or 2n = 20.

Clade A contained eight subclades. Subclade A1 (BS = 100, PP = 1) consisted of M. acuminata var. chinensis, M. acuminata subsp. siamea, M. acuminata ssp529, and M. laterita. Among these, M. acuminata var. chinensis is a unique variant found in China. Subclade A2 (BS = 95, PP = 1) comprised cultivars with the A genotype, which are widely spread, and most of M. acuminata wild accessions, including subspecies malaccensis, burmannica, and burmannicoides; these wild accessions are mainly distributed in Malaysia. Subclade A3 (BS = 89, PP = 1) comprised most A-B genotype cultivars and a single M. acuminata wild accession, M. acuminata ssp516. Subclade A4 (BS = 54, PP = 1) comprised three wild accessions of M. acuminata, including subspecies zebrina, ssp503, and ssp505, M. peekelii Lauterb, and one AA genotype cultivar. Subclade A5 (BS = 58, PP = 1) consisted of M. acuminata subsp. microcarpa, M. yunnanensis, and three AAB genotype cultivars. Subclade A6 (PP = 1) included one AA genotype cultivar but no wild types, being an independent branch. Subclade A7 (BS = 100, PP = 1) comprised two wild accessions, M. balbisiana and M. balbisiana × M. textilis. Subclade A8 contained only wild accessions of M. chunii, which are found exclusively in Yunnan, China.

Clade B was divided into two subclades. Subclade B1 (BS = 99, PP = 1) contained M. nagensium and four M. itinerans accessions, which were collected from different populations in China. M. itinerans accessions collected from the Guangdong and Hainan populations were grouped together, while those from the Guangxi and Yunnan populations formed another group. Subclade B2 (BS = 100, PP = 1) contained four M. balbisiana accessions, ITC0080, ITC0246, and ITC0271, which were collected from China, and two ABB genotype cultivars.

Phylogenetic analysis of Musa spp. based on 12 mitochondrial gene sequences

The dataset based on 12 mitochondrial genes was supported by the ILD test (π = 0.01); these fragments were therefore combined to form a single dataset for subsequent phylogenetic analyses. The combined dataset covered 6802 bp, generating 553 variable sites with 279 Pi sites. The IQ-TREE analyses found that the best-fit model was TVM + F + I for the ML tree and GTR + G + I for the BI tree. The phylogenetic trees constructed using these two methods presented the same topological structure (Fig. 2).

Fig. 2
figure 2

Maximum Likelihood (ML) tree of 60 Musa spp. accessions based on 12 mitochondrial gene sequences (including mitochondrial genotyping)

Four major clades were identified from the phylogenetic trees. Clade A (BS = 100, PP = 1) contained the M. acuminata complex, M. laterita, M. chunii, and most of the banana cultivars in our sample set. Clade B (BS = 100, PP = 1) included all tested M. balbisiana, M. nagensium, the natural wild hybrid M. balbisiana × M. textilis, and three A-B genotype cultivated bananas. Clade C (BS = 100, PP = 0.99) contained samples from different populations of M. itinerans and M. yunnanensis, and the AAB genotype cultivar ‘Cluoi mat’. Clade D included M. beccarii as the outgroup.

Clade A consisted of five subclades. Subclade A1 contained M. acuminata var. chinensis, M. acuminata subspecies burmannicoides and malaccensis, eight unclassified wild M. acuminata accessions, M. laterita, two AA genotype cultivars (‘Pisang lilin’ and ‘Pisang berlin’), one AB genotype cultivar (‘Safet velchi’), two AAB genotype cultivars (‘Kluai roi wi’ and ‘Chuoi mat’), and two AAAB genotype cultivars (‘Tetraploide EMBRAPA 401’ and ‘Pc12-05’). Subclade A2 included M. acuminata subspecies burmannica and zebrina, two unclassified wild M. acuminata accessions, and M. peekelii, with no related cultivars in our sample set. Subclade A3 contained M. acuminata subsp. microcarpa, M. acuminata ssp528, and 14 cultivars, including five AA genotypes (‘Tjau lagada’, ‘Gunn chio’, ‘Guyod’, ‘Amas’, and ‘Khai’), one AB genotype (‘Datil’), two AAA genotypes (‘Highgate’ and ‘Williams’), four AAB genotypes (‘Prata’, ‘Pisang raja bulu’, ‘Mdzodji’, and ‘Vudi wai wai’), and two AAAB genotypes (‘FHIA01’ and ‘FHIA18’). Subclades A4 and A5 each contained one sample; the cultivar ‘Kluai lep mu nang’ (AA genotype) and M. chunii, respectively.

Clade B consisted of two subclades, with M. nagensium forming an independent subclade (BS = 100, PP = 1). M. balbisiana, the natural wild hybrids M. balbisiana × M. textilis, and three A-B genotype cultivated bananas were grouped in another subclade (BS = 100, PP = 1), which was further divided into three branches (BS = 100, PP = 1). Musa balbisiana-ITC0246 and M. balbisiana-ITC0080 formed independent branches, while other M. balbisiana samples, M. balbisiana × M. textilis, and cultivars ‘Fenjiao’ (AB), ‘Maduranga’ (ABB), and ‘Namwa khom’ (ABB) clustered together.

Clade C consisted of two subclades: subclade C1 contained four samples of M. itinerans obtained from different geographical sources in China, and one AAB genotype cultivar, ‘Chuoi mat’. M. itinerans from the Guangdong and Hainan populations grouped together, and those from the Guangxi and Yunnan populations also grouped together. In contrast, subclade C2 contained only M. yunnanensis.

Origin and evolution of cultivated bananas based on cytoplasmic genes

To elucidate the gene pools of cultivated bananas ancestors, we analyzed 49 wild and cultivated types of M. acuminata and M. balbisiana using chloroplast and mitochondrial genes inherited from single parents. Based on 25 chloroplast genes, eight chloroplast gene pools were identified; five from M. acuminata and three from M. balbisiana (Table 1). The first gene pool (Ca1) contained three M. acuminata wild types (var. chinensis and subspecies siamea and ssp529), with no related cultivars in our sample set. The presence of diploid wild type M. acuminata subspecies burmannica, malaccensis, burmannicoides, ssp517, ssp513, ssp528, ssp521, ssp523, and ssp527 characterized the largest gene pool (Ca2), along with six AA cultivars (‘Pisang lilin’, ‘Gu nin chio’, ‘Khai’, ‘Kluai lep mu nang’, ‘Amas’, and ‘Pisang berlin’), one AB diploid hybrid cultivar (‘Datil’), and two triploid AAA cultivars (‘Highgate’ and ‘Willams’). This gene pool contained the most abundant wild type M. acuminata accessions, most of the AA cultivars, and all AAA cultivars. One unclassified M. acuminata wild type (ssp516) and 10 cultivars formed the third gene pool (Ca3). The cultivars in Ca3 included one AB diploid hybrid (‘Safet velchi’), five triploids (AAB: ‘Prata’, ‘Kluai roi wi’, ‘Chuoi mat’, and ‘Mdzodji’; AAB: ‘Maduranga’), and all four tetraploid AAAB cultivars (‘Fhia-01’, ‘Fhia-18’, ‘Tetraploide EMBRAPA 401’, and ‘Pc12-05’). All cultivars in this gene pool contained genome B. Three M. acuminata wild type subspecies (zebrina, ssp503, and ssp505) and one AA diploid cultivar (‘Guyod’) formed the fourth gene pool (Ca4).The Ca5 gene pool comprised two M. acuminata wild type subspecies (microcarpa and ssp507), one AA cultivar (‘Tiau lagada’), and three AAB triploids (‘Vudi wai’, ‘Pisang raja bulu’, and ‘Chuoi mat’).

Based on 25 chloroplast genes, three subgroups of M. balbisiana were identified. Three diploid M. balbisiana wild types (one collected from China and subspecies ssp501 and ssp510) formed the larger gene pool (Cb1) along with two ABB triploid cultivars (‘Fenjiao’ and ‘Namwa Khom’), while subspecies ssp504 and ssp513 were found in the Cb1 and Cb2 gene pools, respectively.

Similarly, based on 12 mitochondrial genes, six mitochondrial gene pools were identified among the 49 analyzed accessions (Table 1). M. acuminata var. chinensis, M. acuminata subspecies siamea, malaccensis, and burmannicoides, and eight unclassified M. acuminata wild accessions formed the largest gene pool (Ma1) along with two diploid AA cultivars (‘Pisang lilin’ and ‘Pisang berlin’), one diploid AB cultivar (‘Safet velchi’), two triploid AAB cultivars (‘Kluai roi wi’ and ‘Chuoi mat’), and two tetraploid AAAB cultivars (‘Tetraploide EMBRAPA 401’ and ‘Pc12-05’). Three diploid wild type M. acuminata subspecies (burmannica, zebrina, and ssp516) formed the second gene pool (Ma2) with no related cultivars. Gene pool Ma3 consisted of two diploid M. acuminata wild subspecies (microcarpa and ssp528), six diploid cultivars (AA: ‘Tjau lagada’, ‘Gu nin chio’, ‘K hai’, ‘Amas’, ‘Guyod’; and AB: ‘Datil’), six triploid cultivars (AAA: ‘Highgate’ and ‘Williams’; AAB: ‘Prata’, ‘Mdzodji’, ‘Vudi wai’, and ‘Pisang raja bulu’), and two tetraploid AAAB cultivars (‘Fhia-01’ and ‘Fhia-18’). In contrast, M. balbisiana comprised three gene pools. The Mb1 and Mb3 gene pools were represented by one subspecies each (ssp501, Mb1; ssp510, Mb3), while Mb2 consisted of six samples, including three diploid M. balbisiana wild types (one collected from China, ssp504, and ssp513), and three ABB triploid cultivars (‘Fenjiao’, ‘Namwa Khom’, and ‘Maduranga’).

Eighteen cytotypes (a combination of chloroplast and mitochondrial gene pools) were identified among the analyzed samples (Table 1). The analyzed wild type M. acuminata accessions yielded nine cytotypes (I, II, III, IV, VI, VIII, IX, XI, and XII), while five cytotypes (XIV, XV, XVI, XVII, and XVIII) were found among the M. balbisiana wild types. Seven different cytotypes were found among 23 cultivars; three of these cytotypes resembled those found in the wild types (cytotypes II, IV, and XV), while the remaining four represented new combinations.

Three different cytotypes (II, IV, and X) were identified among the six diploid AA genotypes. The chloroplast genomes of ‘Pisang lilin’ and ‘Pisang berlin’ belonged to the Ca2 gene pool, which contained the wild type M. acuminata subspecies malaccensis, burmannica, and burmannicoides. The mitochondrial genome of those subspecies originated from the Ma1 gene pool, which contained the wild type subspecies malaccensis, siamea, and burmannicoides, and the var. chinensis, all of cytotype II. A similar chloroplast type was identified in the diploid AA cultivars ‘Gu nin chio’, ‘Khai’, and ‘Amas’; however, its mitochondrial genome represented the Ma3 gene pool, which contained M. acuminata subsp. microcarpa. The remaining AA hybrid cultivar ‘Guyod’ had Ca4 chloroplast and Ma3 mitochondrial genomes, representing the gene pools containing M. acuminata subsp. zebrina (chloroplast) and M. acuminata subsp. microcarpa (mitochondrial). Two diploid AA cultivars were identified, ‘Kluai lep mu nang’ (AA) and ‘Tjau lagada’ (AA). The chloroplast gene pool of ‘Kluai lep mu nang’ was Ca2, which contained M. acuminata subspecies malaccensis, burmannica, and burmannicoides. However, the identity of the mitochondrial genome remains unknown. Conversely, the mitochondrial gene pool of ‘Tjau lagada’ was Ma3, which contained M. acuminata subsp. microcarpa, while the chloroplast genome remains unknown. Therefore, we were unable to classify these two cultivars into the abovementioned cytotypes.

Similar to the diploid AA cultivars, two triploid AAA genotypes were of cytotype IV (‘Highgate’ and ‘Williams’). Therefore, cytotype IV was found most frequently among the intraspecific M. acuminata hybrids. Among the interspecific hybrids analyzed in the present study, the two diploid AB cultivars were of two cytotypes, IV and V.

The six triploid AAB cultivars were of three cytotypes (V, VII, and XI). However, the cytotype of ‘Chuoi mat’ with an AAB genome was identified as cytotype XIX, as it combined the Ca5 chloroplast genome and the M. itinerans mitochondrial genome. The three ABB cooking bananas were of cytotypes XIII and XV.

Similarly, two of the four tetraploid AAAB cultivars (‘Tetraploid EMBRAPA 401’ and ‘Pc12-05’) were of cytotype V, while the remaining two tetraploid cultivars (‘Fhia-01’ and ‘Fhia-18’) were of cytotype VII.

Discussion

In Musa spp., inheritance of the chloroplast genome is strongly biased toward the maternal lineage, while the mitochondrial genome is paternally inherited (Fauré et al. 1994). Consequently, the organellar genomes enable the maternal as well as the paternal lineages to be followed through the use of chloroplast and mitochondrial markers, respectively. In the present study, ML and BI trees of 60 Musa spp. accessions were constructed based on 25 chloroplast and 12 mitochondrial gene sequences. The topologies identified using both approaches were consistent for the two organellar genomes.

Maternal phylogenetic analysis of Musa spp. based on chloroplast genes

Based on 25 chloroplast gene sequences, the M. acuminata wild types and cultivars grouped together and were distinguished from M. balbisiana wild types and cultivars. Gawel and Jarret (1991a) reported similar findings using different chloroplast probes and Southern blot analysis. However, in the present study, differences in the maternal origin of M. balbisiana were identified. Most M. balbisiana samples grouped with M. itinerans and M. nagensium, while M. balbisiana (ITC0545) and M. balbisiana × M. textilis clustered with M. acuminata. Although relatively independent from M. acuminata, both M. balbisiana (ITC0545) and M. balbisiana × M. textilis were expected to have a common maternal origin. Therefore, we hypothesized that some of the M. balbisiana germplasms might have contacted M. acuminata germplasms during their evolution. In addition, M. balbisiana (ITC0545) and M. balbisiana × M. textilis clustered together, indicating that M. balbisiana and M. textilis have a closer genetic relationship in terms of their maternal origin (Gawel and Jarret 1991b).

Most M. balbisiana samples were closely related to M. itinerans, indicating that they may have shared a common maternal ancestor. In China, the wild germplasm of M. balbisiana is only distributed in Yunnan and cannot form a large population. In contrast, M. itinerans is distributed throughout Hainan, Guangdong, Guangxi, and Yunnan (Häkkinen, 2008), and can form large populations in all of these provinces. Here, we showed that M. itinerans from Guangdong and Hainan populations clustered together, while those from Guangxi and Yunnan populations formed a different group, indicating that in southern China M. itinerans has distinct chloroplast genomes. Our findings are in contrast to those of Ge et al. (2005) who analyzed the populations of M. balbisiana distributed in China using chloroplast PCR–RFLP markers and identified two major clades corresponding to two geographical regions; thus, the wild Musa spp. germplasms might have been incorrectly identified by Ge et al. (2005).

Except for M. chunii, the wild germplasms of other Eumusa groups in the tested materials were interspersed between M. acuminata complexes. Among them, M. laterita, M. acuminata subsp. siamea, and M. acuminata var. chinensis have the same maternal origin. M. yunnanensis was first identified by Häkkinen and Hong (2007). Based on its maternal evolution, this species has the same origin as M. acuminata subsp. microcarpa. However, Feng et al. (2016) analyzed the nuclear genome of this species using SSR markers and revealed a closer relationship with M. balbisiana. M. chunii was first identified by Häkkinen (2009) in Yunnan, China. The maternal origin of this species is unique; it did not cluster with any of the tested materials used in the present study.

Based on the 25 chloroplast gene sequences, we conclude that the maternal evolution of Musa spp. followed two main routes: via M. acuminata and via M. balbisiana.

Paternal phylogenetic analysis of Musa spp. based on mitochondrial genes

Based on the 12 mitochondrial gene sequences, the test materials could be divided into two independent branches of M. acuminata and M. balbisiana. Therefore, we believe that the patrilineal evolution of Musa spp. also followed two evolutionary routes: via M. acuminata and via M. balbisiana. However, M. itinerans and M. balbisiana presented the same matrilineal evolutionary path, while M. itinerans and M. acuminata were closer when considering paternal evolution. M. laterita and M. acuminata var. chinensis grouped together in both patrilineal and maternal evolution, indicating the same parental origin. Feng et al. (2016) used SSR markers for phylogenetic research on Musa spp., and also found that M. laterita was most closely related to M. acuminata var. chinensis. M. peekelii clustered with M. acuminata subsp. zebrina in both patrilineal and maternal evolution, having the same parental origin. M. chunii represents an independent branch in mitochondrial evolution. This species is unique in terms of both its maternal and paternal origin.

Few studies on the mitochondrial genome have investigated the phylogenetic evolution of Musa spp. Most previous studies have used some molecular marker technologies, such as PCR–RFLP (Nwakanma et al. 2003; Boonruangrod et al. 2008) and RFLP (Carreel et al. 2002). Additionally, the wild germplasm resources of Musa spp. are very limited, and there are limited references to the wild germplasms distributed in China. Therefore, we believe that our study is the first to report the use of multiple mitochondrial gene sequences for phylogenetic analysis, providing insight into the patrilineal evolution of the Musa genus.

Cytoplasm gene pools of M. acuminata ancestors

In the present study, 25 chloroplast gene sequences resulted in the identification of five chloroplast gene pools in the M. acuminata complex, namely, chinensis/siamea, burmannica/malaccensis/burmannicoides, and M. acuminata subspecies ssp516, zebrina, and microcarpa. Carreel et al. (2002) previously reported five chloroplast patterns for M. acuminata: zebrina, malaccensis, siamea, banksii, and errans/burmannica/burmannnicoides/siamea/malaccensis/microcarpa/truncata. Conversely, three chloroplast gene pools were identified by Boonruangrod et al. (2008): errans/banksii, microcarpa, and burmannicoides/siamea/burmannica/zebrina/malaccensis. The results of the present study support the findings of Carreel et al. (2002), which are, in part, consistent with those of Boonruangrod et al. (2008), and consider burmannica/malaccensis/burmannicoides as representing the same gene pool, while zebrina and siamea have independent chloroplast gene pools. Moreover, consistent with Boonruangrod et al. (2008), we also found that microcarpa belongs to an independent chloroplast gene pool. The limited disparity between our findings and those of Carreel et al. (2002) and Boonruangrod et al. (2008) could be due to the large number of M. acuminata subspecies used by those earlier studies, which might have influenced their separation. However, we believe that this ambiguity was addressed here through the use of chloroplast gene loci up to 22,306 bp, generating 352 Pi loci and 1663 insertion/deletion loci, which is sufficiently large to make the distinction.

Based on 12 mitochondrial gene sequences, three mitochondrial gene pools were identified: chinensis/siamea/malaccensis/burmannicoides, burmannica/zebrina, and microcarpa. The results are, in part, supported by the findings of previous studies (Boonruangrod et al. 2008; Carreel et al. 2002).

Cytoplasm gene pools of M. balbisiana ancestors

Based on 25 chloroplast gene sequences, three chloroplast gene pools were identified in M. balbisiana: balbisiana (ITC0080)/Cameroun (ITC0246)/China, Eti Kehel (ITC0271), and balbisiana (ITC0545). As reported by Carreel et al. (2002) and Boonruangrod et al. (2008), M. balbisiana formed two maternal gene pools. However, comparisons are not possible because only one sample was common between the present and the previous studies (Cameroun [ITC0246]). However, Cameroun formed an independent gene pool (Carreel et al. 2002) and clustered with Singapurii, and Butuhan (Boonruangrod et al. 2008).

Based on the 12 mitochondrial gene sequences, M. balbisiana could also be divided into three mitochondrial gene pools: balbisiana (ITC0080), Eti Kehel/balbisiana (ITC0545)/China, and Cameroun. Boonruangrod et al. (2008) also divided M. balbisiana into three mitochondrial gene pools. The material shared by the two studies (Cameroun) formed an independent gene pool. However, Carreel et al. (2002) reported no differences in the mitochondrial genome of M. balbisiana.

The current research results support that M. balbisiana originated via different evolutionary routes; however, the intraspecific classification of M. balbisiana warrants further discussion.

Origin and evolution of banana cultivars based on organelle DNA sequences

Banana cultivars originated from intraspecific crosses of M. acuminata or interspecific crosses of M. balbisiana. M. acuminata provided the A genome and M. balbisiana provided the B genome to form a series of banana cultivars of different genotypes: AA, AB, AAA, AAB, ABB, and AAAB. Because most banana cultivars are parthenocarpic and sterile, and parthenogenetic genomes are susceptible to mutations, it is possible that the ancient organelle genome was trapped in existing cultivars remaining more or less unchanged since the initial cultivar formation. Considering that present-day wild types are the offspring of ancient species, comparing these to cultivars may reveal gene pools of common origin. In the present study, 25 chloroplast genes and 12 mitochondrial genes were used to study 18 M. acuminata wild types, 5 M. balbisiana wild types, and 26 cultivars with different genotypes.

The maternal origin of 26 cultivars with different genotypes was found to follow three patterns. First, the maternal origin of most AA/AAA genotype cultivars was derived from the Ca2 gene pool, which was represented by the wild type M. acuminata subspecies malaccensis, burmannica, and burmannicoides. This finding was similar to those of Carreel et al. (2002) and Boonruangrod et al. (2008). Second, the maternal origin of the A-B genotype cultivars (for example, AB, AAB, AAAB) was mostly of Ca3 (M. acuminata ssp516) and Ca5 (M. acuminata subsp. microcarpa) gene pools. Third, when the genome of cultivated bananas contained only one B genome (for example, AB or AAB genotypes), the female parent tended to be M. acuminata. Conversely, when the genome of cultivated banana contained two B genomes (for example, the ABB genotype), most of the maternal sources were derived from M. balbisiana, consistent with previous studies (Carreel et al., 2002; Boonruangrod et al., 2008).

Three paternal origins were identified for the 26 cultivated bananas tested in the present study: Ma1 (chinensis/siamea/malaccensis/burmannicoides), Ma3 (microcarpa), and Mb2 (Eti Kehel/balbisiana [ITC0545]/China), with Ma3 being the most common paternal gene pool. However, Boonruangrod et al. (2008) and Carreel et al. (2002) reported that the most common paternal gene pool of the A genome was errans/banksii. Additionally, all tested ABB cultivated bananas belonged to the Mb1 gene pool (‘Pisang klutuk wulung’/’Pisang batu’/’Honduras’/’Lal vechi’/’Tani’). Although results cannot be directly compared because they used different materials, the common material, M. balbisiana (Cameroun), used in the present and previous studies was not involved in the development of banana cultivars. In addition, in the present study, the AAB genotype ‘Cluoi mat’ grouped with M. itinerans but not with M. balbisiana and M. acuminata. Therefore, further exploration is needed to determine whether M. itinerans also acted as a parent during the hybridization of banana cultivars.