Introduction

The genus Salvia L. (tribe Mentheae) is the largest genus in the Lamiaceae, comprising nearly 1,000 species; Salvia has radiated extensively in three regions of the world, Central and South America (500 spp.), and West (200 spp.) and East Asia (100 spp.) (Alziar, 1988–1993). It is distinguished from the other genera in the Lamiaceae by the presence of two aborted posterior stamens and a markedly elongated connective tissue separating thecae of the two expressed stamens, which may act as an effective tool for pollination (e.g., Grant and Grant 1964; Faegri and Van der Pijl 1979; Huck 1992; Crassen-Bockhoff et al. 2003; Reith et al. 2007). The presence of such an unusual stamen structure led taxonomists to believe, until recently, that Salvia was monophyletic. However, recent molecular phylogenetic studies have shown that the genus is polyphyletic, with three major lineages and five other genera intercalated within it (Walker et al. 2004; Walker and Sytsma 2007). Salvia clade I sensu Walker and Sytsma (2007), which is a sister to Rosmarinus and Perovskia, contains European, Central African, Southern African and West Asian species, while clade II contains species from the New World. Clade III comprises West Asian, Central Asian, East Asian, Mediterranean, and African Salvia species.

Since Thunberg’s (1784) first accounts of Japanese Salvia (S. japonica Thunb.), ten species, eight varieties, and one putative hybrid have been described; they are classified into the subgenera (subg.) Salvia, Allagospadonopsis, and Sclarea (Murata and Yamazaki 1993, Inoue 1997, Hihara et al. 2001). Most of the taxa are endemic to Japan, except in the cases of S. japonica (which also occurs in Korea, Taiwan and Central to Southern China) and S. plebeia R. Br. (widely distributed in temperate and tropical Asia, and Australia) (Murata and Yamazaki 1993). Additionally, one of the varieties of S. nipponica Miq., var. formosana (Hayata) Kudo, is known only from Taiwan (Li and Hedge 1994; Huang and Wu 1998). Therefore, we imagine that almost all Japanese taxa, including the Japanese endemics, speciated in Japan.

Inoue and Ozawa (1998) conducted morphological and allozymic analyses among species of subg. Salvia, and argued that S. glabrescens (Franch. et Sav.) Makino and S. nipponica var. nipponica should be considered conspecific, because genetic similarity between them was so high (I = 0.84-0.99). They reported genetic differentiation between S. nipponica populations in eastern and western Japan. They also suggested that S. nipponica var. kisoensis K. Imai should be elevated to species rank, because it was found to be genetically differentiated from S. nipponica var. nipponica and S. glabrescens. However, they did not undertake taxonomic or nomenclatural revisions of Japanese Salvia. Sudarmono and Okada (2007) conducted cpDNA and nrDNA phylogenetic analyses of selected Japanese and Taiwanese Salvia. They considered the speciation process in S. isensis Nakai, taking into account a contradiction in the phylogenetic positions of the species studied. Sudarmono and Okada (2008) analyzed additional species and demonstrated that species of subg. Allagospadonopsis formed a well-supported clade, that two species of subg. Salvia (S. glabrescens and S. nipponica) belonged together in another clade, and that S. plebeia, the sole Japanese species in subg. Sclarea, was sister to the clade of subg. Salvia. Although their results suggested the possibility of polyphyly among S. lutescens (Koidz.) Koidz., S. nipponica and S. glabrescens; however, their discussion was limited to examining the genetic differentiation among populations of S. japonica and related species. Two Japanese Salvia (S. koyamae and S. omerocalyx Hayata) were not included in Sudarmono and Okada’s (2008) analyses.

As part of our ongoing investigations into speciation processes of endemic Japanese Salvia, we conducted molecular phylogenetic analyses of all taxa of Japanese Salvia above variety level not considered by Sudarmono and Okada (2008), i.e., varieties of S. glabrescens (var. glabrescens and var. purpreomaculata (Makino) K. Inoue ex. T. Shimizu), S. koyamae Makino, S. lutescens var. stolonifera (G. Nakai), varieties of S. nipponica (var. kisoensis, var. trisecta (Matsum. ex Kudo) Honda), S. omerocalyx (var. omerocalyx and var. prostrata Satake), S. pygmaea Matsum var. simplicior, and S. sakuensis Naruh. et Hihara.

Materials and methods

DNA extraction, PCR, and DNA sequencing

Total DNA was isolated from 0.7-1.5 g of fresh or silica gel-dried leaves, using a modified version of the 2× cetyltrimethylammonium bromide (CTAB) extraction protocol of Doyle and Doyle (1987). DNA sequences were amplified with rbcL 1-1 as the forward primer and rbcL NN3-2 as the reverse primer for rbcL (Hasebe et al. 1994), FRF as the forward primer and 5FR as the reverse primer for trnL-F (Sudarmono and Okada 2007), and ITS5 as the forward primer and ITS4 as the reverse primer for nrDNA ITS (White et al. 1990). The protocol and conditions of the polymerase chain reaction (PCR), purification, and cycle sequencing followed Sudarmono and Okada (2007). To sequence these three regions amplified by PCR, we used an additional pair of internal primers, i.e., 724F, 744R for rbcL (Sudarmono 2007), 3RF and 3FR for trnL-F (Sudarmono and Okada 2007), and ITS2 and ITS3 (White et al. 1990) for the nrDNA region, including the ITS1-5.8S rDNA-ITS2 region (hereafter, ITS).

Sequence alignment and phylogenetic analysis

Novel sequences were collected for 18 individuals of 12 taxa and the 3 regions. Other sequences were obtained through Genbank. Raw sequences were assembled and edited using the BioEdit software (ver. 5.0.9; Hall 1999). DNA sequences were aligned by multiple alignments using the CLUSTALW 1.83 software package with default settings (Thompson et al. 1994). Alignments of rbcL and the intergenic spacer region of trnL-F of cpDNA were combined. Gaps were treated as missing data.

We used 32 individuals of 10 species, 13 varieties, 4 formas, and 1 hybrid (S. × sakuensis) from Japan, and other selected species belonging to clade I (S. roemeriana Scheele, S. sclarea L., and S. texana Torr.), clade II (S. cedrosensis Greene, S. chionopeplica Epling, S. clevelandii (Gray) Greene, and S. farinacea Benth.), and clade III (S. digitaloides Diels, S. flava Forrest ex Diels, S. glutinosa L., S. hians Royle ex Benth., S. miltiorrhiza Bunge, S. przewalskii Maxim., S. trijuga Diels, and S. yunnanensis C. H. Wright) sensu Walker and Sytsma (2007). Horminum pyrenaicum L. and Melissa officinalis L. were used as outgroups. Materials, accession numbers for the sequences, vouchers, and literature are presented in ESM 1. Three datasets were constructed: (1) cpDNA (=rbcL+trnL-F) contained 47 individuals from 36 taxa, (2) ITS (hereafter nrDNA) contained 53 individuals from 42 taxa (several not included in cpDNA and cpDNA+nrDNA datasets), and (3) cpDNA+nrDNA contained 47 individuals from 36 taxa.

We analyzed these datasets using three methods. Maximum Parsimony (MP) analysis was performed with the PAUP* 4.0b10 software (Swofford 2003). Heuristic searches were conducted with RANDOM addition, tree-bisection-reconnection (TBR) branch swapping, and MULPARS options. Support for branches was estimated using bootstrap analysis with 1,000 replications (Felsenstein 1985), through a heuristic search using RANDOM addition and TBR branch swapping. Maximum likelihood (ML) was also conducted with the PAUP* 4.0b10 software (Swofford 2003). We conducted a hierarchical likelihood ratio test using MrModeltest software (ver. 2.3; Nylander 2004) to determine the best-fit model of sequence evolution in the ML analysis. The GTR+G (for cpDNA and nrDNA datasets) and GTR+G+I (for the cpDNA+nrDNA datasets) models were chosen by the analysis. Heuristic searches were used in the analyses to find ML trees with RANDOM sequence addition and TBR branch swapping; we saved all of the best trees at each step (Multrees). Bootstrap analysis under the ML criterion was conducted using “fast” stepwise addition searches with 200 replicates. Additionally, a Bayesian analysis was conducted using the MrBayes software (ver. 3.1.2; Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003). The best fitting substitution model (GTR+G for the cpDNA and nrDNA datasets, and GTR+I+G for the cpDNA+nrDNA dataset) for Bayesian analysis was selected using a series of hierarchical likelihood ratio tests implemented in the MrModeltest software (ver. 2.3). We performed the analysis using the selected model with two simultaneous runs of two million generations with four chains, sampling every 100 generations. Each analysis reached stationarity (the average standard deviation of split frequencies between runs ≤0.01) well before the end of the run. Burn-in (=5,000) trees were discarded, and the remaining trees and their parameters were saved. A 50% majority rule consensus tree was constructed. The results of the Bayesian analysis are reported as the posterior probabilities (PP; Huelsenbeck and Ronquist 2001), which are equal to the percentage of trees sampled when a given clade was resolved. Only PP scores in excess of 50% are shown.

We assessed the degree of phylogenetic incongruence between the cpDNA (=rbcL+trnL-F) and nrDNA (=ITS) datasets of 47 taxa using the incongruence-length difference (ILD) test (Mickevich and Farris 1981; Farris et al. 1994) in the PAUP*4.0 beta10 software, TBR branch swapping, and saving all of the most parsimonious trees.

Results

The features of alignments in the combined cpDNA (rbcL+trnL-F), nrDNA (ITS), and cpDNA+nrDNA datasets are shown in ESM 2. The G+C contents of ITS regions varied from 59.9% (S. roemeriana) to 68.7% (S. chionopeplica).

We found that the homogeneity test (ILD test) provided a P value < 0.01 for the combined cpDNA datasets and ITS (hereafter, nrDNA) datasets compared with random partitions. However, variable evolutionary rates and heterogeneity rates of substitution appear to affect the ILD test results, increasing the probability of type I errors (the error of incorrectly rejecting the correct hypothesis of congruence) (Baker and Lutzoni 2002; Darlu and Lecointre 2002). This may also be the case for our datasets since the substitution rate of the cpDNA and nrDNA regions studied were quite different (9.9 vs. 40.8%, respectively; ESM 2); therefore, we decided to combine cpDNA and nrDNA datasets.

The cpDNA data set of 47 individuals from 36 taxa contained 2,139 characters, and 99 of these were parsimony-informative. Parsimony analysis produced eight most parsimonious trees of 265 steps, a consistency index (CI) of 0.834 and a retention index (RI) of 0.901. Likelihood analysis resulted in a ML tree with –ln L = 4,863.258. The MP strict consensus, ML, and Bayesian trees had the same topology; the MP tree is shown with bootstrap and PP support in Fig. 1. The species of clade I sensu Walker and Sytsma (2007) were sister to all other Salvia, and the species of clade II were sister to the species of clade III plus Japanese and Taiwanese Salvia. There were three subclades within the clade: 1. S. plebeia, 2. species of Japanese subg. Salvia (S. glabrescens, S. koyamae, S. nipponica, S. sakuensis) plus S. glutinosa (species of clade III), and 3. Japanese and Taiwanese species of subg. Allagospadonopsis plus S. hians (species of clade III). The rbcL and trnL-F sequences in S. koyamae and S. nipponica var. kisoensis were identical, and those of all taxa included in this subclade were also highly similar (99.6–99.9% identical) to one another, except for S. nipponica var. nipponica from Kumamoto and S. glutinosa. The species in subg. Allagospadonopsis formed another well-supported subclade, and Taiwanese (=S. arisanensis Hayata and S. hayatana Makino ex Hayata) and Ryukyu (=S. pygmaea var. pygmaea) Salvia were sister to other species. Individuals of S. omerocalyx and S. lutescens and its varieties were found in two different subclades: two varieties of S. omerocalyx (var. omerocalyx and var. prostrata) formed a subclade with the S. lutescens group (except for S. lutescens var. crenata), S. isensis and S. ranzaniana Makino; and a subclade containing S. japonica, an individual S. omerocalyx plant (specimen from Takeno, Toyooka), S. lutescens var. crenata (Makino) Murata, and one Ryukyu taxon, S. pygmaea var. simplicior. Salvia hians was sister to the Allagospadonopsis clade.

Fig. 1
figure 1

A strict consensus tree of eight most parsimonious trees derived from the cpDNA (rbcL + trnL-F) datasets, CI = 0.834, RI = 0.901. MP/ML bootstrap support is shown above branches, and Bayesian PP numbers are shown below. Asterisk indicate ≤50% support in selected analyses. Clades I, II, and III are sensu Walker and Sytsma (2007). Subg., subgenus; S., Salvia; S. lut. v. lut. f. lutescens, S. lutescens var. lutescens f. lutescens; S. lut. v. lut f. lobatocrenata, S. lutescens var. lutescens f. lobatocrenata

The ITS datasets for 53 individuals in 42 taxa contained 687 characters, and 179 of these were parsimony-informative. Parsimony analysis produced 2,795 most parsimonious trees of 629 steps, a consistency index (CI) of 0.618 and a retention index (RI) of 0.791. Likelihood analysis resulted in a ML tree with –ln L = 4,263.061. The MP strict consensus, ML, and Bayesian trees had the same topology, and the MP tree is shown with bootstrap and PP support in Fig. 2. The species of clade III sensu Walker and Sytsma (2007) plus Japanese and Taiwanese Salvia consisted of a strongly supported clade. Salvia plebeia was sister to all other species. As in the cpDNA tree, S. glabrescens and its varieties, S. koyamae, S. nipponica and its varieties, and S. sakuensis were grouped with relatively low support in a clade with several species of clade III sensu Walker and Sytsma (2007) (=S. digitaloides, S. flava, S. glutinosa, S. hians, S. przewalskii, S. trijuga). The sequences of Salvia glabrescens, S. nipponica and their varieties were highly similar to each other (95.8–99.2% identical). The species belonging to subg. Allagospadonopsis and two Chinese species of clade III sensu Walker and Sytsma (2007), S. miltiorrhiza and S. yunnanensis (in subg. Sclarea) formed a well-supported clade. Within the clade, the two Chinese species were sister to other species, and the species in subg. Allagospadonopsis formed a clade with relatively low support. In this clade, S. lutescens and its varieties, S. omerocalyx and its variety, S. isensis and S. ranzaniana comprised a subclade with rather low support. Individuals of S. lutescens and its varieties were scattered within the subclade, intercalated with S. isensis, S. omerocalyx and S. ranzaniana. Two Taiwanese species, S. arisanensis and S. hayatana, were sister to this subclade; S. japonica and its formas, and S. pygmaea and its variety formed a subclade sister to the remaining species of Allagospadonopisis.

Fig. 2
figure 2

A strict consensus tree of 2,795 most parsimonious trees derived from the ITS dataset, CI = 0.618, RI = 0.791. MP/ML bootstrap support is shown above branches, Bayesian PP numbers are shown below. Asterisk indicate ≤50% support in selected analysis. Clades I, II, and III are sensu Walker and Sytsma (2007). Subg., subgenus; S., Salvia, S. lut. v. lut. f. lutescens, S. lutescens var. lutescens f. lutescens; S. lut. v. lut. f. lobatocrenata, S. lutescens var. lutescens f. lobatocrenata

The combined cpDNA and nrDNA datasets for 47 individuals in 36 taxa contained 2,823 characters, and 285 of these were parsimony-informative. Parsimony analysis produced 26,058 most parsimonious trees of 923 steps, a consistency index (CI) of 0.681 and a retention index (RI) of 0.810. Likelihood analysis resulted in an ML tree with –ln L = 9,561.679. The MP strict consensus, ML, and Bayesian trees had the same topology, and the MP tree is shown with bootstrap and PP support in Fig. 3. The species of clade I sensu Walker and Sytsma (2007) were sister to all other Salvia, and the species of clade II were sister to the species of clade III plus Japanese and Taiwanese Salvia. Salvia glutinosa and S. hians, species of clade III, comprised a strongly supported clade with Japanese Salvia subg. Salvia, i.e., Salvia glabrescens and its varieties, S. koyamae, S. nipponica and its varieties, and S. sakuensis, and were sister to Japanese Salvia. Among Japanese Salvia, S. nipponica var. nipponica from Kumamoto was sister to the others. Salvia plebeia was sister to the species of subg. Allagospadonopsis. Salvia omerocalyx and its variety were sister to all the other taxa, and the two Taiwanese Salvia were sister to the rest. Salvia lutescens and its varieties, S. isensis and S. ranzaniana comprised a subclade with rather low support, and S. japonica and its formas and S. pygmaea and its variety formed another subclade.

Fig. 3
figure 3

A strict consensus tree of 26,058 most parsimonious trees derived from the combined cpDNA (rbcL + trnL-F) and nrDNA (ITS) datasets CI = 0.681, RI = 0.810. MP/ML bootstrap support is shown above branches, and Bayesian PP numbers are shown below. Asterisk indicate ≤50% support in that analysis. Clades I, II, and III are sensu Walker and Sytsma (2007). Subg., subgenus; S., Salvia; S. lut. v. lut. f. lutescens, S. lutescens var. lutescens f. lutescens; S. lut. v. lut. f. lobatocrenata, S. lutescens var. lutescens f. lobatocrenata

Discussion

Our phylogenetic analyses of Japanese species, two Taiwanese species, and species of clade I, II, and III sensu Walker and Sytsma (2007) using cpDNA, nrDNA, and a combined dataset showed that all Japanese species and the two Taiwanese species of Salvia formed a well-supported clade with the Salvia species of clade III (Figs. 1, 2, 3). Thus, Japanese and Taiwanese Salvia may belong to clade III, as proposed by Walker and Sytsma (2007).

The results of our molecular phylogenetic analysis generally supported Murata and Yamazaki’s (1993) system, which divided Japanese Salvia into three subgenera. The species belonging to subg. Salvia (S. glabrescens, S. koyamae, S. nipponica, S. sakuensis) were included in the same phylogenetic clade, the species of subg. Allagospadonopsis formed another clade, and S. plebeia (subg. Sclarea) fell outside both these clades in all cpDNA, nrDNA, and cpDNA+nrDNA trees. However, monophyly of subg. Salvia and Sclarea were not supported. Salvia glutinosa, a member of subg. Sclarea, was sister to all the species belonging to subg. Salvia in our cpDNA and cpDNA+nrDNA trees (Figs. 1, 3), and formed a subclade together with S. sakuensis in the Salvia clade of our nrDNA tree (Fig. 2). The monophyly of species belong to subg. Allagospadonopsis are supported in cpDNA and cpDNA+nrDNA trees (Figs. 1, 3), however, the BS/PP support of the Allagospadonopsis clade in nrDNA tree was weak (Fig. 2). In our nrDNA tree, S. miltiorrhiza and S. yunnanensis (i.e., subg. Sclarea species) formed a well-supported clade that included species of subg. Allagospadonopsis. There might remain the possibility of monophyly of subg. Allagospadonopsis, but further analyses increasing taxa and data are needed to confirm it.

The status of some species is also questionable. For example, all individuals of S. nipponica and S. glabrescens (and their varieties) fell into the same unresolved clade in both cpDNA and nrDNA trees, with a few exceptions, because of high similarity of their sequences (99.6–99.8% in cpDNA and 95.5–99.2% in nrDNA). This supports Inoue and Ozawa’s (1998) claim that the two species were conspecific. However, a proposal to elevate S. nipponica var. kisoensis to species rank was not supported by our analysis. Furthermore, a geographic differentiation trend suggested by Inoue and Ozawa (1998) was not apparent in our data, although genetic differentiation might have occurred in some populations, e.g. S. nipponica var. nipponica from Kumamoto (cpDNA and cpDNA+nrDNA trees Figs. 1, 3) and Tokushima (nrDNA tree, Fig. 2).

Monophyly of S. lutescens and its varieties is also problematic. In our cpDNA tree, S. isensis and S. ranzaniana were included in the S. lutescens clade, while one S. lutescens var. crenata individual fell into the S. japonica clade (Fig. 1). One individual of S. lutescens var. crenata was sister to S. isensis, S. lutescens and its varieties, S. omerocalyx, and S. ranzaniana in the nrDNA tree (Fig. 2). All varieties of S. lutescens were gathered into one clade in the cpDNA+nrDNA tree, but at the same time, this clade contained S. isensis and S. ranzaniana (Fig. 3). Hybridization/introgression may have occurred because the distributions and flowering seasons of S. japonica and S. lutescens partly overlap. It is clear that taxonomic revision of S. lutescens will require more information on morphological characters, habitat, and ecological niches.

Unexpectedly, S. pygmaea var. simplicior was included in the clade containing S. japonica in all cpDNA, nrDNA, and cpDNA+nrDNA trees, suggesting close affinity between the two taxa (Figs. 1, 2, 3). Distributions of S. japonica and S. pygmaea var. simplicior do not overlap; the former is distributed in Honshu, Shikoku, and Kyushu southward to Yaku-shima Island, Taiwan and China, but is absent from the island chain from Amami-oshima Island to Iriomote Island, while S. pygmaea var. simplicior is found only on Amami-oshima Island and Tokunoshima Island. Thus, hybridization/introgression between these two taxa seems unlikely, at least in recent times. Morphologically, the taxa are readily distinguishable: the calyx tube of S. japonica is larger and has long pilose hairs on the upper half of the inner tube, and its leaves are generally radical and cauline; the calyx tube of S. pygmaea var. simplicior is puberulent on all inner surfaces, and leaves are all radical (Murata and Yamazaki 1993). However, S. pygmaea var. simplicior has been little studied since it was formally established in 1993 by T. Yamazaki. Further comparative examinations of S. japonica and varieties of S. pygmaea (var. pygmaea and var. simplicior) are required to determine taxonomic status.

Our molecular analyses revealed several taxonomic issues warranting additional study in Japanese Salvia: the non-monophyletic nature of S. lutescens and its varieties, and the close relationships between S. glabrescens and S. nipponica, as well as between S. japonica and varieties of S. pygmaea. Further analyses, e.g., using population genetic markers, should provide a better understanding of evolutionary history in Japanese Salvia.