Introduction

Cyanobacteria are considered to be major contributors to primary production in the aquatic environment. In addition, many cyanobacterial species can catalyze the conversion of atmospheric nitrogen into ammonia (nitrogen fixation) and contribute to the global nitrogen cycle (Capone et al. 1997; Zehr et al. 2001). Although eukaryotes primarily lack the ability of nitrogen fixation, some eukaryotic lineages recruited this metabolic pathway by hosting nitrogen-fixing cyanobacteria as symbionts (Kneip et al. 2007).

Some diatom species (e.g., Climacodium frauenfeldianum and Rhizosolenia clevei) are known to maintain non-obligatory and non-hereditary cyanobacterial symbionts for nitrogen fixation (Carpenter and Janson 2000; Janson et al. 1995). Another nitrogen-fixing diatom species belonging to the family Rhopalodiaceae, Rhopalodia gibba, also possesses cyanobacterium-like structures termed spheroid bodies (SBs), as well as the typical plastid, in their cells (Drum and Pankratz 1965). Although the precise function(s) of the SBs in R. gibba cells remains unclear, photosynthesis is unlikely to occur in the SBs as they are devoid of chlorophyll autofluorescence (Kies 1992). Rather, the SBs are proposed to carry out nitrogen fixation for the host cells, since (1) nitrogen-fixing capacity was observed in R. gibba (Prechtl et al. 2004), (2) the nif operon encoding the proteins involved in nitrogen fixation was found in the SB genome (Kneip et al. 2008), and (3) the α subunit of dinitrogenase (NifD) was immunolocalized to the SBs (Prechtl et al. 2004). In addition to the putative nitrogen-fixing function of the SBs, previous studies have revealed that they show several unusual characteristics which are not seen in other cyanobacterial symbionts. For example, the SBs cannot survive outside host cells (Prechtl et al. 2004) and are inherited through the sexual cycle of the host (Geitler 1977), implying that they are well integrated into the host cell system.

The family Rhopalodiaceae contains three genera, Rhopalodia, Epithemia, and Protokeelia (Round et al. 1990). Although members of the first two genera are known to possess the SBs (DeYoe et al. 1992; Geitler 1977; Kies 1992), it is unclear whether members of Protokeelia also possess these structures. To our knowledge, molecular data for the endosymbionts is available solely for R. gibba (Kneip et al. 2008; Prechtl et al. 2004). A comparison between a 63 Kbp-region of the SB genome in R. gibba and the corresponding genomic region in Cyanothece sp. ATCC 51142, which is most likely the closest relative of the SB ancestor, revealed multiple gene loss events, hinting that genome reduction is on-going in the SBs (Kneip et al. 2008). The genomic and cytological features of the SBs in rhopalodiacean diatoms suggest that they are indeed organelles specialized for nitrogen fixation. Furthermore, the SBs, which are restricted to an extant diatom family, are certainly evolutionarily much “younger” than mitochondria and plastids found in a wider spectrum of eukaryotes. Thus, in-depth investigations on the SBs in rhopalodiacean diatoms may provide key insights into the transition from an endosymbiont into an organelle.

As the first step for understanding the evolutionary processes leading to the establishment of an organelle specialized for nitrogen fixation, we determined and analyzed both host and endosymbiont phylogenetic markers (i.e. 18S and 16S ribosomal DNA sequences, respectively) from three rhopalodiacean diatom species, Epithemia turgida, Epithemia sorex, and R. gibba, isolated from different sites in Japan. The results presented here clearly indicate a single origin of the SBs in rhopalodiacean diatoms.

Materials and methods

E. turgida and R. gibba cells were isolated from water samples collected from Lake Yunoko, Tochigi, Japan (36°80′0″N, 139°42′53″E), and a pond in Namiki Park, Tsukuba, Ibaraki, Japan (36°06′55″N, 140°14′20″E), respectively. We successfully established cultured strains of the two species, which were maintained in modified CSi medium (Table S1) at 20°C on a 14:10 h LD cycle. Total DNA was extracted from the cultured cells using a DNeasy Plant Mini Kit (Qiagen). For the 18S rDNA amplification, a primer set of SR1 (5′-TACCTGGTTGATCCTGCCAG-3′) and SR12 (5′-CCTTCCGCAGGTTCACCTAC-3′) (Nakayama et al. 1998) was used. We amplified the SB 16S rDNA genes in two steps. Initially, nearly the entire gene was amplified using a set of universal primers for 16S rDNA, U16F1 (5′-AGAGTTTGATCCTGGCTCAG-3′) and U16R1 (5′-ACGGCTACCTTGTTACGACTT-3′) (Yoon et al. 2009). For the second PCR, the product from the first PCR was used as the template, and the 5′ and 3′ halves of the SB 16S rDNA sequence were separately amplified using primers U16R1 and SB-int-F (5′-AAACGATGGAAACTAGGTGTGGCTTGTA-3′), and primers U16F1 and SB-int-R (5′-CCTCGACTTTCATCAAGGTTCGCG-3′), respectively. The primers SB-int-F and SB-int-R are specific to 16S rDNA sequences of the SBs and their cyanobacterial relatives. The PCR conditions were as follows: 1 cycle at 94°C for 5 min, followed by 30 s at 94°C, 30 s at 50–55°C, and 1–2 min at 72°C for 28 cycles.

We collected E. sorex and R. gibba cells in a water sample from an artificial stream in the campus of the University of Tsukuba (Tsukuba, Ibaraki, Japan), and E. turgida cells were obtained from a water sample from Lake Saiko, Yamanashi, Japan (35°50′21″N, 138°70′04″E). Since we failed to culture these isolates in the laboratory, single-cell PCR was conducted to obtain their 18S and 16S rDNA sequences. The isolated cells were washed 10 times with sterilized water to avoid contaminating the PCR with non-diatom cells. A single cell was broken using a micropipette and served as the template for PCR. The conditions of the single-cell PCR were the same as those described above. The nucleotide sequences determined in this study have been deposited in DDBJ/EMBL/GenBank under the accession numbers, AB546729–AB546738.

The 18S rDNA sequences from the two cultured strains and three isolates of rhopalodiacean diatoms were manually aligned with those from 40 pennate and three centric diatom species (outgroup). After the exclusion of ambiguously aligned positions, 1,638 nucleotide positions were subjected to maximum likelihood (ML) analysis under a GTR model approximating among-site rate variation with a discrete gamma distribution (GTR + Γ model) using RAxML v.7.0.3 (Stamatakis 2006). We manually aligned 16S rDNA sequences of two proteobacteria (Escherichia coli and Agrobacterium tumefaciens; outgroup), 31 cyanobacteria, nine plastids, and six SB sequences from rhopalodiacean diatoms. In total, 1,320 unambiguously aligned positions were subjected to ML analysis under the GTR + Γ model. ML bootstrap analyses (1,000 replicates) were conducted as described above.

We additionally subjected the 18S and 16S rDNA alignments to Bayesian phylogenetic analyses under the GTR + Γ model using MrBayes v.3.1.2 (Huelsenbeck and Ronquist 2001). One cold and three heated Markov chain Monte Carlo with default chain temperatures were run for 1,000,000 generations, sampling log-likelihood (lnL) values and trees at 100-generation intervals. The first 40,000 generations (in the 18S rDNA analysis) and 30,000 generations (in the 16S rDNA analysis) were discarded as “burn-in.” Bayesian posterior probabilities (BPP) and branch lengths were calculated from the remaining trees.

Results and discussion

We here determined both 18S (host/eukaryotic) and 16S (SB/prokaryotic) rDNA gene sequences of R. gibba, E. turgida, and E. sorex collected from different areas in Japan (Fig. 1a–c). In the 18S rDNA tree, all rhopalodiacean diatoms formed a monophyletic clade with a bootstrap (BP) value of 93% and a BPP of 1.00 (Fig. 2a). The Rhopalodiaceae clade was composed of two well supported subclades, one which included three Epithemia sequences (BP = 81%, BPP = 1.00) and the other which included two R. gibba sequences (BP = 100%, BPP = 1.00). As a whole, the Rhopalodiaceae clade showed a specific affinity to Entomoneis cf. alata and Surirella fastuosa, members of the order Surirellales (Round et al. 1990), with a BP of 100% and a BPP of 1.00. Simonsen (1979) speculated that Rhopalodiaceae is closely related with Auriculaceae, a family in the Surirellales. Thus, the rDNA phylogeny reported here is consistent with Simonsen’s idea. The cluster of rhopalodiacean diatoms, E. cf. alata, and S. fastuosa was further grouped with Amphora cf. capitellata (BP = 91%, BPP = 1.00).

Fig. 1
figure 1

LM images of shells of diatom species used in the present study. a Rhopalodia gibba, b Epithemia sorex, and c Epithemia turgida. Scale bars 20 μm

Fig. 2
figure 2

Host and endosymbiont phylogenies of rhopalodiacean diatoms. a Maximum-likelihood (ML) tree based on 18S rDNA sequences of pennate and centric diatoms. b ML tree based on 16S rDNA sequences of the SBs, cyanobacteria, plastids, and α-proteobacteria. Bootstrap values ≥70% are shown. Nodes supported by Bayesian posterior probabilities ≥0.95 are indicated by thick lines. Accession numbers are shown on the right of species names

In the 16S rDNA tree (Fig. 2b), all SB sequences were recovered as a monophyletic clade with a BP of 100% and a BPP of 1.00. This clade was placed in the radiation of cyanobacteria, and was clearly distant from the plastid sequences (Fig. 2b). Nearly identical sequences from three R. gibba isolates formed a clade with a BP of 100% and a BPP of 1.00 (Fig. 2b), and the three Epithemia sequences grouped with a BP of 84% and a BPP of 0.99 (Fig. 2b). As reported by Prechtl et al. (2004), the SB sequences showed a specific affinity to Cyanothece species and their relatives (BP = 92%, BPP = 1.00; Fig. 2b).

The 18S and 16S rDNA analyses recovered the monophylies of host rhopalodiacean diatoms and their SBs, respectively, with high statistical support (Fig. 2). These results strongly support a scenario in which a common ancestor of rhopalodiacean diatoms acquired the SBs from a cyanobacterium through an endosymbiotic event, and the SBs have been vertically inherited during host speciation. The 63 Kbp-region of the SB genome in R. gibba implied on-going genome reduction (Kneip et al. 2008). By combining the cytological observations and the genomic data from R. gibba with the 18S and 16S rDNA phylogenies, we propose that the SBs have been fully integrated into the diatom cells as organelles. As the fossil record suggested that surirellacean diatoms were established in the middle Miocene epoch (approximately 12 million years ago; Hajos 1986; Sims et al. 2006) and surirellacean and rhopalodiacean species were closely related to each other (Fig. 2a), we also propose that the SBs in the ancestral rhopalodiacean diatom was established in the middle Miocene epoch. Nevertheless, as this divergence time estimation is based solely on the fossil record, it needs to be re-examined by combining both fossil and molecular data in the future.

The endosymbioses that gave rise to mitochondria and plastids may have occurred in early eukaryotic evolution and largely contributed to the shaping of modern eukaryotic cells and their genomes. Thus, the process by which a prokaryotic endosymbiont is integrated into eukaryotic cells as an organelle is considered one of the most intriguing topics in evolutionary biology. Unfortunately, it is difficult to reconstruct the detailed processes of these endosymbiont-organelle transitions, since most of the key information for these ancient events has been lost. However, evolutionarily ‘young’ organelles (or highly integrated endosymbionts) and their host cells can serve as models for providing snapshots of the transition from a prokaryotic endosymbiont into an organelle. Intriguingly, the cercozoan testate amoeba Paulinella chromatophora possesses a permanent cyanobacterial endosymbiont (chromatophore) with photosynthetic activity (Yoon et al. 2009), and molecular data strongly suggest that the endosymbiont is indeed at the status of an organelle (Nakayama and Ishida 2009). Therefore, rhopalodiacean diatoms, together with P. chromatophora, are ideal organisms for investigating organelle acquisition through endosymbiotic prokaryotes.