Introduction

The actin cytoskeleton is a ubiquitous and essential system in all eukaryotes (Sheterline et al. 1999). Chlamydomonas reinhardtii, a unicellular green alga, has two actin genes, of which one codes for a conventional actin (hereafter referred to as “actin”) and the other for an unconventional actin designated NAP, for novel actin-like protein (Kato-Minoura et al. 1998; Lee et al. 1997; Sugase et al. 1996). The amino acid sequence identity between NAP and actin (64 %) is exceptionally low among actins. Studies using ida5, a mutant that lacks the actin gene, have indicated that NAP can substitute for actin in cell growth and division (Kato-Minoura et al. 1997, 1998). At the same time, NAP apparently cannot substitute for actin in other functions. Most notably, it cannot function as a subunit of certain kinds of flagellar inner arm dyneins, which normally contain a monomeric actin, and cannot form the fertilization tubule in “plus” mating-type gametes, most likely because it has a low ability to polymerize and cannot readily form filamentous actin bundles that serve as the core of the fertilization tubule. NAP molecules ectopically expressed in cultured animal cells show very low ability to polymerize compared with actin (Kato-Minoura 2011). Thus, NAP appears to function as a partial substitute for actin. The question then arises as to how Chlamydomonas benefits from NAP.

As a first step towards answering this question, we performed phylogenetic analyses using various eukaryotic species to explore the origin of the two actin homologs. Since the first determination of the complete amino acid sequence of actin from rabbit skeletal muscle (Zakut et al. 1982), innumerable actin sequences have been identified in a variety of eukaryotes. Actins from most species show high similarity to each other; for example, the identity of the amino acid sequence of yeast actin with human actin is as high as 89 %.

Actins in volvocine green algae display conserved sequences consistent with their phylogenetic positions (An et al. 1999). However, the phylogenetic position of NAP has not been determined, although NAP sequences have been identified in the genomes of three Volvocales species, Chlamydomonas reinhardtii, Chlamydomonas moewusii, and Volvox carteri (Cresnar et al. 1990; Kato-Minoura et al. 1998, 2003; Lee et al. 1997; Prochnik et al. 2010; Sugase et al. 1996). Our previous study suggested that these two types of actin form separate clades (Kato-Minoura et al. 2003). However, their phylogenetic relationship was not clearly resolved, and we were unable to determine whether NAP homologs were present in other species of the Chlorophyta. The ambiguity of the NAP origin in the previous study arose partly because only partial NAP sequences were available (except that of C. reinhardtii), and actin-related sequences were determined only in a few species of the Volvocales. In this study, we determined the full sequences of NAP and actin genes in C. moewusii and Gonium pectorale, and partial sequences in several other volvocine species. These newly determined sequences prompted us to reanalyze the phylogeny of volvocine NAP, using a wide range of available data, including the complete sequence of V. carteri NAP and complete actin sequences of eukaryotes, as well as using more rigorous methods of phylogenetic analysis. Specifically, we analyzed the phylogeny of actin and NAP genes from more than 170 operational taxonomic units (OTUs) selected without bias. Although our results cannot determine the basal branches of the lineages, and thus NAP’s origin still remains ambiguous, the lineage of NAP genes of the Volvocales lends strong support to its monophyly.

Materials and methods

Sources of volvocine algae and extraction of DNA/RNA

The volvocine algal strains used in this study and their culture conditions are shown in Table 1. Genomic DNA was purified by the CTAB extraction method, and further purified by CsCl gradient centrifugation when required (Kato-Minoura et al. 2003). Total RNA of Gonium pectorale was isolated by using Trizol reagent (Life Technologies Co., Carlsbad, CA, USA).

Table 1 The volvocine algal strains used in this study

Acquisition of genomic sequences

Inverse and standard PCR methods were used to obtain the complete genomic sequences of Chlamydomonas moewusii actin, C. moewusii NAP, Gonium pectorale actin, and partial sequences of Eudorina elegans (NIES-351) actin, E. elegans (NIES-717) actin, Volvulina steinii actin, Chlorogonium capillatum actin, Pleodorina californica actin, E. elegans (NIES-717) NAP, and V. steinii NAP. The procedures, including primers (Online Resource 1), are detailed in the Supplementary Information. Genomic PCR products were directly sequenced, or were cloned into either pBluescript (Agilent Technologies, Inc., Santa Clara, CA, USA) or pCR2.1 vectors (Life Technologies Co., Carlsbad, CA, USA) and then sequenced. The genomic sequence of G. pectorale NAP was obtained from a draft genome sequence database (made available by Dr. H. Nozaki of University of Tokyo and Dr. A. Fujiyama of National Institute of Genetics). The sequence was checked and corrected against the partial cDNA sequence determined by RT-PCR (see Online Resource 2). Except for G. pectorale NAP, exon/intron boundaries in the newly obtained genomic sequences were predicted using either GreenGenie2 (Kwan et al. 2009) or sim4cc (Zhou et al. 2009) programs, and manually adjusted considering the C. reinhardtii actin and NAP cDNA sequences. All sequences were deposited in GenBank/EMBL/DDBJ (Online Resource 3). The entire actin and NAP sequences of C. reinhardtii and V. carteri were obtained from the JGI database (http://genome.jgi.doe.gov/, Online Resource 3).

Phylogenetic analysis

In addition to the new sequences from algae, we collected actin amino acid sequences in other eukaryotes from several genome databases. We selected a representative species for each of the 24 families registered in MBGD (a database of completely sequenced microbial genomes) (Uchiyama 2003). As the representatives included only one species (Arabidopsis thaliana) of green plant (Viridiplantae) and no species of red algae (Rhodophyta), we added 11 plant and 3 red alga species whose genomes have been sequenced. From the genome records of the 38 species selected as above, genes encoding amino acid sequences similar to that of C. reinhardtii actin (CHLREDRAFT_24392) were collected by BLAST search of KEGG (http://www.genome.jp/tools/blast/) (Kanehisa et al. 2012). Data of 158 sequences with E-value <1e-100 were selected. In addition to the actin and NAP genes, the collected sequences included the genes of Arp1 and Arp2 actin-related proteins. Further BLAST searches using three genes (NAP1 [CHLREDRAFT_168932] of C. reinhardtii; Gasu_17630 of Galdieria sulphuraria; and CYME_CMS412C of Cyanidioschyzon merolae) as queries did not yield any novel sequences. Green alga sequences similar to the sequence of C. reinhardtii actin gene (CHLREDRAFT_24392) were also collected by BLAST search of the NCBI database in May 2014. Finally, 178 genes were selected as the data set of actin/NAP/Arp1/Arp2 homologous genes.

For phylogenetic analyses, 170 completely sequenced genes out of the 178 genes in the data set were used. The amino acid sequences were aligned with the MAFFT program in E-INS-i mode (Katoh et al. 2005). Poorly aligned sites were selected and deleted by using the trimAl algorithm in the automated1 mode (Capella-Gutierrez et al. 2009). The final data set of 170 genes with 393 aligned sites was used for maximum likelihood (ML) phylogenetic analysis in PhyML 3.0 software (Guindon and Gascuel 2003). In addition, to infer the relationship among the volvocine species, full and partial sequences of 17 actin- and 8 NAP-related genes were used.

Molecular evolutionary analysis

The final 178 genes of the data set contained 18 Volvocales genes, which belonged to the actin, NAP, or Arp2 family. Of these, we used 11 completely sequenced genes in the analysis of patterns of nucleotide substitution in actin/NAP/Arp2 genes. The amino acid sequences were aligned by the MAFFT program in E-INS-i mode (Katoh et al. 2005). The nucleotide sequences were aligned to the amino acid alignment with PAL2NAL software (Suyama et al. 2006). Pairwise genetic distances in the amino acid sequences of actin, NAP, and Arp2 were estimated by MEGA4.0.0 software using the command “Compute pairwise distances” (Tamura et al. 2011). Nonsynonymous and synonymous substitution rates were estimated by the Yang and Nielsen method using yn00 program in the PAML software package (Yang and Nielsen 2000, Yang 2007).

Results and discussion

Using inverse PCR (see Online Resource 2), we extended the previously determined partial sequences of actin and NAP of C. moewusii to cover their entire sequences (Online Resources 4, 5). The complete sequences of the actin and NAP genes of G. pectorale were newly determined (Online Resources 6, 7). The full sequence of V. carteri NAP was obtained by the Volvox genome project (Prochnik et al. 2010). The splice sites in these sequences, except for the G. pectorale NAP sequence, were first estimated using either GreenGenie2 or sim4cc program, and then manually fine-adjusted referring to the splice site information from the C. reinhardtii actin/NAP genes. The splice sites in G. pectorale NAP were determined by its cDNA sequence. Although the splice sites in many species dealt with in our analysis are those predicted from the genomic data, we believe our predictions are fairly accurate because (1) all of the putative exon sequences thus determined had high BLAST optimized scores (>200 for most exons, 100–200 for some exons) against C. reinhardtii actin/NAP exon sequences, (2) all putative introns were inserted following the conserved gt-ag rule, and (3) most of the putative exons had exactly the same lengths as those in the C. reinhardtii actin/NAP gene.

Most of the intron positions are conserved in both groups of sequences (Figs. 1, 2). The intron positions in the C. moewusii NAP gene are unique among NAP genes, but one position (123–3) is shared with actin genes. The C. moewusii NAP gene might retain some features of the ancestral gene, as a shared intron position in the actin gene might reflect the history of gene evolution (Weber and Kabsch 1994).

Fig. 1
figure 1

Genomic structure of volvocine actin and NAP genes. For comparison, previously identified sequences are also shown. Filled boxes, putative coding exons; open boxes, putative 5′ and 3′ untranslated regions. Intervening sequences are shown by solid lines. Intron positions are indicated by codon and phase numbers with reference to the three alpha-actins of vertebrates (377 amino acids) (Weber and Kabsch 1994). The conserved intron positions are linked with dotted lines. ATG, translation start codon; TAA or TGA, stop codon

Fig. 2
figure 2

Alignment of nine actin and six NAP amino acid sequences. The deduced amino acid sequences are aligned using CLUSTAL W (http://www.genome.jp/tools/clustalw/). Amino acid residues shown in dots are identical to those of V. carteri actin. Long strings of dashes indicate unidentified sequences. Asterisk indicates gaps introduced to maximize alignment. Intron boundaries are marked by vertical bars with diamonds. The numeric codes indicate positions with reference to the codon and phase numbers of the three alpha-actins of vertebrates (377 amino acids) (Weber and Kabsch 1994)

For further analysis, we obtained partial sequences of actin genes from V. steinii, E. elegans (NIES-351 and NIES-717), C. capillatum, and P. californica (Online Resources 8, 9, 10, 11, 12), and partial NAP sequences from V. steinii (Online Resource 13) and E. elegans NIES-717 (Online Resource 14). Figure 2 shows the alignment of the deduced amino acid sequences of all species. Intron positions were estimated using the above-mentioned prediction programs (Figs. 1, 2). Among the nine actin genes, exon sequences are highly conserved. The intron positions are mostly conserved, but their sequences and lengths are not (Table 2). Of these, introns 7 and 8 are markedly longer in unicellular algae than in multicellular algae (Table 2).

Table 2 Length of introns in volvocine actin orthologs

We next analyzed the phylogenetic relationship of the NAP and actin genes isolated in this study with their homologs in 38 other eukaryotic species. The resulting ML phylogeny using the 170 completely sequenced genes divided the genes of the volvocine species into three clades of actin, NAP, and Arp2 lineages (Fig. 3 and Online Resource 15). The inferred tree strongly suggests the monophyly of the three clades. The actin clade of the Volvocales was included in the clade of the Chlorophyta in the lineage of green plants (Viridiplantae). The Arp2 clade was also included in the lineage of the Chlorophyta. In contrast, the NAP clade was restricted to the Volvocales; the genomes of other Chlorophyta or Viridiplantae species did not contain NAP genes. Several species in the Amoebozoa, Fungi, and Metazoa also have highly diverged sequences of actin paralogs in each genome, as in the Volvocales of NAP genes (Fig. 3). The actin gene may have been duplicated in several lineages of eukaryotes independently. The phylogenetic relationship of actin and NAP genes in Volvocales was also analyzed using their sequences including those partially sequenced in this study (Fig. 4). The phylogenies of actin (Fig. 4a) and NAP genes (Fig. 4b) are mostly consistent with the relationship in the phylogenetic tree inferred from five chloroplast genes (Fig. 4c; Herron and Michod 2008). If the branches with low bootstrap values (<80 %) were ignored, there is no inconsistency among the results of three phylogenies in Fig. 4. NAP genes may have been vertically transmitted in the lineage of Volvocales. However, it is important to note that in most cases, bootstrap support of the basal branches was low. Further analysis will be needed to elucidate the origin and evolutionary relationship of the NAP genes.

Fig. 3
figure 3

Schematic representation of phylogenetic tree of homologous actin/NAP genes of eukaryotes. The original phylogeny with the gene codes (KEGG entry name or NCBI accession number) of amino acid sequences inferred by ML method is shown in Online Resource 15. Genes of the Volvocales are shown in red; symbols indicate species with actin, NAP, and Arp2 genes. Large triangles indicate a clade of multiple genes; the length of a triangle corresponds to the longest branch among the genes. The analysis used the LG model of substitution with a gamma (four classes) distribution for rate evolution. The first number at a node corresponds to the bootstrap support; the second indicates approximate likelihood ratio (aLRT). The scale bar represents 0.5 substitutions per site

Fig. 4
figure 4

a Phylogenetic tree of volvocine actin genes. Sequences newly identified in this study and other known sequences of Chlorophyta actin genes were used. The first number at a node corresponds to the bootstrap support; the second indicates approximate likelihood ratio (aLRT). The scale bar represents 0.02 substitutions per site. b Phylogenetic tree of volvocine NAP genes. Sequences newly identified in this study and other known sequences of volvocine NAP homologs were used. Actin sequences of D. discoideum and G. sulphuraria were used as outgroups (Fig. 3). The scale bar represents 0.2 substitutions per site. c A representative phylogenetic tree of Volvocales modified from Herron and Michod (2008)

The phylogeny showed longer branches of NAP genes than of actin and Arp2 genes of the Volvocales (red branches in Fig. 3). Therefore, we analyzed genetic distances between the genes of the Volvocales (Table 3). When compared between C. reinhardtii and V. carteri, the genetic distances in the NAP (0.227 ± 0.026 substitutions/site; Table 3) and Arp2 (0.146 ± 0.020) genes were larger than the genetic distance in the actin genes (0.019 ± 0.007). Pairwise comparison of other species supported this tendency (Table 3). We tested whether the difference in genetic distance reflects changes in the nonsynonymous (d N) and synonymous (d S) rates, using previously reported methods (Aoki et al. 2013; Arbiza et al. 2006; Neiman et al. 2010; Zhang et al. 2006). Synonymous substitution rates were similar between NAP, actin, and Arp2 genes (3.995 ± 6.240, 3.890 ± 5.824, and 3.967 ± 6.127 substitutions/site, respectively, in C. reinhardtii and V. carteri) (Table 3). The high rates of d S values may indicate the saturation of synonymous substitutions. In contrast, the nonsynonymous substitution rates differed greatly among the genes. The rate in the actin gene was very low (0.021 ± 0.005 substitutions/site; Table 3) compared with NAP (0.149 ± 0.014) and Arp2 (0.091 ± 0.010). The difference in the phylogenetic branch length and genetic distance seems to reflect the divergence of nonsynonymous substitution rates between NAP, actin, and Arp2 genes (Fig. 3; Table 3). Duplicated genes can face relaxed functional constraints, allowing one of the copies to acquire novel functions (Barton et al. 2007). The low rate of nonsynonymous substitution in actin sequences must reflect the indispensable function of the actin protein. NAP genes might have a cellular function different from that of actin genes in species of the Volvocales.

Table 3 Comparison of genetic distances and synonymous (d S) and nonsynonymous (d N) substitution rates between actin/NAP/Arp2 genes of the Chlorophyta

Our studies reconfirmed that two distinct types of actin, “actin” and “NAP”, are present in a wide range of species of the Volvocales. The inferred tree suggests that NAP was already present in the common ancestor of C. moewusii and C. reinhardtii. Despite detailed phylogenetic analyses using a large number of sequences, the origin of NAP remains unclear. Since NAP homologs are found only in the Volvocales, the function of NAP is possibly specific to the Volvocales or to some members of the Chlorophyta. The long branch length of the NAP clade suggests that NAP faces a more relaxed functional constraint than actin. Perhaps NAP has a more limited cellular function than actin. Our previous studies showed that NAP has a role in the formation of flagella and fertilization tubules (Kato-Minoura et al. 1997). Examining whether or not NAP is present in other green algae that do not have flagella or fertilization tubules may provide further insights.