Introduction

Class Anthozoa (Cnidaria) forms the foundations of a diversity of ecosystems with ~ 7500 extant species spread across shallow waters to deep reefs (Daly et al. 2007), including many of the most ecologically and economically important marine organisms (Moberg and Folke 1999; White et al. 2000; Komyakova et al. 2013). On tropical shallow-water reefs, for example, reef-building stony corals (order Scleractinia) build the underlying structural framework which serves as habitats for an extraordinary diversity of marine taxa (Small et al. 1998; Hughes et al. 2002; Plaisance et al. 2011). In deeper waters, the distribution of anthozoans continues from mesophotic zones to the deepest ocean trenches, contributing to biodiversity and ecological functioning at various spatial scales (Bongiorni et al. 2010; Watling et al. 2011). Anthozoans also provide an array of goods and services such as pharmaceutical products, fisheries, shore protection and tourism (Newton et al. 2007; Chao et al. 2012; Villanoy et al. 2012; Spalding et al. 2017; Morlighem et al. 2018). Despite their importance, the taxonomy and phylogeny of most anthozoan groups (Fig. 1), except for certain scleractinians, remain poorly studied to date.

Fig. 1
figure 1

Phylogeny of orders in Anthozoa, adapted from Quattrini et al. (2020) but excluding Relicanthus daphneae

Traditionally, a lack of consistent and informative markers has stymied progress in the classification of anthozoans based on morphology alone (Budd et al. 2010; McFadden et al. 2010). From the mid-1990s, the application of polymerase chain reaction (PCR) and Sanger sequencing—primarily of mitochondrial and ribosomal RNA (rRNA) genes—profoundly revolutionised the systematics of anthozoans (Chen et al. 1995; Romano and Palumbi 1996; Berntson et al. 1999). The most commonly used molecular markers targeted have been generally similar among anthozoan taxa, with the DNA mismatch repair protein (MutS) in octocorals (subclass Octocorallia) being a remarkable exception (Pont-Kingdon et al. 1995). This gene likely originated from a horizontal gene transfer event in the last common octocorallian ancestor (Bilewitch and Degnan 2011). Phylogenies reconstructed from these mitochondrial and rRNA genes provided unprecedented insights into anthozoan evolution, such as the distinct lineages that arose between the Atlantic and Pacific Oceans (Fukami et al. 2004; Silvestri et al. 2019), the monophyly of subclasses Hexacorallia and Octocorallia (Chen et al. 1995; Berntson et al. 1999; Fig. 1), as well as the role of hybridisation in anthozoan evolution (Odorico and Miller 1997; McFadden and Hutchinson 2004; Reimer et al. 2007).

Despite the use of multiple directly sequenced markers in most recent studies, the taxonomy and systematics of many anthozoans remain in flux today owing to poor phylogenetic resolution of these markers. Uncertainties limiting more comprehensive taxonomic revisions can be attributed chiefly to the poor phylogenetic signal of mitochondrial genes due to slow substitution rates, sequence compositional bias and substitution saturation (Shearer et al. 2008; Huang et al. 2008; Kitahara et al. 2014; Figueroa and Baco 2015; Pratlong et al. 2017a), as well as intragenomic variation in nuclear markers such as the internal transcribed spacers of rRNA (Odorico and Miller 1997; Reimer et al. 2007; Sánchez and Dorado 2008). These issues encountered with the most commonly targeted loci are compounded by anthozoans’ morphological variation, plasticity and convergence that hamper accurate species delimitation and stable taxonomic classification more generally (Todd 2008; Ament-Velásquez et al. 2016; Santos et al. 2016).

Within just the last decade, advances in DNA sequencing technologies have firmly rooted phylogenetic biology in the genomic era. Driven by developments in high-throughput sequencing (HTS), phylogeny reconstructions using genome-wide markers, broadly termed phylogenomic analyses (Philippe et al. 2005), are being churned out at an unprecedented pace and taxonomic scale across the tree of life (Galindo et al. 2019; Laumer et al. 2019; One Thousand Plant Transcriptomes Initiative 2019; Prasanna et al. 2020; Williams et al. 2020). For example, over 700 bird species have been robustly placed on the evolutionary tree by combining multiple phylogenomic studies using either ultraconserved elements (UCEs) or whole genome sequencing (WGS) into a supertree (Kimball et al. 2019).

For Anthozoa, phylogenetic analyses of HTS data primarily employ reduced-representation methods such as genome skimming, restriction site-associated DNA (RAD) sequencing, phylotranscriptomics and hybrid capture (Table 1). Nevertheless, the most comprehensive data available for phylogenomics—whole genome assemblies—have seen an uptick in recent years (Table 1; Fig. 2). Indeed, whole genome sequencing enables the extraction of orthologous loci even for traditional markers (e.g. mitochondrial and rRNA genes; see section on ‘Genome skimming’ below), so that the immense number of Sanger sequences deposited on GenBank can also be integrated for large multi-locus analyses.

Table 1 Studies employing high-throughput sequencing data for phylogenomic reconstruction that include Anthozoa, arranged in chronological order for each broad approach. Under “Taxon”, where two taxonomic groups are indicated, the first taxon name indicates the scale of phylogeny reconstructed (excluding outgroup), and the second indicates focal taxa. The various optimality criteria used are BI Bayesian inference; FFP feature frequency profiling; GN continuous-time general Markov nucleotide substitution model; ML maximum likelihood; MP maximum parsimony; NJ neighbour joining; PN phylogenetic network; ST species tree
Fig. 2
figure 2

Number of phylogenomic studies that include Anthozoa based on (a) the phylogenomic approach utilised, and (b) the clade investigated

There are several key considerations prior to any phylogenomic study. These include the temporal scale of divergence, sample size, the research question, bioinformatic pipeline adopted and costs involved (for a review, see Young and Gillung 2020). For most anthozoans, there exists another layer of consideration: endosymbionts. Due to the simple body plan and often small size of the polyps, digestion of samples typically encompasses tissue from the entire holobiont, defined as the organism along with their endosymbiont microbiota—including the prokaryotic partners and Symbiodiniaceae, plus viruses, fungi and other microorganisms where present (Thompson et al. 2015). As a result, any host genomic study must adopt measures for ensuring non-target DNA from contaminant organisms are discarded to avoid erroneous inferences (e.g. Artamonova and Mushegian 2013). Fortunately, there exist a variety of laboratory and bioinformatic methods to minimise the leakage of contaminant sequences into downstream analyses, such as extracting DNA from gametes (e.g. Shinzato et al. 2011, 2020), reducing the symbiont load prior to extraction (e.g. Cunning et al. 2018), and filtering non-target reads by mapping to appropriate references (e.g. Barshis et al. 2013; Forsman et al. 2017; Quek and Huang 2019; Buitrago-López et al. 2020; Quek et al. 2020).

Phylogenomics is a powerful tool in the arsenal of an evolutionary biologist. As climate change accelerates, the urgency to understand the evolutionary basis of species’ adaptation capacity and resilience has become more pressing than ever. The destruction of large swathes of shallow, mesophotic and deep-sea reef habitats around the world (Roberts et al. 2006; Hughes et al. 2017, 2018; Cunning et al. 2019), compounded by rising water temperatures and ocean acidification, translates into devastating effects projected for the environment and global economy in the next few decades (Speers et al. 2016; Dey et al. 2016; Stuart-Smith et al. 2018). To mitigate impacts on marine populations and initiate conservation policies, genomic data on anthozoans are critical as analyses can help identify cryptic species for biodiversity estimation and protection, improve population connectivity, and evaluate phylogenetic diversity as a key biodiversity measure for prioritisation (Faith 1992; Huang and Roy 2013, 2015; Winter et al. 2013; Huang et al. 2015, 2016b; Balbar and Metaxas 2019; Titus et al. 2019a, 2019b). Furthermore, the study of genomes and transcriptomes in the context of the anthozoan phylogeny can provide insights into the evolution of genes and adaptations in this diverse and ancient lineage (~ 800 Ma; Quattrini et al. 2020; McFadden et al. 2021) to environmental stressors amidst the current climate crisis (Moya et al. 2012; Takeuchi et al. 2016; Huang et al. 2017; Lin et al. 2017a, 2017b; Conci et al. 2019).

This review consolidates phylogenomic studies performed to date on Anthozoa in order to assess the taxonomic coverage and phylogenetic utility of the wide variety of tools available, as well as to identify gaps and opportunities for advancing a comprehensive, resolved and accurate representation of the Anthozoa tree of life. To compile published papers on anthozoan phylogenomic inferences, we searched Google Scholar with associated keywords, each preceded by the term ‘Anthozoa’. We trawled through the first ten pages of results, and selected studies that either employed HTS on anthozoans and/or conducted phylogenomic/mitogenomic reconstructions with anthozoans included as a focal taxon (Table 1, 2, S1; Fig. 2). Studies that investigated gene evolution or population-level phylogeography without reference to species relationships, cryptic or otherwise (e.g. Reitzel et al. 2013; Poole and Weis 2014; Takeuchi et al. 2016; Klompen et al. 2020; Sturm et al. 2020) were excluded. Only reconstructions with a broad sampling of orthologous loci were retained (Table 1, 2; Fig. 2). We note that based on the resolution of search terms, some phylogenomic studies might inadvertently be excluded in this manner of searching. For instance, a RAD-seq study by Spano et al. (2018) on sea anemones (order Actiniaria) does not include “Anthozoa” in the title or as keyword, similar to a number of mitogenome announcements (e.g. Niu et al. 2016a, 2016b, 2017a, 2017b; Teranneo et al. 2018b; Wang et al. 2018; Yu et al. 2019). Therefore, additional studies were identified from the reading of other papers and their references (i.e. snowballing; Greenhalgh and Peacock 2005), in conjunction with our previous research experiences and publications shared by colleagues (Table 1, S1). Together, the publications assembled here encompass a wide variety of studies and methods on a considerable diversity of corals and sea anemones for a comprehensive review of the phylogenomic advances in Anthozoa.

Table 2 Phylogenetic software and associated optimality criteria used in anthozoan phylogenomic studies (see Table 1; BI Bayesian inference; FFP feature frequency profiles; GN continuous-time general Markov nucleotide substitution model; ML maximum likelihood; MP maximum parsimony; NJ neighbour joining; PN phylogenetic network; ST species tree)

Genome skimming

Genome skimming involves extracting data from shallow HTS of whole genomes, whereby high-copy fractions (e.g. plastid, mitochondrial and rDNA genes) are ‘skimmed’ from the sequencing data for phylogenetic analysis. First established in plant phylogenetic studies, this method has been successfully applied on herbarium samples (Straub et al. 2012; Ripma et al. 2014; Nevill et al. 2020), and is especially useful for non-model organisms with no reference genomes or transcriptomes available for hybrid capture and sequencing of genome-wide markers (see section on ‘Hybrid capture and target enrichment sequencing’ below). Importantly, genome skimming enables analyses to take advantage of the large quantity of Sanger-sequenced data that can be combined with HTS data on the naturally-enriched markers for phylogenetic reconstruction, and may also help identify low-copy nuclear loci to be targeted in follow-up studies (Cronn et al. 2012; see also Zhang et al. 2019a).

Prior to HTS, whole mitochondrial genomes were typically sequenced through long-range polymerase chain reaction or primer walking (Uda et al. 2013; Lin et al. 2014; Figueroa and Baco 2015), both potentially costly and time-consuming methods. With HTS, mitogenomes across a number of samples can be obtained via multiplexing samples. One of the earliest studies investigating the mitogenomic phylogeny of anthozoans using HTS was on 11 Acropora (Scleractinia) species sequenced on an Illumina Solexa platform (Liu et al. 2015). Since then, HTS of mitogenomes has been applied widely on anthozoans, such as pennatulaceans (order Pennatulacea; Hogan et al. 2019), actiniarians (Foox et al. 2016; Xiao et al. 2019), ceriantharians (order Ceriantharia; Stampar et al. 2019) and the poorly studied antipatharians (order Antipatharia; Barrett et al. 2020). In addition to mitogenomes, other studies have skimmed mitochondrial, rRNA and histone genes from alternative sequencing data, including ezRAD (see section on ‘Restriction site-associated DNA sequencing (RAD-seq)’ below), hybrid capture markers (Poliseno et al. 2020; Quek et al. 2020) and transcriptomes (Chi and Johansen 2017).

In the past decade, mitogenomes have provided valuable insights into the genome and organismal evolution of anthozoans, challenging the conventional understanding of genome organisation in anthozoans and phylogenetic relationships among certain clades. For example, bipartite circular mitochondrial genomes have been discovered in Protanthea simplex (Actiniaria; Dubin et al. 2019) and Umbellula sp. (Pennatulacea; Hogan et al. 2019), contrary to the typical single circular mitogenome in most anthozoans. Relatedly, ceriantharians have been suggested to possess a linear mitochondrial genome (Stampar et al. 2019; but see Smith 2020), a trait characteristic of Medusozoa (Kayal et al. 2013) but not Anthozoa. With respect to phylogenetic relationships, Kayal et al. (2013) doubted the monophyly of Anthozoa, with Octocorallia recovered as sister to Medusozoa on their mitochondrial tree even after accounting for biases that may result because of amino acid composition biases and limited taxon sampling. However, recent studies based on phylotranscriptomics supported the monophyly of Anthozoa (Zapata et al. 2015; Pratlong et al. 2017a; Kayal et al. 2018; see also Fig. 1). More pertinently, one of the greatest controversies first raised with mitogenome phylogenies was the “naked coral hypothesis” (Medina et al. 2006), which posited that corallimorpharians (order Corallimorpharia) are scleractinians that have lost their skeleton. Examining amino acid trees that nested corallimorpharians within scleractinians, Kitahara et al. (2014) surmised that the unexpected relationship was likely an artefact of amino acid sequence biases, and this conclusion was later supported by broad-scale genomic analyses (Lin et al. 2016; Wang et al. 2017a). Given the slow rate of evolution in mitochondrial genes in anthozoans, substitution saturation and amino acid composition biases in some clades (Shearer et al. 2008; Huang et al. 2008; Kitahara et al. 2014; Pratlong et al. 2017a), we caution against over-reliance on mitochondrial analyses in studying anthozoan evolution, especially given the numerous high-throughput methods that have been established (as reviewed here). Instead, mitogenome sequencing would be best targeted at evaluating mitochondrial evolution and gene rearrangements (e.g. Chen et al. 2009; Lin et al. 2014; Dubin et al. 2019; Hogan et al. 2019).

With an increasing number of available genomes, the range of phylogenetic markers for genome skimming is likely to expand beyond mitochondrial and rRNA genes. Increasingly, seascape genomic analyses are taking advantage of genome data to recover phylogeographic patterns. For example, Bellis et al. (2016) identified several locality-specific lineages in the model anemone Exaiptasia (Actiniaria) based on a genome skimming approach, specifically by mapping their sequencing reads to a reference genome from Baumgarten et al. (2015). Likewise, Kitchen et al. (2019) utilised the Acropora digitifera reference genome (Shinzato et al. 2011) to identify single nucleotide variants between the two Caribbean acroporids (A. cervicornis and A. palmata) and their hybrid (A. prolifera).

However, relative to the genome-wide approaches below, the shallow sequencing effort generally results in low probability of obtaining data for low-copy genes that are orthologous across a number of samples. Therefore, the applicability of genome skimming across divergent clades and as a broad-based phylogenomic tool is likely to remain circumscribed (but see Zhang et al. 2019a).

Restriction site-associated DNA sequencing (RAD-seq)

RAD-seq (Baird et al. 2008) and its predecessor, RAD marker identification and analysis (Miller et al. 2007), are traditionally used as tools for genotyping and discovering variable genetic markers. Generally, this approach utilises specific restriction enzymes to digest genomic DNA, and the flanking regions of each restriction site are used to identify homologous loci bioinformatically with software packages such as Stacks (Catchen et al. 2013; Rochette et al. 2019) and PyRAD (Eaton 2014). Alignments of homologous loci enable the identification of single nucleotide polymorphisms (SNPs), which can be used to infer population structure and phylogeny. Analyses of RAD-seq data for phylogenomics are typically carried out with one of two methods: in the form of concatenated supermatrix analysis (Iguchi et al. 2019) or species tree approaches (Herrera and Shank 2016; Johnston et al. 2017), although there are alternatives such as the assembly- and alignment-free method (Fan et al. 2017). Detailed laboratory and experimental protocols, along with variations of RAD-seq, have been covered thoroughly in a review by Andrews et al. (2016).

First tested on Nematostella (Actiniaria) by Reitzel et al. (2013), RAD-seq has since proved to be an invaluable tool for resolving the taxonomy, phylogeny and population structure of anthozoans. The benefits of RAD-seq in phylogenomics stem from several facets: the relatively low investment required in time and cost; applicability for non-model taxa without a reference genome; and suitability for resolving both relatively deep and shallow divergences (Peterson et al. 2012; Toonen et al. 2013; Eaton et al. 2017). There are several variations of RAD-seq commonly used in anthozoans, varying mainly in the number of restriction enzymes used and the sequencing approach. They are: (1) conventional RAD-seq using a single restriction enzyme and single-end reads in Nematostella (Reitzel et al. 2013) and Chrysogorgia (order Alcyonacea) (Pante et al. 2015); (2) RAD-PE with paired-end sequencing of RAD fragments (Etter et al. 2011) in Pocillopora (Scleractinia) (Combosch and Vollmer 2015) and Anemonia viridis (Actiniaria) (Porro et al. 2020); (3) double-digest RAD-seq with two restriction enzymes (Peterson et al. 2012), which have been used on Bartholomea annulata (Actiniaria) (Titus et al. 2019a); (4) DArTSeq, a genome complexity reduction method similar to RAD-seq as tested in Acropora (Scleractinia) (Rosser et al. 2017) and Pocillopora (Smith et al. 2017); and (5) ezRAD (Toonen et al. 2013), a popular method which uses either a single enzyme or a combination of enzymes in non-model organisms, as applied on Porites (Forsman et al. 2017), Montipora (Scleractinia) (Cunha et al. 2019) and Ovabunda (Alcyonacea) (McFadden et al. 2017), for example.

At relatively shallow divergences (< 100 Ma), RAD-seq is a powerful tool that is able to reconstruct robust phylogenies and characterise cryptic species. For example, the species boundaries and phylogeny of the scleractinian genera Pocillopora and Leptastrea have been clarified considerably by Johnston et al. (2017) and Arrigoni et al. (2020), respectively. Herrera and Shank (2016) tested a species delimitation hypothesis with RAD-seq phylogeny for the morphologically enigmatic octocoral Paragorgia (Alcyonacea). In resolving specific cryptic species problems, Iguchi et al. (2019) demonstrated that the blue coral Heliopora coerulea (order Helioporacea) around Okinawa actually comprises two distinct species; and Quattrini et al. (2019) not only discovered several cryptic species within the genus Sinularia (Alcyonacea) at Dongsha Atoll, they also highlighted the importance of hybrid speciation in the evolution of anthozoans (see also Mao 2020).

As with other reduced-representation sequencing techniques (e.g. hybrid capture; Allio et al. 2020), certain genomic features such as mitochondrial genes or genomes can be extracted and assembled (Table 1). Consequently, RAD-seq data have been used to assemble mitogenomes for several Porites (Scleractinia) species to date (Tisthammer et al. 2016; Teranneo et al. 2018a, 2018b). Similarly, Capel et al. (2020) were able to skim mitochondrial and rRNA genes from RAD data, thereafter describing the genus Atlantia (Scleractinia). Beyond phylogeny reconstruction, RAD-seq data analyses guided Pratlong et al. (2017b) to propose an XX/XY sex-determination system in Corallium rubrum (Alcyonacea). Such data were also used to investigate DNA methylation in Porites astreoides resulting from environmental changes (Dimond and Roberts 2020). While Dimond and Roberts (2020) were able to annotate several genes experiencing changes in DNA methylation involved in cell signalling and proliferation, mRNA splicing and apoptosis, they lamented the lack of well-annotated genomes for further functional inferences in other methylated loci (see Neri et al. 2017).

Despite the increasing popularity of RAD-seq in anthozoan phylogenomics (Fig. 2a), its limitations are apparent for deep divergences. In particular, mutations at the restriction sites accumulate with evolutionary time, resulting in an increase in missing data the more evolutionary divergent the focal species are. More importantly, this pattern reduces the ability to infer orthology in RAD loci based on sequence similarity (Rubin et al. 2012). Recent studies (e.g. Herrera and Shank 2016; Eaton et al. 2017) have demonstrated that RAD-seq is able to infer robust phylogenies for divergences between 50 and 80 Ma, but given that the origin of Anthozoa is estimated to be ~ 800 Ma (Quattrini et al. 2020; McFadden et al. 2021), the applicability of RAD-seq for broad-scale phylogenies is limited. Therefore, the use of RAD-seq will likely remain confined to phylogenetic analyses of closely-related species.

Reductively-amplified DNA sequencing (ReAD-seq)

Leveraging the principles of RAD-seq, reductively-amplified DNA sequencing (ReAD-seq) circumvents the need for restriction enzyme digestion by substituting it with PCR-based reactions. Principally, two types of ReAD-seq are employed: Nextera-tagmented reductively-amplified DNA (NextRAD) (Russello et al. 2015) and multiplexed inter-simple sequence repeat (ISSR) genotyping (MIG-seq) (Suyama and Matsuki 2015). In the former, selective primers are used to amplify consistent loci across samples, whereas the latter amplifies anonymous non-repetitive regions using inter-simple sequence repeat (ISSR) primers which anchor on simple sequence repeats (SSRs) (e.g. microsatellites). Similar to RAD-seq, homologous loci are identified, with subsequent genotyping and SNP identification. Applications of this method include cryptic species identification and population genetics.

Despite the high efficiency and utility, the popularity of ReAD-seq pales in comparison with RAD-seq for anthozoan phylogenomics (Fig. 2). Nevertheless, a recent NextRAD study by Bongaerts et al. (2020; see also Bongaerts et al. 2017) clearly demonstrated that, contrary to popular belief, the widespread Indo-Pacific coral Pachyseris speciosa (Scleractinia) is an ancient cryptic species complex with three morphologically indistinguishable lineages. Using MIG-seq, Richards et al. (2018) described a blue coral species Heliopora hiberniana (Helioporacea) from north Western Australia as distinct from the widespread H. coerulea. Despite identical cytochrome c oxidase subunit I (COI) and mutS sequences among all individuals analysed, principal coordinate analysis (PCoA) and STRUCTURE analysis (Pritchard et al. 2000) on MIG-seq SNPs confirmed the two distinct lineages, H. coerulea and H. hiberniana, which were also morphologically distinguishable. Similarly, Pleurocorallium elatius and P. konojoi (Alcyonacea) from Japan were inseparable with two mitochondrial loci, but both morphology and MIG-seq data clearly delineated the two species with almost no demonstrable gene flow (Takata et al. 2019, see also Tu et al. 2015). Notably, these MIG-seq studies did not reconstruct a phylogeny and have been excluded from Fig. 2 and Table 1.

For deeper divergences, the power of both RAD- and ReAD-seq trails off precipitously (see above), so we now turn our attention to the more versatile phylogenomic tools: phylotranscriptomics, hybrid capture and whole genome sequencing.

Phylotranscriptomics

The transcriptome of an organism refers to all RNA transcripts of both coding and non-coding regions. Broadly, data generated from RNA sequencing (or RNA-seq) are typically used for investigating developmental biology (Okubo et al. 2016; Rentzsch and Technau 2016), phylotranscriptomics (Lin et al. 2016; Quek and Huang 2019; Richards et al. 2020), gene evolution (Gacesa et al. 2015; Huang et al. 2016a; Lin et al. 2017a) and comparative genomics (Bhattacharya et al. 2016). Similar to the aforementioned methods, phylotranscriptomics bypasses the need for assembled genome data, as single-copy orthologous genes are identified bioinformatically following sequencing and assembly. Phylogenetic reconstructions tend to be equivalent to inferences made from whole genome sequencing but can be achieved at a fraction of the cost (see Cheon et al. 2020).

Phylotranscriptomics is an ideal tool for analysing the phylogeny of early-diverging metazoans such as sponges (Simion et al. 2017; Feuda et al. 2017), ctenophores (Whelan et al. 2015; Borowiec et al. 2015) and cnidarians (Zapata et al. 2015; Kayal et al. 2018). In Anthozoa, phylotranscriptomics has clarified perplexing phylogenies reconstructed from mitochondrial data. In particular, phylotranscriptomics cemented the monophyly of both Anthozoa (Zapata et al. 2015; Pratlong et al. 2017a; Kayal et al. 2018) and Scleractinia (Lin et al. 2016). To date, the largest phylotranscriptomic study at the order level included only 44 scleractinians (Quek and Huang 2019), despite an ever-increasing number of transcriptomes available. This approach also remains underutilised for shallower timescales. A recent paper on Acroporidae (Scleractinia) determined that Alveopora is sister to Montipora rather than Astreopora (Richards et al. 2020; see Ryu et al. 2019a), but such studies are few and far between.

We note that there has been much research on the transcriptomes of anthozoans such as sea anemones (Urbarova et al. 2012; Mitchell et al. 2020), scleractinians (Pootakham et al. 2018; Ryu et al. 2019a), octocorals (Ryu et al. 2019b) and zoantharians (order Zoantharia; Huang et al. 2016a, 2017), but with specific objectives other than phylogeny reconstruction. Primarily, these studies focused on identifying toxins and bioactive compounds, testing gene responses under a variety of environmental stressors and reconstructing gene evolution. Consequently, there are multiple well-annotated transcriptomes and associated databases (Liew et al. 2016; Zhang et al. 2019b) that serve as foundations for subsequent phylotranscriptomic and gene evolution studies.

Unlike with RAD-seq where anonymous loci are sequenced, annotation of gene transcripts opens up opportunities for investigating gene and genome evolution (Bhattacharya et al. 2016; Lin et al. 2017a; Koch and Grimmelikhuijzen 2019). When genetic changes are traced on a robust phylogeny, specific hypotheses about the evolutionary history of anthozoans can be tested. For example, Bessho-Uehara et al. (2020) demonstrated that the last common ancestor of all octocorals was likely bioluminescent. In the same vein, the origin and evolution of the aragonite skeleton in scleractinians are of particular interest for reef-building corals (Quattrini et al. 2020). Bhattacharya et al. (2016) examined the transcriptomes of 20 different stony corals and identified genes involved in accretion of the coral skeleton, whereas Lin et al. (2017a) focused primarily on differences in carbonic anhydrases between Scleractinia and Corallimorpharia, concluding that the greater number of scleractinian genes involved in secreted and membrane-associated carbonic anhydrases were critical for the evolution of calcification. Threatened by accelerating climate change, more comparative genomic studies across broader phylogenetic scales (e.g. Guzman et al. 2018; Conci et al. 2019; Quattrini et al. 2020) will be needed to determine the fate of reef ecosystems.

While phylotranscriptomic data are cheaper to obtain and more readily available than whole genome data, there are some drawbacks. Crucially, due to the instability of RNA, samples have to be carefully handled and stored prior to extraction (Seelenfreund et al. 2014). This constraint excludes most museum samples and increases the cost of sample preservation. Furthermore, due to uneven expression levels among genes and differences in sequencing depth, missing data would be expected among orthologs. As a result, orthologs may be misidentified by algorithms that use gene-tree-free methods (Cheon et al. 2020). Missing data and data type (i.e. amino acid vs. DNA) may also impact phylogenetic inference, although Quek and Huang (2019) found limited effects of missing data on tree topologies inferred for scleractinians, instead emphasising the importance of meticulously checking for irregularities among gene trees (see also Xi et al. 2016). The authors also noted that DNA sequences are superior to amino acid sequences in achieving higher branch supports and congruence among loci (Quek and Huang 2019; see also Ying et al. 2018). To overcome some of the limitations with phylotranscriptomic analysis, more studies are beginning to opt for hybrid capture instead.

Hybrid capture and target enrichment sequencing

A relatively novel approach, hybrid capture coupled with target enrichment has been gaining traction in phylogenomic studies across the tree of life (Faircloth et al. 2012; Lemmon et al. 2012; Karin et al. 2020). Briefly, specific DNA or RNA ‘probes’ (also known as ‘baits’) that are complementary to target loci are designed which are then hybridised to DNA libraries. Orthologous targets are typically conserved regions of DNA, with variable flanking regions that may be phylogenetically informative. Unbound DNA is then discarded, and the targeted loci are enriched and sequenced accordingly. Hybrid capture has been used in identification of infectious diseases (Gaudin and Desnues 2018), assembling genomes from ancient remains and museum samples (Gasc et al. 2016), and phylogenomics (for a review, see Andermann et al. 2020). This approach is particularly versatile as researchers have the flexibility to design probes based on the intended level of taxonomic inclusivity. Hybrid capture baits have been used for a number of marine taxa based on transcriptomes, genomes and RAD data, including ophiuroids (Hugall et al. 2016), mobulids (White et al. 2017) and zooplankton (Choquet et al. 2019), respectively.

There are currently three sets of anthozoan baits designed for different taxonomic levels and groups: Hexacorallia (Cowman et al. 2020; see also Quattrini et al. 2018), Octocorallia (Erickson et al. 2020; see also Quattrini et al. 2018) and Scleractinia (Quek et al. 2020). The former two consist of a combined set of loci targeting all anthozoans, optimised from Quattrini et al. (2018) based on both ultraconserved elements (UCEs) and exons as references, whereas the latter design is based on exons. These bait sets have shown great promise in phylogenomic reconstructions. Quattrini et al. (2018) first demonstrated the feasibility of their baits across almost all major clades of anthozoans (n = 33). A greatly expanded sampling in Quattrini et al. (2020), sequenced using two bait sets (Quattrini et al. 2018; Cowman et al. 2020), resulted in a robust, time-calibrated phylogeny of 234 taxa from all anthozoan orders. This study found that the evolution of skeletal traits and diversification rates of anthozoans closely mirrored that of palaeoclimatic changes through geologic time. For example, high levels of carbon dioxide in the atmosphere during the early to mid-Paleozoic likely resulted in ocean acidification that impacted aragonitic skeleton production among scleractinians but drove the cladogenesis of actiniarians, ceriantharians and zoantharians. Accordingly, it has been suggested that contemporary climate change and ocean acidification could lead to the replacement of reef-building corals by their more resilient octocoral relatives (Quattrini et al. 2020), as exemplified by contemporary ecological trends (Inoue et al. 2013; Tsounis and Edmunds 2017).

Hybrid capture has further demonstrated applicability on more recent anthozoan divergences. Both Cowman et al. (2020) and Quek et al. (2020) tested their bait sets on scleractinians, with the former focusing on Acroporidae, Agariciidae and Fungiidae, and the latter on a broad sampling across 12 different families. Notably, Quek et al. (2020) highlighted the impact of sequence specificity on bait design, as they captured considerably fewer loci from sister clade Corallimorpharia compared to Scleractinia. At the genus level, Erickson et al. (2020) utilised hybrid capture to study species delimitation and population genomics in soft corals Alcyonium and Sinularia (Alcyonacea). Leveraging variable sequences that flank UCEs or exons (i.e. introns), their analysis broadly corroborated the RAD-seq phylogeny of Sinularia in Quattrini et al. (2019) while resolving the phylogeny and population genomics of the two genera.

Despite the manifold advantages of hybrid capture-based phylogenomics, such as its affordability, applicability for old museum samples and capacity to capture homologous loci at various taxonomic scales (see McKain et al. 2018), it still falls short compared to whole genome methods. For example, bait design and downstream analyses can be confounded by paralogy stemming from whole genome duplications such as in Acropora (Scleractinia) (Mao and Satoh 2019). Furthermore, genome characterisation may generate broader insights beyond phylogenetics and population genomics, such as the evolution of body plans (Putnam et al. 2007; DuBuc et al. 2018), novel metabolic pathways and gene evolution (Ying et al. 2018; Cunning et al. 2018; Surm et al. 2019), sexual reproduction (Wilding et al. 2020), eco-evolutionary processes (Ledoux et al. 2020) and adaptive radiation (Mao et al. 2018). Therefore, whole genome analysis is likely the ultimate frontier for advancing phylogenomic research.

Whole genomes

The undertaking of sequencing a genome is no small feat. Challenges can arise from cross-contamination by endosymbionts, difficulties in genome assembly and annotation as well as the overall investment required (Dominguez Del Angel et al. 2018). The genomes of many anthozoans are particularly difficult to assemble due to cross-contamination from a mixture of Symbiodiniaceae and other microorganisms (Artamonova and Mushegian 2013). To reduce non-target sequence load, DNA can be isolated from gametes (Shinzato et al. 2011, 2020; Robbins et al. 2019), symbionts separated prior to DNA extraction (Cunning et al. 2018) or non-target reads bioinformatically filtered post-sequencing (Buitrago-López et al. 2020).

Presently, whole genomes of anthozoans are still sorely lacking as compared to several other metazoan groups, which have researchers aiming for 5000 to 10,000 genomes (Zhang 2015; i5K Consortium 2013; Fan et al. 2020). In stark contrast, one of the largest anthozoan genome sequencing projects underway involves the sequencing of only 10 scleractinians (ReFuGe 2020 Consortium 2015). Nevertheless, there has been substantial progress since the emergence of the first sea anemone genome (Putnam et al. 2007), with a recent bumper crop of 18 Acropora (Scleractinia) genomes assembled within a single study (Shinzato et al. 2020). Unlike hybrid capture studies where baits can be designed across different scales of divergence, whole genomes are taxon-specific. As a result, there exists an imbalance of whole genomes assembled among the major anthozoan clades. There are now sequenced genomes for scleractinians, actiniarians, corallimorpharians and two octocoral classes Alcyonacea and Pennatulacea (Jiang et al. 2019; Ledoux et al. 2020; Table 1), but not for antipatharians, ceriantharians, helioporaceans and zoantharians.

Similar to phylotranscriptomics, genomic phylogenetic analysis is typically based on ortholog assignments of protein-coding genes (but see Cunning et al. 2018; Sims et al. 2009), which enable combining of transcriptomic and genomic data for phylogeny reconstruction (e.g. Lin et al. 2016; Kayal et al. 2018; Richards et al. 2020). Genome assemblies can generate critical insight into gene and genome evolution. For instance, Ying et al. (2018) identified an ancestral, complete fungal-like histidine biosynthesis pathway that is present only in “Robust” corals, but not in any other metazoan. In the “Complex” scleractinian Acropora, whole genome analyses have suggested that there was likely a whole genome duplication event that occurred ~ 31 Ma in the most recent common ancestor of the genus (Mao and Satoh 2019), and the contemporary diversity likely resulted from introgression and adaptations to past climate conditions (Mao et al. 2018; Shinzato et al. 2020).

Fundamentally, genome-based phylogenies can be applied across nearly all taxonomic levels (Kapli et al. 2020). However, we emphasise that whole genomes do not necessarily lead to more accurate inferences. For example, the earliest branch of metazoans was the subject of great contention even with analyses of large phylogenomic supermatrices (reviewed in Philippe et al. 2017). Rather than maximising the number of loci analysed, it is critical to ensure that taxon sampling is sufficiently broad (Philippe et al. 2011), robust model selection parameters are adopted (Simon et al. 2017; but see Abadi et al. 2019 cf. Gerth 2019), accurate orthologs are inferred (Cheon et al. 2020), and other systematic errors are avoided (see Philippe et al. 2017; Young and Gillung 2019). Unfortunately, whole genome assemblies in the majority of anthozoans remain a distant possibility due to the astronomical costs and challenges of such an undertaking. All of the alternatives reviewed above are therefore likely to be instrumental for anthozoan phylogenetics in the near future.

Considerations in anthozoan phylogenomics

The development of HTS technologies over the last decade has been mirrored by the uptake of genomic approaches for phylogenetic studies (see Fig. 2). The scale of analysis has also increased as a result. Genome skimming is the most popular method to date, but downstream phylogenetic analyses have largely relied on mitochondrial genes (Fig. 2a; Table 1). In contrast, other reduced-representation approaches that are increasingly being used can include hundreds to thousands of loci, with some methods allowing for the merging of loci for phylogeny reconstruction (e.g. Bossert et al. 2019). This trend towards greater gene sampling will continue with the increasing availability of newer sequencing technologies, particularly the third-generation sequencing platforms produced by Oxford Nanopore Technologies and Pacific Biosciences (PacBio). The nanopore and single-molecule real-time (SMRT) sequencing technologies generate longer reads, offering a number of improvements over short-read sequencing, albeit with lower accuracy (Pollard et al. 2018; Burgess 2018). Studies adopting either of these methods thus typically include Illumina sequencing for error correction to generate hybrid assemblies (Jeon et al. 2019; Jiang et al. 2019; Shumaker et al. 2019; Shinzato et al. 2020). Nevertheless, long-read sequencing remains expensive relative to their output and require computationally intensive analyses (for a review, see Amarasinghe et al. 2020).

Hybrid assemblies are the gold standard for reference assemblies and downstream phylogenomic analyses (Philippe et al. 2005; Burgess 2018). Current short-read sequencing technologies face some restrictions, including missing genes in the final assembled genomes (Alkan et al. 2011), technical and computational challenges in de novo assemblies owing to repetitive sequences and high likelihood of misassemblies (for a review, see Liao et al. 2019). Accurate annotations of well-assembled genomes—currently limited for Anthozoa (Table 1; Artamonova and Mushegian 2013; Shinzato et al. 2020)—are also necessary for numerous downstream phylogenomic analyses. For example, they enable cross-validation of sequence similarity-based read filtering which is helpful for removing endosymbiont sequences (e.g. Forsman et al. 2020). Furthermore, adaptive loci under selection sequenced via reduced-representation methods can be identified based on genome annotations and excluded from phylogenomic studies or separately analysed for their associated functions (Titus et al. 2019a; Forsman et al. 2020; see also Ahrens et al. 2017).

Despite the best efforts of research teams around the world, whole genome assemblies for numerous anthozoans remain unlikely even in the next few decades. Realistically, hybrid capture and target enrichment sequencing may hold the greatest promise for building a comprehensive anthozoan tree of life. The flexibility of the approach, combined with the applicability across different scales of divergence, makes it a forerunner among all reduced-representation methods (Faircloth et al. 2012; Lemmon et al. 2012; Andermann et al. 2019; Erickson et al. 2020). However, there are currently only three available anthozoan bait sets (see section on ‘Hybrid capture and target enrichment sequencing’ above), and the baits designed by Quek et al. (2020) for Scleractinia could not capture a large proportion of targeted loci for the sister clade of Corallimorpharia. To this end, we encourage the development of more taxon-specific baits as more targeted baits would result in greater evenness in coverage and higher capture efficacy (Andermann et al. 2020). Where different bait sets are used, the results can be weaved together into a single supertree (e.g. Kimball et al. 2019). Alternatively, multiple bait sets can be combined and used for hybrid capture experiments, and orthologous loci identified post-capture alongside genome and transcriptome data (e.g. Bossert et al. 2019). We expect more bait sets to emerge in the near future, each differing in its targeted taxonomic group, but all contributing towards a comprehensive, robust phylogeny of anthozoans.

Genome-wide data do not shield phylogenomic analyses from incongruence (see Philippe et al. 2011, 2017; Jeffroy et al. 2006; Roure et al. 2013; Lambert et al. 2015). For instance, erroneous phylogenetic inferences can arise from misalignments, errors in ortholog assignments or violations of model assumptions. Furthermore, incongruent gene trees due to incomplete lineage sorting and cross-contamination will affect downstream analyses. At the sequence level, substitution saturation, inappropriate evolutionary model assignments and, to a limited extent, missing data all have negative impacts on phylogenetic accuracy. One key consideration for anthozoan phylogenetic research is introgression, which is an important process in the evolution of scleractinian corals and sea anemones but can confound typical tree reconstruction methods (Mao et al. 2018; Cunha et al. 2019; Porro et al. 2019; Quattrini et al. 2019; Bongaerts et al. 2020). Specifically, introgressive hybridisation gives rise to a reticulated pattern of evolution that cannot be captured in the typical bifurcating model (Knowles 2009; Townsend et al. 2011; Lambert et al. 2015; Tonini et al. 2015; Quattrini et al. 2019). Phylogenetic network approaches can help to represent this pattern (Huson and Bryant 2006; Morrison 2014; Solís-Lemus et al. 2017), but introgression events first need to be characterised with genomic data (see Morrison 2014). More generally, better understanding recent gene flow and introgressive hybridisation would clarify species boundaries and improve phylogenetic estimates for closely-related species (Mao et al. 2018; Quattrini et al. 2019; Erickson et al. 2020).

We acknowledge that phylogenomic analyses can seem daunting, especially for the inexperienced. Fortunately, there are a number of software programs designed to ease the entire reads-to-phylogeny workflow for a variety of sequencing approaches. For example, RADIS (Cruaud et al. 2016) automates the analysis of RAD-seq data from raw sequences to phylogenetic inference, and ParGenes (Morel et al. 2019; Darriba et al. 2020) performs model selection and gene/species tree reconstruction from multiple sequence alignments. Similarly, TREEasy (Mao et al. 2020) can automate the processing of sequence alignments to build a phylogenetic network or infer the species tree given a set of FASTA files, each corresponding to an orthologous locus. The degree to which each procedure in the reads-to-phylogeny process is put through these tools depends on the scope, method and expected output of each study, as well as the experience level of the user. More experienced researchers may opt for running each step separately in order to fine-tune specific parameters or compare different analyses. Overall, we recommend that, as far as possible, phylogenomic studies should apply multiple optimality criteria (e.g. maximum likelihood, Bayesian and species tree) to ensure congruence and accuracy of the inferred phylogeny (Table 2) (Mao et al. 2018; Arrigoni et al. 2020; Quek et al. 2020; Quattrini et al. 2020; Shinzato et al. 2020).

Future of anthozoan phylogenomics

Phylogenomics has become an important tool for guiding and supporting research across a wide range of biological phenomena. Indeed, well-sampled and robust phylogenomic trees are powerful for addressing complex evolutionary questions. As anthropogenic climate change is currently driving the “sixth mass extinction” (Barnosky et al. 2011; Ceballos et al. 2015), with marine taxa not spared from this seemingly inexorable decimation (Carpenter et al. 2008; McGill et al. 2015; Johnson et al. 2017; Luypaert et al. 2019), studies focusing on gene evolution, diversification trends, biogeography and trait evolution based on a reliable time-calibrated phylogeny are crucial for shaping and adapting conservation strategies (Drake et al. 2014; Bhattacharyra et al. 2016; Mouillot et al. 2016; Hartmann et al. 2017; Kayal et al. 2018; Mao et al. 2018; Palmer and Traylor-Knowles 2018; Huang et al. 2017; Dishon et al. 2020; Quattrini et al. 2020).

Overall, phylogenomic approaches are revolutionising our understanding of anthozoan evolution, driven by sustained sequencing efforts to churn out unprecedented amounts of data for more species. However, there is clearly an imbalance in taxon coverage among anthozoan clades, with groups such as ceriantharians and zoantharians remaining under-sampled for phylotranscriptomic, hybrid capture and whole genome studies (Fig. 2b). It is imperative that while efforts on actiniarians and scleractinians continue, more attention should be placed on these overlooked taxa which are consequently more uncertain phylogenetically (Kushida and Reimer 2019; Stampar et al. 2019; Xiao et al. 2019; Mejia et al. 2020; Poliseno et al. 2020). Furthermore, while whole genome assemblies are expected to result in the most robust trees, reduced-representation techniques such genome skimming, RAD-seq and hybrid capture should continue to be adopted given the research objectives and resources at hand. Finally, we caution that as phylogenomics becomes increasingly accessible both financially and bioinformatically, tried and tested methods that are appropriate for addressing specific biological questions should not be cast aside. Phylogenetic research using genomic data remains well complemented by established approaches focusing on the morphology, physiology, behaviour and ecology of species (Valdecasas et al. 2007; Padial and De la Riva 2010; Padial et al. 2010).