Introduction

The class Halobacteria consists of a large and phenotypically heterogeneous assemblage of halophilic archaea within the phylum Euryarchaeota (Grant et al. 2001; Oren 2012). The class Halobacteria currently contains over 175 species placed into fifty distinct genera (Parte 2013; NamesforLife 2015). Our understanding of interrelationships of the genera within the class Halobacteria was previously based on chemotaxonomic characteristics and analysis of the 16S rRNA gene (Grant et al. 2001; Oren 2006; Wright 2006; Oren 2012). However, chemotaxonomic properties have not proven useful for classification above the genus level (Oren 2006, 2012) and the presence of multiple highly divergent copies of the 16S rRNA gene in many members of the class Halobacteria has limited the inferences that can be drawn from its analysis (Mylvaganam and Dennis 1992; Vreeland et al. 2002; Boucher et al. 2004; Cui et al. 2009; Oren 2012). Analyses of other individual gene/protein sequences, such as rpoB’/rpoC, have also thus far proven of limited value in elucidating the evolutionary relationships of the members of the class Halobacteria due to concerns regarding recombination events and lateral gene transfers (Walsh et al. 2004; Enache et al. 2007; Minegishi et al. 2010b; Naor et al. 2012; Williams et al. 2012).

The use of a concatenated set of unlinked and conserved loci in phylogenetic reconstruction is able to mitigate the effects of any instances of recombination or lateral gene transfer and provide greater resolving power than trees based on single genes/proteins (Rokas et al. 2003; Ciccarelli et al. 2006; Wu et al. 2009). The advent of widely available genome sequencing technology has provided taxonomists with a wealth of data from which to elucidate the relationships between various prokaryotic groups (Gao and Gupta 2012a; Zhi et al. 2012; Oren and Garrity 2014). A genome-centric, polyphasic approach to taxonomy, in which categorisation is driven primarily by inferences drawn from the genome sequence data and secondarily by molecular, biochemical, and phenotypic traits is now the recommended approach for prokaryotic taxonomy (Klenk and Goker 2010; Oren and Garrity 2014; Rossello-Mora and Amann 2015; Sutcliffe 2015; Whitman 2015).

The increasing availability of genome sequencing technology has provided us with genome sequence data from 35 genera within the class Halobacteria, covering a majority of the diversity within the group (NCBI 2015). This genome sequence data has allowed for robust and in-depth phylogenetic reconstructions of the sequenced Halobacteria species based on multiple concatenated gene and protein sequences (Papke et al. 2011; Andam et al. 2012; Williams et al. 2012; Soucy et al. 2014; Gupta et al. 2015). The most comprehensive phylogenetic reconstructions to date have been based on the sequences of thirty-two concatenated housekeeping proteins from 98 Halobacteria genomes (Gupta et al. 2015) and fifty-five concatenated ribosomal proteins from 118 Halobacteria genomes (Soucy et al. 2014). This genome sequence data is also enabling the detection of conserved molecular characteristics shared by evolutionarily related groups of organisms. In particular, two classes of conserved molecular characteristics have recently been utilised in prokaryotic taxonomy (Bhandari and Gupta 2014; Gupta 2014; Naushad et al. 2014; Gupta et al. 2015): conserved signature insertions/deletions (CSIs), which are insertions or deletions (indels) that are present only in a related group of organisms, and conserved signature proteins (CSPs), which are whole proteins that are present only in a related group of organisms. Both classes of molecular characteristics represent synapomorphic characteristics and provide reliable evidence, independent of phylogenetic trees, that the species from the groups in which they are found are specifically related to each other due to common ancestry. Recently, the class Halobacteria, which previously contained a single order (Halobacteriales), was divided into three orders (Halobacteriales, Haloferacales, and Natrialbales) on the basis of CSIs and CSPs (Gupta et al. 2015). However, due to the size of the class Halobacteria, the previous analysis only focused on the higher level divisions within the class Halobacteria, placing a single family within each of the three orders despite the size, diversity, and, in the case of the order Halobacteriales, the polyphyly of the identified groups (Gupta et al. 2015).

In this work, we have employed the whole genome sequences of 129 genome sequenced members of the class Halobacteria to reconstruct a highly robust phylogenetic tree based on 766 shared proteins and to identify conserved molecular characteristics that can be used to determine the interrelationships of the halobacterial genera within the three orders. We present 20 CSIs and 31 CSPs which are unique characteristics of infra-order level groups of genera within the class Halobacteria. Additionally, we present 40 CSIs and 234 CSPs that are characteristic of Haloarcula, Halococcus, Haloferax, or Halorubrum. Importantly, the order Haloferacales has been found to contain two main groups, a group containing the genus Haloferax and related genera, which is supported by four CSIs and five CSPs and a group containing the genus Halorubrum and related genera, which is supported by four CSPs. We have also identified molecular characteristics that suggest that the polyphyletic order Halobacteriales contains at least two large monophyletic clusters of organisms in addition to the polyphyletic members of the order, one cluster containing the genus Haloarcula and related genera supported by ten CSIs and nineteen CSPs and the other containing the members of the genus Halococcus which is supported by nine CSIs and 23 CSPs. On the basis of the phylogenetic analyses and the identified conserved molecular characteristics presented here, we propose a division of the order Haloferacales into two families, an emended family Haloferacaceae and Halorubraceae fam nov. and a division of the order Halobacteriales into three families, an emended family Halobacteriaceae, Haloarculaceae fam. nov., and Halococcaceae fam. nov.

Methods

Phylogenetic analyses

A phylogenetic tree was produced based on the concatenated sequences of 766 proteins obtained from 129 genome sequenced members of the class Halobacteria. The protein families used in this phylogeny were identified using the UCLUST algorithm (Edgar 2010) to identify proteins families present in at least 80 % of the input genomes which shared at least 50 % sequence identity and 50 % sequence length. Input genomes that were not annotated had all of their open reading frames translated from nucleotide sequences to amino acid sequences using USEARCH 8 (Edgar 2010). Each identified protein family was individually aligned using Clustal Omega (Sievers et al. 2011) and trimmed using Gblocks 0.91b (Castresana 2000) with relaxed parameters (Talavera and Castresana 2007). The concatenated dataset of the trimmed sequence alignments contained 212,988 aligned amino acid residues. A maximum-likelihood tree based on this alignment was constructed using FastTree 2 (Price et al. 2010) employing the Whelan and Goldman model of protein sequence evolution (Whelan and Goldman 2001) and RAxML 8 (Stamatakis 2014) using the Le and Gascuel model of protein sequence evolution (Le and Gascuel 2008). SH-like statistical support values (Guindon et al. 2010) for each branch node in the final phylogenetic tree were calculated using RAxML 8 (Stamatakis 2014). The resultant phylogenetic tree was drawn and artificially rooted on the midpoint using MEGA 6 (Tamura et al. 2013). This process was completed using an internally developed software pipeline. A manuscript for this pipeline is currently under preparation and the pipeline will be available for public use on Gleans.net once released.

In parallel, a phylogenetic tree based on the 16S rRNA gene sequences of type strains covering all validly named genera within the class Halobacteria was also constructed. The 16S rRNA sequences were retrieved from Ribosomal Database Project (Cole et al. 2014) and aligned using the SINA aligner (Pruesse et al. 2012) to form a multiple sequence alignment that was 1604 nucleotides long with common gaps removed. A maximum-likelihood phylogenetic tree based on this multiple sequence alignment was created using in MEGA 6 (Tamura et al. 2013) employing the General Time-Reversible model of sequence evolution (Tavaré 1986) with branch support based on 1000 bootstrap replicates.

Identification of conserved signature indels

Conserved signature indels were identified as detailed by Gupta (2014). In summary, BLASTp (Altschul et al. 1997) searches were performed on each protein in the genome of Halobacterium salinarum R1 (Pfeiffer et al. 2008) against all available sequences in the GenBank non-redundant database. Multiple sequence alignments were then created using ClustalX (Jeanmougin et al. 1998) for proteins that returned high scoring matches from Halobacteria and other prokaryotes. The alignments were then visually inspected for the presence of insertions or deletions that were flanked on both sides by at least 5–6 conserved amino acid residues in the neighbouring 30–40 amino acids. Detailed BLASTp searches were then carried out on short sequence segments containing the indel and the flanking conserved regions (60–100 amino acids long) to determine the specificity of the indels. SIG_CREATE and SIG_STYLE (available on Gleans.net) were then used to create Signature files for CSIs that were specific to Halobacteria subgroups as described by Gupta (2014). Due to the large number of genome sequences available for Halobacteria, the sequence alignment files presented here contain sequence information for only a limited number of species (generally only the type species from different genera). However, unless otherwise indicated, all members of the specified groups displayed similar sequence characteristics.

Identification of conserved signature proteins

Identification of conserved signature proteins for different Halobacteria subgroups was carried out by completing BLASTp (Altschul et al. 1997) searches using all proteins in the genomes of Halobacterium salinarum R1, Haloarcula marismortui ATCC 43049, Halococcus thailandensis JCM 13552, Haloferax volcanii DS2 and Halorubrum lacusprofundi ATCC 49239 (Baliga et al. 2004; Pfeiffer et al. 2008; Hartman et al. 2010) as query sequences. BLASTp searches were performed against all available sequences in the GenBank non-redundant sequence database. The results of the BLAST searches were then manually inspected for proteins for which all significant hits were from well-defined groups within the class Halobacteria or proteins for which there was a large increase in E value from the last hit belonging to a particular group of organism within the class Halobacteria and the first hit from an organism for any other bacterial group and the E value for the latter hits were >1 × 10−4 (Gao and Gupta 2007; Gupta and Mok 2007; Naushad et al. 2014).

Results

Phylogenetic analysis of the class Halobacteria based on concatenated protein sequences

In this work, we have produced a phylogenetic tree containing 129 genome sequenced members of the class Halobacteria using a concatenated set of seven hundred and sixty-six proteins (Fig. 1). The phylogenetic reconstruction produced for this work is the most comprehensive phylogenetic analysis of the genome sequenced Halobacteria completed to date. The branching patterns in this concatenated protein based phylogenetic tree largely reflect those seen in prior publications (Soucy et al. 2014; Gupta et al. 2015). In this concatenated protein based phylogenetic tree, the orders Natrialbales and Haloferacales form monophyletic groups with strong statistical support. The order Haloferacales can be further divided into two smaller groups; one containing the genera Haloferax and other closely related Halobacteria and the other containing Halorubrum and its relatives, labelled HF1 and HF2, respectively, in Fig. 1. The order Halobacteriales is currently a polyphyletic assemblage of halobacterial groups which do not show strong affinity to either the order Natrialbales or Haloferacales (Gupta et al. 2015). In our concatenated protein tree, the order Halobacteriales spans at least three subgroups. Two of these subgroups consist of distinct pairings of genera (viz. Halobacterium/Halarchaeum, Haladaptus/Halalkalicoccus). In addition to these smaller subgroups, two large neighbouring clusters of Halobacteriales, one consisting of Haloarcula, Halomicrobium, Halorhabdus, Halosimplex, and Natronomonas (labelled HB1) and the other consisting of the members of the genus Halococcus (labelled HB2), are also observed in the phylogenetic tree.

Fig. 1
figure 1

A maximum likelihood phylogenetic tree based on the concatenated sequences of 766 proteins obtained from 129 genome sequenced members of the class Halobacteria. The members of orders Natrialbales, Haloferacales, and Halobacteriales are highlighted in green, blue, and red, respectively. Major clades are labelled. SH-like statistical support values are shown at branch nodes

In this work, we have also produced a 16S rRNA gene sequence based phylogenetic tree containing all named Halobacteria species (Fig. 2). The branching patterns observed in the 16S rRNA based phylogenetic tree are similar to those observed in the concatenated protein tree. As observed in the concatenated protein tree, the orders Natrialbales and Haloferacales are monophyletic and well-separated entities in the 16 rRNA gene based tree. Within the order Natrialbales, the members of the genus Halopiger form a monophyletic grouping that is not observed in the concatenated protein tree. We have further examined the significance of the monophyletic grouping of the Halopiger in the 16S rRNA tree by creating individual and concatenated phylogenetic trees for the five multi-locus sequence analysis proteins (viz. atpB, EF-2, radA, rpoB’, and secY) for the class Halobacteria proposed by Papke et al. (2011) (Supplemental Figs. 54–59). A monophyletic grouping of the members of the genus Halopiger was not observed in any of these phylogenetic trees and there was stochasticity in their branching patterns, indicating a high level of genetic heterogeneity within the members of this genus. In phylogenetic trees based on different protein sequences, Natrinema altuense 1AG-DGR consistently shows a longer branch in comparison to the other Natrinema spp., but its significance at present is unclear. Within the order Haloferacales in the 16S rRNA tree, the clades HF1 (Haloferax and related genera) and HF2 (Halorubrum and related genera) are identifiable and well separated with the sole exception of the members of the genus Haloplanus which branched within the Haloferacales adjacent to the clades HF1 and HF2. The order Halobacteriales weakly supported monophyletic branching in the 16S rRNA based phylogenetic tree. Within this weakly supported grouping of the Halobacteriales, there are nine distinct and well-supported subgroups. Importantly, the large clusters of Halobacteriales species observed in the concatenated protein tree (clades HB1 and HB2) are also identifiable in the 16S rRNA tree. However, the genus Halococcus (clade HB2) did not branch with clade HB1 in the 16S rRNA tree, instead Halococcus branched with the genus Halalkalicoccus alongside the genera Haladaptus, Halorussus, and Halorubellus. The genus Halosimplex did not branch within clade HB1 in the 16S rRNA tree. The single member of the genus Halosimplex formed a cluster with the lone member of the genus Halovenus which branched in the vicinity of clade HB1 and other Halobacteriales.

Fig. 2
figure 2

A maximum likelihood phylogenetic tree based on 144 16S rRNA gene sequences from members of the class Halobacteria covering all known species. The members of orders Natrialbales, Haloferacales, and Halobacteriales are highlighted in green, blue, and red, respectively. Major clades are labelled. Bootstrap support values above 50 % are shown at branch nodes

Molecular characteristics distinguishing the two main groups within the order Haloferacales

Molecular characteristics, such as CSIs and CSPs, that are uniquely found in a well-defined group of organisms are powerful tools for evolutionary studies (Rokas and Holland 2000; Gao and Gupta 2012a; Jones 2012; Gupta 2014). Recently, CSIs and CSPs have been used to revise the taxonomy of the class Halobacteria and a large number of other prokaryotic groups at varying taxonomic depths (Sawana et al. 2014; Gupta et al. 2015; Naushad et al. 2015). However, the previous analysis of the class Halobacteria focused only on the higher level divisions within the class. In this work we have identified four CSIs and five CSPs which are shared by clade HF1 (Haloferax and related genera). An example of one such CSI, consisting of a one amino acid insertion in the members of clade HF1 located in a conserved region of a hypothetical protein, is shown in Fig. 3. This insertion is uniquely found in all members of clade HF1 and absent in the members of clade HF2 and all other members of the class Halobacteria. The sequence alignments for three additional CSIs which are specific for clade HF1 are shown in Supplemental Figs. 1–3 and their properties are briefly summarised in Table 1. The second group within the order Haloferacales, clade HF2 (Halorubrum and related genera), is characterised by four identified CSPs. GenInfo Identifier numbers (GI Numbers) for the five CSPs specific for clade HF1 and the four CSPs specific for clade HF2 are provided in Table 3A, B, respectively.

Fig. 3
figure 3

A partial sequence alignment of a hypothetical protein showing a 1 amino acid insertion (boxed) that is characteristic of the members of clade HF1. Sequence information for a limited number of Halobacteria and other archaea are shown here, but unless otherwise indicated similar CSIs were detected in all members of the indicated group and not detected in any other species in the top 250 Blastp hits. The dashes (-) in the alignments indicate identity with the residue in the top sequence. GenInfo identification (GI) numbers for each sequence are indicated in the second column

Table 1 Conserved signature indels specific for infra-order level groups of genera within the class Halobacteria

Molecular characteristics which provide evolutionary insights for the order Halobacteriales

In this work, we have also identified a number of CSPs and CSIs that provide novel insights into the interrelationships within the polyphyletic order Halobacteriales. Most importantly, we have identified three CSIs and four CSPs which are shared by clade HB1 as seen in the concatenated protein based phylogenetic tree (viz. the genera Haloarcula, Halomicrobium, Halorhabdus, Halosimplex, and Natronomonas) and seven CSIs and fifteen CSPs which are unique characteristics of all clade HB1 members except the genus Natronomonas, which forms the outermost branch of clade HB1. One example each of these two types of CSI are shown in Fig. 4. The first CSI, consisting of a one amino acid insertion in ATP-dependent DNA helicase (Fig. 4a), is uniquely found in the members of clade HB1 while the second CSI, consisting of a one amino acid deletion in a conserved region of the protein acetylglutamate kinase (Fig. 4b), is uniquely found in all members of clade HB1 except the genus Natronomonas. Neither of the CSIs shown in Fig. 4, or any of the other CSIs indicated to be specific for clade HB1, are found in the members of clade HB2 or other members of the class Halobacteria. The sequence alignments for eight additional CSIs which are specific for clade HB1 are shown in Supplemental Figs. 4–11 and their properties are briefly summarized in Table 2. In addition, we have also identified nineteen CSPs which are also specifically found in members of clade HB1. Identification numbers for these nineteen clade HB1-specific CSPs are provided in Table 3C. These CSIs and CSPs provide strong support that the genus Halosimplex shares an evolutionary lineage with the members of clade HB1 despite not branching within the clade in our 16S rRNA tree (Fig. 2).

Fig. 4
figure 4

Partial sequence alignments of a a ATP-dependent DNA helicase showing a one amino acid insertion deletion that is characteristic of the members of clade HB1, b acetylglutamate kinase showing a 1 amino acid deletion that is characteristic of the members of clade HB1 except the genus Natronomonas. Sequence information for a limited number of Halobacteria and other archaea are shown here, but the indicated CSIs were detected only in members of the indicated group and not detected in any other species in the top 250 Blastp hits. The dashes (-) in the alignments indicate identity with the residue in the top sequence. GenInfo identification (GI) numbers for each sequence are indicated in the second column

Table 2 Conserved signature indels specific for Haloarcula, Halococcus, Haloferax, or Halorubrum
Table 3 GI numbers for conserved signature proteins that are specific for family, infrafamilial, and genus level groups within the class Halobacteria

We have also identified a number of CSIs and CSPs that are uniquely shared by groups within the order Halobacteriales that show inconsistent branching affinity in phylogenetic trees. An association between the members of the genera Halococcus and Halalkalicoccus is supported by two CSIs and one CSP identified in this work (Tables 1, 3D). Additionally, Halococcus and Halalkalicoccus, cluster together in the 16S rRNA sequence based phylogenetic tree alongside the genera Haladaptus, Halorussus, and Halrubellus. However, there is no association observed between Halococcus and Halalkalicoccus in the concatenated protein tree and no CSIs or CSPs are identified that are specific to a grouping of the sequenced members of the genus Halococcus and associated genera identified in the 16S rRNA tree (viz. Halalkalicoccus and Haladaptus). Thus, clade HB2 is currently limited to the members of the genus Halococcus until the affinity between the genus Halococcus and the genera Halalkalicoccus, Haladaptus, Halorussus, and Halrubellus can be determined. Within the order Halobacteriales, we have also identified phylogenetic and molecular support for a supergeneric relationship between the genera Halobacterium and Halarchaeum. The genera Halobacterium and Halarchaeum cluster together in the concatenated protein and in the 16S rRNA gene based phylogenetic trees along with the genus Salarchaeum. A relationship between Halobacterium and Halarchaeum is further supported by four CSIs (Table 1) and one CSP (GI: 169236474) identified in this work which are uniquely shared by these two genera. We have also identified some CSIs that support a relationship Halobacterium and Haladaptus, the significance of which is currently unclear. The sequence alignments for the CSIs specific to either Halococcus and Halalkalicoccus or Halobacterium and Halarchaeum are shown in Supplemental Figs. 12–17 and their properties are briefly summarised in Table 2.

Molecular characteristics which characterise important genus level groups within the class Halobacteria

In addition to the CSIs and CSPs identified for groups of multiple genera within the class Halobacteria, we have also identified a number of CSIs and CSPs which are uniquely found within members of specific genera within the class. Within the order Haloferacales, we have identified seven CSIs and 38 CSPs which are restricted to the members of the genus Haloferax, and seven CSIs and 60 CSPs which are restricted to the members of the genus Halorubrum. Examples of CSIs which are uniquely found in members of the genera Haloferax and Halorubrum, respectively, are shown in Fig. 5. A one amino acid insertion that is specific for members of the genus Haloferax found in a hypothetical protein is shown in Fig. 5a whereas a one amino acid insertion that is specific for members of the genus Halorubrum which is present in the 50S ribosomal protein L19 is shown in Fig. 5b. In both cases, these inserts are unique characteristics of these genera and are not found in any other member of the class Halobacteria. The sequence alignments for the other identified Haloferax and Halorubrum CSIs are presented in Supplemental Figs. 18–29 and their properties are summarised in Table 2. The GI numbers for the seventeen CSPs specific for the genus Haloferax and the seven CSPs specific for the genus Halorubrum are provided in Table 3E, F, respectively.

Fig. 5
figure 5

Partial sequence alignments of a a hypothetical protein showing a one amino acid insertion (boxed) identified in all sequenced members of the genus Haloferax, b 50S ribosomal protein L19 showing a one amino acid insertion (boxed) identified in all sequenced members of the genus Halorubrum. Sequence information for a limited number of Halobacteria and other archaea are shown here, but unless otherwise indicated similar CSIs were detected in all members of the indicated group and not detected in any other species in the top 250 Blastp hits. The characteristics of all identified CSIs specific for the genera Haloferax and Halorubrum are summarized in Table 2

We have also identified nine CSIs and twenty-three CSPs which are restricted to the large assemblage of sequenced species within the genus Halococcus and seventeen CSIs and 113 CSPs that are restricted to the genus Haloarcula, both of which are found within the order Halobacteriales. An example of a CSI which is specific for the genus Halococcus and another CSI which is specific for the genus Haloarcula are shown in Fig. 6. The first CSI, shown in Fig. 6a, consists of a two amino acid insertion in DNA gyrase subunit B that is uniquely found in the genus Halococcus whereas the second CSI, shown in Fig. 6b, consists of a five amino acid insertion in a gufA family protein that is uniquely found in the members of the genus Haloarcula. The sequence alignments for eight additional CSIs which are specific for the genus Halococcus and sixteen additional CSIs which are specific for the genus Haloarcula are shown in Supplemental Figs. 30–53 and their properties are briefly summarised in Table 2. The identification numbers for the 23 CSPs which are specific for the genus Halococcus and the 113 CSPs specific for the genus Haloarcula are provided in Table 3D, G, respectively.

Fig. 6
figure 6

Partial sequence alignments of a a gufA family protein showing a five amino acid insertion (boxed) identified in all sequenced members of the genus Haloarcula, b DNA gyrase subunit B showing a two amino acid insertion (boxed) identified in all sequenced members of the genus Halococcus. Sequence information for a limited number of Halobacteria and other archaea are shown here, but unless otherwise indicated similar CSIs were detected in all members of the indicated group and not detected in any other species in the top 250 Blastp hits. The characteristics of all identified CSIs specific for the genera Haloarcula and Halococcus are summarized in Table 2

Discussion

The analyses presented here, in the form of a highly robust phylogenetic tree based on 266 proteins obtained from 129 genome sequenced members of the class Halobacteria (Fig. 1), the 20 identified CSIs (Table 1), and the 31 identified CSPs (Table 3) provide a reliable basis for understanding the infra-order level relationships of genera within the class Halobacteria. These analyses have identified a number of family level clusters within the orders Haloferacales and Halobacteriales which are supported by a large number of CSIs and CSPs (Fig. 7). The result of phylogenomic studies and identified molecular characteristics provide strong evidence that the order Haloferacales contains two main groups, one consisting of Haloferax and related genera and the other consisting of Halorubrum and related genera, each of which is supported by multiple lines of evidence.

Fig. 7
figure 7

A summary diagram depicting the distribution of identified CSIs and CSPs within the class Halobacteria and the proposed families described in this study. The genera within each family are listed under the corresponding family name. The superscript letter T beside a genus indicates that it is the type genus of the family

The order Halobacteriales, as currently described (Gupta et al. 2015), is a polyphyletic assemblage of members of the class Halobacteria that do not fall into the monophyletic orders Haloferacales and Natrialbales. The present analysis has identified one large and consistently observed group of multiple genera within the order Halobacteriales, referred to as clade HB1, and a second clade consisting of the large assemblage of species which make up the genus Halococcus, referred to as clade HB2, which are both supported by a large number of identified CSIs and CSPs (Fig. 7). Prior research on the interrelationships of the species within the genus Halococcus has noted that the members of the genus are more genetically diverse than other genera within the class Halobacteria and that the genus Halococcus contains two phylotypes which may eventually be separated into two genera (Montero et al. 1993; Goh et al. 2006). It should be mentioned that in the rRNA tree, the genera Halalkalicoccus, Haladaptus, Halorussus and Halrubellus also group together with the Halococcus spp. (Clade HB2). An affinity of Halococcus and Halalkalicoccus is also supported by a number of identified CSIs and CSPs. Although we are presently limiting the clade HB2 to the genus Halococcus, with some additional research it is likely that this group will be expanded to include the genera Halalkalicoccus, Haladaptus, Halorussus and Halrubellus, which show some affinity to this clade. Further work is also required to clearly delimit the groups within the genus Halococcus.

The phylogenetic trees and molecular characteristics presented here also provide strong evidence for the separation of the genera Halobacterium, Halarchaeum, and Salarchaeum from the other members of the order Halobacteriales as a novel family level taxa. However, the Bacteriological Code (Lapage et al. 1992) states that a family consisting of the genera Halobacterium, Halarchaeum, and Salarchaeum must retain the name Halobacteriaceae precluding any reclassification of this group. Thus, an important task for the future will be to identify reliable, genome-derived characteristics, utilising novel techniques and the increasing wealth of genomic data, which can be used to classify the other members of the order Halobacteriales as members of distinct clades/families or clearly establish their close affinity to the grouping of the genera Halobacterium, Halarchaeum, and Salarchaeum.

The analyses presented here have also lead to the identification of 31 CSIs and 211 CSPs which are characteristic of Haloferax, Halorubrum, and Haloarcula (Table 2, 3E–G). Based upon these molecular characteristics, it is now possible to differentiate these groups of species from all other Halobacteria on the basis of the presence or absence of unique molecular features. Earlier work by our group on other groups of prokaryotes has shown that these CSIs and CSPs have strong predictive value and will likely be found in other members of these groups as more members are sequenced (Gao and Gupta 2012b; Gupta and Lali 2013; Bhandari and Gupta 2014; Howard-Azzeh et al. 2014). Additionally, the conserved nature of these CSIs and CSPs make them promising targets for the development of diagnostic assays that can be used to identify novel members of these genera from isolates or environmental samples (Ahmod et al. 2011; Wong et al. 2014). Further analyses of these genus specific molecular characteristics should also lead to the discovery of novel functions in these organisms mediated by CSIs and CSPs which may provide important insights into the physiology, evolution, and novel adaptations of these groups of Halobacteria.

Overall, the results presented here provide strong support for the presence of two family-level groups within the order Haloferacales and for two novel family level clusters within the polyphyletic order Halobacteriales. Based on the phylogenetic analyses, the identified CSIs and the identified CSPs presented in this work, we propose a division of the order Haloferacales into two families, an emended family Haloferacaceae (clade HF1) and Halorubraceae fam nov. (clade HF2) and a division of the order Halobacteriales into three families, Haloarculaceae fam. nov. (clade HB1), Halococcaceae fam. nov. (clade HB2), and an emended family Halobacteriaceae, containing all members of the order Halobacteriales that do not fall into clade HB1 or HB2. Descriptions of the new and emended families are provided below.

Emended description of the family Halobacteriaceae Gibbons 1974 (approved lists 1980)

The family Halobacteriaceae contains the type genus Halobacterium (Oren et al. 2009) and the genera Haladaptatus (Cui et al. 2010d), Halalkalicoccus (Xue et al. 2005), Halarchaeum (Minegishi et al. 2010a), Haloarchaeobius (Yuan et al. 2015), Halomarina (Inoue et al. 2011), Halorubellus (Cui et al. 2012), Halorussus (Cui et al. 2010c), Halovenus (Makhdoumi-Kakhki et al. 2012), Natronoarchaeum (Shimane et al. 2010), Natronomonas (Burns et al. 2010b) and Salarchaeum (Shimane et al. 2011). The description of the family is the same as that of the order Halobacteriales given by Gupta et al. (2015) with the following modifications: The members of this order also lack the CSIs and CSPs that are specific for the families Haloarculaceae and Halococcaceae.

Description of Haloarculaceae fam. nov.

Haloarculaceae (Ha.lo.ar.cu.la.ce’ae. N.L. fem. n. Haloarcula type genus of the family; -aceae ending to denote a family; N.L. fem. pl. n. Haloarculaceae the family whose nomenclatural type is the genus Haloarcula).

The family Haloarculaceae contains the type genus Haloarcula (Oren et al. 2009) and the genera Halapricum (Song et al. 2014), Halomicroarcula (Echigo et al. 2013), Halomicrobium (Oren et al. 2002), Halorhabdus (Antunes et al. 2008), Halorientalis (Amoozegar et al. 2014), and Halosimplex (Han and Cui 2014). The description of the family is the same as that of the order Halobacteriales given by Gupta et al. (2015) with the following modifications: the members of the family Haloarculaceae can be distinguished from the Halobacteriaceae and all other archaea by the ten CSIs listed in Table 1 and by the twenty CSPs listed in Table 3C.

Description of Halococcaceae fam. nov.

Halococcaceae (Ha.lo.coc.ca.ce’ae.N.L. masc. n. Halococcus type genus of the family; -aceae ending to denote a family; N.L. masc. pl. n. Halococcaceae the family whose nomenclatural type is the genus Halococcus).

The family Halococcaceae contains only the type genus Halococcus (Oren et al. 2009). The description of the family is the same as that of the order Halobacteriales given by Gupta et al. (2015) with the following modifications: the members of the family Halococcaceae can be distinguished from the Halobacteriaceae and all other archaea by the nine CSIs listed in Table 2 and by 23 CSPs CSPs listed in Table 3D.

Emended description of the family Haloferacaceae Gupta 2015

The family Haloferacaceae contains the type genus Haloferax (Oren et al. 2009) and the genera Halobellus (Cui et al. 2011b), Halogeometricum (Cui et al. 2010e), Halogranum (Cui et al. 2011c), Halopelagius (Zhang et al. 2013), Haloplanus (Qiu et al. 2013), Haloquadratum (Burns et al. 2007), and Halosarcina (Cui et al. 2010a). The description of the family is the same as that of the order Haloferacales given by Gupta et al. (2015) with the following modifications: the members of the family Haloferacaceae can be distinguished from the Halorubraceae and all other archaea by their branching in phylogenetic trees and by the four CSIs listed in Table 1 and by the five CSPs listed in Table 2.

Description of Halorubraceae fam. nov.

Halorubraceae (Ha.lo.ru.bra.ce’ae. N.L. neut. n. Halorubrum type genus of the family; -aceae ending to denote a family; N.L. neut. pl. n. Halorubraceae the family whose nomenclatural type is the genus Halorubrum).

The family Halorubraceae contains the type genus Halorubrum (Oren et al. 2009) and the genera Halobaculum (Oren et al. 1995), Halogranum (Cui et al. 2010b), Halohasta (Mou et al. 2012), Halolamina (Cui et al. 2011a), Halonotius (Burns et al. 2010a), Halopenitus (Amoozegar et al. 2012), and Salinigranum (Cui and Zhang 2014). The description of the family is the same as that of the order Haloferacales given by Gupta et al. (2015) with the following modifications: the members of the family Halorubraceae can be distinguished from the Haloferacaceae and all other archaea by their branching in phylogenetic trees and by the four CSPs listed in Table 2.