Introduction

Streptomyces are a group of aerobic high %G + C Gram positive bacteria that undergo complex differentiation to form filamentous mycelium, aerial hyphae and spores. In addition, they produce a broad range of secondary metabolites including antibiotics, antiparasitic agents, herbicides, anti-cancer drugs and various enzymes of industrial importance. Three Streptomyces species have their complete genome sequences made publicly available, namely the model organism Streptomyces coelicolor (%G + C=72.1), the avermictin producer Streptomyces avermitilis (%G + C=70.7) and the type strain for the genus Streptomyces griseus (%G + C=72.2%) (Bentley et al. 2002; Ikeda et al. 2003; Ohnishi et al. 2008). There are several more Streptomyces genomes becoming available and this will enhance our ability to carry out comparative genomic studies (http://www.broadinstitute.org/science/data#). Nonetheless, the Streptomyces are not unique among the Actinomycetales with other genera also undergo complex differentiation too, such as the formation of aerial hyphae, fragmentation, single spores forming directly on the vegetative mycelium and sporangia. Although Streptomyces are by far the most well studied, such genera are phylogenetically quite closely related to Streptomyces based on 16S small subunit sequence analysis, but relatively few of these have complete genome sequences available. These are Saccharopolyspora erythraea, various Frankia spp. and two Salinispora spp. There are many Actinomycetales genera about which little information is known outside of their formal taxonomic description and the fact that they undergo some form of complex differentiation and these include Kitasatospora, Saccharomonospora, Saccharopolyspora, Streptosporangium, Amycolatopis and many others (For detailed information see “BactMap”, http://wishart.biology.ualberta.ca/BacMap). Understanding the conservation of the genes involved in development will enable a greater understanding of how sporulation evolved across these organisms and how the complement of genes varies in species that undergo different types of differentiation.

The developmental biology of S. coelicolor has been studied for over 40 years and many sporulation genes have been identified (Chater and Chandra 2006; Flärdh and Buttner 2009). However, no other Actinomycetales species has been this well characterized in terms of developmental biology. Understanding developmental transitions and the genes required for development and how these are distributed throughout the order is central to understanding the evolution of these complex lifecycles.

Two important aspects of the genomes structures of Streptomyces need to be borne in mind. Firstly, that the genome size of Streptomyces is large compared to other bacteria; 8,667,507 base pairs for S. coelicolor (7825 protein coding genes), 9,025,608 base pairs (7,577 protein coding genes) for S. avermitilis, 8,545,929 base pairs for S. griseus (7138 coding genes) and 10,148,695 base pairs for Streptomyces scabies (G + C% = 71.45%; potentially 9107 open reading frames S. scabies, although analysis is not complete; http://www.sanger.ac.uk/projects/S_scabies/). Secondly, that the genomes of these four species are linear and both ends contain unique terminal inverted repeats that probably covalently bind a terminal protein (Yang et al. 2002). Terminal inverted repeats and covalently bound terminal proteins are not found in the limited number of other bacteria that have linear chromosomes such as Borrelia burgdorferi and Agrobacterium tumefaciens and, up to the present, seem to be unique to the Streptomyces and perhaps some other Actinobacteria with Rhodococcus RHA1 genome also being linear (Lin et al. 1993; Chen et al. 2002; Gollub et al. 1999). This has a direct impact on sporulation and development because circularization of these linear genomes is common and in many cases causes a blockage of sporulation due to gene loss (Volff et al. 1997).

There is significant gene diversity at the interspecies level across the genomes of the completely sequenced Streptomyces with >2000 genes being unique to each species (http://avermitilis.ls.kitasato-u.ac.jp/specific/index.html). Genome comparisons across the Actinomycetales have revealed that they share some common features. A core region of about 3000 genes in the centre of the Streptomyces linear chromosome is shared syntenously across the order. Furthermore, there appear to be genus specific region on either side of the core region that are conserved in Streptomyces and these are distinct from the highly divergent terminal regions, which themselves are not the same as the terminal invert repeat regions found at the end of Actinobacterial linear genomes and plasmids (Bentley et al. 2002; Ikeda et al. 2003; Hsiao and Kirby 2008; Jayapal et al. 2007; Kirby et al. 2008). These features suggest that core of developmental/sporulation genes essential to the process maybe conserved across the Actinomycetales.

Although genome sequencing is now much easier and cheaper for comparing many species DNA/DNA microarray genome comparisons have advantage of speed for comparison of highly conserved genes of interest. For this reason, the genomes of a number of Streptomyces species as well as Saccharomonospora and Streptosporangium were compared using this approach. There are, however at least two considerations required when using this approach. Firstly, because the array used is limited to the genes from S. coelicolor, gene presence/absence between species is not detectable if the gene is not present in S. coelicolor. Secondly, DNA/DNA microarray comparisons do not give information on synteny, although congruent gene presence may suggest that this is true. Neither of these disadvantages should have a major effect on an analysis of the genes involved in Actinomycetales sporulation and development.

Chater and Chandra (2006) reviewed in depth the evolution of development in Streptomyces using genome comparison and this analysis was used as a basis of the present study. Overall, the DNA/DNA microarray analysis of the developmental gene set across a wider range of species helps to shed light on which genes are important, which genes are conserved and which genes have undergone rapid evolutionary change. This has implications for studies outside of S. coelicolor and will be informative for future studies.

Materials and methods

Phylogenetic analyses

The 16S phylogeny was carried out on the small subunit 16S ribosomal RNA gene sequences (bp 91–447, S. coelicolor A3(2) 16S DNA sequence AL939108) obtained from Ribosomal Database Project-II Release 9 (http://rdp.cme.msu.edu/index.jsp). These were aligned using CLUSTALX (Thompson et al. 1997). The analysis was carried out using the Neighbor-Joining algorithm from CLUSTALX. The other phylogenetic analyses used a similar approach but involved the translated protein sequences of the various genes analyzed.

Microarrays

PCR arrays covering about 97% of the complete genome of S. coelicolor A3(2) (www.surrey.ac.uk/SBMS/Fgenomics/Microarrays/index.html) were used in this study. The Surrey microarray is made up of 7758 unique PCR amplified sequences, 7563 from the chromosome and 195 from SCP1, the large linear plasmid found in S. coelicolor that encodes methylenomycin (Kirby et al. 1975; Kirby and Hopwood 1977; Bentley et al. 2004). There are an additional 376 non-unique, alternative and cross-hybridizing sequences that are also spotted onto the array together with no probe spots and control spots. These microarrays do not include a number of transposition element related genes. The sequences of the PCR products were unavailable due to intellectual property protection requirements.

Strains and growth conditions

The following species were used for the interspecific comparative genomics aspect of this study: Streptomyces coelicolor A3(2) (SCP1+), Streptomyces antibioticus (ATCC15848), Streptomyces argenteolus (ATCC11009), Streptomyces aureofaciens (BCRC11610), Streptomyces bikiniensis (ATCC11062), Streptomyces cattleya (ATCC35852), Streptomyces clavuligerus (ATCC27064), Streptomyces fradiae (BCRC11172), Streptomyces hydrogenans (BCRC11855), Streptomyces lipmanii (BCRC11889), Streptomyces maritimus (Yang-Ming), Streptomyces rimosus subsp. rimosus (type strain from Pfizer Ltd and purportedly ATCC10970), Streptomyces rochei (BCRC15102), Streptomyces tanashiensis (ATCC23967), Streptomyces venezuelae (BCRC11510), Streptomyces virginiae (ATCC12630), Streptosporangium roseum (ATCC12428) and Saccharomonospora viridis (ATCC15345). Fresh spores were collected from solid medium (R5 agar) and mycelium cultured in TSB liquid medium with 0.5% glycine at 30°C for 3 days.

Preparation of labeled DNA

Genomic DNA from a stationary phase culture was purified by the salting out procedure (Pospiech and Neumann 1995) and was sonicated to a size less than 2 kb. In total, 4–6 μg of sonicated genomic DNA were used as template and this was denatured in the presence of 12 μg of 72%-GC-content random hexamers in a total volume of 25 μl at 100°C for 10 min. The mixture was then snap-cooled on ice before adding the remaining reaction components: 1.5 μl of Cy3-dCTP or Cy5-dCTP (Amersham Pharmacia Biotech), 4 μl Klenow fragment (NEB #212), 5 μl Klenow buffer, 0.5 μl dNTP (4 mM dATP, 4 mM dTTP, 10 mM dGTP, and 0.2 mM dCTP), and 14 μl double distilled H2O. The random primed labeling reaction was carried out for 2–3 h at 37°C. Buffer exchange, purification and concentration of the DNA products was accomplished by three cycles of diluting the reaction mixture in 0.5 ml TE buffer (10 mM Tris and 1 mM EDTA pH 8.0) and filtering though a Microcon-30 microconcentrators (Millipore).

Microarray hybridization and data analysis

In all cases microarray hybridizations were carried out in duplicate. The two DNA pools to be compared were mixed and applied to an array in a hybridization mixture that contained 3.68× SSC, 0.18% SDS, and 1 μg yeast tRNA (total 16.3 μl), which had been heated at 100°C for 5 min before being applied to array. Hybridization took place under a glass coverslip sealed by glue in a humidified Omnislide (Thermo Hybaid) at 60°C for 12–14 h. The slides were washed, dried and scanned for fluorescence using a GenePix TM 4000B scanner (Axon instruments). Average signal intensity and local background measurements were obtained for each spot on each array using GenePixPro software. The dataset was screened for aberrant spots and these were eliminated from the analysis after manual checking. The signal from each gene spot was analyzed and processed using ScanAlyze (Eisen et al. 1998; Gollub et al. 1999). The data was then processed into a mean Log2 Cy3/Cy5 ratio format. The dataset was normalized for each array separately and exported to Excel where after checking the alignment of the datasets from each array, a mean signal for each common gene was calculated. Based on Bentley et al. (2002), the mean signal and standard deviation for the core region of genes from SCO2050 to SCO5800 was calculated. The standard deviation was used to set a cut-off for gene absence at 2SD below the core mean signal. The microarray data is presented relative to the S. coelicolor standard in three ways. Firstly as a grey scale plot changing from white representing a negative hybridization signal to black representing a positive hybridization signal created using the program Treeview (Eisen et al. 1998). Secondly, they are presented as numeric values for the signal from each gene, which are presented in Table 1. Thirdly, as a color plot with green as the negative hybridization signal, black as an equal hybridization signal and red as a positive hybridization signal (see supplementary Fig. S1). The microarray data for the S. rimosus species described here can be accessed at NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/).

Table 1 Analysis of the genes involved in sporulation and development using DNA/DNA microarray signal data for various unsequenced sporoactinomyces together with the identification of the presence of homologous genes in complete genome sequences of other Actinomycetales using bioinformatics

Bioinformatics Analysis

The genes from S. coelicolor genes involved in sporulation/development were identified in the review of Chater and Chandra (2006) were used as the base list selected for investigated in this study. Other genes were added after a comprehensive literature search so that the list was as inclusive as possible. Each gene sequence from S. coelicolor was then used as a BLAST query against the genome sequences of five Actinomycetales, namely Streptomyces avermitilis, Streptomyces griseus, Saccharopolyspora erythraea, Thermobifida fusca and Salinispora tropica. Unfortunately, due to a lack of a searchable ORF database, this approach could not be extended to the other available complete genome sequence, S. scabies, as yet. Using the same criteria as de Been et al. (2006), the orthologs of the S. coelicolor genes that are present in these genomes were identified and are also listed in Table 1.

Criteria used to analyze gene conservation in the various species using the microarray data

A lack of gene conservation or gene absence in a particular species was specified as a microarray signal that was lower than two standard deviations below the mean gene signal for that species. A lack of gene conservation across the Actinomycetales species investigated in this study was specified as an average gene signal cross all the species studied of less than zero. The microarray signals and the status of all the genes analyzed in this study with respect to conservation/gene loss are shown in Table 1.

Results and discussion

The microarray signals (Fig. S1) and the status of all the genes analyzed in this study with respect to conservation/gene loss are shown in Table 1 for convenience, the genes analyzed in this study have been divided into groups based on their functions in morphogenesis and these groups are summarized and discussed below.

Spore structure

Spore structure is highly variable within the sporulating Actinomycetales and Streptomyces are no exception (Miyadoh 1997). Therefore, a lack of gene conservation across these structures may indicate a high level of gene sequence variation in these genes, resulting in the low hybridization signals found based on the microarray comparison, is not unexpected. The microarray dataset identifies sapA (SCO0409), rdlA (SCO2718), rdlB (SCO2719), Sortase A (SCO3849) and many of the genes in the whiE cluster (SCO5314 to SCO5321) as showing low hybridization signals suggesting low conservation at the DNA level. In S. avermitilis, sapA is absent from the genome, however in S. coelicolor and S. scabies it is in the unstable terminal regions; furthermore, though present in S. griseus in the core region, it is absent in Sacc. erythraea and Sal. tropica. This above suggests that SapA varies tremendously across the Actinomycetales and may represent niche specificity, of which we know little in the Streptomycetes. These results also correlate well with the high level of detected variation in the presence of rdlA and rdlB, which form the highly insoluble hydrophobic rodlet outer layer of the spores (Wildermuth et al. 1971; Smucker and Pfister 1978; Claessen et al. 2002) and such variability is supported by the genome sequence of S. avermitilis, where no rdlA/rdlB homologues can be detected and of S. griseus, where the rdl gene cluster is different from that in S. coelicolor. If the ecology of sporoactinomyces varies greatly from terrestrial to aquatic environments, then so will the need for hydrophobic surface proteins, especially when it is known that rodlets are not essential for spore formation (Claessen et al. 2002).

Conservation of the chaplin genes (chpA to chpH) also seems to correlate with the functional requirements of spore structural proteins (Claessen et al. 2003, 2004; Elliot et al. 2003). These proteins are involved in the assembly of the rodlet layer with ChpD to ChpH, the shorter chaplins being involved in reducing surface tension during the erection of aerial hyphae. In general, the chaplin genes seem to be well conserved across the Streptomyces based on the microarray data with ChpC, one of the large chaplins, being the least conserved, despite the role it plays in augmentation of aerial hyphae formation and assembly of small chaplins and rodlins on the spore surface (Di Berardo et al. 2008). This lack of conservation may also reflect sequence diversity in this gene; however, conversely there is strong conservation of the small amyloid-like fibril forming chaplins (de Jong et al. 2009; Elliot et al. 2003). In silico analysis of the Sacc. erythraea and Sal. tropica genomes indicates an absence of these genes in these more divergent species, suggests that in at least the case of the aquatic Salinispora that hydrophobic spore proteins may represent a soil niche adaptation. Therefore, the chaplins may be genus specific in their distribution.

The sortases are involved in exporting proteins for anchoring in the cell wall (Marraffini and Schneewind 2006). In Streptomyces this includes the chaplins (Elliot et al. 2003) and seven sortases have been identified in S. coelicolor (Bentley et al. 2002). All except two of these genes, SCO2841 and SCO3850, show a low level of conservation based on the microarray data suggesting that SCO2841 and SCO3850, both in the core region of the Streptomyces genome, may be the major sortases that export these proteins. The genes upstream of SCO2841 in S. coelicolor are conserved syntenously in the genomes of S. avermitilis and S. griseus, while downstream are not. No synteny is apparent in the Sacc. erythraea and Sal. tropica genomes. Additionally the putative sortase, SCO3849, which is in the core region and beside another putative sortase (SCO3850) is not as well conserved as its partner based on the microarray, yet in silico analysis demonstrates synteny across the S. avermitilis, S. scabies and S. griseus. It is intriguing to hypothesize that these and the other sortases outside the core regions are species specific and help to target spore coat proteins such as the ChpC for survival in specific environmental niches.

Cell wall structure

The cell wall structure and composition of the Gram-positive Streptomyces is one of the basic characteristics of the genus and thus most genes involved in cell wall biosynthesis are likely to be conserved. Indeed the division and cell wall cluster (dcw) (Tamames et al. 2001) is highly conserved throughout the eubacteria and largely correlates with cell shape. The microarray results presented here suggest that this is true for most of the muropeptide biosynthetic genes, including murD, murE, murF, murG and murX extends through out the Actinomycetales as might be expected. The major exceptions from this study are murA and murA2, which encode UDP-N-acetylglucosamine transferases. murA also shows a more variation in microarray signal than murA2 and exhibits a significant lack of signal with none-Streptomyces sporulating Actinomycetales; the genome sequences of Sacc. erythraea and Sal. tropica confirm this latter observation, although highly divergent copies may be present. In contrast, murA2 is well conserved based on the microarray results. There is evidence for lateral transfer of murA like genes in the bacterial lineage (Griffiths and Gupta 2002) suggesting that the presence of a paralog within the genomes has lead to rapid divergence within the sequences. This evidence perhaps indicates that murA is involved in creating genus specific variation in the peptidoglycan cell wall structure of the Streptomyces, perhaps for aerial hyphae, while murA2 is required for core peptidoglycan backbone structures.

The remainder of the dcw genes, which are involved in cell growth and chromosome segregation, form a cluster from SCO2077 to SCO2092 and are all highly conserved with the exceptions of DivIVA (SCO2077) and SCO2078, a gene encoding a hypothetical protein. DivIVA has homologues throughout the Gram-positive bacteria, including the species analyzed here, and the lack of hybridization may reflect sequence diversity. DivIVA is essential for polar growth and morphogenesis (Flärdh 2003), which is highly variable across the Actinomycetes in terms of branching frequency and fragmentation. SCO2078 is not as variable as divIVA, but the diversity is still much higher than the other conserved genes in this cluster.

Outside of the main dcw cluster, ftsH (two genes, SCO3404 and SCO5587), ftsK (four genes, SCO3934, SCO4508, SCO5734 and SCO5780) and ftsW (SCO3846) have been identified in S. coelicolor (Wang et al. 2007; Datta et al. 2006). FtsW is a binding partner of FtsZ (Mercer and Weiss 2002) is also well conserved, which is as would be expected based on the conservation of FtsZ. FtsK, which is involved in coupling cell division and chromosome segregation has one copy, SCO5780 (FtsK1), which has been shown by Wang et al. (2007) to be involved in the correct segregation of the linear chromosome into spores and is highly conserved. In contrast, SCO3934 (FtsK2) is varies to some extent in its conservation within the Streptomyces based on the microarray data and is also within a known HTR/GI. Thus, it is highly likely that this gene is a recent addition to the S. coelicolor genome, which is supported by its absence when analyzed bioinformatically. FtsK3 (SCO4508) is conserved both bioinformatically and based on the microarray data, while ftsK4 (SCO5734) is not conserved either bioinformatically or based on the microarray data. Neither of these genes is in a horizontally transferred region. A phylogenetic analysis of the four types of ftsK shows higher genetic diversity among the genes outside of the main ftsK1 cluster (Fig. 1). Of these, ftsK4 is present in all analyzed Streptomyces as well as Salinospora tropica, Nocardia farcinica and Saccharopolyspora erythraea. This contrasts with ftsK2 and ftsK3, which are present in some Streptomyces and Frankia genomes, but from two very distinct clades. Frankia lacks ftsK4, which suggests that it is replaced by ftsK3, which is present in all three Frankia genomes. The microarray data agrees with the above, in that ftsK4 is generally present in the Streptomyces including S. rimosus and also in Saccharomonospora and Streptosporangium. This contrast with ftsK2, which shows a variable presence in Streptomyces as well as other genera and ftsK3, which is generally absent from Streptomyces and other genera. Together, these results support ftsK4 as a secondary ftsK gene within the Actinomycetales, perhaps with important functions. FtsK2, on the other hand would seem to be horizontally transferred when present, while FtsK3 may be the functional equivalent of FtsK4 in Frankia but is present in only a few other Actinomycetales; the functional significance in these species remains to be seen.

Fig. 1
figure 1

Protein sequence phylogenetic tree of various Actinobacterial FtsK proteins rooted using FtsK sequences from other Gram-positive bacteria. The four FtsK clades are indicated. FtsK1 is the homologous group of genes that are present in most Actinomycetales. FtsK4 is a group that is present is Streptomyces and other sporoactinomyces. FtsK3 is a group that seems to be quite widely distributed across the Streptomyces and includes one example from Frankia EAN1pec, which has three FtsK copies. FtsK2 is a group that seems to be largely Frankia related but with homologues in S. coelicolor and Streptomyces sp. SPB74

FtsH is a metalloprotease that seems to be anchored to the cytoplasmic membrane and in Bacillus is involved in development (Wehrl et al. 2000). Deletion in Bacillus causes filamentous growth. The proteins functions in Streptomyces has not been explored but two paralogues (SCO3404 and SCO5587) are conserved across the Streptomyces and closely related species and are therefore worthy of further study.

The gene mreB (SCO2611) in this cluster, the knockout of which results in defective spores although the mutant grows and develops normally (Mazza et al. 2006), is highly conserved as would be expected, as are mreD (SCO2609) and sfr (SCO2607). The functionally important but not essential gene mreC (SCO2610) is more variable, whilst pbp2 (SCO2608), particularly in non-Streptomyces, is also quite variable.

The syntheny of the dcw cluster contrasts with the other major group of proteins thought to be involved in cell wall biosynthesis, the penicillin binding proteins (PBPs). These show much more variation with 9 of the 14 annotated PBPs being relatively well conserved in the Streptomyces, however only six of these are conserved across the non-Streptomyces species. An essential core of Actinomycetales penicillin binding proteins (SCO2897, SCO3847, SCO4013, SCO5039 and SCO5301) can be identified from this microarray analysis. Of these, all exhibit syntenous conservation in the sequenced Streptomyces genomes, with the exception of SCO4013, which seems to be conserved by microarray, yet is non-syntenous in the sequenced genomes. Microarray data suggests conservation within the Streptomyces of SCO2608, SCO3157 and SCO5110, which may have specifically evolved within the Streptomyces producing genus specific cell wall components. Finally, SCO2608, SCO3156, SCO3580, SCO3771 and SCO3901 all show high variability in microarray signal levels between species in addition to low conservation in non-Streptomyces species, suggesting they may encode species specific genes involved in cell wall formation.

Aerial hyphae erection

bldN, bldM and their interacting partners

BldN is a sigma factor that directly controls bldM, and is required for aerial mycelium formation (Bibb et al. 2000). bldN itself is conserved across almost all Streptomyces but not in Saccharomonospora, Streptosporangium and Streptomyces rimosus by microarray analysis. The gene is also present in Sacc. erythraea and Sal. tropica but is absent from Rhodococcus RHA1 and Thermobifidia fusca. This suggests that BldN functionally evolved in response to the evolution of sporulation, being found in organisms closely related to the Streptomyces, rather than as part of the sporulation process itself (Chater and Chandra 2006) or, alternatively it has undergone gene loss during diversification of the Actinomycetales lineage.

WhiJ

The function of whiJ (SCO4543) and the whiJ-like gene SCO1242 are as yet unknown, although they are part of an apparently Streptomyces specific extensive gene family including BldB (SCO5723) and SCO0704, all of which may have developmental functions (Chater and Chandra 2006). These genes appear to encode regulatory proteins (Aínsa et al. 2010), with all four of these genes showing low levels of conservation across the Actinomycetales; this suggests that they may have been acquired horizontally or they may have arisen through gene duplication.

ParA/B

There are three parA homologues in the S. coelicolor genome (parA1, parA2 and minD1) with one copy of parB. There has been considerable work on the paired genes parA1/parB (SCO3886 and SCO3887) which is found in most bacteria (Jakimowicz et al. 2006). This region appears to be highly syntenous in all the sequenced organisms examined in this study as would be expected. Moreover, the parA2 gene (SCO1772) and surrounding genes are also syntenously conserved, suggesting that it was present in the progenitor Actinomycetales or it arose relatively early in the Actinomycetales lineage, as it is not within a horizontal transferred region (HTR/GI) (Hsiao and Kirby 2008). The third parA-like gene (SCO5006) is annotated as minD1 (http://strepdb.streptomyces.org.uk), is conserved within the Streptomyces in a syntenous manner, yet appears to be absent outside of this genus, with the exception of the three fully sequenced Frankia genomes. The largely genus specific conservation of SCO5006 suggests that it is not of plasmid origin via horizontal transfer as it is not found in a putative HTR/GI region and has an average genome GC content. The microarray data demonstrate that the parA2 gene is highly conserved along with parB (SCO3887). Unexpectedly, conservation of parA1 (SCO3886) shows significant divergence across the Streptomyces in terms of the microarray data; however this agrees with the phylogenetic tree shown in Fig. 2. The minD1 gene appears to be conserved when assessed using the microarray results, although the phylogenetic tree suggests there is significant sequence diversity within this gene. Thus within this genus, there is a single conserved ParB protein that binds to the chromosome during segregation, with three potential ParA-like proteins that might interact with the ParB to permit chromosome segregation. The sequence divergence observed for parA1 (SCO3886) may indicate how each species evolves a unique ParA to control of chromosome copy number, especially given the multigenomic nature of Streptomyces cells. It is also possible the other ParA homologues may play a role in different aspects of chromosome segregation during development. One homologue (possibly ParA2) may be involved in conserved aspects of development, while the other may produce some of the diversity seen in Streptomyces aerial mycelium morphology between species. The identification of minD1 homologues outside of the Streptomyces in the three Frankia genomes coupled with their sequence divergence from the Streptomyces clade, suggests perhaps a divergence in function given the lack of aerial mycelium in this genus, but the formation of spores directly on the vegetative mycelium. Overall, the clade structure between parA1 and parA2 shows similarity, but there are some distinct differences such as the positions of Sacc. erythraea and Sal. tropica.

Fig. 2
figure 2

Protein sequence phylogenetic tree of various Actinobacterial ParA1 and ParA2 proteins rooted using par gene sequences from other Gram-positive bacteria. The higher diversity of Streptomyces ParA1 genes compared to Streptomyces ParA2 gene is marked. Three version of Par are present in the Frankia spp.

The bldJ/bldK,L/bldA,H/bldG/bldC/bldD,M/ram genes

Whilst the exact gene functions of bldJ and bldL are unknown, bldC and bldD, which have homologues in aerial hyphae forming, sporulating T. fusca, show good conservation in the Streptomyces and closely related species suggesting that functionally of these is important and sequence similarity, based on microarray hybridization, is not subject to rapid evolutionary change. This is also appears to be true of bldM, which is downstream of the bldN sigma and RamA,B,C during development (Keijser et al. 2002). The response regulator RamR and the small precursor protein RamS are not well conserved, suggesting species specificity in this part of the developmental process. This supports the divergence seen with the spore level structures which are highly variable in structure. The rag gene cluster (SCO4072–SCO4075) is conserved within the Streptomyces, but much less so outside suggesting that these components of the RamR regulon are important to the development of Streptomyces specifically. The Ram cluster with the exception of RamR is present in Saccharopolyspora, suggesting that Rag evolved later than Ram and that RamR is the specific interaction point as suggested by Keijser et al. (2002).

The Streptomyces specific genes of the developmental process based on our array data, in terms of gene distribution (bldG, bldH and bldK; Champness 1988; Bignel et al. 2000; Nodwell et al. 1999), are not very conserved across the Streptomyces at the level of nucleotide sequence, yet appear to be well distributed across the developing Actinomycetales using BlastN.

SmeA/SffA

These proteins are involved in spore maturation, chromosome segregation and septal placement following Z-ring assembly in the aerial hyphae (Ausmees et al. 2007). They are highly conserved across all Streptomyces and closely related species, but not the more divergent Actinomycetales. This supports a major role in species specific spore development and spore chromosome segregation. Data emerging from the Broad Institute sequencing initiative (www.broadinstitute.org ) also confirms this observation.

ssgA and ssgA-like genes

SALPs are found exclusively in sporulating Actinomycetales and our results confirm the exclusive nature of these genes (Noens et al. 2005). The SALPs are proposed to play a chaperonin-like role in peptidoglycan maintenance and based on this hypothesis, the highly conserved SALP (ssgB; SCO1541) ought to play central roles in this process (Xu et al. 2009). The less conserved, ssgC and ssgF, being absent from S. griseus, and ssgG, being absent from S. avermitilis, are perhaps involved in more species specific roles, and agree with previous work (Noens et al. 2005) where SsgF is proposed to be involved with SsgE in autolytic spore separation and SsgG is involved in septum location, perhaps both species specifically.

DevA–DevE cluster

This group of genes contains a vegetatively expressed metabolite responsive, GntR transcriptional regulator, devA (SCO4190) that represses its own expression, is expressed in the substrate mycelium transiently and on deletion causes a major disruption of sporulation (Hoskisson et al. 2006). The gene in the same operon, devB (SCO4191) is a hydrolase, disruption of which also affects sporulation. SCO4191 is conserved in our study, while SCO4190 is less well conserved but not highly divergent. The other GntR regulator, devE (SCO4188) is conserved, while the two small ORFs, SCO4189 and SCO4187 are poorly conserved. This suggests that DevA and DevE may have important roles in responding to intracellular metabolites, with DevA perhaps having a more species specific role in terms of what triggers the regulator.

WhiG, WhH, and WhiI

These genes are involved in aerial mycelium development with WhiG being a sigma factor that targets whiH and whiI (Tian et al. 2007). All except whiI show strong conservation within the Streptomyces based on the microarray data, suggesting that this gene might have species specific functions. Blast analysis of whiI indicates that this gene is well conserved throughout the Streptomyces.

WhiA, WhiB, and SigF

The genes whiA and whiB, but not sigF are reported by Chater and Chandra (2006) to have homologues in the simple Actinobacteria. whiA and whiB are conserved in terms of the microarray data, with sigF being conserved within the Streptomyces and related strains, but not across the broader range of sporoactinomycetes, confirming the important role it plays in late spore development (Kelemen et al. 1996).

whiB-like genes

These include whiD and form a group of putative regulatory genes, some of which are well known to be involved in development such as WhiD and WblA (Chater and Chandra 2006). Most show significant variation in the microarray signal and low conservation across the species, including whiD (SCO4767); the exception are wblA (SCO3579), wblC (SCO5190), wblI (SCO5046) and wblK (SCO7306). This suggests a high degree of variation in function across the Actinomycetes for this group of genes, exemplified by the role these genes play in drug resistance in mycobacteria (Morris et al. 2005).

Gyrases

Two pairs of gyrase gene homologs are conserved in Streptomyces based on the microarray data, suggesting that both the core gyrase pair (SCO3873 and SCO3874) and the gyrase pair found close to the Streptomyces specific region (SCO5822 and SCO5836) are functional in the Streptomyces and other closely related species. This duplication seems to be absent in more distant species. Phylogenetic analysis suggests together with the microarray results, that the duplication of the gyrases may have occurred early in the evolution of the sporulating Actinobacteria, but after Saccharopolyspora had diverged from the lineage. However, gene loss in the Saccharopolyspora cannot be excluded. This hypothesis is supported by the parallel evolution paths of gyrA1/gyrA2 and gyrB1/gyrB2, which show high congruence (Fig. 3). As would be expected for protein pairs that interact, gyrA1/gyrB1 and gyrA2/gyrB2 also show good congruence. One interesting event is the presence of a third gyrase subunit, gyrB3, in S. griseus. The position of gyrB3 in the phylogenetic tree suggests a horizontal transfer event from outside the Streptomyces.

Fig. 3
figure 3

Comparative protein sequence analysis of various Actinobacterial various GyrA and GyrB proteins rooted using GyrA and GyrB sequences from other Gram-positive bacteria. Note the agreement between the two trees suggesting that GyrA1/GyrB1 and GyrA2/GyrB2 have evolved in parallel throughout recent sporoactinmycetes evolution

This study provides a microarray analysis of gene conservation associated with development and sporulation in the Streptomyces and related sporoactinomycetes using both DNA/DNA microarray hybridization data and informed by bioinformatics. Studies such as this provide a basis for targeting areas of interest and those for potential further study of genes that may have a significant role in the developmental process This is particularly true of genes that are present as more than one copy in S. coelicolor due to lineage specific duplication and amplification; an area of increasing interest (Andersson and Hughes 2009). Although DNA/DNA microarray analysis can never rival full genome sequencing due to inaccuracies caused by chip/chip variation, experimental variation and intergenic cross-hybridization, it does provide a useful overview that pinpoints genes of particular interest based on conservation.