Introduction

The diversification of morphology and emergent function of appendages have been important features of animal evolution. This is best exemplified in the diversity that is seen in closely related species of Arthropoda. In Bilaterians, the Hox genes, first discovered in Drosophila melanogaster, regulate the development of serially homologous structures along the anterior posterior axis (Lewis 1978; McGinnis and Krumlauf 1992; Carroll 1995). Subsequently, large number of studies have reported that variations in Hox gene functions, due to changes in their (1) copy number, (2) their expression patterns and (3) regulation of downstream target genes (reviewed in Hrycaj and Wellik 2016; Pick and Heffer 2012), are critical for morphological diversification.

In Drosophila melanogaster, the Hox protein Ultrabithorax (Ubx) specifies the development of halteres in the third thoracic (T3) segment (for detailed review see Lewis 1978 and Khan et al. 2020). Loss of function mutations of Ubx give rise to four-winged flies with complete duplication of the T2 segment in the place of T3, while overexpression of Ubx in larval wing discs leads to wing to haltere transformations (Lewis 1978; Castelli-Gair et al. 1990; Cabrera et al. 1985; White and Akam 1985; White and Wilcox 1985). While Ubx is expressed in hindwing primordia of different insect species (Carroll 1995; Prasad et al. 2016), the outcome of its expression is not the same in all insect groups. Functional studies have shown that Ubx is required for the suppression of elytra and the specification of hindwings in the T3 of Tribolium (Tomoyasu et al. 2005) and to specify differences in eyespot patterns between the hindwings and forewings in Precis (Weatherbee et al. 1999; Matsuoka and Monteiro 2021). However, in Hymenopterans such as Apis mellifera, wherein T2 and T3 are distinguished by a marginally smaller hindwing, Ubx is expressed in both forewing and hindwing primordia (Prasad et al. 2016). The expression, however, is stronger in the hindwing primordia (Prasad et al. 2016). Over-expression of Ubx derived from A. mellifera, B. mori or T. castaneum can suppress wing development and specify haltere fate in transgenic D. melanogaster (Prasad et al. 2016), suggesting that changes in Ubx at the protein level may either have not contributed or marginally contributed to the evolution of its function. This, in turn, is consistent with one of the prevalent hypothesis in the field which proposes evolution of Ubx binding sites in the cis-regulatory elements of target genes as a determinant of differential regulation of common targets, thereby steering wing to haltere evolution (Weatherbee et al. 1999). A study comparing genome-wide targets of Ubx in developing halteres of D. melanogaster and developing hindwings of A. mellifera and B. mori revealed that a large number of genes have remained common targets of Ubx in these three species (Prasad et al. 2016), which are diverged for nearly 350 million years. Among those common targets, a few wing patterning genes are differentially expressed in D. melanogaster, but not in A. mellifera or B. mori (Prasad et al. 2016).

To identify evolutionary changes in the regulatory sequences of targets of Ubx that may have brought some of the wing patterning genes under the regulation of Ubx, we performed comparative analyses of genome wide Ubx-binding motifs between D. melanogaster and A. mellifera. We find that while in Drosophila, a motif with a core TAAAT sequence (TDATTTATGG) (hereafter referred to as TAAAT motif) was enriched by Ubx in ChIP pulled-down sequences, the same was not observed in A. mellifera. However, previously reported low affinity binding sites such as those with TAAT motif (Galant et al. 2002; Walsh and Carroll 2007; Crocker et al. 2015; Loker et al. 2021) were not enriched in ChIP-pulled down sequences from either Drosophila and Apis (Prasad et al. 2016). In this context, in this study we focused how changes in Ubx binding sites can influence haltere fate specification during evolution of Dipterans. Using a combination of in vitro, cell culture and transgenic assays, our results reveal that the TAAAT motif is important (and bears greater relevance as compared to the TAAT motif) for the Ubx mediated regulation of one of its target genes, CG13222, in Drosophila. We then analyzed the significance of the TAAAT motif in the Ubx mediated down-regulation of a Drosophila pro-wing gene, vestigial, as against the TAAT motif in its orthologue in Apis. In transgenic D. melanogaster, the enhancer of vg of A. mellifera drives reporter gene expression in both wing and halteres. We show that changing the TAAT motif to TAAAT motif was sufficient to bring this enhancer of vg from Apis under the regulation of Drosophila Ubx. Taken together, our study suggests that evolutionary changes in the regulatory sequences in the targets of Ubx in dipteran lineages, such as from TAAT to TAAAT as core sequence in Ubx-binding motifs, may have been an important mechanism contributing to the evolution of wing to haltere appendages.

Results

Motif With a TAAAT Core Sequence is Enriched for Direct Targets of Ubx in Developing Halteres in D. melanogaster but not for Developing Hindwings in A. mellifera

Previous studies indicate that despite the differences in the hindwing morphology, the Ubx protein is expressed in the third thoracic segment of both insects (as well as other insect lineages) (Fig. 1A) (Carroll 1995; Prasad et al. 2016). Additionally, Ubx is highly conserved across A. mellifera, B. mori, T. castaneum, Junonia coenia and D. melanogaster at the level of its DNA-binding homeodomain and the protein interaction domains, YPWM and UbdA (Supplementary Figure S1A). We have previously shown that overexpression of Ubx from different insect species in the wing imaginal disc of transgenic Drosophila resulted in wing to haltere transformation in a pattern similar to Drosophila Ubx (Prasad et al. 2016). A triple mutant combination of alleles, abx bx3 pbx, results in complete transformation of T3 to T2 (Lewis 1982). This four-wing phenotype is rescued when Ubx is over-expressed exogenously during development using a GAL4-UAS system. This phenotype can also be rescued by expressing Ubx derived from Apis or Bombyx (Merabet S; unpublished results). This suggests that morphological differences between the two insects may be due to events occurring downstream of or in parallel to Ubx. Comparison of targets of Ubx between D. melanogaster, and A. mellifera suggest that many functionally important genes are common to the two lineages, although many key genes are differentially expressed between T2 and T3 in Drosophila, but not in Apis (Prasad et al. 2016; Supplementary Figure S1B). This reconfirms that morphological differences between the two insects may lie at the levels of regulation of those targets.

Fig. 1
figure 1

Motif with a TAAAT core sequence is enriched in direct targets of Ubx in developing halteres in D. melanogaster but not in developing hindwings in A. mellifera. A Differences in hindwing morphology across closely related insect species. While Ubx (shown in magenta) is expressed in hindwing primordia of all insect species, its mere presence is not correlated to the differences in morphology observed in the third thoracic segment. B De-novo motif analysis of Ubx ChIP-seq in Apis hindwings and Drosophila halteres reveal enrichment of the TAAAT motif specifically in Drosophila but not in Apis. Binding sites for transcription factors like Trl and Aef1, however, enriched in both datasets. C FIMO analysis of the TAAAT motif reveals a 1.6-fold enrichment of sites in Drosophila ChIP-seq sequences as compared to the entire genome. No such enrichment was found for the Apis ChIP-seq datasets. The TAAT motif was not enriched in either of the datasets

We carried out ChIP-Seq for Ubx from the third instar haltere discs (GSE205177) using polyclonal antibodies raised against the N-terminal region of Drosophila Ubx, which are specific to Ubx and do not cross-react to other Hox proteins or non-Hox homeodomain containing proteins in Drosophila (Agrawal et al. 2011). This method provided better resolution over earlier methods used by us (Agrawal et al. 2011) and others (Choo et al. 2011; Slattery et al. 2011) to identify Ubx-specific binding motifs. We then performed a comparative analysis of Ubx ChIP-Seq data generated from D. melanogaster haltere imaginal discs to previously reported Ubx ChIP-seq data generated from A. mellifera hindwing discs (again using polyclonal antibodies specific to the N-terminal region of Apis Ubx; Prasad et al. 2016).

De-novo motif analysis of ChIP-Seq data from D. melanogaster showed significant enrichment for a motif with a TAAAT core sequence (TDATTTATGG) (henceforth referred to as the TAAAT motif) (Fig. 1B). A similar motif has been reported to be enriched in Ubx pulled down sequences in the Drosophila embryos (Shlyueva et al. 2016). We observed that enrichment for TAAAT motif is only in Drosophila and not in Apis, while both the ChIP-Seq datasets showed enrichment for binding sites for other transcription factors such as GAGA-binding factor and Aef1 (Fig. 1B). The frequency of the TAAAT motif in the Drosophila dataset is 1.6× in Ubx-pulled down sequences as compared to its overall prevalence in the whole genome. Interestingly, the frequency of the TAAAT motif in the entire genome was 1.6-fold higher in A. mellifera as compared to D. melanogaster (Fig. 1C), although there was no significant enrichment in Ubx-pulled down sequences. This suggests that the TAAAT motif is selectively enriched in the Ubx bound regions only in D. melanogaster and not in A. mellifera.

We also performed similar analysis for the canonical TAAT motif (PWM obtained from JASPAR database) and observed, similar to previous results (Agrawal et al. 2011), that it is not enriched in Ubx-pulled down sequences, rather it was found under-represented compared to its prevalence in the whole genome (Fig. 1C). This was found to be similar and true for both Drosophila and Apis datasets. We also observe a difference in enrichment of Ubx binding sites around the TSS of target genes between the two species (Supplementary Figure S1C). However, keeping in line with the scope of this paper, we did not undertake any further approach to understand the importance of such differences.

The TAAAT Motif is Critical for Ubx Mediated regulation of a Target Gene in Drosophila Halteres

Our results suggest that the TAAAT motif is specifically enriched in Ubx-pulled down sequences from haltere imaginal discs of Drosophila, but not from developing hindwings of A mellifera. While earlier studies have demonstrated the role of low affinity binding sites, homotypic binding clusters and DNA structure in Ubx binding to the chromatin (Galant et al. 2002; Slattery et al. 2011; Crocker et al. 2015), the TAAAT motif is primarily reported to be bound by the Ubx–Extradenticle–Homothorax complex (Slattery et al. 2011; Sánchez-Higueras et al. 2019). However, neither Extradenticle (Exd) (which is expressed in the haltere pouch but is cytoplasmic) nor Homothorax (Hth) (which is not expressed in the haltere pouch) are required for the development of the haltere capitellum (Casares and Mann 2000). In this context, we sought to determine the functional significance of Ubx-binding to the TAAAT motif during the specification of the haltere capitellum.

We first compared Ubx ChIP-seq data from Drosophila halteres to RNA seq data (GSE205352) for Drosophila wing and haltere imaginal discs to identify those direct targets that are upregulated or downregulated in the haltere. We then compared the frequency of the TAAAT motif in Ubx bound response elements of putative direct targets that are upregulated or downregulated between wing and haltere discs. We found the frequency of the TAAAT motif in Ubx bound response elements in the upregulated category is marginally higher (1.4×) as compared to the downregulated category (Supplementary Table S1). This suggests that while the TAAAT motif may be preferred site for Ubx to bind on the enhancers of its targets, Ubx may use this motif to regulate both upregulated and downregulated targets in the haltere. However, this study requires more granular analysis taking into account whether a target is expressed in the haltere pouch or in the notum region. In this study, we validated the functional importance of TAAAT motif with the help of detailed analyses of two of the targets of Ubx, CG13222 and vestigial (vg), one of which is expressed outside the pouch (CG13222) and the other which is expressed in the pouch (vg).

We first evaluated the affinity of Ubx binding to the TAAAT motif in CG13222, which is upregulated by Ubx in developing halteres (Mohit et al. 2006; Hersh et al. 2007). The “edge” enhancer of CG13222 has two TAAT motifs at 17 bp apart (CG13222_WT, Fig. 2A). Hersh et al. (2007) have named the two motifs as site 1 and site 2. In their transgenic assays, site1 appears to be critical for Ubx-mediated regulation of CG13222, while site2 is dispensable. Interestingly, the site1 in the edge enhancer, which is critical for its regulation by Ubx, has a TAAAT motif which overlaps with the TAAT motif (Fig. 2A). Their analysis of site1 involved mutating both TAAT and TAAAT motifs, which leads to loss of enhancer driven reporter gene expression (Hersh et al. 2007). We mutated site1 in such a way that the mutant enhancers carry either the TAAT motif (CG13222_M1_A) or the TAAAT motif (CG13222_M1_B) (Fig. 2A).

Fig. 2
figure 2

The TAAAT motif is critical for Ubx mediated regulation of a target gene in Drosophila halters. A Sequence of part of sal enhancer (Spalt 5/6), CG13222 enhancer and mutations of the CG13222 enhancer (CG13222_M1_A, CG13222_M1_B) used as probes for EMSA. The wild type CG13222 enhancer has two Ubx binding sites (shown in green) termed here as CG13222_WT (site1) and CG13222_WT (site2). The TAAAT site is in continuation with TAAT (shown by the red bracket) at site1. Mutations were generated to specifically modify either the TAAT (CG13222_M1_B, CG13222_M2_A) or the TAAAT (CG13222_M1_A) motifs while keeping the other one intact in site1 of the CG13222 enhancer. For Luciferase and Transgenic assays, the entire 512 bp spanning the CG13222 edge enhancer was used. B Binding of Ubx to CG13222_WT (site1) is severely reduced when the TAAAT motif is mutated to TAAT motif (CG13222_M1_A), whereas no such effect is seen on mutating the TAAT motif (CG13222_M1_B). While both Ubx and Scr bind to TAAT motif of sal, only Ubx binds to TAAAT motif of CG13222_WT (site1). The red arrows indicate the probe bound by Ubx whereas the blue arrow indicates the probe bound by Scr. Supershift is indicated using Asterix and black arrows indicate the unbound probe. C The CG13222 enhancer drives Luciferase activity in a S2 cell culture system in presence of Ubx. Significant loss of reporter activity is observed in the mutant enhancer (CG13222_M1_A) where the TAAAT motif is affected. For statistical analysis, t-test was performed (two-tailed). D–H The CG13222 enhancer drives reporter expression in the posterior edge of the third instar haltere imaginal discs (D). The CG13222_M1 mutant where both the TAAT and TAAAT motifs are affected show a marked reduction in reporter expression (E). The CG13222_M1_A mutant carrying changes in the TAAAT site of the enhancer has a similar phenotype as the CG13222_M1 mutant (F) and shows a significant reduction in GFP expression as seen from calculation of average fluorescence intensity (H). In the CG13222_M2_A enhancer where the TAAT is mutated to TAAAT in site2, a significantly higher level of reporter expression is observed (H), along with ectopic expression in some domains (shown by red arrows) (G). For statistical analysis, t-test was performed (two-tailed)

To examine the relative affinities of Ubx binding to the TAAAT and TAAT sites by Electrophoretic Mobility Shift Assay (EMSA), we used a 21 bp long probe covering site 1 of edge enhancer of CG13222 (nucleotide stretch in the grey box in Fig. 2A). We find that Ubx can bind to the wild type probe derived from the CG13222 edge enhancer (indicating by red arrows; super-shifts have been indicated in Asterisk). Mutations in the TAAAT motif in site1 of CG13222 lead to a substantial loss of this binding (probes CG13222_M1_A, Fig. 2B, Supplementary Figure S2A). Conversely, mutating the TAAT motif alone had no significant effects on binding of Ubx to the probe (CG13222_M1_B, Fig. 2B, Supplementary Figure S2A).

As all Hox proteins are known to bind TAAT motifs, we carried out EMSA to determine relative affinities for another Hox protein binding to TAAT and TAAAT motifs. For better comparison we examined the binding of the Hox proteins Sex combs reduced (Scr) and Ubx on TAAAT containing site on the enhancer of CG13222 and TAAT containing site on the enhancer of Sal (Galant et al. 2002). We observed that Scr was unable to bind to the wildtype probe CG13222. On the other hand, both Ubx and Scr bind to a 29 bp probe derived from the sal enhancer containing TAAT sites (Fig. 2B). This highlights the relative importance of Ubx binding to TAAAT as against TAAT.

Next we examined the importance of the TAAAT motif in gene regulation using Luciferase reporter assays in S2 cells and haltere imaginal discs in transgenic Drosophila. For these functional assays, we used a 512 bp long region spanning the edge enhancer of CG13222 as reported by Hersh et al. (2007), which has both site1 and site2. Consistent with CG13222 being a target of Ubx upregulated during haltere development, in the presence of Ubx a luciferase expression construct driven by the enhancer of the gene was significantly upregulated (2.8-fold) in S2 cells (Fig. 2C). Using a similar experimental design, we observed a significant decrease in Ubx-dependent expression when TAAAT alone is mutated (construct CG13222_M1_A in Fig. 2C) compared to the wildtype CG13222 enhancer.

We next generated transgenic strains of D. melanogaster carrying the wild type or mutant enhancers (as explained above) cloned upstream of a GFP reporter. The wildtype CG13222 enhancer showed GFP expression at the posterior edge of the haltere imaginal discs consistent with earlier reports (Fig. 2D) (Mohit et al. 2006; Hersh et al. 2007). We first replicated the mutation described in Hersh et al. (2007) wherein both the TAAT and TAAAT motifs are mutated (CG13222_M1) and found that the reporter gene expression is significantly reduced when driven by the mutant enhancer (Fig. 2E). We then analyzed the CG13222_M1_A mutant where the TAAAT motif is specifically mutated while the TAAT motif is maintained intact and observed a significant loss of reporter expression (Fig. 2F, H). However, the pattern of GFP expression patterns driven by CG13222_M1 and CG13222_M1_A enhancers were similar, suggesting that the TAAAT motif is specifically required for Ubx-mediated activation of the CG13222 enhancer in Drosophila halteres.

As described earlier, the edge enhancer of CG13222 has a second TAAT motif (site2 in Fig. 2A), which is redundant for Ubx-mediated upregulation of CG13222 in the haltere discs (Hersh et al. 2007). We also attempted to examine the effect of mutating the TAAT motif to a TAAAT motif (in site2, construct CG13222_M2_A), which results in two high-affinity binding motifs in the edge enhancer of CG13222. We observed a significant reduction in the luciferase activity (Supplementary Figure S2B), perhaps, due to negative effect of two high-affinity binding sites for Ubx in close proximity. However, the Drosophila transgenic line carrying the same mutation showed significantly stronger expression of the GFP reporter (Fig. 2G, H) compared to the wildtype CG13222 enhancer. Enhancer driven expression was also observed in ectopic regions in the posterior compartment (red arrows in Fig. 2G), suggesting that the presence of an additional TAAAT motif may bring the otherwise unresponsive site2 under Ubx regulation.

The TAAAT Motif Causes Ubx Mediated Repression of an Orthologous Enhancer of the vestigial Gene from A. mellifera in a Transgenic Drosophila Assay

We had previously reported that several wing patterning genes such as the vestigial (vg) gene are expressed in both fore- and hindwings primordia in A. mellifera. On the contrary, in Drosophila, vg is a direct target of Ubx (Supplementary Figure 2C) and is downregulated in the developing halteres (Galant et al. 2002; Hersh and Carroll 2005). Of the two well characterized enhancers, the quadrant enhancer of vg (henceforth named as quad-vg) is differentially expressed between the wing and haltere imaginal discs. In a transgenic Drosophila assay, an enhancer of vg from Apis (henceforth named as Apis-vg) equivalent to quad-vg showed similar patterns and levels of expression between wing and haltere discs (Prasad et al. 2016), suggesting that Drosophila Ubx too cannot repress the Apis-vg enhancer. We evaluated the role of TAAAT in Drosophila as against TAAT in Apis in the regulation of the quad-vg and Apis-vg enhancers in transgenic assays.

We first scanned the entire 850 bp sequence of the quad-vg enhancer and the 575 bp sequence of the Apis-vg enhancer for Ubx binding motifs, TAAAT and TAAT. We observed a 25 bp cassette containing both TAAT and TAAAT motifs in the quad-vg enhancer (Fig. 3A). In the Apis-vg enhancer, we found only a single TAAT motif and did not find any TAAAT motif (Fig. 3A). In an attempt to test the significance of this difference in Ubx binding motifs between the two enhancers, we generated several Drosophila transgenics carrying the wild-type and mutated quad-vg and Apis-vg constructs (mutations in TAAT or in TAAAT motifs) upstream of a GFP reporter (Fig. 3A; Supplementary Figures 2D, 3A). As reported earlier, we observed that the wild type quad-vg enhancer drives expression of the GFP reporter in the wing imaginal discs, albeit at much lower levels (compared to earlier reported quad-vg-lacZ reporter), but not in the haltere imaginal discs (Supplementary Figure 2E). However, owing to complete loss of enhancer readout in both wing and haltere imaginal discs in the transgenic carrying mutations in the TAAAT motif of the quad-vg enhancer (Supplementary Figure 2E), we were unable to conclude the role of the TAAAT motif or the TAAT motif in Ubx mediated regulation of the quad-vg enhancer.

Fig. 3
figure 3

The TAAAT motif causes Ubx mediated repression of an orthologous enhancer of the vestigial gene from A. mellifera in a transgenic Drosophila assay. A Sequence of part of the 805 bp enhancer of the vestigial gene in Drosophila (quad-vg) containing both TAAT and TAAAT motifs and part of the 575 bp enhancer of the vestigial gene in Apis (Apis-vg), showing the presence of a TAAT motif, and the Apis-vg-M1 modified sequence. B The Apis-vg enhancer drives similar expression of reporter GFP in hinge and pouch regions of wing and haltere imaginal discs. C The mutant enhancer (Apis-vg_M1) with introduced TAAAT motif shows reduced levels of GFP expression in the haltere pouch. Magnified images of the haltere pouch of Drosophila transgenics expressing GFP under Apis-vg and Apis-vg_M1 enhancers. Note much reduced GFP levels specifically in the haltere pouch of Apis-vg_M1 transgene. Orthogonal views of the haltere imaginal pouch indicate a clear difference in GFP expression driven by the WT and the mutant enhancers. Hth staining (in red), which is hinge-specific is used to demarcate the pouch region. D Quantification of average fluorescence intensity in the haltere pouch for the Apis-vg and Apis-vg_M1 transgenics. For statistical analysis, t-test was performed (two-tailed)

We next analyzed the reporter GFP expression in wing and haltere imaginal discs of Drosophila transgenics carrying the wild type and mutated Apis-vg enhancers. As reported earlier (Prasad et al. 2016), we observed similar expression of GFP in both wing and haltere imaginal discs (Fig. 3B). Interestingly, in a chimeric enhancer (Apis-vg_M2), where the 25 bp Ubx-binding cassette from the Drosophila quad-vg enhancer (containing the TAAT, GAGA binding and TAAAT motif), replaced its counterpart in the Apis-vg enhancer (containing only the TAAT motif) (Supplementary Figure 3Aiii), we observed differential expression of the GFP reporter between wing and haltere imaginal discs (Supplementary Figure 3B). Its expression was much lower in the haltere pouch compared to the GFP reporter driven by the wildtype Apis-vg enhancer. The loss of reporter GFP expression in the pouch region was consistent in the Apis-vg_M3 mutant, where the GAGA binding and TAAAT motifs were present, but not the TAAT motif. However, the loss of reporter GFP expression was not observed in the Apis-vg_M4 mutant containing the GAGA binding and TAAT motifs, but not the TAAAT motifs (Supplementary Figure 3B). More importantly, changing only the TAAT motif of the Apis-vg enhancer to a TAAAT motif (Apis-vg_M1), was sufficient for the repression of the GFP reporter in the pouch (Fig. 3C, D; Supplementary Figure 3B) of haltere discs without affecting its expression in the wing pouch (Supplementary Figure 3C).

Taken together, our results reveal that a microevolutionary change in the enhancer sequence, specifically modifying the TAAT motif to a TAAAT motif, may have brought certain critical wing-patterning genes, such as the pro-wing gene vg, under the regulation of Ubx during the evolution of dipterans.

Discussion

Genetics, cell and molecular biology methods have established, unequivocally, the role of Hox genes as master control genes in regulating segment-specific developmental pathways and diversification of body plans during evolution. However, precise mechanism by which orthologous Hox proteins mediate differential development of a specific morphological feature in different species is still largely unknown. Main reasons this has eluded systematic study include their highly conserved DNA-binding domains across all species.

In Drosophila, the Ubx protein is expressed in the T3 segment and specifies the development of the haltere by activating or repressing a number of genes at various hierarchical levels of wing patterning (Weatherbee et al. 1998; Shashidhara et al. 1999). Ubx is also expressed in the T3 segment of other insect species and thought to specify different fates in each of them. For example, in Coleopterans, Ubx specifies the development of hindwings instead of elytra and in Lepidopterans it specifies differences in eyespot patterns between the forewing and hindwings (Weatherbee et al. 1999). In more ancestral Hymenopterans such as Apis mellifera, which have two pairs of wings, hindwings are marginally smaller than the forewings. In A. mellifera, Ubx is expressed in both forewing and hindwing primordia, although its expression is stronger in T3 (Prasad et al. 2016). It is possible that divergence of Coleopterans, Lepidopterans and Dipterans involve both suppression of Ubx expression in T2 and Ubx acquiring ability to regulate various developmental pathways in T3.

To identify the mechanisms governing the differential regulation of wing patterning genes between T2 and T3 in Drosophila, we employed a comparative genomics approach and identified that the TAAAT motif is enriched by Ubx specifically in the enhancers of targets of Ubx in Drosophila halteres but not in Apis hindwings. Our functional studies reported here suggest that Ubx binds to the TAAAT motif with higher affinity than to the TAAT motif and its binding to former is critical for the up-regulation of CG13222 expression in the haltere imaginal discs.

An earlier report of ours (Prasad et al. 2016) and this report suggest that unlike the quad-vg enhancer of Drosophila, an equivalent enhancer of vg gene of A. mellifera is not differentially expressed, but drives GFP expression in both wing and haltere imaginal discs in a transgenic assay. We show that changing the TAAT motif to TAAAT motif, dramatically, brought the enhancer under the negative regulation of Ubx. Additionally, we observe that mutating binding sites for adjacent transcription factors like GAGA binding proteins, had a much pronounced effect when combined with presence of the TAAAT motif, but not with the TAAT motif, further demonstrating the role of the TAAAT motif in regulation of the enhancer. The Apis-vg enhancer, thus, allowed us to dissect the importance of TAAAT motif in dipteran evolution. As vg is a pro-wing selector gene and can assign wing fate to any group of dorsal epithelial cells (Kim et al. 1996; Klein and Arias 1998; Williams et al. 1991; Neumann and Cohen 1996), it is a critical target of Ubx to specify haltere development. In this context, a microevolutionary change (TAAT to TAAAT motif) in the enhancer of an ancestral vg may be a critical step in the evolution of dipterans.

It is, however, very unlikely that a large number of convergent mutations would be involved in this process, wherein multiple TAAT motifs are evolved to TAAAT motifs. A few critical genes, such as vg, upstream of wing patterning pathways may have evolved to become targets of Ubx through this route. Once regulatory networks of those genes are modulated by Ubx, chromatin landscape of many other targets (Ubx may bind to these targets via TAAT motifs) would change making them amenable for Ubx-mediated regulation and thereby specifying the haltere fate.

Materials and Methods

ChIP Sequencing

Third instar wandering larvae were cut, inverted and fixed with 1.5% PFA (ThermoFisher scientific) for 20 min at room temperature and subsequently quenched (125 mM Glycine solution, 1×PBS). Samples were washed twice with 1×PBS (Sigma) for 10 min each and dissected for wing and haltere discs (500 number of wing discs and 1000 number of haltere discs were used per replicate). Samples were lysed with Cell Lysis Buffer [10 mM Tris–Cl pH 8, 10 mM NaCl, 0.5% NP40, 1×PIC (Roche)] followed by mechanical shearing for 10 min on ice. After centrifugation at 2000 rpm for 10 min at 4 °C, the was pellet resuspended in Sonication buffer (50 mM Tris–Cl pH 8, 1% SDS, 1% NP40, 10 mM EDTA, 1×PIC) followed by incubation on ice for 30 min. Sonication was carried out using the Covaris S2 sonicator using settings (DC 20%, Intensity 5, Cycles of Burst 200, Time = 20 min). Samples were centrifuged at 14,000 rpm for 15 min at 4 °C and precleared using with 4 mL of Magnetic A Beads (Diagenode) at 4 °C for 1 h. Preclearing beads were removed and 2 mL of Polyclonal anti-Ubx antibodies (Agrawal et al. 2011) were added with overnight incubation at 4 °C. Magnetic A beads were added and incubated for 4 h at 4 °C. The supernatant was removed and beads washed twice with low salt buffer (20 mM Tris–Cl pH 8, 150 mM NaCl, 0.1% SDS, 2 mM EDTA, 1% TritonX-100), twice with High Salt buffer (20 mM Tris–Cl pH 8, 500 mM NaCl, 0.1% SDS, 2 mM EDTA, 1% TritonX-100), twice with LiCl buffer (10 mM Tris–Cl pH 8, 250 mM LiCl, 1% NP40, 1% Na-Deoxycholate, 500 mM EDTA), and once with TE buffer (10 mM Tris–Cl pH 8, 500 mM EDTA). Samples were eluted in elution buffer (50 mM Tris–Cl, 1 mM EDTA, 1% SDS, 50 mM NaHCO3). Samples were decrosslinked followed by treatment with RNAse treatment at 37° for 1 h and Proteinase K treatment at 42 °C for 2 h. DNA was purified using PCI purification and quantified using the Qubit HS DNA quantification system.

For library preparation, equal amount of DNA (~2 ng) was used as an input for NEB Ultra II DNA library prep kits (NEB #E7645). Number of cycles for amplification of adapter ligated libraries were estimated by the qPCR before final amplification to avoid any bias arising due to PCR amplification and indexing (NEB #E7350). Final amplified libraries were purified twice, first with 1× followed by 0.8× volume of beads per sample using HiPrep PCR clean up system (Magbio #AC-60050). Library concentration was determined using Qubit HS DNA kit (Invitrogen #E7350) and average fragment size was estimated using DNA HS assay on bioanalyzer 2100 (Agilent #5067-4626) before pooling libraries in equimolar ratio. Sequencing reads (100 bp PE) were obtained on the Hiseq 2500 V4 platform at Macrogen Inc, Korea.

ChIP Sequencing Analysis

For D. melanogaster, raw reads were trimmed using the Trimmomatic software (command) and aligned to the dm6 genome (versionBDGP6.28). Peak calling was performed using MACS2 (maxdup = 1, FDR 0.05) (Zhang et al. 2008) and high confidence peaks (occurring in at least two biological replicates) were identified (https://ro-che.info/articles/2018-07-11-chip-seq-consensus). For Apis mellifera, raw fastq reads were downloaded from NCBI (GSE71847). Reads were aligned to the Amel_HAv3.1 genome and peak calling done using MACS2 (maxdup = 1, FDR 0.05). High confidence peaks occurring in both replicates were identified. However, since a properly annotated GTF file was missing for the Amel_HAv3.1 genome version, gene ontology and Deeptools analysis were done using the Amel_4.5 genome (version 4.5.47). Peaks were annotated to their nearest TSS using the Homer software (annotatePeaks.pl) (Heinz et al. 2010). Gene ontology of targets was performed using GO (Ashburner et al. 2000; Carbon et al. 2021) and overrepresented GO terms plotted. Genome tracks were made using Deeptools (normalizeUsing RPGC, binsize 1, minMappingQuality 10, smoothLength 150) (Ramírez et al. 2016) and visualized using the IGV software (Robinson et al. 2011). Heatmaps were generated using Deeptools (referencePoint TSS).

RNA Seq Data Generation and Analysis

Third instar wandering larvae of the Canton S strain were collected, cut, inverted and dissected for wing and haltere discs in cold PBS. Experiments were carried out in triplicates and samples were snap frozen in Trizol. Library preparation and sequencing on Illumina Hiseq2000 was performed at Genotypic Technology, Bangalore, India. Raw fastq files were obtained and used for analysis. Raw reads were trimmed using Trimmomatic and aligned to the reference genome using the Hisat2 software (Kim et al. 2015: 2) followed by read counts using the Htseq software (Anders et al. 2015). Differentially expressed genes were identified using the edgeR software (Robinson et al. 2010) using 1.5-fold difference as the cut-off.

Motif Analyses and Protein Sequence Comparison

De-novo motif analysis was carried out using Homer (findMotifsGenome.pl). The PWM for the TAAAT motif obtained from Homer was converted to Transfac format using RSAT (Nguyen et al. 2018) and finally converted to MEME format using transfac2meme command. Identification of binding motifs in enhancers of CG13222, quad-Vg and ApisVg was performed using the MAST software from MEME suite (Bailey et al. 2009).

Protein sequences of Ubx from different insect species were downloaded from Uniprot and multiple sequence alignment performed using Clustal Omega (Madeira et al. 2019). Alignments were visualized and processed using the Jalview software (Waterhouse et al. 2009).

Frequency Calculation of TAAAT Motif in Ubx Response Elements

Using a 1.5-fold difference (FD) cut-off, genes that are upregulated or downregulated in the haltere imaginal discs were identified. Ubx targets obtained from ChIP-seq were mapped to differentially expressed genes to identify putative Ubx response elements that are upregulated, downregulated or not-differentially expressed in the halteres. The PWM of the TAAAT motif was generated and its frequency calculated in each group of Ubx response elements using the FIMO software.

Plasmids Constructs, Cloning and Transgenic Fly Generation

The full list of primers as well as sequences of enhancers of the CG13222, quad-Vg and ApisVg is provided as supplementary data (File S2). The specifics of DNA sequence constructs used for S2 cell culture based assays and transgenic assays have also been mentioned at length (File S2). For Luciferase assays, site-directed mutagenesis was used to clone enhancer constructs of the CG13222 gene between Kpn1 and Nhe1 restriction sites, upstream of a modified pGL3 vector containing a 5× Dorsal binding site. Metallothionein inducible pRMHa3 vectors containing Ubx were previously generated in the lab and empty pRMHa3 vector was generated by excising the cloned Ubx sequence. For generating transgenics, all enhancer constructs were cloned between the Nhe1 and Kpn1 restriction sites of the pH-stinger-attb (a kind gift from Manfred Frasch) and injected into the attp40 site on the second chromosome using a mini white screen at NCBS fly facility, Bangalore. All constructs were sequence verified before use in functional assays. Insertion of various transgenes in attp40 site helped for better comparison of wildtype and mutant transgenes of a given enhancer.

Luciferase Assays

S2 cells were checked for contamination before performing Luciferase assays. Cells were plated onto 24 well plates at a density of 3 × 105 cells per well 6 h prior to transfection. For every construct, either wild type or mutant, two sets of experiments were designed; one well was co-transfected with the enhancer construct in pGL3 vector and the empty pRMHa3 vector whereas the other well was co-transfected with the enhancer construct and pRMHa3 vector containing Ubx. Renilla luciferase was used as an internal control and co-transfected in all experiments. Transfection was carried out using the Effectene transfection reagent and all experiments carried out in three technical replicates and at least three biological replicates. 48 h post transfection, sterile CuSO4 solution was used to induce expression of Ubx at a final concentration of 500 um and incubated for 24 h. Cells were harvested, pelleted down (1000 rpm for 4 min) and 100 uL of 1× Passive Lysis buffer added and vortexed to dissolve the cell pellet. Cells were incubated for 15 min at room temperature and further spun at 10,000 rpm for 90 s to collect the supernatant. The luminescence was measured with the Dual-glo Luciferase assay kit (Promega) on Ensight Plate reader (Perkin Elmer). All readings were normalized to Renilla luminescence and datasets compared using the Prism software.

EMSA

VC-Ubx and VC-Scr were cloned in the PcDNA3 vector and produced with the TNT-T7-coupled in vitro transcription/translation system (Promega) for EMSAs, as previously described (Hudry et al. 2012). Shortly, between 3 and 6 µL of programmed lysate was used for each protein (100 ng/mL of proteins were produced on average). The VC-Ubx and VC-Scr fusion proteins were produced separately (0.5 mg of each plasmid was used for the in vitro transcription/translation reaction). Supershift against the VC fragment was performed by adding the anti-GFP antibody after 15 min in the binding reaction. Each band shift experiment was repeated at least two times.

Immunostaining and Microscopy

Wandering third instar larvae were cut and inverted in cold PBS followed by fixation with 4% PFA for 20 min with gentle rocking. Samples were washed thrice with 0.1% PBTX (0.1% Triton in PBS) for 10 min each, followed by 1 h of blocking at room temperature (blocking solution: 0.5% BSA in 0.1% PBTX). Samples were incubated with primary antibodies (dilutions made in blocking solution) at 4 °C overnight followed by washing with 0.1% PBTX for 10 min (×3) at room temperature. Samples were incubated with secondary antibodies for 1 h at room temperature, washed with 0.1% PBTX for 10 min (×3) and finally with PBS before mounting. Wing and haltere imaginal discs were dissected out, mounted using Prolong Gold Antifade (Invitrogen) and stored at 4°. Antibodies used in the study are Rb-GFP (1:1000) (Invitrogen), Rb-Ubx (1:1000) (Agrawal et al. 2011), m-Ubx (1:30) (DSHB), goat-Hth (1:100) (Santa Cruz Biotechnology), Alexa Fluor 568 (1:1000) (Invitrogen) and Alexa Fluor 633 (1:1000) (Invitrogen). All images were taken on Leica Sp8 system at Microscopy Facility, IISER Pune. Laser settings [laser power (LP) and gain (G)] used for measuring GFP fluorescence have been indicated for each image panel. All images were processed using the ImageJ software and compiled using Microsoft PowerPoint and Adobe Illustrator.