As a result of recent large-scale studies of chromatin states in the entire D. melanogaster genome [1], the idea that genes with ubiquitous activity are located in the interbands of polytene chromosomes was formulated [2]. It was found that the genes that are specifically active at certain stages and in individual tissues (developmental genes) are located in tightly packed bands, which were previously named the intercalary heterochromatin bands [3]. Earlier, we showed that, at the whole-genome level, the chromatin that corresponds to the interbands has a complex structure in terms of genetic organization [2]. In particular, it was found that interbands contain unidirectional and bidirectional promoters, genes entirely located in the interbands, and genes with different number of transcription initiation sites. In view of this, the question arose of a comprehensive study of all interbands located on the molecular and cytological maps of the D. melanogaster genome in order to determine the organization of promoter types in them.

In this study, we obtained new data on the topology and location of proteins and different characteristics of chromatin within interband DNA sequences. We found that interbands containing bidirectional and unidirectional genes have different organization of the promoter region, which is apparently crucial for the formation of the open-chromatin region that we describe as an interband in polytene chromosomes of D. melanogaster.

On the basis of data of the international project modENCODE and the results of our studies [4], we developed a model of four chromatin states [2]. Using information about the location of proteins that are characteristic of the genetically active chromatin, we developed the stereotype of its classification by the degree of compaction from the minimum (aquamarine, corresponds to interbands) to the maximum (ruby, corresponds to the black dense bands). Another two states— lazurite and malachite—correspond primarily to the coding gene regions [2, 4] and the borders of blocks of developmental genes [5].

According to calculations, there are five thousand interbands in D. melanogaster chromosomes, which contain approximately 5% DNA of the euchromatin part of the genome [6, 7]. Only 33 of them were precisely mapped on the molecular and cytological maps of D. melanogaster genome. Since all interbands have common properties (such as the location of 5'-UTRs of the genes with ubiquitous activity, the enrichment in genes with bidirectional orientation of regulatory regions, and the location of insulator proteins (CHRIZ, BEAF-32, and others) and correspond to the aquamarine chromatin [2], the analysis of their genetic structure and architecture is of primary interest.

Previously, we compared the organization and functions of the genes that are fully located in the ​​open-chromatin region (interband) and the genes that occupy two structures (interband and gray band) [8].

In the present study, we investigated in detail the molecular-genetic organization of the sequences of 33  interbands localized on the molecular map of D. melanogaster. For analysis, we used the FlyBase version r.5.57 for D. melanogaster, which contains 29761 transcripts of 13753 protein-coding genes as well as interbands with 58 genes and 97 transcripts with unique starts.

In this study, we found that interbands had different genetic organization (Table 1). Most of the studied interbands contained one gene. The majority of these interbands contained genes with one transcription initiation site, and the remaining interbands contained one gene with several alternative promoters (Table 1, groups I and II). Other interband sequences contained two or more genes. Group III includes the interbands with two unidirectional genes and several transcription initiation sites. Group IV includes the interbands with bidirectional genes, and group V includes the intricately organized interbands with three or more genes in both unidirectional and bidirectional orientation.

Table 1. Gene location in interbands

Previously, we showed [9] that the transcription initiation sites of genes are located primarily in the center of interband sequences, and the border of the aquamarine chromatin, which corresponds to interbands on the molecular map, is almost always located in the first intron of an interband gene.

For the topology analysis, we chose the interbands from two contrasting groups I and IV, which contained the genes with one start and the bidirectional genes, respectively (Table 1). We evaluated the size of the interbands whose borders were localized using the four chromatin state model [2]. The length of the interbands that contained transcripts with one start was, on average, 3.5 kb, and their size ranged from 1 to 15.4 kb (Fig. 1a). The size of the interbands that contained bidirectional genes averaged 3.4 kb (1.6–7 kb). The distance between the promoters in these interbands averaged 0.7 kb (Fig. 1b). Such a large size of the four of the 33 interbands studied may be due to the fact that the used model identifies the first long intron of the active gene in the aquamarine chromatin. However, all genomic elements and proteins characteristic of the interbands are located in these domains in the promoter regions of the genes.

Fig. 1.
figure 1

Location of genes in interbands. Here and in Figs. 2 and 3, (a) are interbands containing the genes with one transcription initiation site (group I in Table 1) and (b) are the interbands containing bidirectional genes (group IV).

Recently we have shown [10] that insulator proteins CTCF and CHRIZ artificially recruited to a chromosomal site form a new interband. We studied [11] the distribution of insulator proteins CHRIZ and BEAF-32 as well as RNA polymerase II in the sequences of interbands containing one gene (group I) and bidirectional genes (group IV). The aquamarine chromatin sequences corresponding to the interbands were aligned with respect to the gene transcription initiation sites for group I interbands and with respect to the center between the transcription initiation sites for group IV interbands. Despite the fact that the average size of interbands was 3.4 kb (Figs. 1a, 1b), the proteins were located locally in the promoter region of the genes rather than occupied the entire sequence (Fig. 2). It can be seen that, in the sequences of the interbands that contained one gene, the insulator proteins CHRIZ and BEAF-32 were located immediately at the gene transcription initiation site, and RNA polymerase II was shifted 200 bp upstream of the transcription initiation site (Fig. 2a). In the interbands containing two bidirectional genes, BEAF-32 and CHRIZ proteins were located between the gene transcription initiation sites, and RNA polymerase II had two peaks in the promoter regions of the genes (Fig. 2b).

Fig. 2.
figure 2

Location of proteins in the DNA sequences corresponding to the interbands of polytene chromosomes. Here and in Fig. 3, point 0 on the abscissa axis corresponds to the gene transcription initiation site. Numbers designate the average binding intensity of proteins in S2 cells (modENCODE data): (1) CHRIZ (CHROMATOR), (2) RNA polymerase II, (3) BEAF-32.

The promoters of many genes contain paused polymerase II, which generates short noncoding RNAs 25–60 bp long. The presence of this paused form of polymerase contributes to creating an accessible chromatin conformation in the regulatory gene region and binding the transcription factors [12].

In [13], short noncoding RNAs generated by the paused RNA polymerase II were mapped in the Drosophila genome, and their 5' and 3' ends were determined. Therefore, their exact sequences are known. It was shown that such RNAs are characteristic of the open aquamarine chromatin domains (interbands) of Drosophila [9]. We investigated the topology of these noncoding RNAs, the proteins of the replicative complex (ORC2) [14], DNase I hypersensitive sites [1], P-element insertions (FlyBase r.5.57), and “broad” promoters [15] in the studied interbands.

All of these structures were located in the region of the transcription initiation sites of the interbands that contained the genes with one transcription initiation site and had two independent peaks in the interbands contained bidirectional genes (Figs. 3a, 3b).

Fig. 3.
figure 3

Location of genomic elements in DNA sequences corresponding to the interbands of polytene chromosomes. Numbers designate the average number of annotations of characteristics in the interbands studied: (1) short RNAs generated by the paused form of RNA polymerase II in D. melanogaster embryos [13]; (2) P-element insertions (FlyBase r. 5.57); (3) broad promoters [15]; (4) replicative complex proteins (ORC2) in S2 cells [14]; (5) DNase I hypersensitive sites (DHS) in S2 cells [1].

Thus, despite the fact that the size of individual interbands is, on average, 2 kb [6, 7] (according to our estimates, approximately 3.4 kb), the insulator proteins and other previously described structures characteristic of the interbands [2] had a narrow localization in the regulatory region of the genes located in the interbands. The enrichment of the aquamarine chromatin in ORC proteins and 5'-noncoding regions of the genes with mostly ubiquitous activity [2, 9] confirmed that fact that interbands combine the processes of transcription initiation and replication of genes. Apparently, the gene regulatory region plays the decisive role in the formation of the domain of open decompacted chromatin of interbands.

ACKNOWLEDGMENTS

This work was supported by the Russian Science Foundation (project no. 14-14-00934).

COMPLIANCE WITH ETHICAL STANDARDS

The authors declare that they have no conflict of interest. This article does not contain any studies involving animals or human participants performed by any of the authors.