Introduction

The eukaryotic genomes present many enigmatic features, especially when viewed in light of the reductionist belief that because of the commonality of fundamental principles of organization of all life forms on this planet, the genome organization in diverse organisms should also follow common norms. While the genome does follow some common ‘rules’ extending across very diverse levels of organizations, the many variations that indeed exist appear paradoxical as they defy the logic on our current understanding of biological principles. Sometimes, simplistic explanations are advanced to explain the paradoxical situations, which gain wider acceptance because of the apparent absence of a ‘logical’ alternative. Heterochromatin and non-coding DNAs are examples of paradoxical components of our genome. Heterochromatin was a fact for cytologists, and geneticists knew that despite its appearing to be a ‘gene desert’, it has significant and decisive effects on phenotypes in various organisms [132, 187]. However, the unusual cytological properties and diversity of phenotypic effects of heterochromatin and the fact that these elements apparently did not follow the rules of the game defined by Mendelian genetics often led to it being sidelined, with some even believing that this component of the nuclear chromatin can be dispensable. Likewise, the non-coding DNA, which was largely, but not absolutely, related to the cytologists’ heterochromatin, defied the logic following the protein-centric central dogma of molecular biology [48, 49]. During the last quarter of 20th century, heterochromatin acquired novel connotations in relation to chromatin organization and epigenetic modifications and thus molecular biologists regained their interest in the riddle of heterochromatin. However, the non-coding DNA was generally cast aside as selfish or junk DNA [59, 159, 160], and therefore, remained largely ignored till the beginning of this century.

Prior to the end of 20th century several of the non-coding DNA sequences and their transcripts were demonstrated to have far-reaching implications in the organism’s life [3, 29, 41, 72, 123, 128, 150]. With the advent of large scale genomic studies in diverse organisms, it became clear by the beginning of this century that the non-coding DNAs are present in all organisms although their relative as well as absolute amounts in the total genome can vary widely even between related species. The improved RNA sequencing technologies provided compelling evidence that a large fraction of the so-called ‘selfish’ or ‘junk’ DNA was actually transcribed in most organisms [7, 26, 46, 148, 152]. Catalyzed by these leads, recent times are indeed witnessing great excitement about the non-coding DNA, so much so that the concepts like ‘selfish’ or ‘junk’ DNAs themselves have become junk!

The present review attempts to correlate, taking examples largely from Drosophila, the various ‘functions’ and actions ascribed to heterochromatin with the increasingly better understood activities of the diverse non-coding RNAs (ncRNA).

Early notions of heterochromatin: subtle and large phenotypic effects despite being ‘gene desert’, transcriptionally silent and highly repetitive DNA enriched

The chromatin regions that showed differential staining and condensation cycle in cells of mosses (Bryophytes) were termed by Heitz [92] as heterochromatin, in contrast to the euchromatin that showed lighter staining and the expected condensation cycle during mitotic and interphase stages. Subsequent genetic and cytological studies in Drosophila and plants like maize [45, 52, 93, 138, 149, 161, 196, 202] revealed unusual cytological and genetic properties of heterochromatin. These included identification of heterochromatic regions of chromosomes as ‘gene deserts’, involvement in chromosome rearrangements or gene transpositions, ectopic pairing, position effect variegation (PEV), diverse effects on phenotypes, essential for male fertility in Drosophila etc. The condensed chromatin regions were found by early studies, utilizing cellular autoradiography to identify the 3H-uridine incorporating sites in intact nuclei, to be transcriptionally inactive [30, 143, 163]. The transcriptional inactivity inferred on the basis of these studies apparently complemented the results of genetic studies that showed the condensed heterochromatic chromosomal regions to be devoid of typical ‘Mendelian’ genes. Another general feature of heterochromatin, established in 1960s, was that these chromosome regions were ‘late’ replicating, i.e., they replicated in the later part of the S-phase of cell cycle [30187]. The polytene chromosomes, present in certain tissues of Drosophila and other dipteran insect larvae, contributed immensely to understanding of gene function and chromatin organization [11, 14, 161] and surprisingly, also to the paradox associated with heterochromatin. The large polytene chromosomes in Drosophila permitted identification of many small intercalary heterochromatic regions dispersed through the euchromatic regions of different chromosomes which were characterized by constrictions, ectopic pairing [196] and late replication [5, 124]. Studies on distribution of the large pericentromeric heterochromatin blocks on different chromosomes in Drosophila melanogaster revealed that these regions did not participate in the endoreduplication cycles that generate the polytene nuclei [93, 125, 127, 175]. Later studies showed that the rDNA sequences located within the pericentromeric heterochromatin of X and Y chromosomes [200], and many intercalary heterochromatic regions [18, 117, 158] also displayed reduced or no participation in the endoreplication cycles in larval salivary glands of Drosophila. Discovery of satellite and repetitive DNAs and application of in situ hybridization to determine their nuclear localization in the 1970s [43, 106, 227] revealed a widespread association of intercalary, pericentromeric and telomeric heterochromatic regions with diverse transposon and highly repetitive sequences in all eukaryotes examined [47, 66, 78, 85, 129, 132, 210, 218, 228].

Heterochromatin remained paradoxical to geneticists, cytologists and evolutionary biologists because of its condensed state in most cell types of an organism, its being largely devoid of ‘genes’ and yet claimed to be exerting remarkable effects on diverse phenotypes, and its persistence in species’ genomes [30, 45, 187]. The fact that certain species that show a precisely regulated and orderly ‘chromatin diminution’, a process that eliminates large blocks of heterochromatin and related DNA sequences from somatic cells during embryonic development [13, 23, 54, 197], and also those species that show under-replication of heterochromatin during endoreplication cycles, have not got rid of heterochromatin from their germline, further added to the enigma.

Darkly stained and condensed heterochromatic regions are generally similar on both homologs in diploid cells, with a few notable exceptions. One is the heterochromatinized inactive X-chromosome in somatic cells of female mammals. In this case, one of the two X-chromosomes in eutherian mammalian female’s somatic cells is randomly inactivated early in development and remains condensed in the form of Barr body [131, 142] while the other X-chromosome remains euchromatic and active like the autosomes. Another well known case is that of the mealy bugs (Coccid insects) in which the males develop parthenogenetically and thus are haploid while females are diploid but all somatic cells of females carry the paternally derived haploid set of chromosomes in an inactive heterochromatinized state [36]. To account for such diversity of heterochromatin regions, Brown [30] grouped the cytologically condensed heterochromatin into two classes: (i) constitutive heterochromatin where both homologs showed similar condensation and dark staining in most of the cell types, and (ii) facultative heterochromatin, in which the genetically similar homologs in the same diploid nucleus behaved differentially so that one of them gets epigenetically modified to become condensed and transcriptionally inactive. An additional and a very significant fundamental difference between the two types is that the constitutive heterochromatin, whether present as pericentromeric or telomeric blocks or dispersed through chromosomes as intercalary heterochromatic regions, is majorly composed of highly repetitive/satellite sequences and functional as well as non-functional defective transposons [31, 139].

Epigenetics of heterochromatin- revelation of chromatin condensation mechanism

Recent decades have seen a remarkable interest in the field of epigenetics, which can explain many phenomena and observations, including heterochromatinization, that appear enigmatic in terms of the conventional understanding of Mendelian genetics and gene expression. The term ‘epigenetics’ was first used by Waddington [215], who also coined the term ‘epigenotype’ for a whole complex of developmental processes that lie “between genotype and phenotype, and connecting them to each other”. The term epigenetics has become very popular in recent decades, although with varying and sometimes misleading/confusing interpretations. A widely accepted view of epigenetics implies study of changes in gene function that are heritable through mitotic and/or meiotic cell generations without entailing changes in the DNA sequence [53]. The first indication of such epigenetic changes was provided by studies on DNA methylation which seemed to affect expression of the given gene [96, 173]. Constitutive as well as facultative heterochromatin regions were found to have higher incidence of DNA methylation [44, 173]. Subsequently, the various histones, which associate with DNA to make the eukaryotic chromatin, were also found to display a variety of isoforms, each with characteristic post-translational modifications that have predictable consequences on chromatin organization, gene activity and ‘heritability’ of the chromatin state through cell generations [2, 12, 15, 53, 60, 85, 90, 91, 102, 105, 122, 129, 131, 169, 188, 230]. Interestingly, the increasing understanding of ‘histone code’, that seems to underlie the epigenotype and the cell inheritable active or inactive state of chromatin and specific genes, has revealed that constitutive heterochromatin while showing some unique post-translational histone modifications also shares some epigenetic marks with facultative heterochromatin and with typical euchromatic regions that get temporarily silenced as part of developmental gene regulation programme [2, 20, 21, 102, 172, 181, 217]. The constitutive heterochromatin is primarily characterized by the presence of H3K9me2/3 and Heterochromatin Protein 1 (HP1) while the facultative heterochromatin shows presence of H3K27me3 and polycomb group (PcG) based PRC1 and/or PRC2 repressive complexes [71, 157]. Although generally believed that the polycomb family based repressive protein complexes are absent in constitutive heterochromatin [71], the BMI1 protein, a member of the PRC1 complex, has been reported [1] to be associated with constitutive heterochromatin in mammalian cells in a developmentally regulated manner. In addition to these major epigenetic marks, other marks are also variably associated with it so that within a block of constitutive heterochromatin, different regions may show heterogeneous and dynamic epigenetic marks [86, 206, 217]. The combinatorial patterns of chromatin marks on active and silent genes within a constitutive heterochromatic block are unusual in terms of levels of enrichment/depletion and in distributions across gene segments, and thus different from those on euchromatic genes or facultative heterochromatin regions. Higher expression of constitutive heterochromatin associated genes correlates with lower enrichments for H3K9me2 across all gene segments, but not with HP1 levels [172]. Thus, the composition and architecture of different constitutive heterochromatin domains are spatially more complex and dynamic than generally perceived so that the diverse functions of heterochromatin are regulated and executed through the network of its sub-domains [206]. Interestingly, it is now also known that HP1 and a few other epigenetic marks that have generally been identified with the repressed constitutive heterochromatin are also essential for active transcription, at least at several euchromatin sites [33, 51, 62, 110, 120, 121, 164, 166, 167, 172, 206, 213, 217]. Thus, while at a gross level the different types of chromatin appear to be characterized by distinctive epigenetic marks, it is notable that when looked at a finer level, the specific epigenetic marks in the given region of condensed constitutive heterochromatin are largely dependent on the local context.

Consanguinity of heterochromatin and non-coding RNAs (ncRNAs)

The advent of eukaryotes with a nuclear genome necessitated evolution of chromatin and temporal regulation of different genes. Further, as eukaryotes evolved multi-cellularity and division of labour, a spatial regulation of activity of different genes also had to be incorporated into the gene regulatory network. Condensed state of chromatin limits access to transcriptional machinery and thus provides simple but efficient spatio-temporal regulatory machinery. Although conventional views considered proteins as the major players in the complex eukaryotic gene regulation network, ncRNAs must have played such roles from the very beginning of eukaryotic organization since bacteria too make use of their regulatory property [69, 220]. With the increasing genome size in eukaryotes, the propensity for insertion of transposons in the genome also enhanced, which on one hand would adversely affect the genome integrity but at the same time also provide the much needed raw material for evolution of new genes and regulatory networks. Like the prokaryotic restriction-modification, and the guide-RNA dependent CRISPR–CAS genome defense systems [116], the evolution of small RNA pathways (miRNA, siRNA, piRNA or P-element–induced wimpy testis (PIWI)—interacting RNAs etc.) early in eukaryotes provided not only the additional layers of gene regulatory networks but also would have helped in keeping the virus and transposon activities in check [70, 75, 115, 182]. One of the simple and effective ways to restrict the increasing load of such invading mobile DNAs would be to heterochromatinize those chromatin regions. Thus evolutionarily, heterochromatin and at least some of the ncRNAs seem to share common origin and task.

Historically, heterochromatin has been associated with a set of negative features, viz., condensed chromatin state, devoid of ‘genes’ and, therefore, inactive. In agreement with such perceived properties, heterochromatic regions were also ‘found’ in early studies to be transcriptionally inactive. It is interesting to note, however, that while the evidence for paucity of typical genes in heterochromatin was based on mutagenesis and gene-mapping studies applied to constitutive heterochromatin, the notion of transcriptional inactivity of heterochromatin was largely based on studies on facultative heterochromatin like the inactive X-chromosome in somatic cells of female mammals [208] or the paternally derived heterochromatinized chromosome set in female coccids [177]. Since the resolution provided by conventional cellular autoradiography of 3H-uridine labeled diploid cells was limited, it was generally accepted that, like the facultative heterochromatin and condensed bands in polytene chromosomes, the condensed constitutive heterochromatin regions too were transcriptionally silent [191]. In view of the strongly held belief of transcriptional inactivity of heterochromatin, some of the early studies that demonstrated genetic and transcriptional activity of typical constitutive heterochromatic regions in Drosophila did not attract widespread attention. A series of studies by G. Meyer’s group in the 1960s showed that the Drosophila Y-chromosome, a classical example of constitutively heterochromatic and gene-desert chromosome, gets decondensed in primary spermatocytes and assembles transcriptionally active ‘lampbrush loops’ [94, 95, 132]. Active transcription of the β-heterochromatin regions in the chromocentre in polytene nuclei of Drosophila was also demonstrated in early 1970s [126]. Likewise, many of the biochemical studies in 1960s and 1970s that showed the existence of a variety of nuclear RNAs that did not move to cytoplasm [61, 80, 189, 199, 222], got into oblivion as the concepts of ‘selfish’ or ‘junk’ DNA became popular. Although the tools and reagents available in early days of molecular biology could not be precise about the identity of these nuclear RNAs, it is obvious that the diversity of the nucleus limited heterogeneous nuclear RNAs (hnRNAs) noted in those early studies included the many nuclear ncRNAs that are now known and yet to be discovered.

Chromosomal RNA was identified as a component of chromosomes of higher organisms in the 1960s [16, 99, 100, 195] but was later claimed to be an artifact [222] and, therefore, remained largely ignored. However, recent studies have revealed a variety of RNAs, especially ncRNAs, to be associated with chromatin, including heterochromatin. The ncRNAs regulate gene expression through modulating availability and/or activity of the regulatory proteins and by regulating the 3-dimensional organization of chromatin in nucleus. As widely reviewed in recent years, many of the ncRNAs are actually derived from centromeric, telomeric and intercalary heterochromatic regions or are essential for heterochromatinization itself [21, 32, 34, 38, 42, 57, 64, 66, 86,87,88,89, 112, 132, 140, 184, 192, 193, 195, 212, 214, 218, 223, 224]. Obviously, heterochromatin and ncRNAs are not only closely related but often interdependent.

As noted above, the process of heterochromatinization or chromatin condensation has conventionally been associated with transcriptional silencing and thus genetic inactivity. The general co-localization of the diverse epigenetic silencing marks with condensed and transcriptionally inactive chromatin buttressed the notion that heterochromatinization essentially serves to keep the chromatin components that are not permitted to transcribe in a silent mode. However, such commonly prevailing negative image of heterochromatin, that it is only a mechanism for suppression, needs a re-assessment since more positive actions of the constitutive heterochromatin are now known and understood. The classical cytological and later cell and molecular biological methods in 1970s and 1980s provided only a limited cytological and biochemical resolution resulting in the near all-or-none descriptions of organization and functions of heterochromatin in any cell type. The significantly enhanced resolution provided by the contemporary microscopic and other high-throughput biochemical and molecular techniques permit us to have a greatly magnified and resolved picture of local variations in the organization and properties of heterochromatin blocks that earlier looked monolithic. The Y-chromosome of D. melanogaster is briefly discussed below to provide glimpses of the fine structure of typical constitutive heterochromatin that is emerging from synthesis of the extensive cytogenetic studies on this chromosome during the past nearly 100 years with high throughput molecular analyses in recent decades. Reexamination of the classical documentation of subtle effects of changes in heterochromatin on specific phenotypes in light of the contemporary molecular analyses reveals that heterochromatin is not just a chromatin state that is required primarily for silencing. It is becoming increasingly clear that the heterochromatic regions have proactively widespread roles in lives of cells and organisms.

Y-chromosome of Drosophila: a remarkable example of complex molecular organization of constitutive heterochromatin and its genome wide regulatory effects

It is generally believed that the Y-chromosome (or the W-chromosome) in heterogametic sex is a degenerate chromosome with its role being required essentially for fertility and/or the very early steps in sex-determination [10, 17, 37, 83, 162]. Very soon after the beginning of genetic mapping of ‘Mendelian genes’ at specific loci on the linear linkage maps of the fruit-fly chromosomes [205], it was discovered that its large Y-chromosome, although not involved in determination of sex, was essential for male fertility [25] and yet, it was a ‘gene desert’ with only the bobbed (now known to be the locus for rRNA genes) and the enigmatic k1 and k2 fertility factors mapping to this chromosome [45, 108]. However, recent studies on the Y-chromosome of Drosophila have revealed it to have a finely peppered molecular organization (Fig. 1), which may help in understanding the diverse obvious and not so obvious subtle phenotypic effects that classical genetic studies [22, 27, 45, 77] ascribed to this chromosome, although without a clue at that time to their mechanistic bases.

Fig. 1
figure 1

Organization of constitutively heterochromatic Y chromosome of Drosophila melanogaster. a A Hoechst 33258 stained metaphase from male somatic cells; b a magnified view of Hoechst-stained Y chromosome; Ce marks the centromere. c Diagrammatic representation of the land mark regions (1–25B) of the Y chromosome: upper half represents Hoechst 33258 banding (bright, dull and no fluorescence indicated by white, gray and black, respectively); lower half shows locations of the six fertility factors (kl-5, kl-3, kl-2, kl-1, ks-1 and ks-2 in green or red shaded regions) and the lampbrush-like loop forming domains (green); the N-band regions are marked by N below the schematic. d, e Locations of the different satellite DNA sequences and transposable elements, respectively, along the Y-chromosome. f Locations of different genes (protein coding and ncRNA genes and pseudogenes); vertical lines below the gene names indicate exons while the connecting diagonal lines represent introns. The different abbreviations and color shades are explained at bottom. Images in a and b are reproduced from [108] with permission of author and the Genetics Society of America. ce are based on data summarized in [168] while f is based on data in [98] and www.flybase.org (color figure online)

The ~ 40 MB DNA containing Y chromosome of D. melanogaster and accounting for ~ 20% of the male haploid genome, appears completely heteropycnotic and heterochromatic in all somatic cells, and is comprised mostly of highly repetitive DNA and transposable elements (Fig. 1). Based on extensive genetic, cytogenetic, Hoechst 33258 and N-banding data, the Y-chromosome is subdivided into 25–26 segments onto which the different genetic elements, and satellite and transposable element sequences have been mapped (Fig. 1) [22, 73, 77, 108, 168]. Following the early cytogenetic mapping of six ‘fertility factors’, viz., ks-1 and ks-2 on the short arm and kl-1, kl-2, kl-3 and kl-5 on the long arm (Fig. 1) and bobbed or the rDNA locus [22, 27, 45, 77], subsequent molecular genetic studies have identified at least 13 protein-coding and some ncRNA genes, besides a few pseudogenes (Fig. 1). Some of the protein coding genes correspond or overlap with the earlier identified fertility factors. As was predicted by studies on the Y-chromosomal ‘lampbrush loops’ in primary spermatocytes [95], some of the Y-linked genes are megabase-sized with gigantic introns and comprised mostly of repetitive and transposable element sequences [73, 98, 168]. Although the DNA sequence information for the Y-chromosome is still incomplete [98] due to the abundance of highly repetitive and transposable element sequences, it is clear that the protein coding and ncRNA genes on the Y-chromosome of D. melanogaster are interspersed, and often buried within long stretches of diverse satellite DNAs and transposable elements (Fig. 1).

The protein coding genes on the D. melanogaster Y chromosome are expressed exclusively in testis and thus seem to be required only for male fertility [35, 73, 98, 168, 178]. Yet, this chromosome exerts significant effects, in males as well as females (when present), on diverse non-germline phenotypes governed by autosomal and X-chromosomal genes. Such Y-linked regulatory variations (YRV), due to structurally altered Y-chromosomes or to even apparently wild type Y-chromosomes derived from different populations/individuals, affect various phenotypes like geotaxis, fitness of males, temperature sensitivity of spermatogenesis, expression of other genes in primary spermatocytes, immune response, silencing of X-chromosomal rDNA genes etc. through trans-effects on transcription of a very large number of X-linked or autosomal genes [24, 73, 84, 119, 136, 168, 178, 229]. The Y-chromosome also has a remarkable effect on PEV, a phenomenon of variable suppression of expression of a ‘euchromatic’ gene brought within or in close proximity of heterochromatin [138]. An extra Y in XXY females and XYY males suppresses the PEV while its absence in X0 males enhances [58, 216, 221]. Autosomal heterochromatin regions also modulate the PEV in a comparable manner and, interestingly, the effect in all cases is based on the amount of heterochromatin rather than the presence or absence of a discrete region of Y-chromosomal or autosomal heterochromatin [19, 58].

Like the Y-chromosome of Drosophila, other constitutive heterochromatin domains in different genomes are also enriched in diverse repetitive and transposon sequences, with some protein coding, non-coding and pseudogenes buried within the landscape of repetitive/transposon sequences, and like the Y-chromosome of Drosophila, the other heterochromatic regions too affect a wide range of somatic and germline phenotypes [89, 181, 198, 218]. Such effects of constitutive heterochromatin are mostly exerted through modulation of the 3D-organization of nuclear chromatin and the associated trans-effects of the variety of ncRNAs produced by and/or associated with the constitutive heterochromatin regions. Significantly, the nuclear architecture in turn can also modulate heterochromatin activity.

Heterochromatin sculpts the 3D-organization of chromatin in nucleus

Beginning with the early cytological studies, the heterochromatic regions are known to remain clumped closer to the nuclear envelop. It is interesting to note that some of the early cytological studies on organization of constitutive heterochromatin had indicated cell type specific patterns of heterochromatin staining. For example, a cell-type specific spatial distribution of the large blocks of sex-chromosome associated constitutive heterochromatin in the vole, Microtus agrestis, was reported by Lee and Yunis in 1971 [134]. The methods available then could explain neither the underlying mechanism/s nor the functional significance of such cell type specific distinct patterns of heterochromatin.

More recent studies using advanced microcopy in conjunction with in situ hybridization/immunostaining, the various chromatin-capture and other high-throughput techniques, have revealed that cell type specific gene activity requires highly ordered yet dynamic 3-dimensional organization of chromatin in the nucleus and that this is dynamically interdependent on organization of the heterochromatin and other nuclear components [64, 183, 207, 214]. The variable positioning of heterochromatin–euchromatin borders in different cell types in terms of the epigenomic patterns [71] provides another example of regulated variability in local heterochromatin domains in relation to specific requirements of the cell. Following a new transposon insertion in euchromatin, the H3K9me2 repressive epigenetic marks can spread up to 20 kb at > 50% of the euchromatic transposon insertion sites [135], resulting in differential epigenetic states of alleles and their expression.

As a source of transposons, heterochromatin can impact the euchromatin sites where they insert. Heterochromatin may affect genome organization by directing insertion of different retroposons in constitutive or facultative heterochromatin or in euchromatin regions through interactions with repeats in their 5’UTR [155]. Indeed involvement of piRNAs and mobile DNA elements in generating transposon mediated heterogeneity in genomes of different cells in brain has been reported in mammals, Aplysia and Drosophila [6, 65, 153, 165, 170]. Such derived heterogeneity in genomes of different brain cells may have significant roles in memory and behavior of the organism. Quantitative and/or qualitative changes in heterochromatic regions may impact such genome reorganizations in somatic cells with varying consequences. A local enrichment of Piwi on genomic regions tethered to nuclear pore complexes, which have highly paused PolII, has been noted in Drosophila ovarian somatic cells [101]. It remains to be seen if this enrichment is only a part of the PIWI scanning mechanism or has some other biological consequence.

Recent studies make it clear that the profile of histone modifications and chromosomal proteins associated with constitutive heterochromatin is very diverse and much more complex than initially believed. It is now obvious that sub-domains within a constitutive heterochromatin block influence the overall organization of heterochromatin which in turn globally affects nuclear architecture and activities through short- and long-rage interactions [39, 50, 171, 183, 207]. The phenomenon of position effect clearly demonstrates the dramatic consequences of the local topography of a given chromosome region or a gene on the 3-dimenstional space of a nucleus based on the dynamicity of interplay of local chromatin organization, boundary elements and ncRNAs [42, 50]. The PEV which involves ‘spreading’ of the condensed state to neighbouring euchromatic loci is related to a breakdown in the normal borders of heterochromatin which are generally occupied by boundary or insulator elements like Gypsy retroposon derived sequences or the CTCF-binding sites, some of which are regulated by ncRNAs [42]. Such interactions between heterochromatin associated boundary elements and ncRNAs contribute to maintenance of the cell-type specific 3-dimensional architecture of nucleus.

In agreement with the wider effects of Y-chromosomal heterochromatin on genome activity in somatic cells of Drosophila, a meta-analysis of the X-chromosomal and autosomal genes affected by YRV revealed that they show tissue- and to some extent species-specific expression and are often located close to the nuclear lamina in a repressive chromatin context, i.e., are usually associated with polycomb regulated repressed euchromatin and intercalary heterochromatin domains [178]. The enrichment of YRV-sensitive genes in repressive chromatin domains is significant since the inactive and condensed domains are major determinants of the spatial organization of chromosomal territories in a nucleus. Since the genes located in repressive context show high affinity for binding with Suppressor of Under-Replication (SuUR), Lam, and D1 proteins [71], it is possible that the altered content of Y-chromosome associated sequences modifies the availability and distribution of DNA binding proteins and consequent changes in the 3D spatial organization of chromatin in the nuclear volume which together affect activity of genes modulated by YRV [97, 132, 178, 207].

The Drosophila Tctp (Translationally controlled tumour protein) has roles in transcription and the stability of repeated sequences (rDNA and pericentromeric heterochromatin) through interactions with Brm and su(var)39 encoded H3K9 methyl transferase [97]. Since these proteins also have wider roles in nuclear topology and gene activity, the constitutive heterochromatin associated repetitive sequences can affect the overall genome organization and activity by modulating the availability of these proteins.

Like the Y-chromosome of Drosophila, the other constitutive heterochromatin blocks in Drosophila and other organisms too are fine mosaics of highly repetitive and transposon sequences, pseudogenes and other protein-coding and non-coding genes. Their higher order actions in cohesion of sister chromatids at pericentromeric regions, homologous chromosome pairing and segregation without chiasma during meiosis and in maintaining genome integrity [206] seem to be dependent upon the panoply of epigenetic marks and the interacting chromatin regulating proteins.

Heterochromatin organization affects aging in Drosophila since the age-related increase in activation of transposons is mitigated by over-expression of Sir2, Su(var)3–9, and Dicer-2 or down regulation of Adar, all of which affect heterochromatin structure and this is accompanied by an increase in life span [225]. The presence of more heterochromatic DNA in male than in female flies, due to the repeat-rich large Y-chromosome, is accompanied by shorter life span of males, presumably because of age-dependent loss of heterochromatic organization of repetitive elements resulting in enhanced transposon activity [28, 218]. In mammals also, senescence and disease condition like Hutchinson–Gilford progeria syndrome (HGPS or progeria) or chronic cell stress are associated with changes in constitutive heterochromatin organization and its epigenetic marks [82, 146] and this seems to be one of the factors responsible for aging.

Recent understanding of the membrane-less organelles in cells as phase-separated entities has been extended to the condensed inclusion bodies carrying repeat-containing RNA bound to certain proteins [103] and to the HP1 associated condensed heterochromatin domains [203]. Such condensed masses are suggested to be formed by specific interactions between certain proteins and DNA or RNA sequences via phase separation which generates organized condensed masses that include liquid and stable compartments [8, 103, 179, 203]. The biophysical properties of phase-separated systems are expected to explain the unusual behaviors of heterochromatin, and the mechanisms through which these domains regulate many nuclear functions. The act of transcription and the local presence of different RNAs, including ncRNAs, have profound effect on nuclear topology not only because of the complex interactions between the nucleic acids and different proteins (transcription factors, chromatin remodelers, RNA pol etc.) but also because all these interactions affect the phase-separated structural features of diverse nuclear domains like speckles, nucleolus, Cajal bodies, and heterochromatin masses etc. [183]. The PcG proteins catalyze the formation of the PRC1 and PRC2 types of repressive complexes some of which form nuclear domains called PcG bodies. The PcG bodies are also examples of phase-separated systems and are often bound to heterochromatin and located near centromeres [211]. Since the PcG proteins regulate diverse and large numbers of genes [185], any change in heterochromatin organization can also affect the PcG bodies and thus have wider implications for chromatin organization in nucleus and thus on gene activity in the cell.

Heterochromatin derived or associated ncRNAs have wide-ranging effects in somatic cells

A less discussed but likely to be very pervasive mechanism through which constitutive heterochromatin exerts its genome-wide effects seems to operate through cis and trans effects of the diverse ncRNAs that are produced and/or are associated with heterochromatin. These RNAs can modulate the chromatin organization and/or have more direct effect on expression of other genes. The major group of ncRNAs produced by the transposon and highly repetitive DNA sequences enriched constitutive heterochromatin is that of small ncRNAs like siRNAs and piRNAs. These transcripts, especially the piRNAs, are usually regarded as a defense mechanism against the invading transposons and viruses by keeping them silenced [53, 113, 176, 186, 190, 194, 207, 219, 226]. Although expression of siRNA or piRNA is known to be required for chromatin condensation and heterochromatin formation [67, 86, 88, 114], these heterochromatin derived or associated small RNAs have other functions too.There is increasing evidence that heterochromatin associated repetitive and transposon sequence derived small RNAs are also expressed in somatic cells and, besides their effects on chromatin condensation, they have other developmental roles as well through regulation of expression of chromatin modifier and other genes [4, 67, 79, 182, 186, 194, 201, 219]. Further, in view of the condensed heterochromatin blocks being phase-separated entities (see above), the act of transcription of the heterochromatin associated DNA sequences/genes by itself alters the topology of chromatin in a given cell and impacts the genome activity in specific manner.

Oncogenic transformation of Drosophila somatic cells through expression of oncogenic Ras combined with loss of the Hippo tumor suppressor pathway activates primary piRNA pathway, including transcription of the piRNA cluster on the Y-linked Su(Ste) gene [68]. In an unpublished study (M. Ray and S. C. Lakhotia, unpublished), an elevated expression of the Y-linked Su(Ste) nc transcripts has also been noted in cells over-expressing activated Ras and the non-coding hsrω gene. The various piRNAs produced by the Y-linked Su(Ste) regions have a complex regulatory relationship with the X-heterochromatin located repetitive Stellate gene, which in turn shares high homology with the X-linked CK2-β and autosomal CK2-β 0 and Suppressor of Stellate Like (SSL or CK2-β Tes) genes [9, 73, 201]. Since the CK2 family proteins have multiple roles in development [9, 201], expression of Su(Ste) nc transcripts in developing somatic cells under certain conditions would have significant consequences.

The piRNAs may affect nuclear metabolism indirectly as well since a loss of nuclear PIWI was found to be associated with an increase in abundance of small nuclear spliceosomal RNAs, suggesting that PIWI may be involved in post-transcriptional regulation [114]. The PIWI proteins, besides cleaving the transposon RNAs, use the piRNAs generated from transposons and pseudogenes to regulate mRNAs at post-transcriptional levels [219]. Thus changes in availability and abundance of piRNAs in somatic cells may have global effect on cell’s transcriptome through their interactions with PIWI proteins which in turn would affect metabolism of snRNAs and mRNAs.

The pericentromeric and intercalary heterochromatin regions show interesting relation with diverse ncRNAs [132]. About 20% of the annotated lncRNAs are reportedly associated with heterochromatic and under-replicated regions in D. melanogaster [147]. The borders of under-replicated domains in endoreplicating cells too are enriched in short ncRNA encoding sequences and rapidly evolving transposable elements that are transcriptionally active [147, 209]. The propensity for rapid evolution displayed by the various ncRNA genes, repetitive DNA sequences and transposons etc. underlie the significant roles that the constitutive heterochromatin is believed to play in speciation and reproductive isolation. The heterochromatic regions, enriched in “non-coding” elements (lncRNA or retroposed genes) seem to be hotspots and testing ground for evolution of novel genes through expression in testis or even as transcriptional noise [111, 147]. Many studies on evolution of ‘new’ genes on Y-chromosome of Drosophila indeed show a rapid DNA sequence divergence and acquisition of new functions by them and their critical roles in reproductive isolation through diverse mechanisms including hybrid dysgenesis [4, 9, 35, 56, 74, 76, 119, 141, 144, 174, 204]. The diverse transposon derived sequences like LINES, SINES etc. in mammalian cells are also known to play very significant roles in organization of heterochromatin, lncRNA evolution and gene regulation [63, 107, 130].

It is interesting that the lncRNAs required for inactivation of X-chromosome in somatic cells of female mammals and those associated with the hyperactive X-chromosome in somatic cells of male Drosophila also affect other activities in cell. The non-coding roX transcripts in conjunction with Msl1, Msl3 and Mle proteins regulate the normal expression of autosomal heterochromatin genes in male but not in female flies [55, 118]. It is reported that a failure of the imprinted X-inactivation centre (XCI) also affects autosomal gene expression [180].

Roles of epigenetic trans-generational inheritance are now increasingly appreciated [91, 133, 188]. Heterochromatin and the associated ncRNAs in different systems have been shown to affect the trans-generational epigenetic effects [133]. In a screen for sex-linked paternal effects in D. melanogaster, both X- and Y-chromosomes were found to substantially contribute to non-genetic paternal effects [74]. Maternally or paternally inherited Y-chromosome has been shown to affect the roX1 and roX2 ncRNA dependent hyperactivity of X-chromosome in male flies [151]. A P-element mediated white gene insert near tip of short arm of Y-chromosome expresses at a lower level in progeny when transmitted by male than by female parent [81]. A study on genome wide effects of sex chromosome imprinting in Drosophila [137] revealed hundreds of genes to be differentially expressed in relation to maternal and paternal origin of sex chromosomes. Y chromosome of D. melanogaster shows chromosome-wide imprinting [145]. Further, many examples of imprinting in Drosophila result in parent-of-origin effects on expression of genes in or near heterochromatic regions [137, 145]. It is possible that the mechanisms and pathways that operate for the trans-generational effects can also be effective within the body of an organism so that activities of the constitutive heterochromatin derived ncRNAs in one cell type can impinge upon other cells in the body.

The above few examples illustrate how the various heterochromatin derived and/or associated ncRNAs affect a range of phenotypes through modulation of the ‘epigenotype’. It is interesting to draw a parallel between the currently understood functions of constitutive heterochromatin and those that were empirically suggested earlier on the basis of cytogenetic studies. In a detailed analysis of properties and ‘functions’ of heterochromatin, Cooper as early as 1959 [45] argued that heterochromatin, even though largely inert genetically, acts (1) on genes, (2) within chromosomes, (3) transchromosomally, (4) metabolically, (5) on the cell, (6) on development, (7) in speciation and, finally, (8) in theory as the especial “seat of the unorthodox” in genetic systems. All these ‘acts’ ascribed to heterochromatin by Cooper can now be explained in terms of actions of various heterochromatin-derived or associated ncRNAs [132]. With better appreciation and understanding of the ncRNAs, the constitutive heterochromatic component of eukaryotic genomes is revealing its mysteries and thus need not be considered any more as mysterious or a paradox. In hindsight, it is indeed remarkable that classical geneticists, with tools of cytology and genetics as the only arsenal in their armour, speculated so concisely about diverse functions of heterochromatin, which are now being appreciated in mechanistic details using the powerful and technologically advanced high-throughput techniques.

Epilogue

Identity of heterochromatin has become a little blurred in recent decades because a distinction between the constitutive and facultative heterochromatin domains, and between them and the chromatin regions that appear transcriptionally silent and/or show some repressive epigenetic marks, has often not been maintained. It is clear that in spite of their sharing several cytological features and epigenetic marks of repression, the three chromatin types, viz., the constitutive heterochromatin, facultative heterochromatin and transiently silenced euchromatin need to be considered separately for understanding their organizational features and evolutionary consequences. While considering these different chromatin domains to be distinctive in some ways, it is also to be appreciated that they are part of a continuum, the basic chromatin fibre.

A general impression about the repetitive DNA sequence and transposon rich constitutive heterochromatin is that these regions are epigenetically marked only to keep them condensed and inactive so that the high propensity of transposon and viral DNAs to remain mobile is kept in check. Much of the paradox about heterochromatin, doing very little for the organism and yet continuing to be a significant part of the genome, is rooted in such perceived negative roles of heterochromatin. Invasion of genomes by viruses, transposons and other DNA sequences is inevitable but it is not correct to imagine that keeping them silent is the only act that heterochromatin can do or does. Likewise, to believe that transposons are only parasites and thus must always be kept in check is also not correct. Biological systems are plastic and they evolve so that while transposons invade, the host genome evolves strategies to restrain their propensity to multiply and spread [109]. However, during this continuing tug of war, the host genome also evolves newer ways to exploit the ‘invading’ genomes to improve its own fitness [40, 109]. Sometimes, however, the adaptive strategies may also be associated with a negative offset. For example, the evolutionarily conserved template switching and DNA break induced repair replication pathways, although essential for maintenance of the highly repetitive and transposon sequence rich centromeric and telomeric heterochromatin in mammalian genomes, also are claimed to be responsible for the occasional triplet expansion to generate tandem repeats of simple sequences which cause serious disease conditions when expanded beyond a threshold upper limit [156]. Apparently, the disease burden is outweighed by the criticality of maintenance of genomic integrity in the face of telomeric and centromeric heterochromatin associated highly repetitive sequences [156]. Such continuing evolutionary forces indeed have sculpted the different genomes as we find them.

The constitutive heterochromatin, especially the Y- or the W-chromosome in species with heterogametic mode of sex-determination, has often been considered to be evolutionarily degenerate [83] because of the general absence or paucity of protein-coding genes but a greater proportion of pseudogenes derived from functional autosomal genes. The perceived lack of well documented phenotypes that can be associated with specific parts of these chromosomes have added to the notion of these chromosomes being degenerate. However, rather than being degenerate, these chromosomes are to be viewed as “seat of the unorthodox” in genetic systems [45], since, as discussed here, they perform a variety of functions, which are often not mediated through their own protein coding genes, but are effected through modulation of diverse ncRNAs, which in turn have trans-effects directly or indirectly through alterations in spatial organization of nuclear chromatin and on activities of protein coding genes located on other chromosomes. Way back in 1959, Cooper [45] stated this very succinctly as follows: “Thus ‘heterochromatin’ is not to be viewed necessarily as evolutionarily degenerated ‘euchromatin’. Rather it is suggested that the major heteropycnotic and heterochromatic regions are specialized elements, having exceptionally long periods of relative condensation, most or many genes of which act only at particular points of development or in particular tissues. The Y for example, is not a degenerate chromosome, but a highly specialized genetic system many genes of which are essential for survival of the species because their actions uniquely confer functional capacity upon the spermatozoa”. While fertility of individuals of a given sex because of the presence of these chromosomes is a very vital function, their roles in modulating diverse somatic phenotypes are of equal evolutionary significance. The small variations in phenotypes, for example due to YRV, may not appear significant under the constant laboratory conditions but these would have long-term significant consequences under the unpredictably variable natural conditions. Thus considering the constitutive heterochromatin as an unwanted guest and burden on the genome is to grossly undermine its significance.

The considerably varying amounts of constitutive heterochromatin in different species have also contributed to the paradox of heterochromatin. Using the reductionist approach, it is indeed difficult to justify the wide variations in relative as well as absolute amounts of chromatin in the form of constitutive heterochromatin. However, biological systems are not created by design but are products of random events and natural selection. As stated by Jakob [104] “The action of natural selection has often been compared to that of an engineer. This, however, does not seem to be a suitable comparison. First, because in contrast to what occurs in evolution, the engineer works according to a preconceived plan in that he foresees the product of his efforts. Second, because of the way the engineer works: to make a new product, he has at his disposal both material specially prepared to that end and machines designed solely for that task. Finally, because the objects produced by the engineer, at least by the good engineer, approach the level of perfection made possible by the technology of the time. In contrast, evolution is far from perfection. This is a point which was repeatedly stressed by Darwin who had to fight against the argument of perfect creation. In the Origin of Species, Darwin emphasizes over and over again the structural or functional imperfections of the living world.” Thus if a species’ genome can take care of greater amount of constitutive heterochromatin without tipping the balance of natural selection, it survives and continues as well as another species which maintains a much smaller proportion. The C-value paradox [230] also needs to be looked at in the same vein without worrying that our conventional reductionist approach fails to explain the enormous variations in the haploid DNA content in related species. Biological systems, being products of chance and necessity [154], do not always follow the human logic.