Introduction

Nucleosomes are composed of ~146 bp of DNA and a histone core, which is an octamer consisting of two copies each of histones H2A, H2B, H3 and H4. All four histones can be modified at specific residues, particularly their amino (N) terminal tails. Among these post-translational modifications, acetylation and methylation are the two most commonly studied forms. Generally, acetylation of specific lysine residues of the N-terminal tails of histones H3 and H4 is associated with transcriptional activation, while mono- di- and tri-methylation occurring at lysine or mono- and di-methylation occurring at arginine residues can fulfill different regulatory roles (Pfluger and Wanger 2007). Together with other modifications, e.g., phosphorylation, ubiquitination, glycosylation and sumoylation, these epigenetic marks are believed to compose a histone code, which directs specific and distinct DNA-templated programs based on distinct combinations of post-translational histone modifications.

Recently, a number of studies reported efforts on the global identification of DNA methylation and histone modification sites in several model organisms. So far, about 20 modifications have been examined in yeast on a genomic basis (Bernstein et al. 2002; Kurdistani et al. 2004; Wirén et al. 2005; Pokholok et al. 2005; Rao et al. 2005; Sinha et al. 2006; Millar and Grunstein 2006; Liu et al. 2005). In higher eukaryotes, chromatin immunoprecipitation (ChIP)-Chip as well as ChIP-seq methods have been utilized to examine DNA methylation and histone modification patterns (Eckhardt et al. 2006; Schübeler et al. 2004; Bernstein et al. 2005; Kondo et al. 2004; Roh et al. 2004, 2005, 2006; Barski et al. 2007; Bernatavichute et al. 2008). These studies have greatly improved our understanding of the relationships between chromatin modifications/structure and gene expression. Furthermore, the availability of these mapping data allows us to progressively decipher the histone code by examining the combinatorial effects of these epigenetic marks (Liu et al. 2005; Agalioti et al. 2002; Kouzarides 2007; Jenuwein and Allis 2001).

In Arabidopsis, several repressive histone methylation marks have been reported: H3K9me1/2, H3K27me1/2 and H4K20me1 preferentially associate with the heterochromatic chromocenters (Fuchs et al. 2006). Additionally H3K9me2 was found to be localized in the heterochromatic knob region of chromosome 4, as well as in transposable elements (TEs) and related repeats (Lippman et al. 2004; Turck et al. 2007), and H3K9me2 showed a higher than average level in pericentromeric/centromeric regions than in euchromatin regions (Bernatavichute et al. 2008). ChIP-chip analysis of H3K9me3 and H3K27me3 showed that theses marks are mostly found in euchromatic regions of chromosome 4 indicating transcriptional repression of genes (Turck et al. 2007). Moreover, genome-wide approaches were used to map DNA methylation and H3K27me3 (Zhang et al. 2006, 2007; Zilberman et al. 2007), and it has been demonstrated that these two epigenetic marks play important roles in the regulation of gene expression and chromatin structure.

While great efforts have been made to generate genome-wide landscapes of repressive epigenetic marks in Arabidopsis, the global mapping of an epigenetic modification associated with transcriptional activation has not been extensively reported. Yet, on a gene-to-gene basis, H3K9ac is one of the most characterized epigenetic marks associated with active transcription and has been shown to influence numerous developmental and biological processes in higher plants (Benhamed et al. 2006; Ausín et al. 2004; Ng et al. 2006). However, a systematic identification of H3K9ac-associated regions in the entire genome is still lacking.

Lysine residue 9 of histone H3 can either be acetylated or mono-, di-, or tri-methylated. These epigenetic states have a diverse impact on regulating transcriptional activity and chromatin organization. H3K9ac is invariably correlated with transcriptional activation in all the species analyzed so far (Kurdistani et al. 2004; Pokholok et al. 2005; Sinha et al. 2006; Liu et al. 2005; Schübeler et al. 2004; Bernstein et al. 2005; Roh et al. 2005, 2006). On the other hand, H3K9me2 was reported to be restricted to constitutive heterochromatin in Arabidopsis (Lippman et al. 2004; Turck et al. 2007; Tariq et al. 2003; Johnson et al. 2002; Zilberman et al. 2003; Houben et al. 2003), and the knowledge concerning its mode of action outside heterochromatin remains limited (Bernatavichute et al. 2008).

In this report, we used a ChIP-Chip method to identify genomic regions associated with H3K9ac and H3K9me2 in Arabidopsis. Our analyses demonstrated that the distribution of these two epigenetic modifications in euchromatic regions has a major impact on transcriptional activity. Our study, together with the available genomic landscapes for H3K27me3 and DNA methylation (Zhang et al. 2007; Zilberman et al. 2007), supports the idea that a combination of different epigenetic modifications is tightly correlated with gene transcriptional activity.

Materials and methods

Plant materials and growth conditions

Arabidopsis ecotype Columbia (Col-0) was the genetic background of all the lines used. T-DNA knockout lines for the HDA19 (At4g38130) gene were obtained from the Salk Institute Genome Analysis Laboratory (SALK_139445.43.60.X), http://signal.salk.edu/; Alonso et al. 2003). Surface-sterilized seeds were sown on sterile half strength Murashige and Skoog medium (Sigma–Aldrich). Seeds were stratified for 2 days at 4°C, then germinated and grown for 6 days at 22°C under continuous light as previously described (Feng et al. 2004). A 6-day-old seedling from three independent biological replicates was collected for further analysis.

Chromatin immunoprecipitation and probe labeling

Chromatin immunoprecipitation (ChIP) was performed with 1.5 g of seedlings according to Bowler et al. (2004). A 3 and 10 μL, respectively, of the following antibodies were used: anti-H3K9ac (Upstate 07-352) and anti-H3K9me2 (Upstate 07-441). ChIP DNA and input DNA samples were amplified and labeled as previously described (Lee et al. 2007). An aliquot of untreated sonicated chromatin equal to the amount of ChIP-DNA was reverse cross-linked, amplified and used as total input DNA control.

Tiling microarray design

The Arabidopsis genome assembly release 5.0 by TAIR was used to design the tiling-path microarray. Probes were selected using the NASA Oligonucleotide Probe Selection Algorithm (NOPSA; Stolc et al. 2005) and arrays were printed with the NimbleGen Maskless Array Synthesizer system. A total of 390,580 36 bp probes, excluding highly repetitive regions, positioned on average every 250 bp along one strand of the genome were synthesized on a single array. This tiling array platform can be found in the NCBI gene expression omnibus (GEO) database with the accession ID GPLXXXX.

Probe hybridization

Microarray chips were prehybridized with prehybridization buffer (5× SSC, 0.1% SDS, 1% BSA) at 42°C for 1 h. Arrays were then washed twice with water and hybridized with labeled DNA in hybridization buffer (5× SSC, 0.1% SDS, 0.5 mg/mL BSA, 0.5 mg/mL ssDNA) for 14–16 h at 50°C. After hybridization, the arrays were washed in 2× SSC and 0.1% SDS, in 0.2× SSC and 0.1% SDS and in 0.2× SSC at room temperature for 10 min each and dried by centrifugation (1,000 rpm, 1 min). Microarrays were scanned with a GenePix 4200A scanner (Axon), and independent TIFF images for both Cy-3 and Cy-5 channels were acquired at a 5 μm resolution.

Microarray data analysis

First, microarray raw data were processed by the LOWESS normalization module in the LIMMA R package (Smyth et al. 2005) to balance the hybridization intensities for biases between Cy3 and Cy5 fluorescence. As a second step, the Quantile normalization module in the NMPP package (Wang et al. 2006) was used to remove the experimental variances across all the replicate arrays and samples.

Identification of genomic regions enriched for H3K9ac or H3K9me2 modified histones was based on the Wilcoxon-Signed-Rank-Test (Hollander and Wolfe 1999). We performed this test on paired data points (ChIP/Input) within a sliding window of ±1 kb across each probe to determine whether hybridization intensities of the ChIP-enriched DNA samples were significantly greater than those of the total genomic DNA input samples. The algorithm of the Wilcoxon-signed-rank-test is described in Affymetrix’ statistical algorithms description document. To control the error from multiple testing, the P values generated by the Wilcoxon-signed-rank-test were adjusted by the Benjamini and Hochberg false discovery rate procedure (Benjamini and Hochberg 1995). A modified genomic region was defined by combining adjacent probes with a significance threshold of FDR-adjusted P value less than 0.05, allowing a maximal gap of 200 bp and requiring a minimal run of two consecutive probes. We concluded that a gene is carrying a particular modification if we detected an overlap of at least 250 bp over this gene and a modified region. The mapping results of H3K9ac and H3K9me2 in this study are publicly available at http://plantgenomics.biology.yale.edu. To characterize the relationships between epigenetic modifications and transcriptional activity we used a previously published expression dataset (Jiao et al. 2005). Raw and processed ChIP-Chip data were deposited in GEO (accession number GSEXXXX).

ChIP-PCR validation

The ChIP-DNA and input DNA samples were resuspended in 30 μL water, and 1 μL was used for PCR amplification with specific primers as listed in additional file 2. PCR conditions were as follows: 94°C for 2 min, 28–30 cycles at 94°C for 1 min, 58°C for 1 min, 72°C for 1 min, followed by 72°C for 10 min.

Quantitative real-time PCR

ChIP-DNA samples and cDNA samples (synthesized from total RNA) were analyzed by quantitative real-time PCR. Primers were designed to amplify 100–300 bp DNA fragments (additional data files 3 and 4). Quantitative PCR was performed in 96-well plates using ABI PRISM 7500 fast real-time PCR system using a SYBR PreMix Ex Taq kit (Takara). Cycling conditions were as follows: 10 s at 95°C, 40 cycles of 3 s at 95°C, 34 s at 60°C, 34 s at 72°C, followed by a 60–95°C dissociation protocol. The enrichment level of gene for ChIP was determined as previously described (Frank et al. 2001). For real time PCR analysis actin was used as internal control. The 2ΔΔC t values were calculated relative to reference PCR values (Livak and Schmittgen 2001).

Results

Identification of genomic regions associated with H3K9ac and H3K9me2

Genomic maps of H3K9ac and H3K9me2 were generated, using chromatin immunoprecipitation (ChIP) and a custom oligonucleotide tiling array. The custom tiling array contained ~390 K 36-mer probes, which covered 82% of the non-repetitive genome of Arabidopsis with an average resolution of ~250 bp. Genomic regions associated with H3K9ac and H3K9me2 were identified by sliding a 1,000 bp window across the interrogated regions and by testing whether hybridization signals of oligonucleotide probes within the window were significantly higher when probed with H3K9ac and H3K9me2 ChIP samples than with input DNA. Probes with a significance threshold of P < 0.05 along with adjacent probes within a maximum gap of 200 bp were selected as positive regions.

Using this methodology, we identified 5,853 H3K9ac regions and 4,009 H3K9me2 regions with median sizes of 656 and 506 bp, respectively. These modified regions covered 3.3 and 1.8% of the Arabidopsis nuclear genome. As expected, H3K9ac regions were highly enriched in the euchromatic regions resembling the distribution of genes while H3K9me2 regions were enriched in heterochromatin (Fig. 1a). Two representative genomic regions of chromosome 2 and one region of chromosome 5 are shown in Fig. 1b to illustrate the distribution of the two modifications for individual genes.

Fig. 1
figure 1

Genome-wide landscapes of H3K9ac and H3K9me2 modifications in Arabidopsis. a Chromosomal distribution of H3K9ac, H3K9me2. The average ratio of each modification per 100 kb region (Y-axis) along each chromosome (X-axis) is shown. b Distribution of H3K9ac, H3K9me2, and annotated genes along three representative genomic regions of chromosomes 2 and 5. The two lower tracks show the log10-transformed P values for each probe. A TE-related gene is marked in light blue while non-TE genes are marked in pink. c Conventional ChIP-PCR assays on randomly selected genes to validate the array analysis of target regions

To validate the accuracy of our ChIP-Chip method, we performed conventional ChIP-PCR on a random collection of 45 candidate genomic loci. These loci were either positive or negative for H3K9ac or H3K9me2 or positive for both two modifications (additional data file 5). This conventional approach successfully validated 93% (42 out of 45) of the selected loci. Representative ChIP-PCR results are shown in Fig. 1c. These ChIP-PCR results confirmed our array data and its prediction of modified regions.

H3K9ac largely targets active Arabidopsis genes while H3K9me2 predominantly modifies transposons

A systematic analysis of the locations of H3K9ac regions revealed a striking correlation with the presence of annotated genes. Based on the criteria described in “Materials and methods”, 5,100 expressed genes with known or predicted functions, 106 expressed genes with unknown functions, and 321 TEs were found to be associated with H3K9ac. This is in stark contrast with H3K9me2, with which 1,031 expressed genes with known or predictable functions, 81 expressed genes with unknown functions, and 2,280 TEs were found to be associated (Fig. 2a; additional data files 1A, 1B, and 1C). Based on the genome annotation version 7 of Arabidopsis released by TAIR, our tiling array results identified 19.3% non-TE genes (5,206 out of 27,029) positive for H3K9ac and 4.1% non-TE genes (1,112 out of 27,029) positive for H3K9me2.

Fig. 2
figure 2

Distribution of H3K9ac and H3K9me2 within Arabidopsis genes. a Frequencies of H3K9ac (left) and H3K9me2 (right) modified regions in known and unknown non-TE genes and TE-related genes. b Comparison of distributions of H3K9ac within short, intermediate, long non-TE genes and TE genes aligned at their translation start sites (ATG). In this study, a promoter region is defined as the 2,000 bp region upstream of the ATG. The Y axis shows the averaged ratios of the signals from the samples enriched for each modification to the signal from total genomic DNA. The average transcription start site of genes is shown by the 124 bp bar. c Comparison of the distribution of H3K9me2 within non-TE and TE genes aligned at their translation start sites (ATG). The Y axis shows the averaged ratios of the signals from the samples enriched for each modification to the signal from total genomic DNA

H3K9ac is enriched downstream of translation start site of target genes while H3K9me2 exhibits no preference along gene territory

Furthermore, we examined the distribution of H3K9ac and H3K9me2 within genes aligned at their translation start sites (ATG). The gene territory was divided into bins of 500 bp and the average ratio of a specific modification was plotted for each bin. To better understand the potential effect of gene length, we divided all selected non-TE genes into three categories according to their length: short (<1 kb), intermediate (1–2 kb) and long (>2 kb). For each gene category, a distinct peak of histone acetylation was detected downstream of the start codon (ATG) along with decreasing acetylation in the coding region (Fig. 2b). Interestingly, a lower level of H3K9ac was observed for short genes in this region. This peak distribution near the translation start site was not detected for H3K9me2 targets (Fig. 2b). It was evident that H3K9ac avoided TEs while H3K9me2 was enriched in TEs (Fig. 2a, b).

H3K9ac positively correlates with gene expression while H3K9me2 does not

H3K9ac and H3K9me2 target genes were analyzed with respect to their expression levels using a previously published microarray dataset (Jiao et al. 2005). The genes targeted by each modification were classified into three groups of high, medium and low expression based on their normalized expression intensities. The genes in each category were aligned at their translation start sites (ATG) and their averaged modification level ratios were plotted. As shown in Fig. 3a, highly expressed genes possessed higher levels of H3K9ac near their translation start sites, while target genes that were expressed at significantly lower levels contained less H3K9ac modification. However, no significant correlation could be put forward between the presence of H3K9me2 and gene expression, since a uniform and low level of H3K9me2 was observed along gene territories of all three types of non-TE genes (Fig. 3b).

Fig. 3
figure 3

Relationships between gene expression, H3K9ac and H3K9me2. a Comparison of distributions of H3K9ac within non-TE genes aligned at their translation start sites (ATG) and belonging to different expression intensity categories: low (1,000–3,000), medium (5,000–7,000) and high (9,000–11,000). The Y axis shows the averaged ratios of the signals from the samples enriched for each modification to the signal from total genomic DNA. b Comparison of the distribution of H3K9me2 within non-TE genes aligned at their translation start sites (ATG) and belonging to different expression intensity categories: low (1,000–3,000), medium (5,000–7,000) and high (9,000–11,000). The Y axis shows the averaged ratios of the signals from the samples enriched for each modification to the signal from total genomic DNA

H3K9ac can span gene body and promoter, while H3K9me2 is restricted to either gene body or promoter

To investigate the positional effects of H3K9ac or H3K9me2 on transcriptional activity, we divided all non-TE genes that were positive for H3K9ac (5,206 genes) or H3K9me2 (1,112 genes) into three groups: only promoter-modified, only body-modified, as well as both promoter- and body-modified. This classification revealed that ~64% of the H3K9ac target genes were modified in both body and promoter, ~35% were only body-modified, and ~1% was exclusively promoter-modified (Fig. 4a). In contrast, the H3K9me2 modification was either only in target gene promoters (46.4%) or only in gene bodies (48.8%), while it rarely occurred in both promoter and gene body of the target genes (4.8%) (Fig. 4a).

Fig. 4
figure 4

H3K9ac and H3K9me2 modifications in gene promoter and body regions. a Numbers of ‘‘only promoter’’, ‘‘only body,’’ and ‘‘both promoter and body’’ H3K9ac-modified non-TE genes (left) and H3K9me2-modified non-TE genes (right). b Expression levels of ‘‘only promoter’’, ‘‘only body’’, ‘‘both promoter and body’’ modified, and ‘‘unmodified’’ genes containing H3K9ac, or H3K9me2. The Y axis shows the average expression level calculated from a previously published expression dataset (Jiao et al. 2005)

Following this classification, the average expression intensities of each gene group along with a control group composed of unmodified genes were compared for H3K9ac and H3K9me2. This analysis clearly demonstrated that among H3K9ac target-genes, body-modified or both body- and promoter-modified genes were expressed at relatively high levels. On the other hand, unmodified genes were expressed at much lower levels (Fig. 4b). Since the group of genes possessing H3K9ac solely in their promoters only contained 28 genes, we did not compare their expression with that of other gene groups (Fig. 4b). Moreover, genes lacking H3K9me2 had a higher average expression intensity compared to genes harboring this modification, suggesting that H3K9me2 correlates with gene suppression (Fig. 4b).

Bivalent modification of H3K9ac and H3K9me2 in the same gene

We observed a bivalent modification of specific genes by both H3K9ac and H3K9me2, even though these two modifications were reported to have opposite effects on transcriptional activity (Pfluger and Wagner 2007). To further clarify if the occurrence of one modification affects the occurrence of the other, we compared their target gene sets. On the one hand, we observed that 40.1% (446 out of 1,112) of the non-TE genes targeted by H3K9me2 were H3K9ac-modified as well. On the other hand, 8.6% (446 out 5,206) of the H3K9ac target-genes were also H3K9me2-modified (Fig. 5a). Considering the genic distributions of 19.3% for H3K9ac and 4.1% for H3K9me2 reported in this study, these results suggest there is a significant degree of overlap between these two epigenetic marks. However, this study does not provide evidence that these two modifications can coexist at the same nucleosome or histone, but only that they can occur at the same gene.

Fig. 5
figure 5

Relationship between H3K9ac and H3K9me2 modified gene sets. a Venn diagram comparing groups of non-TE genes containing H3K9ac and H3K9me2 modifications. b Venn diagrams comparing groups of non-TE genes that are only body-modified by H3K9ac and promoter-modified by H3K9me2. c Venn diagrams comparing groups of non-TE genes that are only body-modified by H3K9ac and body-modified by H3K9me2

To further clarify the impact of a possible co-occurrence of these two marks we looked at their respective positions along the gene territories. This analysis revealed that H3K9me2 promoter-modified genes were unlikely to be only body-modified by H3K9ac (Fig. 5b). However, ~34% (200 out of 596) of the genes that were H3K9me2 body-modified were also H3K9ac body-modified (Fig. 5c). This suggests that genes are more likely to have both H3K9me2 and H3K9ac body modifications, while genes that are H3K9me2 promoter-modified are less likely to be also H3K9ac body-modified.

Correlations between combinations of multiple histone modifications and target gene expression status

To investigate a possible interplay among multiple histone modifications, we combined and analyzed the 4,517 H3K27me3 modified genes identified in 10 to14-day-old seedlings by Zhang et al. (2007) and the 4,268 DNA methylation modified genes identified in the above-ground parts of 25-day-old plants by Zilberman et al. (2007), together with our 5,206 H3K9ac modified genes and 1,112 H3K9me2 modified genes identified in 6-day-old seedlings. These previously published studies reported genic distributions of 20% for DNA methylation and 17% for H3K27me3. By comparing these data sets, we found that the fraction of H3K9ac target genes that also contained H3K27me3 was significantly lower (410 out of 5,206; 9.8%) than the average genic distribution (17%) reported for this modification by Zhang et al. (2007), while the fraction of H3K9ac target genes that were also DNA methylated (1,114 out of 5,206; 21.4%) was found to be very similar to the value (20%) reported by Zilberman et al. (2007) (Fig. 6a, b). On the other hand, the fraction of H3K9me2 target genes overlapping with genes targeted by DNA methylation (304 out of 1,112; 27.3%) was higher than the reported value, while the fraction of H3K9me2 also containing H3K27me3 (179 out of 1,112; 16.1%) was at the genome average (Fig. 6a, b). Collectively, these results indicate a negative correlation between the occurrence of H3K9ac and H3K27me3 and a positive correlation between the occurrence of H3K9me2 and DNA methylation.

Fig. 6
figure 6

Combinatorial effects of epigenetic modifications on transcription in Arabidopsis non-TE genes. a Numbers of non-TE genes containing H3K9ac, H3K9me2 and H3K27me3 regions and various combinations of these modifications. b Numbers of non-TE genes containing H3K9ac, H3K9me2 and DNA methylation regions and various combinations of these modifications. c Expression of non-TE genes containing various combinations of H3K9ac and H3K9me2 and H3K27me3. The Y axis shows the average transcript abundance within each group. Expression dataset from Jiao et al. (2005). d Expression of non-TE genes containing various combinations of H3K9ac and H3K9me2 and DNA methylation. The Y axis shows the average transcript abundance within each group. Expression dataset from Jiao et al. (2005)

It is conceivable that the observed correlations could reflect the potential transcriptional status of a given gene. Thus, we calculated the effects on gene expression of possible combinations of these four epigenetic modifications. Our results indicated that genes that were solely H3K9ac modified were highly expressed. However, lower expression levels were observed when H3K9ac target genes were also targeted by DNA methylation, H3K9me2, or H3K27me3 (Fig. 6c, d). In addition, genes that did not bear an H3K9ac modification mark, but which were possibly modified by a combination of either H3K9me2 and H3K27me3 or H3K9me2 and DNA methylation, were expressed at lower levels compared to genes modified by DNA methylation or by H3K27me3 alone (Fig. 6c, d). These results strongly suggest that H3K9me2, H3K27me3 and DNA methylation can repress the positive effect of H3K9ac on gene expression and that a combination of repressive modifications is probably even more effective.

Increased gene expression and H3K9ac modification levels in a hda19 mutant

To address whether decreasing HDA19 expression could influence the level of H3K9ac, we selected seven target genes belonging to different biological processes and performed real-time ChIP-PCR assays in wild type and hda19 seedlings. Our results indicated that decreasing HDA19 expression increased the H3K9ac level of the seven selected genes (Fig. 7a). To investigate whether the change of the H3K9ac level of these seven genes in a hda19 mutant background correlated with gene expression activity, we conducted an expression analysis of the seven genes under study in wild type and hda19 mutant seedlings using real-time PCR. Our results showed that the expression levels of these seven selected genes were increased in hda19 mutants compared to wild type (Fig. 7b), suggesting that HDA19 influences the H3K9ac levels and consequently the expression levels of these genes in Arabidopsis.

Fig. 7
figure 7

Histone 3 lysine 9 acetylation state and expression analysis of seven genes in wild type and hda19 mutant seedlings. a Histone 3 lysine 9 acetylation state of seven selected genes in Col-0 wild type and hda19 mutant seedlings. Immunoprecipitates were obtained with an antibody specific to acetylated histone H3 lysine 9 and were analyzed by real-time PCR. Relative amounts of the PCR products were calculated and normalized with respect to the input chromatin. b Expression analysis of seven selected genes in wild type and hda19 mutant seedlings. Total RNA was isolated from light-grown seedlings, reverse transcribed, and subjected to quantitative real-time PCR. Relative transcript abundance was calculated and normalized with respect to the actin transcript level. Error bars represent standard deviation (SD) values from at least three repetitions. The seven selected genes are: 1, At1g65380 (CLV2); 2, At4g16280 (FCA); 3, At5g63860 (UVR8); 4, At2g22670 (IAA8); 5, At2g01570 (RGA1); 6, At3g23150 (ETR2); 7, At2g26990 (CSN2)

Discussion

H3K9ac is restricted to euchromatin while H3K9me2 is found in both euchromatic and heterochromatic regions

Our work revealed that H3K9ac is widely distributed in the Arabidopsis genome, and shows that H3K9ac targets 5,206 non-TE genes. As expected, H3K9ac was restricted to euchromatin and was associated with active genes (Figs. 1b, 2a). This observation is consistent with the fact that histone acetylation has been genetically and biochemically linked to transcriptional activation (Kuo et al. 1998; Wu et al. 2000). Conversely, the H3K9me2 level was significantly higher in heterochromatin than in euchromatin (Fig. 1a). This distribution pattern is in agreement with the proposed role of H3K9me2 in heterochromatin formation and/or maintenance (Johnson et al. 2002; Houben et al. 2003; Bernatavichute et al. 2008). To our surprise, 1,112 non-TE genes were found to be H3K9me2-modified, suggesting that this epigenetic mark is also an important gene silencing mechanism in plants that regulates a rather large number of genes.

H3K9ac modification correlates with high gene expression but with plant-specific feature

A systematic analysis of the locations of lysine 9 acetylated sites revealed a striking correlation with the 5′ ends of annotated genes. These data are consistent with recent studies, which documented a similar distribution in yeast, human and mouse—species in which the 5′ end was the most highly acetylated region within a gene (Kurdistani et al. 2004; Pokholok et al. 2005; Sinha et al. 2006; Liu et al. 2005; Bernstein et al. 2005; Roh et al. 2004, 2005, 2006). However, in Arabidopsis, the level of H3K9ac at the translation start site was correlated to gene length, since short genes presented a lower modification level (Fig. 2b), while longer genes presented higher H3K9ac levels. This is consistent with a previous study, which reported that highly expressed genes tend to have more and longer introns than genes expressed at a low level in both rice and Arabidopsis (Ren et al. 2006). Taken together, this could be a unique plant feature since these observations are in contrast with previous findings in yeast (Sinha et al. 2006) and suggest that even if the general correlation of H3K9ac with transcriptional activation is conserved during species evolution, the mechanism of action and the way this modification is established and maintained differs widely.

Combinatorial effect of multiple modifications on gene expression level

In this study, we found that 40.1% of the genes modified by H3K9me2 (446 out of 1,112) were also modified by H3K9ac (Fig. 5a) and that the expression levels of these genes were similar to those only bearing H3K9ac (Fig. 6c, d). Interestingly, genes with H3K9me2 but lacking H3K9ac are expressed at much lower levels. Hence, our findings support the presence of a balancing mechanism, in which the dual recruitment of a repressing and a stimulating modification at H3 lysine residue 9 makes the transcription efficiency more stable and less sensitive to fluctuations. However, when the equilibrium is broken and the repressive mark becomes predominant, the transcriptional activity is repressed.

This balancing act is not the only example of a relationship between two epigenetic marks. It is well established that chromatin compaction and gene silencing involve a functional relationship between the plant DNA-methylation system and H3K9 methylation (Lippman et al. 2004; Houben et al. 2003). A similar cooperation between methyl-H3K9 and DNA methylation in mammals was found in heterochromatin regions and major satellite repeats (Sims et al. 2003), while in euchromatic regions or at the individual gene level, results have been less clear (Umlauf et al. 2004; Gartler et al. 2004; Zhao et al. 2005; Zhang et al. 2005; Yan et al. 2001). Our finding that a significant fraction of the H3K9me2 target genes was also DNA-methylated supports a cooperation model between these two repressive marks. This is further reinforced by the fact that the fraction of H3K9ac target-genes that also contained DNA methylation was within reported values, suggesting that this cooperation is specific to the methylation of lysine residue 9.

Recently, H3K27me3 was reported as an important silencing mechanism independent of DNA methylation and RNA interference that targets 17% of the non-TE Arabidopsis genes (Zhang et al. 2007). Our analysis established that only 9.8% of the H3K9ac target genes were also targeted by H3K27me3 (Fig. 6a), while 16% of the H3K9me2 target genes were found to be H3K27me3-modified (Fig. 6b). This lower than average fraction of H3K9ac target genes also modified by H3K27me3 is not surprising since both modifications have opposite effects on gene transcription and are probably antagonistic to each other. A similar situation was reported for human T-cells, in which H3K27me3 regions did not overlap with chromatin regions presenting high levels of H3K9ac and H3K14ac (Roh et al. 2006).

HDA19 affects both H3K9ac and gene expression levels

Fifteen histone deacetylase (HDA) genes have been predicted to be encoded in the Arabidopsis genome and only a few were reported to be involved in a specific biological process (Verbsky and Richards 2001). One of them, HDA19 (also named HD1) is involved in light signaling, growth, and development. Plants expressing a mutant allele of HDA19 exhibit a variety of developmental abnormalities. These abnormalities are believed to be the results of a perturbation in histone acetylation (Wu et al. 2000; Tian and Chen 2001; Tian et al. 2003). Our data showed that the H3K9ac level and expression activity of selected genes were affected in the hda19 mutants at the seedling stage (Fig. 7), which suggests that HDA19 is mainly responsible for the deacetylation of lysine 9 of histone H3, therefore playing an important role in the regulation of gene expression.

We have mapped the distributions of H3K9ac and H3K9me2 in the Arabidopsis genome at a 250 bp resolution. The data presented here will help in understanding how H3K9ac and H3K9me2 regulate the expression of individual genes and will facilitate a systematic insight into the overall structure and function of chromatin. The results presented here are consistent with previous findings in yeast, mouse, human and plants, and support a model in which H3K9ac enhances gene expression, and in which H3K9me2 is an important silencing system. Collectively, these results suggest that although certain epigenetic mechanisms might differ between plants and animals, the chromatin distributions and functions of H3K9ac and H3K9me2 seem to be widely conserved.