Keywords

2.1 Introduction

Androgens control many physiological processes as diverge as development and maintenance of external genitalia, musculoskeletal development, fertility, and sexual behavior. To this end the androgen receptor (AR) that acts as a ligand-inducible transcription factor, has tissue-specific sets of target genes. Also in the prostate, and by extension in prostate cancer cells, the AR has many target genes which can not only be protein coding but also miRNA and even lncRNA-coding genes. In this chapter, we discuss how the AR can locate these genes within the 6.6  ×  109 bp of the human genome and how the DNA sequences to which the AR binds codetermine the activity of the receptor.

2.2 The DNA-Binding Domain of the Androgen Receptor

Like all nuclear receptors, the AR has a centrally located DNA-binding domain (DBD). The DBD is the signature domain for the nuclear receptors that can interact autonomously with high affinity and specificity with DNA elements near the androgen target genes (reviewed in [1]). These 15 basepair DNA elements were called Androgen Response Elements (ARE). They will be discussed in more detail in a later section.

2.2.1 Comparison of the AR with the Other Steroid Receptors

The androgen, glucocorticoid, progesterone, mineralocorticoid (GR, PR, and MR) receptors bind to the same sequences, at least in vitro [2]. Obviously, in vivo, the androgen target genes are different from those of glucocorticoids, progestins, or mineralocorticoids. So, while the consensus sequences for the GRE and ARE are identical, and even individual response elements from different genes can be identical [3], the in vivo responses are different. This is most likely a consequence of tissue-specific expression of the receptors, of tissue-specific metabolism of the activating ligands, and also of differences in coregulator expression as well as of tissue-specific chromatin structure and organization (reviewed in [4]).

The DBDs of the GR, PR, MR, and AR resemble each other more than they resemble the DBDs of the other nuclear receptors. These DBDs consists of two so-called zinc fingers, which are zinc-nucleated modules in which four Cysteine residues are coordinated by a Zinc molecule [5].

2.2.2 The First Zn Finger

When the 3D structure of the AR-DBD was solved, it was immediately apparent that it is near identical to that of GR and PR [6]. The carboxyterminal part of the first Zn finger is involved in an alpha helical structure which has the ideal dimension to enter the major groove of B-helical DNA and make sequence-specific contacts with a DNA segment of five to six basepairs long. The DNA-contacting residues are conserved between AR, GR, PR, and MR [5].

2.2.3 The Second Zn Finger

The second Zn finger module does not make sequence-specific contacts with the DNA, but is involved in dimerization. The classical dimerization surface in the two contacting monomers is oriented antiparallel which explains the high symmetry of the dimer. In contrast to the first zinc finger, the residues involved are not completely conserved between the AR, GR, PR, and MR: the AR is the only steroid receptor which has a Serine in the center of the dimerization surface, where the others have a Glycine, the result being that in the GR, PR, and MR dimerization interface there is a hole called “Glycine hole” [6]. While in theory this would lower the dimerization forces, residue swapping between AR and GR indicates that AR dimerization on DNA is not very different from GR (Verrijdt et al. 2006).

2.2.4 The Hinge Region

The DBD and the ligand-binding domain (LBD) of the nuclear receptors are connected via a highly variable hinge region, which was initially thought to function merely as hinge, allowing flexibility in the orientations of the LBD and DBDs. However, later work on the AR revealed that this region has many roles [7, 8]. In case of all steroid receptors, it is known to cover a nuclear localization signal (NLS), it is also the recipient of activity-controlling posttranslational modifications like phosphorylation and acetylation (reviewed in [9]). Moreover, it plays a role in DNA binding, receptor activity, and in intranuclear mobility.

2.2.4.1 Role in Nuclear Translocation

It was first noted in 1993 that a sequence resembling the NLS of the large T antigen of the SV40 virus is present in the AR hinge [10, 11]. NLS mutations affected the intracellular distribution. More recently, a crystal structure of the AR-NLS with importin beta was resolved and this showed that the 629RKLKK633 motif is involved in very specific interactions [12]. Not surprisingly, androgen insensitivity syndrome patients can have mutations in this NLS, but in seeming contradiction are the observations that AR NLS mutations have also occurred in prostate cancer. Moreover, it was shown that these PrCa mutations increased the activity of the AR, although they had a negative effect on the nuclear translocation [7, 12].

2.2.4.2 Role in DNA Binding

When we produced the AR-DBD for the first time, we included a large part of the hinge region, simply because constructs were based on the presence of restriction sites [13]. Later on, more directed cloning by PCR revealed that an AR fragment which only covers the two Zn fingers has no or very low affinity for DNA. The minimal AR-DBD has to include a carboxyterminal extension for high affinity DNA binding [14]. The twelve most aminoterminal residues of the hinge region, covering the 629RKLKK633 motif, sufficed [15]. Unfortunately, the structure of this so-called carboxyterminal extension (CTE) is unclear. It is striking, however, that it colocalizes with the NLS.

2.2.4.3 Role in AR Activity

Further studies of the role of the CTE in the full size receptor revealed initially puzzling data. Indeed, when the 629RKLKK633 motif was deleted, the AR did not seem to enter the nucleus (based on immunocytochemistry) but the androgen responses increased up to sevenfold [7]. So while the AR was apparently absent in the nucleus, the undetectable nuclear amounts of AR clearly were highly active. This was also true when a heterologous NLS was fused to the aminoterminal end of the AR [8]. This seeming contradiction is in accordance with the observation that AR hinge mutations found in prostate cancer result in a more potent AR even when the nuclear translocation is slowed down [16]. Moreover, Gioeli et al. [17] demonstrated the crucial role of hinge region phosphorylation in AR activity control.

2.2.4.4 Role in Intranuclear Mobility

A possible explanation for the increased AR activity came from the observations of the effects of hinge region mutations on the intranuclear mobility of the AR. With Fluorescent Recovery After Photobleaching technology (FRAP), we observed that not only the distribution of the AR between mobile and immobile fractions but also the residence time of the AR in the immobile fraction was changed [8]. FRAP gives only an overview of the AR population and cannot definitively discriminate between DNA binding and other binding events (e.g., to coactivator complexes). However, the increased mobility and shorter residence times do fit the hypothesis that nuclear receptors, like all transcription factors, cycle on the enhancers, with each cycle having a different function resulting in a chronological recruitment of complexes involved in histone language writing and reading, recruiting RNA polymerase, RNA pol modifying enzymes, etc. [18].

In conclusion, these data demonstrate that the CTE or 629RKLKK633 motif is a nuclear localization signal and extension of the DBD. In addition, this motif must be recognized by at least one other control mechanism since it determines the intranuclear mobility of the AR. In the spirit of Occam’s Razer, we would predict that the same mechanism determines the transactivation potential of the AR, but at this moment it cannot be ruled out that yet other mechanisms are involved. In this respect, it should be noted that the hinge region is also an interaction site for transcription co-regulatory complexes like SWI/SNF [19] or nucleophosmin [20]. Moreover, the N/C interactions are also affected by the hinge region mutations [7, 21].

2.3 Androgen Response Elements

Like most transcription factors, the AR has to find back the specific DNA motifs which can be present anywhere in the 6.6  ×  109 bp of, e.g., the human genome. At a first level, the inactive part of the genome is packed into heterochromatin and thus invisible to most transcription factors. The ARE-containing enhancers near the tissue-specific genes that are androgen targets will be in open chromatin. While this reduces the complexity of the search for AREs considerably, it is still unclear how exactly the AR can find them, although growing evidence points at several pioneering factors like FoxA1 and Nkx3.1 that will aid the AR in ARE finding ([2224]; for more details see chapter by Wang). However, in this section, we will restrict ourselves to the description of the DNA elements that are recognized by the AR DBD, the so-called androgen response elements.

2.3.1 Definition of an ARE

An ARE is a simple DNA motif, able to convey androgen responsiveness to a heterologous reporter gene through direct binding of the AR. Experimentally defining an ARE involves in vitro binding assays like electrophoretic mobility shift assay (EMSA), DNAseI footprinting on the one hand, and transient or stable transfection data on the other (for more experimental details, see [25]). Ultimate proof for an ARE comes from AR binding demonstrated in chromatin immunoprecipitation (ChIP) assays and AR activity shown in, e.g., a transgenic approach in which the ARE is mutated. Unfortunately, the latter demands a long-term investment. Moreover, deleting one ARE is most likely insufficient to affect the androgen responsiveness of a gene that can be controlled by several androgen-responsive enhancers. However, for the enhancers of the PSA, the C3(1) and the mouse vas deferens protein genes, such proof has been provided in transgenic animal models (reviewed in [3]). Nowadays most AR-ChIP data have been derived from the use of prostate cancer cell lines. However, for epididymis and prostate tissue, AR ChIP data have been reported [26, 27] and can hence be more physiological proof for AR binding. AR ChIP seq data on prostate cancer will no doubt be very informative on DNA binding and how it is modulated by antagonists and other therapeutic strategies against cancer.

2.3.1.1 The Optimal Hexamer Motif for AR Binding

Historically, the consensus high affinity binding sequence for the GR was described to be 5′-AGAACA-3′. After the description of a number of AREs in cellular genes, it became clear that the AR too recognizes this motif (e.g., [28]).

2.3.1.2 AREs Are Hexamer Repeats

It also became clear that most AREs cover a DNA stretch which is extended at its 3′ end beyond the 5′-AGAACA-3′-like motif. A consensus sequence showed that 3′ of the high-affinity binding site, a second binding site is present with a similar consensus, but present in the other strand in the other direction [29]. This is explained by the fact that the AR binds DNA as a symmetrical dimer (see also higher), binding two 5′-AGAACA-3′-like motifs separated by a three nucleotide spacer and organized as an inverted repeat (Fig. 2.1).

Fig. 2.1
figure 00021

Sequence logos for the classical ARE and selective ARE, based on AREs for which selectivity was checked in EMSA and in functional analysis

2.3.1.3 Selective ARE

The DBD of GR, PR, MR, and AR are very similar, with identity of the residues involved in contacting the DNA and high similarity of the dimerization interface. However, since each corresponding hormone has its specific target genes, even in cells where the receptors are coexpressed, efforts have been made to find DNA sequences that are selective for any of the four receptors. DNA motif selections, based on PCR amplifications of DBD-bound oligonucleotides did not reveal selective elements [29]. It was only through the analysis of a series of AREs isolated from androgen target genes that it became apparent that several of these AREs were not recognized by the GR-DBD. These so-called selective AREs (selAREs) consist of a 5′-AGAACA-3′-like hexamer, flanked at three nucleotides downstream by a second hexamer. The similarity of this downstream hexamer to the 5′-AGAACA-3′ is lower compared to that in the classical AREs (clAREs). Mutation analyses indicated that in the selAREs, the two hexamers have a parallel orientation, rather than the inverted orientation seen for clAREs, GRE, and PREs [30, 31]. This was underpinned with experiments like the one described in Fig. 2.3.

The mutational analyses of a series of selAREs revealed which bases are most important for AR binding and which determine selectivity (so prevent GR binding). Despite this information, it is still difficult to predict from its sequence whether an ARE will fall into the selARE or in the clARE group. This is due to several factors: in selAREs as well as clAREs, the guanines and cytosines are at the same positions, the left hexamers have the same orientation and the downstream hexamer can diverge very much from the consensus for clAREs as well as for selAREs (Fig. 2.1). Although for many selAREs, a change of adenine into thymidine at position 3 of the downstream hexamer abolishes selectivity, other selAREs do not have an adenine at this position ([32] and Fig. 2.2). All these reasons explain why one has to do EMSA and functional analyses to determine whether an ARE is selective or not.

Fig. 2.2
figure 00022

Schematic presentation of AR- and GR-DBD binding to classical and selective AREs. The orientation of the monomers and the hexamer-DNA sequences are indicated with arrows. The structure induced by the carboxyterminal extension of the second zinc finger is represented by a triangular extension. For the GR-DBD, this prevents dimerization on selective AREs

2.3.1.4 Role of the Second Zn Finger

For the DBD of the estrogen receptors, residues in the first Zn-finger module dictate higher affinity for 5′-AGGTCA-3′, and in GR, AR, PR, and MR, alternative residues at the same positions dictate high affinity for 5′AGAACA-3′ [33]. Since the two hexamers that constitute all AREs, selAREs and clAREs alike, resemble the same consensus, it is not surprising that the binding of the AR to selAREs and the nonbinding of the GR to these elements is not determined by the differences in the first zinc finger. Indeed, it was when the second Zn finger was swapped between AR- and GR-DBD that the selectivity was also swapped [14]. We concluded that the second zinc finger of the AR allows dimerization on selective elements, while the second zinc finger of the GR does not (Fig. 2.1). Moreover, the 629RKLKK633 motif is necessary but not sufficient to confer high affinity for selAREs [7].

2.3.1.5 ChIP Data Evaluations: The Consensus Revisited

Genomic AR-binding sites (ARBS) have been described by chromatin immunoprecipitation assays [3436]. Because of limitations in the software for in silico ARE searches and motif-finding software, it has been hypothesized that the AR not only binds clARE and selARE but also other types of dimeric binding sites in which the two hexamers are organized as direct, inverted, or everted repeats separated by different length spacers. Monomeric AR binding has also been proposed. Careful analysis of six such candidate AREs revealed that they are all either selAREs or clAREs, with three nucleotide spacers ([32] and Table 2.1). The fact that the downstream half-sites can diverge considerably from the 5′-AGAACA-3′ consensus has been confusing. As shown in Fig. 2.3, a selective ARE can be converted into a classical ARE by enhancing its inverted repeat nature at the less conserved hexamer.

Table 2.1 The position-specific probability matrix derived from those AREs for which AR binding as well as androgen responsiveness has been demonstrated. The use of this PSPM in ARE searching is described in section “ARE search with a position-specific probability matrix (PSPM)”
Fig. 2.3
figure 00023

Change of selectivity of a selective ARE. The ARE is from an AR-binding site near the phosphodiesterase 9 gene [36]. The left upper panel shows an EMSA with DBDs from AR, GR, PR, and MR as indicated. The right upper panel shows the results of an EMSA with a mutant PDE9-ARE in which the inverted repeat nature was increased. The lower panel shows that, while the PDE9 ARE-based reporter is only responsive to androgens and progesterone, the mutant responds to all for steroids. Details on material and methods are described in Denayer et al. [32] and Kerkhofs et al. [39]

The ARBS in the vicinity of the gene encoding the Transmembrane protease, Serine 2 (TMPRSS2) has first been described by ChIP-on-chip [36]. Since the TMPRSS2 upstream sequence is fused to the coding part of oncogenes of the E-twenty six (ETS) family of transcription factors family in over 40% of prostate cancers, the androgen regulation becomes very interesting. The TMPRSS2 enhancer situated 13.5 kb upstream of the gene indeed binds the AR. The DNA motif resembling an ARE and necessary for androgen responsiveness in transient transfection experiments has, however, very low affinity for the AR. Most likely cooperativity with other transcription factors, like the pioneering factors discussed in the chapter by Wang explains how the AR can be recruited to this site. Because of such cooperativity, the AR will have low affinity for its binding site which can make the traditional way of identifying AREs, i.e., by EMSA and transfections, difficult.

Recently, the group of Olli Jänne discovered in AR ChIP seq data on mouse prostate chromatin, that in some cases, the AR seems to bind DNA elements as a heterodimer with FoxA1 [27]. This is reflected in the sequence of the mixed elements which have one 5′-AGAACA-3′ half site and one FoxA1 half site. It will be interesting to see whether other transcription factors can act similarly as heterodimers with AR. Because of the high relevance of FoxA1 as a pioneering factor and its deregulation in prostate cancer cells, this atypical DNA binding might be an interesting candidate for the development of targeted antagonists for the use in prostate cancer.

2.3.1.6 ARE Search with a Position-Specific Probability Matrix (PSPM)

The differences between selective and classical AREs are so small, and the numbers of known selAREs and clAREs is too limited to device relevant separate matrices. For the time being, we devised a matrix based only on AREs for which direct binding as well as functional data are available (Table 2.1). For searching AREs, we use the matrix scan software [37] available on http://rsat.ulb.ac.be/rsat/. Because of the increased number of false positives with fragment length, the use of this approach is limited to genomic fragments of approximately 500 bp. About 75% of the candidate AREs that are indicated by such in silico searches of genomic AR-binding sites were shown to be positive in band shift and functional analyses [32].

2.4 The SPARKI Model

Although the in vitro data were clearly suggesting that the AR has a second type of response elements, it was difficult to assess the in vivo importance of this alternative mode of DNA binding. Based on the in vitro data on the role of the second Zn finger in selARE binding, and the fact that this receptor fragment is encoded by a separate exon in the AR as well as in the GR genes, we developed a transgenic model in which this exon in the AR gene was swapped by that of the GR gene. The resulting model, called SPARKI for “SPecificity affecting AR Knock In” expresses an AR that still binds clAREs with high affinity but has lost high affinity for selAREs [38]. In effect, this model can be considered a knockout of selective AREs. These mice only have a phenotype in the male reproductive organs, which are all reduced in size to approximately 60%. No differences were observed in other androgen target tissues like bone, muscle, kidney, or lacrimal glands, so it seems that selAREs are not involved in the anabolic effects of androgens but have a specific role in reproduction.

2.4.1 Role of selAREs in Fertility

The reduced fertility observed in SPARKI is mainly explained at two sites: in the testis, the number of Sertoli cells is reduced and the spermatogenic process seems to be affected at the second meiotic division; in the epididymis, the sperm maturation is impaired and this correlates with the reduced expression of a subset of the androgen-regulated genes in this tissue. Several of these genes have a known role in sperm maturation and we were able to describe selAREs in two of them [39]. Although the prostates of the SPARKI mice are also reduced in size, gene expression comparison with wild type organs did not reveal significant differences, but this needs further analyses. AR ChIP seq data on SPARKI organs will reveal the importance of the second zinc finger in DNA selectivity of the AR.

2.4.2 Role for selAREs in Prostate

Several of the AREs described in AR-binding segments found in human prostate cancer cell lines are selAREs. The fact that the prostate of SPARKI mice is smaller [38] indicates that selAREs have a role in the development of normal prostates, but it still is unclear whether this type of AREs is also involved in the etiology or evolution of prostate cancer.

Interestingly, SRD5A2, the enzyme which converts testosterone in dihydrotestosterone, is a target itself for androgen regulation. Two AR-binding segments in the SRD5a2 gene reported by Hu et al. [26] were demonstrated to contain selective AREs, indicating a possible feedback mechanism [39]. Whether these AREs are also active in prostate and in prostate cancer still remains to be determined.

2.5 Allostery

While the cognate ligands of the AR are testosterone and dihydrotestosterone, the DNA elements can also be considered ligands rather than merely AR docking sites near the androgen target genes. There are several lines of evidence that indicate that the DNA sequence indeed can modulate the activity of the binding AR. In this section, we will discuss a possible pathway of allosteric signaling from the DBD to the LBD.

2.5.1 Differential Effect of Selective Versus Classical AREs

Several features of the AR have been studied by monitoring the effect of point mutations on the functionality of the receptor in reporter assays involving simple AREs. The effects of disrupting the N/C interactions, the sumoylation of the aminoterminal domain and the role of the glutamine stretch in the control of the overall activity of the human AR have initially all been described on reporter genes controlled by clAREs (reviewed in [40]). However, the same analyses performed with reporter genes based on selAREs gave much less pronounced or no effects [4143]. Clearly, these data demonstrate that the DNA is not a passive partner of the AR but somehow controls its activity.

2.5.2 The DBD–LBD Communications

Many AR mutations have been found in patients with complete or partial androgen insensitivity (AIS) as well as in biopsies of castration resistant metastatic prostate cancer [44]. Most of these mutations affect the function of the domain they are ­situated in. However, some DBD mutations do not affect DNA binding and some LBD mutations do not affect ligand binding. Much to our surprise, a DBD mutation can affect ligand binding and vice versa, and an LBD mutation can affect DNA binding. These mutations are situated at the surface of these domains pointing away from the DNA or the ligand. Based on modeling of the AR domains on the DBD–LBD coordinates of the crystal structure of the PPARγ-RXRα, as well as on docking AR DBD against AR-LBD, we propose that indeed, there is a functional interface between these domains, allowing signals from the DNA reaching the LBD and signals from the ligand reaching the DBD [45]. Also in living cells, the AR-LBD stabilizes the DNA binding [46]. Final proof of this allostery might come from structural studies of AR dimers bound to DNA.

2.6 Conclusions

The AR was cloned more than 20 years ago. We have learned a lot about its main mechanisms of actions since then. However, we also know that there is still a lot to be discovered, even if we focus on the DNA binding alone:

  • How can the DNA-binding domain and the carboxyterminal extension control the different functions of the AR?

  • How can different DNA sequences affect the activity of the AR: is there a direct interaction between the DBD and other domains? Despite strong indications, this still needs to be proven in structural analyses.

  • Can we exploit the allosteric signals between the DBD and the other domains and translate them in one or more therapeutic targets?

  • What is the exact role of selective AREs in prostate cancer, and in the control of the cell cycle in the primary tumor as well as in the metastases, be it hormone sensitive or castration resistant?