Keywords

Introduction

Alternative splicing of pre-mRNA plays a major role in both normal development and cancer progression. By hijacking and leveraging the complex and tightly regulated process of alternative splicing, cancer cells are able to acquire many of the “hallmarks of cancer” [1]. Prostate cancer (PCa), the most diagnosed cancer in men in the USA, is no exception. There have been several comprehensive review articles detailing the important role of alternative splicing in PCa progression and aggressiveness [1,2,3,4,5,6,7]. These reviews, however, do not address the critical topic of alternative splicing in PCa health disparities. PCa exhibits dramatic race/ethnic disparities as African American (AA) men have significantly higher risk, morbidity, and mortality compared to European American (EA) men. In this review, we will summarize some of the major molecular mechanisms and alternative splice events in PCa, as well as introduce our recent study elucidating the important role of differential alternative splicing in mediating PCa disparities.

Prostate Cancer Health Disparities

PCa is the most diagnosed cancer in men in the USA and accounts for over one-fifth of all newly diagnosed cancers in men [8]. More than 164,000 new cases are diagnosed each year, and PCa is the second leading cause of male-cancer-related deaths annually. PCa also has the highest heritability of any cancer at 10% [9]. In addition to family history, well-established risk factors of PCa include Lynch syndrome, age, and race/ethnicity [10, 11]. Despite increased screening and overall decreasing mortality rates of PCa, AA men have significantly higher rates of PCa incidence, high-risk cancer, and mortality [12]. AA men are 1.7 times more likely to be diagnosed with PCa and have a 2.4 times greater mortality rate compared to EA men [13]. This mortality ratio is the largest of any other malignancy in the USA [14]. Additionally, PCa appears to develop at an earlier age in AA men who present with significantly higher prostate-specific antigen (PSA) plasma levels, more clinically advanced disease, and develop higher grade metastatic disease at a three- to four-fold greater rate [15,16,17,18]. This health disparity has been attributed to epidemiological differences in socioeconomic status, health-seeking behavior, access to healthcare, and treatment plans [15, 16]. Even after adjusting for clinical and epidemiological factors, however, AA men still have significantly higher occurrence and mortality rates [19,20,21]. This disease disparity suggests that genetic ancestry plays an important role in PCa incidence, progression, and aggressiveness.

Molecular Differences in African American Prostate Cancer

Multiple studies have shown genetic and biological differences in prostate tumors in AA and EA patient populations. TMPRSS2-ERG gene fusions and PTEN deletions were once thought to be characteristic of all prostate tumors. However, recent reports have shown that these genetic alterations occur at a much lower frequency in AA PCa. Only 20–30% of AA PCa tumors contain TMPRSS2-ERG gene fusions compared to 40–50% in EA patients [22], and loss of PTEN was observed in 34% of EA men and only 18% of AAs [23].

Genome-wide association studies (GWAS) have identified multiple loci that confer a greater risk for PCa in AA men compared to EA men. The rs1447295 variant at the 8q24 locus has been associated with earlier diagnosis and increased risk in AA patients [24]. Six other variants (rs16901979, rs7000448, rs6983267, rs111906932, rs114798100, and rs111906923) have also been linked to increased PCa risk in AA men [25, 26]. African ancestry-specific PCa risk alleles have been identified at chromosomes 13q34 and 22q12 [27]. Additionally, a risk variant at the 17q21 locus has been found more frequently in men of African descent compared to other populations [28]. Many of these alleles reside within long coding RNA sequences.

Single nucleotide polymorphisms (SNPs) in genes that regulate androgen and testosterone metabolism have also been linked to PCa disparity in AAs. Polymorphisms in the cytochrome p450 enzyme CYP17 increase the risk of PCa in AA men by 60% [29]. A homozygous “CC” genotype in the 5′ promoter region (rs743572) in AA men is clinically associated with advanced PCa disease [30].

In terms of the cancer transcriptome, AA PCa has been shown to exhibit increased expression of genes that promote growth (e.g., EGFR and AKT1) and metastasis (e.g., CXCR4 and BMP2) compared to EA PCa [31,32,33]. For the IL-6 gene, a race-specific and anti-correlated expression pattern is observed during PCa progression. Namely, EA PCa has increased expression of IL-6 compared to EA normal prostate, while IL-6 is downregulated in AA PCa compared to AA normal prostate [34]. Exogenous treatment with IL-6 downregulated TP53 in AA PCa cell lines and upregulated expression of a splice variant of MBD2, promoting a cancer stem-like cell phenotype [34]. Additionally, AA PCa exhibits an increased inflammatory signature, including increased expression of inflammatory genes (e.g., CCR7) and more frequent copy number variations of genes related to the immune response (e.g., IL-27, ITGAL, and ITGAM) [31, 33, 35, 36].

AA PCa cell lines and patient specimens have distinct miRNA profiles compared to EA PCa. AA PCa cell lines have increased expression of hsa-miR-26a compared to EA cell lines derived from tumors of similar stage and grade [37]. Theodore et al. [38] showed decreased expression of five miRNAs due to hypermethylation of CpG islands within promoter regions in AA PCa. Of particular interest, miR-152 had significantly lower expression in AA patients versus EA patients (in both non- and malignant tissue). Ectopic over-expression of miR-152 in PCa cell lines downregulated expression of DNMT1 by binding to the 3′UTR of the mRNA, leading to decreased proliferation, migration, and invasion.

Ten miRNAs have been identified that exhibit enriched or depleted expression in AA versus EA PCa [39]. These miRNAs, including miR-133a (AA depleted), miR-513c (AA depleted), and miR-96 (AA enriched), were computationally predicted and experimentally shown to target key genes known to promote cancer, such as MCL1, STAT1, and FOXO3A. Ectopic treatment of PCa cell lines with AA-depleted miRNA mimics (for miR-133a and -513c) or AA-enriched miRNA antagomirs (for miR-96) resulted in decreased proliferation, invasion, and caspase activity. In agreement with these in vitro findings, AA PCa specimens showed significantly increased expression of MCL-1 and STAT1 and decreased expression of FOXO3A compared to EA PCa samples.

The role of epigenetics in PCa disparities is also being explored. Using quantitative pyrosequencing, Kwabi-Addo et al. [40] and Devaney et al. [41] revealed increased gene promoter methylation in AA PCa specimens compared to EA PCa. RARβ2, SPARC, TIMP3, NKXX2-5, ABCG5, and SNRPN genes were all found to be highly methylated in AA PCa samples and cell lines. Tang et al. [42] identified an association between increased RARB and APC methylation and increased PCa risk in AA men.

Alternative Splicing

An area of research that has recently garnered considerable attention with the advent of genome-wide approaches (e.g., exon arrays and RNA-Seq) is the role of alternative splicing (AS) in cancer and cancer disparities. AS is the major mechanism for post-transcriptional regulation of gene expression, mRNA diversity, and protein modification. During AS, introns are typically excised from the precursor mRNA (pre-mRNA) and the remaining exons can be joined together in different combinations to produce multiple unique mature mRNA transcripts from a single gene. It is estimated that over 90% of human genes transcribe pre-mRNAs that undergo AS with an average of five unique mRNA variants per coding gene. This generates a proteomic complexity of ~100,000 distinct protein isoforms from ~20,000 protein-coding genes. Types of splicing events include exon skipping (removal of specific exons), cryptic exon expression, selection between two mutually exclusive exons, exon scrambling, intron retention, alternative 5′ or 3′ splice sites (altering boundaries between introns and exons), alternative promoters (which can alter reading frames), and alternative polyadenylation sites (Fig. 1). This is a highly complex and flexible system that responds to cell type, tissue type, developmental stage, physiological system, and disease state.

Fig. 1
figure 1

Schematic representations of different types of splicing events. Exons are depicted as rectangles and introns as solid lines. Broken lines represent splicing events. Abbreviations: Poly(A), polyadenylation site. Designed on https://prosite.expasy.org

AS generates a variety of protein isoforms with different sequences and altered functions from the same gene, promoting diversification of the transcriptome and proteome at both the species and interspecies levels. Although not all AS variants are functional, many can have similar or different functions, different stability kinetics, alternative subcellular localizations, or encode isoforms that are susceptible to different post-translational modifications (e.g., phosphorylation and ubiquitination). By altering the repertoire of splice variants within a cell in a time- and/or spatial-dependent manner, AS can lead to protein isoforms with different interactome networks by promoting or inhibiting different DNA–protein, protein–protein, protein–ligand, and protein–drug interactions.

Splicing events are regulated by cis-acting sequences (splice sites, splicing enhancers or silencers, and branch points) located within the pre-mRNA and 30–500 trans-acting factors of the spliceosome, including small nuclear RNAs (snRNAs) and RNA-binding proteins (RBPs). AS is also strongly influenced by RNA polymerase kinetics, chromatin modifications, chromatin structure, epigenetic modifications (e.g., DNA and/or RNA methylation), nucleosome occupancy, location of cis-elements, secondary structure of pre-mRNA, and sequence editing [43, 44].

Cis-regulatory sequences are divided into two groups: splice sites that are required for spliceosome binding and binding sites for other RBPs. Sequences within exons (5′ and 3′ splice sites) and within introns (branch point and polypyrimidine sequences) designate exon–intron boundaries for the spliceosome. These splice sites can be constitutive (always recognized as splice sites) or alternative. The strength of a splice site is important for splicing accuracy and frequency. Strong splice sites contain consensus sequences that are well recognized by the spliceosome, thereby undergo splicing at a high rate. Weak splice sites rely on cis-acting sequences and cell context for splicing to occur. Splicing regulatory elements (SREs) include intronic or exonic splicing enhancers (ISE, ESE) or silencers (ISS, ESS). These provide binding sites for trans-acting factors, such as splicing factors (SF).

The spliceosome is composed of five snRNPs and over 200 SFs and auxiliary proteins. SnRPs (U1, U2, U4, U5, and U6) are the core components of the spliceosome ribozyme and are responsible for recognizing splice sites. The spliceosome also contains DEAD/H-box RNA-dependent ATPases that allow for changes in RNA–RNA base pairing [45]. A splicing event begins with U1 binding, the 5′ splice site, the SF3b complex within U2 binding, the branch point site, and U2AF1 and U2AF2 auxiliary proteins binding, the 3′ splice site. U1 and U2 interact to form the pre-spliceosome. Next, U4, 5, and 6 are recruited, the spliceosome rearranges, U1 and U4 are released, and the spliceosome becomes activated. In the first splicing reaction, the phosphodiester bond at the 5′ splice site is cleaved via nucleophilic attack from the adenosine in the branch point site. The intron then forms an intermediate lariat structure, and the phosphodiester bond at the 3′ splice site is cleaved via nucleophilic attack by the free 3′ hydroxyl group on the phosphate of the 3′ splice site . Finally, the two exons are ligated together and the intron lariat is released [46].

Trans-acting RBPs , such as SFs and auxiliary proteins (e.g., SF1 and U2AF), complex with the spliceosome to add additional flexibility and complexity to the splicing process. RBPs bind cis-regulatory sites to promote or inhibit splice site recognition which is dependent on location of binding (e.g., within intronic or exonic sequences, upstream of an alternative exon, and within a downstream intron), cellular context, regulation by other RBPs, and expression level of the RBP [47, 48]. The most well-studied trans-acting factors are the serine/arginine-rich SF (SRSF) and heterogeneous nuclear ribonucleoprotein particle (hnRNP) families. SRSFs are composed of two RNA recognition motifs at the N-terminus and a serine-rich domain at the C-terminus that is involved in protein–protein interactions. SRSFs are generally considered positive splicing regulators. They promote exon inclusion by preferentially binding to purine-rich ESE or ISE sequences and recruiting U1 to 5′ splice sites and U2AF to 3′ splice sites [49]. SRSF protein kinases (SRPKs) and CDC-like kinases (CLKs) activate SRSFs by phosphorylation in the cytoplasm or nucleus, respectively. The hnRNP family is largely classified as negative splicing regulators. Like SRSFs, they have two RNA recognition motifs; however, their protein–protein interaction domains are unstructured. HnRNPs promote exon skipping by binding ESS and ISS sequences and inhibiting recognition of splice sites. They may also prevent spliceosome assembly after 3′ splice site recognition via steric hindrance of snRNPs.

SRSFs and hnRNPs have more nuanced roles than exclusively positive or negative splicing regulators [50, 51]. Their effect on splicing can depend on several factors, such as the location of the binding site. For example, SRSFs enhance splicing when binding to sequences within exons and repress splicing when bound to introns [49]. The functional consequences of SF binding can also be influenced by cell differentiation, cell fate, tissue identity, organ development, and disease state [52].

Alternative Splicing and Cancer

All components of the splicing process are tightly regulated, and any alteration can lead to disease causation and progression. The involvement of splicing dysregulation in oncogenic processes is known to activate oncogenes and inactivate tumor suppressors . Gene expression program changes via aberrant splicing in cancer cells select for functional changes that promote the malignant progression of the tumor [53]. Modifications in the splice sites or splicing machinery can lead to DNA damage, genomic instability, changes in epigenetics, alterations in transcriptional elongation, and changes in gene expression, thus helping to promote any of the “hallmarks of cancer” [54,55,56]. Splice variants are being used to characterize tumor subtypes and are targets of interest for cancer biomarkers and therapeutics [57]. Due to the potential for functional differences, individual AS variants need to be studied separately to better understand each variant’s role in disease progression. In addition, an understanding of the overall splicing changes, as a change in one trans-acting factor can affect the splicing of hundreds of transcripts, will be instrumental in identifying the role of AS in cancer.

The Cancer Genome Atlas (TCGA) data have been used to identify genome-wide AS events in cancer versus normal tissues and between different tumor subtypes and stages. Globally, AS events occur more frequently than somatic mutations in driver genes. AS also occurs more often in cancer-related pathways and in genes that are frequently mutated in cancers [58, 59]. Analysis of TCGA data has also identified key somatic mutations in splice sites that affect exon–intron boundaries, resulting in changes in expression of oncogenes and tumor suppressors in cancer [60]. In general, splicing of proto-oncogenes generates constitutively active or gain of function variants that confer an increased oncogenic advantage. Synonymous mutations, which can alter splice sites, are also more highly enriched in oncogenes. Conversely, AS of tumor suppressors can introduce premature stop codons and altered reading frames, resulting in decreased protein levels via nonsense-mediated decay or decreased function. Cancer cells have increased levels of intron retention in tumor suppressor transcripts which promote premature termination, nonsense-mediated decay, and tumor suppressor inactivation [61,62,63]. Mutations in splice sites or splice site choice can result in isoform switching or generation of novel splice variants [64]. Thus, somatic mutations in key genes or splice sites involved in AS may be a major driver in many cancers.

Differential splicing can generate variants with opposing functions or shift the balance between two isoforms. For example, while the full-length isoform of caspase-9 is pro-apoptotic, a shorter isoform missing exons 3–6 is anti-apoptotic and has been identified in cancers, including non-small cell lung carcinoma [65]. SRSF1, which is overexpressed in many cancers, binds within intron 6 to promote inclusion of exons 3–6 to generate the long variant [66]. Conversely, hnRNPL binds an ESS in exon 3 and induces splicing exclusion of exons 3–6 to generate the short variant [67]. Kinases such as AKT are predicted to phosphorylate and activate both SRSF1 and hnRNPL [68]. The known tumor suppressor gene TP53 has over seven different splice variants that have been detected in a variety of cancers [69]. Splicing events are concentrated in the 5′ and 3′ ends and result in alternative promoter selection, exon skipping, intron retention, alternative 5′ and 3′ splice sites, or alternative reading frames. These P53 isoforms can inhibit full-length P53, impair growth or senescence suppression, and are associated with decreased patient survival.

Frequently, tumors display alterations in trans-acting factors. Perturbations in the expression level, localization, activity, or degradation of RBPs, SFs, or their upstream regulators can vary dramatically between different cancers. While hnRNPA2/B1 is an oncogenic driver in glioblastoma via splicing of tumor suppressors IG20 and MST1R (RON) [70], RBM4 controls apoptosis, proliferation, and migration as a tumor suppressor in a variety of other solid tumors [71]. The most common SF mutations in hematological and solid tumors are heterozygous missense gain or alteration of function mutations in SF3B1, U2AF1, and SRSF2 and homozygous loss of function mutations in ZRSR2 [54, 72]. SF3B1 is a member of the SF3b complex within the U2 snRNP of the spliceosome. Mutations have been observed in the 3′ splice site of the SF3B1 pre-mRNA resulting in nonsense-mediated decay, which commonly occurs in breast cancer [73]. Mutations in the zinc finger domains of U2AF1, a U2 small nuclear RNA auxiliary factor, are frequently identified in non-small cell lung cancer [74]. Missense mutations are often observed in SRSF2 that change its binding affinity to ESE sequences and are common in chronic myelomonocytic leukemia [75]. ZRSR2 is a zinc finger RBP in the U12 minor spliceosome complex. Mutations that introduce in-frame stop codons or disrupt the reading frame are common in myelodysplastic syndrome [76].

Due to the pivotal role of AS in cancer, many researchers are focusing their efforts on identifying or developing molecules that target aberrant AS in cancers [44, 54, 77]. Potential targets for these therapies include mutations in splice sites, cis-regulatory elements, and promoter or coding regions of trans-acting factors. Due to the high mutation rates observed in cancers, SF3B1 is a major target for splicing-modulating drugs. In addition, upstream factors such as SF kinases and specific splice isoforms of oncogenes or tumor suppressors are also attractive targets. Cancers that rely on splicing activity are ideal candidates for AS-targeted therapy. For example, MYC-driven cancers rely on the spliceosome, through BUD31, for promoting oncogenesis [78]. BUD31 associates with SF3B1, U2AF1, and other core spliceosome factors. Inhibition of the spliceosome via spliceosome inhibitors or BUD31 depletion downregulates survival, tumor growth, and metastatic potential of breast cancers driven by MYC.

There are a variety of types of therapeutic compounds used to target AS. The most well-known are antisense oligonucleotides (ASOs) which are composed of nucleotides or analogs that hybridize with a complimentary nucleic acid sequence. By coding for the complimentary sequence of the target, ASOs can potentially block splice sites via steric hindrance, target mRNAs for degradation, redirect splicing, or prevent trans-factors from binding. ASOs have gained traction in treating Duchenne muscular dystrophy (DMD), spinal muscular atrophy (SMA), and amyotrophic lateral sclerosis (ALS) [44]. In oncology, two ASOs, AZD9150 targeting STAT3 and AZD4785 targeting KRAS, are in clinical trials for solid advanced and metastatic diseases [79, 80].

Small molecule inhibitors (SMIs) have been designed to target SF kinases and spliceosome components. SRPIN340 which targets SRPKs and TG-003 which targets CLKs cause decreased activity of SFs and subsequent decreased expression of “splice-correct” signaling proteins, such as VEGF and p70-S6K [81, 82]. ML315, a chemically modified quinazoline probe, selectively inhibits the CLK family as well [83]. Cp028 has been shown to inhibit intermediate stage spliceosome assembly by causing the early release of U4/U6 [84].

Natural products and their derivatives have also shown promise in targeting different stages of AS. Leucettine L41, derived from the marine sponge product leucettamine B, is an ATP-competitive inhibitor against CLK1 and CLK3 and has been shown to inhibit phosphorylation of SRSF4 and SRSF6 [85]. Another natural product, N-palmitoyl-l-leucine , targets late stage spliceosome assembly [86].

Indole derivatives, such as benzopyridoindoles and pyridocarbazoles, alter SF-ESE-dependent splicing in key oncogenic genes such as MST1R [87, 88]. The RBP RBM39 is a target of sulfonamides derived from para-aminobenzoic. Mutations in RBM39 and resistance to indisulam are common in leukemia and lymphomas. These mutations block the complex formation of RBM39 with the CUL4-DCAF15 ubiquitin ligase complex, halting the normal proteasomal degradation of RBM39 and resulting in aberrant pre-mRNA splicing [89]. SRPIN340 is an isonicotinamide compound shown to inhibit expression of SRPK1 and a pro-angiogenic VEGF variant [90]. Derivatives of the natural compound FR901464 have shown promising ability to inhibit SF3B. These analogs include spliceostatin A, meayamycin, and sudemycin [91,92,93].

Pladienolide-scaffold derivatives have had the most success in clinical trials. E7107, derived from pladienolide B, inhibits SAP130 of the SF3B complex [94]. This weakens the binding interaction between U2 and the pre-mRNA by locking SF3B1 in an inactive conformation and sterically preventing binding to the branch point adenosine [95]. E7107 was one of the first splicing modulator drugs to enter clinical trials in solid tumors in 2007 [96, 97]; however, further studies in humans were suspended due to unexpected toxicity. H3B-8800, another pladienolide derivative, selectively inhibits wild-type and mutated SF3B1 isoforms and enriches for intron retention in SF-coding mRNAs [98]. Trials using H3B-8800 in hematological cancers have been ongoing since 2016.

Our current limited understanding of overall “splicing sickness,” restoration of normal splicing, and downstream effects of spliceosomal mutations need to be addressed in order to develop new AS drugs. Overcoming issues of systemic delivery, toxicity, off-target effects, efficacy, and targeting the desired cell type will be keys in splice-modulating therapies, becoming a safe and efficacious therapeutic option for cancer patients.

Alternative Splicing in Prostate Cancer

A number of genes undergoing AS have been associated with PCa development and progression. The androgen receptor (AR), a steroid nuclear hormone that plays a major role in normal prostate homeostasis and PCa development, is the primary target for early PCa treatment. PCa tumors, however, develop AR-targeted treatment resistance (i.e., castrate-resistant) as the disease progresses. One mechanism in which PCa tumors develop drug resistance is through AS of the AR. Among the 20 different AR splice variants identified, ARv7 is the most clinically frequent and relevant variant. The ARv7 variant is generated by inclusion of a cryptic exon within exon 3 that encodes a protein isoform with a truncation of the entire C-terminal ligand binding domain (LBD). The LBD is important for AR activation by androgens and subsequent translocation of the AR into the nucleus for transcriptional regulation of androgen-dependent genes. The ARv7 isoform acts independently of androgen binding and is constitutively present in the nucleus of prostate cells, regardless of androgen stimulation [99]. Levels of ARv7 mRNA in PCa patients can help predict responsiveness to anti-androgen therapies, such as abiraterone and enzalutamide [100]. The SF hnRNPA1 and RBP SAM68 are believed to contribute to the regulation of the ARv7 variant. Relocalization of hnRNPA1 from the nucleus to the cytoplasm decreases expression of ARv7 in PCa cells and resensitizes them to enzalutamide [100,101,102,103]. SAM68 preferentially increases expression of ARv7 in PCa cells, via SAM68 stabilization of the ARv7 mRNA via direct RNA–protein binding and indirect mediation by SRSF1 [104].

A second clinically relevant AR splice variant, ARv567es, has been identified in PCa cells where exons 5–7 (of 8 total) are skipped, truncating the majority of the LBD. Similar to the ARv7 isoform, ARv567es is constitutively active and androgen independent [105]. This variant is highly expressed in metastatic and malignant prostate tissue [106]. ARv567es regulates oncogenes involved in cell cycle progression including UBE2C, which codes for a ubiquitin-conjugating protein involved in the machinery that inactivates the mitotic checkpoint and promotes proliferation [107].

The fibroblast growth factor receptor (FGFR) 2 undergoes AS of the third Ig-like extracellular domain, generating two isoforms: FGFR2IIIb and FGFR2IIIc. FGFR2IIIb is expressed highly in normal prostate epithelial cells and is a known tumor suppressor. FGFR2IIIc is involved in autocrine signaling and expressed more highly in mesenchymal cells. While no change in overall FGFR2 protein expression is observed as PCa progresses [108], a switch in FGFR2 isoforms occurs due to AS. Decreased expression of the IIIb isoform and exclusive expression of the IIIc isoform is associated with epithelial to mesenchyme transition (EMT) and loss of AR sensitivity [109]. This increase in FGFR2IIIc expression correlates with an increase in fibroblast growth factor (FGF) 8b, a ligand associated with PCa [110].

The vascular endothelial growth factor (VEGF) is largely responsible for cellular growth and survival via angiogenesis in both normal and cancerous conditions. “Canonical” splicing of VEGF produces a VEGF isoform that is pro-angiogenic, while an alternative 3′ splice site event generates an anti-angiogenic isoform VEGF165b, which is the main isoform. VEGF165b differs from pro-angiogenic VEGF in the last six amino acids and acts as an antagonist of the VEGF receptor [111]. Expression of pro-angiogenic VEGF is an early driver of PCa, and increased expression corresponds with later stage PCa and increased expression of SRSF1 [112]. Inhibition of the SF kinase SRPK1, a known activator of SRSF1, causes splice switching of VEGF165b in PCa cells and decreased tumor formation in PCa mouse models [113].

Bcl-x plays a pivotal role in regulating apoptosis. Alternative 5′ splice site usage within exon 2 of the BCL2L1 pre-mRNA generates two variants that have opposing functions. The long anti-apoptotic isoform, Bcl-x(L), is associated with cell survival, while the shorter isoform Bcl-x(S) promotes apoptosis, cell death, and sensitivity to chemotherapeutics in PCa [114]. High Bcl-x(L) to Bcl-x(S) ratios have been observed in PCa and over-expression of the short isoform induces apoptosis-mediated cell death in cancer cells [1, 6]. SAM68 selectively favors the upstream 5′ splice site, thus favoring production of the BCL2L1 long variant and preventing apoptosis. Increased expression of Bcl-x(L) has been identified in PCa patients and cell lines, resulting in decreased apoptotic-induced cell death and decreased sensitivity to cytotoxic therapeutics [115].

Splice variants of another apoptotic-related gene SH3GLB1, which codes for the BAX-binding protein Bif-1, has recently been implicated in the transition of adenocarcinoma to aggressive treatment-induced neuroendocrine (t-NE) PCa. Bif-1a, the pro-apoptotic protein isoform encoded by a variant lacking exon 6, is the predominant isoform expressed in adenocarcinoma specimens [116]. Bif-1b (encoded by a variant containing a short version of exon 6) and Bif-1c (encoded by a variant containing a long version of exon 6), however, become highly expressed in t-NE PCa. This switch in dominant variant expression is regulated by the SF SRRM4.

Cyclin D1 (CCDN1) associates with cyclin-dependent kinase 4 (CDK4) to promote cell cycle progression through the G1 phase. Two alternative splice variants of CCDN1 have been identified: the cyclin D1a mRNA, which is the full-length and more common variant, and the cyclin D1b variant, in which intron 4 is retained leading to early termination. The cyclin D1b protein isoform plays a distinct role as an AR co-regulator to promote expression of AR-dependent genes associated with tumor growth and metastasis in PCa, specifically SNAI1 [117]. Additionally, increased expression of SRSF1 in PCa cells correlates with enhanced expression of cyclin D1b, but not D1a [118, 119].

ST6GalNac1 is an enzyme that synthesizes the sialyl-T (sTn) antigen and modifies the glycosylation pattern of cell surface glycoproteins that play a role in cell adhesion and metastasis. ST6GalNac1 is androgen-sensitive, thus indicating a role for this enzyme in PCa. Recently, RNA-seq data have identified a shorter splice variant of ST6GalNac1 that has only been reported in PCa [120]. The short isoform results from the inclusion of an additional exon (exon 2) within the 5′ UTR that generates a new start codon and encodes a longer mRNA variant but a shorter, fully functional protein isoform that has increased expression compared to the full-length protein isoform missing exon 2. In vitro studies suggest a role for the short isoform in promoting EMT through decreased cell adhesion and increased cell motility.

A new splice variant of PCSK6 has been identified in PCa. PCSK6 codes for the proprotein convertase PACE4 that modifies proprotein substrates in secretory and known oncogenic pathways. Couture et al. [121] identified a variant of PCSK6 with a shorter 3′UTR via AS of exon 25 (variant known as PACE4-altCT). While both the full-length PACE4 and the shorter PACE4-altCT are expressed in PCa specimens, PACE4-altCT showed increased expression in higher grade tumors. Additionally, PACE4-altCT appears less susceptible to degradation and secretion, is more stable, more rapidly activated, and increases growth and proliferation when compared to the full-length protein. PACE4-altCT directly increases the processing of pro-GDF15 (i.e., prostate differentiation factor), a TGFβ ligand with a known role in immunosuppression, protection against radiation-induced cell death, and neovascularization.

Three splice variants of CLK1 have been identified: full-length CLK1, CLK1T1 (skipping of exon 4), and CLK1T2 (retention of intron 4). CLK1 is responsible for phosphorylating and activating SRSFs and other SFs. Both T1 and T2 isoforms lack the catalytic domain and are inactive. The CLK1T2 has been found to be the more prominent isoform in PCa. Treating PCa cells with a CLK1 inhibitor shifts the CLK1 variant expression ratio to favor both the expression of full-length active CLK1, as well as expression of the pro-apoptotic variants of CASP9, MCL-1, BCL2L1, and survivin [122].

The HSD17B4 gene encodes 17β-hydroxysteroid dehydrogenase type 4 (17βHSD4), an enzyme involved in testosterone and dihydrotestosterone metabolism. Recently, five splice variants of HSD17B4 were identified, four of which encode enzyme isoforms that do not inactivate testosterone and dihydrotestosterone via conversion to inert steroid products [123]. The remaining isoform, isoform 2, is the major enzyme expressed in prostate tissue and is able to inactivate androgens. The splice variant encoding isoform 2 of 17βHSD4 is missing part of exon 2 and all of exon 3, which code for sections of the short-chain alcohol dehydrogenase domain. This isoform was found to be functionally suppressed in metastatic castration-resistant PCa.

As outlined above, AS plays an important role in PCa development, progression, and drug resistance. While it is fairly well accepted that AA PCa is genetically different from EA PCa, the role of AS in PCa disparities is less clear. Over 60% of the studies cited above used only cell lines derived from EA patients (Table 1). Of the remaining studies, two cell lines (22RV1 and M12) of mixed or self-reported AA ancestry were utilized. The 22RV1 cell line was derived from the CWR22 line, a primary prostatic carcinoma serially transplanted in nude mice [124, 125]. A recent study determined 22RV1 was only 41% AA ancestry [126]. The M12 line was immortalized from the P69SV40T cell line via transfection with SV40 T antigen and passaged in nude mice [127, 128]. While the ancestry of the M12 line has not been confirmed by genotyping, the parental cell line was reported to be derived from prostate epithelial cells from a 63-year-old AA man. None of these studies use an AA PCa cell line with over 50% AA genetic ancestry, such as MDA PCa 2b or RC77 T/E (74% and 73%, respectively) [126]. Forty-one percent of the studies cited analyzed primary prostatic samples, but none specified the ancestry (genotyped or self-reported) of the patients.

Table 1 Summary of splicing events in prostate cancer

The lack of AA cell lines and patient samples used in AS PCa studies reflects the lack of minority subjects across all cancer, and specifically PCa, research [129]. In order to better understand PCa disparities, eliminate the disproportionate disease burden, provide novel biomarkers, and improve survival and quality of life in AA PCa patients, we must increase the use of AA PCa cell lines and specimens in our research.

Differential Alternative Splicing of PIK3CD in Prostate Cancer Disparities

In order to further our understanding of AS in PCa health disparities , we applied a functional genomics approach to investigate differential AS (dAS) events in AA and EA PCa patients [59]. Twenty AA and 15 EA PCa tumor and matched normal specimens (treatment naïve, Gleason score 6–8, age 49–81) were collected, and samples were analyzed using the Affymetrix Human GeneChip exon array to identify genes undergoing AS (Table 2). We identified 158 unique genes that underwent AS in both AA and EA PCa. It can be concluded that these genes, which included TMPRSS2 and AR, are important for PCa development regardless of race. In comparing AA versus EA PCa, 1876 unique genes undergoing dAS were identified, including RASGRP2, NF1, and BAK1. Over 2200 unique genes underwent dAS in AA versus EA normal prostate tissue, suggesting a differential role for these genes in normal prostate homeostasis. We identified splicing events involving 644 genes, including PIK3CD, ITGA4, and MET, that were present in both AA PCa and AA normal tissue, but were absent in EA specimens. We also identified 1575 unique genes (e.g., FGFR3 and TSC2) undergoing dAS in AA PCa versus AA normal, but not EA PCa versus EA normal. These last two comparisons identify two important groups of splicing events: splicing events involving 644 genes that are inherited based on African ancestry, and splicing events involving 1575 genes that occur de novo during PCa progression solely in AA men. Over 70% of dAS events identified in AA PCa versus EA PCa occur in pathways known to contribute to oncogenesis (e.g., cell growth, proliferation, cell survival, cell adhesion, and DNA repair), and the majority were in-frame exon skipping events. Further validation of a subset of genes identified as potential targets for dAS was performed in an additional cohort of 22–25 AA and 21–24 EA specimens. Ninety-one percent of genes chosen for validation via RT-PCR were confirmed. The exon array results also identified 886 differentially expressed genes in AA versus EA PCa (compared to 1876 dAS genes). These data suggest that dAS is playing a much greater role in AA PCa disparities than differential gene expression.

Table 2 Examples of differential alternative splicing events in AA PCa, EA PCa, AA normal, and EA normal specimens

Of the dAS genes identified, we focused on PIK3CD. PIK3CD codes for the p110δ (or PIK3Cδ) catalytic domain of the class I PI3Ks that bind the p85 inhibitory subunit. Upon activation by a receptor tyrosine kinase, p110δ phosphorylates phosphatidylinositol 4,5-bisphosphate (PftdIns(4,5)P2), generating phosphatidylinositol 3,4,5-trisphosphate (PIP3) which recruits AKT1 to the cell membrane thus activating downstream signaling cascades involved in cell growth, survival, and proliferation. The delta subunit of p110 is highly expressed in leukocytes [130, 131]. Four PIK3CD splice variants were identified: PIK3CD-L includes all 24 exons; PIK3CD-Si is missing exon 8 (encoding a domain between the Ras-binding and C2 domains); PIK3CD-Sii is missing exon 20 (encoding part of catalytic domain) (Fig. 2); PIK3CD-Siii is missing exon 8 and 20; and PIK3CD-Siv has a large deletion that encodes the helical domain and part of catalytic domain.

Fig. 2
figure 2

Alternative splicing of PIK3CD . Schematic representations of alternative splicing of PIK3CD pre-mRNA. EA patients predominantly express PIK3CD-L which includes exon 20 (blue), while exon 20 is skipped (red) in AA patients to generate PIK3CD-S. Designed on https://prosite.expasy.org

We selected the PIK3CD-Sii variant (identified as PIK3CD-S from here on) for further characterization for two reasons. First, the PIK3CD-Sii variant encodes a protein isoform missing 56 amino acids (Fig. 3) of the catalytic domain which has an important role in p110δ activity. Second, analysis of 494 PCa patients from The Cancer Genome Atlas (TCGA) revealed significantly decreased disease-free survival in patients with high PIK3CD-S/PIK3CD-L expression ratios (p = 0.0052). Although this analysis was performed irrespective of race or tumor grade, these data provide evidence for the clinical relevance of PIK3Cδ-S in PCa.

Fig. 3
figure 3

Protein isoforms of p110δ due to alternative splicing. Schematic representations of long isoform due to exon 20 inclusion (top) and short isoform due to exon 20 skipping (bottom) of p110δ. Adaptor (p85) binding domain (ABD), RAS binding domain (RBD), C2, helical, and catalytic domains are shown in gray. Key phosphorylation and ubiquitin lysine (K), serine (S), tyrosine (Y), and threonine (T) sites are indicated as diamonds. The catalytic domain region encoded by exon 20 is shown in blue. Designed on https://prosite.expasy.org

We performed siRNA-mediated knockdown of the PIK3CD-L variant in an EA PCa cell line VCaP (which has little to no expression of PIK3CD-S) and knockdown of either the PIK3CD-L or PIK3CD-S variant in an AA PCa cell line MDA PCa 2b (which expresses both variants). Knockdown of PIK3CD-L in VCaP cells results in a significant decrease in invasion, proliferation, and phosphorylation of key downstream signaling proteins (i.e., AKT, mTOR, and S6). In the AA cell line, knockdown of PIK3CD-L enriches expression of PIK3CD-S, leading to increased proliferation, invasion, and phosphorylation of AKT, mTOR, and S6. Not surprisingly, we observe no significant effects on invasion, proliferation, nor phosphorylation of signaling proteins after “knockdown” of PIK3CD-S in the EA cell line. However, a significant decrease in invasion, proliferation, and phosphorylation was observed after knockdown of PIK3CD-S (thereby enriching for PIK3CD-L) in the AA cell line.

In order to determine the effect of both variants on drug resistance, we ectopically overexpressed either PIK3Cδ-L or PIK3Cδ-S in two EA cell lines, VCaP and PC-3, and treated cells with the SMI CAL-101. CAL-101 (idelalisib (Zydelig®)) targets p110δ and is approved for treatment of hematological malignancies such as chronic lymphocytic leukemia, follicular B-cell non-Hodgkin lymphoma, and small lymphocytic lymphoma. Treatment of PCa cells overexpressing PIK3Cδ-L with CAL-101 results in a significant decrease in proliferation and AKT and S6 phosphorylation. CAL-101 treatment of PIK3Cδ-S overexpressing cells results in no significant suppression of proliferation or AKT and S6 phosphorylation compared to vehicle-treated cells. In addition, PIK3Cδ-S expressing cells also have greater baseline proliferation compared to PIK3Cδ-L expressing cells.

Next, we investigated the effect of both PIK3Cδ isoforms on tumor formation, metastasis, and responsiveness to CAL-101 treatment in non-obese diabetic/severe combined immunodeficiency (NOD/SCID) mice. We observe significantly decreased tumor formation in mice injected subcutaneously with PIK3Cδ-L-expressing PCa cells and treated with 50 mg/kg CAL-101 compared to mice injected with PIK3Cδ-S-expressing cells and treated with CAL-101. Additionally, mice injected with PIK3Cδ-L-expressing cells via tail vein and treated with CAL-101 develop significantly less lung metastases compared to mice injected with PIK3Cδ-S-expressing cells and treated with CAL-101. These data suggest that CAL-101 is not effective against the PIK3Cδ-S isoform in vivo.

In order to test the functional differences between the two PIK3Cδ isoforms, we performed co-immunoprecipitation (co-IP) and cell-free kinase assays. Co-IP experiments demonstrate that the PIK3Cδ-L isoform binds with a significantly higher affinity to the p85α regulatory subunit compared to PIK3Cδ-S. We also observe higher activity of PIK3Cδ-S in a cell-free kinase assay with and without the p85α subunit present and with or without wortmannin or CAL-101 treatment compared to PIK3Cδ-L. This suggests that PIK3Cδ-S activity is not as tightly suppressed by p85α and retains kinase activity even in the presence of SMIs such as CAL-101.

What has not been investigated up to this point is the mechanism responsible for the preferential expression of the PIK3CD-S variant in AA PCa. Therefore, we returned to gene expression data generated from previous studies [39, 132] to investigate which upstream SFs may be playing a role in the generation of the PIK3CD-S variant in AA PCa. Interestingly, we identified several SFs, including SRSF2, SRSF7, and HNRNPF, with increased expression in AA PCa compared to EA PCa at both the mRNA and protein levels (data not shown). Moreover, the intronic regions surrounding exon 20 of the PIK3CD pre-mRNA have computationally predicted binding sites for these three SFs. We hypothesized that binding of SRSF2, SRSF7, and/or hnRNPF to flanking regions of exon 20 in the PIK3CD pre-mRNA may facilitate exon 20 skipping, leading to the generation of PIK3CD-S. In order to test this hypothesis, we treated MDA PCa 2b cells with siRNAs targeting SRSF2, SRSF7, or HNRNPF, and observe a decrease in expression of PIK3CD-S and an enrichment of PIK3CD-L (Fig. 4a). We refer to this phenomenon as splice switching. SiRNA-mediated SF knockdowns are confirmed by western blot analysis (Fig. 4b). Our findings suggest that aberrant expression of SFs may be playing a role in the dAS observed in AA PCa.

Fig. 4
figure 4

Knockdown of overexpressed splicing factors causes splice switching of PIK3CD variants. (a) siRNA-mediated knockdown of three splicing factors in an AA cell line switches predominant expression of PIK3CD from the −S to the −L variant. Blots were quantified by densitometry and numbers underneath blots represent the −S/−L expression ratio. Shown are representative blots from 3–4 independent experiments. (b) Western blot confirms knockdown of splicing factors at the protein level. Abbreviations: F2, SRSF2; F7, SRSF7; PF, hnRNPF; β, β-actin. Shown are representative blots from 3 to 4 independent experiments

Thus, we propose a mechanism in which “normal” expression of specific SFs (SRSF2, SRSF7, and/or hnRNPF) in EA PCa promotes inclusion of exon 20 in the final transcript of PIK3CD-L and generation of the PIK3Cδ-L protein (Fig. 5). This protein isoform is sensitive to CAL-101 and has high binding affinity to the p85α regulatory subunit. In AA PCa, however, increased expression of SFs SRSF2, SRSF7, and/or hnRNPF results in dAS of the PIK3CD pre-mRNA, leading to skipping of exon 20, and subsequent generation of the PIK3Cδ-S protein isoform, which exhibits increased oncogenic signaling and decreased sensitivity to SMIs such as CAL-101 (idelalisib). Of interest, ~50% of patients treated second line with idelalisib for certain B-cell malignancies (e.g., chronic lymphocytic leukemia) will exhibit primary resistance to this SMI [133,134,135]. The mechanism of resistance is currently unknown. We propose that expression of the PIK3Cδ-S protein isoform may be responsible, in part, for this resistance. We are currently performing a high throughput chemical library screen to identify a SMI that will effectively suppress PIK3Cδ-S activity.

Fig. 5
figure 5

Proposed mechanism for role of aberrant splicing of PIK3CD in PCa disparities. EA cell lines (e.g., VCaP) and patient specimens show “normal”/low expression of SFs, such as SRSF2 (left panel). Normal splicing of PIK3CD pre-mRNA generates the long variant containing exon 20, which encodes the p110δ-L protein that has low oncogenic properties. Aberrant over-expression of SRSF2 in AA cell lines (MDA PCa 2b) and patient samples results in differential alternative splicing of PIK3CD (right panel). This generates p110δ-S protein that has higher oncogenic signaling and is resistant to CAL-101 (idelalisib)

Conclusion

While studies focusing on PCa disparities have increased over the past 10 years, the RNA splicing landscape has not been fully characterized as a potential mechanism for race-related PCa aggressiveness. Our recent study has highlighted genome-wide dAS events occurring specifically in AA PCa. The dAS events in AA PCa are overrepresented in known oncogenic signaling pathways, possibly providing a mechanistic explanation for PCa disparities. While further studies are needed to fully understand the oncogenic capacity of other variants identified in our study (e.g., FGFR3, MET, and TSC2), these AS variants could serve as useful biomarkers for prognostic predictions and in identifying non-responsive patients to SMIs. Further characterization of dAS variants in AA PCa patients will provide greater, and much needed, insight into the mechanisms responsible for PCa disparities and possible new leads for therapeutic intervention.