Introduction

Molecular heterogeneity is a characteristic feature of all human cancers. At the most granular level, it can be concluded that every cancer is a distinct entity with its own specific pattern of subclonal heterogeneity, DNA methylation, mutations, copy number variations, and RNA expression patterns. However, large-scale mRNA expression profiling studies demonstrated that, at higher levels, human cancers can be grouped into molecular subtypes that share coordinated patterns of gene expression and biological characteristics. The first such studies were performed in diffuse large B-cell lymphoma (DLBCL) that is the most common subtype of non-Hodgkins B cell lymphomas showing two molecular subtypes reflecting various survival outcomes with distinctive gene expression pattern [1, 2]. Subsequent examinations of human breast cancers demonstrated that they could also be grouped into subtypes based on differential expression of biomarkers expressed by the cells located at different positions within the normal mammary epithelium, reflecting increasing levels of terminal differentiation [3]. Recent studies, including essentially all of the projects led by The Cancer Genome Atlas (TCGA), demonstrated that molecular subtypes could be identified in most human cancers. Importantly, the molecular subtypes were associated with different survival outcomes and responses to chemo- and biological therapies, confirming their clinical relevance.

Bladder cancers are particularly heterogeneous tumors. Clinicians have long recognized that bladder cancers progress along two “tracks” that have very different implications for prognosis. Non-muscle invasive bladder cancers (NMIBCs) do not typically pose a threat to survival, but they almost always recur, necessitating lifelong, expensive surveillance and surgical management. On the other hand, muscle-invasive bladder cancers (MIBCs) are clinically aggressive and can progress rapidly to become metastatic to lymph nodes, liver, lungs, bone, and brain, and these cancers are usually fatal. Pathologists have also recognized that the tumors associated with the two clinical tracks tend to display distinct histopathological features. The majority of NMIBCs are enriched with “papillary” features, whereas MIBCs tend to be flat lesions that are usually associated with CIS. At the molecular level, papillary NMIBCs are highly enriched with activating mutations in FGFR3, whereas CIS and MIBC are characterized by high levels of genomic instability and inactivating mutations in TP53.

Heterogeneity in histopathological characteristics adds further complexity to bladder cancer diversity. Many tumors contain additional variant histopathological characteristics, including squamous, sarcomatoid, small cell/neuroendocrine, micropapillary, and plasmacytoid differentiation, and tumors containing these variants tend to be associated with more aggressive clinical phenotypes. Identifying these variants can be highly subjective, posing challenges for diagnosis, and levels of inter-observer variability can therefore be disturbingly high.

Molecular Subtypes of Bladder Cancer

Following the lead of investigators working with lymphomas and breast cancer, several groups used large-scale mRNA expression profiling to look for molecular subtypes in cohorts of bladder cancers [4,5,6]. All of the studies concluded that, at the highest level, NMIBCs tended to form one subtype and MIBCs another [4,5,6]. Analyses of the gene expression features that distinguished the two subtypes revealed that the MIBC-associated subtype exhibited higher expression of genes related to cell cycle progression and mitosis, extracellular matrix, and immune cells, whereas ribosomal genes were upregulated in the NMIBC cluster [4]. The groups also recognized that a subset of MIBCs expressed genes associated with squamous metaplasia [4, 5]. Classifiers were created that could accurately assign tumors to one of the two clusters with smaller numbers of genes, but being able to distinguish tumors by stage does not really address an unmet clinical need, so the added value of these classifiers was unclear. Furthermore, when an attempt was made to generate a classifier for squamous tumors, some conventional MIBCs were assigned to the SCC subtype, which at the time was attributed to inaccurate assignments by the classifier [4]. Work performed more recently suggests that these squamous subtype assignments were probably accurate.

A subsequent study employed gene expression profiling, array comparative genomic hybridization for measurements of copy number variations (CNVs), and focused DNA sequencing to identify known bladder cancer mutations to further characterize molecular heterogeneity [7]. The work identified two major subtypes, termed “MS1” and “MS2,” that largely recapitulated the NMIBC/MIBC dichotomizations reported previously. The MS2/MIBC subset was distinguished from the MS1/NMIBC subset by higher genomic instability and higher TP53 mutation rates, although the genomic instability was not dependent on TP53 inactivation [7]. Mutations and copy number variations involving the E2F3 oncogene and the RB1 tumor suppressor were only observed in the MS2/MIBC subset. Conversely, and consistent with their high prevalence in papillary NMIBCs, FGFR3-activating mutations were enriched in the MS1/NMIBC subset, and they were also enriched with activating mutations in PIK3CA, which are associated with downstream activation of AKT [7]. Although these analyses provided an even deeper understanding of the underlying mechanisms that give rise to bladder cancer molecular heterogeneity, they did not reveal much about the heterogeneity that might be present within NMIBCs and MIBCs.

The Lund Molecular Taxonomy

Subsequent to their description of the MS1 and MS2 tumors [7], the Lund group performed gene expression profiling on another cohort of 308 tumors to search for additional heterogeneity within their MS1 and MS2 subtypes [8]. They first reproduced the MS1 and MS2 dichotomization in the new dataset, and then, they performed successive 2-group divisions within each of the MS1 and MS2 clusters [8]. In the end, they concluded that the MS1 subtype could be split into two distinct subsets, and the MS2 subtype could be split into five subsets [8]. After applying their classifier to additional public datasets, they concluded that 5 of the 7 subtypes were highly reproducible: one subtype consisted of all of the MS1 tumors (termed “urobasal A”, or uroA), another consisted of tumors MS2 tumors that contained a related gene expression signature (termed “urobasal B”, or uroB), two clusters termed “genomically unstable” (GU), a subset characterized by heavy stromal cell infiltration (“infiltrated”), and a subtype that expressed high levels of biomarkers associated with squamous differentiation (“SCC-like”). Importantly, the subtypes had prognostic value—patients with uroA tumors enjoyed the best prognoses, whereas patients with uroB and SCC-like tumors had the worst [8]. Interestingly, the uroA and uroB tumors were both highly enriched with FGFR3-associated gene expression signatures and activating FGFR3 mutations, suggesting that they might be dependent on FGFR3 for their growth. Finally, the Lund group showed that the tumors could be accurately identified by immunohistochemistry [8], suggesting that their molecular taxonomy could be adapted for routine use by pathologists.

They have since refined and expanded their classifier since first describing it in 2012. Using an interim cohort of 234 tumors from TCGA, they reproduced their original 5 subtypes and identified a new subset of tumors characterized by expression of biomarkers associated with neuroendocrine differentiation [9•]. Subsequently, they analyzed both mRNA expression profiles and expression of 28 protein biomarkers in a new cohort of 307 cystectomy specimens and identified a “mesenchymal-like” tumor subtype [10]. As will be discussed later, all of the Lund subtypes are largely reproduced within the basal and luminal molecular subtypes identified independently by other groups as will be discussed below.

A major limitation of gene expression profiling studies of bulk tumors is that the gene expression patterns represent a mixture of tumor and stromal phenotypes. To address this problem, the Lund group refined their molecular taxonomy by comparing expression of 28 proteins to the whole transcriptome gene expression profiles from 307 cystectomies [10]. They concluded that 5 distinct tumor phenotypes could be observed—urothelial-like, genomically unstable, basal/SCC-like, mesenchymal-like, and small cell/neuroendocrine-like. They also concluded that the subtype calls made using mRNA expression profiling were sometimes inconsistent with the IHC calls [10], suggesting that stromal cell infiltration might make mRNA-based calls inaccurate.

Defining Basal and Luminal Subtypes

The Lund molecular taxonomy was developed on mixed cohorts of NMIBCs and MIBCs. Because RNA-based molecular subtypes are defined by relative gene expression patterns and are therefore strongly influenced by the tumors selected for inclusion in a cohort, it seemed possible that focused analyses of cohorts of pure MIBCs might yield different results. Therefore, TCGA and private groups at the University of North Carolina (UNC) and MD Anderson Cancer Center performed independent studies that were focused on identifying molecular subtypes within MIBCs. The group at UNC assembled a large public meta dataset and used unsupervised consensus clustering to show that the tumors could be optimally divided into two clusters [11]. The MD Anderson team generated a new discovery cohort consisting of 73 MIBCs and a new validation cohort of 57 MIBCs, and they used unsupervised hierarchical clustering to identify 3 candidate subtypes [12]. Finally, TCGA used RNA sequencing and a hybrid unsupervised approach to identify 4 clusters within a cohort of 129 tumors [13]. Importantly, all 3 groups recognized that the subtypes were distinguished by mutually exclusive expression of differentiation-associated biomarkers that had been previously associated with the intrinsic basal and luminal subtypes of breast cancer, and that at the highest level, the tumors in all of the cohorts could be divided into two subtypes - “basal” and “luminal” [11,12,13]. Two other groups independently recognized the significance of a distinct “basal” MIBC subtype [14, 15], and a consensus meeting held in 2015 proposed that this subtype be termed “basal squamous” to reflect the terminology adopted by the different groups.

Although the studies identified different numbers of subtypes, inspection of their properties revealed strong concordance among them [16, 17]. The MD Anderson group’s third subtype consisted of tumors expressing an active p53, stromal cell, and extracellular matrix gene expression signatures [12, 16, 17], and this “p53-like” subtype largely corresponded to one of the TCGA luminal clusters (“cluster II”) [13, 16, 17]. TCGA’s classifier divided the basal tumors into two subsets [13] that were largely distinguished by differential expression of biomarkers associated with epithelial-to-mesenchymal transition (EMT) [17], a reversible developmental process that is reactivated in solid tumors to promote invasion, metastasis, “stemness”, and drug resistance. The potential significance of the mesenchymal basal tumors was also appreciated by the group at UNC, who identified them using a breast cancer “claudin-low” gene expression signature [11, 18]. The Lund group performed a head-to-head comparison of their molecular taxonomy to the UNC and MD Anderson classifications, and again, they observed remarkable concordance [9•]. The Lund uroB and SCC-like subtypes corresponded to the UNC and MD Anderson basal tumors, the Lund “infiltrated” tumors were similar to the MD Anderson p53-like MIBCs, and the Lund uroA and GU tumors were contained within the UNC and MD Anderson luminal subtype [9•]. Importantly, as a direct consequence of these comparisons, the Lund group also identified a new “neuroendocrine” (CC3-2) basal subtype characterized by high levels of the stem cell transcription factors, SOX2 and SOX21, and small cell/neuroendocrine markers [9•].

TCGA also reevaluated their mRNA-based molecular subtypes in its final cohort of 408 MIBCs using unbiased NMF consensus clustering [19]. They identified 5 subtypes—luminal papillary, luminal infiltrated, luminal, basal squamous, and neuronal—that were partially overlapping with the 4 clusters identified in their 2014 study [13, 19]. Luminal papillary tumors were enriched with papillary morphology, breast cancer luminal genes, activating FGFR3 mutations, and low tumor stage, and higher tumor cell purities than the tumors in the other subtypes [19], characteristics they shared with the Lund uroA tumors [8, 10]. The main features of the luminal infiltrated tumors were enrichment with the MD Anderson p53-like gene expression signature [12] and lower tumor cell purities than the other subtypes, and they corresponded well with the original TCGA “cluster II” tumors [13]. The new basal squamous subtype contained tumors exhibiting various degrees of squamous differentiation, expression of breast cancer basal and stem cell markers, high expression of CIS signature genes, high lymphocytic infiltration, and low expression of Sonic hedgehog pathway genes [19]. This subtype contained most of the mesenchymal tumors that were assigned to “cluster IV” in the original study [13], and they were assigned to the basal squamous cluster because of low-level expression of luminal genes. The TCGA’s new neuronal subtype exhibited high-level expression of biomarkers characteristic of neuroendocrine differentiation, mixed expression of canonical basal biomarkers, and low expression of squamous markers [19], and the characteristics of these tumors overlapped with the neuroendocrine subtype identified previously [9•, 10]. Finally, the new luminal subtype was characterized by the highest frequency of TP53 mutations among luminal tumors overall and uniform, high-level expression of KRT20 and uroplakins [19]. The high frequency of TP53 mutations and high uroplakin expression were shared by the Lund GU tumors [8], suggesting some degree of biological concordance.

Clinical Characteristics of the Bladder Cancer Molecular Subtypes

The molecular subtypes identified independently by the Lund, UNC, MD Anderson, Baylor, and TCGA groups also shared similar clinical characteristics. Basal squamous tumors were more common in women [12, 19] and were clinically aggressive, characterized by advanced stage and metastatic disease at presentation and shorter disease-specific and overall survival [8, 11,12,13,14, 20, 21]. However, as is also true with breast cancer, patients with basal bladder cancers received the most clinical benefit from neoadjuvant chemotherapy [22•, 23••], suggesting patients with these tumors should be treated aggressively with cisplatin-based regimens. Even though basal tumors expressed high levels of lymphocyte genes [18, 19], and relatively high tumor mutational burdens, patients with basal tumors derived only intermediate benefit from therapy with immune checkpoint blockade [24••, 25•]. Other molecular determinants, possibly including TGFβ gene expression signatures [25•], could limit response to immune checkpoint blockade in basal tumors, presenting a possible opportunity to increase clinical activity with combination therapies. Alternatively, it is possible that the lymphocytes in basal tumors are not specific for tumor neoantigens and are recruited to the tumors via other mechanisms.

Importantly, not all subsets of basal tumors appear to be equally sensitive to chemotherapy. Patients with “epithelial” basal tumors obtained substantial clinical benefit from NAC, but patients with “mesenchymal”/claudin-low basal tumors did not [23••]. Similarly, even though the majority of patients with small cell/neuroendocrine tumors respond to chemotherapy [26], the small number of patients with neuronal tumors in TCGA’s cohort had the worst clinical outcomes [19]. Because the clinical data that are available in retrospective cohorts are less accurate and complete than those obtained with prospective data collection, prospective studies are necessary to validate (or disprove) these associations.

Patients with uroA/luminal papillary tumors enjoyed the longest disease-specific and overall survival [8, 19], and about half of luminal tumors were downstaged by NAC [12]. However, in the preliminary analyses NAC produced little added clinical benefit beyond what is observed with cystectomy alone [23••], perhaps because patients with luminal tumors have less of the subclinical metastatic disease that is probably the most critical target of systemic neoadjuvant or adjuvant chemotherapy. On the other hand, NAC produced the lowest rates of downstaging in patients with p53-like/luminal infiltrated tumors [12], and it may have even shortened overall survival in these patients [23••]. With respect to immune checkpoint blockade, patients with TCGA cluster II (i.e., luminal infiltrated) tumors displayed the highest response rates, and patients with TCGA cluster I (i.e., luminal papillary) tumors had the lowest [24••]. In a subsequent analysis by the same group, patients with Lund GU tumors had the highest response rates and patients with luminal papillary tumors the lowest [25•]. However, the subtype assignments in these studies were not made by the individuals who developed the original classifiers, so these conclusions must be considered preliminary until they have been confirmed by the other groups.

Molecular Subtypes and Bladder Cancer Variants

The observation that tumors with squamous or neuroendocrine variant histopathological features cluster with some conventional basal MIBCs [4, 8, 9•, 10, 12, 13, 19] raises the possibility that other variants also display basal or luminal subtype bias, and strong support for this idea is emerging. For example, whole transcriptome analyses of micropapillary bladder cancers revealed that they express luminal and p53-like/luminal infiltrated biomarkers and co-cluster with some conventional MIBCs [27•]. Similarly, immunohistochemical analyses employing antibodies that were specific for basal and luminal biomarkers [10, 28] indicated that micropapillary, nested, and plasmacytoid variant MIBCs are all luminal, whereas tumors with squamous differentiation are basal [29]. Although the biological relationships between conventional MIBCs and the bladder cancer variants need to be confirmed, the preliminary results suggest that integrating variants into conventional MIBC datasets could help to define new bladder cancer subtypes, just as tumors with squamous differentiation helped to define the basal/squamous subtype and tumors with small cell and neuroendocrine differentiation helped to isolate the neuroendocrine subtype of basal tumors.

Subtypes of NMIBCs

The high impact of the MIBC subtyping studies has prompted investigators to perform similar analyses on cohorts of NMIBCs. In the largest study performed to date, a European consortium subjected a cohort of 460 flash frozen NMIBCs to whole genome RNAseq and used unbiased consensus clustering to define subtypes [30•]. They concluded that the tumors formed three discrete clusters—one corresponded well with the Lund uroA tumors, another with the Lund GU tumors, and a third displayed more of a basal-like bias based on differential expression of UNC’s BASE47 biomarkers [11]. In another study, a group from Leeds identified two copy number groups (genomic subtype:GS1 and GS2) of 141 stage Ta NMIBCs using unsupervised clustering on low pass whole-genome sequencing and array CGH data, and then performed RNA sequencing on a part of the cohort of Ta NMIBCs (GS1 = 48, GS2 = 31) to determine the difference in mRNA expressions between two groups. They concluded that all of the tumors expressed luminal biomarkers, but that they could be segregated into two subtypes based on differential loss of chromosome 9q and relative levels of genomic instability [31]. It is possible that these DNA-based subtypes also correspond to the Lund uroA and GU tumors, although this was not discussed.

There is great interest in determining whether NMIBC molecular subtypes display differential sensitivities to intravesical BCG, which is the frontline therapy for high-risk tumors. Unfortunately, only a fraction of the tumors in the European cohort came from patients treated with BCG [30•], so additional cohorts consisting of BCG-sensitive and BCG-resistant tumors with strong companion clinical data will need to be created to address this question.

Conclusion

It appears that MIBCs can be segregated into at least 5 molecular subtypes: luminal papillary/uroA, luminal infiltrated/p53-like, luminal/GU, basal squamous, and neuroendocrine. A sixth subtype (TCGA cluster IV/claudin-low/mesenchymal-like) may also emerge as a basal subtype that is distinct from either the basal squamous (epithelial) or neuroendocrine (mesenchymal, but expressing neuroendocrine biomarkers) tumors. Additional subtypes may also emerge as larger cohorts of bladder cancer variants are profiled. The subtypes display prognostic significance and also appear to be associated with differential benefit from systemic therapies. Basal squamous, claudin-low, and neuroendocrine tumors all appear to be intrinsically aggressive, possibly because they all express higher levels of EMT biomarkers than the luminal tumors do and are therefore more prone to invasion and metastasis. However, patients with basal squamous tumors may derive the most benefit from chemotherapy, and in these patients, aggressive clinical management can dramatically improve prognosis. Small cell/neuroendocrine tumors are also initially highly chemo-sensitive, but responses tend to be very short-lived, requiring that new therapeutic approaches be developed. Their high TMBs make them attractive as targets for immune checkpoint blockade, and preclinical data suggest that they may also be sensitive to PARP inhibitors. New approaches are also required for patients with claudin-low tumors.

Luminal papillary/uroA tumors are associated with the best survival. Approximately half of these tumors appear to be downstaged by neoadjuvant chemotherapy, but the impact on survival is less obvious than it is in patients with basal tumors, perhaps because these patients tend to have less subclinical metastatic disease. Luminal papillary tumors display an “immune desert” phenotype, and patients with luminal papillary tumors may derive the least benefit from immune checkpoint blockade. However, activating FGFR3 mutations and fusions are greatly enriched in these tumors, so local or systemic therapy with FGF receptor inhibitors (with or without immunotherapy) is an attractive therapeutic approach.

The genomic characteristics of GU tumors would be expected to make them particularly chemosensitive, but the available data have argued that this is not the case [23••]. Instead, the GU tumors appear to be particularly sensitive to immune checkpoint blockade [25•]. The new TCGA luminal subtype is clinically aggressive [19], but its biological properties are consistent with its being a subset of the Lund GU tumors, so they may also be sensitive to chemotherapy and/or immune checkpoint blockade. The luminal infiltrated/p53-like tumors appear to be chemoresistant, but many patients with these tumors also benefit from immune checkpoint blockade [24••, 25•]. It is possible that a newly described TGFβ gene expression signature could be useful in identifying the resistant luminal infiltrated tumors [25•].