Introduction

One of the central unresolved issues in mammary gland biology is the relationship between the normal cellular developmental hierarchy and different subtypes of breast cancer. The following are some of the key questions: do different tumor subtypes originate from a distinct cell-of-origin? What is the nature of the relationship between normal and cancer stem cells? Does cell fate influence susceptibility to malignant transformation? What degree of cell fate plasticity do different cell types exhibit during normal development and tumorigenesis? The prerequisite for resolving these questions is an in depth understanding of the cellular hierarchies in the normal breast tissue.

The mammary gland consists of hollow, bi-layered epithelial tubes that are organized into elaborate tree-like structures embedded in fatty stroma. The gland undergoes intense remodeling during three distinct developmental stages. In puberty, the network of primary and secondary branches is formed by proliferative expansion of terminal end buds (TEBs), bifurcation and side branching. In adult gland, tertiary branches and small alveolar buds are formed alongside primary and secondary ducts during each estrus cycle. In pregnancy, the alveolar epithelium undergoes major expansion to yield milk-producing alveoli. Upon each lactating period, the gland undergoes massive remodeling and involution. It is widely accepted that such profound regenerative capacity is sustained by stem cells; however, their nature and location remain a subject of intense debate.

Transplantation assay

For over half a century, transplantation assay has been the gold standard method for evaluation of stem and progenitor cell activity in mammary epithelium. The first evidence of the presence of stem cells in the mammary gland came from DeOme and Faulkin [1] who showed that transplantation of small fragments of mouse mammary epithelium into de-epithelialized mammary fat pads gives rise to entire ductal epithelial tree. The progeny of the transplanted cells could be serially transplanted confirming that mammary epithelium contains cells with self-renewing properties and multi-lineage potential. Subsequent studies demonstrated that transplantation of dissociated mammary epithelial cells (MECs) at limiting dilutions gives rise to three types of outgrowths: ductal, alveolar (lobular) and complete glands, suggesting the presence of lineage-restricted progenitors [2, 3].

More recently, transplantation assays have mostly been used for characterization of prospectively isolated mouse MEC subpopulations obtained using different combinations of markers [415] and other parameters such as cell size [16] or label retention [17]. In addition, lentiviral genomic barcoding has been employed for analysis of clonal dynamics within serially transplanted cells [18]. Collectively, these studies demonstrated significant heterogeneity in both luminal and basal/myoepithelial layers (reviewed in [19, 20]). Markers specific for stem or progenitor cells have not been identified to date, the existing combinations merely enrich for these subpopulations.

One of the key findings that emerged from these analyses is that MECs within the basal/myoepithelial layer have stem cell-like properties [4, 5]. It is unclear, however, whether all myoepithelial cells can potentially function as stem cells in transplantation assays, or whether there is a specific subset that has intrinsic stem cell-like properties. A recent report by Prater et al. [21] suggests that a high proportion of myoepithelial cells can acquire multi-lineage potential in transplantation assays when cultured in vitro in the presence of agents that disrupt actin–myosin interactions.

Lineage tracing

There is accumulating evidence that transplanted MECs do not always reflect their physiological developmental fate [2125], emphasizing the need to revisit and refine traditional hierarchy models using methodologies that preserve tissue architecture. Lineage tracing, a technique originally developed to analyze early embryos, is one of the most powerful methods available for identifying stem and progenitor cells and analyzing their function in physiological developmental context. The method is based on permanent labeling of a single cell or a group of cells and subsequent tracking of their fate and the fate of their progeny in vivo. This approach is also essential for identification and characterization of topologically and functionally specialized microenvironments such as the stem cell niche. Genetic lineage tracing (also called genetic fate mapping) is a critical tool for analysis of tissue development, homeostasis and cellular origin of cancer [26].

In situ lineage tracing

The first in situ fate mapping in the mammary gland was done over a decade ago by the Smith laboratory [27]. The term “in situ” refers here to the procedure consisting of labeling and tracking of the cells in an intact mammary gland of the host animal. Wagner et al. [27] identified a MEC subpopulation called “parity-induced” mammary epithelial cells (PI-MECs), using transgenic mice carrying Cre-recombinase under control of the pregnancy-specific promoter for whey acidic protein (WAP).

In recent years, a combination of inducible Cre-recombinase and high-resolution imaging has provided extraordinary insight into the mouse MEC composition and function. Several conditional genetic labeling strategies have been employed to track mouse MECs in situ, using either knock-in or transgenic mice. They include doxycycline-inducible TetO-Cre system, Tamoxifen-inducible Cre-recombinase (CreER, CreERT2) and orthotopic adenoviral delivery of lineage-specific Cre-recombinases (AdCre) [2124, 2831]. All methods are based on temporal and cell-type-specific regulation of Cre-recombinase activity, resulting in permanent expression of the reporter allele in target cells.

Each approach has some general, as well as mammary specific, advantages and disadvantages. Generation of new transgenic or knock-in strains and their subsequent breeding is time consuming and expensive. Intraductal injection of lineage-specific AdCre is not only faster and cheaper, but also enables spatial control of Cre-recombinase activity, which is especially useful for downstream applications such as modeling breast cancer, where systemic expression of Cre-recombinase could have deleterious effects. Tamoxifen can temporarily delay ductal development and skew the relative distribution and proliferation rate of MEC subpopulations in a dose-dependent manner; thus caution should be used with CreER system when interpreting short-term in situ tracing studies of a quantitative nature [32]. Doxycycline-inducible Tet-strains contain constructs consisting of endogenous gene promoter sequence and tTA or rtTA cassette. In transgenic Tet-strains, these constructs are randomly inserted into the genome and thus may not faithfully reflect endogenous gene expression due to the chromosomal position effects or uncontrolled copy number. Therefore, the lack of positive cells in lineage tracing studies using transgenic Tet-strains may not reflect their true absence in a tissue and should thus be interpreted with caution. The same caveat applies to transgenic strains carrying Cre-recombinase.

Various gene promoters have been used to drive Cre expression in specific MEC subpopulations (summarized in Table 1). They include components of the developmental signaling pathways (Wnt, Notch), structural genes (cytokeratins, actin) and mammary-related genes (WAP, Elf5). The choice of the drivers has mostly been based on information obtained in ex vivo analyses of various MEC subpopulations, or stem cell markers characterized in other tissues, such as the skin and intestine.

Table 1 Mouse strains used for in situ lineage tracing in the mammary gland

One important consideration when interpreting lineage tracing data is that a gene promoter chosen to drive Cre expression, can be switched on or off in a developmental stage-dependent manner in a particular MEC subset. Conversely, a single reporter gene may label two or more MEC subpopulations at different developmental stages. Thus, a single marker does not necessarily identify the same and unique MEC subpopulation at different stages of morphogenesis.

3D visualization

Mammary gland consists of spatially complex network of branches that undergo intense remodeling in puberty, pregnancy, lactation and involution. This presents some unique challenges for visualization of labeled cells and interpretation of long-term lineage tracing data. Standard two-dimensional tissue sections are inadequate for quantitative clonal analysis and can also be misleading in cases where traced cells are rare or exhibit complex spatial patterns (Fig. 1). Analysis of the whole-mount specimens in different focal planes is essential in such cases, however, traditional protocols for visualization of LacZ-positive cells are not suitable for high-resolution imaging at the cellular level.

Fig. 1
figure 1

Two-dimensional tissue sections do not reveal important topological and quantitative information in lineage tracing experiments. Schematic presentation of mammary ducts harboring genetically labeled cells and respective tissue sections. Dotted line represents the plane of the section. Complex topological patterns, the branch of origin and quantitative information, cannot be discerned from tissue sections

To visualize individual cells within their intact three-dimensional context, Sale and colleagues developed an improved x-gal labeling protocol that enables high-resolution, single-cell imaging of mammary whole-mounts [28]. Using Notch2 paralogue as a genetic marker, they discovered and functionally characterized two previously unrecognized mammary epithelial cell types: S and L cells [28]. S cells exhibit a unique combination of morphological and topological features: small size (S cell diameter is approximately one-third of the luminal cell diameter), formation of strings around single large Notch2-negative cell and a distinct reiterative spatial placement relative to the longitudinal axis and the circumference of the duct. During the estrous cycle, tertiary branches are formed directly above the S cell strings. These complex topological arrangements would be very difficult, if not impossible, to extrapolate from serial sections.

An alternative approach to whole-mount analysis of x-gal labeled samples is 3D confocal reconstruction of optical sections from fluorescently labeled specimens. This method is particularly useful in conjunction with multicolor reporter system for analysis of multiple clones in close proximity [31, 33].

In situ cell ablation

A major drawback of transplantation assays is disruption of the physiological developmental and topological context, which may induce non-physiological lineage commitment in transplanted cells. Moreover, in some cases, transplantation assay may not reveal important functional features of the target cells. For example, S and L cells do not give rise to outgrowths in transplantation assays, nor do they form clones in situ after long-term lineage tracing [28]. To gain more insight into their functional properties, Sale and colleagues used conditional in situ cell ablation, a powerful method for the characterization of cells in their physiological developmental and spatial context. This technique is based on conditional expression of diphtheria toxin receptor (iDTR) in Cre-positive cells [34]. Upon administration of diphtheria toxin, Cre-expressing cells are depleted whereas other cells remain intact. Ablation of S and L cells during active ductal morphogenesis revealed that their function indeed reflects their topological context: S cells regulate spatial placement of tertiary branches alongside the ductal tree, whereas L cells orchestrate formation of alveolar lumen and spatial organization of alveolar clusters [28].

A recent report by Wang et al. [31] described an alternative approach based on conditional expression of Diphtheria toxin fragment A (DTA) in Cre-positive cells. This method does not require administration of diphtheria toxin; however, it may require multiple Tamoxifen injections in a short period of time to achieve optimal ablation efficiency [31]. DTA-based ablation of Procr (protein C receptor)-positive MECs during puberty largely prevented ductal growth, thus corroborating lineage tracing data, which indicated that this population of cells exhibits stem cell properties. The authors note that the role of Procr-positive stromal cells should also be taken into account when interpreting these experiments, since they were also affected by the ablation [31].

Lineage tracing and MEC hierarchy models

Traditional depictions of the mammary epithelial cell hierarchy conform to a classical Waddington’s landscape model [35] and are based on the premise that MECs comprise two major lineages, luminal and basal/myoepithelial. It is becoming increasingly evident, however, that mammary epithelium is much more complex than previously thought, both developmentally and functionally, and exhibits a considerable degree of cell fate plasticity.

The first evidence that mammary epithelium consists of several distinct morphotypes, comes from the ultrastructural studies done by Chepko and Smith [36]. They divided MECs into five subtypes: classical luminal and myoepithelial cells, undifferentiated and differentiated large light cells (ULLCs, DLLCs) and small light cells (SLCs). Sale and colleagues provide the first genetic evidence that MECs constitute at least four independent lineages: classical luminal and myoepithelial lineage, L-cell lineage and S-cell lineage [28]. There are marked morphological and topological similarities between the light cells (small and large) identified by electron microscopy and S and L cells, respectively, identified by genetic labeling. Whether these populations overlap remains to be demonstrated experimentally. These studies highlight the importance of defining distinct MEC subsets based on multiple variables that, in addition to classical marker profile, include morphology, topological placement, function and developmental context.

Alveologenesis is another example where fate mapping has recently revealed previously unrecognized developmental complexity of the mammary epithelium. Two independent groups have demonstrated using different genetic labeling strategies that each alveolus consists of at least two distinct luminal alveolar lineages with independent progenitors [25, 28]. Using Notch2-CreERT2 knock-in mice, Sale and colleagues have identified single and paired Notch2-positive L-alveolar cells in each alveolus, whereas classical secretory alveolar cells remained unlabeled. In situ depletion of L-alveolar progenitors prior to pregnancy did not impair proliferation of secretory alveolar cells, instead they formed outgrowths resembling alveoli but with no lumen [28]. Chang et al. [25] used Wap-Cre transgenics to label PI-MECs and found an inverse labeling pattern: whereas all luminal estrogen receptor (ER) negative cells in an alveolus could be derived from PI-MECs, single or paired alveolar ER-positive cells were unlabeled, indicating that they are derived from a different lineage. Importantly, in both studies, basal alveolar cells remained unlabeled indicating they have a different origin. This result is consistent with the report by van Amerongen and colleagues who demonstrated using Axin2-CreERT2 knock-in mice that a subset of MECs positive for Axin2 in prepubescent animals gives rise to myoepithelial but not luminal alveolar cells [23]. Different origin of basal and luminal alveolar cells in individual alveoli was also demonstrated by Rios et al. [33] using K5-rtTA/TetO-cre and Elf5-rtTA/TetO-cre transgenics.

In addition to classical secretory alveolar cells and L cells, Notch2 lineage tracing has revealed a third, possibly functionally distinct subset of luminal alveolar lineage, called “transient alveolar cells” [28]. They fill in the lumen of actively growing tertiary branches and sprouting alveolar buds in early pregnancy and are cleared as soon as these structures are formed. Transit alveolar cells are Notch2 positive, thus indicating that they are not direct descendants of the secretory alveolar lineage.

Taken together, these studies imply that individual alveoli are not clonal outgrowths, but are formed by spatially and temporally coordinated growth of different cell lineages.

One of the most controversial issues arising from in situ lineage tracing studies is the identity and the role of stem cells in the adult mammary gland homeostasis. Two opposing stem cell hierarchy models have been proposed based on data obtained with partially overlapping set of drivers (Table 1) [22, 33]. One model suggests that the luminal and basal compartment is maintained by lineage-restricted, unipotent stem cells [22]; while the other proposes that adult mammary epithelium is maintained by bi-potent stem cells [33]. Discrepancies between the results obtained using the same set of drivers could be attributed to several factors discussed in detail above (see pages 4–6). These include differences in imaging techniques (2D versus 3D), inherent differences between the mouse models (knock-in versus transgenic strains, Tamoxifen versus doxycycline induction) and heterogeneity within the stem cell compartment.

The evidence for the latter comes from Wang and colleagues [31], who demonstrated the existence of at least two distinct subsets of basal stem cells in transplantation assays: Procrpositive/K5low/K14low and Procrnegative/K5high/K14high population. It is unclear, however, whether both subsets contribute to the maintenance of the basal lineage under physiological conditions. While multi-potency of Procrpositive basal cells was confirmed by in situ lineage tracing, developmental potential of Procrnegative basal cells was not assessed in situ, because they presently lack a unique marker suitable for such analysis.

To address the issue of lineage identity and developmental plasticity, Granit and colleagues have recently proposed an alternative to a traditional MEC hierarchy model [37]. They suggest a multidimensional classification of MECs along several distinct axes, including stem cell versus differentiated identity, basal versus luminal identity and mesenchymal versus epithelial identity. Such multidimensional models allow description of intermediate and mixed differentiation states and can be applied to both normal mammary gland and breast cancer.

Breast cancer cell-of-origin

In situ lineage tracing combined with gain- and loss-of-function analyses will be instrumental for elucidating the cell-of-origin of different breast cancer subtypes. So far, the Cre-lox-based murine models of breast cancer have not been used for classical in situ fate mapping, wherein specific MEC subsets are transformed, labeled and traced in an intact mammary gland of the host animal.

A distinctive genetic labeling strategy based on ubiquitous (i.e. non cell type-specific) labeling has recently been utilized for intravital analysis of clonal dynamics in MMTV-PyMT model of breast cancer [38]. Four-week-old MMTV-PyMT transgenic mice develop mammary hyperplasia, which gradually progresses to invasive cancer [39]. Zomer et al. [38] have generated compound MMTV-PyMT/R26-CreERT2/R26R-Confetti mice and treated them with Tamoxifen at different time points, to induce random expression of one of the four confetti colors in each mouse cell. They then surgically implanted mammary optical window to visualize fluorescently labeled mammary tumor cells and to track the fate of their progeny in live animals. These analyses have revealed high degree of plasticity in putative cancer stem cells [38].

Going forward, a critical factor in tracing the cellular origin of breast cancer will be the use of specific drivers that allow targeting of non-overlapping MEC subpopulations at different stages of development. Simultaneous transformation and labeling of specific MEC subsets will enable in situ visualization of individual cells in initial stages of tumorigenesis, clonal analysis and fate mapping in fully formed tumors, ex vivo characterization of labeled cells and evaluation of therapeutic efficacy of anticancer drugs.

Concluding remarks

Lineage tracing has recently provided some exciting and unexpected new insights into mammary gland biology and has emerged as an indispensable tool for dissecting MEC hierarchy. Many questions still remain, in particular, those regarding the location and the nature of the stem cell niche. A major challenge facing the field will be to reconcile data obtained using different tracers and developmental time points. The extent of the overlap between various MEC subpopulations identified using prospective isolation and fate mapping, remains to be determined.