Main

Somatic cells that have lost their pluripotent properties through the acquisition of differentiation-associated epigenetic marks can be driven to acquire an induced pluripotent cell (iPSC) state by the forced expression of key transcription factors1. iPSCs can fulfil the strictest of murine developmental assays, tetraploid embryo complementation2, to form to all the cells of the embryo proper and the resulting adult animal3. During the reprogramming of somatic cells, it is visibly apparent that there exists a spectrum of distinct cell types. The embryonic stem cell (ESC)-like iPSCs capable of generating healthy mice represent just one end of this spectrum. Many studies describe the successful derivation of iPSCs, however, relatively few studies address the fate of cells that do not reprogram to an ESC-like state. It has been reported that somatic cells expressing the four reprogramming factors1 can stabilize at a Nanog-negative cell state that morphologically resembles ESCs, yet failed to fully acquire an ESC-like expression profile4,5,6. ‘Partially reprogrammed cell’ has become a term to describe any cell that fails to reprogram to an ESC-like state. However, it is likely that a range of cell types exist, whose stable phenotypes and associated epigenetic profiles are different from ESCs.

For somatic cells to acquire an ESC-like state they require extensive genome-wide remodelling, with epigenetic mechanisms regulating cell state transitions throughout the entire reprogramming process. Incomplete remodelling of the somatic epigenome is associated with transgene-dependent cells5 and a functional memory of somatic cell origin7,8. The modulation of epigenetic regulators such as DNA dioxygenases9, histone deacetylases10, H3K36 demethylase (Jhdm1b)11, H3K27 demethylase (Utx)12 and H3K9 demethylases6 greatly influences the efficiency and kinetics of reprogramming towards a ESC-like iPSC state. In particular, vitamin C has been reported to facilitate the transition of cells from a ‘partially reprogrammed state’ to an ESC-like state6,13. In addition to chromatin remodelling, the expression level of reprogramming transcription factors directs cell state. A narrow window of Oct4 expression is required to maintain the ESC state, whereby a twofold perturbation of expression induces cells to transition to a non-ESC state14. During reprograming there are two potential sources of Oct4: the transgene, whose expression has to be high at the beginning, and the endogenous gene, which is reactivated during the process of reprogramming. Towards the end of reprogramming the total expression of these two Oct4 sources has to stabilize within the narrow window required by the ESC-like state. Elevated expression of the four reprogramming factors has the potential to direct cell identity to a non-ESC-like state. In agreement, significant changes in global gene expression are observed when the reprogramming factors are shut down15,16,17.

Somatic-cell-derived epigenetic marks and the conceivable permutations of reprogramming factor expression levels present a unique opportunity to generate novel cell types. Thus, in an experimental approach unbiased by pre-conceptions of what constitutes a reprogrammed cell we characterize the diversity of cell states that arise during somatic cell reprogramming. We define a Nanog-positive cell state (F-class cells) that is stable, occurs frequently, is dependent on high reprogramming factor expression, in which cells do not form typical ESC-like colonies, exhibits advantageous cell culture properties, and yet demonstrates pluripotency.

Reprogramming diversity

To extensively characterize the diversity of cell states arising from embryonic fibroblasts, we initiated reprogramming with the doxycycline-inducible piggyBac transposon system18. Colonies of proliferative cells were picked in a randomized manner, impartial of gene expression and morphological appearance, establishing clonally-derived cell lines (Fig. 1a). Notably, the transgene-expressing cell lines segregated into two distinct cohorts (Fig. 1b), which we had initially classified by morphological appearance as compact colony forming cells (C-class) and fuzzy colony forming cells (F-class). For all 28 cell lines established, the reprogramming genes Oct4 (also known as Pou5f1), Sox2, Klf4 and c-Myc were expressed many fold above ESC levels (Extended Data Fig. 1a), with each clonal cell line exhibiting substantial global gene expression differences when compared to ESCs (Fig. 1b). The majority of genes (67%) that were expressed above ESC levels were also expressed above (>twofold) parental fibroblast levels (Extended Data Fig. 1b, c and Supplementary Information 1), suggesting that these genes were induced upon reprogramming rather than representing a fibroblast memory. 2,959 differentially expressed genes (P < 0.01; false discovery rate (FDR) < 0.05) separated F-class and C-class cells, (Extended Data Fig. 1d, Supplementary Information 2) with the F-class cell lines being particularly intriguing as they expressed Nanog and endogenous Oct4 at ESC levels (Extended Data Fig. 1e, 2a, b), yet did not possess an ESC-like morphology (Fig. 1b). The fuzzy appearance of F-class colonies and low intercellular adhesion was reminiscent of E-cadherin-null ESCs19,20,21 and could be attributed to diminished E-cadherin expression (Extended Data Fig. 1e). When mapped to the previously established PluriNet22 (Extended Data Fig. 1f), F-class cells exhibited significantly reduced expression of many PluriNet genes (Dnmt3b, Zfp42 and Tdgf1), yet they expressed many genes at ESC levels such as Sall4, endogenous Oct4 and Nanog (Supplementary Information 3). In addition, the F-class cells expressed transcription factors associated with lineage commitment including the homeobox protein En2, the helix–loop–helix factor Ngn3 and homeobox protein Nkx2.3.

Figure 1: Fibroblasts reprogram to multiple states.
figure 1

a, Fibroblasts were transfected with Yamanaka factors in four separate piggyBac transposons (pB) and clonal lines were derived. b, Unsupervised hierarchical clustering and sample distance matrix (Pearson correlation) of gene expression at day 16. Phase contrast images representative of F-class (clone 1) and C-class (clone 23) iPS cell lines. Scale bars, 200 μm.

PowerPoint slide

We compared the F-class cells to another well-characterized pluripotent stem cell population, epiblast stem cells (EpiSCs), and found that the F-class cells are transcriptionally distinct (Extended Data Fig. 2c, d). Furthermore, F-class cells could not be generated or maintained in EpiSC media (Extended Data Fig. 2e).

An alternative stem-cell state

Differentially expressed genes (P < 0.01; FDR < 0.05) between ESCs and F-class cells are enriched with genes involved in cell adhesion and the extracellular matrix (Fig. 2a, b), which probably contributes to the morphological appearance of F-class cells. Forced expression of Cdh1 induced some cells to acquire an ESC-like morphology; however, it was insufficient for most cells in culture (Extended Data Fig. 3a, b), suggesting that Cdh1 was not the only factor required. Furthermore, elevated Cdh1 expression did not induce the expression of Esrrb and Dppa5, genes that are downregulated in Cdh1-null ESCs20 (Extended Data Fig. 3a). The F-class gene expression profile remained unchanged upon prolonged culture, with cells maintaining a stable transcriptome and no convergence towards an ESC-like state (Fig. 2c). Independent sub-lines exhibited low variance in gene expression, further demonstrating the stable self-renewal of the F-class cell state (Extended Data Fig. 3c). The absence of interspersed Dppa4-expressing cells suggested that cells do not spontaneously progress to an ESC-like state at a detectable rate (Extended Data Fig. 3d). F-class cells possessed a normal karyotype (Extended Data Fig. 3e) and could be expanded exponentially beyond 40 passages. The cells remained in a transgene-dependent state (Extended Data Fig. 3f), whereby turning off transgene expression induced population-wide differentiation within 48 h, demonstrating that cells had not transformed. The self-renewal of F-class cells was independent of LIF or JAK signalling (Extended Data Fig. 4a, b); furthermore, F-class cells can be generated in media supplemented with JAK inhibitor (Extended Data Fig. 4c–f). F-class cells rapidly proliferated to the extent that, when mixed with ESCs, an initial 1% F-class cells became the dominant cell type (>50%) within three passages (Extended Data Fig. 4g). Stable gene expression, rapid proliferation (Extended Data Fig. 4h) and low intercellular adhesion (Extended Data Fig. 4i) confer F-class cells with highly desirable properties for stirred suspension culture.

Figure 2: The F-class state.
figure 2

a, Differentially expressed genes between ESC-like state n = 4 and F-class state n = 6 (Two-tailed Welchs t-test P < 0.01; FDR < 0.05). b, Gene ontology term analysis of differentially expressed genes. c, Two-way scatter plot comparisons of global gene expression (Illumina BeadArray), blue lines represent fourfold differential threshold. d, Histological analysis of teratomas containing differentiated tissues of all three germ layers. Arrowheads denote ciliated epithelia. Scale bars, 100 μm.

PowerPoint slide

Teratomas initiated by pluripotent cells (ESC, ESC-like iPSC and F-class cells) contained well-differentiated (non-dividing) and less differentiated dividing compartments. The teratomas from the F-class cells were indistinguishable from those derived from ESCs, each consisting of complex differentiated tissues representing all three germ layers (Fig. 2d). In vitro, removal of doxycycline in serum-free media initiated efficient neural differentiation of F-class cells, generating multiple neuronal subtypes (Extended Data Fig. 5a–c). Differentiation in serum-based media generated cells representative of the mesoderm (α-SMA+) and endoderm (FoxA2+) lineages (Extended Data Fig. 5d). We then assessed the embryonic developmental potential of F-class cells and found that they do not contribute to the development of chimaeras, nor do they incorporate into blastocysts after injection into the perivitelline space of eight-cell stage embryos (data not shown). In summary, we describe a novel cell state that is distinct from ESCs yet passes criteria used to functionally identify the pluripotent potential of human ESC and iPSC lines, as by the teratoma-forming assay.

Requirement of transgene expression

To determine the influence of transgene expression levels on the establishment of F-class and ESC-like states, we examined three different reprogramming systems: three-factor (3F), which excludes c-Myc; low-expressing four-factor (4F; Col1a1 transgenic secondary system23); and high-expressing four factor, 1B secondary system18 (Extended Data Fig. 6a, b). High-expressing 4F fibroblasts underwent population-wide proliferation and generated distinct colonies within 5 days, which stabilized at a state morphologically and transcriptionally resembling F-class cells (Extended Data Fig. 6c–e). In contrast, 3F and low-expressing 4F fibroblasts sporadically (<0.1%) gave rise to colonies from day 10 onwards, stabilizing at a state that morphologically and transcriptionally resembled ESCs (Extended Data Fig. 6c–e). During low-expressing 4F reprogramming, no morphologically overt F-class cells were observed at any time point, nor were F-class identifier genes expressed at elevated levels (Extended Data Fig. 6f). These observations suggest a model whereby low-transgene-expressing cells do not generate an F-class cell state (Extended Data Fig. 6g). We found that high four-factor expression can also reprogram adult tail skin fibroblasts to the F-class state (Extended Data Fig. 7a–c).

During somatic cell reprogramming retroviral transgenes become silenced and it is thought that this helps stabilize a fully reprogrammed ESC-like state24. Since F-class cells require maintained transgene expression, we questioned whether a retroviral transgene system could give rise to F-class cells. We initially observed rapidly dividing cells possessing an F-class morphology (days 8–16 post-transduction); however, we did not observe these cells beyond day 30. Retrovirus-delivered transgene expression (green fluorescent protein, GFP) was attenuated during transposon-mediated reprogramming to an F-class state and within established F-class cells (Extended Data Fig. 7d, e). We propose that silencing of the retroviral transgenes is not compatible with the F-class cells’ requirement for high transgene expression.

To examine the continued requirement of all four reprogramming factors, F-class cells were generated where three factors are constitutively expressed and the fourth factor is doxycycline-inducible. Doxycycline was removed at day 30 and in all four cases turning off the fourth factor induced a rapid loss of proliferation and a flattening of cell morphology (data not shown). Thus, all four reprogramming factors are needed to maintain the F-class state. The consistent inability to obtain F-class cells with 3F reprogramming indicates that elevated c-Myc expression is necessary. We used the TetO-Myc F-class cells, and found that upon doxycycline removal there was a downregulation of genes involved in growth factor activity and positive regulation of transcription (Extended Data Fig. 8a–d), in accordance with a reduced proliferation. Although cells did not transition to an ESC-like state, a number of ESC-associated genes were upregulated (Extended Data Fig. 8c, Supplementary Information 4), supporting the theory that reprogramming factor expression actively suppresses the final acquisition of an ESC-like state15.

Cell-state transitions

We questioned whether re-expressing the reprogramming factors at high levels in the ESC-like state would induce a transition to the F-class state. Reprogramming factor expression was re-activated in the iPSC line 1B18 and cells were transferred to media conditions that are conducive to F-class cells but not ESC-like cells: JAK inhibition in the absence of LIF and feeders (Extended Data Fig. 8e, f). Within 48 h, colonies of cells arose that morphologically resembled F-class cells. These cells maintained expression of some ESC-associated genes (Lin28 and Dnmt3B) yet diminished others such as Dppa5, Dnmt3l and Cdh1 (Extended Data Fig. 8g). Notably, cells upregulated genes expressed by F-class cells, suggesting that elevated reprogramming transgene expression can induce an F-class-like state, with the starting cell type (ESCs or MEFs) leaving a signature on the F-class cell state.

Next, we investigated whether established F-class cells can be induced to transition to an ESC-like state. Exposure to the DNA methyltransferase inhibitor 5-aza-deoxycytidine (Aza) was toxic at active concentrations (>0.05 μM), while vitamin C (ascorbic acid) supplementation and 2i media failed to induce an ESC-like morphology (Fig. 3a and Extended Data Fig. 9a). In contrast, inhibition of histone deacetylases (HDAC) induced F-class cells to acquire an ESC-like morphology (Fig. 3a) and transcriptional profile (Fig. 3b, Extended Data Fig. 9a). To determine whether HDAC inhibition (HDACi) selects for a sub-population of cells, we exposed twelve newly established subclones to HDACi and found that they acquired an ESC-like morphology and consistently upregulate ESC-like markers (Extended Data Fig. 9b). Furthermore, when single cells were treated with HDACi, every subsequent colony possessed elevated expression of ESC-associated genes (Fig. 3c). Direct observation of cells by time-lapse microscopy revealed that HDACi treatment decreased cell proliferation (Extended Data Fig. 9c) with no evidence of cell death (Extended Data Fig. 9d). HDACi-mediated acquisition of an ESC-like state was rapid with transcriptionally silent genes upregulated to ESC expression levels within 72 h (Extended Data Fig. 10a–c, Supplementary Information 5). During the first 24 h of HDACi treatment genes with chromatin and cell-division related ontology were upregulated (Extended Data Fig. 10d). The upregulation of chromatin-related factors possibly facilitated the transcriptional activation of further ESC-associated genes. Following HDACi treatment, cells could be maintained as transgene-independent ESC-like cells capable of contributing to chimaeras and the germ line (Fig. 3d, e). This was not possible before HDACi treatment.

Figure 3: HDACi induced F-class to ESC-like transition.
figure 3

a, Day 30 F-class cells (clone 1) treated for ten days. Scale bar, 100 μm. b, Principal component analysis of 32 cell-state identifier genes (determined by quantitative PCR with reverse transcription, qRT–PCR). C3 and C5 represent C-class clones, F1 and F2 represents F-class clones, maintained in different media. c, F-class cells (clone 1) clonally HDACi-treated (10 nM TSA). Scale bar, 250 μm. Each point represents an individual cell colony profiled by qRT–PCR (n = 10 biological replicates, 3 technical replicates per colony). Ngn3 is also known as Neurog3. d, Chimaeric contribution of HDACi treated F-class cells aggregated with eight-cell stage embryo, visualized by LacZ activity, representative of four embryos. e, Genital ridge dissected from chimaeric embryo (n = 1). GFP represents HDACi-treated clone2 F-class cells and Oct4 represents the germ cells. Scale bar, 100 μm.

PowerPoint slide

Epigenetic forces contribute to F-class state

To identify the epigenetic landmarks associated with the establishment of the F-class cell state, we exploited a high-resolution genome-wide resource that profiles fibroblast reprogramming at the molecular level to both F-class and ESC-like states25. Doxycycline-induced high-level reprogramming factor expression directs 1B secondary fibroblast reprogramming to an F-class transcriptional state (Extended Data Fig. 10e)25,26,27,28. Comparison of primary F-class cell lines and ESC-like cell lines identified 86 genes that exhibited substantial (>fivefold) differential expression (Fig. 4a). For these genes we assessed the status of three major chromatin marks; the activating histone H3K4 trimethylation (H3K4me3), the suppressing histone H3K27 trimethylation (H3K27me3)25 and CpG methylation27 (Supplementary Information 6). Transcriptional activity of 72 of the 86 genes (79%) correlated (Pearson correlation coefficient |r| > 0.5) with at least one epigenetic mark (Fig. 4b). The upregulation of F-class state identifiers, such as Nkx2-3 and Insm1 (Fig. 4c, d), was associated with an active loss of H3K27me3 during the reprogramming process, fitting the model that the F-class state is not an intermediate reprogrammed state but a distinct cell state achieved through active epigenetic changes. Further substantiating this is the observation that genes associated with the ESC-like state (Gbx2, Lefty1, Cldn6) acquired hypermethylation at their genomic loci (Fig. 4e), which is uncharacteristic of the ESC-like state. We further validated a subset of differentially methylated regions (DMRs) within primary F-class cells (Fig. 4f). In summary, fibroblast reprogramming to the F-class state is governed by multiple epigenetic marks, whereby active epigenetic modifications direct cell identity away from both fibroblast and ESC-like state, and repressive epigenetic marks are inherited from the parental cell type (fibroblasts).

Figure 4: Epigenetic marks steer reprogramming trajectory.
figure 4

a, Differentially expressed genes in primary lines (Welch t-test P < 0.01, FDR < 0.01). Black data points depict >fivefold difference in 1B secondary reprogramming system of F-class and ESC-like cells25. Lines depict fivefold threshold. b, Euler diagram depicting genes (black points in a) whose transcriptional activity corresponds with differential epigenetic marks in 1B secondary reprogramming system derived F-class cells25 (n = 1). c, Unsupervised clustering heat map of H3K27me3 marks identified in b. d, Transcription and histone modifications at the genomic locus of F-class identifier Insm1. e, Unsupervised clustering heat map of differentially methylated regions identified in b. f, Differentially methylated regions observed in the secondary reprogramming system confirmed in primary F-class cells.

PowerPoint slide

Discussion

In this study, we observed that reprogramming somatic cells, in the presence of elevated reprogramming factor expression, could stabilize at a Nanog-positive fuzzy colony forming (F-class) state. Previous studies may have overlooked this state as the F-class cells highly express Nanog without completing one of the early reprogramming events, the mesenchymal to epithelial transition16,29. Chan and colleagues previously described a human reprogrammed cell type (type II cells) that is Nanog-positive and persists in a state that represents an intermediate stage of somatic cell reprogramming30. In contrast to the human type II cells, the murine F-class cells do not morphologically resemble ESCs, nor do they transcriptionally or epigenetically represent an intermediate cell state that reprogramming cells transit through as they acquire ESC-like state. Two central observations support the notion that the F-class cell state is not representative of an intermediate state. First, F-class cells upregulate a cohort of genes that were not observed during reprogramming without c-Myc (3F) or with low-level four-factor (Oct4, Klf4, Sox2 and c-Myc) expression. Second, the expression of these genes in F-class cells is associated with the loss of repressive epigenetic marks (H3K27me3 and/or DNA methylation) that are typically present in the parental fibroblasts and the ESC-like state. The loss of these repressive marks suggests that, during sustained reprogramming factor expression, cell identity is diverted away from the molecular pathway that leads to an ESC-like state (Fig. 5). This is further supported by the observation that ESC-associated genes (Lefty1, Cldn6, Gbx2) actually acquire inhibitory DNA methylation in the F-class state. To our knowledge, this is the first report to identify dynamic epigenetic changes that actively propel reprogramming cells towards an alternative pluripotent cell state. In conclusion, the F-class cells represent an acquired state and not an intermediate state that all reprogramming cells transition through on the way to an ESC-like state.

Figure 5: Schematic representation of cell-state transitions during reprogramming.
figure 5

HDACi denotes histone deacetylation inhibition, 4F denotes the four Yamanaka factors, 3F denotes the four Yamanaka factors minus c-Myc.

PowerPoint slide

We propose that the F-class cell state is stably maintained as a consequence of high reprogramming factor expression and multiple epigenetic determinants. Through elevated expression of the four reprogramming factors we showed that F-class cells could be generated from both fibroblasts and ESC-like iPSCs. Notably, the cell type of origin leaves distinct signatures on the resultant F-class cells, as an imprint of their respective origin.

The ability to reprogram cells to novel cell states, such as the F-class state, can be harnessed to create a variety of artificial cells that possess desirable properties for regenerative medicine and drug discovery, such as the ability for scalable expansion in bioreactors and reproducible differentiation. ESCs are themselves an artificial in vitro cell state, captured during a brief developmental window and require specific culture conditions for their maintenance. The F-class cell state can be considered to be a distant pluripotent relative of the ESC state. The frequency at which F-class cells arise in transposon-based reprogramming, in combination with their advantageous properties, presents the opportunity to study and utilize a novel pluripotent cell type in biology, medical research and future medicine.

Methods

Cell culture

All cell lines were established in-house with full pathogen testing performed and maintained in a mycoplasma-free facility. Mouse embryonic fibroblasts (MEF) were isolated as previously described18. 15.5 days post coitum ROSA26-rtTA-IRES-GFP embryos (JAX 005572)31 were decapitated, eviscerated, dissociated with 0.25% trypsin, 0.1% EDTA and plated in DMEM, 10% FBS, penicillin–streptomycin and GlutaMAX. MEFs were reprogrammed within 4 passages of derivation. Tail-tip fibroblasts (TTFs) were obtained from 8-week-old mice. Tail-tips were mechanically dissociated with 0.25% trypsin and 1,000 U ml−1 collagenase (Type XI-S).

A standardised transfection protocol was established to electroporate fibroblasts (Neon, Invitrogen) with piggyBac transposons encoding the four reprogramming factors. In brief, 2 × 106 MEFs were electroporated with 4 μg of plasmid (0.5 μg PBase transposon and 3.5 μg factors), using optimized parameters (2 pulses, 1,200 V). Electroporated fibroblasts were plated in serum-based mouse ESC media32 supplemented with 1.5 μg ml−1 doxycycline on gelatinized (0.1%) plates, at a density of 1.5 × 104 cells per cm2. Cells were fed every three days with doxycycline-containing media (1.5 μg ml−1). Colonies were clonally picked and expanded in a 96-well format. Unless stated otherwise, clonal cell lines were maintained in mouse ESC media supplemented with 1.5 μg ml−1 doxycycline. ROSA26-rtTA-IRES-GFP ESCs were used as control cells. 2i media conditions represent serum-free media consisting of DMEM:F12 supplemented with 15% Knockout serum replacement (Gibco), 3 μM CHIR99021 (GSK3β inhibitor) and 1 μM PD0325901 (MEK inhibitor) as previously described33.

Transgene independent ESC-like iPSCs were obtained from F-class cells by exposure to sodium butyrate (0.25 mM) for seven days (plus doxycycline). Cells were then maintained in 2i media in the absence of sodium butyrate (plus doxycycline) for five days and then doxycycline was removed. Cells were furthermore maintained in either serum-based ESC media or 2i media.

EpiSCs were maintained in X-vivo base media (Lonza) supplemented with 10 mM β-mercaptoethanol (Sigma), 1 mM MEM-NEAA (Invitrogen), 2 mM GlutaMAX (Invitrogen), 20 ng ml−1 Activin A (R&D Systems), and 20 ng ml−1 basic fibroblast growth factor (R&D Systems). EpiSCs were passaged every 3–4 days as single cells in TrypLE (Invitrogen) and plated on wells pre-coated with Matrigel.

For retrovirus mediated reprogramming, retroviral packaging of pMX constructs and subsequent transduction of cells was performed as previously described34.

Stirred suspension culture

Adherent cells were trypsinized and seeded into spinner flasks at 2 × 104 cells per ml. 30-ml culture volumes were maintained at constant stirring speed of 85 r.p.m. at 37 °C and 10% CO2. Every three days cell numbers were quantified and suspension cultures reset to 2 × 104 cells per ml. One-half of the culture medium was replaced every two days.

In vitro neural differentiation

Cells were plated on geltrex (1:100 PBS dilution) coated plates at 5,000 cells per cm2. 24 h after plating cells, ESC media was changed to serum-free media that consisted of DMEM:F12 supplemented with N2 (Gibco), B27 (Gibco), and 4 μg ml−1 insulin. Doxycycline was removed by washing cells three times with PBS to remove all traces of doxycycline. Differentiation media was changed every three days.

Diploid aggregation generation of chimaeras

Cells were maintained for two passages in 2i media with cell clumps of ∼8–15 cells collected from gelatinized dishes by gentle trypsinization. For diploid chimaeras, 2.5 d.p.c. Hsd:ICR(CD-1) or C57BL/6 embryos were aggregated with in-vitro-derived cell clumps and cultured overnight at 37 °C in 5% CO2 in KSOM medium33. All embryos were transferred into pseudopregnant recipient ICR females 24 h later. For LacZ detection, pregnant dams were fed doxycycline food and water (0.2 mg ml−1 doxycycline; 5% sucrose in water) 24 h before dissection to activate β-geo expression in iPSC-derived cells. All mouse procedures were performed in accordance with Toronto Centre for Phenogenomics animal care committee.

LacZ staining

As described in ref. 18 cells and embryos were fixed with 0.25% glutaraldehyde, rinsed in wash buffer (2 mM MgCl2, 0.01% sodium deoxycholate, and 0.02% Nonidet-P40 in PBS) and stained overnight (∼16 h) in LacZ staining solution: 20 mM MgCl2, 5 mM K3Fe(CN)6, 5 mM K4Fe(CN)6 and 1 mg ml−1 X-gal in PBS. Embryos were embedded in paraffin, sectioned and counterstained with neutral red.

Teratoma formation

Cells were trypsinized and suspended in DMEM:Matrigel mix (1:1) with 100 μl of 1 × 106 cells injected subcutaneously into the dorsal flanks of nude mice (CByJ.Cg-Foxn1nu/J females, 6 weeks of age) anaesthetized with isoflurane. 4–6 weeks after injection, teratomas were dissected and fixed overnight in 4% formalin. Tissue was embedded in paraffin, sectioned and stained with haematoxylin and eosin.

Immunostaining and flow cytometry

Cells were washed once with PBS, fixed in 4% PFA for 15 min at room temperature and permeabilized with 0.1% Triton X-100 in PBS for 10 min. Primary antibody was added overnight at 4 °C: anti-α-SMA (C6198, Sigma), anti-Nanog (RCAB0002P, Reprocell), anti-DPPA4 (AF3730, R&D Systems), anti-FoxA2 (ab40874, Abcam) anti-SSEA1 (MAB4301, Millipore), anti-Sox2 (MAB2018, R&D Systems), anti-Oct3/4 (611203, BD), anti-GFP (6673, Abcam), anti-βIII-tubulin (TUJ1, Covance), anti-tyrosine hydroxylase (AB152, Millipore), anti-VGAT (131103, SYSY), anti-VGLUT1 (135302, SYSY). Secondary antibody (Jackson immune research cy3 IgG, 1:200; Alexa488 IgG or IgM, 1:400; Alexa594 IgG, 1:400) was added for 1 h at room temperature. Cell nuclei were stained with Hoechst 33342 (5 μg ml−1) for 15 min.

Flow cytometry

Cells were trypsinized and fixed in 4% PFA for 15 min at room temperature. Cells were washed and then stained with 0.1% Triton X-100 in PBS (2% FBS), incubated with primary antibody (Nanog 1:200) for 1 h on ice, washed twice in PBS (2% FBS), incubated with secondary antibody for 30 min on ice, washed twice and resuspended in PBS with 2% FBS for analysis on a FACS-Calibur. Cells were gated on the basis of forward scatter and side scatter.

Cell viability assay

Cell samples were trypsinized, resuspended in Annexin V buffer (10 mM HEPES, 140 mM NaCl, and 2.5 mM CaCl2, pH 7.4) and then incubated with Sytox AADvanced for 5 min and Annexin V for 10 min. Cellular fragments and debris were excluded from analysis using forward-scatter and side-scatter selection.

G-band karyotyping

G-banding was performed on actively dividing cells at the TCAG facility (Toronto, Canada). Cells were incubated with 0.2 μg ml−1 colcemid for 2 h at 37 °C and dissociated with 0.25% trypsin-EDTA. After pipetting a single-cell suspension was resuspended in pre-warmed (37 °C) 75 mM KCl for 15 min. Cells were then fixed with methanol:glacial acetic acid (1:3) and dropped onto glass slides. The slides containing cells were stained in Giemsa solution for 3 min, with 20 metaphases counted and scored for karyotyping.

Quantitative RT–PCR

Cells for RNA preparation were passaged on gelatin-coated plates. Total RNA was extracted from cells using a RNeasy kit (Qiagen). 1 μg of DNase treated RNA was used as template to generate cDNA by QuantiTect reverse transcription kit (Qiagen). For quantitative RT–PCR we used LuminoCt SYBR Green qPCR ReadyMix (Sigma) with JANUS automated liquid handling robot (PerkinElmer) loading the 384-well plates for RT-qPCR. 384 plates were run on a CFX384 (Bio-Rad) with an annealing temperature of 58 °C for all primers. Primer pairs were all assessed for efficiency and melt curves performed. All PCR reactions were performed in triplicate. Primer sequences are listed in Supplementary Information 7.

Illumina BeadChip

Total RNA was assessed for quality and quantity on a Bioanalyzer and global gene expression profiling performed with the Illumina microarray. Purified and labelled RNA was hybridized to MouseRef-8 v2 expression BeadChips (Illumina) according to the manufacturer’s instructions. Bead intensities were mapped to gene information using BeadStudio 3.2 (Illumina). Background correction was performed using the Affymetrix Robust Multi-array Analysis and data log2-scaled with gene expression quantile normalized in the lumi package of Bioconductor.

Bisulphite sequencing

Bisulphite conversion was performed on genomic DNA sample (1 μg) using the EpiTect Bisulfite Kit (QIAGEN). Bisulphite-treated genomic DNA was amplified by EpiTaq HS (Takara) using previously published bisulphite-specific primers35 and novel primers (Supplementary Information 4), with a PCR protocol consisting of an initial 1 min denaturation step followed by 35 cycles of 95 °C for 15 s, 55 °C for 30 s and 72 °C for 30 s. The resultant PCR amplicons were cloned in to pGemTeasy and sequenced at the Centre for Applied Genomics (Toronto, Canada).

Statistical analysis

Unless otherwise stated, all data presented are representative of at least three independent experiments. Hierarchical clustering, principal component analysis and gene distance matrices were performed with Multiexperiment Viewer. Statistical analysis was performed with either Prism (Graphpad) or Multiexperiment viewer (http://www.tm4.org/index.html). Gene ontology term analysis was performed with DAVID (Database for Annotation, Visualization and Integrated Discovery, http://david.abcc.ncifcrf.gov). Gene network association analysis was performed with GeneMANIA (http://www.genemania.org). Genes in the network analysis were chosen based on their membership of the PluriNet network22 and statistically significant differential expression between F-class samples and ESC samples. Differential expression was assessed using the limma package, P values were adjusted using the Benjamini–Hochberg method and significance cut-off set at 0.05.