Abstract
Transcriptional programs control cellular lineage commitment and differentiation during development. Understanding of cell fate has been advanced by studying single-cell RNA-sequencing (RNA-seq) but is limited by the assumptions of current analytic methods regarding the structure of data. We present single-cell topological data analysis (scTDA), an algorithm for topology-based computational analyses to study temporal, unbiased transcriptional regulation. Unlike other methods, scTDA is a nonlinear, model-independent, unsupervised statistical framework that can characterize transient cellular states. We applied scTDA to the analysis of murine embryonic stem cell (mESC) differentiation in vitro in response to inducers of motor neuron differentiation. scTDA resolved asynchrony and continuity in cellular identity over time and identified four transient states (pluripotent, precursor, progenitor, and fully differentiated cells) based on changes in stage-dependent combinations of transcription factors, RNA-binding proteins, and long noncoding RNAs (lncRNAs). scTDA can be applied to study asynchronous cellular responses to either developmental cues or environmental perturbations.
Similar content being viewed by others
Accession codes
References
Jessell, T.M. Neuronal specification in the spinal cord: inductive signals and transcriptional codes. Nat. Rev. Genet. 1, 20–29 (2000).
Wichterle, H., Lieberam, I., Porter, J.A. & Jessell, T.M. Directed differentiation of embryonic stem cells into motor neurons. Cell 110, 385–397 (2002).
Sances, S. et al. Modeling ALS with motor neurons derived from human induced pluripotent stem cells. Nat. Neurosci. 19, 542–553 (2016).
Phatnani, H.P. et al. Intricate interplay between astrocytes and motor neurons in ALS. Proc. Natl. Acad. Sci. USA 110, E756–E765 (2013).
Bratt-Leal, A.M., Carpenedo, R.L. & McDevitt, T.C. Engineering the embryoid body microenvironment to direct embryonic stem cell differentiation. Biotechnol. Prog. 25, 43–51 (2009).
Haghverdi, L., Büttner, M., Wolf, F.A., Buettner, F. & Theis, F.J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).
Setty, M. et al. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat. Biotechnol. 34, 637–645 (2016).
Welch, J.D., Hartemink, A.J. & Prins, J.F. SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data. Genome Biol. 17, 106 (2016).
Angerer, P. et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241–1243 (2016).
Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).
Marco, E. et al. Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape. Proc. Natl. Acad. Sci. USA 111, E5643–E5650 (2014).
Chan, J.M., Carlsson, G. & Rabadan, R. Topology of viral evolution. Proc. Natl. Acad. Sci. USA 110, 18566–18571 (2013).
Cámara, P.G., Levine, A.J. & Rabadán, R. Inference of Ancestral Recombination Graphs through Topological Data Analysis. PLoS Comput. Biol. 12, e1005071 (2016).
Camara, P.G., Rosenbloom, D.I., Emmett, K.J., Levine, A.J. & Rabadan, R. Topological data analysis generates high-resolution, genome-wide maps of human recombination. Cell Syst. 3, 83–94 (2016).
Nicolau, M., Levine, A.J. & Carlsson, G. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc. Natl. Acad. Sci. USA 108, 7265–7270 (2011).
Li, L. et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174 (2015).
Bendall, S.C. et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).
Satija, R., Farrell, J.A., Gennert, D., Schier, A.F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
Singh, G., Mémoli, F. & Carlsson, G.E. in SPBG 91–100 (Citeseer, 2007).
Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012).
Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).
McDavid, A., Finak, G. & Gottardo, R. The contribution of cell cycle to heterogeneity in single-cell RNA-seq data. Nat. Biotechnol. 34, 591–593 (2016).
Mi, H., Muruganujan, A., Casagrande, J.T. & Thomas, P.D. Large-scale gene function analysis with the PANTHER classification system. Nat. Protoc. 8, 1551–1566 (2013).
Balmer, J.E. & Blomhoff, R. Gene expression regulation by retinoic acid. J. Lipid Res. 43, 1773–1808 (2002).
Rhinn, M. & Dollé, P. Retinoic acid signalling during development. Development 139, 843–858 (2012).
Gaunt, S.J. & Strachan, L. Temporal colinearity in expression of anterior Hox genes in developing chick embryos. Dev. Dyn. 207, 270–280 (1996).
Zhang, X., Weissman, S.M. & Newburger, P.E. Long intergenic non-coding RNA HOTAIRM1 regulates cell cycle progression during myeloid maturation in NB4 human promyelocytic leukemia cells. RNA Biol. 11, 777–787 (2014).
Lin, M. et al. RNA-Seq of human neurons derived from iPS cells reveals candidate long non-coding RNAs involved in neurogenesis and neuropsychiatric disorders. PLoS One 6, e23356 (2011).
Mallo, M. & Alonso, C.R. The regulation of Hox gene expression during animal development. Development 140, 3951–3963 (2013).
Dinger, M.E. et al. Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res. 18, 1433–1445 (2008).
Sommer, L., Ma, Q. & Anderson, D.J. neurogenins, a novel family of atonal-related bHLH transcription factors, are putative mammalian neuronal determination genes that reveal progenitor cell heterogeneity in the developing CNS and PNS. Mol. Cell. Neurosci. 8, 221–241 (1996).
Darnell, R.B. RNA protein interaction in neurons. Annu. Rev. Neurosci. 36, 243–270 (2013).
Quesnel-Vallières, M., Irimia, M., Cordes, S.P. & Blencowe, B.J. Essential roles for the splicing regulator nSR100/SRRM4 during nervous system development. Genes Dev. 29, 746–759 (2015).
Calarco, J.A. et al. Regulation of vertebrate nervous system alternative splicing and development by an SR-related protein. Cell 138, 898–910 (2009).
Treutlein, B. et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509, 371–375 (2014).
Petropoulos, S. et al. Single-cell RNA-Seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell 165, 1012–1026 (2016).
Telley, L. et al. Sequential transcriptional waves direct the differentiation of newborn neurons in the mouse neocortex. Science 351, 1443–1446 (2016).
Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
Anders, S., Pyl, P.T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Stegle, O., Teichmann, S.A. & Marioni, J.C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 16, 133–145 (2015).
Grün, D., Kester, L. & van Oudenaarden, A. Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).
Kharchenko, P.V., Silberstein, L. & Scadden, D.T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).
Shalek, A.K. et al. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 510, 363–369 (2014).
Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).
Edelsbrunner, H., Letscher, D. & Zomorodian, A. Topological persistence and simplification. Discrete Comput. Geom. 28, 511–533 (2002).
Zomorodian, A. & Carlsson, G. Computing persistent homology. Discrete Comput. Geom. 33, 249–274 (2005).
Binns, D. et al. QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics 25, 3045–3046 (2009).
UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).
Zhao, Y. et al. NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res. 44, D203–D208 (2016).
Acknowledgements
We thank T. Jessell, N. Francis, and H. Phatnani for critical reading of the manuscript. A.H.R. and T.M. thank the New York Genome Center and D. Goldstein for sequencing support, S. Morton for providing Engrailed antibody, and P. Sims for experimental discussions. P.G.C. and R.R. thank A.J. Levine, G. Carlsson, F. Abate, I. Filip, S. Zairis, U. Rubin, and P. van Nieuwenhuizen for useful comments and discussions, O.T. Elliott for technical support with the online database, and Ayasdi Inc. for technical support. The work of P.G.C. and R.R. is supported by the NIH grants U54-CA193313-01 and R01GM117591. The work of A.H.R., E.K.K., T.J.R. and T.M. is supported by ALS Therapy Alliance grant ATA-2013-F-056 and NIH grant NS088992.
Author information
Authors and Affiliations
Contributions
P.G.C. and R.R. developed the topology-based computational approach (scTDA) and applied it to single cell RNA sequencing data. A.H.R., E.K.K., and T.M. designed all experiments. A.H.R., E.K.K., and T.J.R. conducted experiments. I.S. conducted all flow cytometry. A.H.R., P.G.C., E.K.K., T.M., and R.R. analyzed the data and wrote the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–20, Supplementary Table 1, and Supplementary Notes 1 and 2 (PDF 9143 kb)
Supplementary Tables 2–4
All genes; Ontology; lncRNAs. (XLSX 3377 kb)
Supplementary Table 5
All genes characterization of the expression profile in the topological representation of 80 embryonic (E18.5) mouse lung 35 epithelial cells. (XLSX 542 kb)
Supplementary Table 6
Characterization of the expression profile in the topological representation of 1,529 individual cells from 88 human preimplantation embryos. (XLSX 1929 kb)
Supplementary Table 7
All genes characterization of the expression profile in the topological representation of 272 newborn neurons from the mouse neocortex. (XLSX 1660 kb)
Supplementary Table 8
Barcoded reverse transcription primers utilized in motor neuron differentiation experiment 2. (XLSX 5 kb)
Supplementary Code
Python code for single-cell topological data analysis (scTDA). Also available at https://github.com/RabadanLab/scTDA. (TXT 94 kb)
Rights and permissions
About this article
Cite this article
Rizvi, A., Camara, P., Kandror, E. et al. Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nat Biotechnol 35, 551–560 (2017). https://doi.org/10.1038/nbt.3854
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.3854
- Springer Nature America, Inc.
This article is cited by
-
Single-nucleus transcriptomic analysis reveals the relationship between gene expression in oligodendrocyte lineage and major depressive disorder
Journal of Translational Medicine (2024)
-
Representing and extracting knowledge from single-cell data
Biophysical Reviews (2024)
-
A gradient sampling algorithm for stratified maps with applications to topological data analysis
Mathematical Programming (2023)
-
CASSL: A cell-type annotation method for single cell transcriptomics data using semi-supervised learning
Applied Intelligence (2023)
-
Universal prediction of cell-cycle position using transfer learning
Genome Biology (2022)