Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis

  • Perspective
  Published:

From Nature Genetics

The high proportion of zeros in typical single-cell RNA sequencing datasets has led to widespread but inconsistent use of terminology such as dropout and missing data. Here, we argue that much of this terminology is unhelpful and confusing, and outline simple ideas to help to reduce confusion. These include: (1) observed single-cell RNA sequencing counts reflect both true gene expression levels and measurement error, and carefully distinguishing between these contributions helps to clarify thinking; and (2) method development should start with a Poisson measurement model, rather than more complex models, because it is simple and generally consistent with existing data. We outline how several existing methods can be viewed within this framework and highlight how these methods differ in their assumptions about expression variation. We also illustrate how our perspective helps to address questions of biological interest, such as whether messenger RNA expression levels are multimodal among cells.

Fig. 1: Comparing single-gene expression models on scRNA-seq data.

Data availability

Sorted immune cell and PBMC data were downloaded from iPSC data were downloaded from the Gene Expression Omnibus (accession number GSE118723). Brain data were downloaded from the Genotype-Tissue Expression portal ( Kidney and retina data were downloaded from the Human Cell Atlas Data Portal ( Control data were downloaded from All of the results generated in this study are available at and all analysis notebooks have been published at

Code availability

All of the code used to perform the analysis is available at and


We thank members of the M.S. and Y. Gilad laboratories for helpful comments. This work was supported by NIH grant HG002585 and a Gut Cell Atlas grant from The Leona M. and Harry B. Helmsley Charitable Trust (both to M.S.).

Author information

Authors and Affiliations



A.S. and M.S. developed the theory. A.S. performed the analysis. A.S. and M.S. wrote the paper.

Corresponding authors

Correspondence to Abhishek Sarkar or Matthew Stephens.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Figs. 1–4 and Methods

Reporting Summary

