Abstract
Marker-gene sequencing is a cost-effective method of taxonomically profiling microbial communities. Unlike metagenomic approaches, marker-gene sequencing does not provide direct information about the functional genes that are present in the genomes of community members. However, by capitalizing on the rapid growth in the number of sequenced genomes, it is possible to infer which functions are likely associated with a marker gene based on its sequence similarity with a reference genome. The PICRUSt tool is based on this idea and can predict functional category abundances based on an input marker gene. In brief, this method requires a reference phylogeny with tips corresponding to taxa with reference genomes as well as taxa lacking sequenced genomes. A modified ancestral state reconstruction (ASR) method is then used to infer counts of functional categories for taxa without reference genomes. The predictions are written to pre-calculated files, which can be cross-referenced with other datasets to quickly generate predictions of functional potential for a community. This chapter will give an in-depth description of these methods and describe how PICRUSt should be used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Segata N, Huttenhower C (2011) Toward an efficient method of identifying core genes for evolutionary and functional microbial phylogenies. PLoS One 6:e24704
Snel B, Bork P, Huynen MA (1999) Genome phylogeny based on gene content. Nat Genet 21:108–110
Langille MG, Zaneveld J, Caporaso JG et al (2013) Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol 31:814–821
Gevers D, Kugathasan S, Denson LA et al (2014) The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15:382–392
Zarraonaindia I, Owens S, Weisenhorn P et al (2015) The Soil Microbiome Influences Grapevine-Associated Microbiota. MBio 6:e02527–e02514
Morrow KM, Bourne DG, Humphrey C et al (2015) Natural volcanic CO2 seeps reveal future trajectories for host-microbial associations in corals and sponges. ISME J 9:894–908
Aßhauer KP, Wemheuer B, Daniel R, Meinicke P (2015) Tax4Fun: Predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics 31:2882–2884
Iwai S, Weinmaier T, Schmidt BL et al (2016) Piphillin: improved prediction of metagenomic content by direct inference from human microbiomes. PLoS One 11:e0166104
Quast C, Pruesse E, Yilmaz P et al (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 41:590–596
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7:335–336
Schloss PD, Westcott SL, Ryabin T et al (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541
McDonald D, Clemente JC, Kuczynski J et al (2012a) The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. GigaScience 1:7
McDonald D, Price MN, Goodrich J et al (2012b) An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J 6:610–618
Markowitz VM, Chen IMA, Palaniappan K et al (2014) IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res 42:D560–D567
Matsen FA, Kodner RB, Armbrust EV (2010) pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinform 11:538
Berger SA, Krompass D, Stamatakis A (2011) Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood. Syst Biol 60:291–302
Felsenstein J (1985) Phylogenies and the Comparative Method. Am Nat 125:1–15
Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290
Parks DH, Tyson GW, Hugenholtz P, Beiko RG (2014) STAMP: statistical analysis of taxonomic and functional profiles. Bioinformatics 30:3123–3124
Callahan BJ, McMurdie PJ, Rosen MJ et al (2016) DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583
Amir A, McDonald D, Navas-Molina JA et al (2017) Deblur rapidly resolves single-nucleotide community sequence patterns. mSystems 2:e00191–e00116
Kanehisa M, Goto S, Sato Y et al (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40:109–114
Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637
Burge SW, Daub J, Eberhardt R et al (2013) Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41:226–232
Swofford DL, Maddison WP (1987) Reconstructing ancestral character states under Wagner parsimony. Math Biosci 87:199–229
Acknowledgments
We would like to acknowledge the other coauthors of PICRUSt who helped develop and test the software, as well as the many users from the PICRUSt mailing list that have provided insightful questions and comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Douglas, G.M., Beiko, R.G., Langille, M.G.I. (2018). Predicting the Functional Potential of the Microbiome from Marker Genes Using PICRUSt. In: Beiko, R., Hsiao, W., Parkinson, J. (eds) Microbiome Analysis. Methods in Molecular Biology, vol 1849. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8728-3_11
Download citation
DOI: https://doi.org/10.1007/978-1-4939-8728-3_11
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-8726-9
Online ISBN: 978-1-4939-8728-3
eBook Packages: Springer Protocols