Abstract
In this chapter, we focus on XPath, a domain-specific language that we can use from within R (amongst others) to query sets of nodes in an XML tree by patterns within nodes. XPath is quite simple but very powerful. Similar to a file hierarchy, it allows us to identify nodes of interest by specifying paths through the tree, based on node names, node content, and a node’s relationship to other nodes in the hierarchy. We typically use XPath to locate nodes in a tree and then use R functions to extract data from those nodes and bring the data into R. The combination of R and XPath gives us very powerful and flexible facilities for working with XML, and anyone working with XML on a regular basis should learn the details of XPath. XPath is the primary tool for working with XML content, either from scraping data from Web pages, services, or processing local XML documents.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Anders Berglund. Extensible Stylesheet Language (XSL) Version 1.1. WorldwideWeb Consortium, 2006. http://www.w3.org/TR/xsl.
Michael Brundage. XQuery: The XML Query Language. Addison Wesley, Boston, MA, 2004.
James Clark. XSL transformations (XSLT). Worldwide Web Consortium, 1999. http://www.w3.org/TR/xslt.
David Flanagan. JavaScript: The Definitive Guide. O’Reilly Media, Inc., Sebastopol, CA, 2006.
FLOWR Foundation. Zorba: The XQuery processor. http://www.zorba-xquery.com, 2012.
Elliotte Rusty Harold andW. Scott Means. XML in a Nutshell. O’Reilly Media, Inc., Sebastopol, CA, 2004.
Library of Congress. MODS: Metadata Object Description Schema. http://www.loc.gov/standards/mods/mods.xsd, 2010.
National Center for Integrative Biomedical Informatics. Michigan molecular interactions. http:/mimi.ncibi.org, 2010.
John Simpson. XPath and XPointer: Locating Content in XML Documents. O’Reilly Media, Inc., Sebastopol, CA, 2002.
Duncan Temple Lang. RXQuery: Bi-directional interface to an XQuery engine. http://www.omegahat.org/RXQuery, 2011. R package version 0.3-0.
Duncan Temple Lang. Sxslt: R extension for liblibxslt. http://www.omegahat.org/Sxslt, 2011. R package version 0.91-1.
Jenni Tennison. XSLT and XPath On the Edge. M & T Books, New York, NY, 2001.
Doug Tidwell. XSLT. O’Reilly Media, Inc., Sebastopol, CA, 2008.
W3Schools, Inc. XPath tutorial. http://www.w3schools.com/XPath/default.asp, 2011.
Priscilla Walmsley. XQuery. O’Reilly Media, Inc., Sebastopol, CA, 2007.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Nolan, D., Lang, D.T. (2014). XPath, XPointer, and XInclude . In: XML and Web Technologies for Data Sciences with R. Use R!. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7900-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7900-0_4
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7899-7
Online ISBN: 978-1-4614-7900-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)