Abstract
Two of the prime open-source environments available for machine/statistical learning in data mining and knowledge discovery are the software packages Weka and R which have emerged from the machine learning and statistics communities, respectively. To make the different sets of tools from both environments available in a single unified system, an R package RWeka is suggested which interfaces Weka’s functionality to R. With only a thin layer of (mostly R) code, a set of general interface generators is provided which can set up interface functions with the usual “R look and feel”, re-using Weka’s standardized interface of learner classes (including classifiers, clusterers, associators, filters, loaders, savers, and stemmers) with associated methods.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Carey V (2007) arji: Another R-Java interface. http://www.bioconductor.org/, R package version 0.3.16
Chambers JM, Hastie TJ (1992) Statistical models in S. Chapman & Hall, London
Ellson J, Gansner E, Koutsofios E, North S, Woodhull G (2003) Graphviz and Dynagraph—static and dynamic graph drawing tools. In: Junger M, Mutzel P (eds.) Graph drawing software. Springer, Heidelberg, pp 127–148. http://www.Graphviz.org/
Gentry J, Long L, Gentleman R, Falcon S (2007) Rgraphviz: plotting capabilities for R graph objects. http://www.bioconductor.org/, R package version 1.14.1
Hahsler M, Grün B, Hornik K (2005) arules—A computational environment for mining association rules and frequent item sets. J Stat Softw 14(15):1–25. ISSN 1548-7660, http://www.jstatsoft.org/v14/i15/
Hornik K, Zeileis A, Hothorn T, Buchta C (2007) RWeka: an R interface to Weka. http://CRAN.R-project.org/package=RWeka, R package version 0.3-4
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graphical Stat 15(3): 651–674
R Development Core Team (2007) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria, http://www.R-project.org/, ISBN 3-900051-07-0
Schauerhuber M, Zeileis A, Meyer D, Hornik K (2007) Benchmarking open-source tree learners in R/RWeka. In: Data analysis, machine learning, and applications (Proceedings of the 31st annual conference of the Gesellschaft für Klassifikation e.V., March 7–9, 2007, Freiburg), forthcoming
Temple Lang D, Chambers J (2005) SJava: The omegahat interface for R and Java. http://www.omegahat.org/RSJava/, R package version 0.69-0
Urbanek S (2007) rJava: Low-Level R to Java interface. http://CRAN.R-project.org/package=rJava, R package version 0.4-16
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hornik, K., Buchta, C. & Zeileis, A. Open-source machine learning: R meets Weka. Comput Stat 24, 225–232 (2009). https://doi.org/10.1007/s00180-008-0119-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-008-0119-7