Abstract
We present Tree 2, a new approach to structural classification. This integrated approach induces decision trees that test for pattern occurrence in the inner nodes. It combines state-of-the-art tree mining with sophisticated pruning techniques to find the most discriminative pattern in each node. In contrast to existing methods, Tree 2 uses no heuristics and only a single, statistically well founded parameter has to be chosen by the user. The experiments show that Tree 2 classifiers achieve good accuracies while the induced models are smaller than those of existing approaches, facilitating better comprehensibility.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Kilpeläinen, P.: Tree Matching Problems with Applications to Structured Text Databases. PhD thesis, University of Helsinki (1992)
Kramer, S., Raedt, L.D., Helma, C.: Molecular feature mining in HIV data. In: Provost, F., Srikant, R. (eds.) Proc. KDD 2001, pp. 136–143. ACM Press, New York (2001)
Bringmann, B., Karwath, A.: Frequent SMILES. In: Lernen, Wissensentdeckung und Adaptivität, Workshop GI Fachgruppe Maschinelles Lernen, LWA (2004)
Zaki, M.J., Aggarwal, C.C.: XRules: an effective structural classifier for XML data. In: Getoor, L., Senator, T.E., Domingos, P., Faloutsos, C. (eds.) KDD, Washington, DC, USA, pp. 316–325. ACM, New York (2003)
Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Performance evaluation of decision tree graph-based induction. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) Discovery Science, Sapporo, Japan, pp. 128–140. Springer, Heidelberg (2003)
Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239–266 (1990)
Muggleton, S.: Inverse entailment and progol. New Generation Computing 13, 245–286 (1995)
Morishita, S., Sese, J.: Traversing itemset lattices with statistical metric pruning. In: Proceedings of the Nineteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Dallas, Texas, USA, pp. 226–236. ACM, New York (2000)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G. (eds.) KDD, New York City, New York, USA, pp. 80–86. AAAI Press, Menlo Park (1998)
Mutter, S., Hall, M., Frank, F.: Using classification to evaluate the output of confidence-based association rule mining. In: Webb, G.I., Yu, X. (eds.) Australian Conference on Artificial Intelligence, Cairns, Australia, pp. 538–549. Springer, Heidelberg (2004)
Zimmermann, A., De Raedt, L.: Corclass: Correlated association rule mining for classification (22), 60–72
Gärtner, T., Lloyd, J.W., Flach, P.A.: Kernels and distances for structured data. Machine Learning 57 (2004)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Frank, E., Hall, M., Trigg, L.E., Holmes, G., Witten, I.H.: Data mining in bioinformatics using weka. Bioinformatics 20, 2479–2481 (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Srinivasan, A., Muggleton, S., King, R., Sternberg, M.: Mutagenesis: ILP experiments in a non-determinate biological domain. In: Wrobel, S. (ed.) Proceedings of the 4th International Workshop on Inductive Logic Programming, Gesellschaft für Mathematik und Datenverarbeitung MBH, vol. 237, pp. 217–232 (1994)
King, R.D., Sternberg, M.J.E., Srinivasan, A.: Relating chemical activity to structure: An examination of ILP successes. New Generation Comput. 13, 411–433 (1995)
Weininger, D.: SMILES, a chemical language and information system 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988)
The OpenBabel Software Community: Open Babel (2003), http://openbabel.sourceforge.net
Daylight Chemical Information Systems, Inc. (2004), http://www.daylight.com/
Karwath, A., De Raedt, L.: Predictive graph mining (22), 1–15
Suzuki, E., Arikawa, S. (eds.): DS 2004. LNCS (LNAI), vol. 3245. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bringmann, B., Zimmermann, A. (2005). Tree 2 – Decision Trees for Tree Structured Data. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_10
Download citation
DOI: https://doi.org/10.1007/11564126_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)