Abstract
Tree mining consists in discovering the frequent subtrees from a forest of trees. This problem has many application areas. For instance, a huge volume of data available from the Internet is now described by trees (e.g. XML). Still, for several documents dealing with the same topic, this description is not always the same. It is thus necessary to mine a common structure in order to query these documents. Biology is another field where data may be described by means of trees. The problem of mining trees has now been addressed for several years, leading to well-known algorithms. However, these algorithms can hardly deal with real data in a soft manner. Indeed, they consider a subtree as fully included in the super-tree. This means that all the nodes must appear. In this paper, we extend this definition to fuzzy inclusion based on the idea that a tree is included to a certain degree within another one, this fuzzy degree being correlated to the number of matching nodes.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th VLDB Conference, Santiago, Chile (2002)
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H.: Effcient substructure discovery from large semi-structure data. In: 2nd Annual SIAM Symposium on Data Mining, SDM2002, Arlington, VA, USA, Springer, Heidelberg (2002)
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent subtree mining - an overview. Fundamenta Informaticae XXI, 1001–1038 (2005)
Chi, Y., Yang, Y., Xia, Y., Muntz, R.R.: CMTreeMiner: Mining both closed and maximal frequent subtrees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 63–73. Springer, Heidelberg (2004)
Chi, Y., Yang, Y., Muntz, R.R.: Indexing and mining free trees. In: International Conference on Data Mining 2003, ICDM2003 (2003)
Del Razo, F., Laurent, A., Poncelet, P., Teisseire, M.: Rsf - a new tree mining approach with an efficient data structure. In: Proceedings of the joint Conference: 4th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT 2005), pp. 1088–1093 (2005)
Fiot, C., Laurent, A., Teisseire, M.: From crispness to fuzziness: Three algorithms for soft sequential pattern mining. IEEE Transactions on Fuzzy Systems, to appear (2007)
Huffman, D.: A method for the construction of minimum-redundancy codes. In: Proceedings of the Institute of Radio Engineers (1952)
Knuth, D.: The Art of Computer Programming, Volume 1: Fundamental Algorithms. Addison-Wesley, Reading (1973)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: IEEE International Conference on Data Mining, ICDM (2001)
Laurent, A., Teisseire, M., Poncelet, P.: Fuzzy Data Mining for the Semantic Web: Building XML Mediator Schemas. Sanchez, E. (ed.) Elsevier, To appear (2006)
Sanchez, S., Laurent, A., Poncelet, P., Teisseire, M.: Fuzbt: a binary approach for fuzzy tree mining. In: Proceedings of the 11th IPMU International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2006 (2006)
Termier, A., Rousset, M.-C., Sebag, M.: Treefinder, a first step towards XML data mining. In: IEEE Conference on Data Mining (ICDM), pp. 450–457 (2002)
Wang, C., Yuan, Q., Zhou, H., Wang, W., Shi, B.: Chopper: An eficient algorithm for tree mining. Journal of Computer Science and Technology 19, 309–319 (2004)
Weiss, M.A.: Data Structures And Algorithm Analysis In C. Addison-Wesley, Reading (1998)
Yager, R.: Families of owa operators. Fuzzy Sets and Systems 57(3), 125–148 (1993)
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: IEEE Conference on Data Mining, ICDM (2002)
Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest. In: KDD’02, Edmonton, Alberta, Canada, ACM Press, New York (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Del Razo Lopez, F., Laurent, A., Poncelet, P., Teisseire, M. (2007). Fuzzy Tree Mining: Go Soft on Your Nodes. In: Melin, P., Castillo, O., Aguilar, L.T., Kacprzyk, J., Pedrycz, W. (eds) Foundations of Fuzzy Logic and Soft Computing. IFSA 2007. Lecture Notes in Computer Science(), vol 4529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72950-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-72950-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72917-4
Online ISBN: 978-3-540-72950-1
eBook Packages: Computer ScienceComputer Science (R0)