Abstract
Graphs have enough richness and flexibility to express discrete structures hidden in a large amount of data. Some searching methods utilizing graph algorithmic techniques have been developed in Knowledge Discovery. A term graph, which is one of expressions for graph-structured data, is a hypergraph whose hyperedges are regarded as variables. Although term graphs can represent complicated patterns found from structured data, it is hard to do pattern match and pattern search in them. We have been studying subclasses of term graphs, called regular term trees, which are suited for expressing tree-like structured data. In this paper, we consider a matching problem for a regular term tree t and a standard tree T, which decides whether or not there exists a tree T′ such that T′ is isomorphic to T and T′ is obtained by replacing variables in t with some trees. First we show that the matching problem for a regular term tree and a tree is NP-complete even if each variable in the regular term tree contains only 4 vertices. Next we give a polynomial time algorithm for solving the matching problem for a regular term tree and a tree of bounded degree such that the regular term tree has only variables consisting the constant number of vertices greater than one. We also report some computational experiments and compare our algorithm with a naive algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Džeroski. Inductive logic programming and knowledge discovery in databases. Advances in Knowledges Discovery and Data Mining, MIT Press, pages 118–152, 1996.
S. Džeroski, N. Jacobs, M. Molina, C. Moure, S. Muggleton, and W. V. Laer. Detecting traffic problems with ILP. Proc. ILP-98, Springer-Verlag, LNAI 1446, pages 281–290, 1998.
L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery, 3:7–36, 1999.
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. Proceedings of the Third International Conference Knowledge Discovery and Data Mining, AAAI Press, pages 30–36, 1998.
M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.
S. Matsumoto, Y. Hayashi, and T. Shoudai. Polynomial time inductive inference of regular term tree languages from positive data. Proc. ALT-97, Springer-Verlag, LNAI 1316, pages 212–227, 1997.
T. Miyahara, T. Shoudai, T. Uchida, T. Kuboyama, K. Takahashi, and H. Ueda. Discovering new knowledge from graph data using inductive logic programming. Proc. ILP-99, Springer-Verlag, LNAI 1634, pages 222–233, 1999.
T. Miyahara, T. Uchida, T. Kuboyama, T. Yamamoto, K. Takahashi, and H. Ueda. KD-FGS: a knowledge discovery system from graph data using formal graph system. Proc. PAKDD-99, Springer-Verlag, LNAI 1574, pages 438–442, 1999.
Y. Mukouchi and S. Arikawa. Towards a mathematical theory of machine discovery from facts. Theoretical Computer Science, 137:53–84, 1995.
H. Toivonen. On knowledge discovery in graph-structured data. Proceedings of the PAKDD Workshop on Knowledge Discovery from Advanced Databases (KDAD-99), 1999.
T. Uchida, T. Shoudai, and S. Miyano. Parallel algorithm for refutation tree problem on formal graph systems. IEICE Transactions on Information and Systems, E78-D(2):99–112, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miyahara, T., Shoudai, T., Uchida, T., Takahashi, K., Ueda, H. (2000). Polynomial Time Matching Algorithms for Tree-Like Structured Patterns in Knowledge Discovery. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_4
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive