Abstract
Advanced database systems face a great challenge arising from the emergenceof massive, complex structural data in bioinformatics, chem-informatics, busi- ness processes, etc. One of the most important functions needed in these areas is efficient search of complex graph data. Given a graph query, it is desirable to retrieve relevant graphs quickly from a large database via efficient graph indices. This chapter gives an introduction to graph substructure search, approx- imate substructure search and their related graph indexing techniques, particularly feature-based graph indexing.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press/Addison-Wesley, 1999.
S. Beretti, A. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content based retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23:1089–1105, 2001.
H. Bunke and G. Allermann. Inexact graph matching for structural pattern recognition. Pattern Recognition Letters, 1(4):245–253, 1983.
C. Chen, X. Yan, P. S. Yu, J. Han, D.-Q. Zhang, and X. Gu. Towards graph containment search and indexing. In Proc. of 2007 Int. Conf. on Very Large Data Bases (VLDB’07), pages 926–937, 2007.
Q. Chen, A. Lim, and K. W. Ong. D(k)-Index: An adaptive structural summary for graph-structured data. In Proc. of 2003 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’03), pages 134–144, 2003.
J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: Towards verification-free query processing on graph databases. In Proc. of 2007 ACM Int. Conf. on Management of Data (SIGMOD’07), pages 857–872, 2007.
C. Chung, J. Min, and K. Shim. APEX: An adaptive path index for xml data. In Proc. of 2002 ACM Int. Conf. on Management of Data (SIGMOD’02), pages 121–132, 2002.
S. Cook. The complexity of theorem-proving procedures. In Proc. of the 3rd ACM Symp. on Theory of Computing (STOC’71), pages 151–158, 1971.
B. Cooper, N. Sample, M. Franklin, G. Hjaltason, and M. Shadmon. A fast index for semistructured data. In Proc. of 2001 Int. Conf. on Very Large Data Bases (VLDB’01), pages 341–350, 2001.
Y. Fang, , R. Katz, and T. Lakshman. Gigabit rate packet pattern-matching using TCAM. In Proc. of the 12th IEEE Int. Conf. on Network Protocols (ICNP’04), pages 174–183, 2004.
K. Fu. A step towards unification of syntactic and statistical pattern recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(3):398–404, 1986.
R. Giugno and D. Shasha. GraphGrep: A fast and universal method for querying graphs. pages 112–115, 2002.
R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proc. of 1997 Int. Conf. on Very Large Data Bases (VLDB’97), pages 436–445, 1997.
T. Hagadone. Molecular substructure similarity searching: Efficient retrieval in two-dimensional structure databases. J. Chem. Inf. Comput. Sci., 32:515–521, 1992.
H. He and A. Singh. Closure-Tree: An index structure for graph queries. In Proc. of 2006 Int. Conf. on Data Engineering (ICDE’06), 2006.
D. Hochbaum. Approximation Algorithms for NP-Hard Problems. PWS Publishing, MA, 1997.
L. Holder, D. Cook, and S. Djoko. Substructure discovery in the subdue system. In Proc. of AAAI’94 Workshop on Knowledge Discovery in Databases (KDD’94), pages 169–180, 1994.
C. James, D. Weininger, and J. Delany. Daylight Theory Manual Version 4.82. Daylight Chemical Information Systems, Inc, 2003.
H. Jiang, H. Wang, P. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In Proc. of 2007 Int. Conf. on Data Engineering (ICDE’07), pages 566–575, 2007.
R. Kaushik, P. Shenoy, P. Bohannon, and E. Gudes. Exploiting local similarity for efficient indexing of paths in graph structured data. In Proc. of 2002 Int. Conf. on Data Engineering (ICDE’02), pages 129–140, 2002.
T. Madej, J. Gibrat, and S. Bryant. Threading a database of protein cores. Proteins, 3-2:289–306, 1995.
B. Messmer and H. Bunke. A new algorithm for error-tolerant subgraph isomorphism detection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20:493–504, 1998.
T. Milo and D. Suciu. Index structures for path expressions. Lecture Notes in Computer Science, 1540:277–295, 1999.
N. Nilsson. Principles of Artificial Intelligence. Morgan Kaufmann, Palo Alto, CA, 1980.
E. Petrakis and C. Faloutsos. Similarity searching in medical image databases. Knowledge and Data Engineering, 9(3):435–447, 1997.
M. Petrovic, H. Liu, and H. Jacobsen. G-ToPSS: Fast filtering of graph-based metadata. In Proc. of 2005 Int. Conf. on World Wide Web (WWW’05), pages 539–547, 2005.
J. Raymond, E. Gardiner, and P. Willett. Rascal: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal, 45:631–644, 2002.
D. Shasha, J. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of the 21th ACM Symp. on Principles of Database Systems (PODS’02), pages 39–52, 2002.
A. Shokoufandeh, S. Dickinson, K. Siddiqi, and S. Zucker. Indexing using a spectral encoding of topological structure. In Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR’99), pages 2491–2497, 1999.
S. Srinivasa and S. Kumar. A platform based on the multi-dimensional data model for analysis of bio-molecular structures. In Proc. of 2003 Int. Conf. Very Large Data Bases (VLDB’03), pages 975–986, 2003.
Y. Tian, R. McEachin, C. Santos, D. States, and J. Patel. SAGA: A subgraph matching tool for biological graphs. Bioinformatics, 23:232–239, 2007.
Y. Tian and J. Patel. TALE: A tool for approximate large graph matching. Proc. of 2008 Int. Conf. on Data Engineering (ICDE’08), pages 963–972, 2008.
P. Willett, J. Barnard, and G. Downs. Chemical similarity searching. J. Chem. Inf. Comput. Sci., 38:983–996, 1998.
D. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In Proc. of 2007 Int. Conf. on Data Engineering (ICDE’07), pages 976–985, 2007.
H. Wolfson and I. Rigoutsos. Geometric hashing: An introduction. IEEE Computational Science and Engineering, 4:10–21, 1997.
X. Yan, P. S. Yu, and J. Han. Graph indexing: A frequent structure-based approach. In Proc. of 2004 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD’04), pages 335–346, 2004.
X. Yan, P. S. Yu, and J. Han. Substructure similarity search in graph databases. In Proc. of 2005 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD’05), pages 766 – 777, 2005.
P. Zhao, J. Yu, and P. Yu. Graph indexing: tree + delta >= graph. In Proc. of 2007 Int. Conf. on Very Large Data Bases (VLDB’07), pages 938–949, 2007.
L. Zou, L. Chen, J. Yu, and Y. Lu. A novel spectral coding in a large graph database. In Proc. of the 11th Int. Conf. on Extending Database Technology (EDBT’08), pages 181–192, 2008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag US
About this chapter
Cite this chapter
Yan, X., Han, J. (2010). Graph Indexing. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_5
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6045-0_5
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-6044-3
Online ISBN: 978-1-4419-6045-0
eBook Packages: Computer ScienceComputer Science (R0)