Abstract.
Finding efficient, effective ways to compare graphs arising from recognition processes with their corresponding ground-truth graphs is an important step toward more rigorous performance evaluation.
In this paper, we examine in detail the graph probing paradigm we first put forth in the context of our work on table understanding and later extended to HTML-coded Web pages. We present a formalism showing that graph probing provides a lower bound on the true edit distance between two graphs. From an empirical standpoint, the results of two simulation studies and an experiment using scanned pages show that graph probing correlates well with the latter measure. Moreover, our technique is very fast; graphs with tens or hundreds of thousands of vertices can be compared in mere seconds. Ease of implementation, scalability, and speed of execution make graph probing an attractive alternative for graph comparison.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Babai L, Erdös P, Selkow SM (1980) Random graph isomorphism. SIAM J Comput 9(3):628-635
Bunke H (1997) On a relation between graph edit distance and maximum common subgraph. Patt Recog Lett 18:689-694
Bunke H (2000) Recent developments in graph matching. In: Proceedings of the 15th International Conference on Pattern Recognition, Barcelona, Spain, September 2000, 2:117-124
Bunke H, Messmer BT (1997) Recent advances in graph matching. Int J Patt Recog Artif Intell 11(1):169-203
Corneil DG, Kirkpatrick DG (1980) A theoretical analysis of various heuristics for the graph isomorphism problem. SIAM J Comput 9(2):281-297
Dubois D, Prade H, Sédes F (1999) Some uses of fuzzy logic in multimedia databases querying. In: Proceedings of the Workshop on Logical and Uncertainty Models for Information Systems, London, July 1999, pp 46-54
Fortin S (1996) The graph isomorphism problem. Department of Computer Science Technical Report TR 96-20, University of Alberta, Canada
Hu J, Kashi R, Lopresti D, Nagy, Wilfong G (2001) Why table ground-truthing is hard. In: Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, September 2001, pp 129-133
Hu J, Kashi R, Lopresti D, Wilfong G (2000) A system for understanding and reformulating tables. In: Proceedings of the 4th IAPR International Workshop on Document Analysis Systems, Rio de Janeiro, December 2000, pp 361-372
Hu J, Kashi R, Lopresti D, Wilfong G (2001) Table structure recognition and its evaluation. In: Proceedings of Document Recognition and Retrieval VIII, San Jose, January 2001, 4307:44-55
Hu J, Kashi R, Lopresti D, Wilfong G (2002) Evaluating the performance of table processing algorithms. Int J Doc Anal Recog 4(3):140-153
Jolion JM (2001) Graph matching: what are we really talking about? In: Proceedings of the 3rd IAPR Workshop on Graph-Based Representations in Pattern Recognition, Ischia, Italy, May 2001. http://citeseer.nj.nec.com/503443.html
Kanungo T, Lee CH, Czorapinski J, Bella I (2001) TRUEVIZ: a groundtruth / metadata editing and visualizing toolkit for OCR. In: Proceedings of Document Recognition and Retrieval VIII, San Jose, January 2001, 4307:1-12
Koutsofios E, North SC (1991) Drawing graphs with dot. Technical Report 59113-910904-08TM, AT&T Bell Laboratories
Lazarescu M, Bunke H, Venkatesh S (2000) Graph matching: fast candidate elimination using machine learning techniques. In: Advances in pattern recognition. Lecture Notes in Computer Science, vol 1876, Springer, Berlin Heidelberg New York, pp 236-245
Lopresti D, Wilfong G (2001a) Applications of graph probing to Web document analysis. In: Proceedings of the 1st international workshop on Web document analysis, Seattle, September 2001, pp 51-54
Lopresti D, Wilfong G (2001b) Comparing semi-structured documents via graph probing. In: Proceedings of the 7th International Workshop on Multimedia Information Systems, Capri, Italy, November 2001, pp 41-50
Lopresti D, Wilfong G (2001c) Evaluating document analysis results via graph probing. In: Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, September 2001, pp 116-120
McKay B (1990) Nauty User’s Guide (Version 1.5). Computer Science Department, Australian National University, Canberra, Australia
McKay B (1981) Practical graph isomorphism. Congressus Numerantium 30:45-87
Messmer BT, Bunke H (1995) Efficient error-tolerant subgraph isomorphism detection. In: Shape, Structure and Pattern Recognition, World Scientific, Singapore, pp 231-240
Myers R, Wilson RC, Hancock ER (2000) Bayesian graph edit distance. IEEE Trans Patt Anal Mach Intell 22(6):628-635
Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: Proceedings of the 7th International Conference on Pattern Recognition, Montréal, July 1984, pp 347-349
Ousterhout JK (1994) Tcl and the Tk toolkit. Addison-Wesley, Reading, MA
Papadopoulos AN, Manolopoulos Y (1998) Structure-based similarity search with graph histograms. In: Proceedings of the 10th International Workshop on Database and Expert Systems Applications, pp 174-178. IEEE Press, New York
Phillips I, Chen S, Haralick R (1993) CD-ROM document database standard. In: Proceedings of the 2nd International Conference on Document Analysis and Recognition, Tsukuba Science City, Japan, October 1993, pp 478-483
Sanfeliu A, Fu KS (1983) A distance measure between attributed relational graphs for pattern recognition. IEEE Trans Sys Man Cybern 13(3):353-362
Schlieder T, Naumann F (2000) Approximate tree embedding for querying XML data. In: Proceedings of the ACM SIGIR Workshop on XML and Information Retrieval, Athens, Greece, July 2000, pp 53-67
Tsai WH, Fu KS (1979) Error-correcting isomorphisms of attributed relational graphs for pattern analysis. IEEE Trans Sys Man Cybern 9(12):757-768
Valiente G, Martínez C (1997) An algorithm for graph pattern-matching. In: Proceedings of the 4th South American Workshop on String Processing, Valparaíso, Chile, November 1997, pp 180-197. Carleton University Press, Ottawa, Ontario
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 1 October 2002, Accepted: 15 January 2003, Published online: 6 February 2004
Correspondence to: D. Lopresti
Rights and permissions
About this article
Cite this article
Lopresti, D., Wilfong, G. A fast technique for comparing graph representations with applications to performance evaluation. IJDAR 6, 219–229 (2003). https://doi.org/10.1007/s10032-003-0106-z
Issue Date:
DOI: https://doi.org/10.1007/s10032-003-0106-z