Skip to main content

A Tabular Survey of Automated Table Processing

  • Conference paper
  • First Online:
Graphics Recognition Recent Advances (GREC 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1941))

Included in the following conference series:

Abstract

Tables are the only acceptable means of communicating certain types of structured data. A precise definition of “tabularity” remains elusive because some bureaucratic forms, multicolumn text layouts, and schematic drawings share many characteristics of tables. There are significant differences between typeset tables, electronic files designed for display of tables, and tables in symbolic form intended for information retrieval. Although most research to date has addressed the extraction of low-level geometric information from scanned raster images of paper tables, the recent trend toward the analysis of tables in electronic form may pave the way to a higher level of table understanding. Recent research on table composition and table analysis has improved our understanding of the distinction between the logical and physical structures of tables, and has led to improved formalisms for modeling tables. The present study indicates that progress on half-a-dozen specific research issues would open the door to using existing paper and electronic tables for database update, tabular browsing, structured information retrieval through graphical and audio interfaces, multimedia table editing, and platform-independent display. Although tables are not a conventional format for conveying the primary content of technical papers, here we attempt to subdue our natural garrulity by adopting this genre to communicate what we have to say about tables entirely in tabular form.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Abu-Tarif. Table processing and table understanding. Master’s thesis, Rensselaer Polytechnic Institute, May 1998. 100

    Google Scholar 

  2. J. F. Arias, S. Balasubramanian, A. Prasad, R. Kasturi, and A. Chhabra. Information extraction from telephone company drawings. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 729–732, Seattle, Washington, June 1994. 100

    Google Scholar 

  3. J. F. Arias, A. Chhabra, and V. Misra. Efficient interpretation of tabular documents. In Proceedings of the International Conference on Pattern Recognition (ICPR’96), volume III, pages 681–685, Vienna, Austria, August 1996. 100

    Google Scholar 

  4. J. F. Arias, A. Chhabra, and V. Misra. Interpreting and representing tabular documents. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 600–605, San Francisco, CA, June 1996. 100

    Google Scholar 

  5. J. F. Arias and R. Kasturi. Efficient techniques for line drawing interpretation and their application to telephone company drawings. Technical Report CSE TR CSE-95-020, Penn State University, August 1995. 100

    Google Scholar 

  6. S. Balasubramanian, S. Chandran, J. F. Arias, R. Kasturi, and A. Chhabra. Information extraction from tabular drawings. In Proceedings of Document Recognition I (IS&T/SPIE Electronic Imaging’94), volume 2181, pages 152–163, San Jose, CA, June 1994. 100

    Google Scholar 

  7. L. Bing, J. Zao, and X. Hong. New method for logical structure extraction of form document image. In Proceedings of Document Recognition and Retrieval VI (IS&T/SPIE Electronic Imaging’99), volume 3651, pages 183–193, San Jose, CA, January 1999. 100

    Google Scholar 

  8. S. Chandran and R. Kasturi. Structural recognition of tabulated data. In Proceedings of the Second International Conference on Document Analysis and Recognition (ICDAR’93), pages 516–519, Tsukuba Science City, Japan, October 1993. 100

    Google Scholar 

  9. A. K. Chhabra, V. Misra, and J. Arias. Detection of horizontal lines in noisy run length encoded images: The FAST method. In R. Kasturi and K. Tombre, editors, Graphics Recognition — Methods and Applications, volume 1072 of Lecture Notes in Computer Science, pages 35–48. Springer-Verlag, Berlin, Germany, 1996. 100

    Google Scholar 

  10. E. Codd. A relational model of data for large shared data banks. Communications of the ACM, 13(6), June 1970. 104

    Google Scholar 

  11. M. J. DeHaemer, G. Wright, and T. W. Dillon. Automated speech recognition for spreadsheet tasks: Performance effects for experts and novices. International Journal of Human-Computer Interaction, 6(3):299–318, 1994. 100

    Article  Google Scholar 

  12. S. Douglas, M. Hurst, and D. Quinn. Using natural language processing for identifying and interpreting tables in plain text. In Proceedings of the Symposium on Document Analysis and Information Retrieval (SDAIR’95), pages 535–545, Las Vegas, NV, April 1995. 100

    Google Scholar 

  13. D. Embley, B. Kurtz, and S. Woodfield. Object-oriented Systems Analysis: A Model Driven Apprach. Yourdon Press, 1992. 100, 104

    Google Scholar 

  14. M. Garris, S. Janet, and W. Klein. Federal Register document image database. In Proceedings of Document Recognition and Retrieval VI (IS&T/SPIE Electronic Imaging’99), volume 3651, pages 97–108, San Jose, CA, January 1999. 100

    Google Scholar 

  15. P. Gray, S. Embury, W. Gray, and K. Hui. An agent-based system for handling distributed design constraints. In Proceedings of Agents’98, 1998. 100

    Google Scholar 

  16. E. A. Green. Model-based analysis of printed tables. PhD thesis, Rensselaer Polytechnic Institute, May 1996. 100

    Google Scholar 

  17. E. A. Green and M. Krishnamoorthy. Model-based analysis of printed tables. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR’95), pages 214–217, Montréal, Canada, August 1995. 100, 104

    Google Scholar 

  18. E. A. Green and M. Krishnamoorthy. Model-based analysis of printed tables. In Proceedings of the First International Workshop on Graphics Recognition (GREC’95), pages 234–242, PA, 1995. 100, 104

    Google Scholar 

  19. E. A. Green and M. Krishnamoorthy. Recognition of tables using table grammars. In Proceedings of the Symposium on Document Analysis and Information Retrieval (SDAIR’95), pages 261–277, Las Vegas, NV, April 1995. 100, 104

    Google Scholar 

  20. T. B. Haas. The development of a prototype knowledge-based table-processing system. Master’s thesis, Brigham Young University, December 1997. 100, 104

    Google Scholar 

  21. R. Hall. Handbook of Tabular Presentation. The Ronald Press Company, New York, NY, 1943. 100

    Google Scholar 

  22. Y. Hirayama. A method for table structure analysis using DP matching. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR’95), pages 583–586, Montréal, Canada, August 1995. 100

    Google Scholar 

  23. O. Hori and D. S. Doermann. Robust table-form structure analysis based on boxdriven reasoning. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR’95), pages 218–221, Montréal, Canada, August 1995. 100

    Google Scholar 

  24. J. Hu, R. Kashi, D. Lopresti, and G. Wilfong. Medium-independent table detection. In Proceedings of Document Recognition and Retrieval VII (IS&T/SPIE Electronic Imaging’00), San Jose, CA, January 2000. To appear. 100

    Google Scholar 

  25. T. Hu. Recognizing table entries in a scanned document. Master’s thesis, Rensselaer Polytechnic Institute, October 1993. 100

    Google Scholar 

  26. M. Hurst and S. Douglas. Layout and language: Preliminary investigations in recognizing the structure of tables. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’97), pages 1043–1047, August 1997. 100, 104

    Google Scholar 

  27. K. Itonori. A table structure recognition based on textblock arrangement and ruled line position. In Proceedings of the Second International Conference on Document Analysis and Recognition (ICDAR’93), pages 765–768, Tsukuba Science City, Japan, October 1993. 100

    Google Scholar 

  28. T. G. Kieninger. Table structure recognition based on robust block segmentation. In Proceedings of Document Recognition V (IS&T/SPIE Electronic Imaging’98), volume 3305, pages 22–32, San Jose, CA, January 1998. 100

    Google Scholar 

  29. W. Kornfeld and J. Wattecamps. Automatically locating, extracting and analyzing tabular data. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 347–348, Melbourne, Australia, August 1998. 100

    Google Scholar 

  30. M. Krishnamoorthy. TBL, an easy to use table description language. Internal document, Rensselaer Polytechnic Institute, 1992. 100

    Google Scholar 

  31. G. Kyriazis. Analysis of digitized tables. Senior project report, Rensselaer Polytechnic Institute, 1990. 100

    Google Scholar 

  32. L. Lamport. LATEX: A Document Preparation System. Addison-Wesley, Reading, MA, 1985. 100

    Google Scholar 

  33. A. Laurentini and P. Viada. Identifying and understanding tabular material in compound documents. In Proceedings of the Eleventh International Conference on Pattern Recognition (ICPR’92), pages 405–409, The Hague, 1992. 100

    Google Scholar 

  34. M. Lesk. Tbl — a program to format tables. In UNIX Programmer’s Manual, volume 2A. Bell Telephone Laboratories, Murray Hill, NJ, 1979. 100

    Google Scholar 

  35. D. Lopresti and G. Nagy. Automated table processing: An (opinionated) survey. In Proceedings of the Third IAPR International Workshop on Graphics Recognition, pages 109–134, Jaipur, India, September 1999. 94

    Google Scholar 

  36. G. Nagy, M. Krishnamoorthy, S. Seth, and M. Viswanathan. Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(7):737–747, 1993. 100

    Article  Google Scholar 

  37. G. Nagy and S. Seth. Hierarchical representation of optically scanned documents. In Proceedings the International Conference on Pattern Recognition (ICPR), pages 347–349, 1984. 100

    Google Scholar 

  38. C. Peterman, C. H. Chang, and H. Alam. A system for table understanding. In Proceedings of the Symposium on Document Image Understanding Technology (SDIUT’97), pages 55–62, Annapolis, MD, April/May 1997. 94, 100

    Google Scholar 

  39. P. Pyreddy and W. B. Croft. TINTIN: A system for retrieval in text tables. Technical Report UM-CS-1997-002, University of Massachusetts, Amherst, January 1997. 100

    Google Scholar 

  40. M. A. Rahgozar and R. Cooperman. A graph-based table recognition system. In Proceedings of Document Recognition III (IS&T/SPIE Electronic Imaging’96), volume 2660, pages 192–203, San Jose, CA, January 1996. 100

    Google Scholar 

  41. D. Rus and D. Subramanian. Customizing information capture and access. ACM Transactions on Information Systems, 15(1):67–101, 1997. 100

    Article  Google Scholar 

  42. J. H. Shamalian, H. S. Baird, and T. L. Wood. A retargetable table reader. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’97), pages 158–163, August 1997. 100

    Google Scholar 

  43. R. Sproat, J. Hu, and H. Chen. EMU: an e-mail preprocessor for text-to-speech. In Proceedings of the IEEE Workshop on Multimedia Signal Processing, pages 239–244, Los Angeles, CA, December 1998. 100

    Google Scholar 

  44. E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT, 1983. 100

    Google Scholar 

  45. E. Turolla, Y. Belaid, and A. Belaid. Form item extraction based on line searching. In R. Kasturi and K. Tombre, editors, Graphics Recognition — Methods and Applications, volume 1072 of Lecture Notes in Computer Science, pages 69–79. Springer-Verlag, Berlin, Germany, 1996. 100

    Google Scholar 

  46. M. A. Walker, J. Fromer, G. D. Fabbrizio, C. Mestel, and D. Hindle. What can I say?: Evaluating a spoken language interface to email. In Proceedings of the Conference on Human Factors in Computing Systems (CHI), pages 582–589, Los Angeles, CA, April 1998. 100

    Google Scholar 

  47. X. Wang. Tabular abstraction, editing, and formatting. PhD thesis, University of Waterloo, 1996. 99, 100, 102, 104

    Google Scholar 

  48. T. Watanabe, Q. L. Quo, and N. Sugie. Layout recognition of multi-kinds of table-form documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(4):432–445, 1995. 100, 104

    Article  Google Scholar 

  49. S. Whittaker and C. Sidner. Email overload: exploring personal information management of email. In Proceedings of the Conference on Human Factors in Computing Systems (CHI), pages 276–283, Vancouver, British Columbia, Canada, April 1996. 100

    Google Scholar 

  50. P. Wright. Using tabulated information. Ergonomics, 11(4):331–343, 1968. 100

    Article  Google Scholar 

  51. P. Wright. Understanding tabular displays. Visible Language, 7:351–359, 1973. 100

    Google Scholar 

  52. P. Wright. The comprehension of tabulated information: some similarities between prose and reading tables. NSPI Journal, XIX(8):25–29, October 1980. 100

    Article  Google Scholar 

  53. K. Zuyev. Table image segmentation. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’97), pages 705–708, August 1997. 100

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lopresti, D., Nagy, G. (2000). A Tabular Survey of Automated Table Processing. In: Chhabra, A.K., Dori, D. (eds) Graphics Recognition Recent Advances. GREC 1999. Lecture Notes in Computer Science, vol 1941. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40953-X_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-40953-X_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41222-9

  • Online ISBN: 978-3-540-40953-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics