Abstract
We propose a uniform and flexible mechanism to make reference links from SGML documents to database objects. In addition to typical document logical structures such as sections and paragraphs, our mechanism allows arbitrary character strings in documents as source of these links. By using this mechanism, SGML attributes and their values of marked-up words can be transparently stored as database attributes, and we can establish hyperlinks between keywords in documents, which reflect relationships between the corresponding database objects. Also, we present a query language to retrieve SGML documents which are coupled with databases in this manner. The query language does not assume a particular database schema; instead, it utilizes DTD graphs, representing element structures of DTDs, as virtual schemas.
The first author was supported in part by International Information Science Foundation. The research by the second author was done while he is at Nara Institute of Science and Technology.
Preview
Unable to display preview. Download preview PDF.
References
G. E. Blake, M. P. Consens, I. J. Davis, P. Kilpeläinen, E. Kuikka, P.-Å. Larson, T. Snider, and F. W. Tompa. Text/Relational Database Management Systems: Overview and Proposed SQL Extensions. Technical Report CS-95-25, UW Centre for the New OED and Text Research, Department of Computer Science, University of Waterloo, June 1995.
V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From Structured Documents to Novel Query Facilities. In Proc. ACM SIGMOD International Conference on Management of Data, pages 313–324, May 1994.
V. Christophides and A. Rizk. Querying Structured Documents with Hypertext Links using OODBMS. In Proc. of ACM European Conference on Hypermedia Technology (ECHT'94), pages 186–197, September 1994.
Charles F. Goldfarb. The SGML Handbook. Clarendon Press, Oxford, 1990.
Gaston H. Gonnet. Tutorial: Text Dominated Databases, Theory Practice and Experience. In Proc. ACM Symp. on Principles of Database Systems, pages 301–302, May 1994.
ISO 8879: 1986. Information Processing — Text and Office System — Standard Generalized Markup Language (SGML), Oct. 15 1986.
ISO/IEC 10744: 1992. Hypermedia/Time-based Structuring Language (Hy Time), 1992.
I. A. Macleod. A Query Language for Retrieving Information from Hierarchic Text Structures. The Computer Journal, 34(3):254–264, 1991.
Ian A. Macleod. Storage and Retrieval of Structured Documents. Information Processing & Management, 26(2):197–208, 1990.
Ron Sacks-Davis, Timothy Arnold-Moore, and Justin Zobel. Database Systems for Structured Documents. In Proc. of the International Symposium on Advanced Database Technologies and Their Integration, pages 272–283, October 1994.
Eric van Herwijnen. Practical SGML. Kluwer Academic, 2nd edition, 1994.
Tak W. Yan and Jurgen Annevelink. Integrating a Structured-Text Retrieval System with an Object-Oriented Database System. In Proceedings of the Twentieth International Conference on Very Large Databases, pages 740–749, Santiago, Chile, 1994. Industrial Case.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoshikawa, M., Ichikawa, O., Uemura, S. (1996). Amalgamating SGML documents and databases. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds) Advances in Database Technology — EDBT '96. EDBT 1996. Lecture Notes in Computer Science, vol 1057. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014157
Download citation
DOI: https://doi.org/10.1007/BFb0014157
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61057-1
Online ISBN: 978-3-540-49943-5
eBook Packages: Springer Book Archive