Abstract
This research is dedicated to enhancing the efficiency of web information extraction and web accessibility. The motivation behind the research, its aim and objectives are presented, and the performed work on developing web page model for information extraction is described. We also present work on making extracted information accessible to blind users, providing them with the means to navigate and access required information quickly. We also present our ongoing research on creating efficient methods and approaches for information extraction from the proposed model. There are two main approaches considered: 1) development of the library which provides required functionality to the programmer; 2) development of declarative Datalog-like language for information extraction.
Chapter PDF
Similar content being viewed by others
References
Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification (2009), http://www.w3.org/TR/2009/CR-CSS2-20090908/
Baumgartner, R., Fayzrakhmanov, R.R., Holzinger, W., Krüpl, B., Göbel, M.C., Klein, D., Gattringer, R.: Web 2.0 vision for the blind. In: Proc. of Web Science Conference 2010 (WebSci 2010), Raleigh, USA, p. 8 (2010)
Baumgartner, R., Flesca, S., Gottlob, G.: The Elog Web Extraction Language. In: Nieuwenhuis, R., Voronkov, A. (eds.) LPAR 2001. LNCS (LNAI), vol. 2250, pp. 548–560. Springer, Heidelberg (2001)
Clementini, E., Di Felice, P., Hernández, D.: Qualitative representation of positional information. Artificial Intelligence 95(2), 317–356 (1997)
Cohn, A.G.: Qualitative spatial representation and reasoning techniques, pp. 1–30. Springer, Berlin (1997)
Fayzrakhmanov, R.R., Göbel, M.C., Holzinger, W., Krüpl, B., Baumgartner, R.: A Unified ontology-based web page model for improving accessibility. In: Proc. WWW 2010, pp. 1087–1088. ACM, New York (2010)
Fayzrakhmanov, R.R., Göbel, M.C., Holzinger, W., Krüpl, B., Mager, A., Baumgartner, R.: Modelling Web navigation with the user in mind. In: Proc. W4A 2010, Raleigh, USA, p. 4 (2010)
Gottlob, G., Koch, C., Baumgartner, R., Herzog, M., Flesca, S.: The Lixto data extraction project - back and forth between theory and practice. In: Transformation, pp. 1–12. ACM, Paris (2004)
Kashyap, V., Bussler, C., Moran, M.: The Semantic Web. Semantics for Data and Services on the Web. Springer, Berlin (2008)
Kong, J., Zhang, K., Zeng, X.: Spatial graph grammars for graphical user interfaces. ACM Transactions on Computer-Human Interaction 13(2), 268–307 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fayzrakhmanov, R.R. (2012). Information Extraction from Web Pages Based on Their Visual Representation. In: Harth, A., Koch, N. (eds) Current Trends in Web Engineering. ICWE 2011. Lecture Notes in Computer Science, vol 7059. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27997-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-27997-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27996-6
Online ISBN: 978-3-642-27997-3
eBook Packages: Computer ScienceComputer Science (R0)