Abstract
One problem many Web users encounter is to keep track of changes of distant Web sources. Push services, informing clients about data changes, are frequently not provided by Web servers. Therefore it is necessary to apply intelligent pull strategies, optimizing reload requests by observation of data sources. In this article an adaptive pull strategy is presented that optimizes reload requests with respect to the ‘age’ of data and lost data. The method is applicable if the remote change pattern may approximately be described by a piecewise deterministic behavior which is frequently the case if data sources are updated automatically. Emphasis is laid on an autonomous estimation where only a minimal number of parameters has to be provided manually.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., Raghavan, S.: Searching the web. ACM Trans. Inter. Tech. 1(1), 2–43 (2001)
Babu, S., Widom, J.: Continuous queries over data streams. SIGMOD Rec. 30(3), 109–120 (2001)
Brewington, B.E., Cybenko, G.: How dynamic is the Web? Computer Networks 33(1–6), 257–276 (2000)
Chen, X., Zhang, X.: Web document prefetching on the internet. In: Zhong, Liu, Yao (eds.) Web Intelligence, ch.16, Springer, Heidelberg (2003)
Cho, J., Garcia-Molina, H.: Estimating frequency of change. ACM Trans. Inter. Tech. 3(3), 256–290 (2003)
Coffman, E., Liu, Z., Weber, R.R.: Optimal robot scheduling for web search engines. Journal of Scheduling 1(1), 15–29 (1998)
World Wide Web Consortium. W3c httpd, http://www.w3.org/Protocols/
Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Trans. Inter. Tech. 4(2), 163–184 (2004)
Dingle, A., Partl, T.: Web cache coherence. Computer Networks and ISDN Systems 28(7-11), 907–920 (1996)
Dupont, P., Miclet, L., Vidal, E.: What is the search space of the regular inference? In: Carrasco, R.C., Oncina, J. (eds.) Proceedings of the Second International Colloquium on Grammatical Inference (ICGI 1994): Grammatical Inference and Applications, vol. 862, pp. 25–37. Springer, Berlin (1994)
Everitt, B.S.: Cluster Analysis. Hodder Arnold (2001)
Gold, E.: Language identification in the limit. Information and Control 10, 447–474 (1967)
Kendall, J.E., Kendall, K.E.: Information delivery systems: an exploration of web pull and push technologies. Commun. AISÂ 1(4es), 1 (1999)
Kukulenz, D.: Capturing web dynamics by regular approximation. In: Zhou, X., et al. (eds.) WISE 2004. LNCS, vol. 3306, pp. 528–540. Springer, Heidelberg (2004)
Kukulenz, D.: Optimization of continuous queries by regular inference. In: 6th International Baltic Conference on Databases and IS. Scientific Papers, vol. 672, pp. 62–77 (2004)
Olston, C., Wildom, J.: Best-effort cache synchronization with source cooperation. In: Proceedings od SIGMOD (May 2002)
Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. In: Perez, N., Sanfeliu, A., Vidal, E. (eds.) Pattern Recognition and Image Analysis, pp. 49–61. World Scientific, Singapore (1992)
Parekh, R., Honavar, V.: Learning dfa from simple examples. Machine Learning 44(1/2), 9–35 (2001)
Wessels, D.: Intelligent caching for world-wide web objects. In: Proceedings of INET 1995, Honolulu, Hawaii, USA (1995)
Wolf, J.L., Squillante, M.S., Yu, P.S., Sethuraman, J., Ozsen, L.: Optimal crawling strategies for web search engines. In: Proceedings of the eleventh international conference on World Wide Web, pp. 136–147. ACM Press, New York (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kukulenz, D. (2004). Learning the Grammar of Distant Change in the World-Wide Web. In: Webb, G.I., Yu, X. (eds) AI 2004: Advances in Artificial Intelligence. AI 2004. Lecture Notes in Computer Science(), vol 3339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30549-1_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-30549-1_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24059-4
Online ISBN: 978-3-540-30549-1
eBook Packages: Computer ScienceComputer Science (R0)