Abstract
The recent trend in the Internet traffic is increasing in requests for dynamic and personalized content. To efficiently serve this trend, several server- side and cache-side fragment-based techniques, which exploit reuse of Web pages at the sub-document level, have been proposed. Most of these techniques do not focus on the creation of the fragmented content from existing dynamic content. Also, existing caching techniques do not support fragment movement across the document, a common behavior in dynamic content.
This paper presents two proposals that we have suggested to solve these problems. The first, DyCA, a dynamic content adapter, takes original dynamic Web content and converts it to fragment-enabled content. Thus the dynamic parts of the document are separated into separate fragments from the static template of the document. This is dependent on our proposed keyword-based fragment detection approach that uses predefined keywords to find these fragments and to split them out of the core document. Our second proposal, an augmentation to the ESI standard, allows splitting the information of the position of each fragment in the template from the template data itself by using a mapping table. Using this, a fragment enabled cache can have a more fine grained level of identifying fragments independent of their location on the template, which enables it to take into account fragment behaviors such as fragment movement.
We used the content taken from three real Web sites to achieve a detailed performance evaluation of our proposals. Our results show that our keyword-based approach for fragment extraction provides us with cacheable fragments that, when combined with our proposed mapping table augmentation, can provide significant advantages for fragment-based Web caching of existing dynamic content.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Mogul, J.C., Douglis, F., Feldmann, A., Krishnamurthy, B.: Potential Benefits of Delta- Encoding and Data Compression for HTTP. In: Proc. of the 13th ACM SIGCOMM 1997, pp. 181–194 (1997)
Challenger, J., Iyengar, A., Dantzig, P.: A scalable system for consistently caching dynamic Web data. In: Proc. of IEEE Conference on Computer Communications, INFOCOM 1999 (1999)
Challenger, J., Iyengar, A., Witting, K., Ferstat, C., Reed, P.: Apublishing system for efficiently creating dynamicWeb content. In: Proc. of IEEE Conference on Computer Communications, INFOCOM 2000 (2000)
Douglis, F., Haro, A., Rabinovich, M.: HPP:HTML macro-pre-processing to support dynamic document caching. In: Proc. of the 1st USENIX Symposium on Internet Technologies and Systems (USITS 1997), pp. 83–94 (1997)
Cao, P., Zhang, J., Beach, K.: Active cache: Caching dynamic contents on theWeb. In: Proc. of IFIP Int’l Conf. Dist. Sys. Platforms and Open Dist. Processing, pp. 373–388 (1998)
Myers, A., Chuang, J., Hengartner, U., Xie, Y., Zhang, W., Zhang, H.: A secure and publishercentricWeb caching infrastructure. In: Proc. of IEEE Conference on Computer Communications, INFOCOM 2001 (2001)
Shi, W., Karamcheti, V.: CONCA: An architecture for consistent nomadic content access. In: Workshop on Cache, Coherence, and Consistency (WC3 2001) (2001)
Wills, C.E., Mikhailov, M.: Studying the impact of more complete server information on Web caching. In: Proc. of the 5th International Workshop on Web Caching and Content Distribution, WCW 2000 (2000)
IBM Corp.: http://www.ibm.com/Websphere (Websphere platform)
http://www.w3.org/Style/XSL/ (W3C XSLWorking Group)
Tsimelzon, M., Weihl, B., Jacobs, L.: ESI language specification 1.0 (2000)
http://www.akamai.com/ (Akamai Technologies Inc.)
Shi, W., Collins, E., Karamcheti, V.: DYCE: A synthetic dynamicWeb content emulator. In: Poster Proc. of 11th International World Wide Web Conference (2002)
Rabinovich, M., Xiao, Z., Douglis, F., Kamanek, C.: Moving edge side includes to the real edge – the clients. In: Proc. of the 4th USENIX Symposium on Internet Technologies and Systems, USITS 2003 (2003)
http://www.nytimes.com (NytimesWeb site)
http://www.indiatimes.com (IndiatimesWeb site)
http://www.slashdot.com (SlashdotWeb site)
http://httpd.apache.org (Apache HTTP Server Project)
http://www.w3.org/Jigsaw (Jigsaw Project)
Ramaswamy, L., Iyengar, A., Liu, L., Douglis, F.: Automatic detection of fragments in dynamically generated Web pages. In: Proc. of the 13th International World Wide Web Conference (2004)
Naaman, M., Garcia-Molina, H., Paepcke, A.: Evaluation of esi and class-based delta encoding. In: Proc. of the 8th International Workshop on Web Caching and Content Distribution, WCW 2003 (2003)
Arasu, A., Garcia-Molina, H.: Extracting structured data fromWeb pages. In: Proc. of ACM SIGMOD 2003 (2003)
Bar-Yossef, Z., Rajagopalan, S.: Template detection via data mining and its applications. In: Proc. of the 11th International World Wide Web Conference (2002)
Butler, D., Liu., L.: A Fully Automated Object Extraction System for the World Wide Web. In: Proceedings of ICDCS 2001 (2001)
Gu, X., et al.: Visual based content understanding towards Web adaptation. In: De Bra, P., Brusilovsky, P., Conejo, R. (eds.) AH 2002. LNCS, vol. 2347, p. 164. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brodie, D., Gupta, A., Shi, W. (2004). Accelerating Dynamic Web Content Delivery Using Keyword-Based Fragment Detection. In: Koch, N., Fraternali, P., Wirsing, M. (eds) Web Engineering. ICWE 2004. Lecture Notes in Computer Science, vol 3140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27834-4_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-27834-4_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22511-9
Online ISBN: 978-3-540-27834-4
eBook Packages: Springer Book Archive