Abstract
Engineering the Data Web in the Big Data era demands the development of time- and space-efficient solutions for covering the lifecycle of Linked Data. As shown in previous works, using pure in-memory solutions is doomed to failure as the size of datasets grows continuously with time. We present a study of caching solutions for one of the central tasks on the Data Web, i.e., the discovery of links between resources. To this end, we evaluate 6 different caching approaches on real data using different settings. Our results show that while existing caching approaches already allow performing Link Discovery on large datasets from local resources, the achieved cache hits are still poor. Hence, we suggest the need for dedicated solutions to this problem for tackling the upcoming challenges pertaining to the edification of a semantic Web.
Chapter PDF
Similar content being viewed by others
References
Arlitt, M., Cherkasova, L., Dilley, J., Friedrich, R., Jin, T.: Evaluating content management techniques for web proxy caches. SIGMETRICS Performance Evaluation Review 27(4), 3–11 (2000)
Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and zipf-like distributions: evidence and implications. In: INFOCOM, pp. 126–134 (1999)
Hou, W.-C., Wang, S.: Size-adjusted sliding window LFU - a new web caching scheme. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 567–576. Springer, Heidelberg (2001)
Jin, S., Bestavros, A.: Greedydual* web caching algorithm - exploiting the two sources of temporal locality in web request streams. In: 5th International Web Caching and Content Delivery Workshop, pp. 174–183 (2000)
Karakostas, G., Serpanos, D.N.: Exploitation of different types of locality for web caches. In: Proceedings of the Seventh International Symposium on Computers and Communications, pp. 207–2012 (2002)
Karedla, R., Love, J.S., Wherry, B.G.: Caching strategies to improve disk system performance. Computer 27, 38–46 (1994)
Lyko, K., Höffner, K., Speck, R., Ngomo, A.-C., Lehmann, J.: SAIM – one step closer to zero-configuration link discovery. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 167–172. Springer, Heidelberg (2013)
Ngonga Ngomo, A.-C.: A time-efficient hybrid approach to link discovery. In: Proceedings of OM@ISWC (2011)
Ngonga Ngomo, A.-C.: Link discovery with guaranteed reduction ratio in affine spaces with Minkowski measures. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 378–393. Springer, Heidelberg (2012)
Ngonga Ngomo, A.-C.: ORCHID – reduction-ratio-optimal computation of geo-spatial distances for link discovery. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 395–410. Springer, Heidelberg (2013)
Ngonga Ngomo, A.-C.: HELIOS – execution optimization for link discovery. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 17–32. Springer, Heidelberg (2014)
Ngonga Ngomo, A.-C., Auer, S.: Limes - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of IJCAI (2011)
Ngomo, A.-C.N., Kolb, L., Heino, N., Hartung, M., Auer, S., Rahm, E.: When to reach for the cloud: using parallel hardware for link discovery. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 275–289. Springer, Heidelberg (2013)
O’Neil, E.J., O’Neil, P.E., Weikum, G.: The lru-k page replacement algorithm for database disk buffering. SIGMOD Rec. 22, 297–306 (1993)
Podlipnig, S., Böszörmenyi, L.: A survey of web cache replacement strategies. ACM Comput. Surv. 35(4), 374–398 (2003)
Tanenbaum, A.S., Woodhull, A.S.: Operating systems - design and implementation, 3rd edn. Pearson Education (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hassan, M.M., Speck, R., Ngonga Ngomo, AC. (2015). Using Caching for Local Link Discovery on Large Data Sets. In: Cimiano, P., Frasincar, F., Houben, GJ., Schwabe, D. (eds) Engineering the Web in the Big Data Era. ICWE 2015. Lecture Notes in Computer Science(), vol 9114. Springer, Cham. https://doi.org/10.1007/978-3-319-19890-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-19890-3_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19889-7
Online ISBN: 978-3-319-19890-3
eBook Packages: Computer ScienceComputer Science (R0)