Abstract
The recent file storage applications built on top of peer-to-peer distributed hash tables lack search capabilities. We believe that search is an important part of any document publication system. To that end, we have designed and analyzed a distributed search engine based on a distributed hash table. Our simulation results predict that our search engine can answer an average query in under one second, using under one kilobyte of bandwidth.
This research is supported in part by the National Science Foundation (EIA-99772879, ITR-0082912), Hewlett Packard, IBM, Intel, and Microsoft. Vahdat is also supported by an NSF CARREER award (CCR-9984328), and Reynolds is also supported by an NSF fellowship.
Chapter PDF
Similar content being viewed by others
References
Philip Bernstein and Dah-Ming Chiu. Using semi-joins to solve relational queries. Journal of the Association for Computing Machinery, 28(1):25–40, January 1981.
Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, 1970.
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. In 7th International World Wide Web Conference, 1998.
Junghoo Cho and Hector Garcia-Molina. The evolution of the web and implications for an incremental crawler. In The VLDB Journal, September 2000.
I. Clarke. A distributed decentralised information storage and retrieval system, 1999.
Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), October 2001.
Li Fan, Pei Cao, Jussara Almeida, and Andrei Broder. Summary cache: A scalable wide-area web cache sharing protocol. In Proceedings of ACM SIGCOMM’98, pages 254–265, 1998.
Gnutella. http://gnutella.wego.com/.
T. Hong. Freenet: A distributed anonymous information storage and retrieval system. In ICSI Workshop on Design Issues in Anonymity and Unobservability, 2000.
David R. Karger, Eric Lehman, Frank Thomson Leighton, Rina Panigrahy, Matthew S. Levine, and Daniel Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In ACM Symposium on Theory of Computing, pages 654–663, 1997.
David Liben-Nowell, Hari Balakrishnan, and David Karger. Analysis of the evolution of peer-to-peer systems. In Proceedings of ACM Conference on Principles of Distributed Computing (PODC), 2002.
Lothar Mackert and Guy Lohman. R* optimizer validation and performance evaluation for local queries. In ACM-SIGMOD Conference on Management of Data, 1986.
James Mullin. Optimal semijoins for distributed database systems. IEEE Transactions on Software Engineering, 16(5):558–560, May 1990.
Napster. http://www.napster.com/.
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford University, 1998.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. A scalable content-addressable network. In Proceedings of ACM SIGCOMM’01, 2001.
Antony Rowstron and Peter Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’ 01), 2001.
Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of Multimedia Computing and Networking 2002 (MMCN’02), January 2002.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of ACM SIGCOMM’01, 2001.
Beverly Yang and Hector Garcia-Molina. Efficient search in peer-to-peer networks. Technical Report 2001-47, Stanford University, October 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 IFIP International Federation for Information Processing
About this paper
Cite this paper
Reynolds, P., Vahdat, A. (2003). Efficient Peer-to-Peer Keyword Searching. In: Endler, M., Schmidt, D. (eds) Middleware 2003. Middleware 2003. Lecture Notes in Computer Science, vol 2672. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44892-6_2
Download citation
DOI: https://doi.org/10.1007/3-540-44892-6_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40317-3
Online ISBN: 978-3-540-44892-1
eBook Packages: Springer Book Archive