ACM Home Page
Please provide us with feedback. Feedback
Probabilistic file indexing and searching in unstructured peer-to-peer networks
Source Computer Networks: The International Journal of Computer and Telecommunications Networking archive
Volume 50 ,  Issue 1  (January 2006) table of contents
Pages: 106 - 127  
Year of Publication: 2006
ISSN:1389-1286
Authors
An-Hsun Cheng  Department of Information Management, National Taiwan University, Taipei, Taiwan, ROC
Yuh-Jzer Joung  Department of Information Management, National Taiwan University, Taipei, Taiwan, ROC
Publisher
Elsevier North-Holland, Inc.  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1016/j.comnet.2005.04.009

ABSTRACT

Thanks to the advance of network and computing technology, Peer-to-Peer (P2P) has become a popular way for file sharing. A huge amount of files can now be directly accessed and downloaded by a simple mouse click. Among the types of P2P networks, unstructured architecture has been proven quite successful, mainly due to its simplicity and robustness. However, searching for distant and rare files is still a challenging problem in unstructured P2P networks. Existing approaches either have poor response time, or generate too much network traffic.In this paper we propose a simple, practical, yet powerful index scheme to enhance search in unstructured P2P networks. The index scheme uses a data structure "Bloom filters" to index files shared at each node, and then lets nodes gossip to one another to exchange their Bloom filters. In effect, each node indexes a random set of files in the network, thereby allowing every query to have a constant probability to be successfully resolved within a fixed search space. The experimental results show that our approach can improve the search in Gnutella by an order of magnitude. For example, in a typical Gnutella network consisting of about 89,000 nodes, by replicating a node's Bloom filter to less than 0.45% of the nodes in the network, 70% of the queries can be resolved within a search space of 200 nodes. In contrast, within the same search space size, only 1.6% of the queries can be resolved without the index scheme; or, alternatively, more than 48,000 nodes need to be searched in Gnutella in order to reach the same success rate as our index scheme.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
{1} L.A. Adamic, R.M. Lukose, A.R. Puniyani, B.A. Huberman, Search in power-law networks, Physical Review E 64 (4) (2001) 046135.
 
2
{2} E. Adar, B.A. Huberman, Free riding on Gnutella, First Monday, 5(10), October 2000. Available from: 〈http:// firstmonday.org/issues/issue5_10/adar/index.html〉.
3
 
4
{4} I. Clarke, O. Sandberg, B. Wiley, T.W. Hong, Freenet: a distributed anonymous information storage and retrieval system, in: Proceedings of the 2000 International Workshop on Design Issues in Anonymity and Unobservability (DIAU 2000)Lecture Notes in Computer Science, vol. 2009, Springer-Verlag, 2001, pp. 41-66.
 
5
{5} CNN, Available from: 〈http://www.cnn.com/2001/tech} internet/06/28napster .usage/〉.
6
 
7
 
8
{8} Gnutella, Available from: 〈http://www.gnutella.com〉.
 
9
{9} M.A. Jovanovic, F.S. Annexstein, K.A. Berman, Scalability issues in large peer-to-peer networks--a case study of Gnutella, Technical report, University of Cincinnati, 2001.
10
11
 
12
{12} R. Mahajan, M. Castro, A. Rowstron, Controlling the cost of reliability in peer-to-peer overlays, in: Proceedings of Second International Workshop on Peer-to-Peer Systems (IPTPS 2003)Lecture Notes in Computer Science, vol. 2735, Springer-Verlag, 2003, pp. 21-32.
 
13
 
14
{14} Napster, Available from: 〈http://www.napster.com〉.
 
15
{15} Sharman Networks. KaZaA, Retrieved July 27, 2004 from 〈http://www.kazaa.com〉.
 
16
17
 
18
{18} P. Reynolds, A. Vahdat, Efficient peer-to-peer keyword searching, in: Proceedings of the 2003 ACM/IFIP/USE-NIX International Middleware Conference (Middleware 2003), Lecture Notes in Computer Science, vol. 2672, Springer-Verlag, 2003, pp. 21-40.
 
19
{19} S. Rhea, J. Kubiatowicz, Probabilistic location and routing, in: Proceedings of Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2002), vol. 3, pp. 1248-1257, June 2002.
 
20
 
21
 
22
{22} S. Saroiu, P. Krishna Gummadi, S.D. Gribble, A measurement study of peer-to-peer file sharing systems, in: Proceedings of the 2002 Multimedia Computing and Networking (MMCN 2002), The International Society of Optical Engineering, January 2002.
 
23
{23} K. Sripanidkulchai, Bruce, H. Zhang, Efficient content location using interest-based locality in peer-to-peer systems, in: Proceedings of Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2003), vol. 3, pp. 216-2176, March/ April 2003.
24
 
25
{25} Web Traces and Logs. Available from: 〈http://www.web-caching.com/traces-logs.html〉.
 
26
 
27
{27} B.Y. Zhao, L. Huang, S.C. Rhea, J. Stribling, A.D Joseph, J.D. Kubiatowicz, Tapestry: a resilient global-scale overlay for service deployment, IEEE Journal on Selected Areas in Communications 22 (1) (2004) 1-15, Special Issue on Service Overlay Networks.


Collaborative Colleagues:
An-Hsun Cheng: colleagues
Yuh-Jzer Joung: colleagues