ACM Home Page
Please provide us with feedback. Feedback
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Full text PdfPdf (278 KB)
Source Applications, Technologies, Architectures, and Protocols for Computer Communication archive
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications table of contents
Karlsruhe, Germany
SESSION: Overlays table of contents
Pages: 175 - 186  
Year of Publication: 2003
ISBN:1-58113-735-4
Authors
Chunqiang Tang  University of Rochester, Rochester, NY
Zhichen Xu  HP Laboratories, Palo Alto, CA
Sandhya Dwarkadas  University of Rochester, Rochester, NY
Sponsors
ACM: Association for Computing Machinery
SIGCOMM: ACM Special Interest Group on Data Communication
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 218,   Citation Count: 77
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/863955.863976
What is a DOI?

ABSTRACT

Content-based full-text search is a challenging problem in Peer-to-Peer (P2P) systems. Traditional approaches have either been centralized or use flooding to ensure accuracy of the results returned.In this paper, we present pSearch, a decentralized non-flooding P2P information retrieval system. pSearch distributes document indices through the P2P network based on document semantics generated by Latent Semantic Indexing (LSI). The search cost (in terms of different nodes searched and data transmitted) for a given query is thereby reduced, since the indices of semantically related documents are likely to be co located in the network.We also describe techniques that help distribute the indices more evenly across the nodes, and further reduce the number of nodes accessed using appropriate index distribution as well as using index samples and recently processed queries to guide the search.Experiments show that pSearch can achieve performance comparable to centralized information retrieval systems by searching only a small number of nodes. For a system with 128,000 nodes and 528,543 documents (from news, magazines, etc.), pSearch searches only 19 nodes and transmits only 95.5KB data during the search, whereas the top 15 documents returned by pSearch and LSI have a 91.7% intersection.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
E. Cohen, A. Fiat, and H. Kaplan. Associative Search in Peer to Peer Networks: Harnessing Latent Semantics. In IEEE INFOCOM'03, April 2003.
4
 
5
 
6
 
7
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by Latent Semantic Analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
 
8
S. Dumais. Using LSI for information filtering: TREC-3 experiments. In the Third Text REtrieval Conference (TREC3), 1995.
 
9
 
10
FastTrack. http://www.fasttrack.nu.
11
12
 
13
R. Lempel and S. Moran. Optimizing Result Prefetching in Web Search Engines with Segmented Indices. In VLDB'01, 2001.
 
14
 
15
J. Li, B. T. Loo, J. Hellerstein, F. Kaashoek, D. R. Karger, and R. Morris. On the Feasibility of Peer-to-Peer Web Indexing and Search. In IPTPS'03, February 2003.
16
 
17
C. D. Prete, J. T. McArthur, R. L. Villars, I. L. Nathan Redmond, and D. Reinsel. Industry developments and models, Disruptive Innovation in Enterprise Computing: storage. IDC, February 2003.
 
18
19
 
20
S. Rhea and J. Kubiatowicz. Probabilistic Location and Routing. In IEEE INFOCOM'02, June 2002.
 
21
M. Schwartz. A Scalable, Non-Hierarchical Resource Discovery Mechanism Based on Probabilistic Protocols. Technical Report CU-CS-474-90, University of Colorado, 1990.
22
 
23
SVDPACK. http://www.netlib.org/svdpack.
 
24
C. Tang, Z. Xu, and M. Mahalingam. pSearch: Information Retrieval in Structured Overlays. In HotNets-I, October 2002. Expanded version available as HP technical report HPL-2002-198, "PeerSearch: Efficient Information Retrieval in Peer- to-Peer Networks".
 
25
Text Retrieval Conference (TREC). http://trec.nist.gov.
 
26
 
27
 
28
J. D. Zakis and Z. J. Pudlowski. The World Wide Web as Universal Medium for Scholarly Publication, Information Retrieval and Interchange. Global Journal of Engineering Education, 1(3), 1997.

CITED BY  77

Collaborative Colleagues:
Chunqiang Tang: colleagues
Zhichen Xu: colleagues
Sandhya Dwarkadas: colleagues