ACM Home Page
Please provide us with feedback. Feedback
Distributed cache table: efficient query-driven processing of multi-term queries in P2P networks
Full text PdfPdf (768 KB)
Source Information Retrieval In Peer-To-Peer Networks archive
Proceedings of the international workshop on Information retrieval in peer-to-peer networks table of contents
Arlington, Virginia, USA
SESSION: Resource selection table of contents
Pages: 33 - 40  
Year of Publication: 2006
ISBN:1-59593-527-4
Authors
Gleb Skobeltsyn  Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne Switzerland
Karl Aberer  Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne Switzerland
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 95,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1183579.1183586
What is a DOI?

ABSTRACT

The state-of-the-art techniques for processing multi-term queries in P2P environments are query flooding and inverted list intersection. However, it has been shown that due to scalability reasons both methods fail to support full-text search in large scale document collections distributed among the nodes in a P2P network. Although a number of optimizations have been suggested recently based on the aforementioned techniques, little evidence is given on their scalability. In this paper we suggest a novel query-driven indexing strategy which generates and maintains only those index entries that are actually used for query processing. In our approach called Distributed Cache Table<u>1 By analogy with Distributed Hash Table (DHT) (DCT) we suggest to abandon the difference between data indexing and query caching, and to store result sets (caches) for the most profitable queries. DCT employs a distributed index to efficiently locate caches that can answer a given multi-term query and broadcasts the query to all the peers only if no such caches were found. Evaluations on real data and query loads show that DCT converges to a high cache-hit ratio and indeed offers a large-scale distributed solution for storing and efficient querying of vast amounts of documents in the P2P setting. DCT achieves two orders of magnitude improvement in traffic consumption compared to a standard distributed single-term indexing approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
B. Bhattacharjee, S. Chawathe, V. Gopalakrishnan, P. Keleher, and B. Silaghi. Efficient peer-to-peer searches using result-caching. In IPTPS'03, Berkeley, CA, USA, 2003.
 
3
M. Cai, M. Frank, J. Chen, and P. Szekely. Maan: A multi-attribute addressable network for grid information services. Journal of Grid Computing, 2(1), 2004.
 
4
I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. Lecture Notes in Computer Science, 2009, 2001.
 
5
P. Cudré-Mauroux and K. Aberer. A decentralized architecture for adaptive media dissemination. In ICME'02, Lausanne, Switzerland, 2002.
 
6
 
7
 
8
 
9
 
10
B. T. Loo, R. Huebsch, J. M. Hellerstein, S. Shenker, and I. Stoica. Enhancing p2p file-sharing with an internet-scale query processor. In VLDB'04, Toronto, Canada, 2004.
 
11
J. Lu and J. Callan. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In ECIR'05, Santiago de Compostela, Spain, 2005.
 
12
 
13
I. Podnar, T. Luu, M. Rajman, F. Klemm, and K. Aberer. A peer-to-peer architecture for information retrieval across digital library collections. In ECDL'06, Alicante, Spain (to appear), 2006.
 
14
M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
 
15
P. Reynolds and A. Vahdat. Efficient peer-to-peer keyword searching. In Middleware'03, Rio de Janeiro, Brazil, 2003.
 
16
 
17
G. Skobeltsyn, M. Hauswirth, and K. Aberer. Efficient processing of XPath queries with structured overlay networks. In ODBASE'05, Agia Napa, Cyprus, 2005.
 
18
T. Suel, C. Mathur, J.-W. Wu, J. Zhang, A. Delis, M. Kharrazi, X. Long, and K. Shanmugasundaram. ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval. In WebDB'03, San Diego, California, 2003.
 
19
C. Tang and S. Dwarkadas. Hybrid global-local indexing for efficient peer-to-peer information retrieval. In NSDI'04, San Francisco, CA, USA, 2004.
 
20
C. Tryfonopoulos, S. Idreos, and M. Koubarakis. Publish/subscribe functionalities for future digital libraries using structured overlay networks. In DELOS'05, Schloss Dagstuhl, Germany, 2005.
 
21
 
22
 
23
M. R. Yong Yang, Rocky Dunlap and B. F. Cooper. Performance of full text search in structured and unstructured peer-to-peer systems. In INFOCOM'06, Barcelona, Spain, 2006.
 
24


Collaborative Colleagues:
Gleb Skobeltsyn: colleagues
Karl Aberer: colleagues