|
ABSTRACT
In order to search XML-document collections, structural information - given by a user in the form of a structured query or provided by the self-describing structure of XML-documents - have been used in the past years to improve Information Retrieval (IR) quality in terms of recall and precision. However, all known approaches have only been used in classical client-/server (C/S) architectures. None have ever been applied to improve retrieval in large-scale distributed systems such as Peer-to-Peer (P2P) networks, where efficiency issues have to be dealt with carefully, e.g. in order to reduce communication overhead between distributed nodes. As P2P networks can be considered promising alternatives to C/S-systems for storing large amounts of information including XML-documents, possibilities for improving the retrieval in such networks should be investigated. In this paper, we concentrate on query routing in such a scenario and raise the question, how structured queries can be routed in a highly distributed environment so as to increase both efficiency and effectiveness. We provide an infrastructure for investigating this question and propose techniques for performing routing based on a mixture of document-, element-, collection- and peer-evidence. We also report on preliminary evaluation results with the INEX collection.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abiteboul, S.; Manolescu, I.; Polyzotis, N.; Preda, N.; Sun, C.: XML processing in DHT networks. IEEE 24th Internat. Conference on Data Engineering (ICDE2008), Cancun, Mexico, 2008.
|
 |
2
|
|
| |
3
|
Baeza-Yates, R.; Castillo, C.; Junqueira, F.; Plachouras, V.; Silvestri, F.: Challenges on Distributed Web Retrieval. IEEE Int. Conf. on Data Engineering (ICDE07), Turkey, 2007.
|
 |
4
|
|
 |
5
|
|
 |
6
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
| |
7
|
Ludovic Denoyer , Patrick Gallinari, The Wikipedia XML Corpus, Comparative Evaluation of XML Information Retrieval Systems: 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Dagstuhl Castle, Germany, December 17-20, 2006, Revised and Selected Papers, Springer-Verlag, Berlin, Heidelberg, 2007
[doi> 10.1007/978-3-540-73888-6_2]
|
| |
8
|
El-Ansary, S.; Haridi, S.: An Overview of Structured Overlay Networks. In: Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless and Peer-to-Peer Networks, CRC Press, 2005.
|
| |
9
|
Fuhr, N.; Gövert, N.; Kazai, G.; Lalmas, M. (eds.): INitiative for the Evaluation of XML Retrieval (INEX). In: Proc. of the First INEX Workshop, Dagstuhl, Germany, 2002.
|
 |
10
|
|
| |
11
|
Larson, R.: XML Element Retrieval and Heterogeneous Retrieval - In Pursuit of the Impossible. In: Proc. of INEX 2005, 2nd Edition, Dagstuhl, Germany, 2005.
|
| |
12
|
Li, J.; Loo, B.; Hellerstein, J.; Kaashoek, F.; Karger, D.; Morris, R.: On the Feasibility of Peer-to-Peer Web Indexing and Search. In: Proc. of the Second International Workshop on Peer-to-Peer Systems, 2003.
|
| |
13
|
Malik, S.; Trotman, A.; Lalmas, M.; Fuhr, N.: Overview of INEX 2006. In: Proc. of INEX, Dagstuhl, Germany, 2006.
|
| |
14
|
Mass, Y.; Mandelbrod, M.: Component Ranking and Automatic Query Refinement for XML Retrieval. LNCS, Vol. 3493/2005, Springer-Verlag, 2005.
|
| |
15
|
|
| |
16
|
Pal, S.: XML Retrieval - A Survey. Technical Report, CVPR, http://www.isical.ac.in/~sukomal_r/survey.pdf, 2006.
|
 |
17
|
|
| |
18
|
Podnar, I.; Rajman, M.; Luu, T.; Klemm, F.; Aberer, K.: Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys. In: Proc. of IEEE 23rd International Conference on Data Engineering (ICDE 2007), 2007.
|
 |
19
|
|
| |
20
|
Risson, J.; Moors, T.: Survey of research towards robust peer-to-peer networks - search methods. In: Technical Report UNSW-EE-P2P-1-1, Uni. of NSW, Australia, 2004.
|
 |
21
|
|
 |
22
|
Gleb Skobeltsyn , Toan Luu , Ivana Podnar Zarko , Martin Rajman , Karl Aberer, Web text retrieval with a P2P query-driven index, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277857]
|
| |
23
|
|
| |
24
|
Ion Stoica , Robert Morris , David Liben-Nowell , David R. Karger , M. Frans Kaashoek , Frank Dabek , Hari Balakrishnan, Chord: a scalable peer-to-peer lookup protocol for internet applications, IEEE/ACM Transactions on Networking (TON), v.11 n.1, p.17-32, February 2003
[doi> 10.1109/TNET.2002.808407]
|
| |
25
|
Winter, J.; Drobnik, O.: A Distributed Indexing Strategy for Efficient XML Retrieval. Efficiency Issues in Information Retrieval Workshop (EIIR2008) at ECIR2008, Glasgow, Scotland, 2008.
|
 |
26
|
|
| |
27
|
|
| |
28
|
|
| |
29
|
Zhang, J.; Suel, T.: Optimized Inverted List Assignment in Distributed Search Engine Architectures. In: Proc. of 21th IPDPS 2007, California, USA, 2007
|
|