ACM Home Page
Please provide us with feedback. Feedback
Relaxation in text search using taxonomies
Full text PdfPdf (508 KB)
Source
Proceedings of the VLDB Endowment archive
Volume 1 ,  Issue 1  (August 2008) table of contents
SESSION: IR and forms table of contents
Pages 672-683  
Year of Publication: 2008
ISSN:2150-8097
Authors
Marcus Fontoura  Yahoo! Research, Sunnyvale, CA
Vanja Josifovski  Yahoo! Research, Sunnyvale, CA
Ravi Kumar  Yahoo! Research, Sunnyvale, CA
Christopher Olston  Yahoo! Research, Sunnyvale, CA
Andrew Tomkins  Yahoo! Research, Sunnyvale, CA
Sergei Vassilvitskii  Yahoo! Research, Sunnyvale, CA
Publisher
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 74,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1453856.1453930
What is a DOI?

ABSTRACT

In this paper we propose a novel document retrieval model in which text queries are augmented with multi-dimensional taxonomy restrictions. These restrictions may be relaxed at a cost to result quality. This new model may be applicable in many arenas, including multifaceted, product, and local search, where documents are augmented with hierarchical metadata such as topic or location. We present efficient algorithms for indexing and query processing in this new retrieval model. We decompose query processing into two sub-problems: first, an online search problem to determine the correct overall level of relaxation cost that must be incurred to generate the top k results; and second, a budgeted relaxation search problem in which all results at a particular relaxation cost must be produced at minimal cost. We show the latter problem is solvable exactly in two hierarchical dimensions, is NP-hard in three or more dimensions, but admits efficient approximation algorithms with provable guarantees. We present experimental results evaluating our algorithms on both synthetic and real data, showing order of magnitude improvements over the baseline algorithm.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
 
5
6
7
 
8
9
10
 
11
12
13
 
14
 
15
 
16
17
 
18
 
19
R. J. Fowler, M. Paterson, and S. L. Tanimoto. Optimal packing and covering in the plane are NP-complete. IPL, 12(3):133--137, 1981.
 
20
 
21
 
22
 
23
 
24
 
25
 
26
27
 
28
 
29
30
 
31
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
 
32
 
33
I. Witten, A. Moffat, and T. Bell. Managing Gigabytes. Morgan Kaufmann, 1999.
34
35
36


Collaborative Colleagues:
Marcus Fontoura: colleagues
Vanja Josifovski: colleagues
Ravi Kumar: colleagues
Christopher Olston: colleagues
Andrew Tomkins: colleagues
Sergei Vassilvitskii: colleagues