| Efficient interactive fuzzy keyword search |
| Full text |
Pdf
(1.31 MB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Search/session: search UI
table of contents
Pages 371-380
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
Shengyue Ji
|
University of California, Irvine, Irvine, CA, USA
|
|
Guoliang Li
|
Tsinghua University, Beijing, China
|
|
Chen Li
|
University of California, Irvine, Irvine, CA, USA
|
|
Jianhua Feng
|
Tsinghua University, Beijing, China
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 39, Downloads (12 Months): 186, Citation Count: 2
|
|
|
ABSTRACT
Traditional information systems return answers after a user submits a complete query. Users often feel "left in the dark" when they have limited knowledge about the underlying data, and have to use a try-and-see approach for finding information. A recent trend of supporting autocomplete in these systems is a first step towards solving this problem. In this paper, we study a new information-access paradigm, called "interactive, fuzzy search," in which the system searches the underlying data "on the fly" as the user types in query keywords. It extends autocomplete interfaces by (1) allowing keywords to appear in multiple attributes (in an arbitrary order) of the underlying data; and (2) finding relevant records that have keywords matching query keywords approximately. This framework allows users to explore data as they type, even in the presence of minor errors. We study research challenges in this framework for large amounts of data. Since each keystroke of the user could invoke a query on the backend, we need efficient algorithms to process each query within milliseconds. We develop various incremental-search algorithms using previously computed and cached results in order to achieve an interactive speed. We have deployed several real prototypes using these techniques. One of them has been deployed to support interactive search on the UC Irvine people directory, which has been used regularly and well received by users due to its friendly interface and high efficiency.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Holger Bast , Alexandru Chitea , Fabian Suchanek , Ingmar Weber, ESTER: efficient search on text, entities, and relations, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277856]
|
| |
3
|
H. Bast, C. W. Mortensen, and I. Weber. Output-sensitive autocompletion search. In SPIRE, pages 150--162, 2006.
|
 |
4
|
|
| |
5
|
H. Bast and I. Weber. The completesearch engine: Interactive, efficient, and towards ir&db integration. In CIDR, pages 88--95, 2007.
|
| |
6
|
A. Behm, S. Ji, C. Li, and J. Lu. Space-constrained gram--based indexing for efficient approximate string search. In ICDE, 2009.
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
M. Hadjieleftheriou, A. Chandel, N. Koudas, and D. Srivastava. Fast indexes and algorithms for set similarity selection queries. In ICDE, pages 267--276, 2008.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
C. Li, J. Lu, and Y. Lu. Efficient merging and filtering algorithms for approximate string searches. In ICDE, pages 257--266, 2008.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
CITED BY 2
|
|
Jiannan Wang , Guoliang Li , Jianhua Feng , Chen Li, Automatic URL completion and prediction using fuzzy type-ahead search, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Guoliang Li , Shengyue Ji , Chen Li , Jianhua Feng, Efficient type-ahead search on relational data: a TASTIER approach, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|