ACM Home Page
Please provide us with feedback. Feedback
Dynamic faceted search for discovery-driven analysis
Full text PdfPdf (477 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: DB: faceted search, web query results presentation table of contents
Pages 3-12  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Debabrata Dash  Carnegie Mellon University, Pittsburgh, PA, USA
Jun Rao  IBM Almaden Researche Center, San Jose, CA, USA
Nimrod Megiddo  IBM Almaden Research Center, San Jose, CA, USA
Anastasia Ailamaki  Carnegie Mellon University, Pittsburgh, PA, USA and Ecole Polytechnique Fédérale de Lausanne
Guy Lohman  IBM Almaden Research Center, San Jose, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 328,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458087
What is a DOI?

ABSTRACT

We propose a dynamic faceted search system for discovery-driven analysis on data with both textual content and structured attributes. From a keyword query, we want to dynamically select a small set of "interesting" attributes and present aggregates on them to a user. Similar to work in OLAP exploration, we define "interestingness" as how surprising an aggregated value is, based on a given expectation. We make two new contributions by proposing a novel "navigational" expectation that's particularly useful in the context of faceted search, and a novel interestingness measure through judicious application of p-values. Through a user survey, we find the new expectation and interestingness metric quite effective. We develop an efficient dynamic faceted search system by improving a popular open source engine, Solr. Our system exploits compressed bitmaps for caching the posting lists in an inverted index, and a novel directory structure called a bitset tree for fast bitset intersection. We conduct a comprehensive experimental study on large real data sets and show that our engine performs 2 to 3 times faster than Solr.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
W. Dakka, et al: Automatic discovery of useful facet terms. In SIGIR Faceted Search Workshop, 2006
 
6
DBLP dataset: http://dblp.uni-trier.de/xml/
 
7
Bradley Efron and Robert J. Tibshirani: An introduction to the bootstrap. Chapman & Hall, 1993
 
8
 
9
The Flamenco Search Interface Project. http://flamenco.berkeley.edu/
 
10
 
11
 
12
13
 
14
 
15
 
16
 
17
Patent dataset: http://www.nber.org/patents
 
18
John Roddick, et al: A Survey of Temporal Knowledge Discovery Paradigms and Methods. In TKDE, 2002
 
19
Sunita Sarawagi: User-Adaptive Exploration of Multidimensional Data. VLDB 2000: 307--316
 
20
 
21
 
22
 
23
 
24
 
25
 
26
Ping Wu, et al: From Keyword-based Retrieval to Keyword-driven Analytical Processing: A Multi-faceted Approach. SIGMOD 2007
27
 
28
 
29
30
 
31
Friedman, et al: Exploratory Projection Pursuit. In JASA, 1987.
 
32
Swayne, et al: XGobi: Interactive Dynamic Data Visualization in the X Window System. In JCGS, 1998.

Collaborative Colleagues:
Debabrata Dash: colleagues
Jun Rao: colleagues
Nimrod Megiddo: colleagues
Anastasia Ailamaki: colleagues
Guy Lohman: colleagues