|
ABSTRACT
Gaining business insights from data has recently been the focus of research and product development. On Line-Analytical Processing (OLAP) tools provide elaborate query languages that allow users to group and aggregate data in various ways, and explore interesting trends and patterns in the data. However, the dynamic nature of today's data along with the overwhelming detail at which data is provided, make it nearly impossible to organize the data in a way that a business analyst needs for thinking about the data. In this paper, we introduce "Keyword-Driven Analytical Processing" (KDAP), which combines intuitive keyword-based search with the power of aggregation in OLAP without having to spend considerable effort in organizing the data in terms that the business analyst understands. Our design point is around a user mentality that we frequently encounter: "users don't know how to specify what they want, but they know it when they see it". We present our complete solution framework, which implements various phases from disambiguating the keyword terms to organizing and ranking the results in dynamic facets, that allow the user to explore efficiently the aggregation space. We address specific issues that analysts encounter, like joins, groupings and aggregations, and we provide efficient and scalable solutions. We show, how KDAP can handle both categorical and numerical data equally well and, finally, we demonstrate the generality and applicability of KDAP to two different aspects of OLAP, namely, finding exceptions or surprises in the data and finding bellwether regions where local aggregates are highly correlated with global aggregates, using various experiments on real data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Flamenco faceted search system. http://flamenco.berkeley.edu/.
|
| |
2
|
Google trends. http://www.google.com/trends.
|
| |
3
|
S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis. Automated ranking of database query results. In CIDR, 2003.
|
| |
4
|
A. Balmin, V. Hristidis, and Y. Papakonstantinou. Authority-based keyword queries in databases using objectrank. In VLDB, 2004.
|
| |
5
|
G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using banks. In ICDE, 2002.
|
| |
6
|
S. Chaudhuri, G. Das, and V. Narasayya. Dbexplorer: A system for keyword search over relational databases. In ICDE, 2002.
|
| |
7
|
S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, 2004.
|
 |
8
|
|
| |
9
|
B. Chen, R. Ramakrishnan, J. Shavlik, and P. Tamma. Bellwether analysis: Predicting global aggregates from local regions. In VLDB, 2006.
|
| |
10
|
I. M. D. Florescu and D. Kossmann. Integrating keyword search into xml query processing. In WWW, 2000.
|
| |
11
|
W. Dakka, R. Dayal, and P. G. Ipeirotis. Automatic discovery of useful facet terms. In SIGIR Faceted Search Workshop, 2006.
|
| |
12
|
G. Das, V. Hristidis, N. Kapoor, and S. Sudarshan. Ordering the attributes of query results.
|
| |
13
|
L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. Xrank: Ranked keyword search over xml documents. In SIGMOD, 2003.
|
 |
14
|
|
| |
15
|
V. Hristidis, L. Gravano, and Y. Papakonstantinou. Efficient ir-style keyword search over relational databases. In VLDB, 2003.
|
| |
16
|
V. Hristidis and Y. Papakonstantinou. Discover: keyword search in relational databases. In VLDB, 2002.
|
| |
17
|
Y. Li, C. Yu, and H. V. Jagadish. Schema-free xquery. In VLDB, 2004.
|
| |
18
|
F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective keyword search in relational databases. In SIGMOD, 2006.
|
| |
19
|
S. Sarawagi. Explaining differences in multidimensional aggregates. In VLDB, 1999.
|
| |
20
|
S. Sarawagi. User-adaptive exploration of multidimensional data. In VLDB, 2000.
|
| |
21
|
S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of olap data cubes. In EDBT, 1998.
|
| |
22
|
D. Tunkelang. Dynamic category sets: An approach for faceted search. In SIGIR Faceted Search Workshop, 2006.
|
| |
23
|
K. P. Yee. Faceted metadata for image search and browsing. In CHI, 2003.
|
CITED BY 12
|
|
|
|
|
Yan Qi , K. Selçuk Candan , Junichi Tatemura , Songting Chen , Fenglin Liao, Supporting OLAP operations over imperfectly integrated taxonomies, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
|
|
|
|
|
|
|
|
|
Senjuti Basu Roy , Haidong Wang , Gautam Das , Ullas Nambiar , Mukesh Mohania, Minimum-effort driven dynamic faceted search in structured databases, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
S. Abiteboul , T. Allard , P. Chatalic , G. Gardarin , A. Ghitescu , F. Goasdoué , I. Manolescu , B. Nguyen , M. Ouazara , A. Somani , N. Travers , G. Vasile , S. Zoupanos, WebContent: efficient P2P Warehousing of web data, Proceedings of the VLDB Endowment, v.1 n.2, August 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yi Chen , Wei Wang , Ziyang Liu , Xuemin Lin, Keyword search on structured and semi-structured data, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|