| AlphaSum: size-constrained table summarization using value lattices |
| Full text |
Pdf
(1.15 MB)
|
| Source
|
Extending Database Technology; Vol. 360
archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Database summarization
table of contents
Pages 96-107
Year of Publication: 2009
ISBN:978-1-60558-422-5
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 35, Citation Count: 0
|
|
|
ABSTRACT
Consider a scientist who wants to explore multiple data sets to select the relevant ones for further analysis. Since the visualization real estate may put a stringent constraint on how much detail can be presented to this user in a single page, effective table summarization techniques are needed to create summaries that are both sufficiently small and effective in communicating the available content. In this paper, we first argue that table summarization can benefit from knowledge about acceptable value clustering alternatives for clustering the values in the database. We formulate the problem of table summarization with the help of value lattices. We then provide a framework to express alternative clustering strategies and to account for various utility measures (such as information loss) in assessing different summarization alternatives. Based on this interpretation, we introduce three preference criteria, max-min-util (cautious), max-sum-util (cumulative), and pareto-util, for the problem of table summarization. To tackle with the inherent complexity, we rely on the properties of the fuzzy interpretation to further develop a novel ranked set cover based evaluation mechanism (RSC). These are brought together in an AlphaSum, table summarization system. Experimental evaluations showed that RSC improves both execution times and the summary qualities in AlphaSum, by pruning the search space more effectively than the existing solutions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Aggarwal, K. K. Tomas Feder, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Approximation algorithms for k-anonymity. Journal of Privacy Technology, 2005.
|
| |
2
|
R. Alfred and D. Kazakov. Data summarization approach to relational domain learning based on frequent pattern to support the development of decision making. ADMA, 2006.
|
| |
3
|
Francesco Buccafurri , Filippo Furfaro , Domenico Sacca , Cristina Sirangelo, A quad-tree based multiresolution approach for two-dimensional summary data, Proceedings of the 15th International Conference on Scientific and Statistical Database Management, p.127-140, July 09-11, 2003, Cambridge, MA
[doi> 10.1109/SSDM.2003.1214974]
|
| |
4
|
Adam L. Buchsbaum , Donald F. Caldwell , Kenneth W. Church , Glenn S. Fowler , S. Muthukrishnan, Engineering the compression of massive tables: an experimental approach, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.175-184, January 09-11, 2000, San Francisco, California, United States
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
A. Cuzzocrea, D. Saccà, and P. Serafino. A hierarchy-driven compression technique for advanced olap visualization of multidimensional data cubes. In DaWaK, pages 106--119, 2006.
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 19(1), 1989.
|
| |
24
|
P. Resnik. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. JAIR, Vol. 11, 1999.
|
| |
25
|
R. Richardson and A. F. Smeaton. Using WordNet in an Knowledge-Based Approach to Information Retrieval. Working paper CA-1294, Dublin City Univ., Dublin, 1994.
|
 |
26
|
|
 |
27
|
|
 |
28
|
Yan Qi , K. Selçuk Candan , Junichi Tatemura , Songting Chen , Fenglin Liao, Supporting OLAP operations over imperfectly integrated taxonomies, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
[doi> 10.1145/1376616.1376703]
|
| |
29
|
|
| |
30
|
R. Saint-Paul, G. Raschia, and N. Mouaddib. Database summarization: The saintetiq system. ICDE 2007.
|
| |
31
|
|
| |
32
|
|
| |
33
|
|
|