ACM Home Page
Please provide us with feedback. Feedback
Concise descriptions of subsets of structured sets
Full text PdfPdf (374 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 30 ,  Issue 1  (March 2005) table of contents
Special Issue: SIGMOD/PODS 2003
Pages: 211 - 248  
Year of Publication: 2005
ISSN:0362-5915
Authors
Ken Q. Pu  University of Toronto, Canada
Alberto O. Mendelzon  University of Toronto, Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 62,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1061318.1061324
What is a DOI?

ABSTRACT

We study the problem of economical representation of subsets of structured sets, which are sets equipped with a set cover or a family of preorders. Given a structured set U, and a language L whose expressions define subsets of U, the problem of minimum description length in L (L-MDL) is: “given a subset V of U, find a shortest string in L that defines V.” Depending on the structure and the language, the MDL-problem is in general intractable. We study the complexity of the MDL-problem for various structures and show that certain specializations are tractable. The families of focus are hierarchy, linear order, and their multidimensional extensions; these are found in the context of statistical and OLAP databases. In the case of general OLAP databases, data organization is a mixture of multidimensionality, hierarchy, and ordering, which can also be viewed naturally as a cover-structured ordered set. Efficient algorithms are provided for the MDL-problem for hierarchical and linearly ordered structures, and we prove that the multidimensional extensions are NP-complete. Finally, we illustrate the application of the theory to summarization of large result sets and (multi) query optimization for ROLAP queries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
Batchelor, B. 1980. Hierarchical shape description based on convex hulls of concavities. J. Cybernet. 10, 205--210.
 
4
 
5
 
6
 
7
 
8
 
9
Hansen, M. H. and Yu, B. 2001. Model selection and the principle of minimum description length. J. Amer. Statist. Assoc. 96, 454, 746--774.
10
 
11
 
12
Keil, J. 1999. Polygon decomposition. In Handbook of Computational Geometry. Elsevier Sciences, Amsterdem, The Netherlands, Chap. 11, 491--518.
 
13
Kimball, R. 1996. The Data Warehouse Toolkit. Wiley, New York, NY.
14
 
15
Lakshmanan, L., Ng, R. T., Wang, C. X., Zhou, X., and Johnson, T. J. 1999. The generalized MDL approach for summarization. In Proceedings of VLDB 1999. 445--454.
 
16
Lam, W. and Bacchus, F. 1994. Learning Bayesian belief networks: An approach based on the MDL principle. Comput. Intel. 10, 269--293.
 
17
 
18
 
19
Lodi, E., Luccio, F., Mugnai, C., and Pagli, L. 1979. On two-dimensional data organization I. Fundam. Inform. 2, 211--226.
 
20
21
22

Collaborative Colleagues:
Ken Q. Pu: colleagues
Alberto O. Mendelzon: colleagues