ACM Home Page
Please provide us with feedback. Feedback
Characteristic relational patterns
Full text MovMov (14:30),  PdfPdf (562 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 437-446  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Arne Koopman  Universiteit Utrecht, Utrecht, Netherlands
Arno Siebes  Universiteit Utrecht, Utrecht, Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 34,   Downloads (12 Months): 141,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557071
What is a DOI?

ABSTRACT

Research in relational data mining has two major directions: finding global models of a relational database and the discovery of local relational patterns within a database. While relational patterns show how attribute values co-occur in detail, their huge numbers hamper their usage in data analysis. Global models, on the other hand, only provide a summary of how different tables and their attributes relate to each other, lacking detail of what is going on at the local level.

In this paper we introduce a new approach that combines the positive properties of both directions: it provides a detailed description of the complete database using a small set of patterns. More in particular, we utilise a rich pattern language and show how a database can be encoded by such patterns. Then, based on the MDLprinciple, the novel RDB-KRIMP algorithm selects the set of patterns that allows for the most succinct encoding of the database. This set, the code table, is a compact description of the database in terms of local relational patterns. We show that this resulting set is very small, both in terms of database size and in number of its local relational patterns: a reduction of up to 4 orders of magnitude is attained.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
 
5
L. Getoor, N. Friedman, D. Koller, A. Pfeffer, and B. Taskar. Probabilistic relational models. In L. Getoor and B. Taskar, editors, An Introduction to Statistical Relational Learning. MIT Press, 2007.
 
6
B. Goethals, W. Le Page, and H. Mannila. Mining association rules of simple conjunctive queries. In M. Zaki and K. Wang, editors, SDM, pages 96--107. SIAM, 2008.
 
7
P. D. Grünwald. Minimum description length tutorial. In P. Grünwald and I. Myung, editors, Advances in Minimum Description Length. MIT Press, 2005.
 
8
A. Knobbe. Multi-Relational Data Mining. PhD thesis, Universiteit Utrecht, Utrecht, the Netherlands, 2004.
 
9
 
10
A. Koopman and A. Siebes. Discovering relational items sets efficiently. In M. Zaki and K. Wang, editors, SDM, pages 108--119. SIAM, 2008.
 
11
 
12
 
13
S. Nijssen and J. N. Kok. Efficient frequent query discovery in farmer. In In Proc. of the 7th PKDD, volume 2838 of LNCS, pages 350--362. Springer, 2003.
 
14
A. Siebes, J. Vreeken, and M. van Leeuwen. Item sets that compress. In J. Ghosh, D. Lambert, D. B. Skillicorn, J. Srivastava, J. Ghosh, D. Lambert, D. B. Skillicorn, and J. Srivastava, editors, SDM, pages 393--404. SIAM, 2006.
 
15
 
16

Collaborative Colleagues:
Arne Koopman: colleagues
Arno Siebes: colleagues