ACM Home Page
Please provide us with feedback. Feedback
Generating semantic annotations for frequent patterns with context analysis
Full text PdfPdf (819 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 337 - 346  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Qiaozhu Mei  University of Illinois at Urbana Champaign, Urbana,IL
Dong Xin  University of Illinois at Urbana Champaign, Urbana,IL
Hong Cheng  University of Illinois at Urbana Champaign, Urbana,IL
Jiawei Han  University of Illinois at Urbana Champaign, Urbana,IL
ChengXiang Zhai  University of Illinois at Urbana Champaign, Urbana,IL
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 147,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150441
What is a DOI?

ABSTRACT

As a fundamental data mining task, frequent pattern mining has widespread applications in many different domains. Research in frequent pattern mining has so far mostly focused on developing efficient algorithms to discover various kinds of frequent patterns, but little attention has been paid to the important nextstep - interpreting the discovered frequent patterns. Although some recent work has studied the compression and summarization of frequent patterns, the proposed techniques can only annotate a frequent pattern with non-semantical information (e.g. support), which provides only limited help for a user to understand the patterns.In this paper, we propose the novel problem of generating semantic annotations for frequent patterns. The goal is to annotate a frequent pattern with in-depth, concise, and structured information that can better indicate the hidden meanings of the pattern. We propose a general approach to generate such anannotation for a frequent pattern by constructing its context model, selecting informative context indicators, and extracting representative transactions and semantically similar patterns. This general approach has potentially many applications such as generating a dictionary-like description for a pattern, finding synonym patterns, discovering semantic relations, and summarizing semantic classes of a set of frequent patterns. Experiments on different datasets show that our approach is effective in generating semantic pattern annotations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
4
 
5
S. C. Deerwester, S. T. Dumais, T. K. Landauer,G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
 
6
 
7
G. Grahne and J. Zhu. Efficiently using prefix-trees in mining frequent itemsets. In FIMI'03 Workshop on Frequent Itemset Mining Implementations., 2003.
 
8
 
9
 
10
P. Jaccard. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaudoise Sci. Nat., 44:223C--270, 1908.
 
11
12
13
 
14
X. Ling, J. Jiang, X. He, Q. Mei, C. Zhai, and B. Schatz. Automatically generating gene summaries from biomedical literature. In Proceedings of Pacific Symposium on Biocomputing, pages 40--51, 2006.
 
15
16
17
 
18
T. Tao, C. Zhai, X. Lu, and H. Fang. A study of statistical methods for function prediction of protein motifs. Applied Bioinformatics, 3(2-3):115--124, 2004.
19
 
20
21
 
22
 
23
X. Yan, J. Han, and R. Afshar. Clospan: Mining closed sequential patterns in large datasets. In Proceedings of SDM'03, pages 166--177, 2003.


Collaborative Colleagues:
Qiaozhu Mei: colleagues
Dong Xin: colleagues
Hong Cheng: colleagues
Jiawei Han: colleagues
ChengXiang Zhai: colleagues