ACM Home Page
Please provide us with feedback. Feedback
Detection of unique temporal segments by information theoretic meta-clustering
Full text MovMov (13:39),  PdfPdf (1.11 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 59-68  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Shin Ando  Gunma University, Kiryu, Japan
Einoshin Suzuki  Kyushu University, Fukuoka, Japan
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 58,   Downloads (12 Months): 165,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557033
What is a DOI?

ABSTRACT

The central challenge in temporal data analysis is to obtain knowledge about its underlying dynamics. In this paper, we address the observation of noisy, stochastic processes and attempt to detect temporal segments that are related to inconsistencies and irregularities in its dynamics. Many conventional anomaly detection approaches detect anomalies based on the distance between patterns, and often provide only limited intuition about the generative process of the anomalies. Meanwhile, model-based approaches have difficulty in identifying a small, clustered set of anomalies.

We propose Information-theoretic Meta-clustering (ITMC), a formalization of model-based clustering principled by the theory of lossy data compression. ITMC identifies a 'unique' cluster whose distribution diverges significantly from the entire dataset. Furthermore, ITMC employs a regularization term derived from the preference for high compression rate, which is critical to the precision of detection.

For empirical evaluation, we apply ITMC to two temporal anomaly detection tasks. Datasets are taken from generative processes involving heterogeneous and inconsistent dynamics. A comparison to baseline methods shows that the proposed algorithm detects segments from irregular states with significantly high precision and recall.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
 
5
6
 
7
 
8
 
9
 
10
 
11
 
12
S. Kimura, K. Sonoda, S. Yamane, H. Maeda, K. Matsumura, and M. Hatakeyama. Function approximation approach to the inference of reduced NGnet models of genetic networks. BMC Bioinformatics, 9(1):23, 2008.
 
13
 
14
M. Ronen, R. Rosenberg, B. I. Shraiman, and U. Alon. Assigning numbers to the arrows: Parameterizing a gene regulation network by using accurate expression kinetics. Proceedings of the National Academy of Sciences of the United States of America, 99(16):10555--10560, 2002.
 
15
 
16
A. Schliep, A. Schonhuth, and C. Steinhoff. Using hidden Markov models to analyze gene expression time course data. Bioinformatics, 19:i255--263, 2003.
 
17
N. Tishby, F. C. Pereira, and W. Bialek. The Information Bottleneck Method. In Computing Research Repository(CoRR). physics/0004057, 2000.
 
18
 
19

Collaborative Colleagues:
Shin Ando: colleagues
Einoshin Suzuki: colleagues