ACM Home Page
Please provide us with feedback. Feedback
Fast mining of complex time-stamped events
Full text PdfPdf (495 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: KM: data mining table of contents
Pages 759-768  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Hanghang Tong  Carnegie Mellon University, Pittsburgh, PA, USA
Yasushi Sakurai  NTT Communication Science Laboratories, Kyoto, Japan
Tina Eliassi-Rad  Lawrence Livermore National Laboratory, Livermore, CA, USA
Christos Faloutsos  Carnegie Mellon University, Pittsburgh, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 187,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458184
What is a DOI?

ABSTRACT

Given a collection of complex, time-stamped events, how do we find patterns and anomalies? Events could be meetings with one or more persons and one or more agenda items at zero or more locations (e.g., teleconferences), or they could be publications with authors, keywords, publishers, etc. In such settings, we want to find time stamps that look similar to each other and group them; we also want to find anomalies. In addition, we want our approach to provide interpretations of the clusters and anomalies by annotating them. Furthermore, we want our approach to automatically find the right time-granularity in which to do analysis. Lastly, we want fast, scalable algorithms for all these problems.

We address the above challenges through two main ideas. The first (T3) is to turn the problem into a graph analysis problem, by carefully treating each time stamp as a node in a graph. This viewpoint brings to bear the vast machinery of graph analysis methods (PageRank, graph partitioning, proximity analysis, and CenterPiece Subgraphs, to name a few). Thus, T3 can automatically group the time stamps into meaningful clusters and spot anomalies. Moreover, it can select representative events/persons/locations for each cluster and each anomaly, as their interpretations. The second idea (MT3) is to use temporal multi-resolution analysis (e.g., minutes, hours, days). We show that MT3 can quickly derive results from finer-to-coarser resolutions, achieving up to 2 orders of magnitude speedups. We verify the effectiveness as well as efficiency of T3 and MT3 on several real datasets.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
R. Albert, H. Jeong, and A.-L. Barabasi. Diameter of the world wide web. Nature, (401):130--131, 1999.
5
 
6
 
7
8
 
9
F. R. K. Chung. Spectral graph theory. American Mathematical Society, 1997.
 
10
 
11
S. Dorogovtsev and J. Mendes. Evolution of networks. Advances in Physics, 51:1079--1187, 2002.
12
13
 
14
 
15
16
 
17
M. Girvan and M. E. J. Newman. Community structure is social and biological networks. Proc. Natl. Acad. Sci. USA, 99:7821--7826, 2002.
 
18
M. Grötschel, C. L. Monma, and M. Stoer. Design of survivable networks. In Handbooks in Operations Research and Management Science 7: Network Models. North Holland, 1993.
19
20
21
22
23
 
24
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167---256, 2003.
 
25
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849--856, 2001.
 
26
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998. Paper SIDL-WP-1999-0120 (version of 11/11/1999).
27
 
28
29
30
31
32
 
33
 
34
H. Tong, S. Papadimitriou, P. S. Yu, and C. Faloutsos. Proximity tracking on time-evolving bipartite graphs. In SIAM-DM, pages 704--711, 2008.
 
35

Collaborative Colleagues:
Hanghang Tong: colleagues
Yasushi Sakurai: colleagues
Tina Eliassi-Rad: colleagues
Christos Faloutsos: colleagues