|
ABSTRACT
Given a collection of complex, time-stamped events, how do we find patterns and anomalies? Events could be meetings with one or more persons and one or more agenda items at zero or more locations (e.g., teleconferences), or they could be publications with authors, keywords, publishers, etc. In such settings, we want to find time stamps that look similar to each other and group them; we also want to find anomalies. In addition, we want our approach to provide interpretations of the clusters and anomalies by annotating them. Furthermore, we want our approach to automatically find the right time-granularity in which to do analysis. Lastly, we want fast, scalable algorithms for all these problems. We address the above challenges through two main ideas. The first (T3) is to turn the problem into a graph analysis problem, by carefully treating each time stamp as a node in a graph. This viewpoint brings to bear the vast machinery of graph analysis methods (PageRank, graph partitioning, proximity analysis, and CenterPiece Subgraphs, to name a few). Thus, T3 can automatically group the time stamps into meaningful clusters and spot anomalies. Moreover, it can select representative events/persons/locations for each cluster and each anomaly, as their interpretations. The second idea (MT3) is to use temporal multi-resolution analysis (e.g., minutes, hours, days). We show that MT3 can quickly derive results from finer-to-coarser resolutions, achieving up to 2 orders of magnitude speedups. We verify the effectiveness as well as efficiency of T3 and MT3 on several real datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Aditya , Gaurav Bhalotia , Soumen Chakrabarti , Arvind Hulgeri , Charuta Nakhe , Parag Parag , S. Sudarshan, BANKS: browsing and keyword searching in relational databases, Proceedings of the 28th international conference on Very Large Data Bases, p.1083-1086, August 20-23, 2002, Hong Kong, China
|
 |
2
|
|
 |
3
|
|
| |
4
|
R. Albert, H. Jeong, and A.-L. Barabasi. Diameter of the world wide web. Nature, (401):130--131, 1999.
|
 |
5
|
Lars Backstrom , Dan Huttenlocher , Jon Kleinberg , Xiangyang Lan, Group formation in large social networks: membership, growth, and evolution, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150412]
|
| |
6
|
|
| |
7
|
Andrei Broder , Ravi Kumar , Farzin Maghoul , Prabhakar Raghavan , Sridhar Rajagopalan , Raymie Stata , Andrew Tomkins , Janet Wiener, Graph structure in the Web, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.309-320, June 2000, Amsterdam, The Netherlands
|
 |
8
|
Yun Chi , Xiaodan Song , Dengyong Zhou , Koji Hino , Belle L. Tseng, Evolutionary spectral clustering by incorporating temporal smoothness, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281212]
|
| |
9
|
F. R. K. Chung. Spectral graph theory. American Mathematical Society, 1997.
|
| |
10
|
|
| |
11
|
S. Dorogovtsev and J. Mendes. Evolution of networks. Advances in Physics, 51:1079--1187, 2002.
|
 |
12
|
|
 |
13
|
Michalis Faloutsos , Petros Faloutsos , Christos Faloutsos, On power-law relationships of the Internet topology, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.251-262, August 30-September 03, 1999, Cambridge, Massachusetts, United States
|
| |
14
|
|
| |
15
|
|
 |
16
|
David Gibson , Jon Kleinberg , Prabhakar Raghavan, Inferring Web communities from link topology, Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, p.225-234, June 20-24, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276627.276652]
|
| |
17
|
M. Girvan and M. E. J. Newman. Community structure is social and biological networks. Proc. Natl. Acad. Sci. USA, 99:7821--7826, 2002.
|
| |
18
|
M. Grötschel, C. L. Monma, and M. Stoer. Design of survivable networks. In Handbooks in Operations Research and Management Science 7: Network Models. North Holland, 1993.
|
 |
19
|
Jingrui He , Mingjing Li , Hong-Jiang Zhang , Hanghang Tong , Changshui Zhang, Manifold-ranking based image retrieval, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
[doi> 10.1145/1027527.1027531]
|
 |
20
|
|
 |
21
|
|
 |
22
|
Jure Leskovec , Jon Kleinberg , Christos Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081893]
|
 |
23
|
|
| |
24
|
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167---256, 2003.
|
| |
25
|
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849--856, 2001.
|
| |
26
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998. Paper SIDL-WP-1999-0120 (version of 11/11/1999).
|
 |
27
|
Jia-Yu Pan , Hyung-Jeong Yang , Christos Faloutsos , Pinar Duygulu, Automatic multimedia cross-modal correlation discovery, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014135]
|
| |
28
|
|
 |
29
|
|
 |
30
|
|
 |
31
|
Hanghang Tong , Christos Faloutsos , Brian Gallagher , Tina Eliassi-Rad, Fast best-effort pattern matching in large attributed graphs, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281271]
|
 |
32
|
|
| |
33
|
|
| |
34
|
H. Tong, S. Papadimitriou, P. S. Yu, and C. Faloutsos. Proximity tracking on time-evolving bipartite graphs. In SIAM-DM, pages 704--711, 2008.
|
| |
35
|
|
|