| Efficient aggregation for graph summarization |
| Full text |
Pdf
(1.29 MB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
table of contents
Vancouver, Canada
SESSION: Research Session 13: Graphs II
table of contents
Pages 567-580
Year of Publication: 2008
ISBN:978-1-60558-102-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 32, Downloads (12 Months): 289, Citation Count: 3
|
|
|
ABSTRACT
Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mostly statistical (studying statistics such as degree distributions, hop-plots and clustering coefficients). These statistical methods are very useful, but the resolutions of the summaries are hard to control. In this paper, we introduce two database-style operations to summarize graphs. Like the OLAP-style aggregation methods that allow users to drill-down or roll-up to control the resolution of summarization, our methods provide an analogous functionality for large graph datasets. The first operation, called SNAP, produces a summary graph by grouping nodes based on user-selected node attributes and relationships. The second operation, called k-SNAP, further allows users to control the resolutions of summaries and provides the "drill-down" and "roll-up" abilities to navigate through summaries with different resolutions. We propose an efficient algorithm to evaluate the SNAP operation. In addition, we prove that the k-SNAP computation is NP-complete. We propose two heuristic methods to approximate the k-SNAP results. Through extensive experiments on a variety of real and synthetic datasets, we demonstrate the effectiveness and efficiency of the proposed methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
D. A. Bader and K. Madduri. GTgraph: A suite of synthetic graph generators. http://www.cc.gatech.edu/~kamesh/GTgraph.
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In Proceedings of 4th SIAM International Conference on Data Mining, 2004.
|
 |
9
|
|
| |
10
|
|
 |
11
|
Jun Huan , Wei Wang , Jan Prins , Jiong Yang, SPIN: mining maximal frequent subgraphs from graph databases, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014123]
|
| |
12
|
M. Ley. DBLP Bibliography. http://www.informatik.uni-trier.de/ ley/db/.
|
| |
13
|
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.
|
| |
14
|
M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E, 69:026113, 2004.
|
| |
15
|
S. Raghavan and H. Garcia-Molina. Representing Web graphs. In Proceedings of ICDE'03, pages 405--416, 2003.
|
| |
16
|
F. S. Roberts and L. Sheng. How hard is it to determine if a graph has a 2-role assignment? Networks, 37(2):67--73, 2001.
|
| |
17
|
|
| |
18
|
|
 |
19
|
Wei Wang , Chen Wang , Yongtai Zhu , Baile Shi , Jian Pei , Xifeng Yan , Jiawei Han, GraphMiner: a structural pattern-mining system for large disk-based graph databases and its applications, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
[doi> 10.1145/1066157.1066273]
|
 |
20
|
|
| |
21
|
D. R. White and K. P. Reitz. Graph and semigroup homomorphisms on semigroups of relations. Social Networks, 5(2):193--234, 1983.
|
 |
22
|
Xiaowei Xu , Nurcan Yuruk , Zhidan Feng , Thomas A. J. Schweiger, SCAN: a structural clustering algorithm for networks, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281280]
|
| |
23
|
|
CITED BY 3
|
|
|
|
|
Ruoming Jin , Yang Xiang , Ning Ruan , David Fuhry, 3-HOP: a high-compression indexing scheme for reachability query, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|