| Graph summarization with bounded error |
| Full text |
Pdf
(1.11 MB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
table of contents
Vancouver, Canada
SESSION: Research Session 10: Graphs I
table of contents
Pages 419-432
Year of Publication: 2008
ISBN:978-1-60558-102-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 24, Downloads (12 Months): 295, Citation Count: 1
|
|
|
ABSTRACT
We propose a highly compact two-part representation of a given graph G consisting of a graph summary and a set of corrections. The graph summary is an aggregate graph in which each node corresponds to a set of nodes in G, and each edge represents the edges between all pair of nodes in the two sets. On the other hand, the corrections portion specifies the list of edge-corrections that should be applied to the summary to recreate G. Our representations allow for both lossless and lossy graph compression with bounds on the introduced error. Further, in combination with the MDL principle, they yield highly intuitive coarse-level summaries of the input graph G. We develop algorithms to construct highly compressed graph representations with small sizes and guaranteed accuracy, and validate our approach through an extensive set of experiments with multiple real-life graph data sets. To the best of our knowledge, this is the first work to compute graph summaries using the MDL principle, and use the summaries (along with corrections) to compress graphs with bounded error.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
Joseph M. Hellerstein , Peter J. Haas , Helen J. Wang, Online aggregation, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.171-182, May 11-15, 1997, Tucson, Arizona, United States
|
 |
14
|
Marios Iliofotou , Prashanth Pappu , Michalis Faloutsos , Michael Mitzenmacher , Sumeet Singh , George Varghese, Network monitoring using traffic dispersion graphs (tdgs), Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, October 24-26, 2007, San Diego, California, USA
[doi> 10.1145/1298306.1298349]
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
Laks V. S. Lakshmanan , Raymond T. Ng , Christine Xing Wang , Xiaodong Zhou , Theodore J. Johnson, The generalized MDL approach for summarization, Proceedings of the 28th international conference on Very Large Data Bases, p.766-777, August 20-23, 2002, Hong Kong, China
|
| |
23
|
|
| |
24
|
A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849--856, 2001.
|
 |
25
|
|
| |
26
|
S. Raghavan and H. Garcia-Molina. Representing web graphs. In ICDE, pages 405--416, 2003.
|
| |
27
|
|
| |
28
|
J. Rissanen. Modelling by the shortest data description. Automatica, 14:465--471, 1978.
|
| |
29
|
P. Shannon, A. Markiel, O. Ozier, N. Baliga, J. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res., 13(11):2498--504, 2003.
|
| |
30
|
|
| |
31
|
Godfrey Tan , Massimiliano Poletto , John Guttag , Frans Kaashoek, Role classification of hosts within enterprise networks based on connection patterns, Proceedings of the annual conference on USENIX Annual Technical Conference, p.2-2, June 09-14, 2003, San Antonio, Texas
|
|