ACM Home Page
Please provide us with feedback. Feedback
Estimating PageRank on graph streams
Full text PdfPdf (371 KB)
Source
Symposium on Principles of Database Systems archive
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Vancouver, Canada
SESSION: Streaming data & best paper award table of contents
Pages 69-78  
Year of Publication: 2008
ISBN:978-1-60558-152-1
Authors
Atish Das Sarma  Georgia Tech, Atlanta, GA, USA
Sreenivas Gollapudi  Microsoft Research, Mountain View, CA, USA
Rina Panigrahy  Microsoft Research, Mountain View, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 170,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376916.1376928
What is a DOI?

ABSTRACT

This study focuses on computations on large graphs (e.g., the web-graph) where the edges of the graph are presented as a stream. The objective in the streaming model is to use small amount of memory (preferably sub-linear in the number of nodes n) and a few passes.

In the streaming model, we show how to perform several graph computations including estimating the probability distribution after a random walk of length l, mixing time, and the conductance. We estimate the mixing time M of a random walk in Õ(nα+Mα√n+√Mn/

α) space and Õ(√Mα) passes. Furthermore, the relation between mixing time and conductance gives us an estimate for the conductance of the graph. By applying our algorithm for computing probability distribution on the web-graph, we can estimate the PageRank p of any node up to an additive error of √εp in Õ(√M/α) passes and Õ(min(nα + 1/ε √M/α + 1/ε Mα, αnMα + 1/ε √M/α)) space, for any α ∈ (0, 1]. In particular, for ε = M/n, by setting α = M--1/2, we can compute the approximate PageRank values in Õ(nM--1/4) space and Õ(M3/4) passes. In comparison, a standard implementation of the PageRank algorithm will take O(n) space and O(M) passes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
 
5
 
6
7
 
8
9
10
11
 
12
 
13
 
14
Jon Feldman, S. Muthukrishnan, Anastasios Sidiropoulos, Cliff Stein, and Zoya Svitkina. On the complexity of processing massive, unordered, distributed data. In CoRR abs/cs/0611108, 2006.
15
16
 
17
Sudipto Guha and Andrew McGregor. Space-efficient sampling. In In AISTATS, pages 169--176, 2007.
18
 
19
Sudipto Guha and Andrew McGrgor. Lower bounds for quantile estimation in random-order and multi-pass streaming. In ICALP, pages 704--715, 2007.
 
20
 
21
P. Indyk and D. P. Woodruff. Optimal approximations of the frequency moments of data streams. In IEEE Symposium on Foundations of Computer Science, FOCS, pages 283--292, 2003.
22
 
23
 
24
H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In In COCOON, pages 710--716, 2005.
25
 
26
A. McGregor. Finding graph matchings in data streams. In In APPROX-RANDOM, pages 170--181, 2005.
27
28
 
29
John Wicks and Amy R. Greenwald. Parallelizing the computation of pagerank. In Proc. 5th Workshop On Algorithms And Models For The Web-Graph (WAW), pages 202--208, 2007.
 
30


Collaborative Colleagues:
Atish Das Sarma: colleagues
Sreenivas Gollapudi: colleagues
Rina Panigrahy: colleagues