ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Estimating PageRank on graph streams
Full text PdfPdf (371 KB)
Source
Symposium on Principles of Database Systems archive
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Vancouver, Canada
SESSION: Streaming data & best paper award table of contents
Pages: 69-78  
Year of Publication: 2008
ISBN:978-1-60558-152-1
Authors
Atish Das Sarma  Georgia Tech, Atlanta, GA, USA
Sreenivas Gollapudi  Microsoft Research, Mountain View, CA, USA
Rina Panigrahy  Microsoft Research, Mountain View, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 119,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376916.1376928
What is a DOI?

ABSTRACT

This study focuses on computations on large graphs (e.g., the web-graph) where the edges of the graph are presented as a stream. The objective in the streaming model is to use small amount of memory (preferably sub-linear in the number of nodes n) and a few passes.

In the streaming model, we show how to perform several graph computations including estimating the probability distribution after a random walk of length l, mixing time, and the conductance. We estimate the mixing time M of a random walk in Õ(nα+Mα√n+√Mn/

α) space and Õ(√Mα) passes. Furthermore, the relation between mixing time and conductance gives us an estimate for the conductance of the graph. By applying our algorithm for computing probability distribution on the web-graph, we can estimate the PageRank p of any node up to an additive error of √εp in Õ(√M/α) passes and Õ(min(nα + 1/ε √M/α + 1/ε Mα, αnMα + 1/ε √M/α)) space, for any α ∈ (0, 1]. In particular, for ε = M/n, by setting α = M--1/2, we can compute the approximate PageRank values in Õ(nM--1/4) space and Õ(M3/4) passes. In comparison, a standard implementation of the PageRank algorithm will take O(n) space and O(M) passes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
 
5
 
6
7
 
8
9
10
11
 
12
 
13
 
14
Jon Feldman, S. Muthukrishnan, Anastasios Sidiropoulos, Cliff Stein, and Zoya Svitkina. On the complexity of processing massive, unordered, distributed data. In CoRR abs/cs/0611108, 2006.
15
16
 
17
Sudipto Guha and Andrew McGregor. Space-efficient sampling. In In AISTATS, pages 169--176, 2007.
18
 
19
Sudipto Guha and Andrew McGrgor. Lower bounds for quantile estimation in random-order and multi-pass streaming. In ICALP, pages 704--715, 2007.
 
20
 
21
P. Indyk and D. P. Woodruff. Optimal approximations of the frequency moments of data streams. In IEEE Symposium on Foundations of Computer Science, FOCS, pages 283--292, 2003.
22
 
23
 
24
H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In In COCOON, pages 710--716, 2005.
25
 
26
A. McGregor. Finding graph matchings in data streams. In In APPROX-RANDOM, pages 170--181, 2005.
27
28
 
29
John Wicks and Amy R. Greenwald. Parallelizing the computation of pagerank. In Proc. 5th Workshop On Algorithms And Models For The Web-Graph (WAW), pages 202--208, 2007.
 
30


Collaborative Colleagues:
Atish Das Sarma: colleagues
Sreenivas Gollapudi: colleagues
Rina Panigrahy: colleagues