ACM Home Page
Please provide us with feedback. Feedback
The scalable hyperlink store
Full text PdfPdf (449 KB)
Source
Conference on Hypertext and Hypermedia archive
Proceedings of the 20th ACM conference on Hypertext and hypermedia table of contents
Torino, Italy
SESSION: Link analysis table of contents
Pages 89-98  
Year of Publication: 2009
ISBN:978-1-60558-486-7
Author
Marc Najork  Microsoft Research, Mountain View, CA, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 56,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557914.1557933
What is a DOI?

ABSTRACT

This paper describes the Scalable Hyperlink Store, a distributed in-memory "database" for storing large portions of the web graph. SHS is an enabler for research on structural properties of the web graph as well as new link-based ranking algorithms. Previous work on specialized hyperlink databases focused on finding efficient compression algorithms for web graphs. By contrast, this work focuses on the systems issues of building such a database. Specifically, it describes how to build a hyperlink database that is fast, scalable, fault-tolerant, and incrementally updateable.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
4
 
5
 
6
 
7
8
9
 
10
S. Gollapudi, M. Najork, and R. Panigrahy. Using Bloom Filters to Speed Up HITS-like Ranking Algorithms. In 5th Workshop on Algorithms and Models for the Web--Graph, December 2007, pages 195--201.
 
11
 
12
 
13
 
14
 
15
 
16
M. Najork. System and method for maintaining a distributed database of hyperlinks. US Patent 7340467; filed April 2003, issued March 2008.
17
18
19
 
20
21
 
22
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
 
23
 
24
 
25
I. Witten, A. Moffat, and T. Bell. Managing Gigabytes (2nd edition).Academic Press, 1999.