ACM Home Page
Please provide us with feedback. Feedback
Less is more: sampling the neighborhood graph makes SALSA better and faster
Full text PdfPdf (658 KB)
Source Web Search and Web Data Mining archive
Proceedings of the Second ACM International Conference on Web Search and Data Mining table of contents
Barcelona, Spain
SESSION: Web ranking table of contents
Pages 242-251  
Year of Publication: 2009
ISBN:978-1-60558-390-7
Authors
Marc Najork  Microsoft Research, Mountain View, CA
Sreenivas Gollapudi  Microsoft Research, Mountain View, CA
Rina Panigrahy  Microsoft Research, Mountain View, CA
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
: Google
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
: Yahoo! Research
Microsoft : Microsoft
: Nokia
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 88,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1498759.1498832
What is a DOI?

ABSTRACT

In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algorithms such as HITS, MAX and SALSA. All these ranking algorithms view the results of a query as nodes in the web graph, expand the result set to include neighboring nodes, and compute scores on the induced neighborhood graph. In previous work it was shown that SALSA in particular is substantially more effective than query-independent link-based ranking algorithms such as PageRank. In this work, we show that whittling down the neighborhood graph through consistent sampling of nodes and edges makes SALSA and its cousins both faster (more efficient) and better (more effective). We offer a hypothesis as to why "less is more", i.e. why using a reduced graph improves performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
A. Broder and M. Mitzenmacher. Network Applications of Bloom Filters: A Survey. Internet Mathematics 1(4):485--509, 2005.
 
4
 
5
S. Gollapudi, M. Najork and R. Panigrahy. Using Bloom Filters to Speed Up HITS-like Ranking Algorithms. In 5th Workshop on Algorithms and Models for the Web Graph, pages 195--201, 2007.
6
 
7
8
 
9
10
 
11
G. Linden. Marissa Mayer at Web 2.0. Online at: http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html
 
12
 
13
F. McSherry and M. Najork. Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores. In 30th European Conference on Information Retrieval, pages 414--421, 2008.
14
15
16
 
17
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
18
 
19
H. Zaragoza, N. Craswell, M. Taylor, S. Saria, and S. Robertson. Microsoft Cambridge at TREC-13: Web and HARD tracks. In 13th Text Retrieval Conference, 2004.


Collaborative Colleagues:
Marc Najork: colleagues
Sreenivas Gollapudi: colleagues
Rina Panigrahy: colleagues