ACM Home Page
Please provide us with feedback. Feedback
A generalized Co-HITS algorithm and its application to bipartite graphs
Full text MovMov (15:35),  PdfPdf (565 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 239-248  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Hongbo Deng  The Chinese University of Hong Kong, Hong Kong, China
Michael R. Lyu  The Chinese University of Hong Kong, Hong Kong, China
Irwin King  The Chinese University of Hong Kong, Hong Kong, China
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 47,   Downloads (12 Months): 202,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557051
What is a DOI?

ABSTRACT

Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Examples include queries and URLs in query logs, and authors and papers in scientific literature. However, one of the issues is that previous algorithms only consider the content and link information from one side of the bipartite graph. There is a lack of constraints to make sure the final relevance of the score propagation on the graph, as there are many noisy edges within the bipartite graph. In this paper, we propose a novel and general Co-HITS algorithm to incorporate the bipartite graph with the content information from both sides as well as the constraints of relevance. Moreover, we investigate the algorithm based on two frameworks, including the iterative and the regularization frameworks, and illustrate the generalized Co-HITS algorithm from different views. For the iterative framework, it contains HITS and personalized PageRank as special cases. In the regularization framework, we successfully build a connection with HITS, and develop a new cost function to consider the direct relationship between two entity sets, which leads to a significant improvement over the baseline method. To illustrate our methodology, we apply the Co-HITS algorithm, with many different settings, to the application of query suggestion by mining the AOL query log data. Experimental results demonstrate that CoRegu-0.5 (i.e., a model of the regularization framework) achieves the best performance with consistent and promising improvements.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
5
 
6
7
8
9
 
10
T. Haveliwala, S. Kamvar, and G. Jeh. An analytical comparison of approaches to personalizing PageRank. Preprint, June, 2003.
11
12
13
14
15
16
17
18
19
20
21
22
23
24
 
25
A. Smola and R. Kondor. Kernels and regularization on graphs. COLT, 2003.
26
27
28
29
 
30
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2003.
 
31
D. Zhou, B. Schölkopf, and T. Hofmann. Semi-supervised learning on directed graphs. In NIPS, 2004.
 
32
X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.

Collaborative Colleagues:
Hongbo Deng: colleagues
Michael R. Lyu: colleagues
Irwin King: colleagues