| Learning multiple graphs for document recommendations |
| Full text |
Pdf
(240 KB)
|
Source
|
International World Wide Web Conference
archive
Proceeding of the 17th international conference on World Wide Web
table of contents
Beijing, China
SESSION: Data mining: algorithms
table of contents
Pages 141-150
Year of Publication: 2008
ISBN:978-1-60558-085-2
|
|
Authors
|
|
Ding Zhou
|
Facebook Inc., Palo Alto, CA, USA
|
|
Shenghuo Zhu
|
NEC Labs America, Cupertino, CA, USA
|
|
Kai Yu
|
NEC Labs America, Cupertino, CA, USA
|
|
Xiaodan Song
|
Google Inc, Mountain View, CA, USA
|
|
Belle L. Tseng
|
Yahoo! Inc., Sunnyvale, CA, USA
|
|
Hongyuan Zha
|
Georgia Institute of Technology, Atlanta, GA, USA
|
|
C. Lee Giles
|
The Pennsylvania State University, University park, PA, USA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 23, Downloads (12 Months): 205, Citation Count: 7
|
|
|
ABSTRACT
The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in question are networked by citations and are associated with other entities by various relations. Due to the sparsity of a single graph and noise in graph construction, we propose a new method for combining multiple graphs to measure document similarities, where different factorization strategies are used based on the nature of different graphs. In particular, the new method seeks a single low-dimensional embedding of documents that captures their relative similarities in a latent space. Based on the obtained embedding, a new recommendation framework is developed using semi-supervised learning on graphs. In addition, we address the scalability issue and propose an incremental algorithm. The new incremental method significantly improves the efficiency by calculating the embedding for new incoming documents only. The new batch and incremental methods are evaluated on two real world datasets prepared from CiteSeer. Experiments demonstrate significant quality improvement for our batch method and significant efficiency improvement with tolerable quality loss for our incremental method.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
F. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
|
| |
2
|
F. Chung. Laplacians and the cheeger inequality for directed graphs. Annals of Combinatorics, 9, 2005.
|
| |
3
|
|
| |
4
|
D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 430--436. MIT Press, 2001.
|
| |
5
|
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
| |
6
|
M. Fazel, H. Hindi, and S. P. Boyd. Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices. In Proceedings of American Control Conference, 2003.
|
 |
7
|
R. Guha , Ravi Kumar , Prabhakar Raghavan , Andrew Tomkins, Propagation of trust and distrust, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988727]
|
| |
8
|
X. He, H. Zha, C. H. Q. Ding, and H. D. Simon. Web document clustering using hyperlink structures. Computational Statistics & Data Analysis, 41(1):19--45, November 2002.
|
| |
9
|
|
 |
10
|
|
 |
11
|
Badrul Sarwar , George Karypis , Joseph Konstan , John Reidl, Item-based collaborative filtering recommendation algorithms, Proceedings of the 10th international conference on World Wide Web, p.285-295, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372071]
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. In Neural Information Processing Systems, volume 14, 2001.
|
 |
16
|
|
| |
17
|
D. Zhou, I. Councill, H. Zha, and C. L. Giles. Discovering temporal communities from social network documents. In ICDM?07: Proceedings of the 7th IEEE International Conference on Data Mining, 2007.
|
 |
18
|
|
 |
19
|
Ding Zhou , Eren Manavoglu , Jia Li , C. Lee Giles , Hongyuan Zha, Probabilistic models for discovering e-communities, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
[doi> 10.1145/1135777.1135807]
|
 |
20
|
|
CITED BY 7
|
|
|
|
|
Yun Chi , Shenghuo Zhu , Yihong Gong , Yi Zhang, Probabilistic polyadic factorization and its application to personalized recommendation, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Hao Ma , Haixuan Yang , Michael R. Lyu , Irwin King, SoRec: social recommendation using probabilistic matrix factorization, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Hao Ma , Haixuan Yang , Irwin King , Michael R. Lyu, Learning latent semantic relations from clickthrough data for query suggestion, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Yun Chi , Shenghuo Zhu , Koji Hino , Yihong Gong , Yi Zhang, iOLAP: A framework for analyzing the internet, social networks, and other networked data, IEEE Transactions on Multimedia, v.11 n.3, p.372-382, April 2009
|
|