|
ABSTRACT
We consider the setting of lazy random graph walks over directed graphs, where entities are represented as nodes and typed edges represent the relations between them. This framework has been used in a variety of problems to derive an extended measure of entity similarity. In this paper we contrast two different approaches for applying supervised learning in this framework to improve graph walk performance: a gradient descent algorithm that tunes the transition probabilities of the graph, and a reranking approach that uses features describing global properties of the traversed paths. An empirical evaluation on a set of tasks from the domain of personal information management and multiple corpora show that reranking performance is usually superior to the local gradient descent algorithm, and that the methods often yield best results when combined.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
M. Barthelemy, E. Chow, and T. Eliassi-Rad. Knowledge representation issues in semantic graphs for relationship detection. In AAAI Spring Symposium, 2005.
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
W. W. Cohen and E. Minkov. A graph-search framework for associating gene identifiers with documents. BMC Bioinformatics, 7(440), 2006.
|
| |
9
|
W. W. Cohen, P. Ravikumar, and S. Fienberg. A comparison of string distance metrics for name-matching tasks. In IIWEB, 2003.
|
| |
10
|
W. W. Cohen, R. E. Schapire, and Y. Singer. Learning to order things. Journal of Artificial Intelligence Research (JAIR), 10:243--270, 1999.
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
M. Diligenti, M. Gori, and M. Maggini. Learning web page scores by error back-propagation. In IJCAI, 2005.
|
 |
15
|
|
| |
16
|
T. Hughes and D. Ramage. Lexical semantic relatedness with random graph walks. In EMNLP, 2007.
|
 |
17
|
|
| |
18
|
B. Klimt and Y. Yang. The enron corpus: A new dataset for email classification research. In ECML, 2004.
|
| |
19
|
E. Minkov and W. W. Cohen. An email and meeting assistant using graph walks. In CEAS, 2006.
|
 |
20
|
|
| |
21
|
J. Neville and D. Jensen. Data mining in social networks. In National Academy of Sciences Symposium on Dynamic Social Network Analysis, 2002.
|
 |
22
|
|
| |
23
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Technical Report, Computer Science department, Stanford University, 1998.
|
 |
24
|
|
| |
25
|
|
 |
26
|
Kristina Toutanova , Christopher D. Manning , Andrew Y. Ng, Learning random walk models for inducing word dependency distributions, Proceedings of the twenty-first international conference on Machine learning, p.103, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015442]
|
 |
27
|
Wensi Xi , Edward A. Fox , Weiguo Fan , Benyu Zhang , Zheng Chen , Jun Yan , Dong Zhuang, SimFusion: measuring similarity using unified relationship matrix, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076059]
|
| |
28
|
B. Zhao, P. Sen, and L. Getoor. Entity and relationship labeling in affiliation networks. In ICML Workshop on Statistical Network Analysis, 2006.
|
CITED BY
|
|
Haizheng Zhang , John Yen , C. Lee Giles , Bamshad Mombaster , Myra Spiliopoulou , Jaideep Srivastava , Olfa Nasraoui , Andrew McCallum, WebKDD/SNAKDD 2007: web mining and social network analysis post-workshop report, ACM SIGKDD Explorations Newsletter, v.9 n.2, December 2007
|
|