| SimFusion: measuring similarity using unified relationship matrix |
| Full text |
Pdf
(335 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Salvador, Brazil
SESSION: Categorization and classification
table of contents
Pages: 130 - 137
Year of Publication: 2005
ISBN:1-59593-034-5
|
|
Authors
|
|
Wensi Xi
|
Virginia Tech, Blacksburg, VA
|
|
Edward A. Fox
|
Virginia Tech, Blacksburg, VA
|
|
Weiguo Fan
|
Virginia Tech, Blacksburg, VA
|
|
Benyu Zhang
|
Microsoft Research Asia, Beijing, China
|
|
Zheng Chen
|
Microsoft Research Asia, Beijing, China
|
|
Jun Yan
|
Beijing University, Beijing, China
|
|
Dong Zhuang
|
Beijing Institute of Technology, Beijing, China
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 22, Downloads (12 Months): 162, Citation Count: 9
|
|
|
ABSTRACT
In this paper we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous data objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlinks, user click-through sequences). We claim that iterative computations over the URM can help overcome the data sparseness problem and detect latent relationships among heterogeneous data objects, thus, can improve the quality of information applications that require com- bination of information from heterogeneous sources. To support our claim, we present a unified similarity-calculating algorithm, SimFusion. By iteratively computing over the URM, SimFusion can effectively integrate relationships from heterogeneous sources when measuring the similarity of two data objects. Experiments based on a web search engine query log and a web page collection demonstrate that SimFusion can improve similarity measurement of web objects over both traditional content based algorithms and the cutting edge SimRank algorithm.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
T. L. Brauen, "Document Vector Modification", in The Smart Retrieval System-Experiments in Automatic Document Processing, G. Salton, editor, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, Chapter 24, 1971.
|
| |
3
|
|
| |
4
|
V. Bush, "As We May Think", The Atlantic Monthly, vol. 176, pp.101--108, July 1945.
|
| |
5
|
|
| |
6
|
Soumen Chakrabarti , Byron E. Dom , S. Ravi Kumar , Prabhakar Raghavan , Sridhar Rajagopalan , Andrew Tomkins , David Gibson , Jon Kleinberg, Mining the Web's Link Structure, Computer, v.32 n.8, p.60-67, August 1999
[doi> 10.1109/2.781636]
|
| |
7
|
G. Das, H. Mannila, P. Ronkainen, "Similarity of attributes by external probes", in Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 23--29, 1998.
|
 |
8
|
|
| |
9
|
J. Dean and S. Ghemawat, " MapReduce: Simplified Data Processing on Large Clusters", in Proceedings of the Sixth Symposium on Operating System Design and Implementation (OSDI'04), San Francisco, CA, pp. 137--150, Dec. 2004.
|
| |
10
|
|
 |
11
|
S. T. Dumais , G. W. Furnas , T. K. Landauer , S. Deerwester , R. Harshman, Using latent semantic analysis to improve access to textual information, Proceedings of the SIGCHI conference on Human factors in computing systems, p.281-285, May 15-19, 1988, Washington, D.C., United States
[doi> 10.1145/57167.57214]
|
 |
12
|
|
| |
13
|
|
 |
14
|
Jonathan L. Herlocker , Joseph A. Konstan , Al Borchers , John Riedl, An algorithmic framework for performing collaborative filtering, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.230-237, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312682]
|
 |
15
|
|
| |
16
|
O. Kallenberg, Foundations of Modern Probability. New York: Springer-Verlag, 1997.
|
| |
17
|
M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14:10--25, 1963.
|
 |
18
|
|
| |
19
|
R.R. Larson. "Bibliometrics of the World-Wide Web: An exploratory analysis of the intellectual structure of cyberspace", in Proceedings of the Annual Meeting of the American Society for Information Science. Baltimore, Maryland, October, 1996.
|
| |
20
|
N. Liu et al. "A similarity reinforcement algorithm for heterogeneous Web Pages", in Proceedings of the seventh Asia Pacific Web Conference, Shanghai, March, 2005.
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
J.J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs, NJ, 1971.
|
| |
26
|
|
| |
27
|
H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents, J. of the American Society of Information Science 24:265--269, 1973.
|
 |
28
|
|
 |
29
|
|
 |
30
|
|
 |
31
|
Wensi Xi , Benyu Zhang , Zheng Chen , Yizhou Lu , Shuicheng Yan , Wei-Ying Ma , Edward Allan Fox, Link fusion: a unified link analysis framework for multi-type interrelated data objects, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988715]
|
| |
32
|
W. Xi, B. Zhang and E. A. Fox "SimFusion, A Unified Similarity Measurement Algorithm for Multi-type Interrelated Web Objects", Technical Report, TR-04-19, Computer Science Department, Virginia Tech, Dec. 2004.
|
 |
33
|
Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Yong Yu , Wei-Ying Ma , WenSi Xi , Edward Fox, MRSSA: an iterative algorithm for similarity spreading over interrelated objects, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA
[doi> 10.1145/1031171.1031222]
|
CITED BY 9
|
|
Xuanhui Wang , Jian-Tao Sun , Zheng Chen , ChengXiang Zhai, Latent semantic analysis for multiple-type interrelated data objects, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
Yang Song , Ziming Zhuang , Huajing Li , Qiankun Zhao , Jia Li , Wang-Chien Lee , C. Lee Giles, Real-time automatic tag recommendation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
Einat Amitay , David Carmel , Nadav Har'El , Shila Ofek-Koifman , Aya Soffer , Sivan Yogev , Nadav Golbandi, Social search and discovery using a unified approach, Proceedings of the 20th ACM conference on Hypertext and hypermedia, June 29-July 01, 2009, Torino, Italy
|
|