| Discovering missing links in Wikipedia |
| Full text |
Pdf
(184 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 3rd international workshop on Link discovery
table of contents
Chicago, Illinois
Pages: 90 - 97
Year of Publication: 2005
ISBN:1-59593-215-1
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 135, Citation Count: 9
|
|
|
ABSTRACT
In this paper we address the problem of discovering missing hypertext links in Wikipedia. The method we propose consists of two steps: first, we compute a cluster of highly similar pages around a given page, and then we identify candidate links from those similar pages that might be missing on the given page. The main innovation is in the algorithm that we use for identifying similar pages, LTRank, which ranks pages using co-citation and page title information. Both LTRank and the link discovery method are manually evaluated and show acceptable results, especially given the simplicity of the methods and conservativeness of the evaluation criteria.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Ahn, V. Jijkoun, G. Mishne, K. Müller, M. de Rijke, and S. Schlobach. Using Wikipedia at the TREC QA Track. In Proceedings TREC 2004, 2005.
|
| |
2
|
Apache Lucene. A high-performance, full-featured text search engine library. URL: http://lucene.apache.org, 2005.
|
| |
3
|
F. Bellomi and R. Bonato. Lexical authorities in an encyclopedic corpus: a case study with wikipedia. URL: http://www.fran.it/blog/2005/01/lexical-authorities-in-encyclopedic.html, 2005. Site accessed on June 9, 2005.
|
| |
4
|
S. Chakrabarti. Mining the Web. Morgan Kaufmann, 2002.
|
| |
5
|
A. Ciffolilli. Phantom authority, selfselective recruitment and retention of members in virtual communities: The case of Wikipedia. First Monday, 8(12), 2003.
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
A. Lih. Wikipedia as participatory journalism: Reliable sources? Metrics for evaluating collaborative media as a news resource. In Proceedings of the 5th International Symposium on Online Journalism, 2004.
|
| |
12
|
N. Miller. Wikipedia and the disappearing "Author". ETC: A Review of General Semantics, 62(1):37--40, 2005.
|
| |
13
|
U. Rao and M. Turoff. Hypertext functionality: A theoretical framework. International Journal of Human-Computer Interaction, 1990.
|
 |
14
|
|
| |
15
|
J. Voss. Measuring Wikipedia. In Proceedings 10th International Conference of the International Society for Scientometrics and Informetrics, 2005.
|
| |
16
|
Wikipedia. Manual of style. URL: http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_%28links%29, 2005.
|
| |
17
|
Wikipedia. The Free Encyclopedia, 2005. URL: http://www.wikipedia.org.
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Meiqun Hu , Ee-Peng Lim , Aixin Sun , Hady Wirawan Lauw , Ba-Quy Vuong, Measuring article quality in wikipedia: models and evaluation, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
|
|