| HITS algorithm improvement using anchor-related text extracted by DOM structure analysis |
| Full text |
Pdf
(448 KB)
|
Source
|
Symposium on Applied Computing
archive
Proceedings of the 2009 ACM symposium on Applied Computing
table of contents
Honolulu, Hawaii
SESSION: Information access and retrieval track
table of contents
Pages 1691-1698
Year of Publication: 2009
ISBN:978-1-60558-166-8
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 23, Downloads (12 Months): 60, Citation Count: 0
|
|
|
ABSTRACT
Kleinberg's HITS algorithm is a popular algorithm to rank web pages. One of its problems is the topic drift problem. Previous researchers have tried to solve this problem using anchor-related text. We proposed another type of anchor-related text in our previous study. This is found by executing a deep analysis on the DOM structures of web pages. We call our anchor-related text DOM-based anchor-related text (DOM-text). In this paper, we investigate the effectiveness of using DOM-text for improving the HITS algorithm. We examine how much we can improve the HITS algorithm. We also compare DOM-text with anchor-related text of other kinds. The experimental results show that the use of DOM-text is the best for improving the HITS algorithm.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
E. Amitay, Using common hypertext links to identify the best phrasal description of target web document, in: Proc. SIGIR'98 Post-Conference Workshop on Hypertext Information Retrieval for the Web, pp.271--276, 1998.
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Soumen Chakrabarti , Byron Dom , Prabhakar Raghavan , Sridhar Rajagopalan , David Gibson , Jon Kleinberg, Automatic resource compilation by analyzing hyperlink structure and associated text, Proceedings of the seventh international conference on World Wide Web 7, p.65-74, April 1998, Brisbane, Australia
|
 |
8
|
|
| |
9
|
Document Object Model (DOM) Technical Reports, http://www.w3.org/DOM/DOMTR
|
| |
10
|
|
 |
11
|
Eric J. Glover , Kostas Tsioutsiouliklis , Steve Lawrence , David M. Pennock , Gary W. Flake, Using web structure for classifying and describing web pages, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511520]
|
| |
12
|
|
| |
13
|
D. Hawking, et al., Overview of the TREC-7 very large collection track, in: Proc. TREC-7, pp. 1--24, 1998.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
|