| A practical hypertext catergorization method using links and incrementally available class information |
| Full text |
Pdf
(674 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Athens, Greece
Pages: 264 - 271
Year of Publication: 2000
ISBN:1-58113-226-3
|
|
Authors
|
|
Hyo-Jung Oh
|
Electronics and Telecommunications, Research Institute (ETRI), Taejon, Korea
|
|
Sung Hyon Myaeng
|
Department of Computer Science, Chungnam National University, Taejon, Korea
|
|
Mann-Ho Lee
|
Department of Computer Science, Chungnam National University, Taejon, Korea
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 67, Citation Count: 32
|
|
|
ABSTRACT
As WWW grows at an increasing speed, a classifier targeted at hypertext has become in high demand. While document categorization is quite a mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we propose a practical method for enhancing both the speed and the quality of hypertext categorization using hyperlinks. In comparison against a recently proposed technique that appears to be the only one of the kind, we obtained up to 18.5% of improvement in effectiveness while reducing the processing time dramatically. We attempt to explain through experiments what factors contribute to the improvement.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
Soumen Chakrabarti , Byron Dom , Piotr Indyk, Enhanced hypertext categorization using hyperlinks, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.307-318, June 01-04, 1998, Seattle, Washington, United States
|
| |
4
|
Soumen Chakrabarti , Byron E. Dom , S. Ravi Kumar , Prabhakar Raghavan , Sridhar Rajagopalan , Andrew Tomkins , David Gibson , Jon Kleinberg, Mining the Web's Link Structure, Computer, v.32 n.8, p.60-67, August 1999
[doi> 10.1109/2.781636]
|
| |
5
|
Mark Craven , Dan DiPasquo , Dayne Freitag , Andrew McCallum , Tom Mitchell , Kamal Nigam , Seán Slattery, Learning to extract symbolic knowledge from the World Wide Web, Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, p.509-516, July 1998, Madison, Wisconsin, United States
|
| |
6
|
|
| |
7
|
P. J. Hayes , P. M. Andersen , I. B. Nirenburg , L. M. Schmandt, TCS: a shell for content-based text categorization, Proceedings of the sixth conference on Artificial intelligence applications, p.320-326, January 1990, Santa Barbara, California, United States
|
| |
8
|
|
| |
9
|
Won-Kyun Joo 7 Sung-Hyon Myaeng, "Improving Retrieval Effectivness with Link Information", Proc. of the international Workshop on IRAL '98, 1998.
|
| |
10
|
|
| |
11
|
Jeong-Mook Lim, Hyo-Jung Oh, Sung-Hyon Myaeng, and Mann-Ho Lee, "Improving Efficiency with Document Category Information in Link-based Retrieval", Proc. of the international Workshop on IRAL "99, 1999.
|
| |
12
|
David D. Lewis and Marc Ringuette, " A Comparison of Two Learning Algorithms for Text Categorization", Proc. of the ya Annual Symposium on Document Analysis and Information Retreival, 1994.
|
| |
13
|
Andrew McCallum and Kamal Nigram, "A Comparison of Event Models for Naive Bayes Text Classification", AAA1 '98 Workshop on Learning for Text Categorization, 1998.
|
| |
14
|
|
 |
15
|
|
CITED BY 32
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pável Calado , Marco Cristo , Edleno Moura , Nivio Ziviani , Berthier Ribeiro-Neto , Marcos André Gonçalves, Combining link-based and content-based methods for web document classification, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
|
|
|
|
|
|
Baoping Zhang , Yuxin Chen , Weiguo Fan , Edward A. Fox , Marcos Gonçalves , Marco Cristo , Pável Calado, Intelligent GP fusion from multiple sources for text classification, Proceedings of the 14th ACM international conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany
|
|
|
|
|
|
Thierson Couto , Marco Cristo , Marcos André Gonçalves , Pável Calado , Nivio Ziviani , Edleno Moura , Berthier Ribeiro-Neto, A comparative study of citations and links in document classification, Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, June 11-15, 2006, Chapel Hill, NC, USA
|
|
|
|
|
|
|
|
|
Jian-Tao Sun , Ben-Yu Zhang , Zheng Chen , Yu-Chang Lu , Chun-Yi Shi , Wei-Ying Ma, GE-CKO: A Method to Optimize Composite Kernels for Web Page Classification, Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence, p.299-305, September 20-24, 2004
|
|
|
|
|
|
|
|
|
Xin Li , Hsinchun Chen , Zhu Zhang , Jiexun Li, Automatic patent classification using citation network information: an experimental study in nanotechnology, Proceedings of the 2007 conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
|
|
|
|
|
|
Gui-Rong Xue , Yong Yu , Dou Shen , Qiang Yang , Hua-Jun Zeng , Zheng Chen, Reinforcing Web-object Categorization Through Interrelationships, Data Mining and Knowledge Discovery, v.12 n.2-3, p.229-248, May 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaoxun Zhang , Xueying Wang , Honglei Guo , Zhili Guo , Xian Wu , Zhong Su, Floatcascade learning for fast imbalanced web mining, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|