| Learning cross-document structural relationships using boosting |
| Full text |
Pdf
(145 KB)
|
| Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the twelfth international conference on Information and knowledge management
table of contents
New Orleans, LA, USA
SESSION: Knowledge management session 2: semantic web
table of contents
Pages: 124 - 130
Year of Publication: 2003
ISBN:1-58113-723-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 58, Citation Count: 2
|
|
|
ABSTRACT
Multi-document discoure analysis has emerged with the potential of improving various information retrieval applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that uses boosting to classify CST relationships between sentence pairs extracted from topically related documents. We show that the binary classifier for determining existence of structural relationships significantly outperforms the baseline. We also achieve promising results on the multi-class case in which the full taxonomy of relationships are considered.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.
|
| |
4
|
J. Hajič. Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. In E. Hajičová, editor, Issues of Valency and Meaning. Studies in Honor of Jarmila Panevová, pages 12--19. Prague Karolinum, Charles University Press, 1998.
|
| |
5
|
V. Hatzivassiloglou, J. L. Klavans, M. L. Holcombe, R. Barzilay, M.-Y. Kan, and K. R. McKeown. Simfinder: A flexible clustering tool for summarization. In NAACL Workshop on Text Summarization, 2001.
|
| |
6
|
G. Hirst and D. St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum, editor, WordNet: An electronic lexical database, pages 305--332. Cambridge, MA: The MIT Press, 1998.
|
| |
7
|
J. Jiang and D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the Int'l Conf. on Research on Computational Linguistics, Taiwan, 1997.
|
| |
8
|
C. Leacock and M. Chodorow. Combining Local Context and WordNet Similarity for Word Sense Identification. In C. Fellbaum, editor, WordNet: An electronic lexical database, pages 265--283. Cambridge, MA: The MIT Press, 1998.
|
| |
9
|
|
| |
10
|
|
| |
11
|
W. C. Mann and S. A. Thompson. Rhetorical Structure Theory: towards a functional theory of text organization. Text, 8(3):243--281, 1988.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
S. Patwardhan and T. Pedersen. distance.pl: Perl program that measures the semantic relatedness of words (version 0.11). http:/www.d.umn.edu/tpederse/distance.html, 2002.
|
| |
16
|
|
| |
17
|
P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In IJCAI, pages 448--453, 1995.
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Zhu Zhang , Sasha Blair-Goldensohn , Dragomir R. Radev, Towards CST-enhanced summarization, Eighteenth national conference on Artificial intelligence, p.439-445, July 28-August 01, 2002, Edmonton, Alberta, Canada
|
|