ACM Home Page
Please provide us with feedback. Feedback
Learning cross-document structural relationships using boosting
Full text PdfPdf (145 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the twelfth international conference on Information and knowledge management table of contents
New Orleans, LA, USA
SESSION: Knowledge management session 2: semantic web table of contents
Pages: 124 - 130  
Year of Publication: 2003
ISBN:1-58113-723-0
Authors
Zhu Zhang  University of Michigan, Ann Arbor, MI
Jahna Otterbacher  University of Michigan, Ann Arbor, MI
Dragomir Radev  University of Michigan, Ann Arbor, MI
Sponsors
ACM: Association for Computing Machinery
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 58,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/956863.956887
What is a DOI?

ABSTRACT

Multi-document discoure analysis has emerged with the potential of improving various information retrieval applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that uses boosting to classify CST relationships between sentence pairs extracted from topically related documents. We show that the binary classifier for determining existence of structural relationships significantly outperforms the baseline. We also achieve promising results on the multi-class case in which the full taxonomy of relationships are considered.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.
 
4
J. Hajič. Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. In E. Hajičová, editor, Issues of Valency and Meaning. Studies in Honor of Jarmila Panevová, pages 12--19. Prague Karolinum, Charles University Press, 1998.
 
5
V. Hatzivassiloglou, J. L. Klavans, M. L. Holcombe, R. Barzilay, M.-Y. Kan, and K. R. McKeown. Simfinder: A flexible clustering tool for summarization. In NAACL Workshop on Text Summarization, 2001.
 
6
G. Hirst and D. St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum, editor, WordNet: An electronic lexical database, pages 305--332. Cambridge, MA: The MIT Press, 1998.
 
7
J. Jiang and D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the Int'l Conf. on Research on Computational Linguistics, Taiwan, 1997.
 
8
C. Leacock and M. Chodorow. Combining Local Context and WordNet Similarity for Word Sense Identification. In C. Fellbaum, editor, WordNet: An electronic lexical database, pages 265--283. Cambridge, MA: The MIT Press, 1998.
 
9
 
10
 
11
W. C. Mann and S. A. Thompson. Rhetorical Structure Theory: towards a functional theory of text organization. Text, 8(3):243--281, 1988.
 
12
 
13
 
14
 
15
S. Patwardhan and T. Pedersen. distance.pl: Perl program that measures the semantic relatedness of words (version 0.11). http:/www.d.umn.edu/tpederse/distance.html, 2002.
 
16
 
17
P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In IJCAI, pages 448--453, 1995.
18
 
19
 
20
 
21


Collaborative Colleagues:
Zhu Zhang: colleagues
Jahna Otterbacher: colleagues
Dragomir Radev: colleagues