|
ABSTRACT
Breaking news often contains timely definitions and descriptions of current terms, organizations and personalities. We utilize such web sources to construct definitions for such terms. Previous work has identified definitions using hand-crafted rules or supervised learning that constructs rigid, hard text patterns. In contrast, we demonstrate a new approach that uses flexible, soft matching patterns to characterize definition sentences. Our soft patterns are able to effectively accommodate the diversity of definition sentence structure exhibited in news. We use pseudo-relevance feedback to automatically label sentences for use in soft pattern generation. The application of our unsupervised method significantly improves baseline systems on both the standardized TREC corpus as well as crawled online news articles by 27% and 30%, respectively, in terms of F measure. When applied to a state-of-art definition generation system recently fielded in the TREC 2003 definitional question answering task, it improves the performance by 14%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
S. Blair-Goldensohn, K. R. McKeown and A. H. Schlaikjer. A Hybrid Approach for Answering Definitional Questions. Technical Report CUCS-006-03. Columbia University, 2003.
|
| |
3
|
C. Buckley, G. Salton, J. Allan, and A. Singhal, Automatic query expansion using SMART, NIST Special Publication 500-225: The Third Text Retrieval conference (TREC 3), 1995, pp. 69--80.
|
 |
4
|
|
| |
5
|
S. Harabagiu, D. Moldovan, R. Mihalcea M. Pasca, R. Bunescu, M. Surdeanu, R. Girju, V. Rus, and P. Morarescu, Falcon: Boosting knowledge for answer engines, Proc. of Ninth Text Retrieval Conference (TREC 9), pp. 479--488, 2000.
|
| |
6
|
J. L. Klavans, S. Popper and R. Passonneau, Tackling the Internet Glossary Glut: Automatic extraction and Evaluation of Genus Phrases, In Proceedings of Semantic Web Workshop, SIGIR 2003, July 28 - Aug. 1, 2003, Toronto, Canada.
|
| |
7
|
J. Lannon, 1991, Technical Writing, Ch 5, HarperCollins Publishers Inc., 1991.
|
| |
8
|
|
 |
9
|
|
| |
10
|
I. Muslea. Extraction patterns for information extraction tasks: A survey. In AAAI-99 Workshop on Machine Learning for Information Extraction, 1999, pp. 1--6.
|
| |
11
|
U. Y. Nahm and R. J. Mooney, 2001, Mining softmatching rules from textual data. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-2001), pp. 979--984.
|
| |
12
|
|
| |
13
|
Dragomir R. Radev , Hongyan Jing , Malgorzata Budzikowska, Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies, NAACL-ANLP 2000 Workshop on Automatic summarization, p.21-30, April 30-30, 2000, Seattle, Washington
[doi> 10.3115/1117575.1117578]
|
| |
14
|
|
| |
15
|
E. Riloff. Automatically generating extraction patterns from untagged text. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-96), AAAI Press, 1996, pp. 1044--1049.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
H. Yang, H. Cui, M.-Y. Kan, M. Maslennikov, L. Qiu, T.-S. Chua, QUALIFIER in TREC-12 QA Main Task, In Proceedings of the Twelfth Annual Text Retrieval Conference (TREC- 12), NIST, November, 2003.
|
| |
23
|
|
CITED BY 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hang Li , Yunbo Cao , Jun Xu , Yunhua Hu , Shenjie Li , Dmitriy Meyerzon, A new approach to intranet search based on information extraction, Proceedings of the 14th ACM international conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany
|
|
|
|
|
|
|
|
|
Fuchun Peng , Ralph Weischedel , Ana Licuanan , Jinxi Xu, Combining deep linguistics analysis and surface pattern learning: a hybrid approach to Chinese definitional question answering, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.307-314, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|