|
ABSTRACT
This paper shows that linguistic techniques along with machine learning can extract high quality noun phrases for the purpose of providing the gist or summary of email messages. We describe a set of comparative experiments using several machine learning algorithms for the task of salient noun phrase extraction. Three main conclusions can be drawn from this study: (i) the modifiers of a noun phrase can be semantically as important as the head for the task of gisting, (ii) linguistic filtering improves the performance of machine learning algorithms, (iii) a combination of classifiers improves accuracy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
B. Boguraev and C. Kennedy. 1999. Salience-based content characterisation of text documents. In Interjit Mani and T. Maybury, Mark, editors, Advances in Automatic Text Summarization, pages 99--111. The MIT Press.
|
| |
3
|
W. Cohen. 1995. Fast effective rule induction. In Machine-Learning: Proceedings of the Twelfth International Conference.
|
| |
4
|
|
| |
5
|
J. Justeson and S. Katz. 1995. Technical terminology: Some linguistic properties and an algorithm for identification in text. Natural Language Engineering, (1):9--27.
|
| |
6
|
J. L. Klavans, M. S. Chodorow, and N. Wacholder. 1990. From dictionary to knowledge base via taxonomy. In Proceedings of the Sixth Conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research, University of Waterloo, Canada.
|
 |
7
|
Julian Kupiec , Jan Pedersen , Francine Chen, A trainable document summarizer, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.68-73, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215333]
|
| |
8
|
R. J Mooney and C. Cardie. 1999. Symbolic machine learning for natural language processing. In ACL'99 Tutorial.
|
| |
9
|
S. K. Murthy, S. Kasif, S. Salzberg, and R. Beigel. 1993. OCI: Randomized induction of oblique decision trees. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pages 322--327, Washington, D.C.
|
| |
10
|
|
| |
11
|
L. A. Ramshaw and M. P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of Third ACL Workshop on Very Large Corpora.
|
| |
12
|
A. Smeaton. 1999. Using NLP or NLP resources for information retrieval tasks. In Tomek Strzalkowski, editor, Natural Language Information Retrieval. Kluwer, Boston, MA.
|
| |
13
|
K. Sparck Jones. 1999. What is the role for NLP in text retrieval. In Tomek Strzalkowski, editor, Natural Language Information Retrieval, pages 1--12. Kluwer, Boston, MA.
|
| |
14
|
T. Strzalkowski, F. Lin, J. Wang, and J. Perez-Carballo. 1999. Evaluating natural language processing techniques in information retrieval. In Tomek Strzalkowski, editor, Natural Language Information Retrieval. Kluwer, Boston, MA.
|
| |
15
|
|
| |
16
|
|
| |
17
|
N. Wacholder. 1998. Simplex NPS sorted by head: A method for identifying significant topics within a document. In Proceedings of the COLING-ACL Workshop on the Computational Treatment of Nominals, Montreal, Canada.
|
 |
18
|
Ian H. Witten , Gordon W. Paynter , Eibe Frank , Carl Gutwin , Craig G. Nevill-Manning, KEA: practical automatic keyphrase extraction, Proceedings of the fourth ACM conference on Digital libraries, p.254-255, August 11-14, 1999, Berkeley, California, United States
[doi> 10.1145/313238.313437]
|
CITED BY 8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark Dredze , Hanna M. Wallach , Danny Puller , Tova Brooks , Josh Carroll , Joshua Magarick , John Blitzer , Fernando Pereira, Intelligent email: aiding users with AI, Proceedings of the 23rd national conference on Artificial intelligence, p.1524-1527, July 13-17, 2008, Chicago, Illinois
|
|