| Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods |
| Full text |
Pdf
(757 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Athens, Greece
Pages: 128 - 135
Year of Publication: 2000
ISBN:1-58113-226-3
|
|
Authors
|
|
Georgios Petasis
|
Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
|
|
Alessandro Cucchiarelli
|
Istituto di Informatica, Università di Ancona, Via Brecce Bianche, Ancona
|
|
Paola Velardi
|
Dip. di Scienze dell'Informazione, Università di Roma 'La Sapienza', Via Salaria 113, Roma
|
|
Georgios Paliouras
|
Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
|
|
Vangelis Karkaletsis
|
Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
|
|
Constantine D. Spyropoulos
|
Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 33, Citation Count: 6
|
|
|
ABSTRACT
The recognition of Proper Nouns (PNs) is considered an important task in the area of Information Retrieval and Extraction. However the high performance of most existing PN classifiers heavily depends upon the availability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing or manual tagging. Though it is not a heavy requirement to rely on some existing PN dictionary (often these resources are available on the web), its coverage of a domain corpus may be rather low, in absence of manual updating. In this paper we propose a technique for the automatic updating of an PN Dictionary through the cooperation of an inductive and a probabilistic classifier. In our experiments we show that, whenever an existing PN Dictionary allows the identification of 50% of the proper nouns within a corpus, our technique allows, without additional manual effort, the successful recognition of about 90% of the remaining 50%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Basili, R., Marziali A., Pazienza M.T., Modelling syntax uncertainty in lexical acquisition from texts. Journal of Quantitative Linguistics, vol. 1, n. 1, 1994.
|
| |
3
|
Daniel M. Bikel , Scott Miller , Richard Schwartz , Ralph Weischedel, Nymble: a high-performance learning name-finder, Proceedings of the fifth conference on Applied natural language processing, p.194-201, March 31-April 03, 1997, Washington, DC
[doi> 10.3115/974557.974586]
|
| |
4
|
A. Borthwick, J. Sterling, E. Agichten and R. Gnshman. NYU: Description of the MENE named Entity system as Used in MUC-7. Proc. of MUC-7, 1998
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Cucchiarelli A. and Velardi P, Using Corpus Evidence for Automatic Gazetteer Extension. Proc. of Conf, on Language Resources and Evaluation, Granada, Spain, 28-30 May 1998
|
| |
9
|
Defense Advanced Research Projects Agency. Proceedings of the Sixth Message Understanding Conference (MUC-6), Morgan Kaufinann.
|
| |
10
|
Defense Advanced Research Projects Agency. Proceedings of the Seventh Message Understanding Conference (MUC- 7), Morgan Kaufmann.
|
| |
11
|
Day, D., Robinson, P., Vilain, M., and Yeh, A. Description of the ALEMBIC system as used for MUC-7. In {DARPA 1998}.
|
| |
12
|
|
| |
13
|
|
| |
14
|
Humphreys, K., Gaizauskas, R., Cunningham, H., and Azzam, S. VIE Technical Specifications. Department of Computer Science, University of Sheffield.
|
 |
15
|
|
| |
16
|
|
| |
17
|
S. Sekine, NYU System for Japanese NE-MET2. Proc. of MUC-7, 1998
|
| |
18
|
|
| |
19
|
|
CITED BY 6
|
|
|
|
|
Paola Velardi , Paolo Fabriani , Michele Missikoff, Using text processing techniques to automatically enrich a domain ontology, Proceedings of the international conference on Formal Ontology in Information Systems, p.270-284, October 17-19, 2001, Ogunquit, Maine, USA
|
|
|
|
|
|
Georgios Petasis , Frantz Vichot , Francis Wolinski , Georgios Paliouras , Vangelis Karkaletsis , Constantine D. Spyropoulos, Using machine learning to maintain rule-based named-entity recognition and classification systems, Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, p.426-433, July 06-11, 2001, Toulouse, France
|
|
|
|
|
|
|
|