ACM Home Page
Please provide us with feedback. Feedback
Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods
Full text PdfPdf (757 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Athens, Greece
Pages: 128 - 135  
Year of Publication: 2000
ISBN:1-58113-226-3
Authors
Georgios Petasis  Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
Alessandro Cucchiarelli  Istituto di Informatica, Università di Ancona, Via Brecce Bianche, Ancona
Paola Velardi  Dip. di Scienze dell'Informazione, Università di Roma 'La Sapienza', Via Salaria 113, Roma
Georgios Paliouras  Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
Vangelis Karkaletsis  Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
Constantine D. Spyropoulos  Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications, National Centre for Scientific Research 'Demokritos', 153 10 Ag. Paraskevi, Athens, Greece
Sponsors
Athens U of Econ & Business : Athens University of Economics and Business
Greek Com Soc : Greek Computer Society
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 33,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/345508.345563
What is a DOI?

ABSTRACT

The recognition of Proper Nouns (PNs) is considered an important task in the area of Information Retrieval and Extraction. However the high performance of most existing PN classifiers heavily depends upon the availability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing or manual tagging. Though it is not a heavy requirement to rely on some existing PN dictionary (often these resources are available on the web), its coverage of a domain corpus may be rather low, in absence of manual updating. In this paper we propose a technique for the automatic updating of an PN Dictionary through the cooperation of an inductive and a probabilistic classifier. In our experiments we show that, whenever an existing PN Dictionary allows the identification of 50% of the proper nouns within a corpus, our technique allows, without additional manual effort, the successful recognition of about 90% of the remaining 50%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Basili, R., Marziali A., Pazienza M.T., Modelling syntax uncertainty in lexical acquisition from texts. Journal of Quantitative Linguistics, vol. 1, n. 1, 1994.
 
3
 
4
A. Borthwick, J. Sterling, E. Agichten and R. Gnshman. NYU: Description of the MENE named Entity system as Used in MUC-7. Proc. of MUC-7, 1998
 
5
 
6
 
7
 
8
Cucchiarelli A. and Velardi P, Using Corpus Evidence for Automatic Gazetteer Extension. Proc. of Conf, on Language Resources and Evaluation, Granada, Spain, 28-30 May 1998
 
9
Defense Advanced Research Projects Agency. Proceedings of the Sixth Message Understanding Conference (MUC-6), Morgan Kaufinann.
 
10
Defense Advanced Research Projects Agency. Proceedings of the Seventh Message Understanding Conference (MUC- 7), Morgan Kaufmann.
 
11
Day, D., Robinson, P., Vilain, M., and Yeh, A. Description of the ALEMBIC system as used for MUC-7. In {DARPA 1998}.
 
12
 
13
 
14
Humphreys, K., Gaizauskas, R., Cunningham, H., and Azzam, S. VIE Technical Specifications. Department of Computer Science, University of Sheffield.
15
 
16
 
17
S. Sekine, NYU System for Japanese NE-MET2. Proc. of MUC-7, 1998
 
18
 
19

CITED BY  6

Collaborative Colleagues:
Georgios Petasis: colleagues
Alessandro Cucchiarelli: colleagues
Paola Velardi: colleagues
Georgios Paliouras: colleagues
Vangelis Karkaletsis: colleagues
Constantine D. Spyropoulos: colleagues