ACM Home Page
Please provide us with feedback. Feedback
A comparison of methods for the automatic identification of locations in wikipedia
Full text PdfPdf (144 KB)
Source
Workshop On Geographic Information Retrieval archive
Proceedings of the 4th ACM workshop on Geographical information retrieval table of contents
Lisbon, Portugal
SESSION: Mining geographic information and GIR applications table of contents
Pages 89-92  
Year of Publication: 2007
ISBN:978-1-59593-828-2
Authors
Davide Buscaldi  Universidad Politécnica de Valencia, Valencia, Spain
Paolo Rosso  Universidad Politécnica de Valencia, Valencia, Spain
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 84,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1316948.1316971
What is a DOI?

ABSTRACT

In this paper we compare two methods for the automatic identification of geographical articles in encyclopedic resources such as Wikipedia. The methods are a WordNet-based method that uses a set of keywords related to geographical places, and a multinomial Naïve Bayes classificator, trained over a randomly selected subset of the English Wikipedia. This task may be included into the broader task of Named Entity classification, a well-known problem in the field of Natural Language Processing. The experiments were carried out considering both the full text of the articles and only the definition of the entity being described in the article. The obtained results show that the information contained in the page templates and the category labels is more useful than the text of the articles.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Buscaldi, P. Rosso, and P. Peris. Inferring geographical ontologies from multiple resources for geographical information retrieval. In C. Jones and R. Purves, editors, Proceedings of 3rd SIGIR Workshop on Geographical Information Retrieval, August 2006.
 
2
D. Buscaldi, P. Rosso, and E. Sanchis. Wordnet as a geographical information resource. In Proceedings of the 3rd Global WordNet Association (GWA06), 2006.
 
3
S. Cucerzan. Large scale named entity disambiguation based on wikipedia data. In The EMNLP-CoNLL Joint Conference, 2007.
4
 
5
G. Fu, C. B. Jones, and A. I. Abdelmoty. Bulding a geographical ontology for intelligent spatial search on the web. In Proceedings of the IASTED International Conference on Databases and Applications, 2005.
6
 
7
B. Martins, M. Chaves, and M. J. Silva. Assigning geographical scopes to web pages. In Advances in Information Retrieval, volume 3408 of Lecture Notes in Computer Science, pages 564--567. Springer, Berlin, 2005.
8
 
9
S. Overell and S. Rüger. Identifying and grounding descriptions of places. In C. Jones and R. Purves, editors, Proceedings of the 3rd SIGIR Workshop on Geographic Information Retrieval, pages 14--16, August 2006.
 
10
D. Pinto, H. Jiménez-Salazar, P. Rosso, and E. Sanchis. Buap-upv tpirs: A system for document indexing reduction at webclef. In S. Verlag, editor, Accessing Multilingual Information Repositories, Revised Selected Papers CLEF05, volume 4022, pages 873--879, 2006.
 
11
 
12
M. Sanderson and J. Kohler. Analyzing geographic queries. In Proceedings of the 1st SIGIR Workshop on Geographic Information Retrieval, 2004.

Collaborative Colleagues:
Davide Buscaldi: colleagues
Paolo Rosso: colleagues