|
ABSTRACT
We describe Web-a-Where, a system for associating geography with Web pages. Web-a-Where locates mentions of places and determines the place each name refers to. In addition, it assigns to each page a geographic focus --- a locality that the page discusses as a whole. The tagging process is simple and fast, aimed to be applied to large collections of Web pages and to facilitate a variety of location-based applications and data analyses.Geotagging involves arbitrating two types of ambiguities: geo/non-geo and geo/geo. A geo/non-geo ambiguity occurs when a place name also has a non-geographic meaning, such as a person name (e.g., Berlin) or a common word (Turkey). Geo/geo ambiguity arises when distinct places have the same name, as in London, England vs. London, Ontario.An implementation of the tagger within the framework of the WebFountain data mining system is described, and evaluated on several corpora of real Web pages. Precision of up to 82% on individual geotags is achieved. We also evaluate the relative contribution of various heuristics the tagger employs, and evaluate the focus-finding algorithm using a corpus pretagged with localities, showing that as many as 91% of the foci reported are correct up to the country level.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Google Search by Location http://labs.google.com/location.
|
| |
2
|
ISO 3166 code lists. http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/index.html.
|
| |
3
|
MεταCARTA, Inc. 875 Massachusetts Avenue, Cambridge, MA 02139. http://www.metacarta.com.
|
| |
4
|
ODP: Regional. http://dmoz.org/regional.
|
| |
5
|
Text REtrieval Conference 2003: .gov test collection. http://es.cmis.csiro.au/trecweb/access_to_data.html.
|
| |
6
|
United Nations department of economic and social affairs. http://unstats.un.org/unsd.
|
| |
7
|
USGS Geographic Names Information System (GNIS). http://geonames.usgs.gov.
|
| |
8
|
WebFountain framework for data mining. http://www.almaden.ibm.com/webfountain.
|
| |
9
|
World Gazetteer. http://www.world-gazetteer.com.
|
| |
10
|
The 6th message understanding conference task definition, March 1995. http://www.cs.nyu.edu/cs/faculty/grishman/COtask21.book_1.html.
|
| |
11
|
Language-independent named entity recognition: shared task, 2002. http://cnts.uia.ac.be/conll2002/ner.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
G. Eriksson, K. Franzén, F. Olsson, L. Asker, and P. Lidén. Exploiting syntax when detecting protein names in text. In Proceedings of Workshop on Natural Language Processing in Biomedical Applications, 2002.
|
| |
17
|
|
| |
18
|
Huifeng Li , Rohini K. Srihari , Cheng Niu , Wei Li, Location normalization for information extraction, Proceedings of the 19th international conference on Computational linguistics, p.1-7, August 24-September 01, 2002, Taipei, Taiwan
[doi> 10.3115/1072228.1072355]
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
Y. Ravin and N. Wacholder. Extracting names from natural-language text. Technical Report RC-20338, IBM Research Division, T.J.Watson, Yorktown Heights, NY, October 1997.
|
| |
26
|
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
CITED BY 52
|
|
Lee Wang , Chuang Wang , Xing Xie , Josh Forman , Yansheng Lu , Wei-Ying Ma , Ying Li, Detecting dominant locations from search queries, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
Wenbo Zong , Dan Wu , Aixin Sun , Ee-Peng Lim , Dion Hoe-Lian Goh, On assigning place names to geography related web pages, Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2005, Denver, CO, USA
|
|
|
|
|
|
Chuang Wang , Xing Xie , Lee Wang , Yansheng Lu , Wei-Ying Ma, Detecting geographic locations from web resources, Proceedings of the 2005 workshop on Geographic information retrieval, November 04-04, 2005, Bremen, Germany
|
|
|
|
|
|
|
|
|
Chuang Wang , Xing Xie , Lee Wang , Yansheng Lu , Wei-Ying Ma, Web resource geographic location classification and detection, Special interest tracks and posters of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
|
|
|
|
|
|
Yinghua Zhou , Xing Xie , Chuang Wang , Yuchang Gong , Wei-Ying Ma, Hybrid index structures for location-based web search, Proceedings of the 14th ACM international conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany
|
|
|
|
|
|
Qingqing Gan , Josh Attenberg , Alexander Markowetz , Torsten Suel, Analysis of geographic queries in a search engine log, Proceedings of the first international workshop on Location and the web, p.49-56, April 22-22, 2008, Beijing, China
|
|
|
|
|
|
Yunyao Li , Rajasekar Krishnamurthy , Shivakumar Vaithyanathan , H. V. Jagadish, Getting work done on the web: supporting transactional queries, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
Bruno Martins , Jose Borbinha , Gilberto Pedrosa , João Gil , Nuno Freire, Geographically-aware information retrieval for collections of digitized historical maps, Proceedings of the 4th ACM workshop on Geographical information retrieval, November 09-09, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jiebo Luo , Jie Yu , Dhiraj Joshi , Wei Hao, Event recognition: viewing the world with a third eye, Proceeding of the 16th ACM international conference on Multimedia, October 26-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
Ross S. Purves , Paul Clough , Christopher B. Jones , Avi Arampatzis , Benedicte Bucher , David Finch , Gaihua Fu , Hideo Joho , Awase Khirni Syed , Subodh Vaid , Bisheng Yang, The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet, International Journal of Geographical Information Science, v.21 n.7, p.717-745, January 2007
|
|
|
|
|
|
|
|
|
Karla A. V. Borges , Alberto H. F. Laender , Claudia B. Medeiros , Clodoveu A. Davis, Jr., Discovering geographic locations in web pages using urban addresses, Proceedings of the 4th ACM workshop on Geographical information retrieval, November 09-09, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Joe Szabo , John Aycock , Randal Acton , Jörg Denzinger, The tale of the weather worm, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
Benjamin E. Teitler , Michael D. Lieberman , Daniele Panozzo , Jagan Sankaranarayanan , Hanan Samet , Jon Sperling, NewsStand: a new view on news, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, November 05-07, 2008, Irvine, California
|
|
|
|
|
|
Álvaro Zubizarreta , Pablo de la Fuente , José M. Cantera , Mario Arias , Jorge Cabrero , Guido García , César Llamas , Jesús Vegas, A georeferencing multistage method for locating geographic context in web search, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wei Wang , Chuan Xiao , Xuemin Lin , Chengqi Zhang, Efficient approximate entity extraction with edit distance constraints, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
Huajing Li , Zhisheng Li , Wang-Chien Lee , Dik Lun Lee, A probabilistic topic-based ranking framework for location-sensitive domain information retrieval, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
REVIEW
"Wei Tang : Reviewer"
Location-assisted search has been gaining momentum recently. For example, Google has introduced a new service called "Search by Location." (Other search engines offer similar services, for example, Gigablast.com and local-news.net.) However, there
more...
|