|
ABSTRACT
As part of a large effort to acquire large repositories of facts from unstructured text on the Web, a seed-based framework for textual information extraction allows for weakly supervised extraction of class attributes (e.g., side effects and generic equivalent for drugs) from anonymized query logs. The extraction is guided by a small set of seed attributes, without any need for handcrafted extraction patterns or further domain-specific knowledge. The attributes of classes pertaining to various domains of interest to Web search users have accuracy levels significantly exceeding current state of the art. Inherently noisy search queries are shown to be a highly valuable, albeit unexplored, resource for Web-based information extraction, in particular for the task of class attribute extraction.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Michael J. Cafarella , Doug Downey , Stephen Soderland , Oren Etzioni, KnowItNow: fast, scalable information extraction from the web, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.563-570, October 06-08, 2005, Vancouver, British Columbia, Canada
[doi> 10.3115/1220575.1220646]
|
| |
3
|
T. Chklovski and Y. Gil. An analysis of knowledge collected from volunteer contributors. In Proceedings of the 20th National Conference on Artificial Intelligence (AAAI-05), pages 564--571, Pittsburgh, Pennsylvania, 2005.
|
 |
4
|
Hang Cui , Ji-Rong Wen , Jian-Yun Nie , Wei-Ying Ma, Probabilistic query expansion using query logs, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511489]
|
 |
5
|
Susan Dumais , Michele Banko , Eric Brill , Jimmy Lin , Andrew Ng, Web question answering: is more always better?, Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, August 11-15, 2002, Tampere, Finland
[doi> 10.1145/564376.564428]
|
| |
6
|
|
| |
7
|
Mu Li , Yang Zhang , Muhua Zhu , Ming Zhou, Exploring distributional similarity based models for query spelling correction, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, p.1025-1032, July 17-18, 2006, Sydney, Australia
[doi> 10.3115/1220175.1220304]
|
| |
8
|
|
| |
9
|
|
| |
10
|
L. Lita and J. Carbonell. Instance-based question answering: A data driven approach. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-04), pages 396--403, Barcelona, Spain, 2004.
|
 |
11
|
|
| |
12
|
M. Paşca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the World Wide Web of facts -- step one: the one-million fact extraction challenge. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pages 1400--1405, Boston, Massachusetts, 2006.
|
| |
13
|
M. Paşca and B. Van Durme. What you seek is what you get: Extraction of class attributes from query logs. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-07), pages 2832--2837, Hyderabad, India, 2007.
|
| |
14
|
|
| |
15
|
P. Pantel and D. Ravichandran. Automatically labeling semantic classes. In Proceedings of the 2004 Human Language Technology Conference (HLT-NAACL-04), pages 321--328, Boston, Massachusetts, 2004.
|
| |
16
|
M. Remy. Wikipedia: The free encyclopedia. Online Information Review, 26(6):434, 2002.
|
| |
17
|
L. Schubert. Turing's dream and the knowledge challenge. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), Boston, Massachusetts, 2006.
|
| |
18
|
|
| |
19
|
M. Strube and S. Ponzetto. Wikirelate! computing semantic relatedness using Wikipedia. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pages 1419--1424, Boston, Massachusetts, 2006.
|
| |
20
|
K. Tokunaga, J. Kazama, and K. Torisawa. Automatic discovery of attribute words from Web documents. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP-05), pages 106--118, Jeju Island, Korea, 2005.
|
| |
21
|
|
 |
22
|
|
CITED BY 11
|
|
|
|
|
|
|
|
Jinwen Guo , Shengliang Xu , Shenghua Bao , Yong Yu, Tapping on the potential of q&a community by recommending answer providers, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Jiafeng Guo , Gu Xu , Xueqi Cheng , Hang Li, Named entity recognition in query, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
|
|
|
Yunliang Jiang , Hui-Ting Yang , Kevin Chen-chuan Chang , Yi-Shin Chen, AIDE: ad-hoc intents detection engine over query logs, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|