|
ABSTRACT
Extensive and deep paraphrase corpora are important for a variety of natural language processing and user interaction tasks. In this paper, we present an approach which i) collects multiple paraphrases per given item from volunteers and ii) incentivises responsible contributions by volunteer contributors. Our approach is to solicit paraphrases from Web volunteers, both collecting new paraphrases with no prompting and asking contributors to guess partially obfuscated paraphrases. To test the approach, we have implemented an online game, 1001 Paraphrases (http://ai-games.org/paraphrase.html), and deployed it to collect 20,944 entries focused on paraphrases of 400 statements. The approach complements existing text extraction methods and has some inherent unique advantages. We present and motivate our design as well as share preliminary observations and lessons learned about the performance of the approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
Belasco, A., Curtis, J., Kahlert, R., Klein, C., Mayans, C., Reagan, P. 2002. Representing Knowledge Gaps Effectively. In Practical Aspects of Knowledge Management, (PAKM), Vienna, Austria, December 2-3.
|
| |
5
|
|
| |
6
|
Chklovski, T. 2003a. Using Analogy to Acquire Commonsense Knowledge from Human Contributors, PhD thesis. MIT Artificial Intelligence Laboratory technical report AITR-2003-002.
|
 |
7
|
|
| |
8
|
Chklovski, T. and Gil, Y. 2005. An Analysis of Knowledge Collected from Volunteer Contributors. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05).
|
 |
9
|
|
| |
10
|
Dolan, W. B., Quirk, C., and Brockett, C. 2004. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. In Proceedings of COLING 2004, Geneva, Switzerland.
|
| |
11
|
Gupta, R., and Kochenderfer, M. 2004. Common sense data acquisition for indoor mobile robots. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-04).
|
| |
12
|
|
| |
13
|
Mihalcea, R., and Chklovski, T. 2004. Building Sense Tagged Corpora with Volunteer Contributions over the Web. In Current Issues in Linguistic Theory: Recent Advances in Natural Language Processing, Nicolas Nicolov and Ruslan Mitkov (eds), John Benjamins Publishers.
|
| |
14
|
Narayanan, S., Ananthakrishnan, S., Belvin, R., Ettelaie, E. Ganjavi, S. Georgiou, P., Hein, C., Kadambe, S., Knight, K., Marcu, D., Neely, H., Srinivasamurthy, N., Traum, D. and Wang, D. 2003. Transonics: A Speech to Speech System for English-Persian Interactions. In Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop (IEEE ASRU).
|
| |
15
|
Stork, D. 2003. Invited talk at the Workshop on Distributed and Collaborative Knowledge Capture (DC-KCAP), held in conjunction with the International conference on Knowledge Capture (K-CAP 2003).
|
|