ACM Home Page
Please provide us with feedback. Feedback
Joke retrieval: recognizing the same joke told differently
Full text PdfPdf (350 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: IR: medley table of contents
Pages 883-892  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Lisa Friedland  University of Massachusetts Amherst, Amherst, MA, USA
James Allan  University of Massachusetts Amherst, Amherst, MA, USA
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 127,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458199
What is a DOI?

ABSTRACT

In a corpus of jokes, a human might judge two documents to be the "same joke" even if characters, locations, and other details are varied. A given joke could be retold with an entirely different vocabulary while still maintaining its identity. Since most retrieval systems consider documents to be related only when their word content is similar, we propose joke retrieval as a domain where standard language models may fail. Other meaning-centric domains include logic puzzles, proverbs and recipes; in such domains, new techniques may be required to enable us to search effectively. For jokes, a necessary component of any retrieval system will be the ability to identify the "same joke," so we examine this task in both ranking and classification settings. We exploit the structure of jokes to develop two domain-specific alternatives to the "bag of words" document model. In one, only the punch lines, or final sentences, are compared; in the second, certain categories of words (e.g., professions and countries) are tagged and treated as interchangeable. Each technique works well for certain jokes. By combining the methods using machine learning, we create a hybrid that achieves higher performance than any individual approach.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Allan, J., Callan, J., Croft, W. B., Ballesteros, L., Broglio, J., Xu, J., and Shu, H. 1997. INQUERY at TREC-5. In Proceedings of the 5th Text Retrieval Conference. NIST, 119-132.
 
2
Attardo, S. and Raskin, V. 1991. Script theory revis(it)ed: Joke similarity and joke representation model. Humor: International Journal of Humor Research 4(3-4), 293--347.
3
4
 
5
 
6
 
7
 
8
Hofstadter, D. and Gabor, L. 1989. Synopsis of the workshop on humor and cognition. Humor: International Journal of Humor Research, 2(4), 417--440.
9
10
 
11
 
12
 
13
 
14
15
 
16
Raskin, V. 1985. Semantic Mechanisms of Humor. Studies in Linguistics and Philosophy. D. Reidel.
 
17
Ritchie, G. 2003. The Linguistic Analysis of Jokes. Routledge Studies in Linguistics, Vol. 2. Routledge, London.
 
18
 
19
Taylor, J. M. and Mazlack, L. J. 2007. Multiple component computational recognition of children's jokes. In IEEE International Conference on Systems, Man and Cybernetics. 1194--1199.
 
20
 
21
 
22
23
 
24
Zhu, J., Eisenstadt, M., Song, D., and Denham, C. 2006. Exploiting semantic association to answer 'vague queries'. In Li, Y., Looi, M., and Zhong, N., eds., Advances in Intelligent IT - Active Media Technology 2006. Frontiers in Artificial Intelligence and Applications, Vol. 138. IOS Press, 73--78.
25
 
26
Logic Problems - easy, http://www.folj.com/puzzles/easy.htm
 
27
The Aristocrats (2005), The Internet Movie Database, http://www.imdb.com/title/tt0436078/
 
28
Brain Teasers and Math Puzzles, Syvum Technologies, http://www.syvum.com/teasers/

Collaborative Colleagues:
Lisa Friedland: colleagues
James Allan: colleagues