| A comprehensive human computation framework: with application to image labeling |
| Full text |
Pdf
(874 KB)
|
Source
|
International Multimedia Conference
archive
Proceeding of the 16th ACM international conference on Multimedia
table of contents
Vancouver, British Columbia, Canada
SESSION: Applications track A5/H3: browsing
table of contents
Pages 479-488
Year of Publication: 2008
ISBN:978-1-60558-303-7
|
|
Authors
|
|
Yang Yang
|
University of Science and Technology of China, Hefei, Anhui, China
|
|
Bin B. Zhu
|
Microsoft Research Asia, Beijing, China
|
|
Rui Guo
|
Beihang University, Beijing, China
|
|
Linjun Yang
|
Microsoft Research Asia, Beijing, China
|
|
Shipeng Li
|
Microsoft Research Asia, Beijing, China
|
|
Nenghai Yu
|
University of Science and Technology of China, Hefei, Anhui, China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 142, Citation Count: 1
|
|
|
ABSTRACT
Image and video labeling is important for computers to understand images and videos and for image and video search. Manual labeling is tedious and costly. Automatically image and video labeling is yet a dream. In this paper, we adopt a Web 2.0 approach to labeling images and videos efficiently: Internet users around the world are mobilized to apply their "common sense" to solve problems that are hard for today's computers, such as labeling images and videos. We first propose a general human computation framework that binds problem providers, Web sites, and Internet users together to solve large-scale common sense problems efficiently and economically. The framework addresses the technical challenges such as preventing a malicious party from attacking others, removing answers from bots, and distilling human answers to produce high-quality solutions to the problems. The framework is then applied to labeling images. Three incremental refinement stages are applied. The first stage collects candidate labels of objects in an image. The second stage refines the candidate labels using multiple choices. Synonymic labels are also correlated in this stage. To prevent bots and lazy humans from selecting all the choices, trap labels are generated automatically and intermixed with the candidate labels. Semantic distance is used to ensure that the selected trap labels would be different enough from the candidate labels so that no human users would mistakenly select the trap labels. The last stage is to ask users to locate an object given a label from a segmented image. The experimental results are also reported in this paper. They indicate that our proposed schemes can successfully remove spurious answers from bots and distill human answers to produce high-quality image labels.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
The Open Mind project. http://www.openmind.org/.
|
 |
4
|
|
 |
5
|
|
 |
6
|
|
 |
7
|
Luis von Ahn , Shiry Ginosar , Mihir Kedia , Ruoran Liu , Manuel Blum, Improving accessibility of the web with a computer game, Proceedings of the SIGCHI conference on Human Factors in computing systems, April 22-27, 2006, Montréal, Québec, Canada
[doi> 10.1145/1124772.1124785]
|
| |
8
|
|
| |
9
|
|
| |
10
|
Google Image Labeler (Beta, published in August, 2006). http://images.google.com/imagelabeler/.
|
| |
11
|
Luis von Ahn. CAPTCHA: Using Hard AI Problems For Security. Eurocrypt 2003.
|
| |
12
|
reCAPTCHA. http://recaptcha.net/.
|
| |
13
|
Internet Archive. http://www.archive.org/index.php.
|
| |
14
|
Wikipedia item on Human-Based Computation. http://en.wikipedia.org/wiki/Human-based_computation.
|
| |
15
|
Google AdSense. https://www.google.com/adsense/.
|
| |
16
|
A. Kosorukoff. Human Based Genetic Algorithm. IEEE Int. Conf. on Systems, Man, and Cybernetics, vol. 5, pp. 3464--3469, 2001.
|
 |
17
|
|
 |
18
|
|
| |
19
|
J. Ruderman. The Same Origin Policy. http://www.mozilla.org/projects/security/components/same-origin.html.
|
| |
20
|
Web Hypertext Application Technology Working Group. HTML 5 - Cross-document messaging. http://www.whatwg.org/specs/web-apps/current-work/#crossDocumentMessages.
|
 |
21
|
Helen J. Wang , Xiaofeng Fan , Jon Howell , Collin Jackson, Protection and communication abstractions for web browsers in MashupOS, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
 |
22
|
Rui Guo , Bin B. Zhu , Min FENG , Aimin PAN , Bosheng ZHOU, Compoweb: a component-oriented web architecture, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
[doi> 10.1145/1367497.1367571]
|
 |
23
|
|
 |
24
|
Craig Gentry , Zulfikar Ramzan , Stuart Stubblebine, Secure distributed human computation, Proceedings of the 6th ACM conference on Electronic commerce, p.155-164, June 05-08, 2005, Vancouver, BC, Canada
[doi> 10.1145/1064009.1064026]
|
 |
25
|
|
| |
26
|
WordNet::Similarity, a Perl module for computing measures of semantic relatedness based on WordNet. http://www.d.umn.edu/~tpederse/similarity.html.
|
| |
27
|
EDISON: the Edge Detection and Image Segmentation system. http://www.caip.rutgers.edu/riul/research/code/EDISON/index.html.
|
|