ACM Home Page
Please provide us with feedback. Feedback
Autonomous visual model building based on image crawling through internet search engines
Full text PdfPdf (575 KB)
Source International Multimedia Conference archive
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval table of contents
New York, NY, USA
SESSION: Internet and WWW-based systems table of contents
Pages: 315 - 322  
Year of Publication: 2004
ISBN:1-58113-940-3
Authors
Xiaodan Song  University of Washington, Seattle, WA
Ching-Yung Lin  IBM T.J. Watson Research Center, Hawthorne, NY
Ming-Ting Sun  University of Washington, Seattle, WA
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 20,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1026711.1026762
What is a DOI?

ABSTRACT

In this paper, we propose an autonomous learning scheme to automatically build visual semantic concept models from the output data of Internet search engines without any manual labeling work. First of all, images are gathered by crawling through the Internet using a search engine such as Google. Then, we model the search results as "Quasi-Positive Bags" in the Multiple-Instance Learning (MIL) framework. We call this generalized MIL (GMIL). We propose an algorithm called "Bag K-Means" to find the maximum Diverse Density (DD) without the existence of negative bags. A cost function is found as K-Means with special "Bag Distance". We also propose a solution called "Uncertain Labeling Density" (ULD) which describes the target density distribution of instances in the case of quasi-positive bags. A "Bag Fuzzy K-Means" is presented to get the maximum of ULD. By this generalized MIL with ULD, the model for a particular concept is learned from the crawled images of the Internet search engines. Experiments show that our algorithm can get correct models for the concepts we are interested in. Compared to the original Google Image Search, our algorithm shows improved accuracy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
A. Schneider, "Weighted possibilistic clustering algorithms", Proc. of the 9th IEEE International Conference on Fuzzy Systems. Texas, 2000, 1, 176--180.
 
10
R.N. Dave, and R. Krishnapuram, "Robust clustering methods: a unified view", IEEE Transactions on Fuzzy Systems, May 1997, 5, 2, 270--293
 
11
A. Amir, M. Berg, S.-F. Chang, G. Iyengar, C.-Y. Lin, A. Natsev, C. Neti, H. Nock, M. Naphade, W. Hsu, J. R. Smith, B. Tseng, Y. Wu, D. Zhang, "IBM Research TRECVID-2003 Video Retrieval Syste<http://www.ctr.columbia.edu/papers_advent/03/ibmcutrec03.html>," Proc. of TRECVID 2003 Workshop.
 
12
 
13
R. Lienhart, A. Kuranov <http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/k/Kuranov:Alexander.html> and V. Pisarevsky <http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/p/Pisarevsky:Vadim.html>, "Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection," DAGM-Symposium <http://www.informatik.uni-trier.de/~ley/db/conf/dagm/dagm2003.html>, 2003, 297--304.
 
14
M.J. Jones and J.M. Rehg, "Statistical color models with application to skin detection," Proc. of CVPR, 1999, 274--280.
 
15
Q. Zhang, and S. A. Goldman, "EM-DD: an improved multi-instance learning technique", Proc. of Advances in Neural Information Processing Systems, Cambridge, MA, MIT Press, 2002, 1073--1080.
 
16
C.-Y. Lin, B. L. Tseng and J. R. Smith, "Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets," Proc. of NIST Text Retrieval Conf. (TREC), Gaithersburg, MD, November 2003.
17
 
18
Y. Rui, T. Huang, M. Ortega and S. Mehrotra, "Relevance Feedback: A Power Tool for Interactive Content-Based Image Retrieval," IEEE Transactions on Circuits and Systems for Video Technology, 8/5 (1998), 644--655.
 
19
P. Aigrain, H. Zhang, and D. Petkovic, ''Content-Based Representation and Retrieval of Visual Media: A State-of-theArt Review,'' Multimedia Tools and Applications, Vol. 3, 179--202, November 1996.
 
20
T. Minka, "An Image Database Browser that Learns from User Interaction," MIT Technical Report TR#365, MIT, 1996.
 
21
 
22
Y. Wu, Q. Tian, T. S. Huang, "Discriminant-EM Algorithm with Application to Image Retrieval," Proc. of CVPR, Vol. I, pp. 222--227, Hilton Head Island, SC, June, 2000
 
23
G. Salton, and C. Buckle, "Improving retrieval performance by relevance feedback," Journal of the American Society for Information Science Vol. 41, 288--297, 1990
 
24


Collaborative Colleagues:
Xiaodan Song: colleagues
Ching-Yung Lin: colleagues
Ming-Ting Sun: colleagues