ACM Home Page
Please provide us with feedback. Feedback
Fusing semantics, observability, reliability and diversity of concept detectors for video search
Full text PdfPdf (540 KB)
Source
International Multimedia Conference archive
Proceeding of the 16th ACM international conference on Multimedia table of contents
Vancouver, British Columbia, Canada
SESSION: Content track C2: semantic video annotation table of contents
Pages 81-90  
Year of Publication: 2008
ISBN:978-1-60558-303-7
Authors
Xiao-Yong Wei  City University of Hong Kong, Kowloon, Hong Kong
Chong-Wah Ngo  City University of Hong Kong, Kowloon, Hong Kong
Sponsors
ACM: Association for Computing Machinery
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 164,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459359.1459371
What is a DOI?

ABSTRACT

Effective utilization of semantic concept detectors for large-scale video search has recently become a topic of intensive studies. One of main challenges is the selection and fusion of appropriate detectors, which considers not only semantics but also the reliability of detectors, observability and diversity of detectors in target video domains. In this paper, we present a novel fusion technique which considers different aspects of detectors for query answering. In addition to utilizing detectors for bridging the semantic gap of user queries and multimedia data, we also address the issue of "observability gap" among detectors which could not be directly inferred from semantic reasoning such as using ontology. To facilitate the selection of detectors, we propose the building of two vector spaces: semantic space (SS) and observability space (OS). We categorize the set of detectors selected separately from SS and OS into four types: anchor, bridge, positive and negative concepts. A multi-level fusion strategy is proposed to novelly combine detectors, allowing the enhancement of detector reliability while enabling the observability, semantics and diversity of concepts being utilized for query answering. By experimenting the proposed approach on TRECVID 2005-2007 datasets and queries, we demonstrate the significance of considering observability, reliability and diversity, in addition to the semantics of detectors to queries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
C. G. M. Snoek and et. al. The MediaMill TRECVID 2006 semantic video search engine. In TRECVID, pages 277--290, 2006.
 
3
M. Campbell and et. al. IBM research TRECVID-2006 video retrieval system. In TRECVID, pages 175--182, 2006.
 
4
N. Francis and H. Kucera. Frequency analysis of English usage: Lexicon and grammar. 1982.
5
 
6
C. G. M. Snoek and et. al. Adding semantics to detectors for video retrieval. IEEE Trans. on Multimedia, 9(5):975--986, 2007.
 
7
A. Hauptmann and et. al. Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Trans. on Multimedia, 9(5):958--966, 2007.
 
8
 
9
J. J. Jiang and D.W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Intl. Conf. Research on Computational Linguistics, 1997.
10
 
11
J. R. Kender. A large scale concept ontology for news stories: Empirical methods, analysis, and improvements. In Intl. Conf. on Multimedia and Expo (ICME), 2007.
 
12
13
14
15
16
 
17
18
 
19
S.-Y. Neo and et. al. Video retrieval using high level features: Exploiting query matching and confidence-based weighting. In Intl. Conf. on Image and Video Retrieval (CIVR), 2006.
 
20
P. Over, W. Kraaij, and A. F. Smeaton. TRECVID 2007 - overview. In TRECVID, 2007.
 
21
R. Penrose. A generalized inverse for matrices. Proceedings of the Cambridge Philosophical Society, 51:406--413, 1955.
 
22
P. Resnik. Using information content to evaluate semantic similarity in taxonomy. In Intl. Joint Conf. on Artificial Intelligence (IJCAI), 1995.
 
23
J. P. Romano. On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85(411):686--692, 1990.
 
24
S. F. Chang and et. al. Columbia university TRECVID-2006 video search and high-level feature extraction. In TRECVID, pages 99--109, 2006.
25
 
26
27
 
28
L. Xie and S.-F. Chang. Pattern mining in visual concept streams. In Intl. Conf. on Multimedia and Expo (ICME), 2006.
 
29
A. Yanagawa and et. al. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Technical report, Columbia University, 2007.
 
30

Collaborative Colleagues:
Xiao-Yong Wei: colleagues
Chong-Wah Ngo: colleagues