|
ABSTRACT
Effective utilization of semantic concept detectors for large-scale video search has recently become a topic of intensive studies. One of main challenges is the selection and fusion of appropriate detectors, which considers not only semantics but also the reliability of detectors, observability and diversity of detectors in target video domains. In this paper, we present a novel fusion technique which considers different aspects of detectors for query answering. In addition to utilizing detectors for bridging the semantic gap of user queries and multimedia data, we also address the issue of "observability gap" among detectors which could not be directly inferred from semantic reasoning such as using ontology. To facilitate the selection of detectors, we propose the building of two vector spaces: semantic space (SS) and observability space (OS). We categorize the set of detectors selected separately from SS and OS into four types: anchor, bridge, positive and negative concepts. A multi-level fusion strategy is proposed to novelly combine detectors, allowing the enhancement of detector reliability while enabling the observability, semantics and diversity of concepts being utilized for query answering. By experimenting the proposed approach on TRECVID 2005-2007 datasets and queries, we demonstrate the significance of considering observability, reliability and diversity, in addition to the semantics of detectors to queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
C. G. M. Snoek and et. al. The MediaMill TRECVID 2006 semantic video search engine. In TRECVID, pages 277--290, 2006.
|
| |
3
|
M. Campbell and et. al. IBM research TRECVID-2006 video retrieval system. In TRECVID, pages 175--182, 2006.
|
| |
4
|
N. Francis and H. Kucera. Frequency analysis of English usage: Lexicon and grammar. 1982.
|
 |
5
|
Cees G. M. Snoek , Marcel Worring , Jan C. van Gemert , Jan-Mark Geusebroek , Arnold W. M. Smeulders, The challenge problem for automated detection of 101 semantic concepts in multimedia, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180727]
|
| |
6
|
C. G. M. Snoek and et. al. Adding semantics to detectors for video retrieval. IEEE Trans. on Multimedia, 9(5):975--986, 2007.
|
| |
7
|
A. Hauptmann and et. al. Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Trans. on Multimedia, 9(5):958--966, 2007.
|
| |
8
|
|
| |
9
|
J. J. Jiang and D.W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Intl. Conf. Research on Computational Linguistics, 1997.
|
 |
10
|
|
| |
11
|
J. R. Kender. A large scale concept ontology for news stories: Empirical methods, analysis, and improvements. In Intl. Conf. on Multimedia and Expo (ICME), 2007.
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
Xirong Li , Dong Wang , Jianmin Li , Bo Zhang, Video search in concept subspace: a text-like paradigm, Proceedings of the 6th ACM international conference on Image and video retrieval, p.603-610, July 09-11, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1282280.1282366]
|
| |
17
|
Milind Naphade , John R. Smith , Jelena Tesic , Shih-Fu Chang , Winston Hsu , Lyndon Kennedy , Alexander Hauptmann , Jon Curtis, Large-Scale Concept Ontology for Multimedia, IEEE MultiMedia, v.13 n.3, p.86-91, July 2006
[doi> 10.1109/MMUL.2006.63]
|
 |
18
|
|
| |
19
|
S.-Y. Neo and et. al. Video retrieval using high level features: Exploiting query matching and confidence-based weighting. In Intl. Conf. on Image and Video Retrieval (CIVR), 2006.
|
| |
20
|
P. Over, W. Kraaij, and A. F. Smeaton. TRECVID 2007 - overview. In TRECVID, 2007.
|
| |
21
|
R. Penrose. A generalized inverse for matrices. Proceedings of the Cambridge Philosophical Society, 51:406--413, 1955.
|
| |
22
|
P. Resnik. Using information content to evaluate semantic similarity in taxonomy. In Intl. Joint Conf. on Artificial Intelligence (IJCAI), 1995.
|
| |
23
|
J. P. Romano. On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85(411):686--692, 1990.
|
| |
24
|
S. F. Chang and et. al. Columbia university TRECVID-2006 video search and high-level feature extraction. In TRECVID, pages 99--109, 2006.
|
 |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
L. Xie and S.-F. Chang. Pattern mining in visual concept streams. In Intl. Conf. on Multimedia and Expo (ICME), 2006.
|
| |
29
|
A. Yanagawa and et. al. Columbia university's baseline detectors for 374 LSCOM semantic visual concepts. Technical report, Columbia University, 2007.
|
| |
30
|
|
|