|
ABSTRACT
A number of researchers have been building high-level semantic concept detectors such as outdoors, face, building, etc., to help with semantic video retrieval. Using the TRECVID video collection and LSCOM truth annotations from 300 concepts, we simulate performance of video retrieval under different assumptions of concept detection accuracy. Even low detection accuracy provides good retrieval results, when sufficiently many concepts are used. Considering this extrapolation under reasonable assumptions, this paper arrives at the conclusion that "concept-based" video retrieval with fewer than 5000 concepts, detected with minimal accuracy of 10% mean average precision is likely to provide high accuracy results, comparable to text retrieval on the web, in a typical broadcast news collection. We also derive evidence that it should be feasible to find sufficiently many new, useful concepts that would be helpful for retrieval.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
The thesaurus for graphic materials: Its history, use, and future. Cataloging & Classification Quarterly, 31(3/4):189--212, 2001.
|
| |
2
|
|
 |
3
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , Abdur Chowdhury , Greg Pass, Surrogate scoring for improved metasearch precision, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076139]
|
| |
4
|
S. F. Chang, R. Manmatha, and T. S. Chua. Combining text and audio-visual features in video indexing. In IEEE ICASSP 2005, 2005.
|
| |
5
|
A. G. Hauptmann, R. Baron, M.-Y. Chen, M. Christel, P. Duygulu, C. Huang, R. Jin, W.-H. Lin, T. Ng, N. Moraveji, N. Papernick, C. Snoek, G. Tzanetakis, J. Yang, R. Yan, and H. Wactlar. Informedia at TRECVID 2003: Analyzing and searching broadcast news video. In Proc. of TRECVID, 2003.
|
| |
6
|
|
| |
7
|
G. R. Institute. Art and architecture thesaurus on line, 2006.
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
C. Lin, B. Tseng, and J. Smith. VideoAnnEx: IBM MPEG-7 annotation tool for multimedia indexing and concept learning. In IEEE International Conference on Multimedia and Expo, 2003.
|
 |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Milind Naphade , John R. Smith , Jelena Tesic , Shih-Fu Chang , Winston Hsu , Lyndon Kennedy , Alexander Hauptmann , Jon Curtis, Large-Scale Concept Ontology for Multimedia, IEEE MultiMedia, v.13 n.3, p.86-91, July 2006
[doi> 10.1109/MMUL.2006.63]
|
| |
16
|
M. R. Naphade, T. Kristjansson, B. Frey, and T. Huang. Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems. In Proc. of ICIP, 1998.
|
 |
17
|
|
 |
18
|
|
| |
19
|
NIST. Overview of trecvid 2006, 2006.
|
| |
20
|
C. of Australian State Libraries. Australian pictorial thesaurus, 2005.
|
| |
21
|
P. Over, T. Ianeva, W. Kraaij, and A. Smeaton. Trecvid 2005 - an overview. In Proceedings of TRECVID 2005. NIST, USA, 2005.
|
 |
22
|
Kerry Rodden , Wojciech Basalaj , David Sinclair , Kenneth Wood, Does organisation by similarity assist image browsing?, Proceedings of the SIGCHI conference on Human factors in computing systems, p.190-197, March 2001, Seattle, Washington, United States
[doi> 10.1145/365024.365097]
|
| |
23
|
A. Smeaton and P. Over. TRECVID: Benchmarking the effectiveness of information retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
|
| |
24
|
|
| |
25
|
J. R. Smith, C. Y. Lin, M. R. Naphade, P. Natsev, and B. Tseng. Advanced methods for multimedia signal processing. In Intl. Workshop for Digital Communications IWDC, Capri, Italy, 2002.
|
 |
26
|
Cees G. M. Snoek , Marcel Worring , Jan C. van Gemert , Jan-Mark Geusebroek , Arnold W. M. Smeulders, The challenge problem for automated detection of 101 semantic concepts in multimedia, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
[doi> 10.1145/1180639.1180727]
|
| |
27
|
|
 |
28
|
Yi Wu , Edward Y. Chang , Kevin Chen-Chuan Chang , John R. Smith, Optimal multimodal fusion for multimedia data analysis, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
[doi> 10.1145/1027527.1027665]
|
| |
29
|
|
 |
30
|
|
| |
31
|
J. Yang, M. Y. Chen, and A. G. Hauptmann. Finding person x: Correlating names with visual appearances. In Intl. Conf. on Image and Video Retrieval (CIVR'04), Ireland, 2004.
|
| |
32
|
G. K. Zipf. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Hafner Pub. Co, 1972.
|
|