ACM Home Page
Please provide us with feedback. Feedback
On the detection of semantic concepts at TRECVID
Full text PdfPdf (288 KB)
Source International Multimedia Conference archive
Proceedings of the 12th annual ACM international conference on Multimedia table of contents
New York, NY, USA
SESSION: Brave new topics -- session 3: the effect of benchmarking on advances in semantic video table of contents
Pages: 660 - 667  
Year of Publication: 2004
ISBN:1-58113-893-8
Authors
Milind R. Naphade  IBM Thomas J. Watson Research Center, Hawthorne, NY
John R. Smith  IBM Thomas J. Watson Research Center, Hawthorne, NY
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 114,   Citation Count: 34
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1027527.1027680
What is a DOI?

ABSTRACT

Semantic multimedia management is necessary for the effective and widespread utilization of multimedia repositories and realizing the potential that lies untapped in the rich multimodal information content. This challenge has driven researchers to devise new algorithms and systems that enable automatic or semi-automatic tagging of large scale multimedia content with rich semantics. An emerging research area is the detection of a predetermined set of semantic concepts that can act as semantic filters and aid in search, and manipulation. The NIST TRECVID benchmark has responded by creating a task that has evaluated the performance of concept detection. Within the scope of this benchmark task, this paper studies trends in the emerging concept detection systems, architectures and algorithms. It also analyzes strategies that have yielded reasonable success, and challenges and gaps that lie ahead.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Amir, M. Berg, S. F. Chang, G. Iyengar, C. Lin, M. Naade, A. Natsev, C. Neti, H. Nock, W. Hsu, I. Sachdev, J. Smith, B. Tseng, Y. Wu, and D. Zhang, "IBM research trecvid-2003 video retrieval system," Nov 2003, NIST TRECVID 2003.
 
2
A. Hauptmann, R. Baron, M. Chen, M. Christel, P. Duygulu, C. Huang, R. jin, W. Lin, T. Ng, N. Moraveji, N. Papernick, C. Snoek, G~Tzanetakis, J. Yang, R. Yan, and H. Wactlar, "Informedia at TRECVID 2003: Analyzing and searching broadcast news video," Nov 2003, NIST TRECVID 2003.
 
3
L. Wu, Y. Guo, X. Qiu, Z. Feng, J. Rong, W. Jin, D. Zhou, R. Wang, and M. Jin, "Fudan university at TRECVID 2003," Nov 2003, NIST TRECVID 2003.
 
4
M. Rautiainen, J. Pebttila, P. Peterila, K. Noponen, M. Hosio, T. Koskela, S. Makela, J. Peltola, J. Liu, T. Ojala, and T. Seppanen, "TRECVID 2003 experiments at mediaTeam Oulu and VTT," Nov 2003, NIST TRECVID 2003.
 
5
Y. Zhai, Z. Rasheed, and M. Shah, "University of central florida at TRECVID 2003," Nov 2003, NIST TRECVID 2003.
 
6
X. Huang, G. Wei, and V. Petrushin, "Shot boundary detection and high-level features extraction for the TREC video evaluation 2003," Nov 2003, NIST TRECVID 2003.
 
7
F. Souvannavong, B. Merialdo, and B. Huet, "Latent semantic indexing for video content modeling and analysis," Nov 2003, NIST TRECVID 2003.
 
8
W. H. Adams, A. Amir, C. Dorai, S. Ghoshal, G. Iyengar, A. Jaimes, C. Lang, C. Y. Lin, M. R. Naade, A. Natsev, C. Neti, H. J. Nock, H. Permutter, R. Singh, S. Srinivasan, J. R. Smith, B. L. Tseng, A. T. Varadaraju, and D. Zhang, "IBM research TREC-2002 video retrieval system," in Proc. Text Retrieval Conference (TREC), Gaithersburg, MD, Nov 2002, pp. 289--298.
 
9
J. Smith, S. Srinivasan, A. Amir, S. Basu, G. Iyengar, C. Lin, M. Naade, D. Ponceleon, and B. Tseng, "Integrating features, models, and semantics for content-based retrieval," NIST video-TEC notebook, 2001.
 
10
A. Hauptmann, R. Yan, Y. Qi, R. Jin, M. Christel, M. Derthick, M. Chen, R. Baron, W. Lin, and T. Ng, "Video classification and retrieval with the informedia digital video library system," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 119--127.
 
11
L. Wu, X. Huang, J. Niu, Y. Xia, Z. Feng, and Y. Zhou, "FDU at TREC 2002: Filtering, q&a and video tasks," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 232--247.
 
12
M. Rautiainen, J. Pebttila, P. Peterila, D. Vorobiev, K. Noponen, M. Hosio, E. Matinmikko, S. Makela, J. Peltola, T. Ojala, and T. Seppanen, "TRECVID 2002 experiments at MediaTeam Oulu and VTT," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 417--428.
 
13
G. Quenot, D. Moraru, L. Besacier, and P. Muthem, "Clips at trec 11: Experiments in video retrieval," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 181--187.
 
14
F. Souvannavong, B. Merialdo, and B. Huet, "Semantic feature extraction using mpeg macro-block classification," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 227--231.
 
15
A. Smeaton, "TRECVID 2003- an introduction," Nov 2003, NIST TRECVID 2003.
 
16
A. Smeaton and P. Over, "The TREC-2002 video track report," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 69--85.
 
17
M. Naade, S. Basu, J. Smith, C. Lin, and B. Tseng, "Modeling semnatic concepts to support query by keywords in video," in IEEE International Confernce on Image Processing, Rochester, NY, Sep 2002.
 
18
C. Lin, B. Tseng, and J. Smith, "Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets," in Proc. Text Retrieval Conference (TREC), Gaithersburg, MD, Nov 2003.
 
19
M. Naade, T. Kristjansson, B. Frey, and T. S. Huang, "Probabilistic multimedia objects (multijects): A novel approach to indexing and retrieval in multimedia systems," in Proceedings of IEEE International Conference on Image Processing, Chicago, IL, Oct. 1998, vol. 3, pp. 536--540.
 
20
A. Vailaya, A. Jain, and H. Zhang, "On image classification: City images vs. landscapes," Pattern Recognition, vol. 31, pp. 1921--1936, Dec. 1998.
 
21
NIST TREC10, "Common evaluation measures," http://trec.nist.gov/pubs/trec10/appendices/measures.pdf.
 
22
E. Voorhees, "The ilosoy of information retrieval evaluation," http://www.itl.nist.gov/iaui/894.02/~works/papers/eval_ilosoy.ps.
 
23
J. Smith, S. Srinivasan, A. Amir, S. Basu, G. Iyengar, C. Lin, M. Naade, D. Ponceleon, and B. Tseng, "Integrating features, models, and semantics for content-based retrieval," NIST video-TEC notebook, 2001.
 
24
Milind R. Naade and John R. Smith, "A hybrid framework for detecting the semantics of concepts and context," in Lecture Notes in Computer Science: Image and Video Retrieval, M. Lew, N. Sebe, and J. Eakins, Eds. Springer, 2003.
 
25
J. Vendrig, J. Hartog, D. Leeuwen, I. Patras, S. Raaijmakers, J. Best, C. Snoek, and M. Worring, "TREC feature extraction by active learning," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 429--438.
 
26
P. Browne, C. Czirjek, C. Gurrin, R. Jarina, H. Lee, S. Markow, K. McDonald, N. Mury, N. O'Connor, A. Smeaton, and J. Ye, "Dublin city university video track experiments for TREC 2002," in The Eleventh Text Retrieval Conference, TREC 2002, Gaithersburg, MD, Nov 2002, pp. 217--226.
 
27

CITED BY  34

Collaborative Colleagues:
Milind R. Naphade: colleagues
John R. Smith: colleagues