ACM Home Page
Please provide us with feedback. Feedback
Naming every individual in news video monologues
Full text PdfPdf (343 KB)
Source International Multimedia Conference archive
Proceedings of the 12th annual ACM international conference on Multimedia table of contents
New York, NY, USA
SESSION: Technical session 6: learning in multi-modal data table of contents
Pages: 580 - 587  
Year of Publication: 2004
ISBN:1-58113-893-8
Authors
Jun Yang  Carnegie Mellon University, Pittsburgh, PA
Alexander G. Hauptmann  Carnegie Mellon University, Pittsburgh, PA
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 18,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1027527.1027666
What is a DOI?

ABSTRACT

Naming every individual person appearing in broadcast news videos with names detected from the video transcript leads to better access of the news video content. In this paper, we approach this challenging problem with a statistical learning method. Two categories of information extracted from multiple video modalities have been explored, namely <i>features</i>, which help distinguish the true name of every person, as well as <i>constraints</i>, which reveal the relationships among the names of different persons. The person-naming problem is formulated into a learning framework which predicts the most likely name for each person based on the features, and refines the predictions using the constraints. Experiments conducted on ABC World New Tonight and CNN Headline News videos demonstrate that this approach outperforms a non-learning alternative by a large amount.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Teh, Y.W., Miller, E., Foryth, D. Names and Faces in the News. In Proc. of Computer Vision and Pattern Recognition, Vol.2, pp. 848--854, 2004.
 
3
 
4
 
5
 
6
 
7
Rong, Y., Zhang, J., Yang, J. and Hauptmann, A. A Discriminative Learning Framework with Pair-wise Constraints for Video Object Classification. In Proc. of Computer Vision and Pattern Recognition, Vol.2, pp. 284--291, 2004.
 
8
 
9
 
10
 
11
Snoek, C.G.M. and Hauptmann, A. Learning to identify TV news monologues by style and context. Technical Report, CMU-CS-03-193, Carnegie Mellon University, 2003.
 
12
TRECVID: TREC Video Retrieval Evaluation: http://www-nlpir.nist.gov/projects/trecvid/.
 
13
Yang, J., Chen, M.Y., Hauptmann, A. Finding Person X: Correlating Names with Visual Appearances. Int'l Conf. on Image and Video Retrieval, 2004. (To appear)
 
14
15
16


Collaborative Colleagues:
Jun Yang: colleagues
Alexander G. Hauptmann: colleagues