| Multiple instance learning for labeling faces in broadcasting news video |
| Full text |
Pdf
(1.92 MB)
|
| Source
|
International Multimedia Conference
archive
Proceedings of the 13th annual ACM international conference on Multimedia
table of contents
Hilton, Singapore
SESSION: Content 1: news video processing
table of contents
Pages: 31 - 40
Year of Publication: 2005
ISBN:1-59593-044-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 32, Citation Count: 3
|
|
|
ABSTRACT
Labeling faces in news video with their names is an interesting research problem which was previously solved using supervised methods that demand significant user efforts on labeling training data. In this paper, we investigate a more challenging setting of the problem where there is no complete information on data labels. Specifically, by exploiting the uniqueness of a face's name, we formulate the problem as a special multi-instance learning (MIL) problem, namely exclusive MIL or eMIL problem, so that it can be tackled by a model trained with partial labeling information as the anonymity judgment of faces, which requires less user effort to collect. We propose two discriminative probabilistic learning methods named Exclusive Density (ED) and Iterative ED for eMIL problems. Experiments on the face labeling problem shows that the performance of the proposed approaches are superior to the traditional MIL algorithms and close to the performance achieved by supervised methods trained with complete data labels.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. Andrews, I. Tsochantaridis, and T. Hofmann. Support vector machines for multiple-instance learning. In Advances in Neural Information Processing Systems 15, pages 561--568. MIT Press, 2003.
|
| |
2
|
T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y.-W. Teh, E. Learned-Miller, and D. Forsyth. Names and faces in news. In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 848--854. IEEE Computer Society, 2004.
|
| |
3
|
D. Bikel, S. Miller, R. Schwartz, and R. Weischedel. Nymble: a high-performance learning name-finder, 1997.
|
| |
4
|
S. F. Chang, R. Manmatha, and T. S. Chua. Combining text and audio-visual features in video indexing. In IEEE ICASSP 2005, 2005.
|
| |
5
|
M. Chen and A. Hauptmann. Toward robust face recognition from multiple views. In Proc. of Int'l Conference on Multimedia and Expo, 2004.
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
C. Snoek, M. Worring, and A. Hauptmann. Detection ofTVnews monologues by style analysis. In Proc. of theIEEEInt'l Conference on Multimedia & Expo, June 2004.
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
Yi Wu , Edward Y. Chang , Kevin Chen-Chuan Chang , John R. Smith, Optimal multimodal fusion for multimedia data analysis, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
[doi> 10.1145/1027527.1027665]
|
| |
18
|
|
| |
19
|
|
| |
20
|
J. Yang, M. Chen, and A. G. Hauptmann. Finding personX: Correlating names with visual appearances. In Proc. of 3rd Int'l Conf. on Image and Video Retrieval, pages 270--278, 2004.
|
 |
21
|
|
| |
22
|
Q. Zhang and S. Goldman. Em-DD: An improved multiple-instance learning technique. In Advances in Neural Information Processing Systems, pages 1073--1080. TheMITPress, 2001.
|
| |
23
|
|
|