ACM Home Page
Please provide us with feedback. Feedback
Multimodal metadata fusion using causal strength
Full text PdfPdf (541 KB)
Source International Multimedia Conference archive
Proceedings of the 13th annual ACM international conference on Multimedia table of contents
Hilton, Singapore
SESSION: Content 6: multimodal processing table of contents
Pages: 872 - 881  
Year of Publication: 2005
ISBN:1-59593-044-2
Authors
Yi Wu  University of California, Santa Barbara, CA
Edward Y. Chang  University of California, Santa Barbara, CA
Belle L. Tseng  NEC Labs America, Cupertino, CA
Sponsors
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 51,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1101149.1101338
What is a DOI?

ABSTRACT

We propose a probabilistic framework that uses influence diagrams to fuse metadata of multiple modalities for photo annotation. We fuse contextual information (location, time, and camera parameters), visual content (holistic and local perceptual features), and semantic ontology in a synergistic way. We use causal strengths to encode causalities between variables, and between variables and semantic labels. Through analytical and empirical studies, we demonstrate that our fusion approach can achieve high-quality photo annotation and good interpretability, substantially better than traditional methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, volume 2, pages 408--415, 2000.
 
3
M. Boutell and J. Luo. Bayesian fusion of camera metadata cues in semantic scene classification. IEEE CVPR, 2004.
 
4
E. Y. Chang. Extent: Combining context, content, and semantic ontology for photo annotation. Second International Workshop on Computer Vision meets Databases, 2005.
5
 
6
 
7
T. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Artifical Intelligence Research, 2:263--286, 1995.
 
8
 
9
P. J. Doshi, L. G. Greenwald, and J. R. Clarke. Using bayesian networks for cleansing trauma data. American Association for Artificial Intelligence, 2003.
 
10
 
11
 
12
E. B. Goldstein. Senstation and perception (5th edition). 1999.
 
13
R. M. Haralick, K. Shanmugam, and I. Dinstein. Texture features for image classification. IEEE Trans. on Sys. Man. and Cyb, 3(6), 1973.
 
14
D. Heckerman. A bayesian approach to learning causal networks. Conference on Uncertainty in Artificial Intelligence, pages 107--118, 1995.
 
15
D. Heckerman and R. Shachter. Decision-theoretic foundations for causal reasoning.MSR-TR-94-11, 1994.
 
16
Y. Ke and R. Sukthankar. Pca-sift: A more distinctive representation for local image descriptors. IEEE Computer Vision and Pattern Recognition, 2004.
 
17
L. Khan and D. McLeod. Disambiguation of annotated text of audio using ontologies. SIGKDD, 2002.
 
18
 
19
 
20
21
 
22
M. Naaman, A. Paepcke, and H. Garcia-Molina. From where to what: Metadata sharing for digital photographs with geographic coordinates. International Conference on Cooperative Information Systems (CoopIS), 2003.
 
23
NIST. Common evaluation measures. 2001.
 
24
L. R. Novick and P. W. Cheng. Assessing interactive causal influence. Psychological Review, 111(2):455--485, 2004.
 
25
 
26
J. Pearl. Causal inference in the health sciences: A conceptual introduction. Special issue on causal inference, Kluwer Academic Publishers, Health Services and Outcomes Research Methodology, 2:189--220, 2001.
 
27
J. Platt. Probabilistic outputs for svms and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 1999.
 
28
Y. Rui, T. S. Huang, and S.-F. Chang. Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 1999.
 
29
Y. Rui, A. C. She, and T. S. Huang. Modified fourier descriptors for shape representations- a practical approach. Proc. of First International Workshop on Image Databases and Multi Media Search, 1996.
 
30
J. R. Smith and S. F. Chang. Transform features for texture classification and discrimination in large image databases. Proc. IEEE Int. Conf. on Image Proc., 1994.
 
31
J. R. Smith and S.-F. Chang. Tools and techniques for color image retrieval. Proc. SPIE Proceedings Storage and Retrieval for Image and Video Databases IV, 2670, 1995.
 
32
 
33
M. Stricker and M. Orengo. Similarity of color images. Proc. SPIE Storage and Retrieval for Image and Video Databases, 1995.
 
34
H. Tamura, S. Mori, and T. Yamawaki. Texture features corresponding to visual perception. IEEE Trans. on Sys., Man. and Cyb, 3(6), 1978.
35
 
36
J. Z. Wang, J. Li, and G. Wiederhold. Simplicity: Semantics-sensitive integrated matching for picture libraries. ACM Multimedia Conference, 2000.
 
37
J. Williamson. Causality, in Dov Gabbay & F. Guenthner (eds.): Handbook of Philosophical Logic. Kluwer (to appear), 2005.
 
38
Y. Wu, B. L. Tseng, and J. R. Smith. Ontology-based multi-classification learning for video concept detection. IEEE International Conference on Multimedia and Expo, 2004.

CITED BY  7

Collaborative Colleagues:
Yi Wu: colleagues
Edward Y. Chang: colleagues
Belle L. Tseng: colleagues