ACM Home Page
Please provide us with feedback. Feedback
Cross-media correlation: a case study of navigated hypermedia documents
Full text PdfPdf (767 KB)
Source International Multimedia Conference archive
Proceedings of the tenth ACM international conference on Multimedia table of contents
Juan-les-Pins, France
SESSION: Session 3: interfacing stored media I table of contents
Pages: 57 - 66  
Year of Publication: 2002
ISBN:1-58113-620-X
Authors
Wei-Ta Chu  National Chi Nan University, Puli, Nantou, Taiwan, R.O.C.
Herng-Yow Chen  National Chi Nan University, Puli, Nantou, Taiwan, R.O.C.
Sponsors
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
SIGCOMM: ACM Special Interest Group on Data Communication
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 33,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/641007.641017
What is a DOI?

ABSTRACT

The research issues on multiple media correlation have arisen with more and more integrated multimedia applications. The multimedia correlation is used to coordinate different media and facilitate cross-media access. This paper presents our work on two types of multimedia correlation: explicit and implicit relations. We develop a system to carefully capture explicit relations and devise some computed synchronization processes to discover implicit relations between media objects. The proposed computed synchronization techniques, including speech-text alignment process in temporal domain, automatic scrolling process in spatial domain, and content dependency check process in content domain, will be addressed. Experimental results show that in the speech-text alignment process 80% of forced alignment are in-sync even the speech recognition accuracy is as low as 25%. The automatic scrolling process effectively maintains a resynchronization mechanism in different displaying environments.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
 
5
Synchronized Multimedia Integration Language (SMIL) Specification. http://www.w3.org/TR/REC-smil/
 
6
Chu, W.T., Hsu, K.T., and Chen, H.Y. Design of an Alignment System for Synchronized Speech-Text Presentation. Proceedings of Distributed Multimedia Systems, 2001, pp. 86--93.
 
7
Huang, X., Alleva, F., Hon, H. W, Hwang, M.Y., and Rosenfeld, R. The SPHINX II Speech Recognition System: An Overview. Computer Speech and Language, 2(7), 1993, pp. 137--148.
 
8
The CMU pronouncing dictionary http://www.speech.cs.cmu.edu/cgi-bin/cmudict
 
9
Ney, H. and Ortmanns S. Progress in Dynamic Programming Search for LVCSR. Proceedings of the IEEE, Vol. 88, No. 8, 2000, pp. 1224--1240.
 
10
11
 
12
 
13
 
14
Owen, C.B. and Makedon, F. The Handbook of Multimedia Computing, chapter Cross-Modal Information Retrieval. CRC Press, Boca Raton, FL, 1998.
15
 
16
Chen, T., Graf, H.P., and Wang, K. Lip synchronization using speech-assisted video processing. IEEE Signal Processing Letters, 2(4), 1995, pp. 57--59.
 
17
Bacher, C., Muller, R., Ottmann, T., and Will M. Authoring on the Fly: A new way of integrating Telepresentation and Courseware production. Proceedings of ICCCE'97, 1997, pp. 89--96.
18


Collaborative Colleagues:
Wei-Ta Chu: colleagues
Herng-Yow Chen: colleagues