| Cross-media correlation: a case study of navigated hypermedia documents |
| Full text |
Pdf
(767 KB)
|
| Source
|
International Multimedia Conference
archive
Proceedings of the tenth ACM international conference on Multimedia
table of contents
Juan-les-Pins, France
SESSION: Session 3: interfacing stored media I
table of contents
Pages: 57 - 66
Year of Publication: 2002
ISBN:1-58113-620-X
|
|
Authors
|
|
Wei-Ta Chu
|
National Chi Nan University, Puli, Nantou, Taiwan, R.O.C.
|
|
Herng-Yow Chen
|
National Chi Nan University, Puli, Nantou, Taiwan, R.O.C.
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 33, Citation Count: 5
|
|
|
ABSTRACT
The research issues on multiple media correlation have arisen with more and more integrated multimedia applications. The multimedia correlation is used to coordinate different media and facilitate cross-media access. This paper presents our work on two types of multimedia correlation: explicit and implicit relations. We develop a system to carefully capture explicit relations and devise some computed synchronization processes to discover implicit relations between media objects. The proposed computed synchronization techniques, including speech-text alignment process in temporal domain, automatic scrolling process in spatial domain, and content dependency check process in content domain, will be addressed. Experimental results show that in the speech-text alignment process 80% of forced alignment are in-sync even the speech recognition accuracy is as low as 25%. The automatic scrolling process effectively maintains a resynchronization mechanism in different displaying environments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
Gregory D. Abowd , Christopher G. Atkeson , Jason Brotherton , Tommy Enqvist , Paul Gulley , Johan LeMon, Investigating the capture, integration and access problem of ubiquitous computing in an educational setting, Proceedings of the SIGCHI conference on Human factors in computing systems, p.440-447, April 18-23, 1998, Los Angeles, California, United States
[doi> 10.1145/274644.274704]
|
| |
4
|
|
| |
5
|
Synchronized Multimedia Integration Language (SMIL) Specification. http://www.w3.org/TR/REC-smil/
|
| |
6
|
Chu, W.T., Hsu, K.T., and Chen, H.Y. Design of an Alignment System for Synchronized Speech-Text Presentation. Proceedings of Distributed Multimedia Systems, 2001, pp. 86--93.
|
| |
7
|
Huang, X., Alleva, F., Hon, H. W, Hwang, M.Y., and Rosenfeld, R. The SPHINX II Speech Recognition System: An Overview. Computer Speech and Language, 2(7), 1993, pp. 137--148.
|
| |
8
|
The CMU pronouncing dictionary http://www.speech.cs.cmu.edu/cgi-bin/cmudict
|
| |
9
|
Ney, H. and Ortmanns S. Progress in Dynamic Programming Search for LVCSR. Proceedings of the IEEE, Vol. 88, No. 8, 2000, pp. 1224--1240.
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Owen, C.B. and Makedon, F. The Handbook of Multimedia Computing, chapter Cross-Modal Information Retrieval. CRC Press, Boca Raton, FL, 1998.
|
 |
15
|
|
| |
16
|
Chen, T., Graf, H.P., and Wang, K. Lip synchronization using speech-assisted video processing. IEEE Signal Processing Letters, 2(4), 1995, pp. 57--59.
|
| |
17
|
Bacher, C., Muller, R., Ottmann, T., and Will M. Authoring on the Fly: A new way of integrating Telepresentation and Courseware production. Proceedings of ICCCE'97, 1997, pp. 89--96.
|
 |
18
|
|
CITED BY 5
|
|
Richard Anderson , Crystal Hoyer , Craig Prince , Jonathan Su , Fred Videon , Steve Wolfman, Speech, ink, and slides: the interaction of content channels, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|