|
ABSTRACT
This paper describes a new unified representation for the informa¿tion in a video. We reduce the dimensionality of the signal with either a singular-value decomposition (on the semantic and image data) or mel-frequency cepstral coefficients (on the audio data) and then concatenate the vectors to form a multi-dimensional represen¿tation of the video. Using scale-space techniques we find large jumps in the video's path, which we call edges. We use these tech¿niques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from the audio, semantic and image data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
lames Allan, Jaime CarbonelI. George Doddmgron. Jonathan Yamron, and Yiming Yang. "Topic detection and tracking pilot study final report." Proceeding of the Broadcast New Transaction and Understanding Workhop (Sponsored by DARPA), Feb. 1998.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, R. Harshman. "Indexing by latent semantic analysis." Journal of the American Society for Infbrmation Science, 41, pp. 391407, 1990.
|
| |
6
|
S. Dharanipragada, M. Franz, J. S. McCarley, K. Papineni, SRoukos, T.Ward, W.-J. Zhu. "Statistical models for topic segmentation." Proc. of ICSLP-2000, Beijing, 2000.
|
| |
7
|
S. T. Dumais. "Improving the retrieval of information from external sources' Behavior Research Methods, Instruments, & Computers, 23, pp. 229-236, 1991.
|
 |
8
|
|
| |
9
|
Arlo Gutherie. "City of New Orleans.' On Tribute to Steve Goodman, Audio CD, Red Pajamas Records, Nashville, TN, 1991.
|
| |
10
|
|
| |
11
|
Thomas Hobbes. The Leviathan. 1660. (Also available at http://www.orst.edu/instruct/phl302/texts/hobbes/leviathancontents. html.)
|
| |
12
|
|
| |
13
|
Linguistic Data Consortium. '1997 english broadcast news speech (Hub-4)" LDC catalog no.: LDC98S71, File ed980104.
|
| |
14
|
Richard F. Lyon. 'Speech recognition in scale space," Proc. of 1984 ICASSP. San Diego, March, pp. 29.3.14, 1984.
|
| |
15
|
Nancy E. Miller , Pak Chung Wong , Mary Brewster , Harlan Foote, TOPIC ISLANDS—a wavelet-based text visualization system, Proceedings of the conference on Visualization '98, p.189-196, October 18-23, 1998, Research Triangle Park, North Carolina, United States
|
| |
16
|
PBS Home Video. '21st century jet: The building of the 777." Channel 4, London, 1995.
|
| |
17
|
Malcolm Slaney, Gerald McRoberts. 'BabyEars: A recognition system for affective vocalizations." Proceedings of ICASSP, Seattle, WA, pp. 985-988, May 12-l 5, 1998.
|
| |
18
|
Malcolm Slaney and Dulce Ponceleon. 'Hierarchical segmentation using latent semantic indexing in scale space." Proceedings qf the 2001 ICASSP, Salt Lake City, Utah, May, 2001.
|
| |
19
|
Savitha Srinivasan, Dulce Ponceleon, Amon Amir, Dragutin Petkovic. "What is in that video anyway? In search of better browsing." Proceedings IEEE International Conference on Multimedia Computing and Systems, pp. 388-393. Florence, Italy, 7711 June 1999.
|
| |
20
|
Andrew P. Witkin. "Scale-space Filtering: A new approach to multi-scale description." Proceedings of ICASSP, San Diego, CA March, pp. 39A. l.lI39A.l.4, 1984.
|
| |
21
|
YesVideo, Inc. www.yesvideo.com.
|
CITED BY 5
|
|
Matthew Cooper , Jonathan Foote , Andreas Girgensohn , Lynn Wilcox, Temporal event clustering for digital photo collections, Proceedings of the eleventh ACM international conference on Multimedia, November 02-08, 2003, Berkeley, CA, USA
|
|
|
|
|
|
Matthew Cooper , Jonathan Foote , Andreas Girgensohn , Lynn Wilcox, Temporal event clustering for digital photo collections, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.1 n.3, p.269-288, August 2005
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.10
Vision and Scene Understanding
Subjects:
Video analysis
Additional Classification:
G.
Mathematics of Computing
G.4
MATHEMATICAL SOFTWARE
Subjects:
Parallel and vector implementations
H.
Information Systems
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.1
Multimedia Information Systems
Subjects:
Evaluation/methodology
General Terms:
Design,
Experimentation,
Performance
Keywords:
audio,
automatic segmentation,
color space,
hierarchy,
images,
latent semantic indexing,
multimedia,
video,
scale space,
semantic space,
singular-value decomposition,
temporal properties
|