|
ABSTRACT
The design and implementation of a client-centered multimedia content adaptation system suitable for a mobile environment comprising of resource-constrained handheld devices or clients is described. The primary contributions of this work are: (1) the overall architecture of the client-centered content adaptation system, (2) a data-driven multi-level Hidden Markov model (HMM)-based approach to perform both video segmentation and video indexing in a single pass, and (3) the formulation and implementation of a Multiple-choice Multidimensional Knapsack Problem (MMKP)-based video personalization strategy. In order to segment and index video data, a video stream is modeled at both the semantic unit level and video program level. These models are learned entirely from training data and no domain-dependent knowledge about the structure of video programs is used. This makes the system capable of handling various kinds of videos without having to manually redefine the program model. The proposed MMKP-based personalization strategy is shown to include more relevant video content in response to the client's request than the existing 0/1 knapsack problem and fractional knapsack problem-based strategies, and is capable of satisfying multiple client-side constraints simultaneously. Experimental results on CNN news videos and Major League Soccer (MLS) videos are presented and analyzed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Baum, L. E., Peterie, T., Souled, G., and Weiss, N. 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist, 164--171.
|
| |
4
|
|
| |
5
|
Boreczky, J. S. and Wilcox. L. D. 1998. A hidden Markov model framework for video segmentation using audio and image features. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
|
| |
6
|
|
| |
7
|
Chen, M. J., Chu, M. C., and Pan, C. W. 2002. Efficient motion estimation algorithm for reduced frame-rate video transcoder. IEEE Trans. Circ. Syst. Video Technol. 12, 4, 269--275.
|
| |
8
|
|
| |
9
|
|
| |
10
|
Eleftheriadis, A. and Batra, P. 2006. Dynamic rate shaping of compressed digital video. IEEE Trans. Multimedia 8, 2, 297--314.
|
| |
11
|
Fellbaum, C., Ed. 1998. WordNet—An Electronic Lexical Database. The MIT Press, Cambridge, MA.
|
| |
12
|
Myron Flickner , Harpreet Sawhney , Wayne Niblack , Jonathan Ashley , Qian Huang , Byron Dom , Monika Gorkani , Jim Hafner , Denis Lee , Dragutin Petkovic , David Steele , Peter Yanker, Query by Image and Video Content: The QBIC System, Computer, v.28 n.9, p.23-32, September 1995
[doi> 10.1109/2.410146]
|
| |
13
|
Forney, G. D. 1973. The Viterbi algorithm. Proceedings of the IEEE, vol. 61, No. 3, 268-278.
|
| |
14
|
Hernandez, R. P. and Nikitas, N. J. 2005. A new heuristic for solving the multiple-choice multidimensional knapsack problem. IEEE Trans. Syst. Man Cybernetics, Part A 35, 5, 708--717.
|
| |
15
|
Huang, J., Liu, Z., and Wang, Y. 2005. Joint scene classification and segmentation based on hidden Markov model. IEEE Trans. Multimedia 7, 3, 538--550.
|
| |
16
|
Irani, M., Hsu, S., and Anandan, P. 1995. Mosaic-based video compression. In Proceedings of the SPIE Conference on Electronic Imaging, Digital Video Compression: Algorithms and Technologies, vol. 2419, 242--253.
|
| |
17
|
Irani, M., Anandan, P., Bergen, J., Kumar, R., and Hsu, S. 1996. Efficient representations of video sequences and their applications. Signal Process. Image Commun. Special Issue on Image Video Semantics: Processing, Analysis, Appl. 8, 4, 327--351.
|
| |
18
|
Md. Shahadatullah Khan , Kin F. Li , Eric G. Manning, Quality adaptation in a multisession multimedia system: model, algorithms, and architecture, University of Victoria, Victoria, B.C., Canada, 1998
|
| |
19
|
Leacock, C. and Chodorow, M. 1998. Combining local context and wordnet similarity for word sense identification. In WordNet: An Electronic Lexical Database, Fellbaum C. (Ed.), MIT Press, Cambridge, MA, 265--283.
|
| |
20
|
|
| |
21
|
Li, C. S., Mohan, R., and Smith, J. R. 1998. Multimedia content description in the Info-Pyramid. In Proceedings of the ICASSP'98, Special Session on Signal Processing in Modern Multimedia Standards, vol.6, 3789--3792.
|
 |
22
|
Bernard Merialdo , Kyung Tak Lee , Dario Luparello , Jeremie Roudaire, Automatic construction of personalized TV news programs, Proceedings of the seventh ACM international conference on Multimedia (Part 1), p.323-331, October 30-November 05, 1999, Orlando, Florida, United States
[doi> 10.1145/319463.319637]
|
| |
23
|
|
| |
24
|
Ney, H. and Ortmanns, S. 1999. Progress on dynamic programming search for continuous speech recognition. IEEE Signal Proc. Mag. 16, 5, 64--83.
|
| |
25
|
Papoulis, A. 1984. Probability, Random Variables, and Stochastic Processes, 2nd Ed. McGraw-Hill, New York, 104, 148.
|
| |
26
|
Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceedings of the IEEE 77, 2, 257--286.
|
| |
27
|
Shinoda, K., Bach, N. H., Furui, S., and Kawai, N. 2005. Scene recognition using hidden Markov models for video database. In Proceedings of the Symposium on Large-Scale Knowledge Resources (LKR'05), 107--110.
|
| |
28
|
|
| |
29
|
Sun, H., Kwok, W., and Zdepski, J. 1996. Architectures for MPEG compressed bitstream scaling. IEEE Trans. Circ. Syst. Video Technol. 6, 191--199.
|
| |
30
|
Tamura, H., Mori, S., and Yamawaki, T. 1978. Textural features corresponding to visual perception. IEEE Trans. Syst. Man Cybernetics 8, 460--472.
|
| |
31
|
|
| |
32
|
Tseng, B. L. and Smith, J. R. 2003. Hierarchical video summarization based on context clustering. In Proceedings of the SPIE, 5242, 14--25.
|
| |
33
|
Tseng, B. L., Lin, C. Y., and Smith, J. R. 2002. Video personalization and summarization system. In Proceedings of the IEEE Workshop on Multimedia Signal Processing, 424--427.
|
| |
34
|
|
| |
35
|
Vanderbei, R. J. 1997. Linear Programming: Foundations and Extensions. Kluwer Academic, Norwell, MA.
|
| |
36
|
|
| |
37
|
|
| |
38
|
Wheeler, E. S. 2002. Zipf's law and why it works everywhere. Glottometrics, 4, 45--48.
|
| |
39
|
Zhu, W., Yang, K., and Beacken, M. 1998. CIF-to-QCIF video bitstream down conversion in the DCT domain. Bell Labs Tech. J. 3, 3, 21--29.
|
|