|
ABSTRACT
In this paper, we propose an approach that retrieves motion of objects from the videos based on the dynamic time warping of view invariant characteristics. The motion is represented as a sequence of dynamic instants and intervals, which are automatically computed using the spatiotemporal curvature of the trajectory of moving object in the videos. Dynamic Time Warping (DTW) method matches trajectories using a view invariant similarity measure. Our system is able to incrementally learn different actions without any initialization mode, therefore it can work in an unsupervised manner. The retrieval of relevant videos can be easily performed by computing a simple distance metric. This paper makes two fundamental contribution to view invariant video retrieval: (1) Dynamic Instant detection in trajectories of moving objects acquired from video. (2) View-invariant Dynamic Time Warping to measure similarity between two trajectories of actions performed by different persons and from different viewpoints. Although the learning algorithm is relatively simple in our approach, we can achieve high recognition rate because of the view-invariant representation and the similarity measure using DTW.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Cen Rao and Mubarak Shah, View invariance in action recognition. In IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, Hawaii, Dec. 2001.
|
| |
2
|
Tanveer Syeda-Mahmood, A. Vasilescu, and S. Sethi, Recognizing action events from multiple viewpoints. In IEEE Workshop on Detection and Recognition of Events in Video (EVENT'01), Vancouver, Canada, July 2001.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
James W. Davis and Aaron Bobick. "Action recognition using temporal templates", pages 125--146. CVPR-97, 1997.
|
| |
9
|
|
| |
10
|
M. Izumi A. Kojiama "Generating natural language description of human behavior from video images", ICPR-2000, 4: 728--731, 2000.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
J. Davis, A. Bobick, W. Richards, "Categorical Representation and Recognition of Oscillatory Motion Patterns", IEEE Conference on Computer Vision and Pattern Recognition, June 2000, pp. 628--635.
|
| |
15
|
|
| |
16
|
S. Niyogi and E.H. Adelson, "Analyzing and recognizing walking figures in XYT", cvpr 1994.
|
| |
17
|
|
| |
18
|
R. Polana and R.C. Nelson, "Detecting activities", J. of Visual Communication and Image Representation", vol 5, P172--180, 1994.
|
| |
19
|
|
| |
20
|
removed for blind review, Proc. IEEE Workshop on Applications of Computer Vision, WACV'98, 1998.
|
| |
21
|
|
| |
22
|
C. Bregler and A. Hertzmann and H. Biermann, "Recovering non-rigid 3d shape from image streams",CVPR, 2000.
|
| |
23
|
M. Black and A. Jepson, "Eigentracking: Robust matching and tracking of articulated objects using a view-based representation",ECCV 1998.
|
| |
24
|
|
| |
25
|
Y. Caspi and M. Irani, "A step towards sequence-to-sequence alignment",CVPR 2000
|
| |
26
|
K.Yu, J.Mason, J.Oglesby, "Speaker recognition using hidden Markov models, dynamic time warping and vector quantisation", IEEE Proceedings- Vision, Image and Signal Processing Vol.142, Issue 5, pg. 313--318, Oct 1995.
|
| |
27
|
|
| |
28
|
D. Moore, I. Essa, M. Hayes, "ObjectSpaces: Context Management for Action Recognition," Proceedings of the 2nd Annual Conference on Audio-Visual Biometric Person Authentication, Washington, D.C.,March 1999
|
| |
29
|
Jesse Hoey and James J. Little, "Representation and recognition of complex human motion". In Proc. IEEE CVPR, Hilton Head, SC, June 2000
|
| |
30
|
|
| |
31
|
G. R. Bradski. Computer vision face tracking for use in a perceptual user interface. Intel Tech J Q2, 1998
|
| |
32
|
T. Syeda-Mahmood, "Segmenting Actions in Velocity Curve Space", ICPR 2002.
|
| |
33
|
Vasu Parameswaran and Rama Chellappa, "Quasi-Invariants for Human Action Representation and Recognition".
|
| |
34
|
|
| |
35
|
|
| |
36
|
|
|