ACM Home Page
Please provide us with feedback. Feedback
View-invariant action recognition using interest points
Full text PdfPdf (1.04 MB)
Source
International Multimedia Conference archive
Proceeding of the 1st ACM international conference on Multimedia information retrieval table of contents
Vancouver, British Columbia, Canada
SESSION: Video concept, action, and retrieval table of contents
Pages 305-312  
Year of Publication: 2008
ISBN:978-1-60558-312-9
Authors
Yuedong Yang  Beihang University, Beijing, China
Aimin Hao  Beihang University, Beijing, China
Qinping Zhao  Beihang University, Beijing, China
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 144,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1460096.1460146
What is a DOI?

ABSTRACT

In this paper, we present a two-layer classification model for view-invariant human action recognition based on interest points. Training videos of every action are recorded from multiple viewpoints and represented as space-time interest points. These videos do not require temporal aligning and camera estimating. The first layer of the model is view clustering. We cluster all the videos of an action using K-Means, and break the action into several sub-actions. The second layer is Bayes classifying. We use Naïve Bayes to train the sub-classifiers for the sub-actions, and then generate an optimal classifier for the action. Unlabeled data can be recognized by the optimal classifiers, which may be single-view videos, multi-view videos, or long multi-action videos. Finally, we test our algorithm on the IXMAS dataset, and the CMU motion capture library. The experiments demonstrate that our algorithm can recognize the view-invariant actions and achieve high recognition rates.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Filipovych, R., and Ribeiro, E. 2007. Combining Models of Pose and Dynamics for Human Motion Recognition. In Advances in Visual Computing, Third International Symposium (Lake Tahoe, NV, USA, Nov. 26--28, 2007). ISVC2007. Springer, Berlin, 21--32. DOI= http://dx.doi.org/10.1007/978--3--540--76856--2_3.
 
5
 
6
Ikizler, N., and Forsyth, D. 2007. Searching Video for Complex Activities with Finite State Models. In IEEE Conference on Computer Vision and Pattern Recognition (Minneapolis, Minnesota, USA, June 17--22, 2007). CVPR. IEEE Computer Society, Washington, DC, 1--8. DOI = http://dx.doi.org/10.1109/CVPR.2007.383168.
 
7
 
8
Lv, F., and Nevatia, R. 2007. Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching, In IEEE Conference on Computer Vision and Pattern Recognition (Minneapolis, Minnesota, USA, June 17--22, 2007). CVPR. IEEE Computer Society, Washington, DC, 1--8. DOI = http://dx.doi.org/10.1109/CVPR.2007.383131.
 
9
Niebles, J. C., Wang, H., and Fei-Fei, L. 2006. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. In British Machine Vision Conference (Edinburgh, Sep. 4--7, 2006). The British Machine Vision Association. III:1249--1258.
 
10
Nowozin, S., Bakir, G., and Tsuda, K. 2007. Discriminative Subsequence Mining for Action Classification. In Proceedings of the 11th IEEE International Conference on Computer Vision (Rio de Janeiro, Brazil, October 14--20, 2007). ICCV. IEEE Computer Society, Los Alamitos, CA, USA, 1--8. DOI = http://dx.doi.org/10.1109/ICCV.2007.4409049.
 
11
Ogale, A. S., Karapurkar, A., and Aloimonos, Y. 2005. View-Invariant Modeling and Recognition of Human Actions Using Grammars. In Tenth IEEE International Conference on Computer Vision (Beijing, China, Oct. 21, 2005). Springer Berlin / Heidelberg, 115--126. DOI= http://dx.doi.org/10.1007/978--3--540--70932--9_9.
 
12
Oikonomopoulos, A., Patras, I., and Pantic, M. 2006. Spatiotemporal salient points for visual recognition of human actions. IEEE Transactions on Systems, Man, and Cybernetics 36, 3 (June 2006), 710--719.
 
13
 
14
Ramanan, D., and Forsyth, D. A. 2003. Automatic Annotation of Everyday Movements. Technical Report. UC Berkeley.
 
15
Savarese, S., DelPozo, A., Niebles, J. C., and Fei-fei, L. 2008. Spatial-Temporal Correlatons for Unsupervised Action Classification. In IEEE Workshop on Motion and Video Computing (Copper Mountain, Colorado, January 08--09, 2008).
 
16
 
17
Weinland, D., Boyer, E., and Ronfard, R. 2007. Action Recognition from Arbitrary Views using 3D Exemplars. In Proceedings of the 11th IEEE International Conference on Computer Vision (Rio de Janeiro, Brazil, October 14--20, 2007). ICCV. IEEE Computer Society, Los Alamitos, CA, USA, 1--7.
 
18
 
19
 
20
Wong, S. F., and Cipolla, R. 2007. Extracting Spatiotemporal Interest Points using Global Information. In Proceedings of the 11th IEEE International Conference on Computer Vision (Rio de Janeiro, Brazil, October 14--20, 2007). ICCV. IEEE Computer Society, Los Alamitos, CA, USA, 1--8.
 
21
Wong, S.F., Kim, T.K., and Cipolla, R. 2007. Learning Motion Categories using both Semantic and Structural Information, In IEEE Conference on Computer Vision and Pattern Recognition (Minneapolis, Minnesota, USA, June 17--22, 2007). CVPR. IEEE Computer Society, Washington, DC, 1--6. DOI = http://dx.doi.org/10.1109/CVPR.2007.383332.
 
22

Collaborative Colleagues:
Yuedong Yang: colleagues
Aimin Hao: colleagues
Qinping Zhao: colleagues