ACM Home Page
Please provide us with feedback. Feedback
Learning complex motions by sequencing simpler motion templates
Full text PdfPdf (676 KB)
Source ACM International Conference Proceeding Series; Vol. 382 archive
Proceedings of the 26th Annual International Conference on Machine Learning table of contents
Montreal, Quebec, Canada
Pages 753-760  
Year of Publication: 2009
ISBN:978-1-60558-516-1
Authors
Gerhard Neumann  Graz University of Technology, Graz, Austria
Wolfgang Maass  Graz University of Technology, Graz, Austria
Jan Peters  Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Sponsors
: MITACS
: NSF
Microsoft Research : Microsoft Research
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 25,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1553374.1553471
What is a DOI?

ABSTRACT

Abstraction of complex, longer motor tasks into simpler elemental movements enables humans and animals to exhibit motor skills which have not yet been matched by robots. Humans intuitively decompose complex motions into smaller, simpler segments. For example when describing simple movements like drawing a triangle with a pen, we can easily name the basic steps of this movement.

Surprisingly, such abstractions have rarely been used in artificial motor skill learning algorithms. These algorithms typically choose a new action (such as a torque or a force) at a very fast time-scale. As a result, both policy and temporal credit assignment problem become unnecessarily complex - often beyond the reach of current machine learning methods.

We introduce a new framework for temporal abstractions in reinforcement learning (RL), i.e. RL with motion templates. We present a new algorithm for this framework which can learn high-quality policies by making only few abstract decisions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Arbib, M. A. (1981). Perceptual structures and distributed motor control. Handbook of physiology, section 2: The nervous system vol. ii, motor control, part 1, 1449--1480.
 
2
Atkeson, C., & Stephens, B. (2007). Multiple balance strategies from one optimization criterion. 7th IEEE-RAS International Conference on Humanoid Robots.
 
3
 
4
Bradtke, S. J., & Duff, M. O. (1995). Reinforcement learning methods for continuous-time markov decision problems. Advances in Neural Information Processing Systems 7, 7, 393--400.
 
5
 
6
 
7
Ghavamzadeh, M., & Mahadevan, S. (2003). Hierarchical policy gradient algorithms. Twentieth International Conference on Machine Learning (ICML-2003) (pp. 226--233).
 
8
Huber, M., & Grupen, R. A. (1998). Learning robot control---using control policies as abstract actions. In NIPS'98 Workshop: Abstraction and Hierarchy in Reinforcement Learning.
 
9
Ijspeert, A., Nakanishi, J., & Schaal, S. (2002). Learning attractor landscapes for learning motor primitives. Advances in Neural Information Processing Systems 15 (NIPS2002) (pp. 1523--1530).
 
10
Kober, J., & Peters, J. (2009). Policy search for motor primitives in robotics. Advances in Neural Information Processing Systems 22 (NIPS 2008) (pp. 849--856). MA: MIT Press.
 
11
Neumann, G., & Peters, J. (2009). Fitted Q-iteration by Advantage Weighted Regression. Advances in Neural Information Processing Systems 22 (NIPS 2008) (pp. 1177--1184). MA: MIT Press.
 
12
Riedmiller, M. (2005). Neural fitted Q-iteration - first experiences with a data efficient neural reinforcement learning method. Proceedings of the European Conference on Machine Learning (ECML) (pp. 317--328).
 
13
 
14
Xu, X., & Antsaklis, P. (2002). An approach to optimal control of switched systems with internally forced switchings. Proceedings of the American Control Conference (pp. 148--153). Anchorage, USA.

Collaborative Colleagues:
Gerhard Neumann: colleagues
Wolfgang Maass: colleagues
Jan Peters: colleagues