ACM Home Page
Please provide us with feedback. Feedback
Scaling ant colony optimization with hierarchical reinforcement learning partitioning
Full text PdfPdf (411 KB)
Source
Genetic And Evolutionary Computation Conference archive
Proceedings of the 10th annual conference on Genetic and evolutionary computation table of contents
Atlanta, GA, USA
SESSION: Ant colony optimization, swarm intelligence, and artificial immune systems papers table of contents
Pages 25-32  
Year of Publication: 2008
ISBN:978-1-60558-130-9
Authors
Erik J. Dries  Air Force Institute of Technology, Wright-Patterson AFB, OH, USA
Gilbert L. Peterson  Air Force Institute of Technology, Wright-Patterson AFB, OH, USA
Sponsors
ACM: Association for Computing Machinery
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 110,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1389095.1389100
What is a DOI?

ABSTRACT

This paper merges hierarchical reinforcement learning (HRL) with ant colony optimization (ACO) to produce a HRL ACO algorithm capable of generating solutions for large domains. This paper describes two specific implementations of the new algorithm: the first a modification to Dietterich's MAXQ-Q HRL algorithm, the second a hierarchical ant colony system algorithm. These implementations generate faster results, with little to no significant change in the quality of solutions for the tested problem domains. The application of ACO to the MAXQ-Q algorithm replaces the reinforcement learning, Q-learning, with the modified ant colony optimization method, Ant-Q. This algorithm, MAXQ-AntQ, converges to solutions not significantly different from MAXQ-Q in 88% of the time. This paper then transfers HRL techniques to the ACO domain and traveling salesman problem (TSP). To apply HRL to ACO, a hierarchy must be created for the TSP. A data clustering algorithm creates these subtasks, with an ACO algorithm to solve the individual and complete problems. This paper tests two clustering algorithms, k-means and G-means. The results demonstrate the algorithm with data clustering produces solutions 20 times faster with 5-10% decrease in solution quality due to the effects of clustering.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
T. G. Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. Journal of Artificial Intelligence, (13):227--303, Nov. 2000.
 
4
M. Dorigo, M. Birattari, and T. Stutzle. Ant colony optimization. Computational Intellligence Magazine, IEEE, 1(4):28--39, 2006.
 
5
M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1(1):53--66, April 1997.
 
6
L. M. Gambardella and M. Dorigo. Ant-q: A reinforcement learning approach to the traveling salesman problem. In International Conference on Machine Learning, pages 252--260, 1995.
 
7
G. Hamerly and C. Elkan. Learning the k in k-means. In Proceedings of the Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), pages 281--288, 2003.
 
8
 
9
J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Symposium on Mathematical Statistics and Probabilities, pages 281--297, 1967.
 
10
R. Parr and S. Russell. Reinforcement learning with hierarchies of machines, Jan. 17 1997.
 
11
 
12
G. Reinelt. Tsplib 95, 2007.
 
13
W. D. Smart. Explicit manifold representations for value-functions in reinforcement learning. In Proceedings of the Eighth International Symposium on Artificial Intelligence and Mathematics, January 2004. Paper number AI&M 25-2004.
 
14
T. Stutzle and H. Hoos. The max-min ant system and local search for combinatorial optimization problems, 1999.
 
15
 
16
 
17
C. Walshaw. A Multilevel Approach to the Travelling Salesman Problem. Tech. Rep. 00/IM/63, Comp. Math. Sci., Univ. Greenwich, London SE10 9LS, UK, August 2000.
 
18
C. Watkins. Learning From Delayed Rewards. PhD thesis, King's College, Oxford, May 1989.

Collaborative Colleagues:
Erik J. Dries: colleagues
Gilbert L. Peterson: colleagues