|
|||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||
ABSTRACT
This paper addresses the challenge of multi-policy optimization in decentralized autonomic systems. We evaluate several multi-policy reinforcement learning-based optimization techniques in an urban traffic control simulation, a canonical example of a decentralized autonomic system. Our results indicate that W-learning, which learns separately for each policy and then selects between nominated actions based on current action importance, is a suitable approach for optimization towards multiple policies on non-collaborating agents in heterogeneous autonomic environments. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
INDEX TERMS
Primary Classification:
Additional Classification:
General Terms:
Keywords:
|
|||||||||||||||||||||||||||||||||||||||||||||||||