|
ABSTRACT
Meta-level control manages the allocation of limited resources to deliberative actions. This paper discusses efforts in adding meta-level control capabilities to a Markov Decision Process (MDP)-based scheduling agent. The agent's reasoning process involves continuous partial unrolling of the MDP state space and periodic reprioritization of the states to be expanded. The meta-level controller makes situation-specific decisions on when the agent should stop unrolling in order to derive a partial policy while bounding the costs of state reprioritization. The described approach uses performance profiling combined with multi-level strategies in its decision making. We present results showing the performance advantage of dynamic meta-level control for this complex agent.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Alexander, A. Raja, E. H. Durfee, and D. J. Musliner. Design paradigms for meta-control in multi-agent systems. In Proceedings of the AAMAS 2007 Workshop on Metareasoning in Agent-based Systems, pages 92--103, Honolulu, HI, May 2007.
|
| |
2
|
|
| |
3
|
M. Boddy and T. Dean. Decision-theoretic deliberation scheduling for problem solving in time-constrained environments, 1994.
|
| |
4
|
V. Conitzer and T. Sandholm. Definition and complexity of some basic metareasoning problems. In IJCAI, pages 1099--1106, 2003.
|
| |
5
|
|
| |
6
|
T. Dean and M. Boddy. An analysis of time-dependent planning. In Proceedings of the Seventh National Conference on Artificial Intelligence (AAAI-88), pages 49--54, Saint Paul, Minnesota, USA, 1988. AAAI Press/MIT Press.
|
| |
7
|
Gnuplot. http://www.gnuplot.info.
|
| |
8
|
R. P. Goldman, D. J. Musliner, and K. D. Krebsbach. Managing online self-adaptation in real-time environments. In Lecture Notes in Computer Science, volume 2614, pages 6--23. Springer-Verlag, 2003.
|
| |
9
|
|
| |
10
|
B. Horling, V. Lesser, R. Vincent, T. Wagner, A. Raja, S. Zhang, K. Decker, and A. Garvey. The TAEMS White Paper, January 1999.
|
| |
11
|
K. Larson and T. Sandholm. Using performance profile trees to improve deliberation control. In AAAI, pages 73--79, 2004.
|
| |
12
|
M. L. Littman, T. Dean, and L. P. Kaelbling. On the complexity of solving markov decision problems. In UAI, pages 394--402, Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence (UAI--95).
|
| |
13
|
D. W. Marquardt. An algorithm for least-squares estimation of nonlinear parameters. Journal of the Society for Industrial and Applied Mathematics, 11(2):431--441, June 1963.
|
| |
14
|
D. J. Musliner, E. H. Durfee, J. Wu, D. A. Dolgov, R. P. Goldman, and M. S. Boddy. Coordinated plan management using multiagent MDPs. In Working Notes of the AAAI 2006 Spring Symposium on Distributed Plan and Schedule Management, pages 73--80, March 2006.
|
| |
15
|
D. J. Musliner, R. P. Goldman, and K. D. Krebsbach. Deliberation scheduling strategies for adaptive mission planning in real-time environments. In Proc. Third International Workshop on Self Adaptive Software, 2003.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
|