| Optimizing time warp simulation with reinforcement learning techniques |
| Full text |
Pdf
(219 KB)
|
| Source
|
Winter Simulation Conference
archive
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
table of contents
Washington D.C.
SESSION: Modeling methodology A: distributed simulation I
table of contents
Pages 577-584
Year of Publication: 2007
ISBN:1-4244-1306-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
IEEE Press
Piscataway, NJ, USA
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 33, Citation Count: 0
|
|
|
ABSTRACT
Adaptive Time Warp protocols in the literature are usually based on a pre-defined analytic model of the system, expressed as a closed form function that maps system state to control parameter. The underlying assumption is that this model itself is optimal. In this paper we present a new approach that utilizes Reinforcement Learning techniques, also known as simulation-based dynamic programming. Instead of assuming an optimal control strategy, the very goal of Reinforcement Learning is to find the optimal strategy through simulation. A value function that captures the history of system feedbacks is used, and no prior knowledge of the system is required. Our reinforcement learning techniques were implemented in a distributed VLSI simulator with the objective of finding the optimal size of a bounded time window. Our experiments using two benchmark circuits indicated that it was successful in doing so.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Das, S. 2000, April. Adaptive protocols for parallel discrete event simulation. Journal of the operational research society (JORS) 51 (4): 385--394.
|
 |
2
|
|
| |
3
|
|
| |
4
|
Kaelbling, L., M. Littman, and A. Moore. 1996. Reinforcement learning: a survey. Journal of artificial intelligence research 4:237--285.
|
 |
5
|
Yi-Bing Lin , Bruno R. Preiss , Wayne M. Loucks , Edward D. Lazowska, Selecting the checkpoint interval in time warp simulation, Proceedings of the seventh workshop on Parallel and distributed simulation, p.3-10, May 16-19, 1993, San Diego, California, United States
|
 |
6
|
|
| |
7
|
Palaniswamy, A., and P. Wilsey. 1993, March. Adaptive bounded time windows in an optimistically synchronized simulator. Great lakes VLSI conference:114--118.
|
| |
8
|
|
 |
9
|
|
| |
10
|
Parent, J., K. Verbeeck, and J. Lemeire. 2002. Adaptive load balancing of parallel applications with reinforcement learning on heterogeneous networks. Proceedings of international symposium DCABES.
|
 |
11
|
|
| |
12
|
|
| |
13
|
Schaerf, A., Y. Shoham, and M. Tennenholtz. 1995. Adaptive load balancing: a study in multi-agent learning. Journal of artificial intelligence research 2:475--500.
|
| |
14
|
Sokol, L., D. Briscoe, and A. Wieland. 1988, July. Mtw: a strategy fo scheduling discrete simulation events for concurrent execution. Proceedings of the SCS multi-conference on distributed simulation 19 (3): 34--42.
|
| |
15
|
|
|