ACM Home Page
Please provide us with feedback. Feedback
Optimizing time warp simulation with reinforcement learning techniques
Full text PdfPdf (219 KB)
Source Winter Simulation Conference archive
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come table of contents
Washington D.C.
SESSION: Modeling methodology A: distributed simulation I table of contents
Pages 577-584  
Year of Publication: 2007
ISBN:1-4244-1306-0
Authors
Jun Wang  McGill University, Montreal, Canada
Carl Tropper  McGill University, Montreal, Canada
Sponsors
INFORMS-SIM : Institute for Operations Research and the Management Sciences: Simulation Society
NIST : National Institute of Standards and Technology
(SCS) : The Society for Modeling and Simulation International
ACM/SIGSIM : Association for Computing Machinery: Special Interest Group on Simulation
IIE : Institute of Industrial Engineers
ASA : American Statistical Association
IEEE/SMC : Institute of Electrical and Electronics Engineers: Systems, Man, and Cybernetics Society
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 33,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Adaptive Time Warp protocols in the literature are usually based on a pre-defined analytic model of the system, expressed as a closed form function that maps system state to control parameter. The underlying assumption is that this model itself is optimal. In this paper we present a new approach that utilizes Reinforcement Learning techniques, also known as simulation-based dynamic programming. Instead of assuming an optimal control strategy, the very goal of Reinforcement Learning is to find the optimal strategy through simulation. A value function that captures the history of system feedbacks is used, and no prior knowledge of the system is required. Our reinforcement learning techniques were implemented in a distributed VLSI simulator with the objective of finding the optimal size of a bounded time window. Our experiments using two benchmark circuits indicated that it was successful in doing so.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Das, S. 2000, April. Adaptive protocols for parallel discrete event simulation. Journal of the operational research society (JORS) 51 (4): 385--394.
2
 
3
 
4
Kaelbling, L., M. Littman, and A. Moore. 1996. Reinforcement learning: a survey. Journal of artificial intelligence research 4:237--285.
5
6
 
7
Palaniswamy, A., and P. Wilsey. 1993, March. Adaptive bounded time windows in an optimistically synchronized simulator. Great lakes VLSI conference:114--118.
 
8
9
 
10
Parent, J., K. Verbeeck, and J. Lemeire. 2002. Adaptive load balancing of parallel applications with reinforcement learning on heterogeneous networks. Proceedings of international symposium DCABES.
11
 
12
 
13
Schaerf, A., Y. Shoham, and M. Tennenholtz. 1995. Adaptive load balancing: a study in multi-agent learning. Journal of artificial intelligence research 2:475--500.
 
14
Sokol, L., D. Briscoe, and A. Wieland. 1988, July. Mtw: a strategy fo scheduling discrete simulation events for concurrent execution. Proceedings of the SCS multi-conference on distributed simulation 19 (3): 34--42.
 
15