|
ABSTRACT
We consider the problem of estimating the optimal parameter trajectory over a finite time interval in a parameterized stochastic differential equation (SDE), and propose a simulation-based algorithm for this purpose. Towards this end, we consider a discretization of the SDE over finite time instants and reformulate the problem as one of finding an optimal parameter at each of these instants. A stochastic approximation algorithm based on the smoothed functional technique is adapted to this setting for finding the optimal parameter trajectory. A proof of convergence of the algorithm is presented and results of numerical experiments over two different settings are shown. The algorithm is seen to exhibit good performance. We also present extensions of our framework to the case of finding optimal parameterized feedback policies for controlled SDE and present numerical results in this scenario as well.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
Bhatnagar, S. and Borkar, V. S. 1998. A two time scale stochastic approximation scheme for simulation based parametric optimization. Prob. Eng. Inf. Sci. 12, 519--531.
|
| |
8
|
Bhatnagar, S. and Borkar, V. S. 2003. Multiscale chaotic SPSA and smoothed functional algorithms for simulation optimization. Simul. 79, 10, 568--580.
|
| |
9
|
Bhatnagar, S., Fu, M. C., Marcus, S. I., and Bhatnagar, S. 2001. Two timescale algorithms for simulation optimization of hidden Markov models. IIE Trans. 33, 3, 245--258.
|
| |
10
|
Bhatnagar, S. and Karmeshu. 2007. Monte-Carlo estimation of time-dependent statistical characteristics of a process governed by a random differential equation. Submitted.
|
| |
11
|
Bhatnagar, S. and Kumar, S. 2004. A simultaneous perturbation stochastic approximation based actor-critic algorithm for Markov decision processes. IEEE Trans. Autom. Control 49, 4, 592--598.
|
| |
12
|
Campillo, F. and Traore, A. 1994. Lyapunov exponents of controlled SDEs and stabilizability property: Some examples. Rapport de Recherche 2397, INRIA.
|
| |
13
|
Campillo, F. and Traore, A. 1995. A stabilization algorithm for linear controlled SDEs. In Proceedings of IEEE Conference on Decision and Control, 1034--1035.
|
| |
14
|
Charalambos, C. D., Djouadi, S. M., and Denic, S. Z. 2005. Stochastic power control for wireless networks via SDEs: Probabilistic qos measures. IEEE Trans. Inf. Theory 51, 12, 4396--4401.
|
| |
15
|
Glasserman, P. 2005. Monte Carlo Methods in Financial Engineering. Springer, New York.
|
 |
16
|
|
| |
17
|
|
| |
18
|
Ho, Y. C. and Cao, X. R. 1991. Perturbation Analysis of Discrete Event Dynamical Systems. Kluwer, Boston.
|
| |
19
|
|
| |
20
|
|
| |
21
|
Kushner, H. J. and Clark, D. S. 1978. Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York.
|
| |
22
|
|
| |
23
|
Lim, A. E. B., Zhou, X. Y., and Moore, J. B. 2003. Multiple-Objective risk-sensitive control and its small noise limit. Automatica 39, 533--541.
|
| |
24
|
Liu, T., Bahl, P., and Chlamtac, I. 1998. Mobility modeling, location tracking, and trajectory prediction in wireless atm networks. IEEE J. Selected Areas Commun. 16, 6, 922--936.
|
| |
25
|
Marbach, P. and Tsitsiklis, J. N. 2001. Simulation-based optimization of Markov reward processes. IEEE Trans. Autom. Control 46, 2, 191--209.
|
| |
26
|
Moose, R. L., Vanlandingham, H. F., and McCabe, D. H. 1979. Modeling and estimation for tracking maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-15, 3, 448--456.
|
 |
27
|
|
| |
28
|
Primak, S., Kontorovich, V., and Lyandres, V. 2004. Stochastic Methods and Their Applications to Communications: Stochastic Differential Equations Approach. Wiley, West Sussex, UK.
|
| |
29
|
|
| |
30
|
Singer, R. A. 1970. Estimating optical tracking filter performance for manned maneuvering targets. IEEE Trans. Aerospace Electron. Syst. AES-6, 4, 473--483.
|
| |
31
|
Spall, J. C. 1992. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37, 3, 332--341.
|
| |
32
|
|
| |
33
|
Vazquez-Abad, F. J. and Kushner, H. J. 1992. Estimation of the derivative of a stationary measure with respect to a control parameter. J. Appl. Probability 29, 343--352.
|
|