| A novel orthogonal NMF-based belief compression for POMDPs |
| Full text |
Pdf
(321 KB)
|
| Source
|
ICML; Vol. 227
archive
Proceedings of the 24th international conference on Machine learning
table of contents
Corvalis, Oregon
Pages: 537 - 544
Year of Publication: 2007
ISBN:978-1-59593-793-3
|
|
Authors
|
|
Xin Li
|
Hong Kong Baptist University, Kowloon Tong, HK
|
|
William K. W. Cheung
|
Hong Kong Baptist University, Kowloon Tong, HK
|
|
Jiming Liu
|
Hong Kong Baptist University, Kowloon Tong, HK
|
|
Zhili Wu
|
Hong Kong Baptist University, Kowloon Tong, HK
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 35, Citation Count: 0
|
|
|
ABSTRACT
High dimensionality of POMDP's belief state space is one major cause that makes the underlying optimal policy computation intractable. Belief compression refers to the methodology that projects the belief state space to a low-dimensional one to alleviate the problem. In this paper, we propose a novel orthogonal non-negative matrix factorization (O-NMF) for the projection. The proposed O-NMF not only factors the belief state space by minimizing the reconstruction error, but also allows the compressed POMDP formulation to be efficiently computed (due to its orthogonality) in a value-directed manner so that the value function will take same values for corresponding belief states in the original and compressed state spaces. We have tested the proposed approach using a number of benchmark problems and the empirical results confirms its effectiveness in achieving substantial computational cost saving in policy computation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Chris Ding , Tao Li , Wei Peng , Haesun Park, Orthogonal nonnegative matrix t-factorizations for clustering, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150420]
|
| |
3
|
Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788--791.
|
| |
4
|
Lee, D. D., & Seung, H. S. (2001). Algorithms for non-negative matrix factorization. In Neural information processing systems 13, 556--562. MIT Press.
|
| |
5
|
|
| |
6
|
Li, X., Cheung, W. K., & Liu, J. (2005b). Towards solving large-scale POMDP problems via spatiotemporal belief state clustering. Proceedings of IJCAI-05 Workshop on Reasoning with Uncertainty in Robotics (RUR'05). Edinburgh, Scotland.
|
| |
7
|
N. Roy and G. Gordon, & Thrun, S. (2005). Finding approximate POMDP solutions through belief compressions. Journal of Artificial Intelligence Research, 23, 1--40.
|
| |
8
|
Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: An anytime algorithm for POMDPs. Proceedings of International Joint Conference on Artificial Intelligence (IJCAI-03).
|
| |
9
|
Poupart, P., & Boutilier, C. (2003). Value-directed compression of POMDPs. In Advances in Neural Information Processing Systems 15, 1547--1554. Cambridge, MA: MIT Press.
|
| |
10
|
Smith, T., & Simmons, R. (2005). Point-based POMDP algorithms: Improved analysis and implementation. Proceedings of the 21th Annual Conference on Uncertainty in Artificial Intelligence (UAI-05) (pp. 542--55). Arlington, Virginia: AUAI Press.
|
| |
11
|
Spaan, M. T. J., & Vlassis, N. (2005). Perseus: Randomized point-based value iteration for POMDPs. Journal of Artificial Intelligence Research, 24, 195--220.
|
|