Cited by
- BibTex
- RIS
- TXT
This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}$$∞$ of CTMDP with denumerable states and a sequence {$\mathcal{M}$$n$} of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of {$\mathcal{M}$$n$} converge to the optimal value and optimal policy of $\mathcal{M}$$∞$ as the state space $S$$n$ of $\mathcal{M}$$n$ converges to the state space $S$$∞$ of $\mathcal{M}$$∞$, respectively. The transition rates and cost/reward functions of $\mathcal{M}$$∞$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers.
}, issn = {}, doi = {https://doi.org/}, url = {http://global-sci.org/intro/article_detail/aam/18090.html} }This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}$$∞$ of CTMDP with denumerable states and a sequence {$\mathcal{M}$$n$} of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of {$\mathcal{M}$$n$} converge to the optimal value and optimal policy of $\mathcal{M}$$∞$ as the state space $S$$n$ of $\mathcal{M}$$n$ converges to the state space $S$$∞$ of $\mathcal{M}$$∞$, respectively. The transition rates and cost/reward functions of $\mathcal{M}$$∞$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers.