Volume 35, Issue 4
Convergence of Controlled Models for Continuous-Time Markov Decision Processes with Constrained Average Criteria

Wenzhao Zhang & Xianzhu Xiong

Ann. Appl. Math., 35 (2019), pp. 449-464.

Published online: 2020-08

Export citation
  • Abstract

This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}$$∞$ of CTMDP with denumerable states and a sequence {$\mathcal{M}$$n$} of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of {$\mathcal{M}$$n$} converge to the optimal value and optimal policy of $\mathcal{M}$$∞$ as the state space $S$$n$ of $\mathcal{M}$$n$ converges to the state space $S$$∞$ of $\mathcal{M}$$∞$, respectively. The transition rates and cost/reward functions of $\mathcal{M}$$∞$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers.

  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{AAM-35-449, author = {Zhang , Wenzhao and Xiong , Xianzhu}, title = {Convergence of Controlled Models for Continuous-Time Markov Decision Processes with Constrained Average Criteria}, journal = {Annals of Applied Mathematics}, year = {2020}, volume = {35}, number = {4}, pages = {449--464}, abstract = {

This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}$$∞$ of CTMDP with denumerable states and a sequence {$\mathcal{M}$$n$} of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of {$\mathcal{M}$$n$} converge to the optimal value and optimal policy of $\mathcal{M}$$∞$ as the state space $S$$n$ of $\mathcal{M}$$n$ converges to the state space $S$$∞$ of $\mathcal{M}$$∞$, respectively. The transition rates and cost/reward functions of $\mathcal{M}$$∞$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers.

}, issn = {}, doi = {https://doi.org/}, url = {http://global-sci.org/intro/article_detail/aam/18090.html} }
TY - JOUR T1 - Convergence of Controlled Models for Continuous-Time Markov Decision Processes with Constrained Average Criteria AU - Zhang , Wenzhao AU - Xiong , Xianzhu JO - Annals of Applied Mathematics VL - 4 SP - 449 EP - 464 PY - 2020 DA - 2020/08 SN - 35 DO - http://doi.org/ UR - https://global-sci.org/intro/article_detail/aam/18090.html KW - continuous-time Markov decision processes, optimal value, optimal policies, constrained average criteria, occupation measures. AB -

This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}$$∞$ of CTMDP with denumerable states and a sequence {$\mathcal{M}$$n$} of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of {$\mathcal{M}$$n$} converge to the optimal value and optimal policy of $\mathcal{M}$$∞$ as the state space $S$$n$ of $\mathcal{M}$$n$ converges to the state space $S$$∞$ of $\mathcal{M}$$∞$, respectively. The transition rates and cost/reward functions of $\mathcal{M}$$∞$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers.

Zhang , Wenzhao and Xiong , Xianzhu. (2020). Convergence of Controlled Models for Continuous-Time Markov Decision Processes with Constrained Average Criteria. Annals of Applied Mathematics. 35 (4). 449-464. doi:
Copy to clipboard
The citation has been copied to your clipboard