Implementation of Gradient Estimation to a Constrained Markov Decision Problem

V Krishnamurthy; K Martin; FV Abad

Conference Proceedings

Implementation of Gradient Estimation to a Constrained Markov Decision Problem

V Krishnamurthy, K Martin, FV Abad

Proceedings of the IEEE Conference on Decision and Control | Published : 2003

DOI: 10.1109/CDC.2003.1272362

Abstract

Consider the problem of a constrained Markov Decision Process (MDP). Under a parameterization of the control strategies, the problem can be transformed into a non-linear optimization problem with non-linear constraints. Both the cost and the constraints are stationary averages. We assume that the transition probabilities of the underlying Markov chain are unknown: only the values of the control variables are known, as well as the instantaneous values of the cost and the constraints, so no analytical expression for the stationary averages is available. To find the solution to the optimization problem, a stochastic version of a primal/dual method with an augmented Lagrangian is used. The updat..

View full abstract

University of Melbourne Researchers

Felisa Vazquez-Abad Author Computing and Information Systems

Grants

Citation metrics

14Scopus

13Dimensions

Keywords

4905 Statistics

4901 Applied Mathematics

49 Mathematical Sciences