Conference Proceedings

Implementation of Gradient Estimation to a Constrained Markov Decision Problem

V Krishnamurthy, K Martin, FV Abad

Proceedings of the IEEE Conference on Decision and Control | Published : 2003

Abstract

Consider the problem of a constrained Markov Decision Process (MDP). Under a parameterization of the control strategies, the problem can be transformed into a non-linear optimization problem with non-linear constraints. Both the cost and the constraints are stationary averages. We assume that the transition probabilities of the underlying Markov chain are unknown: only the values of the control variables are known, as well as the instantaneous values of the cost and the constraints, so no analytical expression for the stationary averages is available. To find the solution to the optimization problem, a stochastic version of a primal/dual method with an augmented Lagrangian is used. The updat..

View full abstract