Towards Q-learning the Whittle Index for Restless Bandits

Jing Fu, Yoni Nazarathy, Sarat Moka, Peter G Taylor

2019 Australian & New Zealand Control Conference (ANZCC) | IEEE | Published : 2019


J. Fu and P.G. Taylor's research is supported by the Australian Research Council (ARC) Laureate Fellowship FL130100039 and the ARC Centre of Excellence for the Mathematical and Statistical Frontiers (ACEMS). S. Moka's research is supported by ACEMS, under grant number CE140100049. Y. Nazarathy's research is supported by ARC grant DP180101602. The authors also thank Prof. Vivek Borkar for preliminary discussions.