Počet záznamů: 1  

Second Order Optimality in Transient and Discounted Markov Decision Chains

  1. 1.
    0448938 - ÚTIA 2016 RIV CZ eng C - Konferenční příspěvek (zahraniční konf.)
    Sladký, Karel
    Second Order Optimality in Transient and Discounted Markov Decision Chains.
    Procedings of the 33rd International Conference Mathematical Methods in Economics MME 2015. Plzeň: University of West Bohemia, Plzeň, 2015, s. 731-736. ISBN 978-80-261-0539-8.
    [Mathematical Methods in Economics 2015 /33./. Cheb (CZ), 09.09.2015-11.09.2015]
    Grant CEP: GA ČR GA13-14445S; GA ČR GA15-10331S
    Institucionální podpora: RVO:67985556
    Klíčová slova: dynamic programming * discounted and transient Markov reward chains * reward-variance optimality
    Kód oboru RIV: BC - Teorie a systémy řízení
    http://library.utia.cas.cz/separaty/2015/E/sladky-0448938.pdf

    The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less than unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimizing) total expected discounted reward (or undiscounted reward for the transient model) we choose the policy minimizing the total variance. Explicit formulae for calculating the variances for transient and discounted models are reported along with sketches of algoritmic procedures for finding second order optimal policies.
    Trvalý link: http://hdl.handle.net/11104/0250633

     
     
Počet záznamů: 1  

  Tyto stránky využívají soubory cookies, které usnadňují jejich prohlížení. Další informace o tom jak používáme cookies.