- Risk-Sensitive Optimality in Markov Games
Number of the records: 1  

Risk-Sensitive Optimality in Markov Games

  1. 1.
    SYSNO ASEP0480036
    Document TypeC - Proceedings Paper (int. conf.)
    R&D Document TypeConference Paper
    TitleRisk-Sensitive Optimality in Markov Games
    Author(s) Sladký, Karel (UTIA-B) RID
    Martínez Cortés, V. M. (MX)
    Number of authors2
    Source TitleProceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017). - Hradec Králové : University of Hradec Králové, 2017 - ISBN 978-80-7435-678-0
    Pagess. 684-689
    Number of pages6 s.
    Publication formOnline - E
    ActionMME 2017. International Conference Mathematical Methods in Economics /35./
    Event date13.09.2017 - 15.09.2017
    VEvent locationHradec Králové
    CountryCZ - Czech Republic
    Event typeEUR
    Languageeng - English
    CountryCZ - Czech Republic
    Keywordstwo-person Markov games ; communicating Markov chains ; risk-sensitive optimality ; dynamic programming
    Subject RIVAH - Economics
    OECD categoryApplied Economics, Econometrics
    R&D ProjectsGA13-14445S GA ČR - Czech Science Foundation (CSF)
    Institutional supportUTIA-B - RVO:67985556
    UT WOS000427151400117
    AnnotationThe article is devoted to risk-sensitive optimality in Markov games. Attention is focused on Markov games evolving on communicating Markov chains with two-players with opposite aims. Considering risk-sensitive optimality criteria means that total reward generated by the game is evaluated by exponential utility function with a given risk-sensitive coefficient. In particular, the first player (resp. the secondplayer) tries to maximize (resp. minimize) the long-run risk sensitive average reward. Observe that if the second player is dummy, the problem is reduced to finding optimal policy of the Markov decision chain with the risk-sensitive optimality. Recall that for the risk sensitivity coefficient equal to zero we arrive at traditional optimality criteria. In this article, connections between risk-sensitive and risk-neutral Markov decisionchains and Markov games models are studied using discrepancy functions. Explicit formulae for bounds on the risk-sensitive average long-run reward are reported. Policy iteration algorithm for finding suboptimal policies of both players is suggested. The obtained results are illustrated on numerical example.
    WorkplaceInstitute of Information Theory and Automation
    ContactMarkéta Votavová, votavova@utia.cas.cz, Tel.: 266 052 201.
    Year of Publishing2018
Number of the records: 1  

Metadata are licenced under CC0

  This site uses cookies to make them easier to browse. Learn more about how we use cookies.