Second Order Optimality in Markov and  Semi-Markov Decision Processes

Sladký, Karel

Number of the records: 1

Second Order Optimality in Markov and Semi-Markov Decision Processes

1.

SYSNO ASEP	0517875
Document Type	K - Proceedings Paper (Czech conf.)
R&D Document Type	Conference Paper
Title	Second Order Optimality in Markov and Semi-Markov Decision Processes
Author(s)	Sladký, Karel (UTIA-B)_RID
Number of authors	1
Source Title	Conference Proceedings. 37th International Conference on Mathematical Methods in Economics 2019. - České Budějovice : University of South Bohemia in České Budějovice, Faculty of Economics, 2019 / Houda M. ; Remeš R. - ISBN 978-80-7394-760-6 S. 338-343
Number of pages	6 s.
Publication form	Online - E
Action	MME 2019: International Conference on Mathematical Methods in Economics /37./
Event date	11.09.2019 - 13.09.2019
VEvent location	České Budějovice
Country	CZ - Czech Republic
Event type	WRD
Language	eng - English
Country	CZ - Czech Republic
Keywords	semi-Markov processes with rewards ; discrete and continuous-time Markov reward chains ; risk-sensitive optimality ; average reward and variance over time
Subject RIV	BB - Applied Statistics, Operational Research
OECD category	Statistics and probability
R&D Projects	GA18-02739S GA ČR - Czech Science Foundation (CSF)
Institutional support	UTIA-B - RVO:67985556
Annotation	Semi-Markov decision processes can be considered as an extension of discrete- and continuous-time Markov reward models. Unfortunately, traditional optimality criteria as long-run average reward per time may be quite insufficient to characterize the problem from the point of a decision maker. To this end it may be preferable if not necessary to select more sophisticated criteria that also reflect variability-risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean-variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. Such approach has been already studied for very special classes of semi-Markov decision processes, in particular, for Markov decision processes in discrete - and continuous-time setting. In this note these approaches are summarized and possible extensions to the wider class of semi-Markov decision processes is discussed. Attention is mostly restricted to uncontrolled models in which the chain is aperiodic and contains a single class of recurrent states. Considering finite time horizons, explicit formulas for the first and second moments of total reward as well as for the corresponding variance are produced.
Workplace	Institute of Information Theory and Automation
Contact	Markéta Votavová, votavova@utia.cas.cz, Tel.: 266 052 201.
Year of Publishing	2020

Number of the records: 1