Aproximace konvexních funkcí v algoritmech pro řešení stochastických her

Jaroslav Šafář

Approximation of Bound Functions in Algorithms for Solving Stochastic Games

Type of document

bakalářská práce
bachelor thesis

Author

Jaroslav Šafář

Supervisor

Bošanský Branislav

Opponent

Kléma Jiří

Field of study

Informatika a počítačové vědy

Study program

Otevřená informatika

Institutions assigning rank

katedra kybernetiky

Rights

A university thesis is a work protected by the Copyright Act. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one?s own expense. The use of thesis should be in compliance with the Copyright Act http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf and the citation ethics http://knihovny.cvut.cz/vychova/vskp.html
Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf a citační etikou http://knihovny.cvut.cz/vychova/vskp.html

Metadata

Show full item record

Abstract

V této práci se soustředíme na aproximaci konvexních funkcí v Heuristic Search Value Iteration algoritmu pro řešení Jednostranně Částečně Pozorovatelných Stochastických Her. Jedná se o dynamické hry, kde první hráč má neúplnou informaci o hře, zatímco druhý hráč má informaci úplnou. Konvexní funkce tvoří odhady tzv. value funkce celé hry. Dolní odhad je tvořen pomocí horní obálky lineárních funkcí, zatímco horní odhad je tvořen jako dolní konvexní obálka množiny bodů. V práci se zaměřujeme pouze na aproximaci horního odhadu převážně pomocí Aproximativního Convex Hull algoritmu. Ukazujeme, že aproximace horního odhadu je problematická a že pro lepší výsledky je zapotřebí se zaměřit také na aproximaci dolního odhadu.

In this thesis, we focus on the approximation of the bound functions in the Heuristic Search Value Iteration (HSVI) algorithm for One-Sided Partially Observable Stochastic Games (OS-POSG). These are dynamic games with infinite horizon where only one player has imperfect information, and the opponent has full information. The bound functions approximate the value function of the game. The lower bound is represented as an upper envelope of linear functions, while the upper bound is represented as a lower convex envelope of a set of points. We focus only on the approximation of the upper bound mainly by using the Approximate Convex Hull algorithm. We show that the approximation of the upper bound is problematic and that for better results, it is necessary to focus on the approximation of the lower bound function as well.