Hluboké posilované učení pro hru Super Mario Bros

Ondřej Schejbal

Deep Reinforcement Learning for Super Mario Bros

Type of document

diplomová práce
master thesis

Author

Ondřej Schejbal

Supervisor

Vašata Daniel

Opponent

Novák Petr

Field of study

Znalostní inženýrství

Study program

Informatika

Institutions assigning rank

katedra aplikované matematiky

Rights

A university thesis is a work protected by the Copyright Act. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one?s own expense. The use of thesis should be in compliance with the Copyright Act http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf and the citation ethics http://knihovny.cvut.cz/vychova/vskp.html
Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf a citační etikou http://knihovny.cvut.cz/vychova/vskp.html

Metadata

Show full item record

Abstract

V rámci této diplomové práce byl připraven odladěný model posilovaného učení, který je schopný natrénování inteligentního agenta způsobilého hrát hru Super Mario Bros.. Jeho architektura je založena na provedeném průzkumu aktuálních state-of-the-art technik posilovaného učení, kde mezi sebou byly porovnány modely, které jsou pro tento typ úlohy nejvíce relevantní. Pro možnost porovnání modelů byl proveden průzkum a popis nástrojů, které umožňují interakci modelů s hrou. Na základě výsledků porovnání modelů byla vybrána nejvhodnější metoda. Následně byly provedeny experimenty s aplikováním rozmanitých modifikací na vybraný model za účelem najít nejvhodnější úpravy pro hru Super Mario Bros.. Odladěný model byl následně použit k natrénování inteligentního agenta, jehož výkony byly otestovány na úrovni, na které byl natrénován a také na dalších dvou úrovních, které nikdy neviděl. Výkony agenta byly velmi dobré a ukázaly pěkné vzorce chování hlavně na úrovních, na kterých byl natrénován, ačkoliv jeho výkon na neznámých úrovních byl pochopitelně horší.

Within this master's thesis, a fine-tuned reinforcement learning model capable of preparing an intelligent agent able to play the Super Mario Bros. game has been created. Its architecture is based on conducted research on current state-of-the-art reinforcement learning techniques where the most relevant models for this type of task have been compared between each other. In order to compare the models, research and description of tools that allow the model to interact with the game had been done. Based on the comparison results, the most suitable approach was selected. Experiments with applying various modifications to the selected model have been done in order to find the most suitable modifications for the Super Mario Bros. game. The fine-tuned model has been used to train an intelligent agent, whose performances were tested on the level he was trained on and also on two levels that he had never seen before. The agent's performances were really good and showed nice behavioral patterns, mainly on the level he was trained on, as his performance on the unseen levels was understandably worse.