Reprezentace a generování adversariálních řetězců

Marek Galovič

Representation Learning and Adversarial Sample Generation for Strings

dc.contributor.advisor	Bošanský Branislav
dc.contributor.author	Marek Galovič
dc.date.accessioned	2021-06-07T22:51:37Z
dc.date.available	2021-06-07T22:51:37Z
dc.date.issued	2021-06-07
dc.identifier	KOS-958759756305
dc.identifier.uri	http://hdl.handle.net/10467/94648
dc.description.abstract	V analýze správania malwaru je zoznam vytvorených alebo otvorených súborov často silný príznak pre klasifikačný problém, v ktorom sa rozhoduje či je daný súbor bezpečný alebo nebezpečný. Autori malwaru sa snažia uniknúť odhaleniu s pomocou generovania náhodných názvov súborov, alebo modifikovaním existujúcich názvov súborov v nových verziách malwaru. Tieto zmeny predstavujú adversariálne útoky na detekčný klasifikátor. Cieľom tejto práce je učenie sa latentných reprezentácií znakových reťazcov, generovanie adversariálnych vstupov, a zlepšenie robustnosti klasifikátora voči adversariálnym útokom. Pre učenie sa vektorových reprezentácií znakových reťazcov sme vyvinuli rekurentnú autoenkóder architektúru, ktorá dosahuje vysokú rekonštrukčnú kvalitu. S použitím perturbácií latentných reprezentácií, ktoré sú založené na znalosti gradientu, sme potom boli schopní generovať adversariálne vstupy a použiť tieto adversariálne vstupy na zlepšenie robustnosti klasifikátora. Taktiež sme ukázali, že latentné reprezentácie získané pomocou variačných autoenkóderov zlepšujú adversariálnu robustnosť bez potreby adversariálneho učenia.	cze
dc.description.abstract	In malware behavioral analysis, the list of accessed and created files is very often a strong predictive feature for classification whether the examined file is malicious or benign. However, malware authors are trying to avoid detection by generating random filenames, and/or modifying existing filenames with new versions of the malware. These changes represent real-world adversarial examples against the detection classifier. The goal of this work is to learn latent representations of character sequences, generate realistic adversarial examples, and improve the classifier's robustness against adversarial attacks. To obtain fixed-size vector representations of character sequences, we developed a recurrent autoencoder architecture that achieves high sample reconstruction accuracy. Using gradient-based adversarial attacks in the latent representation space, we were able to generate realistic adversarial examples in the input space, and use these adversarial examples to improve the classifier's robustness. Additionally, we showed that latent representations obtained using variational autoencoders improve adversarial robustness without the need for adversarial training.	eng
dc.publisher	České vysoké učení technické v Praze. Vypočetní a informační centrum.	cze
dc.publisher	Czech Technical University in Prague. Computing and Information Centre.	eng
dc.rights	A university thesis is a work protected by the Copyright Act. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one?s own expense. The use of thesis should be in compliance with the Copyright Act http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf and the citation ethics http://knihovny.cvut.cz/vychova/vskp.html	eng
dc.rights	Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf a citační etikou http://knihovny.cvut.cz/vychova/vskp.html	cze
dc.subject	strojové učenie	cze
dc.subject	reprezentačné učenie	cze
dc.subject	adversariálne útoky	cze
dc.subject	adversariálne trénovanie	cze
dc.subject	klasifikácia viacerých inštancií	cze
dc.subject	machine learning	eng
dc.subject	representation learning	eng
dc.subject	adversarial attacks	eng
dc.subject	adversarial training	eng
dc.subject	multiple instance learning	eng
dc.title	Reprezentace a generování adversariálních řetězců	cze
dc.title	Representation Learning and Adversarial Sample Generation for Strings	eng
dc.type	bakalářská práce	cze
dc.type	bachelor thesis	eng
dc.contributor.referee	Šmídl Václav
theses.degree.discipline	Základy umělé inteligence a počítačových věd	cze
theses.degree.grantor	katedra kybernetiky	cze
theses.degree.programme	Otevřená informatika	cze

Soubory tohoto záznamu

Název:: F3-BP-2021-Galovic-Marek-Repre ...
Velikost:: 3.467Mb
Formát:: PDF
Popis:: PLNY_TEXT
: Zobrazit/otevřít

Název:: F3-BP-2021-Galovic-Marek-prilo ...
Velikost:: 2.927Mb
Formát:: Neznámý
Popis:: PRILOHA
: Zobrazit/otevřít

Název:: F3-BP-2021-posudek-Smidl_Vaclav.pdf
Velikost:: 110.2Kb
Formát:: PDF
Popis:: POSUDEK
: Zobrazit/otevřít

Název:: F3-BP-2021-posudek-Bosansky_Br ...
Velikost:: 143.4Kb
Formát:: PDF
Popis:: POSUDEK
: Zobrazit/otevřít

Tento záznam se objevuje v následujících kolekcích

Bakalářské práce - 13133 [777]

Zobrazit minimální záznam