Explainability of classification of graph represented data
Vysvětlitelnost klasifikace grafových reprezentací dat
Authors
Supervisors
Reviewers
Editors
Other contributors
Journal Title
Journal ISSN
Volume Title
Publisher
České vysoké učení technické v Praze
Czech Technical University in Prague
Czech Technical University in Prague
Date
Abstract
Táto práca sa zaoberá vysvetlitel'nosťou klasifikácie dát v štruktúre grafov za ciel'om zoznámenia sa s dostupnými metódami v danej oblasti a analyzovania klienských dát reprezentovaných grafmi pochádzajúcich z Banky, so zámerom klasifikácie klientov. Práca obsahuje prehl'ad modelov strojového učenia na klasifikáciu, medzi ktoré patrí logistická regresia a neurónové siete. V práci je zdoraznené, že nie je doležité len to, aby klasifikátory predikovali dobre, je potrebné tiež vedieť, prečo daný model rozhodol tak ako rozhodol. Vysvetlitel'nosť modelov strojového učenia je možné dosiahnuť pomocou redukcie štruktúry modelu — výberu príznakov. V prípade grafových štruktúr dát dosiahnutie vysvetlitel'ného modelu zahrňa aj nájdenie podmnožiny grafu, ktorá je relevantná ku klasifikáciu uzlov grafu. Na konci práce je popísaný navrhnutý algoritmus na nájdenie vysvetlitel'ného modelu a sú diskutované výsledky analýzy reálnych dát.
This thesis deals with explainability of classification of graph structured data with the aim to become acquaint with available methods and analyse bank client data in graph structure. The work contains overview of machine learning methods such as logistic regression and neural networks. We emphasize, that for classifiers it is important to predict well, but what is also necessary is to have an explanation, why the model decided in a certain way. Explainability of the models can be achieved by reducing the structure of a model, which means feature selection. In the case of graph structured data, another aspect is to find a subset of graph that is relevant for predicting the right class of a vertex. There is a suggested method for finding an explainable model at the end of the thesis together with the analysis of real data.
This thesis deals with explainability of classification of graph structured data with the aim to become acquaint with available methods and analyse bank client data in graph structure. The work contains overview of machine learning methods such as logistic regression and neural networks. We emphasize, that for classifiers it is important to predict well, but what is also necessary is to have an explanation, why the model decided in a certain way. Explainability of the models can be achieved by reducing the structure of a model, which means feature selection. In the case of graph structured data, another aspect is to find a subset of graph that is relevant for predicting the right class of a vertex. There is a suggested method for finding an explainable model at the end of the thesis together with the analysis of real data.