Learning Domains for Named Entities
Učení domén pojmenovaných entit
Authors
Supervisors
Reviewers
Editors
Other contributors
Journal Title
Journal ISSN
Volume Title
Publisher
České vysoké učení technické v Praze
Czech Technical University in Prague
Czech Technical University in Prague
Date
Abstract
Diplomová práce se zabývá doménami pojmenovaných entit a možnostmi strojového učení nad nimi. Práce nejprve analyzuje problém strojového učení, zdrojů dat a dosavadních řešení. Na základě těchto analýz je navrhnuta a implementována aplikace sloužící k tvorbě trénovacích datasetů a REST služba automatizující proces učení domén entit. Dále je představen nástroj Weka, který vypomáhá s vytvořením natrénovaných modelů, a projekt DBpedia, který je hlavním zdrojem pojmenovaných entit. Nakonec jsou provedeny experimenty k vyhodnocení kvality vytvořených modelů pro učení domén pojmenovaných entit.
The master thesis deals with the domains of the named entities and the possibilities of machine learning over them. At first the thesis analyses the problem of machine learning, the sources of data and the actual solutions. Based on these analyzes, the application, which creates the training datasets, and the REST API, which automates the process of learning domains for entities, are designed and implemented. Furthermore, the program Weka, which helps with creating models, and the project DBpedia, which is the main source of named entities, are described. Finally, the experiments are made to evaluate the quality of created models for learning domains for named entities.
The master thesis deals with the domains of the named entities and the possibilities of machine learning over them. At first the thesis analyses the problem of machine learning, the sources of data and the actual solutions. Based on these analyzes, the application, which creates the training datasets, and the REST API, which automates the process of learning domains for entities, are designed and implemented. Furthermore, the program Weka, which helps with creating models, and the project DBpedia, which is the main source of named entities, are described. Finally, the experiments are made to evaluate the quality of created models for learning domains for named entities.