Zlepšování algoritmů pro učení se řadit

Vu, Huy Hoang

Improving Learning to Rank Algorithms

Zlepšování algoritmů pro učení se řadit

Authors

Vu, Huy Hoang

Supervisors

Kordík, Pavel

Reviewers

Maldonado Lopez, Juan Pablo

Publisher

České vysoké učení technické v Praze
Czech Technical University in Prague

Files

Full Text (334.09 KB)

Review (137.25 KB)

Review (135.97 KB)

Abstract

V této práci se zabývám existujícími algoritmy pro úlohu přeřazení URL podle relevance na základě uživatelského dotazu do vyhledávače a metodami kolaborativního filtrování, které uvádím v rešerši. Vybrané algoritmy, což jsou ES-Rank a maticová faktorizace, pak implementuji a použiji na dataset poskytnutý společností Yandex v rámci soutěže Personalized Web Search Challenge na Kaggle.com. Poté porovnávám přesnost řazení s ostatními řešeními na Kaggle.com. Následně testuji, jestli kolaborativní filtrování metodou maticové faktorizace významně zvyšuje přesnost řazení. Nakonec analyzuji časovou složitost svého řešení.

In this thesis I explore existing approaches to the learning to rank problem and collaborative filtering methods, and apply them to Yandex's dataset provided in the Personalized Web Search Challenge competition on Kaggle.com. I build on the existing submissions by replicating the top competitor's feature extraction from the dataset. Then I implement and apply ES-Rank and matrix factorization on these features and test if matrix factorization based collaborative filtering significantly increases the overall performance of the algorithm. Then I compare the performance of the implemented algorithms to other submissions on Kaggle. Lastly I analyze the time complexity of my solution.

Keywords

získávání informací,učení se řadit,kolaborativní filtrování,maticová faktorizace,evoluční strategie, information retrieval,learning to rank,collaborative filtering,matrix factorization,evolutional strategy

Permanent link

http://hdl.handle.net/10467/76809

Rights/License

A university thesis is a work protected by the Copyright Act of the Czech Republic. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one`s own expense. The use of thesis should be in compliance with the Copyright Act.

Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem v platném znění.

Collections

Bachelor Theses - 18101

Full item page

Improving Learning to Rank Algorithms

Zlepšování algoritmů pro učení se řadit

Authors

Supervisors

Reviewers

Editors

Other contributors

Journal Title

Journal ISSN

Volume Title

Publisher

Date of defense

Files

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

Underlying research data set URL

Permanent link

Rights/License

Collections

Endorsement

Review

Supplemented By

Referenced By