Manipulace kapacitou doporučovacích modelů pro optimalizaci v Recall-Coverage rovině

Tomáš Řehořek

Manipulating the Capacity of Recommendation Models in Recall-Coverage Optimization

Type of document

disertační práce
doctoral thesis

Author

Tomáš Řehořek

Supervisor

Kordík Pavel

Opponent

Vojtáš Peter

Field of study

Informatika

Study program

Informatika

Institutions assigning rank

katedra aplikované matematiky

Rights

A university thesis is a work protected by the Copyright Act. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one?s own expense. The use of thesis should be in compliance with the Copyright Act http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf and the citation ethics http://knihovny.cvut.cz/vychova/vskp.html
Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf a citační etikou http://knihovny.cvut.cz/vychova/vskp.html

Metadata

Show full item record

Abstract

Traditional approaches in Recommender Systems ignore the problem of long-tail recommendations. There is no systematic approach to control the magnitude of long-tail recommendations generated by the models, and there is not even proper methodology to evaluate the quality of long-tail recommendations. This thesis addresses the long-tail recommendation problem from both the algorithmic and evaluation perspective. We proposed controlling the magnitude of long-tail recommendations generated by models through the manipulation with capacity hyperparameters of learning algorithms, and we dene such hyperparameters for multiple state-of-the-art algorithms. We also summarize multiple such algorithms under the common framework of the score function, which allows us to apply popularity-based regularization to all of them. We propose searching for Pareto-optimal states in the Recall-Coverage plane as the right way to search for long-tail, high-accuracy models. On the set of exhaustive experiments, we empirically demonstrate the corectness of our theory on a mixture of public and industrial datasets for 5 dierent algorithms and their dierent versions.