Detekce emocí z řeči a psaného textu

V této práci se věnujeme emocím a jejich rozpoznávání primárně z řeči, ale představujeme také model pro rozpoznávání emocí z textu. Začínáme tím, že si představíme emoce jako takové a různé emoční teorie. Poté si ukážeme, jaké vlastnosti záznamu řeči jsou pro nás při rozpoznávání klíčové a jak je z řečové nahrávky získáme. Dáváme nahlédnout tomu, jak se vyvíjelo rozpoznávání emocí z řeči a jaké metody k tomu byly používány dříve a jaké jsou dnes. V další kapitole představujeme několik vybraných anglických a českých datasetů. Jádrem naší práce je představení několika existujícíh moderních modelů pro rozpoznávání emocí z řeči a jejich otestování na již zmíněných datasetech. Výsledky jsme porovnali a vybrali ten nejlepší model, kterým je emotion2vec, s kterým jsme poté provedli další testy s přidáním a redukcí šumu. K tomuto modelu poté přidáváme ještě model pro rozpoznání emocí z textu a pro oba jsme pak zpracovali a naprogramovali algoritmus, jehož vstupem může být buď samostatně záznam řeči nebo text, nebo kombinace obojího a výstupem jsou obsažené emoce. Došli jsme k závěru, že kombinací výsledků z obou modelů můžeme eliminovat nesprávně klasifikované emoce, ale také ztrácíme mnoho správně klasifikovaných emocí, které jsou nyní klasifikovány jako neznámé.

In this work, we are focusing on emotions and their recognition primarily from speech, but we are also introducing a model for recognizing emotions from text. We begin by introducing emotions and various emotion theories. We are describing which features of the speech recording are the most important for emotion recognition and how we can get them from the speech recording. We are introducing how the recognition of emotions from speech has been developing in the last decades which methods were used for this before and what methods are used nowadays. In the next part, we are introducing several selected English and Czech datasets. The core of our work is presenting several existing state-of-the-art models for recognizing emotions from speech and their testing on the previously mentioned datasets. After we had compared the results we selected the best model which is emotin2vec and we tested it with the addition and reduction of noise. Finally, we add a model for recognizing emotions from text, and for both models, we have written an algorithm, the input of which is either a speech recording or text, or a combination of both, and the output is the contained emotion. In the end, have tested the functionality of this algorithm. We have concluded that by combining results from both models we can eliminate incorrectly classified emotions as emotions from opposite sentiment, but we lose many correctly classified emotions, which are now classified as unknown.

Keywords

rozpoznávání emocí z řeči, rozpoznávání emocí z textu, SER, wav2vec, huBert, emotion2vec, emoce, sentiment, speech emotion recognition, text emotion recogition, SER, wav2vec, huBert, emotion2vec, emotios, sentiment

Permanent link

http://hdl.handle.net/10467/115362

Rights/License

A university thesis is a work protected by the Copyright Act of the Czech Republic. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one`s own expense. The use of thesis should be in compliance with the Copyright Act.

Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem v platném znění.

Collections

Bachelor Theses - 13133

Full item page

Emotion Detection from Speech and Written Text