Predicting the Quality of Synthesized and Natural Speech Impaired by Packet Loss and Coding Using PESQ and P.563 Models
Typ dokumentu
ArticleAutor
Počta, P.
Holub, J.
Metadata
Zobrazit celý záznamAbstrakt
This paper investigates the impact of independent and dependent losses and coding on speech quality predictions
provided by PESQ (also known as ITU-T P.862) and P.563 models, when both naturally-produced and synthesized
speech are used. Two synthesized speech samples generated with two different Text-to-Speech systems
and one naturally-produced sample are investigated. In addition, we assess the variability of PESQ’s and P.563’s
predictions with respect to the type of speech used (naturally-produced or synthesized) and loss conditions as
well as their accuracy, by comparing the predictions with subjective assessments. The results show that there is
no difference between the impact of packet loss on naturally-produced speech and synthesized speech. On the
other hand, the impact of coding is different for the two types of stimuli. In addition, synthesized speech seems
to be insensitive to degradations provided by most of the codecs investigated here. The reasons for those findings
are particularly discussed. Finally, it is concluded that both models are capable of predicting the quality of transmitted
synthesized speech under the investigated conditions to a certain degree. As expected, PESQ achieves the
best performance over almost all of the investigated conditions.
Kolekce
K tomuto záznamu jsou přiřazeny následující licenční soubory: