Repozytorium publikacji - Politechnika Gdańska

Ustawienia strony

english
Repozytorium publikacji
Politechniki Gdańskiej

Treść strony

Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions

The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency F0. The recording session includes sets of words and sentences in Polish, recorded in silence, as well as in the presence of interfering signals, i.e. pink noise and so-called bustle (called babble speech), also referred to as the “cocktail-party” effect. Research on the Lombard speech often focuses on subjective studies of speech intelligibility. There are, however, objective indicators such as PESQ (Perceptual Evaluation of Speech Quality) and P.563, which are used in studies of quality of telecommunication channels. The study shows that increasing the fundamental frequency results in increased values of the speech quality index, measured using the PESQ (Perceptual Evaluation of Speech Quality) standard. The research carried out consists of several stages: (1) recording speech samples (words and sentences) without and in the presence of pink noise and babble speech (the so-called cocktail party effect), i.e. the reference signal (“clean” speech), and then recording the same words/sentences in the presence of additional disturbances forcing the Lombard effect in speech recordings to occur; (2) analyzing differences between “clean” speech and the Lombard speech based on objective audio parameters; (3) mixing speech recordings with pink noise with a different signal to 36 10th International Workshop on noise ratio (SNR) in order to measure PESQ MOS coefficients; (4) measuring the PESQ coefficients of the reference files (“clean speech”) that are processed by increasing the F0 value and sound intensity level, and then the same files mixed with pink noise and babble speech interfering signals; (5) repeating step (2), i.e. analyzing the difference in objective parameters and indicating whether these differences are statistically significant.

Autorzy