The Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection, noise profiling, and an adaptive strategy of selection the modification. The role of the first component is to detect the Lombard speech in the input signal to avoid unnecessary speech modifications when the speech is naturally Lombard in its character. The second module is noise profiling, as the type of noise strongly impacts the selection of the best modification. The last part of the system is the adaptive modification selection component. The selection is made based on the speech signal features, resulting in the most considerable speech quality improvement, measured with objective metrics. To solve the problem posed, machine learning was used in this dissertation – especially deep learning with convolutional neural networks and typical multilayer networks. It was proven that it is possible to create an adaptive system that would improve speech quality in the presence of noise in real-time or near real-time.
Authors
- mgr inż. Krzysztof Kąkol
Additional information
- Category
- Doktoraty, rozprawy habilitacyjne, nostryfikacje
- Type
- praca doktorska pracowników zatrudnionych w PG oraz studentów studium doktoranckiego
- Language
- angielski
- Publication year
- 2023