Repozytorium publikacji - Politechnika Gdańska

Ustawienia strony

english
Repozytorium publikacji
Politechniki Gdańskiej

Treść strony

Bimodal classification of English allophones employing acoustic speech signal and facial motion capture

A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip tracking data synchronized with the audio signal. 32 markers were used, 20 of which were placed on the speaker's inner lips and 4 on a special cap, which served as the point of reference and stabilized the FMC image while post-processing. Speech samples were simultaneously recorded as a list of approximately 300 words in which all English consonantal and vocalic allophones were represented. Different parameterization strategies were tested and the accuracy of vocalic segments

Autorzy