Publications Repository - Gdańsk University of Technology

Page settings

polski
Publications Repository
Gdańsk University of Technology

Treść strony

Investigating Feature Spaces for Isolated Word Recognition

The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and three feature spaces were investigated for the frequency domain, namely: Linear Prediction Coefficient (LPC) spectrum, Hartley spectrum, and cochleagram. Due to the fact that deep learning requires an adequate training set size of the corpus and its content may significantly influence the outcome, thus for the data augmentation purpose, the created dataset was extended with mixes of the speech signal with noise with various SNRs (Signal-to-Noise Ratio). In order to evaluate the applicability of the implemented feature spaces for isolated word recognition task, three experiments were conducted, i.e., 10-, 70-, and 111-word cases were analyzed.

Authors

Additional information

DOI
Digital Object Identifier link open in new tab 10.1007/978-3-030-39250-5
Category
Publikacja monograficzna
Type
rozdział, artykuł w książce - dziele zbiorowym /podręczniku w języku o zasięgu międzynarodowym
Language
angielski
Publication year
2020

Source: MOSTWiedzy.pl - publication "Investigating Feature Spaces for Isolated Word Recognition" link open in new tab

Portal MOST Wiedzy link open in new tab