A neural predictive coding feature extraction scheme in DCT domain for phoneme recognition

Mahmood Yousefi Azar, Farbod Razzazi

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Nonlinear feature extraction of speech signals has been the main concern of many researches in recent years. In this paper, feature extraction of phonemes using NPC (neural predictive coding) model is generalized to a combination of time and DCT domains. Two main ideas were proposed and evaluated in this paper. First, a frame-wise DCT-based NPC feature extractor is proposed to overcome the computational complexity deficiency of the system. The basis of this approach is the application of a DCT pre-feature extractor to remove unwanted additional data. In this approach, the extracted features are the output of the hidden layer. It is shown that the use of a pre-processing stage can improve both computational complexity efficiency and accuracy issues. At the second approach, we proposed a complementary role for DCT domain features in classic NPC modeling. This approach uses the signal residual of the predicted signal in the DCT domain. The experiments were conducted on voiced plosive phonemes of TIMIT database. Simulations showed that the performance of the combined method is good at the plosive phonemes. The achieved accuracy that was resulted from the proposed method was 70. 3% recognition rate on /b/d/g/ phonemes, which is higher than the results of traditional NPC approaches.

Original languageEnglish
Pages (from-to)565-574
Number of pages10
JournalNeural Computing and Applications
Volume21
Issue number3
DOIs
Publication statusPublished - 1 Apr 2012
Externally publishedYes

Keywords

  • Automatic feature extraction
  • Discrete cosine transform
  • Neural network
  • Nonlinear predictive coding

Fingerprint Dive into the research topics of 'A neural predictive coding feature extraction scheme in DCT domain for phoneme recognition'. Together they form a unique fingerprint.

  • Cite this