Abstract
The syllables of speech contain information about the vocal tract length (VTL) of the speaker as well as the phonetic message. Ideally, the pre-processor used for automatic speech recognition (ASR) should segregate the phonetic message from the VTL information. This paper describes a method to calculate VTL-invariant auditory feature vectors from speech, using a method in which the message and the VTL are segregated. Spectra produced by an auditory filterbank are summarized by a Gaussian mixture model (GMM) to produce a low-dimensional feature vector. These features are evaluated for robustness in comparison with conventional mel-frequency cepstral coefficients (MFCCs) using a hidden-Markov-model (HMM) recognizer. A dynamic, compressive gammachirp (dcGC) auditory filterbank is also introduced. The dcGC provides a leveldependent spectral analysis, with near instantaneous compression, and two-tone suppression.
Original language | English |
---|---|
Title of host publication | ISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems |
Place of Publication | Piscataway, NJ |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 3813-3816 |
Number of pages | 4 |
ISBN (Print) | 9781424453085 |
DOIs | |
Publication status | Published - 2010 |
Externally published | Yes |
Event | 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, ISCAS 2010 - Paris, France Duration: 30 May 2010 → 2 Jun 2010 |
Other
Other | 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, ISCAS 2010 |
---|---|
Country/Territory | France |
City | Paris |
Period | 30/05/10 → 2/06/10 |