This work proposes the use of pseudo-color cochleagram image of sound signals for feature extraction for robust acoustic event recognition. A cochleagram is a variation of the spectrogram. It utilizes a gammatone filter and has been shown to better reveal spectral information. We propose mapping of the grayscale cochleagram image to higher dimensional color space for improved characterization from environmental noise. The resulting time-frequency representation is referred as pseudo-color cochleagram image and the resulting feature, which captures the statistical distribution, as pseudo-color cochleagram image feature (PC-CIF). In addition, sequential backward feature selection is applied for selecting the most useful feature dimensions, thereby reducing the feature dimension and improving the classification performance. We evaluate the effectiveness of the proposed methods using two classifiers, k-nearest neighbor and support vector machines. The performance is evaluated on a dataset containing 50 sound classes, taken from the Real World Computing Partnership Sound Scene Database in Real Acoustical Environments, with the addition of environmental noise at various signal-to-noise ratios. The experimental results show that the proposed techniques give significant improvement in classification performance over baseline methods. The most improved results were observed at low signal-to-noise ratios.
- Acoustic event recognition
- Sequential backward feature selection
- Support vector machines
- Time-frequency image