Time-frequency image resizing using interpolation for acoustic event recognition with convolutional neural networks

Roneel V. Sharan, Tom J. Moir

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

5 Citations (Scopus)

Abstract

Convolutional neural networks (CNN) are being increasingly used for audio signal classification applications, including acoustic event recognition. CNN is an image classifier and acoustic event signals are often represented using time-frequency image for this purpose. However, the length or duration of the sound event signals can vary greatly and an important consideration is how to equally size time-frequency images for classification using CNN. In this paper, we use techniques from digital image processing to address this problem. In particular, we apply interpolation-based image resizing techniques to form equally sized time-frequency representations. We consider nearest-neighbor, bilinear, bicubic, and Lanczos kernel interpolation methods for this purpose. A database containing 50 sound event classes with sound events of varying duration is used to evaluate the classification performance of these resized time-frequency images. The results show that the time-frequency images resized using bicubic and Lanczos kernel interpolation methods give a much improved classification performance than the conventional time-frequency image representation.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Signals and Systems
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages8-11
Number of pages4
ISBN (Electronic)9781728121772
ISBN (Print)9781728121789
DOIs
Publication statusPublished - 1 Jul 2019
Externally publishedYes
Event2019 IEEE International Conference on Signals and Systems, ICSigSys 2019 - Bandung, Indonesia
Duration: 16 Jul 201918 Jul 2019

Conference

Conference2019 IEEE International Conference on Signals and Systems, ICSigSys 2019
Country/TerritoryIndonesia
CityBandung
Period16/07/1918/07/19

Keywords

  • acoustic event recognition
  • convolutional neural network
  • image resize
  • interpolation
  • spectrogram
  • time-frequency image

Fingerprint

Dive into the research topics of 'Time-frequency image resizing using interpolation for acoustic event recognition with convolutional neural networks'. Together they form a unique fingerprint.

Cite this