Separating voices from multiple sound sources using 2D microphone array

Xinran Lu, Lei Xie, Fang Wang, Tao Gu, Chuyu Wang, Wei Wang, Sanglu Lu

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

1 Citation (Scopus)

Abstract

Voice assistant has been widely used for human-computer interaction and automatic meeting minutes. However, for multiple sound sources, the performance of speech recognition in voice assistant decreases dramatically. Therefore, it is crucial to separate multiple voices efficiently for an effective voice assistant application in multi-user scenarios. In this paper, we present a novel voice separation system using a 2D microphone array in multiple sound source scenarios. Specifically, we propose a spatial filtering-based method to iteratively estimate the Angle of Arrival (AoA) of each sound source and separate the voice signals with adaptive beamforming. We use BeamForming-based cross-Correlation (BF-Correlation) to accurately assess the performance of beamforming and automatically optimize the voice separation in the iterative framework. Different from cross-correlation, BF-Correlation further performs cross-correlation among the after-beamforming voice signals processed with each linear microphone array. In this way, the mutual interference from voice signals out of the specified direction can be effectively suppressed or mitigated via the spatial filtering technique. We implement a prototype system and evaluate its performance in real environments. Experimental results show that the average AoA error is 1.4 degree and the average ratio of automatic speech recognition accuracy is 90.2% in the presence of three sound sources.

Original languageEnglish
Title of host publicationIEEE INFOCOM 2022 - IEEE Conference on Computer Communications
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages989-998
Number of pages10
ISBN (Electronic)9781665458221
ISBN (Print)9781665458238
DOIs
Publication statusPublished - 2022
Event41st IEEE Conference on Computer Communications, INFOCOM 2022 - Virtual, London, United Kingdom
Duration: 2 May 20225 May 2022

Publication series

Name
ISSN (Print)0743-166X
ISSN (Electronic)2641-9874

Conference

Conference41st IEEE Conference on Computer Communications, INFOCOM 2022
Country/TerritoryUnited Kingdom
CityVirtual, London
Period2/05/225/05/22

Fingerprint

Dive into the research topics of 'Separating voices from multiple sound sources using 2D microphone array'. Together they form a unique fingerprint.

Cite this