Chinese sentence semantic matching based on multi-granularity fusion model

Xu Zhang, Wenpeng Lu*, Guoqiang Zhang, Fangfang Li, Shoujin Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Sentence semantic matching is the cornerstone of many natural language processing tasks, including Chinese language processing. It is well known that Chinese sentences with different polysemous words or word order may have totally different semantic meanings. Thus, to represent and match the sentence semantic meaning accurately, one challenge that must be solved is how to capture the semantic features from the multi-granularity perspective, e.g., characters and words. To address the above challenge, we propose a novel sentence semantic matching model which is based on the fusion of semantic features from character-granularity and word-granularity, respectively. Particularly, the multi-granularity fusion intends to extract more semantic features to better optimize the downstream sentence semantic matching. In addition, we propose the equilibrium cross-entropy, a novel loss function, by setting mean square error (MSE) as an equilibrium factor of cross-entropy. The experimental results conducted on Chinese open data set demonstrate that our proposed model combined with binary equilibrium cross-entropy loss function is superior to the existing state-of-the-art sentence semantic matching models.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication24th Pacific-Asia Conference, PAKDD 2020 Singapore, May 11–14, 2020, Proceedings, Part II
EditorsHady W. Lauw, Ee-Peng Lim, Raymond Chi-Wing Wong, Alexandros Ntoulas, See-Kiong Ng, Sinno Jialin Pan
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages246-257
Number of pages12
ISBN (Electronic)9783030474362
ISBN (Print)9783030474355
DOIs
Publication statusPublished - 2020
Event24th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2020 - Singapore, Singapore
Duration: 11 May 202014 May 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12085 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2020
CountrySingapore
CitySingapore
Period11/05/2014/05/20

Keywords

  • Equilibrium cross-entropy
  • Multi-granularity fusion
  • Sentence semantic matching

Fingerprint Dive into the research topics of 'Chinese sentence semantic matching based on multi-granularity fusion model'. Together they form a unique fingerprint.

Cite this