A contextual semantic-based approach for domain-centric lexicon expansion

Muhammad Abulaish, Mohd Fazil, Tarique Anwar

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

This paper presents a contextual semantic-based approach for expansion of an initial lexicon containing domain-centric seed words. Starting with a small lexicon containing some domain-centric seed words, the proposed approach models text corpus as a weighted word-graph, where the initial weight of a node (word) represents the contextual semantic-based association between the node and the target domain, and the weight of an edge represents the co-occurrence frequency of the respective nodes. The semantic-based association between a node and the target domain is calculated as a function of three contextual semantic-based association metrics. Thereafter, a random walk-based modified PageRank algorithm is applied on the weighted graph to rank and select the most relevant terms for domain-centric lexicon expansion. The proposed approach is evaluated over five datasets, and found to perform significantly better than three baselines and three state-of-the-art approaches.

Original languageEnglish
Title of host publicationDatabases Theory and Applications
Subtitle of host publication31st Australasian Database Conference, ADC 2020, Melbourne, VIC, Australia, February 3–7, 2020, Proceedings
EditorsRenata Borovica-Gajic, Jianzhong Qi, Weiqing Wang
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages216-224
Number of pages9
ISBN (Electronic)9783030394691
ISBN (Print)9783030394684
DOIs
Publication statusPublished - 2020
EventAustralasian Database Conference (31st: 2020) - Swinburne University of Technology, Melbourne, Australia
Duration: 4 Feb 20207 Feb 2020
https://adc2020.github.io/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12008 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceAustralasian Database Conference (31st: 2020)
Abbreviated titleADC 2020
CountryAustralia
CityMelbourne
Period4/02/207/02/20
Internet address

Keywords

  • Text mining
  • Keyword extraction
  • Lexicon expansion
  • Contextual similarity

Fingerprint

Dive into the research topics of 'A contextual semantic-based approach for domain-centric lexicon expansion'. Together they form a unique fingerprint.

Cite this