A framework for clustering and dynamic maintenance of XML documents

Ahmed Al-Shammari, Chengfei Liu, Mehdi Naseriparsa, Bao Quoc Vo, Tarique Anwar, Rui Zhou

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

6 Citations (Scopus)

Abstract

Web data clustering has been widely studied in the data mining communities. However, dynamic maintenance of the web data clusters is still a challenging task. In this paper, we propose a novel framework called XClusterMaint which serves for both clustering and maintenance of the XML documents. For clustering, we take both structure and content into account and propose an efficient solution for grouping the documents based on the combination of structure and content similarity. For maintenance, we propose an incremental approach for maintaining the existing clusters dynamically when we receive new incoming XML documents. Since the dynamic maintenance of the clusters is computationally expensive, we also propose an improved approach which uses a lazy maintenance scheme to improve the performance of the clusters maintenance. The experimental results on real datasets verify the efficiency of the proposed clustering and maintenance model.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications
Subtitle of host publication13th International Conference, ADMA 2017 Singapore, November 5–6, 2017 Proceedings
EditorsGao Cong, Wen-Chih Peng, Wei Emma Zhang , Chengliang Li, Aixin Sun
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages399-412
Number of pages14
ISBN (Electronic)9783319691794
ISBN (Print)9783319691787
DOIs
Publication statusPublished - 2017
Externally publishedYes
Event13th International Conference on Advanced Data Mining and Applications, ADMA 2017 - Singapore, Singapore
Duration: 5 Nov 20176 Nov 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10604 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Advanced Data Mining and Applications, ADMA 2017
Country/TerritorySingapore
CitySingapore
Period5/11/176/11/17

Keywords

  • Clustering
  • Dynamic maintenance
  • Structure and content similarity
  • XML documents

Fingerprint

Dive into the research topics of 'A framework for clustering and dynamic maintenance of XML documents'. Together they form a unique fingerprint.

Cite this