An effective contrast sequential pattern mining approach to taxpayer behavior analysis

Zhigang Zheng*, Wei Wei, Chunming Liu, Wei Cao, Longbing Cao, Maninder Bhatia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)

Abstract

Data mining for client behavior analysis has become increasingly important in business, however further analysis on transactions and sequential behaviors would be of even greater value, especially in the financial service industry, such as banking and insurance, government and so on. In a real-world business application of taxation debt collection, in order to understand the internal relationship between taxpayers’ sequential behaviors (payment, lodgment and actions) and compliance to their debt, we need to find the contrast sequential behavior patterns between compliant and non-compliant taxpayers. Contrast Patterns (CP) are defined as the itemsets showing the difference/discrimination between two classes/datasets (Dong and Li, 1999). However, the existing CP mining methods which can only mine itemset patterns, are not suitable for mining sequential patterns, such as time-ordered transactions in taxpayer sequential behaviors. Little work has been conducted on Contrast Sequential Pattern (CSP) mining so far. Therefore, to address this issue, we develop a CSP mining approach, eCSP, by using an effective CSP-tree structure, which improves the PrefixSpan tree (Pei et al., 2001) for mining contrast patterns. We propose some heuristics and interestingness filtering criteria, and integrate them into the CSP-tree seamlessly to reduce the search space and to find business-interesting patterns as well. The performance of the proposed approach is evaluated on three real-world datasets. In addition, we use a case study to show how to implement the approach to analyse taxpayer behaviour. The results show a very promising performance and convincing business value.

Original languageEnglish
Pages (from-to)633-651
Number of pages19
JournalWorld Wide Web
Volume19
Issue number4
DOIs
Publication statusPublished - Jul 2016
Externally publishedYes

Keywords

  • Contrast pattern
  • Sequential pattern
  • Client behavior analysis

Fingerprint

Dive into the research topics of 'An effective contrast sequential pattern mining approach to taxpayer behavior analysis'. Together they form a unique fingerprint.

Cite this