TY - JOUR
T1 - KSCB
T2 - a novel unsupervised method for text sentiment analysis
AU - Jiang, Weili
AU - Zhou, Kangneng
AU - Xiong, Chenchen
AU - Du, Guodong
AU - Ou, Chubin
AU - Zhang, Junpeng
PY - 2023/1
Y1 - 2023/1
N2 - In recent years, deep learning models (e.g. Convolutional Neural Networks (CNN) and Long Short-Term Memories (LSTM)), have been successfully applied to text sentiment analysis. However, the class-imbalance and unlabeled corpus still limit the accuracy of text sentiment classification. To overcome the two issues, in this work, we propose a new classification model named KSCB (integrating K-means++, SMOTE, CNN and Bi-LSTM models) for text sentiment analysis. The K-means++-SMOTE (combining K-means++ and SMOTE) operation in KSCB is firstly used to cluster sentiment text, and further generate new corpora via imbalance ratio to adjust data distribution. Then the loss function between K-means++-SMOTE and CNN-Bi-LSTM (combining CNN and Bi-LSTM) is applied to construct end-to-end learning. Different from other deep learning models, our proposed method KSCB can adjust data distribution for different sentiment corpora via KSCB optimization. We have applied KSCB into the balanced and imbalanced corpora, and the comparison results show that KSCB is better than or comparable to the other five state-of-the-art methods in text sentiment classification. Moreover, the ablation experiment in the balanced and imbalanced corpora have demonstrated the effectiveness of KSCB in text sentiment analysis.
AB - In recent years, deep learning models (e.g. Convolutional Neural Networks (CNN) and Long Short-Term Memories (LSTM)), have been successfully applied to text sentiment analysis. However, the class-imbalance and unlabeled corpus still limit the accuracy of text sentiment classification. To overcome the two issues, in this work, we propose a new classification model named KSCB (integrating K-means++, SMOTE, CNN and Bi-LSTM models) for text sentiment analysis. The K-means++-SMOTE (combining K-means++ and SMOTE) operation in KSCB is firstly used to cluster sentiment text, and further generate new corpora via imbalance ratio to adjust data distribution. Then the loss function between K-means++-SMOTE and CNN-Bi-LSTM (combining CNN and Bi-LSTM) is applied to construct end-to-end learning. Different from other deep learning models, our proposed method KSCB can adjust data distribution for different sentiment corpora via KSCB optimization. We have applied KSCB into the balanced and imbalanced corpora, and the comparison results show that KSCB is better than or comparable to the other five state-of-the-art methods in text sentiment classification. Moreover, the ablation experiment in the balanced and imbalanced corpora have demonstrated the effectiveness of KSCB in text sentiment analysis.
KW - K-means++
KW - SMOTE
KW - CNN
KW - Bi-LSTM
KW - End-to-end Learning
KW - Text Sentiment Classification
UR - http://www.scopus.com/inward/record.url?scp=85128103195&partnerID=8YFLogxK
U2 - 10.1007/s10489-022-03389-4
DO - 10.1007/s10489-022-03389-4
M3 - Article
AN - SCOPUS:85128103195
SN - 0924-669X
VL - 53
SP - 301
EP - 311
JO - Applied Intelligence
JF - Applied Intelligence
IS - 1
ER -