TY - JOUR
T1 - MASK-Net
T2 - robust health mention classification by masking a disease or symptom terms
AU - Naseem, Usman
AU - Thapa, Surendrabikram
AU - Zhang, Qi
AU - Rashid, Junaid
AU - Hu, Liang
PY - 2024/12/2
Y1 - 2024/12/2
N2 - Social media users often use disease or symptom terms in ways other than describing their health conditions, which can lead to flawed conclusions in data-driven public health surveillance. The health mention classification (HMC) task aims to identify posts in which users use disease or symptom terms to discuss their health conditions instead of using them for other reasons. Existing methods rely on features extracted from external resources and are tested on data from either Twitter or Reddit; therefore, their generalizability and transferability are unproven. In this work, we present MASK-Net, which masks disease or symptom terms and relies on the context of a post. Furthermore, to capture the negative sentiments associated with the experience of having a disease, we incorporate sentiment information to improve the HMC. We conduct experiments using publicly available health-mention datasets collected from Twitter and Reddit. Experimental results demonstrate that our method outperforms state-of-The-Art methods on both HMC datasets, highlighting the relevance of context words in identifying HMC. Additionally, we evaluate our method in cross-domain and multidomain settings to analyze the transferability and generalizability of MASK-Net and conclude with a discussion on the empirical and ethical considerations of our study.
AB - Social media users often use disease or symptom terms in ways other than describing their health conditions, which can lead to flawed conclusions in data-driven public health surveillance. The health mention classification (HMC) task aims to identify posts in which users use disease or symptom terms to discuss their health conditions instead of using them for other reasons. Existing methods rely on features extracted from external resources and are tested on data from either Twitter or Reddit; therefore, their generalizability and transferability are unproven. In this work, we present MASK-Net, which masks disease or symptom terms and relies on the context of a post. Furthermore, to capture the negative sentiments associated with the experience of having a disease, we incorporate sentiment information to improve the HMC. We conduct experiments using publicly available health-mention datasets collected from Twitter and Reddit. Experimental results demonstrate that our method outperforms state-of-The-Art methods on both HMC datasets, highlighting the relevance of context words in identifying HMC. Additionally, we evaluate our method in cross-domain and multidomain settings to analyze the transferability and generalizability of MASK-Net and conclude with a discussion on the empirical and ethical considerations of our study.
KW - Health mention classification (HMC)
KW - mental health
KW - social media
KW - word masking
UR - http://www.scopus.com/inward/record.url?scp=85211316882&partnerID=8YFLogxK
U2 - 10.1109/TCSS.2024.3492143
DO - 10.1109/TCSS.2024.3492143
M3 - Article
AN - SCOPUS:85211316882
SN - 2329-924X
JO - IEEE Transactions on Computational Social Systems
JF - IEEE Transactions on Computational Social Systems
ER -