Social media sentiment analysis: lexicon versus machine learning

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Purpose With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.
Design/methodology/approach A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.
Findings Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.
Research limitations/implications Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.
Practical implications The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.
Originality/value This study combines two sentiment analysis approaches and demonstrates significantly improved performance.
LanguageEnglish
Pages480-488
Number of pages9
JournalJournal of Consumer Marketing
Volume34
Issue number6
DOIs
Publication statusPublished - 2017

Fingerprint

Machine learning
Social media
Sentiment analysis
Sentiment
Marketers
Design methodology
Decision making
Sentiment classification
Consumer sentiment
Facebook

Cite this

@article{4cf39a97e28446339b08160563afbc6b,
title = "Social media sentiment analysis: lexicon versus machine learning",
abstract = "Purpose With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.Design/methodology/approach A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.Findings Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.Research limitations/implications Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.Practical implications The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.Originality/value This study combines two sentiment analysis approaches and demonstrates significantly improved performance.",
keywords = "Sentiment analysis, Social media, Consumer-generated content",
author = "Chedia Dhaoui and Cynthia Webster and LayPeng Tan",
year = "2017",
doi = "10.1108/JCM-03-2017-2141",
language = "English",
volume = "34",
pages = "480--488",
journal = "Journal of Consumer Marketing",
issn = "0736-3761",
publisher = "Emerald Group Publishing",
number = "6",

}

Social media sentiment analysis : lexicon versus machine learning. / Dhaoui, Chedia; Webster, Cynthia; Tan, LayPeng.

In: Journal of Consumer Marketing, Vol. 34, No. 6, 2017, p. 480-488.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Social media sentiment analysis

T2 - Journal of Consumer Marketing

AU - Dhaoui,Chedia

AU - Webster,Cynthia

AU - Tan,LayPeng

PY - 2017

Y1 - 2017

N2 - Purpose With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.Design/methodology/approach A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.Findings Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.Research limitations/implications Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.Practical implications The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.Originality/value This study combines two sentiment analysis approaches and demonstrates significantly improved performance.

AB - Purpose With the soaring volumes of brand-related social media conversations, digital marketers have extensive opportunities to track and analyse consumers’ feelings and opinions about brands, products or services embedded within consumer-generated content (CGC). These “Big Data” opportunities render manual approaches to sentiment analysis impractical and raise the need to develop automated tools to analyse consumer sentiment expressed in text format. This paper aims to evaluate and compare the performance of two prominent approaches to automated sentiment analysis applied to CGC on social media and explores the benefits of combining them.Design/methodology/approach A sample of 850 consumer comments from 83 Facebook brand pages are used to test and compare lexicon-based and machine learning approaches to sentiment analysis, as well as their combination, using the LIWC2015 lexicon and RTextTools machine learning package.Findings Results show the two approaches are similar in accuracy, both achieving higher accuracy when classifying positive sentiment than negative sentiment. However, they differ substantially in their classification ensembles. The combined approach demonstrates significantly improved performance in classifying positive sentiment.Research limitations/implications Further research is required to improve the accuracy of negative sentiment classification. The combined approach needs to be applied to other kinds of CGCs on social media such as tweets.Practical implications The findings inform decision-making around which sentiment analysis approaches (or a combination thereof) is best to analyse CGC on social media.Originality/value This study combines two sentiment analysis approaches and demonstrates significantly improved performance.

KW - Sentiment analysis, Social media, Consumer-generated content

UR - http://www.scopus.com/inward/record.url?scp=85029675452&partnerID=8YFLogxK

U2 - 10.1108/JCM-03-2017-2141

DO - 10.1108/JCM-03-2017-2141

M3 - Article

VL - 34

SP - 480

EP - 488

JO - Journal of Consumer Marketing

JF - Journal of Consumer Marketing

SN - 0736-3761

IS - 6

ER -