Public Relations Reputation Management Crisis Management Social Media Monitoring

Sentiment Analysis: What is it and how does it work?

Masha Maksimava
by Masha Maksimava on September 28, 2020

Sentiment analysis is the ultimate buzzword. And as buzzwords go, it's a concept that's very often misunderstood.

At Awario, we just released a brand new sentiment analysis system, and we've been getting a lot of questions about sentiment since. With any luck, this guide will help you learn more about sentiment analysis: from how it's used to the ins and outs of the mechanics behind it. Here are the main questions I'm going to try and answer in this article; feel free to click on whichever ones you're most interested in!

  1. What is sentiment analysis?
  2. How is sentiment analysis used? 
  3. How does sentiment analysis work?
  4. How accurate is sentiment analysis?
  5. What are the caveats of sentiment analysis?

Let's start with the elephant of a question:

What is sentiment analysis?

Sentiment analysis, also called opinion mining, is the process of determining the emotion (often classified as positive, negative, or neutral) expressed by someone towards a topic or phenomenon. In the context of social listening and online reputation management, sentiment analysis is most commonly used to capture the voice of the customer and determine the attitude of consumers towards a brand, company, product, or person. 

How is sentiment analysis used? 

Sentiment analysis is arguably the most important thing to look for in a social listening tool. From analyzing brand health to improving customer service, here are some of the main things sentiment analysis tools help you do.

1. Monitor brand health

Sentiment analysis for mentions of Kanye West. Screenshot from Awario.

Sentiment analysis is an important part of monitoring your brand and assessing brand health. In your social media monitoring dashboard, keep an eye on the ratio of positive and negative mentions within the conversations about your brand and look into the key themes within both positive and negative conversations to learn what your customers tend to praise and complain about the most. 

When you look at the share of positive mentions within the conversations around your brand, then negative ones, and then neutral ones, you're working with three numbers, which is not an ideal way to measure one thing. To keep things simple, it's a good idea to calculate net sentiment instead. 
Net sentiment is one metric (a number on a scale from -1 to 1) that will show you how your brand is doing sentiment-wise. The formula goes like this (note that neutral mentions are excluded from the calculation):
In other words, net sentiment shows you whether you have more positive or negative mentions, and by how much. Here’s an example: on the sentiment chart for mentions of Kanye West above, you can see that there are 28.2% positive mentions and 26.9% negative ones. From there, you can calculate net sentiment:
(28.2 - 26.9) / (28.2 + 26.9) = 1.3 / 55.1 = 0.024
The fact that the number is positive tells us there are more positive mentions than negative ones; the value itself - that positive conversations dominate by 2.4%.

2. Spot reputation crises early

A spike in negative mentions of Burger King caused by its questionable sustainability move. Screenshot from Awario.

Sentiment analysis isn't only used for social media analytics and reporting. It's just as important to log into your social listening dashboard daily and look out for any spikes in negative mentions - this way, you'll be able to catch a reputation crisis early and prevent it from turning into a full-on disaster. In Awario, with the help of Insights, you can also see the reasons behind any spikes in the volume of negative or positive conversations. By clicking on these insights, you can dig deeper into the data and see what caused the influx of negative (or positive) mentions.

3. Track performance of campaigns

Alert setup for #beforealexa campaign. Screenshot from Awario.

In the same way you're monitoring your brand, you can track mentions of your marketing campaigns, collaborations, events you're organizing, or literally any other initiative by your company that generates buzz online. Just like with brand monitoring, you can use sentiment analysis to measure the overall sentiment around the campaign and look out for spikes to identify the reasons behind them.

4. Perform competitor analysis

Sentiment analysis for Ford, Jeep, and Dodge. Screenshot from Awario.

Monitoring your competitors' sentiment will help you see which aspects of their products customers are most (and least) excited about. On top of that, competitor sentiment can also serve as a benchmark when you analyze the sentiment behind the mentions of your own brand and product. Let’s say, 50% of your mentions are positive, 40% are negative, and the rest are neutral. How do you know if that’s a good thing or a bad thing without a benchmark?

To quickly evaluate where you stand in terms of sentiment, it can be useful to assess the sentiment for all your competitors, combined. If you’re using Awario, you don’t have to calculate this manually - just go to the Mention Statistics report and select all of your competitors to see their total sentiment. From there, you can use it as a benchmark for your own sentiment.

5. Improve customer care

Negative mentions of Oatly on Twitter. Screenshot from Awario.

Monitoring customer sentiment can also help your Customer Support team prioritize their work. Make sure to address negative mentions (particularly those with a higher reach) first - in Awario, you can do that by using filters in your mentions feed. If you catch these negative conversations early, chances are you can turn the situation around for this specific client, and improve the customer experience for other consumers.

How does sentiment analysis work?

In data science lingo, sentiment analysis is a classification problem: the algorithm is presented with pieces of text that need to be classified as positive, negative, or neutral. The problem is usually tackled with the help of Natural Language Processing (NLP) in one of these three ways: supervised machine learning, rule-based techniques, or a combination of the two approaches. Let's take a look at each of these methods.

1. Supervised machine learning

In supervised machine learning, the system is presented with a full set of labeled data for training. This dataset consists of documents whose sentiment has already been determined by human evaluators. The system then learns the classification factors of the document from the training set and labels new input data (the test set).

Source: Exploring, Visualizing, and Modeling Big Data with R

Various classification algorithms can be used for sentiment analysis, such as Naive Bayes, logistic regression, Support Vector Machines, and others. Regardless of the approach, the system will usually assign a score to each word or phrase in the text it's analyzing to reflect its sentiment: say, on a scale from -1 for 'extremely negative' to 1 for 'extremely positive'. These scores will then be added up for all words in the text and divided by the number of words in it to get the average score. From there, it is up to the researcher to set the boundaries. For instance, you could say that an overall score between -1 and -0.33 should mean the statement is negative, use -0.33 to 0.33 for neutral, and 0.33 to 1 for positive.

Pros: Supervised machine learning techniques allow for creating trained models tailored for the specific purpose of the analysis.

Cons: These models will often have poor adaptability between domains or different writing styles.

2. Rule-based methods

A rule-based system uses a set of human-crafted (and optionally machine-enriched) rules for text analysis. These rules commonly include lexicons (i.e. dictionaries of pre-labeled words and expressions).

Here’s a very basic example of what a dictionary may look like:

Word Sentiment score
Fantastic 1
Good 0.5
Okay 0
Bad -0.5
Terrible -1

Once the system's at work, the first step is to look for words from the dictionary in the text it's analyzing. From there, the maths is the same as with machine learning models: add up the scores for every word and divide the result by the number of words to get the average. Lastly, determine the sentiment of the text based on the boundaries you set for positive, negative, and neutral. Let's take the following sentence as an example:

The coffee was okay, but the food was terrible.

Since this text contains two words from our dictionary, the score would be:

(0 + (-1)) / 2 = -1 / 2 = -0.5

If you decided that a score between -1 and -0.33 means the statement is negative, there you have it - this sentence will be labeled as negative by the system.

Pros: Rule-based systems are easier to build, and they are often not so resource-heavy as machine learning algorithms. They also give the researcher full control over the vocabulary and can therefore have better term coverage.

Cons: Basic rule-based systems look at individual words or phrases and not how how they are used in a sequence. Adding new rules may help, but eventually, the whole system can get very complex. On top of that, the number of words in dictionaries is finite, which may be a problem for natural language processing in dynamic environments (I'm looking at you, social media). In other words, lexicon-based systems will often require continuous fine-tuning. 

3. Hybrid approaches

Supervised machine learning and lexicon-based approaches are sometimes combined to improve sentiment accuracy without sacrificing performance. The techniques can be used alongside each other in different ways, but most commonly, a rule-based system (which is typically faster than ML algorithms) will attempt to classify the sentiment of a statement. If a certain degree of confidence is not achieved (e.g., when few or no words from the sentence are available in the lexicon), a machine learning classifier will be used to identify the sentiment of the statement.

Pros: Hybrid approaches can have the perks of both rule-based and machine learning methods. They can have the perfomance benefits of lexicon-based techniques most of the time, but surpass them in accuracy to account for statements whose sentiment can't be easily identified with a rule-based approach.

Cons: Naturally, these systems take the most time and effort to build.

How accurate is sentiment analysis?

The accuracy of sentiment analysis is a term used to refer to how much of a sentiment analysis system's output agrees with human evaluations. 

However, it's not as straightforward as it seems - research shows that human raters will only agree with each other between 65% and 80% of the time. Effectively, you can paraphrase this to say:

The sentiment analysis accuracy of the human brain is between 65% and 80%. 

Sentiment is often subjective, which makes it hard to measure accuracy. On average, researchers agree that a sentiment analysis system needs to be at least 50% accurate to be considered effective; an accuracy of over 65% is considered good, even if it may not sound impressive. Awario, with its sentiment analysis accuracy of just over 70%, is doing nearly as well as humans. 

There's also another reason why accuracy isn't always the ultimate way to measure how good an algorithm is. Here's a great example of when it's not (unrelated to sentiment analysis):

Would you believe someone who claimed to create a model entirely in their head to identify terrorists trying to board flights with greater than 99% accuracy? Well, here is the model: simply label every single person flying from a US airport as not a terrorist. Given the 800 million average passengers on US flights per year and the 19 (confirmed) terrorists who boarded US flights from 2000–2017, this model achieves an astounding accuracy of 99.9999999%! That might sound impressive, but I have a suspicion the US Department of Homeland Security will not be calling anytime soon to buy this model. While this solution has nearly-perfect accuracy, this problem is one in which accuracy is clearly not an adequate metric!

The two other factors that tell the researchers how good their alogirthm is are precision and recall.

Precision is the percentage of instances correctly identified as X by the system among all instances identified as X by the system. 

Recall, on the other hand, is the ratio of the number of instances of X correctly identified by the system to all instances of X in the dataset. 

Source: Wikipedia

For instance, imagine we have a dataset of 10 statements: 7 of those are labeled by human experts as positive, and 3 as negative. A sentiment analysis system identifies 5 statements as positive. Of these 5 instances, only 3 are actually positive (as evaluated by human experts). The system's precision is 3/5 while its recall is 3/7.

What are the caveats of sentiment analysis?

We've seen that sentiment analysis is tricky even for humans, let alone machines - but why? Here are the biggest challenges sentiment analysis systems face.

1. Sarcasm

Source: Giphy

Any kind of natural language processing is tricky for short documents, and sentiment analysis is no exception. Social media posts are generally shorter than other kinds of web content, such as news articles, which means there's often little context to work with. This is particularly important for statements that are sarcastic or ironic. 

In many cases, sarcasm is pretty obvious to people, but extremely tricky to detect for machines. Here's an example:

It only took me 5 minutes to get a coffee at Starbucks. Great start of the day!

It only took me 30 minutes to get a coffee at Starbucks. Great start of the day!

2. Negations

Negations are a linguistic means of reversing the meaning of words, phrases, and even entire sentences. For the purposes of sentiment analysis, it is important not only to identify negation, but also to figure out which words are affected by it so that the system can revert their sentiment.

Like sarcasm, negation is pretty easy to interpret for humans, but it can be quite challenging for computers.

I wouldn't say the coffee was particularly good.

To deal with negation, sentiment classification algorithms will often revert the sentiment of all words starting with the negation word and up to the next punctuation mark. However, this approach can sometimes fail, as you can see in the example below.

I was hoping the coffee would be great. It wasn't.

3. Ambiguity

Emotional words, such as love and hate, are easy to interpret to both humans and machines. However, some words can be negative in one context, and neutral or positive in another, such as in the example below.

I tend to drink ice cold coffee in the summer.

When I finally got my coffee, it was ice cold.

4. Multipolarity

Oftentimes, the text you're analyzing will express several emotions at once. When you're presented with text that expresses one emotion towards one object or topic, and a different emotion towards another one, you are dealing with multipolarity.

Starbucks coffee is much better than Dunkin'.

(Shameless plug: if you'e wondering which really is better, head to our social listening analysis of the two brands.)

In this case, a basic opinion mining system might conclude the sentiment of the statement is positive, based off of the phrase much better. However, if the brand you're monitoring is Dunkin', I bet you wouldn't agree. To deal with multipolarity, an approach called aspect-based sentiment analysis is used.

I hope this guide provided a solid introduction into sentiment analysis, its uses, and challenges. Still have questions about sentiment? Please let me know in the comments below.

Related posts