Introduction to sentiment analysis in NLP

Updated Aug 17, 2023 • 11 min read

Sentiment analysis plays an important role in natural language processing (NLP). It is the confluence of human emotional understanding and machine learning technology.

Sentiment analysis in NLP can be implemented to achieve varying results, depending on whether you opt for classical approaches or more complex end-to-end solutions.

Sentiment analysis is a classification task in the area of natural language processing. Sometimes called ‘opinion mining,’ sentiment analysis models transform the opinions found in written language or speech data into actionable insights. For many developers new to machine learning, it is one of the first tasks that they try to solve in the area of NLP. This is because it is conceptually simple and useful, and classical and deep learning solutions already exist.

While this article provides only a surface-level exploration of the possibilities of sentiment analysis in NLP, you should come away with an idea of the problems that it can solve, the different types of analysis that are possible, and which python libraries and NLP methods can be used to implement it.

What is sentiment analysis?

Customers are driven by emotion when making purchasing decisions - as much as 95% of each decision is dictated by subconscious, emotional reactions. What’s more, with an increased use of social media, they are more open when discussing their thoughts and feelings when communicating with the businesses they interact with. A sentiment analysis model gives a business tool to analyze sentiment, interpret it and learn from these emotion-heavy interactions.

Sentiment analysis involves determining whether the author or speaker’s feelings are positive, neutral, or negative about a given topic. For instance, you would like to gain a deeper insight into customer sentiment, so you begin looking at customer feedback under purchased products or comments under your company’s post on any social media platform. You would like to know if the customer is pleased with your services, neutral, or if he/she has any complaints, meaning whether the customer has a neutral, positive or negative sentiment regarding your products, services or actions. Figuring this out is called sentiment analysis.

Of course, manually searching through and reading these reviews might seem easier; but, when you have thousands of comments, posts, emails, and reviews, as well as physical mail, it is impossible to analyze this data individually and perform sentiment analysis work manually. The only way to get an accurate, representative analysis of customer sentiment is by doing it automatically. That’s where sentiment analysis as an NLP technique comes in!

Before we jump into the details of how to do sentiment analysis, let’s discuss the different types of sentiment analysis. Four common groups for sentiment analysis are:

Graded sentiment analysis
Aspect-based sentiment analysis
Emotion detection
Intent analysis

Graded sentiment analysis (or fine-grained analysis) is when content is not polarized into positive, neutral, or negative. Instead, it is assigned a grade on a given scale that allows for a much more nuanced analysis. For example, on a scale of 1-10, 1 could mean very negative, and 10 very positive. Rather than just three possible answers, sentiment analysis now gives us 10. The scale and range is determined by the team carrying out the analysis, depending on the level of variety and insight they need.

Aspect-based sentiment analysis is when you focus on opinions about a particular aspect of the services that your business offers. The general attitude is not useful here, so a different approach must be taken. For example, you produce smartphones and your new model has an improved lens. You would like to know how users are responding to the new lens, so need a fast, accurate way of analyzing comments about this feature.

Emotion detection assigns independent emotional values, rather than discrete, numerical values. It leaves more room for interpretation, and accounts for more complex customer responses compared to a scale from negative to positive. For instance, even though sadness and anger are negative emotions, they do have different connotations; so, distinguishing between them can give more precise information about the customer's interactions with your products and a better insight into areas for improvement.

Intent analysis focuses on the intent of the person. For example, whether he/she is going to buy the next products from your company or not. This can be helpful in separating a positive reaction on social media from leads that are actually promising.

Some types of sentiment analysis overlap with other broad machine learning topics. Emotion detection, for instance, isn’t limited to natural language processing; it can also include computer vision, as well as audio and data processing from other Internet of Things (IoT) sensors.

What can you use sentiment analysis for?

So far, we have covered just a few examples of sentiment analysis usage in business. To quickly recap, you can use it to examine whether your customer’s feedback in online reviews about your products or services is positive, negative, or neutral. You can also rate this feedback using a grading system, you can investigate their opinions about particular aspects of your products or services, and you can infer their intentions or emotions.

You might be asking yourself: if I already have a great tool for collecting feedback from my customers, is there space for me to take advantage of sentiment analysis?

The answer is yes! You can still use sentiment analysis in ways that will enhance your understanding of customer responses and give you a competitive advantage. For example:

Monitoring people's attitude to your brand - this is more general than user feedback about a particular product or service, to give an overview of how your brand is perceived. Sentiment analysis can also show you how these attitudes change over time.
Customer service - find out if your clients are satisfied with the customer experience and service they have received, by polarizing and grading their recorded conversations with your helpdesk.
Employees’ satisfaction - why should sentiment analysis be restricted to customers? You can also investigate how your employees view their work by analyzing their comments.
Social media monitoring - improve your marketing strategy and future product development by following and analyzing what is popular online and why.
Market research - see how people speak about your competitors, and identify those that perform better than you. Then, to give yourself a key advantage, analyze why they prove more popular and use this information to inform your marketing campaigns, product development, and customer service plans.

NLP methods for sentiment analysis

As the range of methods and purposes for sentiment analysis is so broad, for the purposes of this introduction we will be focusing on the most common form of sentiment analysis system: assigning a positive, neutral, or negative sentiment to text data drawn from any source - this could be in the form of social media comments, emails, online conversations, or even the automatically generated transcription from phone conversations.

There are two main approaches to perform this kind of sentiment analysis: classical and deep learning. Both of these methods are available with python.

In classical methods, we define features and models that can then be identified by the sentiment analysis system. This can be done by:

using a dictionary of manually defined keywords,
creating a ‘bag of words’,
using the TF-IDF strategy.

Using a dictionary of manually defined keywords is based on the assumption that we know what words are typically associated with positive and negative emotions. For example, if we are going to classify movie reviews, we expect to find words such as “great”, “super”, and “love” in positive comments and words like “hate”, “bad”, and “awful” in negative comments. We can count the number of occurrences of every selected word to define feature vectors. Then, we can train a sentiment analysis classifier on each comment.

This approach restricts you to manually defined words, and it is unlikely that every possible word for each sentiment will be thought of and added to the dictionary. This is where a bag of words comes in. Instead of calculating only words selected by domain experts, we can calculate the occurrences of every word that we have in our language (or every word that occurs at least once in all of our data). This will cause our vectors to be much longer, but we can be sure that we will not miss any word that is important for prediction of sentiment.

We can think about TF-IDF as a modified version of the bag of words. Instead of treating every word equally, we normalize the number of occurrences of specific words by the number of its occurrences in our whole data set and the number of words in our document (comments, reviews, etc.). This means that our model will be less sensitive to occurrences of common words like “and”, “or”, “the”, “opinion” etc., and focus on the words that are valuable for analysis.

No matter how you prepare your feature vectors, the second step is choosing a model to make predictions. The range of models is wide. SVM, DecisionTree, RandomForest or simple NeuralNetwork are all viable options. Different models work better in different cases, and full investigation into the potential of each is very valuable - elaborating on this point is beyond the scope of this article.

You can create feature vectors and train sentiment analysis models using the python library Scikit-Learn. There are also some other libraries like NLTK , which is very useful for pre-processing of data (for example, removing stopwords) and also has its own pre-trained model for sentiment analysis.

Sentiment analysis using transformers

Currently, transformers and other deep learning models seem to dominate the world of natural language processing.

In contrast to classical methods, sentiment analysis with transformers means you don’t have to use manually defined features - as with all deep learning models. You just need to tokenize the text data and process with the transformer model. Hugging Face is an easy-to-use python library that provides a lot of pre-trained transformer models and their tokenizers.

If you prefer to create your own model or to customize those provided by Hugging Face, PyTorch and Tensorflow are libraries commonly used for writing neural networks.

Getting started with sentiment analysis in NLP

Sentiment analysis is easy to implement using python, because there are a variety of methods available that are suitable for this task. It remains an interesting and valuable way of analyzing textual data for businesses of all kinds, and provides a good foundational gateway for developers getting started with natural language processing. Its value for businesses reflects the importance of emotion across all industries - customers are driven by feelings and respond best to businesses who understand them.

This is only the tip of the iceberg for sentiment analysis. To find out more about natural language processing, visit our NLP team page.