Natural Language Processing: What is Text Polarity?

Natural Language Processing (NLP) and all of its applications will be huge in the 2020s. A lot of my blogging is about text processing and all the things that go with it such as Named Entity Recognition and Part of Speech Tagging. Text polarity is a basic text processing technique that gives us insight into how positive or negative a text is. The polarity of a text is essentially it’s “sentiment” rating from -1 to 1.

Overview of Text Polarity

In this post we’ll cover:

  • What is Text Polarity?
  • How to Get Text Polarity with spaCy
  • How to Get Text Polarity with NLTK
  • How to Get Text Polarity with a web API
  • Why are these Text Polarity Numbers so Different?

What is Text Polarity?

In short, text polarity is a measure of how negative or how positive a piece of text is. Polarity is the measure of the overall combination of the positive and negative emotions in a sentence. It’s notoriously hard for computers to predict this, in fact it’s even hard for people to predict this over text. Check out the following Key and Peele video for an example of what I mean.

Most of the time, NLP models can predict simply positive or negative words and phrases quite well. For example, the words “amazing”, “superb”, and “wonderful” can easily be labeled as highly positive. The words “bad”, “sad”, and “mad” can easily be labeled as negative. However, we can’t just look at polarity from the frame of individual words, it’s important to take a larger context for evaluating total polarity. For example, the word “bad” may be negative but what about the phrase “not bad”? Is that neutral? Or is that the opposite of bad? At this point we’re getting into linguistics and semantics rather than natural language processing.

Due to the nature of language and how words around each other can modify their meaning and polarity, when I personally implemented text polarity for The Text API, I used a combination of total text polarity and the polarity of individual phrases in it. The two biggest open source libraries for NLP in Python are spaCy and NLTK, and both of these libraries measure polarity on a normalized scale of -1 to 1. The Text API measures, combines, and normalizes values on both the polarity of the overall text, individual sentences, and individual phrases. This returns a better picture of the relative polarities of texts by not penalizing longer sentences that are expressing positive or negative emotion at scale but also contain neutral phrases. Let’s take a look at how we can implement text polarity with the libraries and API I mentioned above!

How to Get Text Polarity with spaCy

To get started with spaCy we’ll need to download two spaCy libraries with pip in our terminal as shown below:

pip install spacy spacytextblob

We’ll also need to download a model. As usual we’ll download the `en_core_web_sm` model to get started. Run the below command in the terminal after the pip installs are finished:

python -m spacy download en_core_web_sm

Now that we’ve downloaded our libraries and model, let’s get started with our code. We’ll need to import `spacy` and `SpacyTextBlob` from `spacytextblob.spacytextblob`. Spacy Text Blob is the pipeline component that we’ll be using to get polarity. We’ll start our program by loading the model we downloaded earlier and then adding the `spacytextblob` pipe to the `nlp` pipeline. Notice that we never actually explicitly call the `SpacyTextBlob` module, but rather pass it in as a string to `nlp`. If you’re using VSCode, you’ll see the `SpacyTextBlob` is grayed out like it’s not being used, but don’t be fooled, we require this import in order to add the pipeline component even though we don’t call it directly.

Next we’ll choose a text to process. For this example, I simply wrote two decently positive sentences on The Text API, which we’ll show an example for later. Then all we have to do is send the text to a document via our `nlp` object and check its polarity score.

import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
 
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe('spacytextblob')
text = "The Text API is super easy to use and super useful for anyone who needs to do text processing. It's the best Text Processing web API and allows you to do amazing NLP without having to download or manage any models."
doc = nlp(text)
 
print(doc._.polarity)

Our spaCy model predicted our text’s polarity score at 0.5. It’s hard to really judge how “accurate” the polarity of something is, so we’ll go through the other two methods and I’ll comment on this later.

Text Polarity from spaCy

How to Get Text Polarity with NLTK

Now that we’ve covered how to get polarity via spaCy, let’s check out how to get polarity with the Natural Language Toolkit. As always, we’ll start out by installing the library and dependencies we’ll need.

pip install nltk

Once we install NLTK, we’ll fire up and interactive Python shell in the command line to install the NLTK modules that we need with the commands below.

python
>>> import nltk
>>> nltk.download([“averaged_perceptron_tagger”, “punkt”, “vader_lexicon”])

Averaged Perceptron Tagger handles part of speech tagging. It’s the best tagger in the NLTK library at the time of writing, so you’ll probably use it for something else as well as polarity. Punkt is for recognizing punctuation. I know what you’re thinking:

Vader

But no, the VADER lexicon library actually stands for “Valence Aware Dictionary and sEntiment Reasoner”. It is the library that provides the sentiment analysis tool we need. Once we have all these installed, it’s pretty simple to just import the library and call it. We need the `SentimentIntensityAnalyzer` library, pass our text to it, and call it to score the text on polarity.

from nltk.sentiment import SentimentIntensityAnalyzer
 
sia = SentimentIntensityAnalyzer()
text = "The Text API is super easy to use and super useful for anyone who needs to do text processing. It's the best Text Processing web API and allows you to do amazing NLP without having to download or manage any models."
scores = sia.polarity_scores(text)
print(scores)

We should get a print out like the one below.

NLTK Text Polarity

This result tells us that none of the text is negative, 61.8% is neutral, and 38.2% of it is positive. Compound is a normalized sentiment score that you can see calculated in the VADER package on GitHub. It’s calculated before the negative, neutral, and positive scores, and represents a normalized polarity score of the sentence. So NLTK has calculated our sentence to be very positive.

How to Get Text Polarity with The Text API

Finally, let’s take a look at how to get a text polarity score from The Text API. A major advantage of using a web API like The Text API to do text processing is that you don’t need to download any machine learning libraries or maintain any models. Simply using the requests library, which if you don’t have by now you can install with the pip command below, and then go to The Text API website and sign up for your free API key.

pip install requests

When you land on The Text API’s homepage you should scroll all the way down and you’ll see a button that you can click to sign up for your free API key. 

Once you log in your API key will be right at the top of the page. Now that we’re all set up, let’s take a dive into the code. All we’re going to do is set up a request with headers that tells the server we’re sending a JSON request and pass the API key, a body with the text we want to analyze, and the URL endpoint we’re going to hit (in this case “https://app.thetextapi.com/text/text_polarity”) and then send a request and parse the response.

import requests
import json
from config import apikey
 
text = "The Text API is super easy to use and super useful for anyone who needs to do text processing. It's the best Text Processing web API and allows you to do amazing NLP without having to download or manage any models."
headers = {
    "Content-Type": "application/json",
    "apikey": apikey
}
body = {
    "text": text
}
url = "https://app.thetextapi.com/text/text_polarity"
 
response = requests.post(url, headers=headers, json=body)
polarity = json.loads(response.text)["text polarity"]
print(polarity)

Once we send off our request we’ll get a response that looks like the following:

The Text API Text Polarity

The Text API thinks that my praise of The Text API is roughly .575 polarity, that translates to like ~79% AMAZING (if 1 is AMAZING). 

Why Are These Polarities So Different?

Earlier I mentioned that we’d discuss the different polarity scores at the end so here we are. We used three different methods to get the polarity of the same document of text, so why were our polarity scores so different? The obvious answer is that each method used a) a different model and b) a different way to calculate document polarity. However, there’s also another underlying factor at play here.

Remember that Key and Peele video earlier? It’s hard for people to even understand the polarity of comments even with context. Remember that machines don’t have the ability to understand context yet. Also a range of -1 to 1 without really providing examples of what is a polarity of 1 and what is a polarity of -1 makes it hard to interpret. However, all three methods at least agree that the text is quite positive in general. Of course there are ways to improve the interpretability of these results, but that will be in a coming post!

Further Reading

Learn More

To learn more, feel free to reach out to me @yujian_tang on Twitter, connect with me on LinkedIn, and join our Discord. Remember to follow the blog to stay updated with cool Python projects and ways to level up your Software and Python skills! If you liked this article, please Tweet it, share it on LinkedIn, or tell your friends!

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

Yujian Tang
Yujian Tang

I started my professional software career interning for IBM in high school after winning ACSL two years in a row. I got into AI/ML in college where I published a first author paper to IEEE Big Data. After college I worked on the AutoML infrastructure at Amazon before leaving to work in startups. I believe I create the highest quality software content so that’s what I’m doing now. Drop a comment to let me know!

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
%d bloggers like this: