Categories
NLP

Ask NLP: The Media on the Obama Presidency Over Time

Recently we’ve used NLP to do an exploration of the media’s portrayal of Obama in two parts, based on the most common phrases used in headlines about him, and an AI summary of the headlines about him. We also explored the who/what/when/where of the article headlines that we got in when we pulled the Obama related headlines from the NY Times. In this post, we’ll be looking at the sentiment surrounding his presidency over time.

Click here to skip directly to graphs of the headline sentiments.

To follow along with this tutorial you’ll need to get your API key from the NY Times and The Text API. You’ll also need to use your package manager to install the `requests` module. You can install it by using the line below in your terminal.

pip install requests matplotlib

Setting Up the API Request

We’ve been here many times before. Every time we start a program, we want to handle the imports. As with many of our prior programs, we’re going to be using the `requests`, and `json` libraries for the API request and parsing. We’ll also be using `matplotlib.pyplot` to plot the sentiment over time. Once again, I’m using the `sys` library purely because I stored my API keys in a parent directory and we need access to them in order to do this project. I also import the base URL from the config, this is the API endpoint. It’s “https://app.thetextapi.com/text/”.

# import libraries
import requests
import json
import matplotlib.pyplot as plt
import sys
sys.path.append("../..")
from nyt.config import thetextapikey, _url
 
# set up request headers and URL
headers = {
    "Content-Type": "application/json",
    "apikey": thetextapikey
}
polarity_by_sentence_url = _url + "polarity_by_sentence"

Getting the Sentiments for Each Headline in Each Year

Everything’s set up, let’s get the actual sentiments. You’ll see that, just like in the last post about running Named Entity Recognition, we’re going to set up a function that performs a loop through all the years. Alternatively, you could set up a function that only does one year and then set up a loop to call that function on each year. We’ll do this for the next function.

In each loop, we’ll open up the `txt` file we downloaded when we got the Obama Headlines and read that into a list. Then we’ll join the list into one single string to send to the endpoint. For this request body, we don’t have any extra parameters to adjust, we’ll just send in the text. After we send in the text, we’ll parse the response.
The response will be in the form of a list of lists. To save it to a `txt` file, we’ll loop through each element in the list and write the second element, followed by a colon, followed by the first. Why the second element and then the first? The way the response is returned, as outlined in the documentation, is the polarity and then the sentence, to make our document more readable, we want to put the sentence first.

# loop through each year
def get_polarities():
    for i in list(range(2008, 2018)):
        with open(f"obama_{i}.txt", "r") as f:
            headlines = f.readlines()
        # combine list of headlines into one text
        text = "".join(headlines)
       
        # set up request bodies
        body = {
            "text": text
        }
        # parse responses
        response = requests.post(url=polarity_by_sentence_url, headers=headers, json=body)
        _dict = json.loads(response.text)
        # save to text file
        with open(f"obama/{i}_sentence_polarities.txt", "w") as f:
            for entry in _dict["polarity by sentence"]:
                f.write(f"{entry[1]} : {entry[0]}")
 
get_polarities()

Plotting the Sentiments For Each Year

Now that we’ve gotten all the polarity values, we’re ready to plot them. As I said above, we’ll be running this function with one parameter, the `year`. We will open each file, read the entries in as a list, and then iterate through them to get their polarity values. Notice that I encompass splitting the entries with a `try/except` block, this is just in case there were any errors or anomalies in our write to earlier based on the original data. 

As we loop through, we’ll add each polarity value to a list. At the end of looping through each of the titles and their sentiments, we’ll create a second list that is the length of the list of sentiment values. This one will contain values from 0 to however many headlines we processed. We run this function on each year from 2008 to 2017, and the plots are below.

# plot each datapoint
def plot_polarities(year):
    with open(f"obama/{year}_sentence_polarities.txt", "r") as f:
        entries = f.readlines()
    ys = []
    for entry in entries:
        try:
            _entry = entry.split(" : ")
            ys.append(float(_entry[1]))
        except:
            continue
    xs = list(range(len(ys)))
    plt.plot(xs, ys)
    plt.title(f"Obama Sentiments, {year}”)
    plt.xlabel("Headline Number")
    plt.ylabel("Average Polarity")
    plt.show()
   
# plot each year
for year in range(2008, 2018):
    plot_polarities(year)

Sentiment of Each Obama Headline from 2008 to 2017

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Leave a Reply Cancel reply