Find the Most Common Named Entities by Type

Natural Language Processing techniques have come far in the last decade. NLP will be an even more important field in the coming decade as we get more and more unstructured text data. One of the base applications of NLP is Named Entity Recognition (NER). However, NER by itself just gives us the names of entitiesContinue reading “Find the Most Common Named Entities by Type”

Create Your Own AI Content Moderator – Part 3

As content on the web increases, content moderation becomes more and more important to protect sensitive groups such as children and people who have suffered from trauma. We’re going to learn how to create your own AI content moderator using Python, Selenium, Beautiful Soup 4, and The Text API. Our AI content moderator will beContinue reading “Create Your Own AI Content Moderator – Part 3”

Ask NLP: The Media on the Obama Presidency Over Time

Recently we’ve used NLP to do an exploration of the media’s portrayal of Obama in two parts, based on the most common phrases used in headlines about him, and an AI summary of the headlines about him. We also explored the who/what/when/where of the article headlines that we got in when we pulled the ObamaContinue reading “Ask NLP: The Media on the Obama Presidency Over Time”

Twitter Sentiment for Stocks? Starbucks 11/29/21

Updated 6:19pm PST 11/29/2021 – Our sentiment prediction was right! Next step is to predict how much it’ll go up. Recently I’ve been playing around with sentiment analysis on Tweets a lot. I discovered the Twitter API over the Thanksgiving holidays and it’s like Christmas came early. Sort of like how Christmas comes earlier toContinue reading “Twitter Sentiment for Stocks? Starbucks 11/29/21”

What is Lemmatization and How can I do It?

Lemmatization is an important part of Natural Language Processing. Other NLP topics we’ve covered include Text Polarity, Named Entity Recognition, and Summarization. Lemmatization is the process of turning a word into its lemma. A lemma is the “canonical form” of a word. A lemma is usually the dictionary version of a word, it’s picked byContinue reading “What is Lemmatization and How can I do It?”

The Best Way to do Named Entity Recognition (NER)

Named Entity Recognition (NER) is a common Natural Language Processing technique. It’s so often used that it comes in the basic pipeline for spaCy. NER can help us quickly parse out a document for all the named entities of many different types. For example, if we’re reading an article, we can use named entity recognitionContinue reading “The Best Way to do Named Entity Recognition (NER)”

Natural Language Processing: Part of Speech Tagging

Part of Speech (POS) Tagging is an integral part of Natural Language Processing (NLP). The first step in most state of the art NLP pipelines is tokenization. Tokenization is the separating of text into “tokens”. Tokens are generally regarded as individual pieces of languages – words, whitespace, and punctuation. Once we tokenize our text weContinue reading “Natural Language Processing: Part of Speech Tagging”