Recently, I posted an article about Natural Language Processing written by (our own!) AI. I’ll be honest, I don’t think that was a great explanation of Natural Language Processing (NLP). So here’s an article, written by a human, that will explain Natural Language Processing in plain English.
What is a “Natural Language”?
Before we can understand Natural Language Processing we need to understand what a natural language is. There are two broad categories of language, there’s “natural languages” and “formal languages”. A formal language is formed around a particular set of rules. Natural languages arise without a formal set of rules. The difference is not just in the rules but also in the way that the language comes about.
Formal languages are generally used in computation, some examples include Python, Java, or C++. These kinds of languages represent axiomatic states. At a high level, math is a formal language. All formal languages are created and constructed according to a set of rules.
Natural languages are not constructed. Natural languages are the languages that naturally arise from human interaction. Some examples include English, Chinese, or Spanish. Natural languages are constantly evolving, and one does not necessarily have to understand its rules to use it. For example, if you’re a native English speaker I would bet that you don’t know what a gerund is, but you sure know how to use it. I’ve dropped the definition and an example in the appendix if you’re curious what a gerund is. I’ve also dropped an example of the constant evolution of natural languages using English in the appendix.
What is “Processing”?
Alright so natural languages are languages that arise from human interaction from the need for communication. Now let’s look at the third part of natural language processing – what is processing? Processing is (a gerund!) the act of turning one form of data or material into another form. That’s it. It is simply a shift of the form of information.
Before we move on let’s put it all together. Now that we know what “natural language” and what “processing” means, what does natural language processing mean? It’s the act of changing information in the form of natural language into another form. The form that the information is transformed to may still be natural language in the cases of language generation, or text summarization, or named entity recognition.
A Brief History of Natural Language Processing
Now that we’ve got an overview of what natural language processing is, let’s go over a brief history. There’s been three major “ages” or “zeitgeists” of natural language processing. The first age was symbolic NLP, the second was statistical NLP, and the current is “Neural” NLP – or NLP using neural networks.
Symbolic Natural Language Processing
Symbolic Natural Language Processing was the start of NLP back in the early 1950s. It was the zeitgeist of NLP up until the 1980s. Symbolic NLP focuses on using a set of rules to formulate the underlying computer “understanding”.
The first famous NLP project was produced in 1954. It translated Russian sentences into English. The creators claimed that language translation would be a solved problem within a few years. Well, it’s 2021 and we still can’t translate languages perfectly. The constant evolution of natural languages has made that problem quite hard.
In the 1960s and 1970s systems using “block” technology such as SHRDLU (a simple chatbot using block words) and ELIZA (a psychotherapy chatbot). The 1980s saw a rise in NLP research with focus in areas such as generative grammar, morphology, and semantics. The continued development and research into chatbots such as Racter and Jabberwacky which helped in the rise of statistical NLP.
Statistical Natural Language Processing
After the age of symbolic NLP, the main focus of the field shifted to a statistical variant. This came from multiple reasons, not just the continued development and research of chatbots, but also the increased processing power of machines and a shift away from transformational grammar in the field of linguistics. The increased processing power of machines opened the door for the introduction of machine learning techniques to natural language processing.
Although perceptrons and simple neural networks (the current day zeitgeist of NLP) existed in this time, they were not yet practical. Due to power and processing constraints statistical NLP research and development was restricted to large companies such as IBM. IBM pioneered the space of machine translation in the 1990s by training on large text corpora from Canada and the EU.
Starting from the mid-1990s and ranging into today, there’s become a huge amount of unclassified text available through the internet. This has allowed natural language processing practitioners to branch out from just supervised learning techniques to unsupervised or semi-supervised learning techniques.
Current Day Natural Language Processing
Just as how increased processing power helped shift natural language processing from symbolic to statistical in the 1980s, increased processing power was a huge factor in establishing new techniques in the 21st century. Increased processing power revived machine learning from a dying field in the 1990s to the most dominant subfield of computer science today. In the 2010s, research papers showing techniques using deep neural networks to do NLP shifted the main NLP focus from statistical models to representational models.
A 2014 paper focusing on sequence to sequence techniques using deep neural networks and long-short term memory published by Google was pivotal in this change. Since then, NLP has evolved far, far out of the reach of just large companies. Today, open source libraries such as spaCy and NLTK allow us regular folk to do NLP on our own computers.
State of the Art NLP Techniques
Now that we’ve covered what NLP is and briefly gone over the history of it, let’s check out some state of the art NLP techniques. Starting from the early-mid 2010s we started seeing a huge number of Recurrent Neural Networks and Long Short Term Memory. In the late 2010s we saw the rise of Connectionist Temporal Classification and now in the early 2020s we’re seeing the application of transformer models.
Recurrent Neural Networks (RNNs)
RNNs are just neural networks in which hidden layers (that means nodes not in the input or output layer) “recur” or feed into themselves. The image below shows an example.
Long Short Term Memory (LSTM)
LSTMs are just a special form of RNNs. Instead of just having a feedforward connection, they also have backwards connections. LSTMs uses a “cell” with three gates – an input gate, an output gate, and a forget gate. This results in being able to process entire sequences of data as a sequence instead of individual points. Here’s an example image of an LSTM architecture.
Connectionist Temporal Classification (CTC)
CTC is a technique that makes it possible to do real time speech recognition. It’s a unidirectional RNN using a multilayer perceptron as an encoder. It produces a continuous output and only cares about the sequence of the labels (ignoring blank labels). Here’s a picture of how a CTC works.
Finally we come to transformer models, the most recent applied development in the NLP space. Transformer models adopt the technique of “self-attention”. The attention provides a way for the model to process the text within context and not necessarily in order. Here’s a picture of a transformer model from Analytics Vidhya.
Modern Day NLP Applications
Computers fully being able to do natural language processing would be quite the feat. While that is not currently possible, we still use NLP in many modern-day technologies. Some examples of modern-day application of NLP include transcribing speech to text, translating between languages, and text processing.
As a field, speech to text has grown exponentially since the mid 2010s with many startups creating web APIs in this space. Translating between languages was the original focus of NLP with that 1954 project translating Russian into English and continues to be a big part of it today. Recently, Google came out with a much more robust and accurate model for their translator. Text processing has yet to reach as significant a usage as either of these. However, products like The Text API, which makes ready to use text processing techniques like text polarity, detecting similar sentences, and named entity recognition easily usable, aims to change that.
Natural language processing is a huge field that’s been around for about 70 years now and will go on to be around for quite a while longer. The utility of being able to have computers do translation work, transcription work, and analyze text has a really high value. There’s so much research in this field going on. Who knows what the future of NLP will bring? To learn more feel free to reach out to me @yujian_tang on Twitter, connect with me on LinkedIn, and join our Discord. Remember to follow the blog to stay updated with cool Python projects and ways to level up your Python skills!
- Accuracy, Precision, Recall, and F Score
- Build Your Own AI Text Summarizer
- Introduction to Machine Learning: KNN
- What is Text Polarity?
- The Best Way to do Named Entity Recognition
I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.
- Gerunds are infinitive verbs used as nouns. Example: asking in “do you mind me asking you?”
- “Lit” – meaning “great”, “amazing”, “cool”
- “Swag” – y’all already know what this means, don’t pretend like you don’t