Natural Language Processing Using TextBlob

Introduction to TextBlob

TextBlob is a Python library that is built with a simple API to perform various tasks related to Natural Language Processing(NLP). It is built based on NLTK and Pattern libraries but with a simpler interface. The lucidity of TextBlob makes it the perfect library to work with if you are new to NLP and the best library to experiment with text analytics in Python. In the following sections, we will get a better understanding of TextBlob and its functionalities. 

Installing TextBlob

TextBlob can be easily installed using ‘pip’ by typing the following in the command line. 

pip install textblob

You can also use conda to install TextBlob. If you have installed Anaconda, try running the following command in the anaconda prompt to install TextBlob.

conda install -c conda-forge textblob

In case you need large linguistic data for practical work, the NLTK natural language data sets called Corpora can be downloaded through the following command.

python -m textblob.download_corpora

Creating a TextBlob

Before we start coding, we need to import the TextBlob package into our Python file.

from textblob import TextBlob

When we are working with TextBlob, our texts will be store as instances of TextBlob. Let’s create our first TextBlob with a simple paragraph.

firstText=TextBlob("If it is your first step in NLP, TextBlob is the perfect library for you to get hands-on with. The best way to go through this article is to follow along with the code and perform the tasks yourself.")

Once we create our TextBlob, we can try out different TextBlob features using it. 

Tokenization

 Using the Tokenization feature, you can break the text to tokens, which can be either words or sentences for further analysis. We will be using "words" and "sentences" attributes to tokenize the TextBlob we created earlier. 

>>firstText.words
 
Result:
 
WordList(['If', 'it', 'is', 'your', 'first', 'step', 'in', 'NLP', 'TextBlob', 'is', 'the', 'perfect', 'library', 'for', 'you', 'to', 'get', 'hands-on', 'with', 'The', 'best', 'way', 'to', 'go', 'through', 'this', 'article', 'is', 'to', 'follow', 'along', 'with', 'the', 'code', 'and', 'perform', 'the', 'tasks', 'yourself'])
 
>>> firstText.sentences
 
Result:
[Sentence("If it is your first step in NLP, TextBlob is the perfect library for you to get hands-on with."), Sentence("The best way to go through this article is to follow along with the code and perform the tasks yourself.")]
 

Part Of Speech (POS) Tagging

POS is tagging words with labels to identify its function in the given context. In TextBlob, it can be done using the tags attribute.

>>> firstText.tags
Result:
[('If', 'IN'), ('it', 'PRP'), ('is', 'VBZ'), ('your', 'PRP$'), ('first', 'JJ'), ('step', 'NN'), ('in', 'IN'), ('NLP', 'NNP'), ('TextBlob', 'NNP'), ('is', 'VBZ'), ('the', 'DT'), ('perfect', 'JJ'), ('library', 'NN'), ('for', 'IN'), ('you', 'PRP'), ('to', 'TO'), ('get', 'VB'), ('hands-on', 'JJ'), ('with', 'IN'), ('The', 'DT'), ('best', 'JJS'), ('way', 'NN'), ('to', 'TO'), ('go', 'VB'), ('through', 'IN'), ('this', 'DT'), ('article', 'NN'), ('is', 'VBZ'), ('to', 'TO'), ('follow', 'VB'), ('along', 'RB'), ('with', 'IN'), ('the', 'DT'), ('code', 'NN'), ('and', 'CC'), ('perform', 'VB'), ('the', 'DT'), ('tasks', 'NNS'), ('yourself', 'PRP')]

All POS tags are printed in the abbreviated form. You can refer to this link to see its full form.

Noun phrase extraction

This is used to extract all phrases with a noun in it. This can be simply done by using the noun_phrases attribute in TextBlob

>>> firstText.noun_phrases
Result:
WordList(['nlp', 'textblob', 'perfect library'])

Words Inflection and Lemmatization

Inflection is adding character to the end of the word to change its meaning grammatically. For example, we will pluralize a selected word in the Textblob we created earlier. 

Initially, we need to break the paragraph into words using TextBlob.word. This will consider each word as an object.

>>> firstText.words
WordList(['If', 'it', 'is', 'your', 'first', 'step', 'in', 'NLP', 'TextBlob', 'is', 'the', 'perfect', 'library', 'for', 'you', 'to', 'get', 'hands-on', 'with', 'The', 'best', 'way', 'to', 'go', 'through', 'this', 'article', 'is', 'to', 'follow', 'along', 'with', 'the', 'code', 'and', 'perform', 'the', 'tasks', 'yourself'])
 

Then we can access each word object and apply the function “pluralize()”

>>> firstText.words[5].pluralize()
'Steps'

Similarly, you can create word objects by importing TextBlob.word. Let's look at an example using lemmatize() function, which reduces words to its root form.

>>> from textblob import Word
>>> word1=Word("easier")
>>> word1.lemmatize("a")
'Easy'

Note that, through passing the parameter  "a", we tell the method to treat the word as an adjective since, by default, all words are considered as nouns.

n-Grams

In TextBlob, N-Grams is a combination of two or more words together where n>1. For this, we are using TextBlob.ngrams, which returns tuples with "n" number of words.

for ngram in firstText.ngrams(4):
 
  print(ngram)
  Result:
   
  >>>['If', 'it', 'is', 'your']
  ['it', 'is', 'your', 'first']
  ['is', 'your', 'first', 'step']
  ['your', 'first', 'step', 'in']
  ['first', 'step', 'in', 'NLP']
  ['step', 'in', 'NLP', 'TextBlob']
  

Sentimental analysis

The sentimental analysis determines the emotion or the opinion that the text holds, and it can be obtained using the sentiment attribute of TextBlob. This will return a tuple of two values called polarity and subjectivity. Polarity value is in the range -1 to 1, where -1 means it is a negative statement, and a positive value means it is a positive statement. 

Subjectivity value lies in the range 0-1 where lower values mean the statement is more subjective, and higher values mean it is more objective. Let's create sample TextBlob with a customer review and obtain its sentiment.

>>> sampleComment = TextBlob("It feels odd! There’s something wrong with it. Do not order from here.")
  >>> sampleComment.sentiment
  Sentiment(polarity=-0.35416666666666663, subjectivity=0.575)
  

The above values show that the polarity is a negative value, which means the review is negative, and the subjectivity is somewhat neutral.

Conclusion

TextBlob is an excellent library to learn if you are a beginner to NLP, and it is becoming widely popular in the field of Data Science. This article shows how TextBlob can be useful to implement different functionalities of NLP using its straightforward API.

Apart from the functionalities we have mentioned, there are so many other features that TextBlob offers, such as spelling corrections, text summarization, language detection, translation, and text classification, to name a few. Considering all these, we can realize that learning TextBlob is a perfect stepping stone to learn NLP, and it could be the foundation to create complex systems such as chatbots, machine translators, and advanced search engines.