Natural Language Processing Using TextBlob
Introduction to TextBlob
TextBlob is a Python library that is built with a simple API to perform various tasks related to Natural Language Processing(NLP). It is built based on NLTK and Pattern libraries but with a simpler interface. The lucidity of TextBlob makes it the perfect library to work with if you are new to NLP and the best library to experiment with text analytics in Python. In the following sections, we will get a better understanding of TextBlob and its functionalities.
TextBlob can be easily installed using ‘pip’ by typing the following in the command line.
pip install textblob
You can also use conda to install TextBlob. If you have installed Anaconda, try running the following command in the anaconda prompt to install TextBlob.
conda install -c conda-forge textblob
In case you need large linguistic data for practical work, the NLTK natural language data sets called Corpora can be downloaded through the following command.
python -m textblob.download_corpora
Creating a TextBlob
Before we start coding, we need to import the TextBlob package into our Python file.
from textblob import TextBlob
When we are working with TextBlob, our texts will be store as instances of TextBlob. Let’s create our first TextBlob with a simple paragraph.
firstText=TextBlob("If it is your first step in NLP, TextBlob is the perfect library for you to get hands-on with. The best way to go through this article is to follow along with the code and perform the tasks yourself.")
Once we create our TextBlob, we can try out different TextBlob features using it.
Using the Tokenization feature, you can break the text to tokens, which can be either words or sentences for further analysis. We will be using "words" and "sentences" attributes to tokenize the TextBlob we created earlier.
Result: WordList(['If', 'it', 'is', 'your', 'first', 'step', 'in', 'NLP', 'TextBlob', 'is', 'the', 'perfect', 'library', 'for', 'you', 'to', 'get', 'hands-on', 'with', 'The', 'best', 'way', 'to', 'go', 'through', 'this', 'article', 'is', 'to', 'follow', 'along', 'with', 'the', 'code', 'and', 'perform', 'the', 'tasks', 'yourself']) > firstText.sentences Result: [Sentence("If it is your first step in NLP, TextBlob is the perfect library for you to get hands-on with."), Sentence("The best way to go through this article is to follow along with the code and perform the tasks yourself.")]firstText.words
Part Of Speech (POS) Tagging
POS is tagging words with labels to identify its function in the given context. In TextBlob, it can be done using the tags attribute.
Result: [('If', 'IN'), ('it', 'PRP'), ('is', 'VBZ'), ('your', 'PRP$'), ('first', 'JJ'), ('step', 'NN'), ('in', 'IN'), ('NLP', 'NNP'), ('TextBlob', 'NNP'), ('is', 'VBZ'), ('the', 'DT'), ('perfect', 'JJ'), ('library', 'NN'), ('for', 'IN'), ('you', 'PRP'), ('to', 'TO'), ('get', 'VB'), ('hands-on', 'JJ'), ('with', 'IN'), ('The', 'DT'), ('best', 'JJS'), ('way', 'NN'), ('to', 'TO'), ('go', 'VB'), ('through', 'IN'), ('this', 'DT'), ('article', 'NN'), ('is', 'VBZ'), ('to', 'TO'), ('follow', 'VB'), ('along', 'RB'), ('with', 'IN'), ('the', 'DT'), ('code', 'NN'), ('and', 'CC'), ('perform', 'VB'), ('the', 'DT'), ('tasks', 'NNS'), ('yourself', 'PRP')]> firstText.tags
All POS tags are printed in the abbreviated form. You can refer to this link to see its full form.
Noun phrase extraction
This is used to extract all phrases with a noun in it. This can be simply done by using the noun_phrases attribute in TextBlob
Result: WordList(['nlp', 'textblob', 'perfect library'])> firstText.noun_phrases
Words Inflection and Lemmatization
Inflection is adding character to the end of the word to change its meaning grammatically. For example, we will pluralize a selected word in the Textblob we created earlier.
Initially, we need to break the paragraph into words using TextBlob.word. This will consider each word as an object.
'If', 'it', 'is', 'your', 'first', 'step', 'in', 'NLP', 'TextBlob', 'is', 'the', 'perfect', 'library', 'for', 'you', 'to', 'get', 'hands-on', 'with', 'The', 'best', 'way', 'to', 'go', 'through', 'this', 'article', 'is', 'to', 'follow', 'along', 'with', 'the', 'code', 'and', 'perform', 'the', 'tasks', 'yourself'])> firstText.words WordList([
Then we can access each word object and apply the function “pluralize()”
5].pluralize() 'Steps'> firstText.words[
Similarly, you can create word objects by importing TextBlob.word. Let's look at an example using lemmatize() function, which reduces words to its root form.
"easier") > word1.lemmatize("a") 'Easy'> from textblob import Word > word1=Word(
Note that, through passing the parameter "a", we tell the method to treat the word as an adjective since, by default, all words are considered as nouns.
In TextBlob, N-Grams is a combination of two or more words together where n>1. For this, we are using TextBlob.ngrams, which returns tuples with "n" number of words.
for ngram in firstText.ngrams(4): print(ngram) Result: >['If', 'it', 'is', 'your'] ['it', 'is', 'your', 'first'] ['is', 'your', 'first', 'step'] ['your', 'first', 'step', 'in'] ['first', 'step', 'in', 'NLP'] ['step', 'in', 'NLP', 'TextBlob']
The sentimental analysis determines the emotion or the opinion that the text holds, and it can be obtained using the sentiment attribute of TextBlob. This will return a tuple of two values called polarity and subjectivity. Polarity value is in the range -1 to 1, where -1 means it is a negative statement, and a positive value means it is a positive statement.
Subjectivity value lies in the range 0-1 where lower values mean the statement is more subjective, and higher values mean it is more objective. Let's create sample TextBlob with a customer review and obtain its sentiment.
"It feels odd! There’s something wrong with it. Do not order from here.") > sampleComment.sentiment Sentiment(polarity=-0.35416666666666663, subjectivity=0.575)> sampleComment = TextBlob(
The above values show that the polarity is a negative value, which means the review is negative, and the subjectivity is somewhat neutral.
TextBlob is an excellent library to learn if you are a beginner to NLP, and it is becoming widely popular in the field of Data Science. This article shows how TextBlob can be useful to implement different functionalities of NLP using its straightforward API.
Apart from the functionalities we have mentioned, there are so many other features that TextBlob offers, such as spelling corrections, text summarization, language detection, translation, and text classification, to name a few. Considering all these, we can realize that learning TextBlob is a perfect stepping stone to learn NLP, and it could be the foundation to create complex systems such as chatbots, machine translators, and advanced search engines.
- How To Analyze Data Using Pyspark RDD
- How To Read CSV File Using Python PySpark
- How To Read JSON Data Using Python Pandas
- How To Analyze Wikipedia Data Tables Using Python Pandas
- How to Visualize Data Using Python - Matplotlib
- Select Pandas Dataframe Rows And Columns Using iloc loc and ix
- How to do SQL Select and Where Using Python Pandas