We can count bigrams in nltk using nltk.FreqDist(). Then We have to convert the raw text into bigrams. We utilize the bigrams in nltk.FreqDist(). In this article, We will implement the solution in steps.
Count bigrams in nltk (Stepwise) –
This is a multi-step process. We will also explain one by one.
Step 1: Importing the packages-
In order to complete the counting of bigram in NLTK. We need the below python packages.
import nltk nltk.download('punkt')
Step 2: Tokenize the input text-
In this step, we will define the input text and then we further tokenize it.
text=" This is the best place to learn Data Science Learner" tokens = nltk.word_tokenize(text)
The nltk.word_tokenize() function tokenize the text into list.
Step 3: Generate the Bigrams –
In this step, we will generate the bigram pairs from the tokens. here is the code for bigrams pair extraction from tokens.
bigrams = nltk.bigrams(tokens)
The nltk.bigrams() function will create the bigrams from the tokens which we have created in the above text.
Step 4: Counting the Bigrams-
In the above steps, we have extracted the bigrams from the text in the form of a generative class sequence. Now in this section, we will use FreqDist(bigrams)
frequency = nltk.FreqDist(bigrams) for key,value in frequency.items(): print(key,value)
Once we have the frequencies, We can iterate the key, value pair.
Complete Code –
Let’s combine the code pieces from each step. Now run the consolidated code.
import nltk nltk.download('punkt') text="This is the best place to learn Data Science Learner" tokens = nltk.word_tokenize(text) bigrams = nltk.bigrams(tokens) frequence = nltk.FreqDist(bigrams) for key,value in frequence.items(): print(key,value)
We have seen that the above code extracted the count of the occurrence of bigrams in the corpus. Although we have used a very small corpus. You may replace it with a bigger one.
There may be various ways to count the bigrams from the raw text. But we have implemented the simplest for you. Although most of the steps are self-explanatory. Still, we have tried to explain it to you. If you have any doubt related to this topic or article, please let us know. You may also comment in the below comment box.
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.