I have done the following
import nltk
words = nltk.corpus.brown.words()
freq = nltk.FreqDist(words)
And am able to find the frequency of certain words in the brown corpus, like
freq["the"]
62713
But now I want to be able to find the Frequency Distribution of specific bigrams. So then I tried
bigrams = nltk.bigrams(words)
freqbig = nltk.FreqDist(bigrams)
But every bigram that I enter, I always get 0. Like,
freqbig["the man"]
0
What I am doing wrong?
CodePudding user response:
It accepts a tuple
as key, not a str
:
freqbig[("the", "man")]
OUTPUT
128
If you want to pass strings, you could create an auxiliary function which takes care of it:
def get_frequency(my_string):
return freqbig[tuple(my_string.split(" "))]