Home > Enterprise >  word cloud counting double words instead of one
word cloud counting double words instead of one

Time:12-28

I am trying to do a word cloud with matplotlib in python and instead of counting single words like "will" its counting two like "i will". I have looked at the word cloud documentation and there doesnt seem to be anything that produces this, could my input be causing this?

my code looks like the following:

fields = ['comments']

text= pd.read_csv('comments.csv', usecols=fields)

stopwords = ["https", "RT"]   list(STOPWORDS)

print(' '.join(text['comments'].tolist()))

wordcloud = WordCloud(stopwords=stopwords, background_color="white").generate(' '.join(text['comments'].tolist()))

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()

CodePudding user response:

The documentation (https://amueller.github.io/word_cloud/generated/wordcloud.WordCloud.html) states:

collocations: bool, default=True

Whether to include collocations (bigrams) of two words. Ignored if using generate_from_frequencies.

You may need to include collocations=False in the parameters to WordCloud.

  • Related