Home > Back-end >  Cannot Generate Word Cloud - Python
Cannot Generate Word Cloud - Python

Time:05-23

I am trying to create a wordcloud using the frequencies of words from a Pandas column. I have a dataframe like so:

PageNumber  Top_words_only
1           people trees like instagram ...
2           people yellow like flickrioapp people level water...
...
78          teatree instagram water leith circuits...

I have calculated the frequencies of words from the top_words_only column and put it into a tuple so that wordcloud can process the data into a visualisation like so:

tuples = tuple([tuple(x) for x in df.top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])
print(tuples)

<OUT>
(('instagram', 3), ('plant', 3), ('shadow', 3), ('rise', 3), .... ('hibs', 1), ('bud', 1), ('insect', 1),
('warriston', 1), ('garage', 1))

wordcloud = WordCloud()
wordcloud.generate_from_frequencies(tuples)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

However, it comes up with an attribute error saying:

AttributeError: 'tuple' object has no attribute 'items'

Does anyone know what is wrong with the code I have?

CodePudding user response:

Use a dictionary:

d = dict([tuple(x) for x in df.Top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])

from wordcloud import WordCloud
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(d)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()

output:

enter image description here

Alternative to generate the dictionary:

from collections import Counter

d = Counter(w for x in df['Top_words_only'] for w in x.split())
  • Related