My dataset has 10 columns, one of which has texts as lists of strings.
Dataset:
Col1 Col2 Col3 Text
... ... ... ['I','have', 'a','dream']
... ... ... ['My', 'mom', 'is','Spanish']
The code
wordcloud = WordCloud(stopwords=stopwords, max_font_size=50, max_words=100, background_color="white").generate(' '.join(df['Text']))
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
returns the
TypeError: sequence item 0: expected str instance, list found
It is clear that it expects strings, not lists. How can I transform the lists within Text column in strings?
CodePudding user response:
You can try to first concatenate the lists within the column df['Text']
with .sum()
, then join:
combined_text = ' '.join(df['Text'].sum())
wordcloud = (
WordCloud(stopwords=stopwords,
max_font_size=50,
max_words=100,
background_color="white")
.generate(combined_text)
)
CodePudding user response:
Since you have lists as value in the dataset, try exploding them first:
wordcloud = (WordCloud(stopwords=stopwords,
max_font_size=50,
max_words=100,
background_color="white")
.generate(' '.join(df['Text'].explode())))
Or join them first:
wordcloud = (WordCloud(stopwords=stopwords,
max_font_size=50,
max_words=100,
background_color="white")
.generate(' '.join(df['Text'].agg(' '.join)))