Hi guys I have a problem. I did a twitter scraper work for my thesis inorder to obtain some texts and hashtags to process. So the problem is the seguent: in the hashtag column, I have all rows such as:
['covid19', 'croazia', 'slovenia']
Now in order to cluster this text data, I wanto to join all rows into one, in order to have something like this:
covid19 croazia slovenia
So because of these hashtags are in a pandas column called "Hashtag", to do what I want I used this line of code:
df["Hashtag_united"] = df["Hashtag"].apply(lambda x: " ".join(x))
But in this way I hadn't the rows as I expected as I wrote, but I had:
[ ' c o v i d 1 9 ' , ' c r o a z i a ' , ' s l o v e n i a ' ]
What I have to do in order to obtain what I want? Thank you for the time spent for me. I apologize for the stupid question. Have a good day!
CodePudding user response:
Since you have "['covid19', 'croazia', 'slovenia']"
in your Hashtag column, you can use:
import ast
df["Hashtag_united"] = df["Hashtag"].apply(lambda x: " ".join(ast.literal_eval(x)))
The ast.literal_eval(x)
will cast the stringified string list into a string list, and " ".join(...)
will make a string out of it.