I am still new to Python/Jupyter. I have an excel file which I have imported into Python with 2 columns- one is a binary 1/0 (1 for spam and 0 for non-spam) and the second is a text message. I am trying to create 2 wordclouds, one for spam and one for non-spam. How could I separate my texts into spam and non-spam? Screenshot of my spreadsheet for clarity
CodePudding user response:
Sort by the binary value, either ascending so (0 first) or descending (1 first).
Once done save the excel file then import as you usually do.
CodePudding user response:
assume your csv file is called test.csv with two cols called 'text' and 'label'
df = pd.read_csv("test.csv")
df_spam = df[df.label == 1]
df_no_spam = df[df.label == 0]