Home > Software design >  Convert column with more than one string into rows with Pandas
Convert column with more than one string into rows with Pandas

Time:08-14

I'm collecting some data from Google Sheets, but I need to convert one column with more than one string into a row. Each cell can have more than one string, and each string I want to put in one row

I have this dataframe

import pandas as pd
df = pd.DataFrame({'Name': ['Bob', 'John', 'Ric'], 
                           'Submitted At': ['2022/08/12 23:56:42', '2022/08/12 23:56:42', '2022/08/12 23:56:42'], 
                           'Class': ['Math,English,History', 'English,History', 'Math, Chemistry']})
    
print(df)
   Name         Submitted At                 Class
0   Bob  2022/08/12 23:56:42  Math,English,History
1  John  2022/08/12 23:56:42       English,History
2   Ric  2022/08/12 23:56:42       Math, Chemistry

I want to transform it into this

final

I'm trying to do this, but it did not work

df2 = df.join(df['Class'].str.split(',', expand=True).add_prefix('Class'))

print(df2.melt(id_vars=['Submitted At', 'Name'], var_name=df2.iloc[:, 2:6]))

ValueError: Unable to coerce to Series, length must be 4: given 2

Could you help me please?

CodePudding user response:

You can try explode the split list Class column

out = (df.assign(Class=df['Class'].str.split(','))
       .explode('Class', ignore_index=True))
print(out)

   Name         Submitted At       Class
0   Bob  2022/08/12 23:56:42        Math
1   Bob  2022/08/12 23:56:42     English
2   Bob  2022/08/12 23:56:42     History
3  John  2022/08/12 23:56:42     English
4  John  2022/08/12 23:56:42     History
5   Ric  2022/08/12 23:56:42        Math
6   Ric  2022/08/12 23:56:42   Chemistry
  • Related